export/import question outlook

  • Thread starter Thread starter WebBuilder451
  • Start date Start date
W

WebBuilder451

well, it is a web system i'm building and this is the place where i get the
best respone to questions...........
I'm attempting to read and parse exported outlook contacts and i notice
excel has no problem figuring out columns and fields, but i'm having trouble
with extra CR/LF in fields. Does anyone know how excel figures this out? Is
there an extra char that signifies end of line? or is it parsing csv files in
a unique way?
 
Hi Mark, I appreciate your help (again)

Fair question. Some how, because i suspect excel is better at opening csv
files correctly, the exported outlook csv file opens up with all fields
matching correctly with their column headers. When i attempt to read the file
i'm encountering CR/LR that are throwing my import off. I've tried three ways
and i still get messed up with the CR/LR that appear in the middle of the
fields:

String[] str =
File.ReadAllLines(@"c:\websites\outlookImportTest\website\Files\uploaded.csv");
//give me uneven records for the reason's noted above.
String[] fr = str[0].Split((char)10);
// will give me the correct column headers
// but when i dump the lines i get more records than i started with

// i tried
var strX =
File.ReadAllText(@"c:\websites\outlookImportTest\website\Files\uploaded.csv");
// and attempt to find the end of line but i get the same problem, no
supprise here i know.

// i can read by byte, which i notice is very slow, but i still can fine a
way to count out the fields correctly.

So during this process i opened the csv file in excel to see if it were
corrupted. Excel had no problem. Hence my question, if i knew how excel does
it i can use the same logic.
note: Attempts to replace the CR/LF's leave me with a single flat record
--
thanks

kes
 
WebBuilder451 laid this down on his screen :
well, it is a web system i'm building and this is the place where i get the
best respone to questions...........
I'm attempting to read and parse exported outlook contacts and i notice
excel has no problem figuring out columns and fields, but i'm having trouble
with extra CR/LF in fields. Does anyone know how excel figures this out? Is
there an extra char that signifies end of line? or is it parsing csv files in
a unique way?

When you look at the csv file with notepad, you will probably see more
lines than original records (ignoring the extra header). A record tat
has fields that contain a newline is split over two (or more) lines.

What about the next strategy?
1. read the first line, containing the headers. This not only gives you
the names of the fields, but also the *number* of fields.
2. read a line
3. try to match the fields in the line to the headers.
4. If not all fields have values (you know how many to expect), then
the last filled field is really partially filled! Add a CRLF, *read the
next line* and *continue* processing this *same* record, reading more
lines as needed
5. repeat at step 2 until the file is done

Hans Kesting
 
I recall I saw somewhere on codeproject.com (may be
http://www.codeplex.com/ ) normal CSV parser.
It supports commas in the data, Double quotes. Like "AAA""BBB" in data is
actually AAA"BBB



George.

Mark Rae said:
So during this process I opened the csv file in excel to see if it were
corrupted. Excel had no problem. Hence my question, if i knew how Excel
does
it I can use the same logic.

string strLineFromFile = String.Empty;
string[] strSplitLine;

using (StreamReader objSR = new
StreamReader(@"c:\websites\outlookImportTest\website\Files\uploaded.csv");
{
while ((strLineFromFile = objSR.ReadLine()) != null)
{
strSplitLine = strLineFromFile.Split(',');
//string strFirstField = strSplitLine[0].Trim(); // etc
}
}

Watch out for commas in the actual data...
 
Back
Top