trouble reading CSV file with Jet OLEDB

  • Thread starter Thread starter Andy Fish
  • Start date Start date
A

Andy Fish

Hi,

I am reading a CSV file using the provider "Microsoft.Jet.OLEDB.4.0;Data".
my code is the same as a million and one examples I have seen:

OleDbConnection conn = new
OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filename +
";Extended Properties=\"text;HDR=No;FMT=Delimited\"");
OleDbCommand cmd = new OleDbCommand("SELECT * FROM " +
Path.GetFileName(filename), conn);
OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
objAdapter1.SelectCommand = cmd;
DataSet ds = new DataSet();
objAdapter1.Fill(ds, "CSVData");

mostly it works, but some specific cases are giving me problems. for
instance, the following 2-line CSV file

foo,bar
1,a

ds.Tables["CSVData"].Rows[0][0] returns a System.DBNull object (should be
"foo"), whereas the other 3 values in the file appear correctly.

I guess it is something to do with the fact that the value in the second
line happens to be numeric, because if I put 2 text values in the second
line, the first one is read correctly. Note that I am using "HDR=No" in the
extended properties to indicate that the file has no header row. This is
just one example - other screwy things happen when the file contains values
which could be interpreted as dates and basically it's just very unreliable.

Is there any way of turning of this "clever" (i.e. nondeterministic)
behaviour. I just want to read a CSV file and get the values that are in it
as strings.

TIA

Andy
 
¤ Hi,
¤
¤ I am reading a CSV file using the provider "Microsoft.Jet.OLEDB.4.0;Data".
¤ my code is the same as a million and one examples I have seen:
¤
¤ OleDbConnection conn = new
¤ OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + filename +
¤ ";Extended Properties=\"text;HDR=No;FMT=Delimited\"");
¤ OleDbCommand cmd = new OleDbCommand("SELECT * FROM " +
¤ Path.GetFileName(filename), conn);
¤ OleDbDataAdapter objAdapter1 = new OleDbDataAdapter();
¤ objAdapter1.SelectCommand = cmd;
¤ DataSet ds = new DataSet();
¤ objAdapter1.Fill(ds, "CSVData");
¤
¤ mostly it works, but some specific cases are giving me problems. for
¤ instance, the following 2-line CSV file
¤
¤ foo,bar
¤ 1,a
¤
¤ ds.Tables["CSVData"].Rows[0][0] returns a System.DBNull object (should be
¤ "foo"), whereas the other 3 values in the file appear correctly.
¤
¤ I guess it is something to do with the fact that the value in the second
¤ line happens to be numeric, because if I put 2 text values in the second
¤ line, the first one is read correctly. Note that I am using "HDR=No" in the
¤ extended properties to indicate that the file has no header row. This is
¤ just one example - other screwy things happen when the file contains values
¤ which could be interpreted as dates and basically it's just very unreliable.
¤
¤ Is there any way of turning of this "clever" (i.e. nondeterministic)
¤ behaviour. I just want to read a CSV file and get the values that are in it
¤ as strings.

Yes, create a schema.ini file and define the layout of your text file:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetschema_ini_file.asp


Paul
~~~~
Microsoft MVP (Visual Basic)
 
Hi Andy,

If you do not have a schema.ini file defined for your CSV content, then
provider tries to predict datatype of the columns based on a scanning result
of the first 25 rows (I believe this is correct number). It looks like your
column was defined as a numeric one and for all non-numeric values in this
column provider just returns NULL. If you create Schema.ini file, then you
could specify number of rows to scan to define type of the column. Just set
it to 0, define column as string type and it will work
 
Back
Top