OleDbDataAdapter for csv/Excel

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi Guys

I just want to let you know that my colleagues and I in the financial industry spend so much time trying to get data from csv/excel files into DataTables. Time and again, I find that OleDb provider fails us in so far as its interpretation of data type. Just today, I found that my "stable" file parser failed for a field. I predefined a DataTable and all columns so if there's any data type mismatch in the file, the user will know about it right away.

This morning, a file came in with a column with 60 rows that's all integer except the last 5 rows, where the data is floats

My DataTable defines this column to accept data of type double. OleDb sees the first 55 rows as int so it truncates all data values as it puts the data into my DataTable, causing miscalculations to happen later. Of course, int can fit into a column of type double, the problem goes unnoticed until the data is audited -- it's too late and too costly to fix then

Other than manually parsing these files (and there are many file formats in my organization), is there anything in the ADO.NET stack that can help with my *very common* situation

Thanks
Paul
 
¤ Hi Guys,
¤
¤ I just want to let you know that my colleagues and I in the financial industry spend so much time trying to get data from csv/excel files into DataTables. Time and again, I find that OleDb provider fails us in so far as its interpretation of data type. Just today, I found that my "stable" file parser failed for a field. I predefined a DataTable and all columns so if there's any data type mismatch in the file, the user will know about it right away.
¤
¤ This morning, a file came in with a column with 60 rows that's all integer except the last 5 rows, where the data is floats.
¤
¤ My DataTable defines this column to accept data of type double. OleDb sees the first 55 rows as int so it truncates all data values as it puts the data into my DataTable, causing miscalculations to happen later. Of course, int can fit into a column of type double, the problem goes unnoticed until the data is audited -- it's too late and too costly to fix then.
¤
¤ Other than manually parsing these files (and there are many file formats in my organization), is there anything in the ADO.NET stack that can help with my *very common* situation?

Are you using schema.ini files for your Text files? If you don't, the ISAM driver attempts to
determine data types based upon row scanning. It isn't an exact science. For more info:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/odbcjetsdk_98.asp

Excel can pose a different problem, especially when you have mixed mode (numeric and alphanumeric)
data in a column. If you encounter Null values where there should be data, add the IMEX argument to
your connection string:

ConnectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=e:\My Documents\Book20.xls;Extended Properties=""Excel 8.0;HDR=NO;IMEX=1;"""


Paul ~~~ (e-mail address removed)
Microsoft MVP (Visual Basic)
 
Back
Top