Specifying format of DateTime columns read from CSV files

  • Thread starter Thread starter Joel Gordon
  • Start date Start date
J

Joel Gordon

Hi,

I am reading a CSV file into a DataTable with the following code :


internal static DataTable ReadFromCSVFile( string csvFileName ) {
if ( !File.Exists( csvFileName ) ) {
throw new ApplicationException( " Cannot find CSV file : \"" +
csvFileName + "\"." );
}
string csvPath = Path.GetDirectoryName( csvFileName );
OleDbConnection connection =
new OleDbConnection( "Provider=Microsoft.Jet.OLEDB.4.0;" +
"Data Source=" + csvPath + ";" +
"Extended Properties=\"Text;HDR=Yes\"" );
return ReadDataTable( connection, Path.GetFileName(csvFileName) );
}


private static DataTable ReadDataTable( OleDbConnection connection,
string tableName ) {
connection.Open();
try {
// Read all data into a data table.
OleDbDataAdapter adapter =
new OleDbDataAdapter( "SELECT * FROM [" + tableName + "]",
connection );
DataTable dataTable = new System.Data.DataTable( tableName );
adapter.Fill( dataTable );
return dataTable;
} finally {
connection.Close();
}
}

How do I specify the format of columns containing DateTime values ?

If my CSV file contains the following :
Ints,Doubles,Dates1,Dates2
34,34.5,01/02/2003,12:43:00 p.m.,01/02/2003

Then the "Dates2" column in the DataTables is of type DateTime, but the
"Dates1" column is of type string.



Thanks in advance,
Joel Gordon.
 
Hi Joel,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to put all the data in Date1
and Date2 into DateTime columns. If there is any misunderstanding, please
feel free to let me know.

I think we can use typed DataSet to achieve this.

1. We create a typed DataSet with a DataTable which contains all the
columns with the column data type set.
2. Create the OleDbDataAdapter as you did in your last post.

OleDbDataAdapter adapter = new OleDbDataAdapter( "SELECT * FROM ["
+ tableName + "]", connection );

3. Add a DataTableMapping object with each column mapping to the data
adapter before filling data.

this.sqlDataAdapter1.TableMappings.AddRange(new
System.Data.Common.DataTableMapping[] {
new System.Data.Common.DataTableMapping("Table", "Table1", new
System.Data.Common.DataColumnMapping[] {
new System.Data.Common.DataColumnMapping("Ints", "IntsCol"),
new System.Data.Common.DataColumnMapping("Doubles", "DoublesCol"),
new System.Data.Common.DataColumnMapping("Dates1", "Dates2Col"),
new System.Data.Common.DataColumnMapping("Dates2", "Dates2Col")})});

That will put all the data from source table to the corresponding
destination columns with data type you have defined in typed DataSet schema.

HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Kevin said:
Hi Joel,

First of all, I would like to confirm my understanding of your issue.
From your description, I understand that you need to put all the data
in Date1 and Date2 into DateTime columns. If there is any
misunderstanding, please feel free to let me know.

I think we can use typed DataSet to achieve this.

1. We create a typed DataSet with a DataTable which contains all the
columns with the column data type set.
2. Create the OleDbDataAdapter as you did in your last post.

OleDbDataAdapter adapter = new OleDbDataAdapter( "SELECT *
FROM [" + tableName + "]", connection );

3. Add a DataTableMapping object with each column mapping to the data
adapter before filling data.

this.sqlDataAdapter1.TableMappings.AddRange(new
System.Data.Common.DataTableMapping[] {
new System.Data.Common.DataTableMapping("Table", "Table1", new
System.Data.Common.DataColumnMapping[] {
new System.Data.Common.DataColumnMapping("Ints", "IntsCol"),
new System.Data.Common.DataColumnMapping("Doubles", "DoublesCol"),
new System.Data.Common.DataColumnMapping("Dates1", "Dates2Col"),
new System.Data.Common.DataColumnMapping("Dates2", "Dates2Col")})});

That will put all the data from source table to the corresponding
destination columns with data type you have defined in typed DataSet
schema.

HTH.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Hi Kevin,

Thanks very much for the response.
From your description, I understand that you need to put all the data
in Date1 and Date2 into DateTime columns.
That's correct.

Unfortunately your solution assumes that the code knows the names and
types of the columns in the CSV file (I am trying to use the Jet OLEDB
4.0 driver to read any arbitary CSV file into a DataTable).

In old posting in microsoft.public.dotnet.framework.odbcnet I did find
mention of using a schema.ini file when reading CSV files to specify
that the columns were tab delimited rather than comma delimited.

Questions :

1. Is it possible to specify the format of dates (possibly using the
schema.ini or programatically) so that dates of a certain form
(e.g.01/02/2003,12:43:00 p.m.) are recognized as dates ?

2. When writing a DataTable to a CSV file how do you specify how
DateTime columns (and columns of other types) are formatted when they
are written to a CSV file (I am attempting to use the Jet OLEDB 4.0
driver to write any arbitary DataTable to a CSV file) ?


Thanks,
Joel.
 
Hi Joel,

1. We cannot specify the format of dates to recognize certain form. The
only way to get column data as specified data type, is to use schema.ini,
which has to be applied at design time.

2. As far as I know, the text driver used to read CSV file is read-only,
which means that we cannot use update command to write to that file. So, to
output data in the DataTable to a CSV file, we have to go through each row
and each column and write to the text file directly and seperate data with
comma. When writing DateTime type, we can use DateTime.ToLongDateString,
DateTime.ToLongTimeString or DateTime.ToString(string, IFormatProvider); to
control the output format.

For more information, please check the following link:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/
frlrfsystemdatetimeclasstostringtopic.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Kevin said:
Hi Joel,

1. We cannot specify the format of dates to recognize certain form. The
only way to get column data as specified data type, is to use schema.ini,
which has to be applied at design time.

2. As far as I know, the text driver used to read CSV file is read-only,
which means that we cannot use update command to write to that file. So, to
output data in the DataTable to a CSV file, we have to go through each row
and each column and write to the text file directly and seperate data with
comma. When writing DateTime type, we can use DateTime.ToLongDateString,
DateTime.ToLongTimeString or DateTime.ToString(string, IFormatProvider); to
control the output format.

For more information, please check the following link:

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/cpref/html/
frlrfsystemdatetimeclasstostringtopic.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Thanks for the info - where do I find documentation regarding how the Jet
Driver determines the types of columns read from a CSV file and also
documentation about the schema.ini file ?

With regards point 2 above - I did find the following code in an old posting in
microsoft.public.general.discussion (Subject: Save an acess table into Csv)
showing how you code write an access table to CSV in one go.

Sub ExportAccessToTextADO()

Dim cnn As New ADODB.Connection
Dim sqlString As String

cnn.Open _
"Provider=Microsoft.Jet.OLEDB.4.0;" & _
"Data Source=d:\My Documents\DB1.mdb;" & _
"Jet OLEDB:Engine Type=4;"

sqlString = "SELECT * INTO [Text;DATABASE=d:\My
Documents\TextFiles].[Plot.csv] FROM [tblPlot]"

cnn.Execute sqlString

cnn.Close
Set cnn = Nothing

End Sub


I was hoping a similar method could be used to write a DataTable to a CSV file?


Regards,
Joel.
 
Joel, the Jet Driver does not have any way to determine the Column Types ( Integer, String, etc.) because there is no standard
in CSV formating that allows including Field Types(Column Types) from the exporting application. The only thing that is
consistant, is that the first row is "supposed" to be the column names. That is why Access's importing wizard will ask you if
the first row contains Column Names. And all rows following are actual field data broken up by commas "," to show a new
field. Each field value is "supposed" to be enclosed in double Quotes. And the end of each row is marked with a Carriage Return
and/or Linefeed.
So, Jet sets everything to Type: String and a default field size of 255.
Otherwise, you have to go into the Table Designer and make the changes yourself. Or, do it in code.
(painful......)
james
 
Thanks for James's quick response!

Hi Joel,

You can try to set MaxScanRows value in connection string or in schema.ini
to a higher value to increase the row for the text driver to determine the
type. This seems to be the only thing we can do to affect the recognition.

The following is the document for schema.ini.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odbc/htm/od
bcjetschema_ini_file.asp

Unfortunately, the DataTable doesn't support SELECT INTO to put data to csv
files. I think we have to loop through the whole table and write to text
file directly.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."
 
Back
Top