Loading UTF-8 CSV into ADO.NET Recordset

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a program that loads a CSV file in UTF-8 in to an ADO recordset (if I
open the CSV file in Notepad it shows that it's UTF-8 encoded). It works
fine when the file contains Latin characters but works incorrectly when the
CSV file contains Japanese characters. How can I get the Japanese characters
loaded into the recordset from the CSV file?

Here's the CSV file:

"å¹´","月","æ—¥","M01 = Compound æ–°è¦ã® メトリック (1)"
"1998","9月","9/2/1998","724.8996"
"1998","9月","9/3/1998","582.1596"
"1998","9月","9/4/1998","2382.8796"

Here's the schema.ini file:

[CSVFile.csv]
ColNameHeader=True
Format=CSVDelimited
MaxScanRows=2
Col1="å¹´" Text
Col2="月" Text
Col3="æ—¥" Text
Col4="M01_=_Compound_æ–°è¦ã®_メトリック_(1)" Double

Here's the current code:

string connect = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
@"c:\temp" + ";Extended Properties=\"text;HDR=YES;FMT=Delimited\"";

ADODB.ConnectionClass adoConn = new ADODB.ConnectionClass();
adoConn.Open(connect, "", "", 0);

ADODB.RecordsetClass adoRS = new ADODB.RecordsetClass();

string sourceData = @"Select * from CSVFile.csv";

adoRS.Open(sourceData, adoConn, ADODB.CursorTypeEnum.adOpenStatic,
ADODB.LockTypeEnum.adLockReadOnly, (int)ADODB.CommandTypeEnum.adCmdText);
 
You could try my CSV parser. The end result is not an ADO recordset,
but it works about the same way, and you could load it in to a
DataTable with one line of code, and is faster than the Jet connection
you're using.

http://www.geocities.com/shriop/index.html

using (CsvReader reader = new CsvReader(@"c:\temp", ',',
Encoding.UTF8))
{
reader.ReadHeaders();
}
 
Back
Top