N
Norman Diamond
With Visual Studio 2005 SP1 and DotNet Framework 2 SP1, Excel 2003 can read
a file that was saved by Excel 2003, but the OleDb driver can't.
Example: Excel row 2 maps to row 1 in OleDb, and Excel columns 1 to 24 (A
to X) map to columns 0 to 23 in OleDb. Counting these in OleDb, row 1
columns 0 to 4 are OK, columns 5 and 6 get corrupted to DbNull, columns 7 to
12 are OK, columns 13 and 14 get corrupted to DbNull, etc.
If I do Interop to Excel 2003 instead of using OleDb, it works. Counting
these in Excel interop, row 2 columns 1 to 24 are entirely OK.
Back to the OleDb problem. One worksheet has 40 columns instead of 24. The
worksheet with 40 columns works. Several of the additional columns read as
doubles instead of text but ToString() works on them. But all of the
24-column worksheets get corrupted by OleDb, and the corrupted column
numbers vary.
Does anyone know how to get OleDb to work?
OleDbConnection connection = new OleDbConnection(
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
fileName + ";Extended Properties='Excel 8.0;HDR=NO'");
connection.Open();
DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow["TABLE_NAME"].ToString();
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);
int headerRowNum = 1;
DataTable dataTable = new DataTable(tableName);
dataAdapter.FillSchema(dataTable, SchemaType.Source);
DataColumnCollection dataColumns = dataTable.Columns;
int columnCount = dataColumns.Count;
foreach (DataColumn dataColumn in dataColumns)
{
dataColumn.DataType = typeof(string);
}
dataAdapter.Fill(dataTable);
int rowCount = dataTable.Rows.Count;
if (!((rowCount > headerRowNum) &&
((columnCount == 24) || (columnCount == 40))))
{
dataTable.Dispose();
dataAdapter.Dispose();
continue;
}
DataRow headerRow = dataTable.Rows[headerRowNum];
// ... examine tableName in the debugger; it is correct including $ ...
// ... examine headerRow in the debugger; some elements are null ...
string x = (string)headerRow[5]; // crashes
Using Odbc, besides the obvious changes to the DotNet class names and the
connection string, one more change is this:
int headerRowNum = 0;
instead of
int headerRowNum = 1;
The reason of course is that HDR=NO in Odbc means HDR=YES.
So with those changes, I read the Excel file in Odbc instead of OleDb. The
corruption in Odbc is identical to the corruption in OleDb.
Does anyone know how to get either OleDb or Odbc to read an Excel file?
I think I have more news though. Although Excel 2003 saved the .xls file,
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here's
the reason: In all of the cells which I'm trying to interpret, either the
cell value was alphabetic, or the cell value had an apostrophe followed by
digits, so the cell values were already text. For some other cells, Excel
2003 displays them with a little triangle in one corner, and a tooltip
explains that the cell is text due to one of those reasons. However, for
the cells which I'm trying to interpret, Excel 2003 doesn't display that
triangle ... at first. However, if I put the mouse cursor in one of those
cells and double-click, but do not type anything, and then move the mouse
cursor to another cell and click, then suddenly Excel 2003 displays that
little triangle in one corner of the cell that I double-clicked, and a
tooltip explains that the cell is text due to one of those reasons.
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb starts
correctly reading the cell that I double-clicked.
So, was there a bug in Excel 97 or 2000 or XP, which corrupted the format
when saving some cells which were text due to having apostrophe plus digits,
but which saved some other cells correctly? And Excel 2003 contains a hack
to read corrupted .xls files correctly? But OleDb is missing that hack so
it gets all confused when reading files which were saved by Excel 97 or 2000
or XP (or saved by Excel 2003 without having double-clicked a corrupted
cell)?
Is there some way to get the same hack copied into OleDb so it will start
working?
a file that was saved by Excel 2003, but the OleDb driver can't.
Example: Excel row 2 maps to row 1 in OleDb, and Excel columns 1 to 24 (A
to X) map to columns 0 to 23 in OleDb. Counting these in OleDb, row 1
columns 0 to 4 are OK, columns 5 and 6 get corrupted to DbNull, columns 7 to
12 are OK, columns 13 and 14 get corrupted to DbNull, etc.
If I do Interop to Excel 2003 instead of using OleDb, it works. Counting
these in Excel interop, row 2 columns 1 to 24 are entirely OK.
Back to the OleDb problem. One worksheet has 40 columns instead of 24. The
worksheet with 40 columns works. Several of the additional columns read as
doubles instead of text but ToString() works on them. But all of the
24-column worksheets get corrupted by OleDb, and the corrupted column
numbers vary.
Does anyone know how to get OleDb to work?
OleDbConnection connection = new OleDbConnection(
"Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
fileName + ";Extended Properties='Excel 8.0;HDR=NO'");
connection.Open();
DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow["TABLE_NAME"].ToString();
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);
int headerRowNum = 1;
DataTable dataTable = new DataTable(tableName);
dataAdapter.FillSchema(dataTable, SchemaType.Source);
DataColumnCollection dataColumns = dataTable.Columns;
int columnCount = dataColumns.Count;
foreach (DataColumn dataColumn in dataColumns)
{
dataColumn.DataType = typeof(string);
}
dataAdapter.Fill(dataTable);
int rowCount = dataTable.Rows.Count;
if (!((rowCount > headerRowNum) &&
((columnCount == 24) || (columnCount == 40))))
{
dataTable.Dispose();
dataAdapter.Dispose();
continue;
}
DataRow headerRow = dataTable.Rows[headerRowNum];
// ... examine tableName in the debugger; it is correct including $ ...
// ... examine headerRow in the debugger; some elements are null ...
string x = (string)headerRow[5]; // crashes
Using Odbc, besides the obvious changes to the DotNet class names and the
connection string, one more change is this:
int headerRowNum = 0;
instead of
int headerRowNum = 1;
The reason of course is that HDR=NO in Odbc means HDR=YES.
So with those changes, I read the Excel file in Odbc instead of OleDb. The
corruption in Odbc is identical to the corruption in OleDb.
Does anyone know how to get either OleDb or Odbc to read an Excel file?
I think I have more news though. Although Excel 2003 saved the .xls file,
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here's
the reason: In all of the cells which I'm trying to interpret, either the
cell value was alphabetic, or the cell value had an apostrophe followed by
digits, so the cell values were already text. For some other cells, Excel
2003 displays them with a little triangle in one corner, and a tooltip
explains that the cell is text due to one of those reasons. However, for
the cells which I'm trying to interpret, Excel 2003 doesn't display that
triangle ... at first. However, if I put the mouse cursor in one of those
cells and double-click, but do not type anything, and then move the mouse
cursor to another cell and click, then suddenly Excel 2003 displays that
little triangle in one corner of the cell that I double-clicked, and a
tooltip explains that the cell is text due to one of those reasons.
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb starts
correctly reading the cell that I double-clicked.
So, was there a bug in Excel 97 or 2000 or XP, which corrupted the format
when saving some cells which were text due to having apostrophe plus digits,
but which saved some other cells correctly? And Excel 2003 contains a hack
to read corrupted .xls files correctly? But OleDb is missing that hack so
it gets all confused when reading files which were saved by Excel 97 or 2000
or XP (or saved by Excel 2003 without having double-clicked a corrupted
cell)?
Is there some way to get the same hack copied into OleDb so it will start
working?