OleDb can't read Excel file?

  • Thread starter Thread starter Norman Diamond
  • Start date Start date
N

Norman Diamond

With Visual Studio 2005 SP1 and DotNet Framework 2 SP1, Excel 2003 can read
a file that was saved by Excel 2003, but the OleDb driver can't.

Example: Excel row 2 maps to row 1 in OleDb, and Excel columns 1 to 24 (A
to X) map to columns 0 to 23 in OleDb. Counting these in OleDb, row 1
columns 0 to 4 are OK, columns 5 and 6 get corrupted to DbNull, columns 7 to
12 are OK, columns 13 and 14 get corrupted to DbNull, etc.

If I do Interop to Excel 2003 instead of using OleDb, it works. Counting
these in Excel interop, row 2 columns 1 to 24 are entirely OK.

Back to the OleDb problem. One worksheet has 40 columns instead of 24. The
worksheet with 40 columns works. Several of the additional columns read as
doubles instead of text but ToString() works on them. But all of the
24-column worksheets get corrupted by OleDb, and the corrupted column
numbers vary.

Does anyone know how to get OleDb to work?

OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
fileName + ";Extended Properties='Excel 8.0;HDR=NO'");
connection.Open();
DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow[2].ToString();
string tableNameNoDollar = tableName.Substring(0, tableName.Length - 1);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);
DataTable dataTable = new DataTable(tableName);
for (int columnIndex = 1; columnIndex <= 24; columnIndex++)
{
dataTable.Columns.Add("F" + columnIndex.ToString(), typeof(string));
}
dataAdapter.Fill(dataTable);
int columnCount = dataTable.Columns.Count;
int rowCount = dataTable.Rows.Count;
if (!((rowCount >= 2) && ((columnCount == 24) || (columnCount == 40))))
{
dataTable.Dispose();
dataAdapter.Dispose();
continue;
}
DataRow headerRow = dataTable.Rows[1];
// ... examine headerRow in the debugger; some elements are null ...
string x = headerRow[5].ToString(); // crashes
 
Are you sure that your tabel name is correct in your Select

SQL syntax "SELECT * FROM [sheet1$]". I.e. excel worksheet name followed by
a "$" and wrapped in "[" "]" brackets.
(copied from www.connectionstrings.com)

Cor
 
Cor Ligthert said:
Are you sure that your tabel name is correct in your Select

SQL syntax "SELECT * FROM [sheet1$]". I.e. excel worksheet name followed
by a "$" and wrapped in "[" "]" brackets.
(copied from www.connectionstrings.com)


Here are the relevant parts of my original posting:

DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow[2].ToString();
string tableNameNoDollar = tableName.Substring(0, tableName.Length - 1);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);

Please observe that the tableName part of the SELECT statement is tableName
not tableNameNoDollar. Though my original posting neglected to mention that
in the debugger I checked whether each value of tableName was correct (each
iteration through the loop) and it was always correct.

I think I have more news though. Although Excel 2003 saved the .xls file,
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here's
the reason: In all of the cells which I'm trying to interpret, either the
cell value was alphabetic, or the cell value had an apostrophe followed by
digits, so the cell values were already text. For some other cells, Excel
2003 displays them with a little triangle in one corner, and a tooltip
explains that the cell is text due to one of those reasons. However, for
the cells which I'm trying to interpret, Excel 2003 doesn't display that
triangle ... at first. However, if I put the mouse cursor in one of those
cells and double-click, but do not type anything, and then move the mouse
cursor to another cell and click, then suddenly Excel 2003 displays that
little triangle in one corner of the cell that I double-clicked, and a
tooltip explains that the cell is text due to one of those reasons.
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb starts
correctly reading the cell that I double-clicked.

So, was there a bug in Excel 97 or 2000 or XP, which corrupted the format
when saving some cells which were text due to having apostrophe plus digits,
but which saved some other cells correctly? And Excel 2003 contains a hack
to read corrupted .xls files correctly? But OleDb is missing that hack so
it gets all confused when reading files which were saved by Excel 97 or 2000
or XP (or saved by Excel 2003 without having double-clicked a corrupted
cell)?

Is there some way to get the same hack copied into OleDb so it will start
working?
 
Using Odbc, besides the obvious changes to the DotNet class names and the
connection string, one more change is this:
DataRow headerRow = dataTable.Rows[0];
instead of
DataRow headerRow = dataTable.Rows[1];
The reason of course is that HDR=NO in Odbc means HDR=YES.

So with those changes, I read the Excel file in Odbc instead of OleDb. The
corruption in Odbc is identical to the corruption in OleDb.

Does anyone know how to get either OleDb or Odbc to read an Excel file?
 
The Cells in excel has to be converted to text format and the data inside updated.


Although Excel 2003 saved the .xls file,
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here's
the reason: In all of the cells which I'm trying to interpret, either the
cell value was alphabetic, or the cell value had an apostrophe followed by
digits, so the cell values were already text. For some other cells, Excel
2003 displays them with a little triangle in one corner, and a tooltip
explains that the cell is text due to one of those reasons. However, for
the cells which I'm trying to interpret, Excel 2003 doesn't display that
triangle ... at first. However, if I put the mouse cursor in one of those
cells and double-click, but do not type anything, and then move the mouse
cursor to another cell and click, then suddenly Excel 2003 displays that
little triangle in one corner of the cell that I double-clicked, and a
tooltip explains that the cell is text due to one of those reasons.
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb starts
correctly reading the cell that I double-clicked.






Norman Diamond wrote:

Re: OleDb can't read Excel file?
24-Jan-08


Here are the relevant parts of my original posting

DataTable schemaTable = connection.GetSchema("Tables")
DataRow[] schemaRows = schemaTable.Select()
foreach (DataRow schemaRow in schemaRows

string tableName = schemaRow[2].ToString()
string tableNameNoDollar = tableName.Substring(0, tableName.Length - 1)
OleDbDataAdapter dataAdapter = new OleDbDataAdapter
"SELECT * FROM [" + tableName + "]", connection)

Please observe that the tableName part of the SELECT statement is tableNam
not tableNameNoDollar. Though my original posting neglected to mention tha
in the debugger I checked whether each value of tableName was correct (eac
iteration through the loop) and it was always correct

I think I have more news though. Although Excel 2003 saved the .xls file
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here'
the reason: In all of the cells which I'm trying to interpret, either th
cell value was alphabetic, or the cell value had an apostrophe followed b
digits, so the cell values were already text. For some other cells, Exce
2003 displays them with a little triangle in one corner, and a toolti
explains that the cell is text due to one of those reasons. However, fo
the cells which I'm trying to interpret, Excel 2003 doesn't display tha
triangle ... at first. However, if I put the mouse cursor in one of thos
cells and double-click, but do not type anything, and then move the mous
cursor to another cell and click, then suddenly Excel 2003 displays tha
little triangle in one corner of the cell that I double-clicked, and
tooltip explains that the cell is text due to one of those reasons
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb start
correctly reading the cell that I double-clicked

So, was there a bug in Excel 97 or 2000 or XP, which corrupted the forma
when saving some cells which were text due to having apostrophe plus digits
but which saved some other cells correctly? And Excel 2003 contains a hac
to read corrupted .xls files correctly? But OleDb is missing that hack s
it gets all confused when reading files which were saved by Excel 97 or 200
or XP (or saved by Excel 2003 without having double-clicked a corrupte
cell)

Is there some way to get the same hack copied into OleDb so it will star
working?

Previous Posts In This Thread:

OleDb can't read Excel file?
With Visual Studio 2005 SP1 and DotNet Framework 2 SP1, Excel 2003 can read
a file that was saved by Excel 2003, but the OleDb driver can't

Example: Excel row 2 maps to row 1 in OleDb, and Excel columns 1 to 24 (A
to X) map to columns 0 to 23 in OleDb. Counting these in OleDb, row 1
columns 0 to 4 are OK, columns 5 and 6 get corrupted to DbNull, columns 7 to
12 are OK, columns 13 and 14 get corrupted to DbNull, etc.

If I do Interop to Excel 2003 instead of using OleDb, it works. Counting
these in Excel interop, row 2 columns 1 to 24 are entirely OK.

Back to the OleDb problem. One worksheet has 40 columns instead of 24. The
worksheet with 40 columns works. Several of the additional columns read as
doubles instead of text but ToString() works on them. But all of the
24-column worksheets get corrupted by OleDb, and the corrupted column
numbers vary.

Does anyone know how to get OleDb to work?

OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
fileName + ";Extended Properties='Excel 8.0;HDR=NO'");
connection.Open();
DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow[2].ToString();
string tableNameNoDollar = tableName.Substring(0, tableName.Length - 1);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);
DataTable dataTable = new DataTable(tableName);
for (int columnIndex = 1; columnIndex <= 24; columnIndex++)
{
dataTable.Columns.Add("F" + columnIndex.ToString(), typeof(string));
}
dataAdapter.Fill(dataTable);
int columnCount = dataTable.Columns.Count;
int rowCount = dataTable.Rows.Count;
if (!((rowCount >= 2) && ((columnCount == 24) || (columnCount == 40))))
{
dataTable.Dispose();
dataAdapter.Dispose();
continue;
}
DataRow headerRow = dataTable.Rows[1];
// ... examine headerRow in the debugger; some elements are null ...
string x = headerRow[5].ToString(); // crashes

Are you sure that your tabel name is correct in your SelectSQL syntax "SELECT
Are you sure that your tabel name is correct in your Select

SQL syntax "SELECT * FROM [sheet1$]". I.e. excel worksheet name followed by
a "$" and wrapped in "[" "]" brackets.
(copied from www.connectionstrings.com)

Cor

Re: OleDb can't read Excel file?


Here are the relevant parts of my original posting:

DataTable schemaTable = connection.GetSchema("Tables");
DataRow[] schemaRows = schemaTable.Select();
foreach (DataRow schemaRow in schemaRows)
{
string tableName = schemaRow[2].ToString();
string tableNameNoDollar = tableName.Substring(0, tableName.Length - 1);
OleDbDataAdapter dataAdapter = new OleDbDataAdapter(
"SELECT * FROM [" + tableName + "]", connection);

Please observe that the tableName part of the SELECT statement is tableName
not tableNameNoDollar. Though my original posting neglected to mention that
in the debugger I checked whether each value of tableName was correct (each
iteration through the loop) and it was always correct.

I think I have more news though. Although Excel 2003 saved the .xls file,
now I guess that Excel 97 or 2000 or XP first created the .xls file. Here's
the reason: In all of the cells which I'm trying to interpret, either the
cell value was alphabetic, or the cell value had an apostrophe followed by
digits, so the cell values were already text. For some other cells, Excel
2003 displays them with a little triangle in one corner, and a tooltip
explains that the cell is text due to one of those reasons. However, for
the cells which I'm trying to interpret, Excel 2003 doesn't display that
triangle ... at first. However, if I put the mouse cursor in one of those
cells and double-click, but do not type anything, and then move the mouse
cursor to another cell and click, then suddenly Excel 2003 displays that
little triangle in one corner of the cell that I double-clicked, and a
tooltip explains that the cell is text due to one of those reasons.
Furthermore, if I hit Ctrl+S to save the file, then suddenly OleDb starts
correctly reading the cell that I double-clicked.

So, was there a bug in Excel 97 or 2000 or XP, which corrupted the format
when saving some cells which were text due to having apostrophe plus digits,
but which saved some other cells correctly? And Excel 2003 contains a hack
to read corrupted .xls files correctly? But OleDb is missing that hack so
it gets all confused when reading files which were saved by Excel 97 or 2000
or XP (or saved by Excel 2003 without having double-clicked a corrupted
cell)?

Is there some way to get the same hack copied into OleDb so it will start
working?

Both Odbc and OleDb can't read Excel file?
Using Odbc, besides the obvious changes to the DotNet class names and the
connection string, one more change is this:
DataRow headerRow = dataTable.Rows[0];
instead of
DataRow headerRow = dataTable.Rows[1];
The reason of course is that HDR=NO in Odbc means HDR=YES.

So with those changes, I read the Excel file in Odbc instead of OleDb. The
corruption in Odbc is identical to the corruption in OleDb.

Does anyone know how to get either OleDb or Odbc to read an Excel file?




Submitted via EggHeadCafe - Software Developer Portal of Choice
Get Started with SQLite and Visual Studio
http://www.eggheadcafe.com/tutorial...b-b6f46d4f2c6a/get-started-with-sqlite-a.aspx
 
Back
Top