Reading Excel XML files

  • Thread starter Thread starter Terry Olsen
  • Start date Start date
T

Terry Olsen

Is there any way to connect to and query Excel files that are in XML format?
I've not run into these before but now I've got these files that have the
..xls extension but are actually XML files. I've read in other discussions
about this where the advice was to open them in Excel and save them in Excel
(.xls) format. That isn't feasable here because these files are
automatically generated by reporting software and I need to query the data
to show only the data that pertains to my local group.

Is there a driver or connection string that will allow OleDb to recognize
these files? Or am I stuck with parsing them with the XML namespace?
 
The Jet Engine (that runs Access) can also be used to get data from Excel.
There are some limitations, of course, as the data in the sheet really has
to adhere to a table type format. If your software has tons of formatting,
it can be problematic. For pure tabular type data, each sheet is like a
table. Check out www.connectionstrings.com for the correct format of conn
string for Excel.

Can you query the XML? Certainly, but I would search and see if someone has
already mapped through the schema for the proper XPath for the start of the
data. Once you can find the root node for the data you desire, yanking it
out is a piece of cake.

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA
http://gregorybeamer.spaces.live.com
Co-author: Microsoft Expression Web Bible (upcoming)

************************************************
Think outside the box!
************************************************
 
I have the correct connection string. Ive connected to countless Excel
spreadsheets and am experienced in reading data from them. But they've
always been binary files. However, these new files are not binary files.
They are XML formatted files. Excel loads them and displays them just as it
would a binary .xls file. For example, Here's a file called
MySpreadsheet.xls and when opening it up in a text editor, it looks like
this:

---------------------------------------
<?xml version="1.0"?>
<?mso-application progid='Excel.Sheet'?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
xmlns:html="http://www.w3.org/TR/REC-html40">
<DocumentProperties xmlns="urn:schemas-microsoft-com:office:office"/>
<ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
<ProtectStructure>False</ProtectStructure><ProtectWindows>False</ProtectWindows>
</ExcelWorkbook><Styles><Style ss:ID="Default" ss:Name="Normal"><Alignment
ss:Vertical="Bottom"/>
<Borders/><Font/><Interior/><NumberFormat/><Protection/></Style>
<Style ss:ID="s23"><Alignment ss:Horizontal="Center"
ss:Vertical="Bottom"/><Borders>
<Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="2"
ss:Color="#000000"/>
<Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"
ss:Color="#000000"/>
<Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"
ss:Color="#000000"/>
<Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"
ss:Color="#000000"/>
</Borders><Interior ss:Color="#C0C0C0"
ss:Pattern="Solid"/></Style></Styles>
<Worksheet ss:Name="Missed Details"><Table>

<ROWS> & <CELLS> (removed for brevity)

</Table></Worksheet></Workbook>
--------------------------------------------------------

When I try to connect to this XLS file using OleDb, I get an exception that
says "External table is not in the expected format."

I'd rather use OleDb to read these file because it's so much simpler using a
SELECT statement than it is walking through it using the XmlDocument
methods.
 
Back
Top