datatable from xml doc and massive time differences on desktop / server

  • Thread starter Thread starter Hermit Dave
  • Start date Start date
H

Hermit Dave

Hello All,

I have an xml document which can contain massive amounts of data with no
fixed child node sequence.
We have some 9000 odd fields and any ones can be a part of the xml.

The way i build the data table is that i process one record at a time and i
keep a list of columns in hash table (with ordinal position) and in a string
collection (for retrieval in correct order) and for each field, if the
column does not exists, i add the column and for that ordinal position, i
set the value in the data table.

For xml doc contain 85000 odd records with say 5 child fields, on my desktop
i can process the whole thing and display paged in list in 4 mins and 40
secs, however on the server it takes around 12 mins.
Same input file, same code, same binaries.

The desktop which performs faster is Core 2 Duo 2.4 GHz with 2 GB of RAM and
running XP
The Server runs 4 x 2.6 Dual core Opteron with 16 GB of RAM running Windows
2003 Server Enterprise edition.

What do you guys think of it ? TIA

Hermit

PS: Its code in C# on vs.net 2003 using standard frameworkclasses
(datatable, xmltextreader, xmldocument etc)
 
Hello All,

I have an xml document which can contain massive amounts of data with no
fixed child node sequence.
We have some 9000 odd fields and any ones can be a part of the xml.

The way i build the data table is that i process one record at a time andi
keep a list of columns in hash table (with ordinal position) and in a string
collection (for retrieval in correct order) and for each field, if the
column does not exists, i add the column and for that ordinal position, i
set the value in the data table.

For xml doc contain 85000 odd records with say 5 child fields, on my desktop
i can process the whole thing and display paged in list in 4 mins and 40
secs, however on the server it takes around 12 mins.
Same input file, same code, same binaries.

The desktop which performs faster is Core 2 Duo 2.4 GHz with 2 GB of RAM and
running XP
The Server runs 4 x 2.6 Dual core Opteron  with 16 GB of RAM running Windows
2003 Server Enterprise edition.

What do you guys think of it ? TIA

Where is the source XML read from on your desktop, and on your server?
It could well be an I/O bottleneck, not a problem with your code as
such. Perhaps the server reads it from an SMB share, or something like
that?
 
Pavel,

Well the desktop is a standard ide / sata drive.
there server uses SAN.

I dont think its I/O issue as my team leader tried a hard way of assuming a
column exists for a given node (and catching any exceptions and handling it
correctly) and the server timings dropped to about 2 and half mins.
I guess it was something else. Maybe something in the code was being
optimised lot better for intel architecture

Regards,

Hermit

Hello All,

I have an xml document which can contain massive amounts of data with no
fixed child node sequence.
We have some 9000 odd fields and any ones can be a part of the xml.

The way i build the data table is that i process one record at a time and
i
keep a list of columns in hash table (with ordinal position) and in a
string
collection (for retrieval in correct order) and for each field, if the
column does not exists, i add the column and for that ordinal position, i
set the value in the data table.

For xml doc contain 85000 odd records with say 5 child fields, on my
desktop
i can process the whole thing and display paged in list in 4 mins and 40
secs, however on the server it takes around 12 mins.
Same input file, same code, same binaries.

The desktop which performs faster is Core 2 Duo 2.4 GHz with 2 GB of RAM
and
running XP
The Server runs 4 x 2.6 Dual core Opteron with 16 GB of RAM running
Windows
2003 Server Enterprise edition.

What do you guys think of it ? TIA

Where is the source XML read from on your desktop, and on your server?
It could well be an I/O bottleneck, not a problem with your code as
such. Perhaps the server reads it from an SMB share, or something like
that?
 
Back
Top