Data Insertion

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Hi All,

I am working with C#, SQL CE and ADO.net using the .net compact framework on a PDA.

I need to populate quite large tables from XML.
Initially I just used Dataset.ReadXML but this quickly ran into memory problems with large amounts of data.
I am now using XmlTextReader to read the data in piecemeal fashion.

It has been suggested to me that I read a single record from the XML and then insert each record one by one and work my way through the XML.
This seems far too clumsy.
I would rather extract a certain amount of data to a dataset and then update the table regularly.
Is this possible or is there another simpler more efficient way of adding the data?

Thanks,
Dave
 
Dave,

First of all, how large are your data tables and how big is the XML file?

Next, are you using schema to load your XML? If you do not have schema,
inference is used.
This involves loading entire XML file into XmlDocument, which takes huge
amount of memory and a lot of time.

If you do have a schema (loaded from schema file, embedded into XML or
created programmatically), DataSet.ReadXml() will use XmlTextReader to load
records one by one and add them to the DataSet, so you won't have to.
However, data will be stored in memory anyway, so if you have a lot of
records, it might be just too much.

Using XmlTextReader and adding records directly to SQL CE is the way to go
in this case.

Best regards,

Ilya

This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQRhjr1NH4yZoruSRyv/2H+WPk1pw==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>
Subject: Data Insertion
Date: Wed, 24 Mar 2004 01:56:08 -0800
Lines: 15
Message-ID: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49318
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework

Hi All,

I am working with C#, SQL CE and ADO.net using the .net compact framework
on a PDA.

I need to populate quite large tables from XML.
Initially I just used Dataset.ReadXML but this quickly ran into memory
problems with large amounts of data.
I am now using XmlTextReader to read the data in piecemeal fashion.

It has been suggested to me that I read a single record from the XML and
then insert each record one by one and work my way through the XML.
This seems far too clumsy.
I would rather extract a certain amount of data to a dataset and then
update the table regularly.
Is this possible or is there another simpler more efficient way of adding
the data?

Thanks,
Dave
 
The XML File is 4.6 MB
There is no Schema

You recomend using the xmltextreader an adding the records direct to SQL CE
Do you mean individually
This seems to take a long time

Would adding them in batches to a dateset prove quicker
Is this possible

Thanks
Dave
 
4.6 MB is pretty bug, but device should be able to handle it.
It'll take few minutes to load, but should be just fine.
You _have_ to use schema to accomplish that, though.
Inference is OK for prototyping and, may be, loading 1-2K XML files; it's
not good for anything else and should be avoided.

Here are your options:
1. Change your XML file to contain embedded schema. This might be
impossible if you have no control over XML.
2. Create schema using inference (or schema designer) on a desktop and load
it from separate file before loading XML.
You can also reformat you XML to improve loading performance and
minimize XML size. You can use utility below to do that.
3. Create schema programmatically (i.e. add tables, columns, relations,
etc.) before loading XML. This is not very flexible, though, but might be
OK if your schema is fixed.
4. Create typed DataSet using desktop's designer, remove unsupported stuff
manually and use it in device project.
This is pretty much the same as #3, but instead of coding you would
have to comment out unsupported code.

Now to the XmlTextReader solution...

I do not recommend using it unless you really desperate. It's inflexible
and would require significant coding on your part (not to mention testing).
It is, however, the fastest solution and it uses the least possible amount
of memory.
It would work even for 50 MB XML, but probably not worth is with 4.6 MB.
It won't be slower. DataAdapter.Fill() will add records one by one anyway
as SQL CE does not support batching.
Thus, loading small batches into the DataSet would not improve performance.
I see no good way to accomplish this, anyway. You might be able to use
Fragment mode, but it's not exactly what you need.

Best regards,

Ilya

PS Utility source code...

using System;
using System.Data;
using System.IO;
using System.Text;
using System.Xml;


namespace FixSchema
{
class DoFix
{
const int OK = 0;
const int ERROR = 1;

static void DumpRelation ( DataRelation dr) {
string parentColumns = "";
string childColumns = "";

foreach (DataColumn dc in dr.ParentColumns)
parentColumns += String.Format("'{0}' ", dc.ColumnName);

foreach (DataColumn dc in dr.ChildColumns)
childColumns += String.Format("'{0}' ", dc.ColumnName);

Console.WriteLine ("Relation '{0}': '{1}' ({4})->'{2}' ({4}),
{3}", dr.RelationName, dr.ChildTable, dr.ParentTable, dr.Nested ? "nested"
: "related",
parentColumns, childColumns );
}

static int Main(string[] args)
{
if (args.Length == 0) {

Console.WriteLine ("This utility loads XML file, changes relations from
nexted to related,\nmaps all columns as attributes and saves resulted
schema and data.");
Console.WriteLine ("Usage: {0} xml_file",
Path.GetFileName(System.Reflection.Assembly.GetExecutingAssembly().GetName()
.CodeBase));

return ERROR;
}

try {
string fileName = args[0];

DataSet ds = new DataSet();

Console.WriteLine ("Loading XML file '{0}'", fileName);

XmlReadMode mode = ds.ReadXml (fileName);

if (XmlReadMode.InferSchema == mode ) {
Console.WriteLine ("WARNING! WARNING! WARNING! WARNING! WARNING!
WARNING!\nXML has been loaded in inference mode, all type information is
now lost.\nConsider changing schema to restore type information.");
}

Console.WriteLine ("Done loading file, changing column mappings to
attributes...");

foreach (DataTable dt in ds.Tables) {
foreach (DataColumn dc in dt.Columns){
if (dc.ColumnMapping != MappingType.Attribute) {
Console.WriteLine ("Changing mapping '{0}' of column '{1}' from
table '{2}' to 'Attribute'",
dc.ColumnMapping, dc.ColumnName, dt.TableName );

dc.ColumnMapping = MappingType.Attribute;
}
}
}

Console.WriteLine ("Now changing relations from nested to related...");

foreach (DataRelation r in ds.Relations ) {

DumpRelation (r);

if (r.Nested && (r.ParentColumns.Length == 1) &&
(r.ChildColumns.Length == 1)) {

Console.WriteLine ("Changing relation '{0}' from nested to
related...", r.RelationName);

r.Nested = false;
}
}

Console.WriteLine ("Done processing relations, saving schema...");

ds.WriteXmlSchema (fileName + ".xsd");

Console.WriteLine ("Schema saved to file '{0}'", fileName + ".xsd");

Console.WriteLine ("Now saving data with schema...");

ds.WriteXml (fileName + ".xml", XmlWriteMode.WriteSchema);

Console.WriteLine ("Data saved to file '{0}'", fileName + ".xml");
}
catch (Exception e) {

Console.WriteLine ("Error while processing: {0}", e);

return ERROR;
}

return OK;
}
}
}


This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQSTUsR44/NKI2IQW+5cd6h8lFY3w==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>
References: <[email protected]>
Subject: RE: Data Insertion
Date: Thu, 25 Mar 2004 01:41:05 -0800
Lines: 12
Message-ID: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49423
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework

The XML File is 4.6 MB.
There is no Schema.

You recomend using the xmltextreader an adding the records direct to SQL CE
 
Please see inline...

Best regards,

Ilya

This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQTVKa17g320Z3SRJiYVf230tNpDQ==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>
References: <[email protected]>
<[email protected]>
Subject: RE: Data Insertion
Date: Fri, 26 Mar 2004 09:06:16 -0800
Lines: 211
Message-ID: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49591
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework

Hi Ilya,
Thanks for your answer.
Could I clarify a couple of points.
The physical database structure is already in place.

Do you mean the database data is coming from, SQL CE database on device or
both?
The XML Schema will be fixed.
I can determine the XML file and whether it contains a schema.
The need for the embedded schema confuses me a little.
Where is the schema used ?

In this particular case XML Schema is used to create a DataSet schema (set
of tables, columns, relations, constraints and so on).
As soon as you load XML schema into the DataSet, DataSet will have all the
tables, columns and so on, but no data.
XML schema could be inside XML file with data (embedded schema) or it could
be loaded separately.
You do not actually need XML schema, but you do need DataSet schema it
generates as you load it into the DataSet.
You can avoid using XML schema if you create DataSet schema
programmatically.
Do you mean the schema is used with the dataadapter or dataset ?

It is used to create tables, columns and other DataSet elements.
If it is intended to generate the database itself , that has already been
done .

Since DataSet is essentially in memory relational database, yes, it is
intended to do this.
Using DataSet without schema is pretty much like creating an empty SQL
database and trying to fill it with data.
Won't work, you need to create some tables and columns first, right?

Now, you're probably wondering how it's possible to load XML without XML
schema into the DataSet without schema (created programmatically or loaded
from XML schema file)…
It is possible because before loading any data DataSet would parse entire
XML in attempt to infer schema from XML structure using predefined set of
rules.
This is very slow process which consumes huge amount of memory, works only
for some XML files and might produce some unpredictable results.
Nobody should use inference to load XML into the DataSet.

Another question :
you say : "It'll take few minutes to load, but should be just fine."
Do you mean loading the 4.6 mb file into a dataset using
DataSet.ReadXml() ?

Correct. As soon as you have data in the DataSet, you could copy it to the
SQL CE database.
I am unclear as to how exactly you are suggesting updaing the database.

1. Load XML schema and data into the DataSet.

DataSet data = new DataSet(); // New DataSet, no
tables or columns, could not hold any data.

data.ReadXmlSchema (schemaFile); // Now you have tables,
columns and so on, but no data.
//
Remove if you have embedded schema in dataFile.
data.ReadXml (dataFile); // Now you have
data.

2. Create Connection, DataAdapter and Command.

SqlCeConnection con = new SqlCeConnection("Data Source = " +
dataFile );
//
Create a new connection
con.Open(); // Open
this connection

SqlCeDataAdapter da = new SqlCeDataAdapter(); // Prepare
data adapter

SqlCeCommand cmd = con.CreateCommand(); // Prepare
command

3. Execute DataAdapter.Update() for each table:

foreach (DataTable table in data.Tables ) {

cmd.CommandText = "Inset Into ..."; // Set insert
command, add table name and parameters...

// Create and add parameters to command here...

da.InsertCommand = cmd; // Use
command as insert command on DataAdapter

da.Update(table);
// Update table in the SQL CE with data in DataSet's table
}

4. Close connection and dispose of all used objects as you normally do.
To simplify ..If I have an XML file with a schema could you clarify what
steps I follow to get it into the database.

Please see above.
Thanks for all your help,
Dave.



----- \"Ilya Tumanov [MS]\" wrote: -----

4.6 MB is pretty bug, but device should be able to handle it.
It'll take few minutes to load, but should be just fine.
You _have_ to use schema to accomplish that, though.
Inference is OK for prototyping and, may be, loading 1-2K XML files;
it's
not good for anything else and should be avoided.

Here are your options:
1. Change your XML file to contain embedded schema. This might be
impossible if you have no control over XML.
2. Create schema using inference (or schema designer) on a desktop and
load
it from separate file before loading XML.
You can also reformat you XML to improve loading performance and
minimize XML size. You can use utility below to do that.
3. Create schema programmatically (i.e. add tables, columns,
relations,
etc.) before loading XML. This is not very flexible, though, but might
be
OK if your schema is fixed.
4. Create typed DataSet using desktop's designer, remove unsupported
stuff
manually and use it in device project.
This is pretty much the same as #3, but instead of coding you
would
have to comment out unsupported code.

Now to the XmlTextReader solution...

I do not recommend using it unless you really desperate. It's
inflexible
and would require significant coding on your part (not to mention
testing).
It is, however, the fastest solution and it uses the least possible
amount
of memory.
It would work even for 50 MB XML, but probably not worth is with 4.6
MB.
It won't be slower. DataAdapter.Fill() will add records one by one
anyway
as SQL CE does not support batching.
Thus, loading small batches into the DataSet would not improve
performance.
I see no good way to accomplish this, anyway. You might be able to use
Fragment mode, but it's not exactly what you need.

Best regards,

Ilya

PS Utility source code...

using System;
using System.Data;
using System.IO;
using System.Text;
using System.Xml;


namespace FixSchema
{
class DoFix
{
const int OK = 0;
const int ERROR = 1;

static void DumpRelation ( DataRelation dr) {
string parentColumns = "";
string childColumns = "";

foreach (DataColumn dc in dr.ParentColumns)
parentColumns += String.Format("'{0}' ", dc.ColumnName);

foreach (DataColumn dc in dr.ChildColumns)
childColumns += String.Format("'{0}' ", dc.ColumnName);

Console.WriteLine ("Relation '{0}': '{1}' ({4})->'{2}'
({4}),
{3}", dr.RelationName, dr.ChildTable, dr.ParentTable, dr.Nested ?
"nested"
: "related",
parentColumns, childColumns );
}

static int Main(string[] args)
{
if (args.Length == 0) {

Console.WriteLine ("This utility loads XML file, changes relations
from
nexted to related,\nmaps all columns as attributes and saves resulted
schema and data.");
Console.WriteLine ("Usage: {0} xml_file",

Path.GetFileName(System.Reflection.Assembly.GetExecutingAssembly().GetName()
.CodeBase));

return ERROR;
}

try {
string fileName = args[0];

DataSet ds = new DataSet();

Console.WriteLine ("Loading XML file '{0}'", fileName);

XmlReadMode mode = ds.ReadXml (fileName);

if (XmlReadMode.InferSchema == mode ) {
Console.WriteLine ("WARNING! WARNING! WARNING! WARNING! WARNING!
WARNING!\nXML has been loaded in inference mode, all type information
is
now lost.\nConsider changing schema to restore type information.");
}

Console.WriteLine ("Done loading file, changing column mappings to
attributes...");

foreach (DataTable dt in ds.Tables) {
foreach (DataColumn dc in dt.Columns){
if (dc.ColumnMapping != MappingType.Attribute) {
Console.WriteLine ("Changing mapping '{0}' of column '{1}' from
table '{2}' to 'Attribute'",
dc.ColumnMapping, dc.ColumnName, dt.TableName );

dc.ColumnMapping = MappingType.Attribute;
}
}
}

Console.WriteLine ("Now changing relations from nested to
related...");

foreach (DataRelation r in ds.Relations ) {

DumpRelation (r);

if (r.Nested && (r.ParentColumns.Length == 1) &&
(r.ChildColumns.Length == 1)) {

Console.WriteLine ("Changing relation '{0}' from nested to
related...", r.RelationName);

r.Nested = false;
}
}

Console.WriteLine ("Done processing relations, saving schema...");

ds.WriteXmlSchema (fileName + ".xsd");

Console.WriteLine ("Schema saved to file '{0}'", fileName +
".xsd");

Console.WriteLine ("Now saving data with schema...");

ds.WriteXml (fileName + ".xml", XmlWriteMode.WriteSchema);

Console.WriteLine ("Data saved to file '{0}'", fileName + ".xml");
}
catch (Exception e) {

Console.WriteLine ("Error while processing: {0}", e);

return ERROR;
}

return OK;
}
}
}


This posting is provided "AS IS" with no warranties, and confers no
rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQSTUsR44/NKI2IQW+5cd6h8lFY3w==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>>
References:
Date: Thu, 25 Mar 2004 01:41:05 -0800
Lines: 12
Message-ID: <[email protected]>> MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49423
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
There is no Schema.

You recomend using the xmltextreader an adding the records direct to
SQL CE
 
Please see inline...

Best regards,

Ilya

This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQWkb4hx5KE3N2cSMGF3sTfJ9I/KA==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
Subject: RE: Data Insertion
Date: Tue, 30 Mar 2004 12:01:08 -0800
Lines: 20
Message-ID: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49902
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework

Hi Ilya,
Thank you for your help.
I have followed your suggestions.
Loading a file of more than 4 mb into a dataset is using up lots of memory.

It's not surprising considering 4 MB is from 6 to 12% of entire device
memory.
The loading and insertion of the data is taking some twenty minutes.
Doesn't this seem very , very long?

It is long. Few questions:

Which device are you using? Today's mainstream is PXA 255 400 MHz based
devices running PPC 2003.
Which version of the framework do you have? SP2 includes major performance
fix for loading XML into the DataSet.
Is your database located on a flash card? If yes, data insertion might be
several times slower compared to RAM.

How long it takes to load data to DataSet from XML? How long it takes to
push it from DataSet to SQL CE?
Are there any benchmarks for data insertion with SQL CE and ADO.net?

I'm not aware of publicly available benchmarks, but we're running internal
benchmarks constantly.
Here's sample data we have (Toshiba e750, PXA 255 400 MHz, SP2):

Load ~1 MB XML file into the DataSet with schema preloaded (3 nested
tables, 300 records in each table, 33 columns of various types including
unique primary key in each record) - 22.8 seconds

Insert data described above from DataSet into SQL CE database in RAM using
parameterized manually created insert command - 29.5 seconds

Load data from SQL CE into the DataSet - 7.5 seconds.

I would say, your 4.5 MB should be done in about (22.8 + 29.5) * 4.5/1= 236
seconds ~ 4 minutes.
Let's add one minute for increased garbage collection which would result in
5 minutes.

20 minutes seems to be excessive to me, but this might be if you running
slower device and using flash card.

If you have a fast device, running SP2 and not using flash card, that means
something's wrong.
Please check if you're using parameterized query to insert data and not
using command builder.
It does not seem pratcical to load all the data into a dataset even using
an embedded schema , reading the schema and then reading the data >ignoring
the schema .
The dataset solution does not seem to be a practical solution.

In this particular case whatever you do with DataSet is a pure overhead.
You do not need DataSet per se; you just need to get data from XML to SQL
CE.
Dumping it into the DataSet and into the SQL CE is not the most efficient
way to do it.
Do you have any idea if this technology is intended to be used on a PDA
with a file of this size.

Generally, I would recommend using replication to get data into SQL CE.
Do you have any other suggestions?

The fastest solution is to use SQL replication. This would allow you to
create and use database with 30-50K records, may be more.
If I was to use the XMLTextReader method then repeatedly using a command
object with ExecuteNonQuery and adding the records one by one >is this
going to be any slower than the dataset method?

Since SQL CE does not support batching, DataAdapter adds records from
DataSet to SQL CE one by one anyway.
So, it would be faster if carefully programmed. Before you do that, let's
see if we can optimize DataSet path somehow.
 
Ilya

I have upped the spec of the PDA
The resulting time is just over 6 minutes
But the spec of the machine is still inferior to that described by you as standard
I have a Dell Axim 300 mz with 32 mb ram(same as the old machine but this one is new and running pocket pc 2003)
I know it's a poor machine but the statistics are looking better now and aren't too inconsistent with those you provided
I am using 1.1 of the .net framework , version 1.1.4322
I am using a parametrised query

cmd.CommandText = "INSERT INTO test(col_1 ,col_2....) VALUES (?,?.....)"
cmd.Parameters.Add(new SqlCeParameter("p1", SqlDbType.NVarChar, 255, "col_1"))
cmd.Parameters.Add(new SqlCeParameter("p2", SqlDbType.NVarChar, 50, "col_2"))

As those in charge of the development want perhaps even larger data I may have to go with an XMlTextreader solution or even accessing the base tables

Thanks Ilya for all your help


----- \"Ilya Tumanov [MS]\" wrote: ----

Please see inline..

Best regards

Ily

This posting is provided "AS IS" with no warranties, and confers no rights
-------------------
Thread-Topic: Data Insertio
thread-index: AcQWkb4hx5KE3N2cSMGF3sTfJ9I/KA=
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframewor
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>> References: <[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]>> Subject: RE: Data Insertio
Date: Tue, 30 Mar 2004 12:01:08 -080
Lines: 2
Message-ID: <[email protected]>> MIME-Version: 1.
Content-Type: text/plain
charset="Utf-8
Content-Transfer-Encoding: 7bi
X-Newsreader: Microsoft CDO for Windows 200
Content-Class: urn:content-classes:messag
Importance: norma
Priority: norma
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.
Newsgroups: microsoft.public.dotnet.framework.compactframewor
Path: cpmsftngxa06.phx.gb
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:4990
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.18
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframewor
Thank you for your help
I have followed your suggestions
Loading a file of more than 4 mb into a dataset is using up lots of memory

It's not surprising considering 4 MB is from 6 to 12% of entire device
memory
The loading and insertion of the data is taking some twenty minutes
Doesn't this seem very , very long

It is long. Few questions

Which device are you using? Today's mainstream is PXA 255 400 MHz based
devices running PPC 2003
Which version of the framework do you have? SP2 includes major performance
fix for loading XML into the DataSet
Is your database located on a flash card? If yes, data insertion might be
several times slower compared to RAM

How long it takes to load data to DataSet from XML? How long it takes to
push it from DataSet to SQL CE
Are there any benchmarks for data insertion with SQL CE and ADO.net

I'm not aware of publicly available benchmarks, but we're running internal
benchmarks constantly
Here's sample data we have (Toshiba e750, PXA 255 400 MHz, SP2)

Load ~1 MB XML file into the DataSet with schema preloaded (3 nested
tables, 300 records in each table, 33 columns of various types including
unique primary key in each record) - 22.8 second

Insert data described above from DataSet into SQL CE database in RAM using
parameterized manually created insert command - 29.5 second

Load data from SQL CE into the DataSet - 7.5 seconds

I would say, your 4.5 MB should be done in about (22.8 + 29.5) * 4.5/1= 236
seconds ~ 4 minutes.
Let's add one minute for increased garbage collection which would result in
5 minutes.

20 minutes seems to be excessive to me, but this might be if you running
slower device and using flash card.

If you have a fast device, running SP2 and not using flash card, that means
something's wrong.
Please check if you're using parameterized query to insert data and not
using command builder.
It does not seem pratcical to load all the data into a dataset even using
an embedded schema , reading the schema and then reading the data >ignoring
the schema .
The dataset solution does not seem to be a practical solution.

In this particular case whatever you do with DataSet is a pure overhead.
You do not need DataSet per se; you just need to get data from XML to SQL
CE.
Dumping it into the DataSet and into the SQL CE is not the most efficient
way to do it.
Do you have any idea if this technology is intended to be used on a PDA
with a file of this size.

Generally, I would recommend using replication to get data into SQL CE.
Do you have any other suggestions?

The fastest solution is to use SQL replication. This would allow you to
create and use database with 30-50K records, may be more.
If I was to use the XMLTextReader method then repeatedly using a command
object with ExecuteNonQuery and adding the records one by one >is this
going to be any slower than the dataset method?

Since SQL CE does not support batching, DataAdapter adds records from
DataSet to SQL CE one by one anyway.
So, it would be faster if carefully programmed. Before you do that, let's
see if we can optimize DataSet path somehow.
 
Well, it's not a stellar performance, but it's not that bad considering
you're using entry level device and dealing with XML which is the most
inefficient way to store and transfer data ever invented (but the most
flexible).

VS 2003 you're using includes RTM version of a Compact Framework, so you're
not using SP2.
You might be able improve loading time it by a minute or two by upgrading
your device to SP2.

You can download it here:

http://www.microsoft.com/downloads/details.aspx?familyid=10600643-09b3-46d8-
ba28-bc494bc20d26&displaylang=en

Best regards,

Ilya

This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQXR10g3+NFlYUSRpyAC69mYpr3Mw==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
Subject: RE: Data Insertion
Date: Wed, 31 Mar 2004 09:41:14 -0800
Lines: 136
Message-ID: <[email protected]>
MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49989
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework

Ilya,

I have upped the spec of the PDA.
The resulting time is just over 6 minutes.
But the spec of the machine is still inferior to that described by you as
standard.
I have a Dell Axim 300 mz with 32 mb ram(same as the old machine but this
one is new and running pocket pc 2003).
I know it's a poor machine but the statistics are looking better now and
aren't too inconsistent with those you provided.
I am using 1.1 of the .net framework , version 1.1.4322.
I am using a parametrised query:

cmd.CommandText = "INSERT INTO test(col_1 ,col_2....) VALUES (?,?.....)";
cmd.Parameters.Add(new SqlCeParameter("p1", SqlDbType.NVarChar, 255,
"col_1"));
cmd.Parameters.Add(new SqlCeParameter("p2", SqlDbType.NVarChar, 50,
"col_2"));

As those in charge of the development want perhaps even larger data I may
have to go with an XMlTextreader solution or even accessing the base tables.

Thanks Ilya for all your help.


----- \"Ilya Tumanov [MS]\" wrote: -----

Please see inline...

Best regards,

Ilya

This posting is provided "AS IS" with no warranties, and confers no
rights.
--------------------
Thread-Topic: Data Insertion
thread-index: AcQWkb4hx5KE3N2cSMGF3sTfJ9I/KA==
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
From: "=?Utf-8?B?RGF2ZQ==?=" <[email protected]>>
References:
<[email protected]><lVUadJdEEHA.3924@cpmsft
ngxa06.phx.gbl><[email protected]><jFDCcVqE
(e-mail address removed)><3B85E9D8-03CA-4BEB-BACB-B6480F8DEEB2@microsoft
.com> said:
Date: Tue, 30 Mar 2004 12:01:08 -0800
Lines: 20
Message-ID: <[email protected]>> MIME-Version: 1.0
Content-Type: text/plain;
charset="Utf-8"
Content-Transfer-Encoding: 7bit
X-Newsreader: Microsoft CDO for Windows 2000
Content-Class: urn:content-classes:message
Importance: normal
Priority: normal
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.3790.0
Newsgroups: microsoft.public.dotnet.framework.compactframework
Path: cpmsftngxa06.phx.gbl
Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework.compactframework:49902
NNTP-Posting-Host: tk2msftcmty1.phx.gbl 10.40.1.180
X-Tomcat-NG: microsoft.public.dotnet.framework.compactframework
Thank you for your help.
I have followed your suggestions.
Loading a file of more than 4 mb into a dataset is using up lots of
memory.

It's not surprising considering 4 MB is from 6 to 12% of entire device
memory.
The loading and insertion of the data is taking some twenty minutes.
Doesn't this seem very , very long?

It is long. Few questions:

Which device are you using? Today's mainstream is PXA 255 400 MHz
based
devices running PPC 2003.
Which version of the framework do you have? SP2 includes major
performance
fix for loading XML into the DataSet.
Is your database located on a flash card? If yes, data insertion might
be
several times slower compared to RAM.

How long it takes to load data to DataSet from XML? How long it takes
to
push it from DataSet to SQL CE?
Are there any benchmarks for data insertion with SQL CE and ADO.net?

I'm not aware of publicly available benchmarks, but we're running
internal
benchmarks constantly.
Here's sample data we have (Toshiba e750, PXA 255 400 MHz, SP2):

Load ~1 MB XML file into the DataSet with schema preloaded (3 nested
tables, 300 records in each table, 33 columns of various types
including
unique primary key in each record) - 22.8 seconds

Insert data described above from DataSet into SQL CE database in RAM
using
parameterized manually created insert command - 29.5 seconds

Load data from SQL CE into the DataSet - 7.5 seconds.

I would say, your 4.5 MB should be done in about (22.8 + 29.5) *
4.5/1= 236
seconds ~ 4 minutes.
Let's add one minute for increased garbage collection which would
result in
5 minutes.

20 minutes seems to be excessive to me, but this might be if you
running
slower device and using flash card.

If you have a fast device, running SP2 and not using flash card, that
means
something's wrong.
Please check if you're using parameterized query to insert data and
not
using command builder.
It does not seem pratcical to load all the data into a dataset even
using
an embedded schema , reading the schema and then reading the data
ignoring the schema .
The dataset solution does not seem to be a practical solution.

In this particular case whatever you do with DataSet is a pure
overhead.
You do not need DataSet per se; you just need to get data from XML to
SQL
CE.
Dumping it into the DataSet and into the SQL CE is not the most
efficient
way to do it.
Do you have any idea if this technology is intended to be used on a
PDA
with a file of this size.

Generally, I would recommend using replication to get data into SQL
CE.
Do you have any other suggestions?

The fastest solution is to use SQL replication. This would allow you
to
create and use database with 30-50K records, may be more.
If I was to use the XMLTextReader method then repeatedly using a
command
object with ExecuteNonQuery and adding the records one by one >is this
going to be any slower than the dataset method?

Since SQL CE does not support batching, DataAdapter adds records from
DataSet to SQL CE one by one anyway.
So, it would be faster if carefully programmed. Before you do that,
let's
see if we can optimize DataSet path somehow.
 
Back
Top