J
juliensellgren
Hi, I have a huge in-memory DataTable (300,000 records), and I am
finding that performance for adding rows at this stage degrades
significantly if the table has a Primary Key, or is attached to a
DataView with a Sort. For example:
// Fill a large datatable (300,000 rows)...
// Add a PK, or an index (without this, performance is good)
largeDataTable.DefaultView.Sort = "ItemID";
// Time the append operation
DateTime ts = DateTime.Now;
System.Diagnostics.Debug.Write("Adding 10,000 rows: ");
// Append a bunch of rows
for (int i=0; i < 10,000; i++)
{
DataRow newRow = largeDataTable.NewRow();
newRow["ItemID"] = System.Guid.NewGuid().ToString();
newRow["Value"] = "Blah";
largeDataTable.Rows.Add(newRow);
newRow.AcceptChanges();
}
// Output time for the operation
Diagnostics.Debug.WriteLine(DateTime.Now.Subtract(ts).TotalMilliseconds
+ " Milliseconds");
It takes about 30,000 milliseconds to append this many rows with the
index on, but 171 milliseconds, without the index. If the table was
small to begin with, the append speed is good - which leads me be to
believe that index maintenance time is proportional to the size of the
target table, as opposed to the number of records being appended. In
otherwords, performance is degrading.
A workaround seems to be to temporarly remove the index, and add it
back after the append operation, but I am wondering if anyone has any
other thoughts? With this much data, re-creating the index takes ~ 5000
milliseconds. I have tried DataTable.LoadDataRow(),
DataRowCollection.InsertAt(), and DataTable.ImportRow(), and all suffer
from the same index maintenance issues.
cheers,
- Julien
finding that performance for adding rows at this stage degrades
significantly if the table has a Primary Key, or is attached to a
DataView with a Sort. For example:
// Fill a large datatable (300,000 rows)...
// Add a PK, or an index (without this, performance is good)
largeDataTable.DefaultView.Sort = "ItemID";
// Time the append operation
DateTime ts = DateTime.Now;
System.Diagnostics.Debug.Write("Adding 10,000 rows: ");
// Append a bunch of rows
for (int i=0; i < 10,000; i++)
{
DataRow newRow = largeDataTable.NewRow();
newRow["ItemID"] = System.Guid.NewGuid().ToString();
newRow["Value"] = "Blah";
largeDataTable.Rows.Add(newRow);
newRow.AcceptChanges();
}
// Output time for the operation
Diagnostics.Debug.WriteLine(DateTime.Now.Subtract(ts).TotalMilliseconds
+ " Milliseconds");
It takes about 30,000 milliseconds to append this many rows with the
index on, but 171 milliseconds, without the index. If the table was
small to begin with, the append speed is good - which leads me be to
believe that index maintenance time is proportional to the size of the
target table, as opposed to the number of records being appended. In
otherwords, performance is degrading.
A workaround seems to be to temporarly remove the index, and add it
back after the append operation, but I am wondering if anyone has any
other thoughts? With this much data, re-creating the index takes ~ 5000
milliseconds. I have tried DataTable.LoadDataRow(),
DataRowCollection.InsertAt(), and DataTable.ImportRow(), and all suffer
from the same index maintenance issues.
cheers,
- Julien