S
Stuart Carnie
Firstly, the index performance improvement is awesome, I've seen a 75x
improvement in test cases.
Using the RTM version of Whidbey and the code from November 2005 MSDN
article
(http://msdn.microsoft.com/msdnmag/issues/05/11/DataPoints/default.aspx), I
ran the same tests in my environment with ADO.NET 1.1 and 2.0. I'd like to
raise as point that the memory usage is significantly higher (2.3x) than
2003 for loading the same data.
Tested load of 1,000,000 rows using code from this article. Made two
modifications, Unique = false (to speed up the ADO v1.1 load, since it takes
30 minutes), and a Console.ReadLine at the end.
Results (using Process Explorer v9.25 for memory usage):
..NET v1.1
Time to load: 6.8s
Mem Usage (Private, Working, Virtual) : 73,592K, 72,828K, 147,368K
- - - - - - -
..NET v2.0
Time to load: 11.3s
Mem usage: 168,220K, 161,264K, 241,792K
When digging a little deeper using .NET Memory Profiler v2.5, I found these
major differences:
ADO.NET 1.1 (top 5 classes by Bytes):
Class Total Instances Total Bytes
-----------------------------------------------------
DataRow 500,000 20,000,000
Int32[] 10,293 8,717,856
Object[] 10,551 3,379,952
DataRow[] 2 3,145,760
ArrayListEnumerator... 20,530 497,720
----------
35,741,288
ADO.NET 2.0 (top 5 classes by Total Bytes):
Class Total Instances Total Bytes
-----------------------------------------------------
DataRow 500,000 32,000,000
RBTree<int>.Node[] 225 16,095,884
RBTree<DataRow>.Node[] 225 16,095,884
Int32[] 472 4,457,268
DataRow[] 2 2,097,184
----------
70,746,220
The instance size of DataRow has increased by 60%
Introduced 2 new objects, RBTree. For the massive performance improvements,
I'm sure these binary trees are necessary, and it appears they hold
references to all the rows in the data set, as they are about 32 bytes in
size for each instance of Node, and amount to a figure close enough to
500000 if you divide 16,095,884 by 32.
Anyways, I just wanted to bring this up, as it could have an impact for
some, if memory is tight.
Cheers,
Stuart
improvement in test cases.
Using the RTM version of Whidbey and the code from November 2005 MSDN
article
(http://msdn.microsoft.com/msdnmag/issues/05/11/DataPoints/default.aspx), I
ran the same tests in my environment with ADO.NET 1.1 and 2.0. I'd like to
raise as point that the memory usage is significantly higher (2.3x) than
2003 for loading the same data.
Tested load of 1,000,000 rows using code from this article. Made two
modifications, Unique = false (to speed up the ADO v1.1 load, since it takes
30 minutes), and a Console.ReadLine at the end.
Results (using Process Explorer v9.25 for memory usage):
..NET v1.1
Time to load: 6.8s
Mem Usage (Private, Working, Virtual) : 73,592K, 72,828K, 147,368K
- - - - - - -
..NET v2.0
Time to load: 11.3s
Mem usage: 168,220K, 161,264K, 241,792K
When digging a little deeper using .NET Memory Profiler v2.5, I found these
major differences:
ADO.NET 1.1 (top 5 classes by Bytes):
Class Total Instances Total Bytes
-----------------------------------------------------
DataRow 500,000 20,000,000
Int32[] 10,293 8,717,856
Object[] 10,551 3,379,952
DataRow[] 2 3,145,760
ArrayListEnumerator... 20,530 497,720
----------
35,741,288
ADO.NET 2.0 (top 5 classes by Total Bytes):
Class Total Instances Total Bytes
-----------------------------------------------------
DataRow 500,000 32,000,000
RBTree<int>.Node[] 225 16,095,884
RBTree<DataRow>.Node[] 225 16,095,884
Int32[] 472 4,457,268
DataRow[] 2 2,097,184
----------
70,746,220
The instance size of DataRow has increased by 60%
Introduced 2 new objects, RBTree. For the massive performance improvements,
I'm sure these binary trees are necessary, and it appears they hold
references to all the rows in the data set, as they are about 32 bytes in
size for each instance of Node, and amount to a figure close enough to
500000 if you divide 16,095,884 by 32.
Anyways, I just wanted to bring this up, as it could have an impact for
some, if memory is tight.
Cheers,
Stuart