C# DIctionary size limit

orsula · Aug 12, 2008

Hi Guys,

I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

Thanks,
Orsula.z

Jon Skeet [C# MVP] · Aug 12, 2008

I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

"Several thousand" is actually reasonably small. Basically if you've
got enough room in memory for the objects and a bit of overhead, the
dictionary should be fine.

Jon

Cor Ligthert[MVP] · Aug 12, 2008

Orsula,

Be aware that as with every system collection, you are only adding the
references of the objects, not the values itself, therefore a dictionary can
be in fact very small.

Cor

orsula · Aug 12, 2008

"Several thousand" is actually reasonably small. Basically if you've
got enough room in memory for the objects and a bit of overhead, the
dictionary should be fine.

Jon

Thanks (again

) Jon

Arne Vajhøj · Aug 13, 2008

orsula said:
I have a class A composed of several string and int members.
I would like to manage a huge amount (several thousands) of A objects
in a dictionary where each object has its unique key.

Is there any known limitation or best practice that you think I should
consider?

No.

My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,A> on 32 bit Windows.

Arne

Jon Skeet [C# MVP] · Aug 13, 2008

No.

My understanding is that would be able to stuff 2GB/16B = 125 million
of entries in a Dictionary<string,A> on 32 bit Windows.

Not quite. For one thing I think there's an upper limit of about 1GB
per object (including the entries array) in the CLR. I've also found
that for some reason (using a Dictionary<int,int> to avoid creating
any extra objects) the memory per entry averages at around 40 bytes
rather than 16. No idea why at the moment - 16 would make sense to me
as well, although there's bound to be *some* extra overhead for the
buckets etc. Possibly alignment issues? Hmm...

Jon

Cor Ligthert [MVP] · Aug 13, 2008

A little bit difficult, because then all pairs should be referencing to null
and that is not possible.

Cor

Brian Gideon · Aug 13, 2008

Not quite. For one thing I think there's an upper limit of about 1GB
per object (including the entries array) in the CLR. I've also found
that for some reason (using a Dictionary<int,int> to avoid creating
any extra objects) the memory per entry averages at around 40 bytes
rather than 16. No idea why at the moment - 16 would make sense to me
as well, although there's bound to be *some* extra overhead for the
buckets etc. Possibly alignment issues? Hmm...

Jon

I just Reflector'd it. In addition to the key and the value the hash
code (int) and a next pointer (int) are also stored on each entry.
That would account for 16B. There is also an int[] array that is used
for indexing the entries array. That gets us to 20B. The resizing
algorithm uses the first prime number greater than twice the
hashtable's current size and uses that for the new size. That would
easily get you to 40.

Since we're discussing very large dictionaries I noticed that the
prime table is preloaded up to 7,199,369 (which is not to say that all
primes up to that value are included). Once it reaches that size the
next resize operation will start using the naive algorithm for
primality testing, but will *probably* yield 14,398,753, 28,797,523,
and 57,595,063. That gives us jumps to 288, 576, and 1152 MB
respectively. If the 1GB limit is indeed in play for arrays then we
might expect ~28 million to be the upper limit. Pure speculation.

Jon Skeet [C# MVP] · Aug 13, 2008

I just Reflector'd it. In addition to the key and the value the hash
code (int) and a next pointer (int) are also stored on each entry.
That would account for 16B. There is also an int[] array that is used
for indexing the entries array. That gets us to 20B.

For some reason I didn't think that the int[] was the same length as
the main array - but I see that it is.

The resizing
algorithm uses the first prime number greater than twice the
hashtable's current size and uses that for the new size. That would
easily get you to 40.

Ah, I've just worked out a flaw in my test - I was only outputing the
result when the amount of memory changed, which would always give the
worst case. Printing out the best case as well shows around 20 bytes,
which now makes sense.

Since we're discussing very large dictionaries I noticed that the
prime table is preloaded up to 7,199,369 (which is not to say that all
primes up to that value are included). Once it reaches that size the
next resize operation will start using the naive algorithm for
primality testing, but will *probably* yield 14,398,753, 28,797,523,
and 57,595,063. That gives us jumps to 288, 576, and 1152 MB
respectively. If the 1GB limit is indeed in play for arrays then we
might expect ~28 million to be the upper limit. Pure speculation.

Don't forget that the 1GB limit applies to individual objects AFAIK -
so only the 16 bytes per entry of the "main" array. The int[] is
separate. 57,595,063*16 = 921 521 008, which should be okay. However,
you've also got to have enough room for the old version as well, so
your whole process needs 1.5x as much memory, unless you preallocate
that capacity.

Unfortunately I don't know the exact details of all the limits, so I
won't analyse this any further, for fear of giving the wrong
information.

Jon

Brian Gideon · Aug 13, 2008

Don't forget that the 1GB limit applies to individual objects AFAIK -
so only the 16 bytes per entry of the "main" array. The int[] is
separate. 57,595,063*16 = 921 521 008, which should be okay. However,
you've also got to have enough room for the old version as well, so
your whole process needs 1.5x as much memory, unless you preallocate
that capacity.

Ah yes. Good points. Also in the sake of full disclosure, that
57595063 value was based on a quick statistical test for primality so
it may not even be prime for all I know. I'm sure there is a prime
somewhere in that ballpark so it's probably not going to skew the
numbers too much.

Unfortunately I don't know the exact details of all the limits, so I
won't analyse this any further, for fear of giving the wrong
information.

Understood.

Brian Gideon · Aug 13, 2008

A little bit difficult, because then all pairs should be referencing to null
and that is not possible.

Cor

Well, as long as TValue is a reference type then the value could be
null. I haven't actually tested that because I've never done it, but
it should be possible according to the documentation.

Pass optional parameters to a method in a Dictionary (C#2.0) ?	10	Feb 10, 2010
XML to Dictionary using LINQ	2	Jun 17, 2013
How to implement IDictionary with <int, MyClass> members for custom Collection Object?	7	Jun 28, 2010
Size of Dictionary Object	3	Jun 1, 2006
Dictionary with object for key	2	Nov 4, 2008
learning to code - dictionary question	8	Nov 27, 2011
Late-binding: (Un)boxing an Object type to a primitive and othertypes	1	Mar 3, 2009
Delete key from Dictionary (C# 2.0)	2	Oct 14, 2008

C# DIctionary size limit

orsula

Jon Skeet [C# MVP]

Cor Ligthert[MVP]

orsula

Arne Vajhøj

Jon Skeet [C# MVP]

Cor Ligthert [MVP]

Brian Gideon

Jon Skeet [C# MVP]

Brian Gideon

Brian Gideon

Ask a Question

Similar Threads