fastest searchable datastructure?

  • Thread starter Thread starter Pieter
  • Start date Start date
IMHO

The Hashtable will not work for the TS as the object limit in .Net = 2 GB
the hashtable way will blow up with a out of memory exception wit the
amounts the TS wants to store

so the only way that will probably work is the also in this thread proposed
database aproach

Michel
 
Dictionaries are close to O(1) which is as fast as you can get.

However, as others have pointed out, 10^6 is a lot of space, never mind 10^9.

If you are looking at 32 bit pointers (references) you are using 4 megs just
for the references to be held. 8 megs at 64 bit. Exactly how big are your
objects? At 10^6 you can say you will need one MEGABYTE of RAM per BYTE of
size for each of your objects. IE: An object who's size is 20 bytes will
cost you 20 megs of RAM to hold 10^6 of them.

I would seriously look at your data architecture and see if you can lower
the number of instances you need to hold at once. The only application I
can think of that would need such a huge quantity of data might be a game
terrain database. In those cases you really do not need all of the terrain
loaded at once and can section them off to blocks. In the cases where you do
need all that data available at once, you end up shrinking the map size to
something reasonable or implementing a lot of compression.
 
On Jan 18, 3:04 pm, TheSilverHammer

I would seriously look at your data architecture and see if you can lower
the number of instances you need to hold at once. The only application I
can think of that would need such a huge quantity of data might be a game
terrain database. In those cases you really do not need all of the terrain
loaded at once and can section them off to blocks. In the cases where you do
need all that data available at once, you end up shrinking the map size to
something reasonable or implementing a lot of compression.

I've got a datastructure of 100 million records in memory. I probably
shouldn't say what it's for, but it basically ends up being a 10,000
squared lookup, with a single byte result.

Doing this in memory (for the low, low cost of 100M of memory) gave me
a ~100,000% performance improvement over a database lookup :)

Jon
 
Jon Skeet said:
I've got a datastructure of 100 million records in memory. I probably
shouldn't say what it's for, but it basically ends up being a 10,000
squared lookup, with a single byte result.

Doing this in memory (for the low, low cost of 100M of memory) gave me
a ~100,000% performance improvement over a database lookup :)

That of course would not be a searchable datastructure, more like an indexed
array where the location is calculated by offset ;) If it was a key per
item, then each item would require at least a 4 byte key, making the in
memory size at least 500M
 
I know it's not always good netiquette to pry into other's projects in these
newsgroups, but is anyone else really interested to know what this actual
application is for...?
 
Oh, it's good to have a healthy curiousity ;-)

It's a Reinforcement Learning Project: Artificial Intelligence.
I'm making an agent that can play a board game using RL. To do this, it must
learn from what it did:
- the agent comes in different states
- every state has different actions
- the agent must choose between these actions
- at the end he gets his reward (win or lose). this reward is given to the
actions he choose.
By learnign (exploration) and exploitation he must maximize his wins :-)

The whole datastructure is needed to put my states in: I must be able to
search very fast is he had alreaddy been in that state :-)

It has to be finished by monday evening ;-)
 
Back
Top