Finding Close matches in a collection class

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

In an appliction I want to store lots of items in memory and locate sets of
the based on the some starting substring.

If I was doing it in sql, I would use "select * from table where Key like
'Fred%'

From what I've read the SortedDictionary should be able to do this (as it is
implemented as a tree), but there seems no way of getting at this
functionality (or the innards of the thing).

Will any of the collection classes do this? Are there any other collection
classes which will do something like this?

(I don't want to use a for each solution as I have to do LOTS of searches on
LARGE datasets).

THanks for your help

Iain
 
Iain said:
In an appliction I want to store lots of items in memory and locate sets of
the based on the some starting substring.

If I was doing it in sql, I would use "select * from table where Key like
'Fred%'

From what I've read the SortedDictionary should be able to do this (as it is
implemented as a tree), but there seems no way of getting at this
functionality (or the innards of the thing).

Will any of the collection classes do this? Are there any other collection
classes which will do something like this?

I'm not aware of any. You could implement a predicate delegate to use
with something like the FindAll() method of a collection class like
SortedDictionary, but that would not be very much different than just
enumerating the whole collection, since there's no feedback from the
delegate to help narrow the search. It's just a binary "is this one or
isn't it?".

So, I think you'd have to implement your own binary search.

Now, that said, you can at least short-cut the sorting part, as the
SortedDictionary<> class (as well as some other sorted collection
classes) has a Values property that returns a collection of the values
in the collection, in the already-sorted order. So you only have to
write the binary search part, which shouldn't be that difficult.

Pete
 
Peter Duniho said:
I'm not aware of any. You could implement a predicate delegate to use
with something like the FindAll() method of a collection class like
SortedDictionary, but that would not be very much different than just
enumerating the whole collection, since there's no feedback from the
delegate to help narrow the search. It's just a binary "is this one or
isn't it?".

So, I think you'd have to implement your own binary search.

Now, that said, you can at least short-cut the sorting part, as the
SortedDictionary<> class (as well as some other sorted collection
classes) has a Values property that returns a collection of the values
in the collection, in the already-sorted order. So you only have to
write the binary search part, which shouldn't be that difficult.
Seems a shame they didn't provide access to the innards of the
SortedDictionary.

On that (ish), when you access the Values collection on a dictionary, is it
simply a pointer access or is an array created when the call is made? I've
not found the documentation clear on that...

Thanks!

Iain
 
Iain said:
Seems a shame they didn't provide access to the innards of the
SortedDictionary.

Not sure what you mean here. The point of encapsulation is to _not_
"provide access to the innards". :) That said, to some extent the
SortedDictionary.Values property does give you access to the innards.
On that (ish), when you access the Values collection on a dictionary, is it
simply a pointer access or is an array created when the call is made? I've
not found the documentation clear on that...

Did you try looking at the documentation for the SortedDictionary.Values
property?

From http://msdn2.microsoft.com/en-us/library/zazzsd83.aspx :

The returned SortedDictionary.ValueCollection is not
a static copy; instead, the SortedDictionary.ValueCollection
refers back to the values in the original SortedDictionary.
Therefore, changes to the SortedDictionary continue to be
reflected in the SortedDictionary.ValueCollection

Pete
 
Peter Duniho said:
Not sure what you mean here. The point of encapsulation is to _not_
"provide access to the innards". :) That said, to some extent the
SortedDictionary.Values property does give you access to the innards.

Well, encapsulation is about design decisions. One design decision would be
to expose the internal tree structure in a read only fashion (in order to do
all the things with trees which SortedDictionary doesn't do - prefix sort
order, post fix sort order, range retrieval and so on). Doing this would
provide value but make the dictionary riskier to use (a node could become
quite separated from the rest of the tree after some operations). SO the
choice was the dumbed down version (which, actually, I agree with). However
they could have provided a more primitive version with raw access for the
benefit of those of us who know the risks!

Maybe V3 :) .
Did you try looking at the documentation for the SortedDictionary.Values
property?

From http://msdn2.microsoft.com/en-us/library/zazzsd83.aspx :

The returned SortedDictionary.ValueCollection is not
a static copy; instead, the SortedDictionary.ValueCollection
refers back to the values in the original SortedDictionary.
Therefore, changes to the SortedDictionary continue to be
reflected in the SortedDictionary.ValueCollection
Thanks.

Iain
 
Back
Top