best algorithm for getting distinct list

  • Thread starter Thread starter Bob
  • Start date Start date
B

Bob

It's not especially important, but I always like to know the best way of doing
things for when I encounter a case where performance becomes a factor... say I
have a string array and I want to get a distinct list, that is, get a list
without duplicates. In SQL this is easy because it's done for you. In code, what
would be the best practice? Array.Sort(stringarray) followed by n comparisons
collecting on the changes? Or perhaps adding each string as a key to a
collection or hashtable and just letting duplicates refusal do the work? I have
always wondered, are there circumstances when catching errors in quantity like
this degrade performance? Is it common, avoided?

Bob
 
It's not especially important, but I always like to know the best way of doing
things for when I encounter a case where performance becomes a factor... say I
have a string array and I want to get a distinct list, that is, get a list
without duplicates. In SQL this is easy because it's done for you. In code, what
would be the best practice? Array.Sort(stringarray) followed by n comparisons
collecting on the changes? Or perhaps adding each string as a key to a
collection or hashtable and just letting duplicates refusal do the work? I have
always wondered, are there circumstances when catching errors in quantity like
this degrade performance? Is it common, avoided?

Bob

Catching exceptions is always expensive and should be avoided where
possible. Personally, I would use the hashtable approach, but I would
probably check if it existed first using the hashtable contains
method... The check would probably be less expensive then catching an
exception in terms of performance.
 
Hi Bob,

I think just as you said,
Array.sort
Extra Arraylist
Loop 1 time through the sorted array from 0 to array.length
Add the first from the arry to the arraylist and every time the item is
another than the previous
Would go very fast I think.

For me are exceptions only for if you cannot manage the exception, let say a
file which is incomplete readed from disk.

And for no other things, have you seen how long it takes when there is an
exception in your debugger?

I hope this helps,

Cor
 
I sort of knew I shouldn't be lazy and let exception handling do the work, but I
wasn't sure. Thanks for the answers.

Bob
 
Back
Top