Terrance said:
Hello, I have a question in regards to hashtables that I was hoping someone
can share some light on. I'm currently working with VB.net and I'm trying to
learn as much as possible about the developer software. Can someone clarify
what a hashtable is? What is it's purpose? Why/when should one be used?
After doing some reading I kind of understand it's like an array but what
makes it so different from any other type of array that a developer can
declare in an application?
<extract moreThanYouWantedToKnow="maybe" ISBN="1590593863"
legal="yes">
In general, a hash table is a mapping of keys to values. The syntax is
usually modeled on arrays, so you can say Hash[Key] = Value or Value =
Hash[Key]. (As in this example, people with a Perl background will
often refer to a hash table as simply a hash.) The difference between
a hash table and an array is that arrays are indexed by numbers which
map directly to positions in a sequential stream of locations while
hash tables are indexed by the value of the key. Some hash table
implementations can only use strings as keys; the FCL implementation
can use any reference or value type as a key or as a value.
When you store a value in a hash table, a hash function reduces the
key to an integer, the hashed key. The hashed key, modulo the number
of buckets, specifies a bucket. A bucket contains zero or more key /
value pairs. The store operation looks in the bucket for a pair with a
(cached) hash value that equals the hashed key. If the hash values
match, the store operation compares the actual keys. (Comparing
integers is a fast way to avoid doing a slow operation like string
comparison. A hash match doesn’t mean equality, but a hash mismatch
guarantees inequality.) If the keys match, the store operation sets
the key / value pair to the new value. If the keys don’t match, the
store operation goes back to scanning the bucket for a matching hash
value. If the scan sails past the end of the bucket, the store
operation simply adds a new key/value pair.
When you retrieve a value from a hash, the same hash function is used
to reduce the key to an integer, which again determines which bucket
to look in. If the appropriate bucket does not contain the key, the
hash table has no value for the key.
Obviously, this is more expensive than array indexing. However, it is
much cheaper than either doing a linear search through a long list of
keys, or maintaining a key/value list sorted in key order. Hash tables
are comparatively expensive for small collections, but (assuming a
good hash function, one that doesn’t favor certain results over
others, and that gives different results for similar keys) hash table
cost goes up much slower than collection size.
Hash tables are useful for many common programming tasks. Obviously,
they make good symbol tables in compilers, interpreters, and web
browsers (both JavaScript interpreters and CSS style sheets need
symbol tables), as well as in any sort of program that maps strings to
array indices, or that applies macros or templates. An object in a
language like JavaScript, where you create a field simply by assigning
a value to it, is basically just a hash table. Somewhat less
obviously, hash tables can be used for sparse arrays and sets. Also,
a program that often generates the same string expression (perhaps
extracting XML tags from various documents) can use a hash table to
save memory by replacing the new string with the first copy.
</extract>