Composite Keys with <TKey,TValue> pairs generic collections

D

Dave

I seem to be missing one of my favorite STL techniques for key, value
collections - being able to use Composite Keys. What i mean by C.K.
is the ability for a key to compare multiple values when sorting /
searching collections.

E.g. IP Address, and port as key, rather than the single IP Address
value.

in STL i would create an object as key, and override the "<" operator
to compare the composite key (multiple values) - thereby achieving
what i needed. (STL often used equivalence rather than equality for
its collections - hence less than operator overloading)

Can I do something similar in C#?

thanks tons, dave

p.s. i could crudely append values IPAddr and Port# together making a
pseudo composite key, but not pretty.
 
T

Tom Shelton

I seem to be missing one of my favorite STL techniques for key, value
collections - being able to use Composite Keys. What i mean by C.K.
is the ability for a key to compare multiple values when sorting /
searching collections.

E.g. IP Address, and port as key, rather than the single IP Address
value.

in STL i would create an object as key, and override the "<" operator
to compare the composite key (multiple values) - thereby achieving
what i needed. (STL often used equivalence rather than equality for
its collections - hence less than operator overloading)

Can I do something similar in C#?

thanks tons, dave

p.s. i could crudely append values IPAddr and Port# together making a
pseudo composite key, but not pretty.

Hmmm... are you looking for something like the Comparer property on the
System.Collections.Generic.Dictionary class?
 
N

Nicholas Paldino [.NET/C# MVP]

Dave,

The closest thing you can do is create an implementation of
IEqualityComparer<TKey> which is used to compare the instances of TKey in
the Dictionary<TKey, TValue> implementation.

You could simplify the process by creating a generic implementation of
IEqualityComparer<TKey> which you assign delegates to (in order to simplify
the whole thing, and not have to create specific implementations each time)
which will do the comparison and hash code generation for you.
 
M

Marc Gravell

In C# 3 (VS 2008), I believe (IIRC) that anonymous types include
Equals and GetHashCode implementations that span the members, so you
could use "new {IP=ip, Port=port}" as the key, and ToDictionary/
ToLookup etc, and it should work.

Haven't checked, though.

Marc
 
J

Jon Skeet [C# MVP]

Marc Gravell said:
In C# 3 (VS 2008), I believe (IIRC) that anonymous types include
Equals and GetHashCode implementations that span the members, so you
could use "new {IP=ip, Port=port}" as the key, and ToDictionary/
ToLookup etc, and it should work.

Haven't checked, though.

Yes, they do. I only remember offhand because the hash algorithm *used*
to be simple XOR-ing, which is horrible. It's better now.
 
L

Lasse Vågsæther Karlsen

Dave said:
I seem to be missing one of my favorite STL techniques for key, value
collections - being able to use Composite Keys. What i mean by C.K.
is the ability for a key to compare multiple values when sorting /
searching collections.

E.g. IP Address, and port as key, rather than the single IP Address
value.

in STL i would create an object as key, and override the "<" operator
to compare the composite key (multiple values) - thereby achieving
what i needed. (STL often used equivalence rather than equality for
its collections - hence less than operator overloading)

Can I do something similar in C#?

thanks tons, dave

p.s. i could crudely append values IPAddr and Port# together making a
pseudo composite key, but not pretty.

Here is my own implementation pre-C#3 that allowed me to do that kind of
stuff.

Note that it suffers from the problem Jon Skeet alludes to, where the
hash code is simply xoring the individual members.

--
Lasse Vågsæther Karlsen
mailto:[email protected]
http://presentationmode.blogspot.com/
PGP KeyID: 0xBCDEA2E3



namespace LVK.GenericTypes
{

#region Structure<T1, T2>

/// <summary>
/// Defines a structure with two fields.
/// </summary>
/// <typeparam name="T1">
/// The type of the <see cref="Structure{T1, T2}.Value1"/> field.
/// </typeparam>
/// <typeparam name="T2">
/// The type of the <see cref="Structure{T1, T2}.Value2"/> field.
/// </typeparam>
public struct Structure<T1, T2> : IComparable<Structure<T1, T2>>
{
#region Public Fields

/// <summary>
/// The first value in the structure.
/// </summary>
public T1 Value1;

/// <summary>
/// The second value in the structure.
/// </summary>
public T2 Value2;

#endregion

#region Construction & Destruction

/// <summary>
/// Constructs a new structure of type <see
cref="Structure{T1,T2}"/> and initializes the values.
/// </summary>
/// <param name="value1">
/// The first value in the structure.
/// </param>
/// <param name="value2">
/// The second value in the structure.
/// </param>
public Structure(T1 value1, T2 value2)
{
Value1 = value1;
Value2 = value2;
}

#endregion

#region Equals and GetHashCode

/// <summary>
/// Indicates whether this instance and a specified object are
equal.
/// </summary>
/// <param name="other">Another <see cref="Structure{T1, T2}"/>
to compare to.</param>
/// <returns>
/// <c>true</c> if <paramref name="other"/> and this instance
represent the same value; otherwise, <c>false</c>.
/// </returns>
public Boolean Equals(Structure<T1, T2> other)
{
// Compare Value1
if ((ReferenceEquals(Value1, null)) !=
(ReferenceEquals(other.Value1, null)))
return false;
if (!ReferenceEquals(Value1, null) && Value1 is
IComparable<T1> &&
((IComparable<T1>)Value1).CompareTo(other.Value1) != 0)
return false;

// Compare Value2
if ((ReferenceEquals(Value2, null)) !=
(ReferenceEquals(other.Value2, null)))
return false;
if (!ReferenceEquals(Value2, null) && Value2 is
IComparable<T2> &&
((IComparable<T2>)Value2).CompareTo(other.Value2) != 0)
return false;

return true;
}

/// <summary>
/// Indicates whether this instance and a specified object are
equal.
/// </summary>
/// <param name="obj">Another object to compare to.</param>
/// <returns>
/// <c>true</c> if obj and this instance are the same type and
represent the same value; otherwise, <c>false</c>.
/// </returns>
public override Boolean Equals(Object obj)
{
if (obj is Structure<T1, T2>)
return Equals((Structure<T1, T2>)obj);
else
return false;
}

/// <summary>
/// Returns the hash code for this instance.
/// </summary>
/// <returns>
/// A 32-bit signed integer that is the hash code for this
instance.
/// </returns>
/// <remarks>
/// Note that this method returns a hash code based on the
values of the fields in the structure.
/// Thus, if the values of these fields change, the structure
will have a new hash code, usually
/// this will not pose a problem since a dictionary that uses
one of these as a key will make its own
/// copy of the contents of the struct.
/// </remarks>
public override Int32 GetHashCode()
{
Int32 hashcode = 0;
if (!ReferenceEquals(Value1, null))
hashcode ^= Value1.GetHashCode();
if (!ReferenceEquals(Value2, null))
hashcode ^= Value2.GetHashCode();
return hashcode;
}

#endregion

#region IComparable<Structure<T1,T2>> Members

/// <summary>
/// Compares the current object with another object of the same
type.
/// </summary>
/// <param name="other">An object to compare with this
object.</param>
/// <returns>
/// A 32-bit signed integer that indicates the relative order
of the objects being compared.
/// If <c>this</c> object is less than <paramref
name="other"/>, a negative value will be returned.
/// If <c>this</c> object is greater than <paramref
name="other"/>, a positive value will be returned.
/// Otherwise, zero will be returned to indicate that
<c>this</c> is equal to <paramref name="other"/>.
/// </returns>
public Int32 CompareTo(Structure<T1, T2> other)
{
Int32 result;

if ((ReferenceEquals(Value1, null)) !=
(ReferenceEquals(other.Value1, null)))
return (ReferenceEquals(Value1, null)) ? -1 : +1;
if (!ReferenceEquals(Value1, null) && Value1 is
IComparable<T1>)
{
result = ((IComparable<T1>)Value1).CompareTo(other.Value1);
if (result != 0)
return result;
}
if ((ReferenceEquals(Value2, null)) !=
(ReferenceEquals(other.Value2, null)))
return (ReferenceEquals(Value2, null)) ? -1 : +1;
if (!ReferenceEquals(Value2, null) && Value2 is
IComparable<T2>)
{
result = ((IComparable<T2>)Value2).CompareTo(other.Value2);
if (result != 0)
return result;
}

return 0;
}

#endregion

#region Overridden Methods

/// <summary>
/// Returns the value of the structure as a string, containing
the values of the individual fields all
/// concatenated to a string in braces.
/// </summary>
/// <returns>
/// A <see cref="String"></see> containing the values of all
the fields in the structure.
/// </returns>
public override String ToString()
{
return String.Format("{{ {0}, {1} }}", Value1, Value2);
}

#endregion

#region Operators

/// <summary>
/// Determines wether two <see cref="Structure{T1, T2}"/>
contains the same values
/// in their fields.
/// </summary>
/// <param name="value1">
/// The first <see cref="Structure{T1, T2}"/> value to compare
to <paramref name="value2"/>.
/// </param>
/// <param name="value2">
/// The second <see cref="Structure{T1, T2}"/> value to compare
to <paramref name="value1"/>.
/// </param>
/// <returns>
/// <c>true</c> if the two structures <paramref name="value1"/>
and <paramref name="value2"/>
/// contains the same values in their fields; otherwise,
<c>false</c>.
/// </returns>
/// <remarks>
/// Equivalent to <c><paramref name="value1"/>.Equals(<paramref
name="value2"/>)</c>.
/// </remarks>
public static Boolean operator ==(Structure<T1, T2> value1,
Structure<T1, T2> value2)
{
return value1.Equals(value2);
}

/// <summary>
/// Determines wether two <see cref="Structure{T1, T2}"/> does
not contain the same values
/// in their fields.
/// </summary>
/// <param name="value1">
/// The first <see cref="Structure{T1, T2}"/> value to compare
to <paramref name="value2"/>.
/// </param>
/// <param name="value2">
/// The second <see cref="Structure{T1, T2}"/> value to compare
to <paramref name="value1"/>.
/// </param>
/// <returns>
/// <c>true</c> if the two structures <paramref name="value1"/>
and <paramref name="value2"/>
/// does not contain the same values in their fields;
otherwise, <c>false</c>.
/// </returns>
/// <remarks>
/// Equivalent to <c>!(<paramref name="value1"/> == <paramref
name="value2"/>)</c>.
/// </remarks>
public static Boolean operator !=(Structure<T1, T2> value1,
Structure<T1, T2> value2)
{
return !(value1 == value2);
}

#endregion
}

#endregion
}
 
C

Chris Shepherd

Dave said:
I seem to be missing one of my favorite STL techniques for key, value
collections - being able to use Composite Keys. What i mean by C.K.
is the ability for a key to compare multiple values when sorting /
searching collections.

E.g. IP Address, and port as key, rather than the single IP Address
value.

in STL i would create an object as key, and override the "<" operator
to compare the composite key (multiple values) - thereby achieving
what i needed. (STL often used equivalence rather than equality for
its collections - hence less than operator overloading)

Can I do something similar in C#?

thanks tons, dave

p.s. i could crudely append values IPAddr and Port# together making a
pseudo composite key, but not pretty.

This isn't strictly answering your composite Key question, but you can use the
IPEndPoint class to tie IP/Port together.

IPEndPoint i = new IPEndPoint(IPAddress.Parse("192.168.0.1"), 80);
http://msdn2.microsoft.com/en-us/library/system.net.ipendpoint(VS.80).aspx

Chris.
 
J

Jeroen Mostert

Jon said:
Yes, they do. I only remember offhand because the hash algorithm *used*
to be simple XOR-ing, which is horrible. It's better now.
Why is that horrible, and how is it better? Assuming the hash functions of
the individual members do their jobs properly and generate nicely
distributed hash codes, how would XOR'ing them be any worse than another
operation? Conversely, if the hash functions of the individual members
*don't* produce nicely distributed hash codes, what sort of algorithm would
minimize the damage for the composite hash code?
 
J

Jon Skeet [C# MVP]

Jeroen Mostert said:
Why is that horrible, and how is it better? Assuming the hash functions of
the individual members do their jobs properly and generate nicely
distributed hash codes, how would XOR'ing them be any worse than another
operation? Conversely, if the hash functions of the individual members
*don't* produce nicely distributed hash codes, what sort of algorithm would
minimize the damage for the composite hash code?

It's quite common to store the same type for both members of a pair. At
that point, XOR gives the unpleasant behaviour. Consider a map with the
following pairs of strings in:

{"hello", "hello"} => hash of 0, XOR of x and x is always 0
{"there", "there"} => hash of 0 again
{"a", "a"} => and again
{"a", "b"} => some value k
{"b", "a"} => k again

Values don't tend to be uniformly distributed - values (x, y) and
(y, x) are likely to come up in the same map in various scenarios, as
are various (x, x) values.

Hash collisions are of course a bad thing, so creating a situation
where common scenarios will lead to collisions is not a good idea.

Personally I tend to use something like:

int hash = 17;
hash = 23*hash + firstField.GetHashCode();
hash = 23*hash + secondField.GetHashCode();
hash = 23*hash + thirdField.GetHashCode();

etc

17 and 23 are arbitrary coprime numbers here.

If you know more about what you're hashing (preferrably in terms of
both its hashcode algorithm and likely values) then you can do a lot
better - but in the meantime, the coprime version is nicer than XOR.
 
J

Jeroen Mostert

Jon said:
It's quite common to store the same type for both members of a pair. At
that point, XOR gives the unpleasant behaviour. Consider a map with the
following pairs of strings in:

{"hello", "hello"} => hash of 0, XOR of x and x is always 0
{"there", "there"} => hash of 0 again
{"a", "a"} => and again
{"a", "b"} => some value k
{"b", "a"} => k again

Values don't tend to be uniformly distributed - values (x, y) and
(y, x) are likely to come up in the same map in various scenarios, as
are various (x, x) values.
Nice. I hadn't considered that -- I haven't had too many scenarios yet where
such collisions could occur, although I've had many where the types were equal.

This could be an common scenario if you're handling coordinates of some
form, though. If you had need to hash those for some purpose, a simple
XOR'ing of values could be disastrous.
Personally I tend to use something like:

int hash = 17;
hash = 23*hash + firstField.GetHashCode();
hash = 23*hash + secondField.GetHashCode();
hash = 23*hash + thirdField.GetHashCode();

etc

17 and 23 are arbitrary coprime numbers here.
Good. Don't forget to wrap that in an unchecked { }, though. :)
 
J

Jon Skeet [C# MVP]

Nice. I hadn't considered that -- I haven't had too many scenarios yet where
such collisions could occur, although I've had many where the types were equal.

This could be an common scenario if you're handling coordinates of some
form, though. If you had need to hash those for some purpose, a simple
XOR'ing of values could be disastrous.

Exactly :) With something as general as anonymous types, it just wasn't
a good idea.
Good. Don't forget to wrap that in an unchecked { }, though. :)

Unchecked is the default apart from for constants. No need to do it
explicitly - unless you've changed the project settings, of course.
 
D

Dave

Exactly :) With something as general as anonymous types, it just wasn't
a good idea.



Unchecked is the default apart from for constants. No need to do it
explicitly - unless you've changed the project settings, of course.

An additional question is, how best to reverse the sort order of a
generic IComparable<> derived class implementation?

I guess the solution involves supplanting the implementation of
IComparable.CompareTo() with a Descending order version, but...
I thought of overriding the IComparable.CompareTo() method in a
derived class and flipping the values returned, but
IComparable.CompareTo() is not defined as virtual - so that's a no go.

These responses are exactly what i was after - thanks so much to all!

dave
 
J

Jon Skeet [C# MVP]

An additional question is, how best to reverse the sort order of a
generic IComparable<> derived class implementation?

Are you using something which allows you to pass in an IComparer<T>? If
so, you could use a class I happened to write this evening:
ReverseComparer. See below for the code. You can get an appropriate
Comparer<T> for a T:IComparable<T> using Comparer<T>.Default.

Of course, none of this helps if the API you're using doesn't allow the
use of an IComparer<T>, I'm afraid.


namespace MiscUtil.Collections
{
/// <summary>
/// Implementation of IComparer[T] based on another one;
/// this simply reverses the original comparison.
/// </summary>
/// <typeparam name="T"></typeparam>
public sealed class ReverseComparer<T> : IComparer<T>
{
readonly IComparer<T> originalComparer;

/// <summary>
/// Returns the original comparer; this can be useful to
/// avoid multiple
/// reversals.
/// </summary>
public IComparer<T> OriginalComparer
{
get { return originalComparer; }
}

/// <summary>
/// Creates a new reversing comparer.
/// </summary>
/// <param name="original">The original comparer to
/// use for comparisons.</param>
public ReverseComparer(IComparer<T> original)
{
original.ThrowIfNull("original");
this.originalComparer = original;
}

/// <summary>
/// Returns the result of comparing the specified
/// values using the original
/// comparer, but reversing the order of comparison.
/// </summary>
public int Compare(T x, T y)
{
return originalComparer.Compare(y, x);
}
}
}

If you're using C# 3, there's an associated extension method:

/// <summary>
/// Extensions to IComparer
/// </summary>
public static class ComparerExt
{
/// <summary>
/// Reverses the original comparer; if it was already a
/// reverse comparer,
/// the previous version was reversed (rather than reversing
/// twice).
/// In other words, for any comparer X,
/// X==X.Reverse().Reverse().
/// </summary>
public static IComparer<T> Reverse<T>
(this IComparer<T> original)
{
ReverseComparer<T> originalAsReverse =
original as ReverseComparer<T>;
if (originalAsReverse != null)
{
return originalAsReverse.OriginalComparer;
}
return new ReverseComparer<T>(original);
}
}
 
J

Jeroen Mostert

Jon said:
Unchecked is the default apart from for constants. No need to do it
explicitly - unless you've changed the project settings, of course.
I still prefer to use unchecked explicitly when I want to communicate
explicitly to maintainers that I'm relying on the wraparound behavior. I
don't usually turn on overflow checking myself, but I'd like the program to
be correct when it's turned on regardless.

I'm used to C, where there's no kind of overflow checking whatsoever (unless
the hardware happens to enforce it), so I like this opportunity to be explicit.
 
J

Jon Skeet [C# MVP]

I still prefer to use unchecked explicitly when I want to communicate
explicitly to maintainers that I'm relying on the wraparound behavior. I
don't usually turn on overflow checking myself, but I'd like the program to
be correct when it's turned on regardless.

That makes a lot of sense, yes.
I'm used to C, where there's no kind of overflow checking whatsoever (unless
the hardware happens to enforce it), so I like this opportunity to be explicit.

:)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top