Hashtable corrupted (.Net bug?)

  • Thread starter Thread starter Guest
  • Start date Start date
Hello

I ran you test program and got the same result.
But when I modified your Equals method (see below) I got this message
Equals called with an object with value
System.Collections.Hashtable+bucket[]
0

This means that the problem is like stefan said, happens when an object is
removed whose hashcode happens to be zero.
You should fix your Equals method.

public override bool Equals(object obj)
{
SdtsMsgKey key = obj as SdtsMsgKey;
if(key == null)
{
if(obj == null)
Trace.WriteLine("Equals called with a null object");
else
Trace.WriteLine("Equals called with an object with value " +
obj.GetType().FullName);
Trace.WriteLine(this.GetHashCode().ToString());
return false;
}
return ((this.ClientID == key.ClientID) &&
(this.MsgReference == key.MsgReference));
}
Ken said:
Hi Jon,
I have finally managed to re-produce the problem using my test program. I
paste the program below. Could you please take a look at it? It basically
creates two threads: one adding element to the common Hashtable, and the
other removing element from the Hashtable. The Hashtable is synchronized
using Hashtable.Synchronized(myHashtable). Initially I thought this was the
cause of the problem, but later after I change to use
lock(myHashtable.SyncRoot) before adding or removing element, the problem
still existed, so I suspect this is a bug with the .Net Framework. The
NullReferenceException will happen after the program runing for some time
(within one hour).

//Here is the test program

using System;
using System.Threading;
using System.Collections;
using System.Diagnostics;

namespace HashtableTest
{
/// <summary>
/// Summary description for Class1.
/// </summary>
class HashtableTesting
{
static Hashtable PendingAckMsgs = null;

/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
TextWriterTraceListener myWriter = new
TextWriterTraceListener(System.Console.Out);
Trace.Listeners.Add(myWriter);

Hashtable pendingAckMsgsUnsync = new Hashtable();
PendingAckMsgs = Hashtable.Synchronized(pendingAckMsgsUnsync);

//
// TODO: Add code to start application here
//
Thread addingThread = new Thread(new ThreadStart(AddEle));
Thread removingThread = new Thread(new ThreadStart(RemoveEle));

addingThread.IsBackground = true;
removingThread.IsBackground = true;

addingThread.Start();
removingThread.Start();

Console.ReadLine();
}

static void AddEle()
{
Trace.WriteLine("Start of adding thread.");
try
{
byte refNum = 0;
long clientID = 0;
while (true)
{
SdtsMsgKey key = new SdtsMsgKey();
key.ClientID = clientID;
key.MsgReference = refNum;

PendingAckMsgs.Add(key, "object");

refNum++;
clientID++;

Thread.Sleep(500);
}
}
catch(Exception e)
{
Trace.WriteLine(e.ToString());
}
}

static void RemoveEle()
{
Trace.WriteLine("Start of removing thread.");
try
{
Thread.Sleep(500);

byte refNum = 0;
long clientID = 0;
while (true)
{
SdtsMsgKey key = new SdtsMsgKey();
key.ClientID = clientID;
key.MsgReference = refNum;

PendingAckMsgs.Remove(key);

refNum++;
clientID++;
Thread.Sleep(1000);
}
}
catch(Exception e)
{
Trace.WriteLine(e.ToString());
}
}
}
}


//Here is the self-implemented key class

using System;
using System.Text;

namespace HashtableTest
{
/// <summary>
/// Key for outgoing SDTS messages
/// </summary>
public class SdtsMsgKey
{
/// <summary>
/// Destination ISSI
/// </summary>
public long ClientID
{
get
{
return clientID;
}
set
{
clientID = value;
}
}

/// <summary>
/// Message reference is a unique identifier for the messages to
each destination
/// </summary>
public byte MsgReference
{
get
{
return msgReference;
}
set
{
msgReference = value;
}
}

/// <summary>
/// Serves as a hash function for a particular type,
/// suitable for use in hashing algorithms and data structures like
a hash table.
/// EXPLANATION:
/// ClientID 3 bytes (shift left 3 bytes x 8 bits) bitwise-OR
MsgReference 1 byte
/// Makes 4 bytes exactly an int.
/// </summary>
/// <returns>Hash code</returns>
public override int GetHashCode()
{
return ((int) (ClientID << 24) | (int) MsgReference);
}

/// <summary>
/// Determines whether the specified Object is equal to the current
Object.
/// Required for use in data structures like a hash table.
/// </summary>
/// <param name="obj"></param>
/// <returns></returns>
public override bool Equals(object obj)
{
SdtsMsgKey key = obj as SdtsMsgKey;

return ((this.ClientID == key.ClientID) &&
(this.MsgReference == key.MsgReference));
}

/// <summary>
/// Get the string representation of the object.
/// </summary>
/// <returns>String representation of the object</returns>
public override string ToString()
{
StringBuilder msgString = new StringBuilder();
msgString.AppendFormat("ClientID={0}\n", ClientID);
msgString.AppendFormat("MsgReference={0}\n", MsgReference);
return msgString.ToString();
}

private long clientID = 0;
private byte msgReference = 0;
}
}



Jon Skeet said:
Hmm... that does seem very odd. I'll have a look at it more closely
tonight. Do you have a test program which demonstrates the problem
(even after several hours)? Anything to help reproduce it would be
good.
 
Sherif ElMetainy said:
I disagree that this should be considered a bug in hashtable, since Equals
method should return false if the bucket array is passed.

That doesn't mean that the hashtable should pass it. It shouldn't be
passing a null key *at all*. Yes, with an appropriate Equals method it
doesn't cause a problem, but that doesn't mean it's not a bug.
 
Hello

It is not passing null, it is passing a bucket array as stephan mentioned,
but even if it passes null to a correctly implemented Equals method, it
shouldn't be a problem.

So the question is should Hashtable implementation take into consideration a
misbehaving Equals method? Should it assume that the Equals method is
implemented correctly?

Personally I favour the second choice. Note that the Equals method is
throwing a NullReferenceException, and the documetation of Object.Equals
method says that it should never throw any exceptions.

Best regards,
Sherif
 
Hi Stefan,
Thank you so much for your explanation on the cause of the problem. Base on
this investigation, we want to come out a workaround solution. To do that,
could you help me answer a few questions below:

1. Does the problem only happen in the scenario that you described? Will it
happen for other hashcode? For example, if more than one objects with same
hashcode are added to the hashtable, then remove one of them, and then add
another object with the same hashcode to the hashtable again, will it pass
the internal bucket array to the Equal method?

2. If I change my Equal method to check null, will it affect the performance
very much? Or in other words, will the Equal method be called very often?

3. If I change my Equal method to check null, will it cause memory leak?
Because from our log file, we see that the null key will stay in the
hashtable for a long time (every insersion to the hashtable will cause the
NullReferenceException).

4. Is it confirmed that dotnet 2.0 beta 1 (2.0.40607) will fix the problem?
If so, when will this version be released?

Thank you for your help.

Regards,
Ken

Stefan Simek said:
It really seems to be a bug in the hashtable! It has nothing to do with the
threads though. All you need to crash it, is the following:

1. Add two distinct objects with hashcode 0
2. Remove one of them
3. Add it again

This causes the Hashtable to send it's internal buckets array to the Equals
comparision function, which is certainly a bug, though it won't show if you
have your Equals function written properly (that is, test if the compared
object is of the same type, or for null after an 'as' cast).

Here's a pretty simple code that shows the problem:

using System;
using System.Collections;

namespace HashtableTest
{
/// <summary>
/// Summary description for Class1.
/// </summary>
class HashtableTesting
{
class TestKey
{
int value;

public TestKey(int value) { this.value = value; }

public override int GetHashCode()
{
return 0; // changing this to anything else than zero removes the
problem
}

public override bool Equals(object obj)
{
TestKey key = obj as TestKey;

// uncommeting this also removes the following also removes the problem
//if (key == null)
// return false;

return value == key.value;
}
}

/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
Hashtable test = new Hashtable();

object key0 = new TestKey(0), key1 = new TestKey(1);

test.Add(key0, "object");
test.Add(key1, "object");
test.Remove(key0);
test.Add(key0, "object");
}
}
}

This is apparently fixed in dotnet 2.0 beta 1 (2.0.40607)

HTH,
Stefan

Ken said:
Hi Stefan,
I have finally managed to re-produce the problem. Could you take a look at
my latest reply to Jon's post? (the 1/12/2005 one)

Regards,
Ken
 
Hi Sherif,
Could you take a look at my reply to Stefan? I raised a few questions in the
reply. Here can you clarify my understanding of the implementation of the
Hashtable: Does each hashcode have its own bucket? When you said passing a
bucket array, did you mean passing the bucket corresponding to the hashcode
but not all the buckets?

Regards,
Ken

Sherif ElMetainy said:
Hello

It is not passing null, it is passing a bucket array as stephan mentioned,
but even if it passes null to a correctly implemented Equals method, it
shouldn't be a problem.

So the question is should Hashtable implementation take into consideration a
misbehaving Equals method? Should it assume that the Equals method is
implemented correctly?

Personally I favour the second choice. Note that the Equals method is
throwing a NullReferenceException, and the documetation of Object.Equals
method says that it should never throw any exceptions.

Best regards,
Sherif
 
Sherif ElMetainy said:
It is not passing null, it is passing a bucket array as stephan mentioned,
but even if it passes null to a correctly implemented Equals method, it
shouldn't be a problem.

Passing the bucket array is even worse - that's a structure which is
internal to the hashtable, and should *never* be exposed to the outside
world.
So the question is should Hashtable implementation take into consideration a
misbehaving Equals method? Should it assume that the Equals method is
implemented correctly?

While it should assume that Equals is implemented correctly, it
shouldn't be exposing its internal data structures, nor calling Equals
without any cause.

Put it this way - if every time it needed to make a key comparison it
called Equals 100 times, would you count that as a bug? I would, but
your reasoning so far suggests that it wouldn't be.

Do you believe the Hashtable authors really *wanted* to pass the bucket
array? Do you believe that was their intention? If it wasn't, it's a
bug. The class is not behaving as designed.

If you believe it *was* their intention, I'd like to know what good it
does.
Personally I favour the second choice. Note that the Equals method is
throwing a NullReferenceException, and the documetation of Object.Equals
method says that it should never throw any exceptions.

Sure, but that's not my point.
 
Ken said:
Hi Stefan,
Thank you so much for your explanation on the cause of the problem. Base on
this investigation, we want to come out a workaround solution. To do that,
could you help me answer a few questions below:

1. Does the problem only happen in the scenario that you described? Will it
happen for other hashcode? For example, if more than one objects with same
hashcode are added to the hashtable, then remove one of them, and then add
another object with the same hashcode to the hashtable again, will it pass
the internal bucket array to the Equal method?

No it happens when the hashcode is zero
2. If I change my Equal method to check null, will it affect the performance
very much? Or in other words, will the Equal method be called very often?

You equals method will be called the same number of times, but there will be
an extra check inside the Equals method itself. I don't think it will affect
peformance much since things like network and database connections or disk
access affect performance much more.
3. If I change my Equal method to check null, will it cause memory leak?
Because from our log file, we see that the null key will stay in the
hashtable for a long time (every insersion to the hashtable will cause the
NullReferenceException).
No it will not, it is not a null key that's causing the problem. The problem
is that Equals is passed the internal buckets array in some cases
4. Is it confirmed that dotnet 2.0 beta 1 (2.0.40607) will fix the problem?
If so, when will this version be released?
I am not sure about that, but even if it is fixed, your Equals method should
check for null anyways

Best regards,
Sherif
 
Passing the bucket array is even worse - that's a structure which is
internal to the hashtable, and should *never* be exposed to the outside
world.
You have a point there I agree
Put it this way - if every time it needed to make a key comparison it
called Equals 100 times, would you count that as a bug? I would, but
your reasoning so far suggests that it wouldn't be.

This is not my point
if(obj != null && obj != buckets && key.Equals(obj)) // the first 2 checks
here are redundant, since they are rare and Equals should do them anyways
or
if(key.Equals(obj)) // for a correctly implemented Equals method, this is
not a problem

Best regards,
Sherif
 
If you believe it *was* their intention, I'd like to know what good it
does.

I don't know, we have to ask them :)
But may be they were aware that in the *very rare* event that 2 different
keys with the hash code zero are added, then one of them is deleted, an
extra *redundant* check is needed for every add operation that a properly
implemented Equals method should do anyways. So may be they left this on
purpose so that they improve performance.

// 2 extra redundant checks that a peroperly implemented Equals should do
anyways
// The 2 extra checks would happen with *every* call to Equals
// obj == buckets is true in very rare event
if(obj != null && obj != buckets && key.Equals(obj)) {
// Do something
}

if(keyEquals(obj)) // this one performs better
{
// Do something
}

Anyways my point is that, bug or no bug, Ken's (the original poster) problem
is solved if Equals is implemented correctly.

Best regards,
Sherif
 
Sherif ElMetainy said:
You have a point there I agree


This is not my point
if(obj != null && obj != buckets && key.Equals(obj)) // the first 2 checks
here are redundant, since they are rare and Equals should do them anyways
or
if(key.Equals(obj)) // for a correctly implemented Equals method, this is
not a problem

But again, just because a well-implemented Equals method wouldn't care
doesn't mean it isn't a bug. It only isn't a bug if it's the designed
behaviour, which I very much doubt that it is.
 
Sherif ElMetainy said:
I don't know, we have to ask them :)
But may be they were aware that in the *very rare* event that 2 different
keys with the hash code zero are added, then one of them is deleted, an
extra *redundant* check is needed for every add operation that a properly
implemented Equals method should do anyways.

I would rather hope that isn't the case, actually.
So may be they left this on purpose so that they improve performance.

Why would any of that involve calling Equals with the hash bucket
array? What can that possibly achieve, given that the Equals method
being called isn't one which is meant to know about hash bucket arrays?

Anyways my point is that, bug or no bug, Ken's (the original poster)
problem is solved if Equals is implemented correctly.

Yes - I agreed with that earlier. I was only disagreeing with your
suggestion that it shouldn't be regarded as a bug.
 
Hi Sherif,
I am doubting whether checking Null in the Equal method can solve the
problem. Yes, it won't throw the NullReferenceException any more, but under
that error scenario (two elements with hashcode 0 are inserted, one is
removed and inserted again), it seems that every element with hashcode 0 can
be added to the hashtable as the Equal method will always return false (the
bucket instead of key will be passed in the Equal method, so the "as"
statement will always return null). This can lead to duplicated key with
hashcode 0 in the hashtable.

Regards,
Ken
 
If I use string as the key (something like "clientID:ReferenceNumber"), will
it permanantly solve the problem?
 
Ken said:
I am doubting whether checking Null in the Equal method can solve the
problem. Yes, it won't throw the NullReferenceException any more, but under
that error scenario (two elements with hashcode 0 are inserted, one is
removed and inserted again), it seems that every element with hashcode 0 can
be added to the hashtable as the Equal method will always return false (the
bucket instead of key will be passed in the Equal method, so the "as"
statement will always return null).

No - there's an *extra* call to Equals, not a *replacement* one, I
believe.
 
I have a simlilar problem of a segv in a hast table find method when the process runs for a long time... wht was the solution you decided? mine happens on a 64 bit architecture and on linux. code in c,c++... could u mind to comment pls.
 
Back
Top