not sure if there is an easy way to do this!

Paul · Dec 12, 2007

Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.
Thanks Paul.

Jon Skeet [C# MVP] · Dec 13, 2007

Paul said:
Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.

Well the first thing I'd do is add a little encapsulation. Convert this
mixed-type list into one where each element is the same type - an
object encapsulating the name, number, user, start date and end date.

At that point, removing duplicates is *relatively* straightforward -
override Equals and GetHashCode in your new type, create a Hashtable of
them, put all the entries in (mapping to themselves, for example - it's
only the keys which are important) and then just get the keys out.

Paul · Dec 13, 2007

Hi Jon, thanks for the information. Just wondering if you could provide a
brief example, I can convert all to the same type, but am not sure about
object encapsulating the name, number, user, start date and end date. I have
not worked with hash tables but I think I understand the general concept.
Hash table
key object
1 (contains 5 grouped items)

--
Paul G
Software engineer.

Jon Skeet said:
Paul said:

Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.

Click to expand...

Well the first thing I'd do is add a little encapsulation. Convert this
mixed-type list into one where each element is the same type - an
object encapsulating the name, number, user, start date and end date.

At that point, removing duplicates is *relatively* straightforward -
override Equals and GetHashCode in your new type, create a Hashtable of
them, put all the entries in (mapping to themselves, for example - it's
only the keys which are important) and then just get the keys out.

Jon Skeet [C# MVP] · Dec 13, 2007

Paul said:
Hi Jon, thanks for the information. Just wondering if you could provide a
brief example, I can convert all to the same type, but am not sure about
object encapsulating the name, number, user, start date and end date. I have
not worked with hash tables but I think I understand the general concept.
Hash table
key object
1 (contains 5 grouped items)

Something like (replace Foo with a useful name - I don't know what your
groups really represent):

public class Foo
{
readonly string name;
readonly string number; // Misnomer?
readonly string user;
readonly DateTime startDate;
readonly DateTime endDate;

public Foo(string name, string number, string user,
DateTime startDate, DateTime endDate)
{
this.name = name;
this.number = number;
this.user = user;
this.startDate = startDate;
this.endDate = endDate;
}

// Put properties supporting the variables in here

public override bool Equals(object other)
{
Foo otherFoo = other as Foo;
if (otherFoo==null)
{
return false;
}
return otherFoo.name==name &&
otherFoo.number==number &&
otherFoo.user==user &&
otherFoo.startDate==startDate &&
otherFoo.endDate==endDate;
}

public override int GetHashCode()
{
int hash = 17;
hash = hash*23 + name.GetHashCode();
hash = hash*23 + number.GetHashCode();
hash = hash*23 + user.GetHashCode();
hash = hash*23 + startDate.GetHashCode();
hash = hash*23 + endDate.GetHashCode();
return hash;
}
}

To convert the list, you'd do something like:

Hashtable grouped = new Hashtable();
for (int i=0; i < original.Count/5; i++)
{
Foo foo = new Foo ((string)original[i*5],
(string)original[i*5+1]
(string)original[i*5+2]
(DateTime)original[i*5+3]
(DateTime)original[i*5+4]);

grouped[foo] = foo;
}

Then use grouped.Keys to get at them.

Alun Harford · Dec 13, 2007

Paul said:
Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.

Using C# 3 here...

Well you make your own data type. In this case, I'll make a struct
(since I'm assuming this is pure data with no behavior attached - and
because I'm to write the Equals() and GetHashCode() methods to make a
complete example with a proper class).

public struct MyType
{
public string Name { get; set; }
public string Number /* Eugh! */ { get; set; }
public string User { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }

public MyType(IList items, int startIndex) {
Name = items[startIndex];
Number = items[startIndex + 1];
User = items[startIndex + 2];
StartDate = items[startIndex + 3];
EndDate = items[startIndex + 4];
}
}

You can now make a method in another class:

public List<MyType> GetDistinctElements(IList items) {
if(items==null) throw new ArgumentNullException("items");
if(items.Count%5!=0) throw new ArgumentException("items.Count%5!=0");

List<MyType> result = new List<MyType>();
for(int i=0;i<items.Count;i+=5) {
result.Add(new MyType(items, i));
}
return result.Distinct();
}

As an added bonus, you get your data in a sensible data type. :-)

Alun Harford

Peter Duniho · Dec 13, 2007

Hi I have an array list in C# and I need to remove duplicates of data
groups
in an array list. After the array list is populated each group of data
is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime).

I would say that this example is a great example of how _not_ to store
your data, as well as an example of the hazards of using a type-agnostic
data structure like ArrayList.

I think it's a really bad idea to store data items in a collection in the
way you've done it. For a variety of reasons, but mainly because a
collection ought to have a direct mapping between a single data type and
an element in the collection. There's no way to describe what your
collection actually contains, because each element is different from 80%
of the other elements.

As far as your specific question goes...

I would first suggest that you stop using an ArrayList like that. Create
a new data structure that contains each of the data elements you're
referencing, and then store that data structure in a collection. Then,
make sure the new data structure implements IComparable, so that you can
do things like sorting and comparing for equality.

Then you can just store the data in a List<>, sorting the list after
you've received all of the data and removing the duplicates with a linear
scan through the list. Yet another alternative would be to use a
SortedList<>, adding items only if they already aren't in the list. This
avoids having to sort at the end, but of course you have the overhead of
inserting elements as you go along.

Alternatively, you can use a Dictionary<> containing your data, with the
data structure itself being its own key. That provides a fast and
convenient way to ensure that you only ever have one instance of any given
group of data. You'd have to override the Object.GetHashCode() method for
your new data structure though, basing your hash code on the data in the
object.

A dictionary would be faster than the sorted list, with the trade-off that
it requires you also implement GetHashCode(), and of course you don't get
a sorted list of data in the end (but that may not be needed anyway).

If for some reason you _must_ have this data in an ArrayList that looks
like what you've described, I would still do the above, but the convert
the results back to an ArrayList as needed. You could either brute-force
the problem by scanning the ArrayList considering five elements at a time
(easy to write, but terrible performance), or manually sorting the list,
again using five elements at a time for the sort (harder to write, but
good performance). But the framework won't give you any help there, and
while it's not hard to implement your own sort, it would be a pain
(especially given the "five elements at a time" requirement that would
lead you to have to do that in the first place) and certainly much more
trouble than just creating a new structure that implements IComparable.

If you have a data structure like this:

struct DataItem : IComparable<DataItem>
{
readonly string Name;
readonly int Number;
readonly string User;
readonly DateTime StartDate;
readonly DateTime EndDate;

public DataItem(string Name, int Number, string User, DateTime
StartDate, DateTime EndDate)
{
this.Name = Name;
this.Number = Number;
this.User = User;
this.StartDate = StartDate;
this.EndDate = EndDate;
}

static int CompareTo(DataItem diOther)
{
int compareResult;

compareResult = Name.CompareTo(diOther.Name);
if (compareResult != 0)
{
goto Done;
}

compareResult = Number.CompareTo(diOther.Number);
if (compareResult != 0)
{
goto Done;
}

compareResult = User.CompareTo(diOther.User);
if (compareResult != 0)
{
goto Done;
}

compareResult = StartDate.CompareTo(diOther.StartDate);
if (compareResult != 0)
{
goto Done;
}

compareResult = EndDate.CompareTo(diOther.EndDate);
Done:
return compareResult;
}
}

Then you can write code like this:

void AddDataItem(SortedList<DataItem, object> list, DataItem di)
{
if (!list.ContainsKey(di))
{
// All we really care about is the key, so don't bother
// with a non-null value
list.Add(di, null);
}
}

Or, not doing any of the duplicate-removing work until the end (i.e. using
a List<>):

// Do this for each new data item
void AddDataItem(List<DataItem> list, DataItem di)
{
list.Add(di);
}

// Once you've got all the data, do this
void RemoveDuplicates(List<DataItem> list)
{
if (list.Count > 0)
{
list.Sort();

int idi = 1;
DataItem di = list[0];

while (idi < list.Count)
{
if (di.CompareTo(list[idi]) == 0)
{
list.RemoveAt(idi);
}
else
{
di = list[idi++];
}
}
}
}

Some notes:

* The above isn't optimized. For example, you could improve the
removal performance by going backwards. But you said you've only got
hundreds of elements, and they come from a web service, so it seems
unlikely you'd get enough of a performance improvement to make it worth
obufscating the code here.

* At least in the case of the List<> class, you can sort without the
elements implementing IComparable, as long as you provide a comparison
delegate at the time you do the sort. So strictly speaking you don't need
to implement IComparable for your data structure (the CompareTo() method
could just be passed directly to the Sort() method as the comparer
method). Since the code has to go _somewhere_ and since there are other
potential benefits to implementing IComparable, I prefer doing so. I
think that using the Sort() overloads that take comparers are more useful
for data types you don't have control over.

Pete

Alun Harford · Dec 13, 2007

Alun said:
Paul said:

Hi I have an array list in C# and I need to remove duplicates of data
groups in an array list. After the array list is populated each group
of data is 5 elements in order (0-4,5-9,10-14), name (string), number
(string), user(string), startdate(datetime), enddate(datetime). So it
might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data
from a webservice, but I only want to remove values when all 5
elements of each group matches another group so I am left with only
the unique groups.

Click to expand...

Using C# 3 here...

Well you make your own data type. In this case, I'll make a struct
(since I'm assuming this is pure data with no behavior attached - and
because I'm to write the Equals() and GetHashCode() methods to make a
complete example with a proper class).

public struct MyType
{
public string Name { get; set; }
public string Number /* Eugh! */ { get; set; }
public string User { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }

public MyType(IList items, int startIndex) {
Name = items[startIndex];
Number = items[startIndex + 1];
User = items[startIndex + 2];
StartDate = items[startIndex + 3];
EndDate = items[startIndex + 4];
}
}

You can now make a method in another class:

public List<MyType> GetDistinctElements(IList items) {
if(items==null) throw new ArgumentNullException("items");
if(items.Count%5!=0) throw new ArgumentException("items.Count%5!=0");

List<MyType> result = new List<MyType>();
for(int i=0;i<items.Count;i+=5) {
result.Add(new MyType(items, i));
}
return result.Distinct();
}

Err... before Jon corrects me :-)

The last line should be:

return result.Distinct().ToList();

Some people might get upset about this abuse of structs, and I guess I
agree with them - but it makes the code short for a newsgroup post.

Alun Harford

carnold · Dec 13, 2007

Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.
Thanks Paul.

FWIW, here's what it would look like in python:
assert len(arraylist) > 0 and len(arraylist) % 5 == 0, "arraylist
isn't setup right"
groupedList = zip(arraylist[0::5], arraylist[1::5], arraylist[2::5],
arraylist[3::5], arraylist[4::5])
uniqueGroupedList = set(groupedList)

Paul · Dec 13, 2007

wow, python looks very powerful, have not used it before!
--
Paul G
Software engineer.

carnold said:
Hi I have an array list in C# and I need to remove duplicates of data groups
in an array list. After the array list is populated each group of data is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime). So it might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data from
a webservice, but I only want to remove values when all 5 elements of each
group matches another group so I am left with only the unique groups.
Thanks Paul.

Click to expand...

FWIW, here's what it would look like in python:
assert len(arraylist) > 0 and len(arraylist) % 5 == 0, "arraylist
isn't setup right"
groupedList = zip(arraylist[0::5], arraylist[1::5], arraylist[2::5],
arraylist[3::5], arraylist[4::5])
uniqueGroupedList = set(groupedList)

Paul · Dec 13, 2007

thanks for the response. Probably do not have to use an array list but will
have to rewrite a small portion of the code. Array lists are convenient in
that they accept all datatypes!
--
Paul G
Software engineer.

Peter Duniho said:
Hi I have an array list in C# and I need to remove duplicates of data
groups
in an array list. After the array list is populated each group of data
is 5
elements in order (0-4,5-9,10-14), name (string), number (string),
user(string), startdate(datetime), enddate(datetime).

Click to expand...

I would say that this example is a great example of how _not_ to store
your data, as well as an example of the hazards of using a type-agnostic
data structure like ArrayList.

I think it's a really bad idea to store data items in a collection in the
way you've done it. For a variety of reasons, but mainly because a
collection ought to have a direct mapping between a single data type and
an element in the collection. There's no way to describe what your
collection actually contains, because each element is different from 80%
of the other elements.

As far as your specific question goes...

I would first suggest that you stop using an ArrayList like that. Create
a new data structure that contains each of the data elements you're
referencing, and then store that data structure in a collection. Then,
make sure the new data structure implements IComparable, so that you can
do things like sorting and comparing for equality.

Then you can just store the data in a List<>, sorting the list after
you've received all of the data and removing the duplicates with a linear
scan through the list. Yet another alternative would be to use a
SortedList<>, adding items only if they already aren't in the list. This
avoids having to sort at the end, but of course you have the overhead of
inserting elements as you go along.

Alternatively, you can use a Dictionary<> containing your data, with the
data structure itself being its own key. That provides a fast and
convenient way to ensure that you only ever have one instance of any given
group of data. You'd have to override the Object.GetHashCode() method for
your new data structure though, basing your hash code on the data in the
object.

A dictionary would be faster than the sorted list, with the trade-off that
it requires you also implement GetHashCode(), and of course you don't get
a sorted list of data in the end (but that may not be needed anyway).

If for some reason you _must_ have this data in an ArrayList that looks
like what you've described, I would still do the above, but the convert
the results back to an ArrayList as needed. You could either brute-force
the problem by scanning the ArrayList considering five elements at a time
(easy to write, but terrible performance), or manually sorting the list,
again using five elements at a time for the sort (harder to write, but
good performance). But the framework won't give you any help there, and
while it's not hard to implement your own sort, it would be a pain
(especially given the "five elements at a time" requirement that would
lead you to have to do that in the first place) and certainly much more
trouble than just creating a new structure that implements IComparable.

If you have a data structure like this:

struct DataItem : IComparable<DataItem>
{
readonly string Name;
readonly int Number;
readonly string User;
readonly DateTime StartDate;
readonly DateTime EndDate;

public DataItem(string Name, int Number, string User, DateTime
StartDate, DateTime EndDate)
{
this.Name = Name;
this.Number = Number;
this.User = User;
this.StartDate = StartDate;
this.EndDate = EndDate;
}

static int CompareTo(DataItem diOther)
{
int compareResult;

compareResult = Name.CompareTo(diOther.Name);
if (compareResult != 0)
{
goto Done;
}

compareResult = Number.CompareTo(diOther.Number);
if (compareResult != 0)
{
goto Done;
}

compareResult = User.CompareTo(diOther.User);
if (compareResult != 0)
{
goto Done;
}

compareResult = StartDate.CompareTo(diOther.StartDate);
if (compareResult != 0)
{
goto Done;
}

compareResult = EndDate.CompareTo(diOther.EndDate);
Done:
return compareResult;
}
}

Then you can write code like this:

void AddDataItem(SortedList<DataItem, object> list, DataItem di)
{
if (!list.ContainsKey(di))
{
// All we really care about is the key, so don't bother
// with a non-null value
list.Add(di, null);
}
}

Or, not doing any of the duplicate-removing work until the end (i.e. using
a List<>):

// Do this for each new data item
void AddDataItem(List<DataItem> list, DataItem di)
{
list.Add(di);
}

// Once you've got all the data, do this
void RemoveDuplicates(List<DataItem> list)
{
if (list.Count > 0)
{
list.Sort();

int idi = 1;
DataItem di = list[0];

while (idi < list.Count)
{
if (di.CompareTo(list[idi]) == 0)
{
list.RemoveAt(idi);
}
else
{
di = list[idi++];
}
}
}
}

Some notes:

* The above isn't optimized. For example, you could improve the
removal performance by going backwards. But you said you've only got
hundreds of elements, and they come from a web service, so it seems
unlikely you'd get enough of a performance improvement to make it worth
obufscating the code here.

* At least in the case of the List<> class, you can sort without the
elements implementing IComparable, as long as you provide a comparison
delegate at the time you do the sort. So strictly speaking you don't need
to implement IComparable for your data structure (the CompareTo() method
could just be passed directly to the Sort() method as the comparer
method). Since the code has to go _somewhere_ and since there are other
potential benefits to implementing IComparable, I prefer doing so. I
think that using the Sort() overloads that take comparers are more useful
for data types you don't have control over.

Pete

Paul · Dec 13, 2007

thanks for the information! This is a great forum as I received a lot of
responses on this question!
--
Paul G
Software engineer.

Jon Skeet said:
Paul said:

Hi Jon, thanks for the information. Just wondering if you could provide a
brief example, I can convert all to the same type, but am not sure about
object encapsulating the name, number, user, start date and end date. I have
not worked with hash tables but I think I understand the general concept.
Hash table
key object
1 (contains 5 grouped items)

Click to expand...

Something like (replace Foo with a useful name - I don't know what your
groups really represent):

public class Foo
{
readonly string name;
readonly string number; // Misnomer?
readonly string user;
readonly DateTime startDate;
readonly DateTime endDate;

public Foo(string name, string number, string user,
DateTime startDate, DateTime endDate)
{
this.name = name;
this.number = number;
this.user = user;
this.startDate = startDate;
this.endDate = endDate;
}

// Put properties supporting the variables in here

public override bool Equals(object other)
{
Foo otherFoo = other as Foo;
if (otherFoo==null)
{
return false;
}
return otherFoo.name==name &&
otherFoo.number==number &&
otherFoo.user==user &&
otherFoo.startDate==startDate &&
otherFoo.endDate==endDate;
}

public override int GetHashCode()
{
int hash = 17;
hash = hash*23 + name.GetHashCode();
hash = hash*23 + number.GetHashCode();
hash = hash*23 + user.GetHashCode();
hash = hash*23 + startDate.GetHashCode();
hash = hash*23 + endDate.GetHashCode();
return hash;
}
}

To convert the list, you'd do something like:

Hashtable grouped = new Hashtable();
for (int i=0; i < original.Count/5; i++)
{
Foo foo = new Foo ((string)original[i*5],
(string)original[i*5+1]
(string)original[i*5+2]
(DateTime)original[i*5+3]
(DateTime)original[i*5+4]);

grouped[foo] = foo;
}

Then use grouped.Keys to get at them.

Paul · Dec 13, 2007

thanks for the information, will give it a try!
--
Paul G
Software engineer.

Alun Harford said:
Alun said:

Paul said:

Hi I have an array list in C# and I need to remove duplicates of data
groups in an array list. After the array list is populated each group
of data is 5 elements in order (0-4,5-9,10-14), name (string), number
(string), user(string), startdate(datetime), enddate(datetime). So it
might look like
arraylist[0]="truck";
arraylist[1]="242";
arraylist[2]="tom";
arraylist[3]=Convert.todatetime("12/10/2007 00:10:00");
arraylist[4]=Convert.todatetime("12/11/2007 00:10:00");
//start of next group
arraylist[5]="car";
When the array is filled it has several hundred values by reading data
from a webservice, but I only want to remove values when all 5
elements of each group matches another group so I am left with only
the unique groups.

Click to expand...

Using C# 3 here...

Well you make your own data type. In this case, I'll make a struct
(since I'm assuming this is pure data with no behavior attached - and
because I'm to write the Equals() and GetHashCode() methods to make a
complete example with a proper class).

public struct MyType
{
public string Name { get; set; }
public string Number /* Eugh! */ { get; set; }
public string User { get; set; }
public DateTime StartDate { get; set; }
public DateTime EndDate { get; set; }

public MyType(IList items, int startIndex) {
Name = items[startIndex];
Number = items[startIndex + 1];
User = items[startIndex + 2];
StartDate = items[startIndex + 3];
EndDate = items[startIndex + 4];
}
}

You can now make a method in another class:

public List<MyType> GetDistinctElements(IList items) {
if(items==null) throw new ArgumentNullException("items");
if(items.Count%5!=0) throw new ArgumentException("items.Count%5!=0");

List<MyType> result = new List<MyType>();
for(int i=0;i<items.Count;i+=5) {
result.Add(new MyType(items, i));
}
return result.Distinct();
}

Click to expand...

Err... before Jon corrects me

The last line should be:

return result.Distinct().ToList();

Some people might get upset about this abuse of structs, and I guess I
agree with them - but it makes the code short for a newsgroup post.

Alun Harford

Paul · Dec 13, 2007

Hi Joh, I implimented your code since it looked quick, seems to work! Anyhow
just trying to understand it but for the public override int GetHashCode()
why is it hash = hash*23+name.GetHashCode(); and int has =17, is this a size?
I still need to get at the data so will use the grouped.Keys as you stated.
thanks
--
Paul G
Software engineer.

Jon Skeet said:
Paul said:

Hi Jon, thanks for the information. Just wondering if you could provide a
brief example, I can convert all to the same type, but am not sure about
object encapsulating the name, number, user, start date and end date. I have
not worked with hash tables but I think I understand the general concept.
Hash table
key object
1 (contains 5 grouped items)

Click to expand...

Something like (replace Foo with a useful name - I don't know what your
groups really represent):

public class Foo
{
readonly string name;
readonly string number; // Misnomer?
readonly string user;
readonly DateTime startDate;
readonly DateTime endDate;

public Foo(string name, string number, string user,
DateTime startDate, DateTime endDate)
{
this.name = name;
this.number = number;
this.user = user;
this.startDate = startDate;
this.endDate = endDate;
}

// Put properties supporting the variables in here

public override bool Equals(object other)
{
Foo otherFoo = other as Foo;
if (otherFoo==null)
{
return false;
}
return otherFoo.name==name &&
otherFoo.number==number &&
otherFoo.user==user &&
otherFoo.startDate==startDate &&
otherFoo.endDate==endDate;
}

public override int GetHashCode()
{
int hash = 17;
hash = hash*23 + name.GetHashCode();
hash = hash*23 + number.GetHashCode();
hash = hash*23 + user.GetHashCode();
hash = hash*23 + startDate.GetHashCode();
hash = hash*23 + endDate.GetHashCode();
return hash;
}
}

To convert the list, you'd do something like:

Hashtable grouped = new Hashtable();
for (int i=0; i < original.Count/5; i++)
{
Foo foo = new Foo ((string)original[i*5],
(string)original[i*5+1]
(string)original[i*5+2]
(DateTime)original[i*5+3]
(DateTime)original[i*5+4]);

grouped[foo] = foo;
}

Then use grouped.Keys to get at them.

Jon Skeet [C# MVP] · Dec 13, 2007

Paul said:
Hi Joh, I implimented your code since it looked quick, seems to work! Anyhow
just trying to understand it but for the public override int GetHashCode()
why is it hash = hash*23+name.GetHashCode(); and int has =17, is this a size?
I still need to get at the data so will use the grouped.Keys as you stated.

The 17 is just an initial value. This is the way I normally do
hashcodes, as suggested by Josh Bloch's excellent book "Effective
Java".

Peter Duniho · Dec 13, 2007

Hi Joh, I implimented your code since it looked quick, seems to work!
Anyhow
just trying to understand it but for the public override int
GetHashCode()
why is it hash = hash*23+name.GetHashCode(); and int has =17, is this a
size?

Those are just prime numbers to help make a reliable hash code out of each
member's own hash code.

I don't know if Jon picked them because he just happens to like those
particular prime numbers (I'm a big fan of 17 myself), or if he knows
something about the .NET implementations of GetHashCode() for the various
types that makes those numbers more useful than others. But to some
extent, they are just "magic numbers".

Pete

Peter Duniho · Dec 13, 2007

thanks for the response. Probably do not have to use an array list but
will
have to rewrite a small portion of the code. Array lists are convenient
in
that they accept all datatypes!

Yes, but that's the sort of convenience that's only convenient until it's
trouble.

Paul · Dec 13, 2007

thanks for the additional information. just had another question, if I
wanted to copy this back out to an arraylist would it be difficult or
possible? I tried the CopyTo(arraylist, index)
but this did not work as I think it requires a (system.Array,index). Also
to copy it to an array I would think I would need to set the array up with
the same datatypes, string, string, string,datetime,datetime. Also I will be
comparing this data with another data group that will be in a dataset (from a
stored procedure). Do you know how I can search through the entire hash
table for example, a specific name like "car", and if found get all 5
elements of that group that has the matching name?

Jon Skeet [C# MVP] · Dec 13, 2007

Paul said:
thanks for the additional information. just had another question, if I
wanted to copy this back out to an arraylist would it be difficult or
possible? I tried the CopyTo(arraylist, index)
but this did not work as I think it requires a (system.Array,index). Also
to copy it to an array I would think I would need to set the array up with
the same datatypes, string, string, string,datetime,datetime. Also I will be
comparing this data with another data group that will be in a dataset (from a
stored procedure).

To write it to an ArrayList you'd need to just loop through the
entries, calling Add with each element:

foreach (Foo foo in table.Keys)
{
list.Add(foo.Name);
list.Add(foo.Number);
list.Add(foo.User);
list.Add(foo.StartDate);
list.Add(foo.EndDate);
}

etc.

Do you know how I can search through the entire hash
table for example, a specific name like "car", and if found get all 5
elements of that group that has the matching name?

Well, for searching like that you could either keep a separate hash
table for names, or keep a list and look through that, matching each
one.

Paul · Dec 13, 2007

Hi thanks that worked!, just wondering if you know if there is an easy way to
sort the hash table, say based on name alphabetical? I can look through the
hash table methods as well to see if there is anything.

Peter Duniho · Dec 13, 2007

Hi thanks that worked!, just wondering if you know if there is an easy
way to
sort the hash table, say based on name alphabetical? I can look through
the
hash table methods as well to see if there is anything.

The Hashtable itself can't be sorted. But you can certainly copy the data
from the Hashtable and sort that.

You could copy the keys to an Array or List<T> and sort it using a custom
comparer (one that only looks at the field you want sorted). For example
(using Jon's sample):

Hashtable grouped = ...;

/* initialize hash table... */

/* then... */
ICollection keys = grouped.Keys;
Foo[] keyArray = new Foo[keys.Count];

keys.CopyTo(keyArray, 0);
Array.Sort(keys, delegate(Foo foo1, Foo foo2) { return
foo1.name.CompareTo(foo2.name); });

Then the "keys" array will be all of your data items, sorted by name.

You would probably do well to just browse through the System.Collections
and System.Collections.Generic namespaces. There are a number of
different kinds of collections, each of which providing different
functionality appropriate to different needs.

Pete

Can Someone Help me figure this out? Not only finding multiparts	2	Jul 25, 2006
how to sort a generic list of a structure	9	Sep 6, 2007
Object reference not set to an instance of an object	2	Jun 8, 2007
System.Data.Common.DbDataAdapter Bug.	1	Apr 2, 2007
How do you remove an object from ArrayList	2	Sep 18, 2005
what's a neater way of writing this simple code...	5	Oct 14, 2005
Custom Object and Profile Object	2	Feb 17, 2006
Multi-threaded app memory leak	1	Jan 14, 2008

not sure if there is an easy way to do this!

Paul

Jon Skeet [C# MVP]

Paul

Jon Skeet [C# MVP]

Alun Harford

Peter Duniho

Alun Harford

carnold

Paul

Paul

Paul

Paul

Paul

Jon Skeet [C# MVP]

Peter Duniho

Peter Duniho

Paul

Jon Skeet [C# MVP]

Paul

Peter Duniho

Ask a Question

Similar Threads