Aggregate unions

  • Thread starter Thread starter mota
  • Start date Start date
M

mota

Hi,

in VS2010 (.NET4) I get runtime-error:

Unable to cast object of type '<UnionIterator> ... to type
'System.Collections.Generic.HashSet ...

when aggregating unions of hashsets in collection like:

Dictionary<CustomType1,
HashSet<CustomType2>>.Values.Aggregate((workingSet, nextSet) =>
[System.InvalidCastException] (HashSet<CustomType2>)
workingSet.Union(nextSet)) [/System.InvalidCastException]

Debugger stops as marked by [System.InvalidCastException]
 
ok, here is copy-paste example

List<HashSet<int>> lhsi = new List<HashSet<int>>();
HashSet<int> hsi = new HashSet<int>();
hsi.Add(1);
hsi.Add(2);
lhsi.Add(hsi);
hsi = new HashSet<int>();
hsi.Add(2);
hsi.Add(3);
lhsi.Add(hsi);
Console.WriteLine(lhsi.Aggregate((workingSet, nextSet) =>
(HashSet<int>) workingSet.Union(nextSet)).Count);
 
mota said:
ok, here is copy-paste example

List<HashSet<int>> lhsi = new List<HashSet<int>>();
HashSet<int> hsi = new HashSet<int>();
hsi.Add(1);
hsi.Add(2);
lhsi.Add(hsi);
hsi = new HashSet<int>();
hsi.Add(2);
hsi.Add(3);
lhsi.Add(hsi);
Console.WriteLine(lhsi.Aggregate((workingSet, nextSet) =>
(HashSet<int>) workingSet.Union(nextSet)).Count);

The Enumerable.Union() method you're using here is not really suitable
for creating a union of two HashSet<T> instances. You can get it to
work, by creating a new HashSet<T> from the return value of the Union()
method:

lhsi.Aggregate((workingSet, nextSet) =>
new HashSet<int>(workingSet.Union(nextSet)))

…but it's not really a very efficient way to accomplish the operation,
because all that the Enumerable type knows about the inputs is that they
are an enumeration of ints. Internally, it has to recreate all the
structure that the HashSet<T> type already has, and then of course you
have to take the output and create yet another HashSet<T>.

A much more efficient approach would be to simply use the regular
HashSet<T> features, and specifically the UnionWith() method:

HashSet<int> aggregate = null;

foreach (HashSet<int> set in lhsi)
{
if (aggregate == null)
{
aggregate = new HashSet<int>(set);
}
else
{
aggregate.UnionWith(set);
}
}

Pete
 
Peter Duniho said:
A much more efficient approach would be to simply use the regular
HashSet<T> features, and specifically the UnionWith() method:

Thanks, I'll stick with unionwith, just realized lambda functions can
have more than one command :)

this works, although if initial accumulator seed is not specified,
first object in collection is used as storage:

lhsi.Aggregate(new HashSet<int>(), (workingSet, nextSet) =>
{ workingSet.UnionWith(nextSet); return workingSet; }
 
Back
Top