Does linq solve my problem

Michael C · Feb 8, 2009

Stefan Hoffmann said:
Have you read Frans Bouma's post? The grouping on an anonymous type is a
very smart solution.

I like it:

.GroupBy(i => new { Tab = i.Tab, Value = i.Value })

It's not a bad solution but I would think it's the third solution I would
pick. It's not as efficient as the other methods as it creates a new
anonymous object for each row. The first method I suggested of overriding
GetHashCode is not always possible because you can only do this once and you
might want to group by different properties at different times and
gethashcode should only return the same values if the 2 objects and
considered equal. The custom comparer can be done in a few lines of code and
less if you use an existing class.

Michael

Michael C · Feb 8, 2009

Frans Bouma said:
that's what grouping does. see my code in another post.

From what I understand grouping doesn't do what he wants. For this data:

TAB Value
A1 c1
A1 c1
A1 c1
A1 c1
A1 c1
A3 23
A3 23
A3 23
A6 45
A6 45
A1 c1
A1 c1
A1 c1

he wants 5,3,2,3. Grouping will give him 8,3,2

Michael

Michael C · Feb 8, 2009

Tony Johansson said:
Hello!

No I haven't lost interest in this issue.

You are absolutely true I'm not allowed to move rows.
I will only count the number of rows in sequence where the TAB and the
value is the same
Is this possible ?

Yep, I posted a solution previously. Here it is again in case you can't see
the post.

public class<T> CountConsec
{
public T Key;
public int Count;
}

public static IEnumerable<CountConsec<TResult>> GetConsecCounts<T,
TResult>(this IEnumerable<T> source, Func<T, TResult> selector)
{
TResult prev = default(TResult);
int count = 0;
foreach(T item in source)
{
TResult val = selector(item);
if(count > 0)
{
if(!val.Equals(prev))
{
yield return new CountConsec<TResult>() { Key = prev, Count
= count };
count = 0;
}
}
count++;
prev = val;
}
if(count > 0) yield return new CountConsec<TResult>() { Key = prev,
Count = count };
}

usage would then be:

MyCollection.GetConsecCount(i => i.Tab).Select(i => i.Count);

Michael

Frans Bouma [C# MVP] · Feb 8, 2009

Michael said:
It's not a bad solution but I would think it's the third solution I would
pick. It's not as efficient as the other methods as it creates a new
anonymous object for each row. The first method I suggested of overriding
GetHashCode is not always possible because you can only do this once and you
might want to group by different properties at different times and
gethashcode should only return the same values if the 2 objects and
considered equal. The custom comparer can be done in a few lines of code and
less if you use an existing class.

Why is the grouping less efficient? It uses the grouping logic built
into Linq to Objects, which is what you re-implemented as your code
effectively does a grouping loop.

If the TS wants real efficient code, he should build his own loop etc.
but he wanted a linq-based solution so a 3-liner linq query is the one
to use.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Frans Bouma [C# MVP] · Feb 8, 2009

Michael said:
From what I understand grouping doesn't do what he wants. For this data:

TAB Value
A1 c1
A1 c1
A1 c1
A1 c1
A1 c1
A3 23
A3 23
A3 23
A6 45
A6 45
A1 c1
A1 c1
A1 c1

he wants 5,3,2,3. Grouping will give him 8,3,2

Michael

Agreed, I didn't anticipate this situation, his suggested test data did
give the results he wanted. (Yay for example driven testing!

)

With the sequence given above, it's not doable to do this with normal
linq operators (in a short query like I proposed earlier) because all
operators give information about elements on whatever position, e.g. the
count of A1, c1, but not the different groups of A1, c1. So TS indeed
should build a loop and count on the fly, using a couple of dictionaries
and eventually build an extension method of that if he needs it multiple
times however the requirement seems so unique, a loop would do.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------

Michael C · Feb 8, 2009

Frans Bouma said:
Why is the grouping less efficient? It uses the grouping logic built into
Linq to Objects, which is what you re-implemented as your code effectively
does a grouping loop.

As I said, because it creates a new object for each iteration. You're also
doing string comparisons instead of int comparison.

If the TS wants real efficient code, he should build his own loop etc.

That's not as easy as it sounds as writing an efficient grouping can be
difficult. I think the linq query will be quite an efficient solution if
optimised. MS built in the GetHashCode and Equals method for a reason.

but he wanted a linq-based solution so a 3-liner linq query is the one to
use.

It's not a bad solution but as I said it's the third solution I would pick.

Michael

Michael C · Feb 8, 2009

Frans Bouma said:
Agreed, I didn't anticipate this situation, his suggested test data did
give the results he wanted. (Yay for example driven testing! )

Yep, it would have been better if his sample data demonstrated the actual
problem.

With the sequence given above, it's not doable to do this with normal linq
operators (in a short query like I proposed earlier) because all operators
give information about elements on whatever position, e.g. the count of
A1, c1, but not the different groups of A1, c1. So TS indeed should build
a loop and count on the fly, using a couple of dictionaries

You wouldn't need a dictionary at all because you can just detect when
values change.

and eventually build an extension method of that if he needs it multiple
times however the requirement seems so unique, a loop would do.

That's true, I probably wouldn't use the solution I posted because it's a
little too specific. However I would think that a GroupConsecutive extension
method would be something that could be used from time to time. This could
return an IGrouping object that did the same as GroupBy except only grouped
consecutive items.

Michael

Does linq solve my problem

Michael C

Michael C

Michael C

Frans Bouma [C# MVP]

Frans Bouma [C# MVP]

Michael C

Michael C