Does linq solve my problem

  • Thread starter Thread starter Tony Johansson
  • Start date Start date
T

Tony Johansson

Hello!

I have a collection of objects with the following important fields for this
question
string TAB;
string value;

Here is an example how it could look like.
TAB Value
A1 c1
A1 c1
A1 c1
A1 c1
A1 c1
A3 23
A3 23
A3 23
A6 45
A6 45
A1 7
A1 7
A1 7

What I want is to get the following answer from a linq query but I don't
thing it's possible to write such a query
5,3,2,3
So I want to group on TAB and Value in a way but don't allowed to move any
rows. In this case I have 5 rows
in sequence with TAB = A1 and the value is c1.
Then I have 3 rows with TAB = A3 and value = 23.
Then I have 2 rows with TAB = A6 and value = 45.
Then I have 3 rows with TAB = A1 and value = 7.

Does anybody have a good algoritm to solve this if it's not possible to use
linq ?

//Tony
 
Tony Johansson said:
Does anybody have a good algoritm to solve this if it's not possible to
use linq ?

Well, the algorithm would be pretty simple: just loop trough all the
items in the collection incrementing a counter for each iteration. As soon
as the two properties in an item differ from the preceding item, output the
counter and reset it to zero. After exiting the loop, output the final value
of the counter.
 
Hi,

I don't understand what You mean by not being allowed to move rows, but You
can try this :

var result = yourCollection.GroupBy(i => i.TAB).Select(i => i.Count());
//(this will yield 5, 3, 2, 3 for the example You posted)

Hope You find this useful.
-Zsolt
 
Tony Johansson said:
Hello!

I have a collection of objects with the following important fields for
this question
string TAB;
string value;

Here is an example how it could look like.
TAB Value
A1 c1
A1 c1
A1 c1
A1 c1
A1 c1
A3 23
A3 23
A3 23
A6 45
A6 45
A1 7
A1 7
A1 7

What I want is to get the following answer from a linq query but I don't
thing it's possible to write such a query
5,3,2,3
So I want to group on TAB and Value in a way but don't allowed to move
any rows. In this case I have 5 rows
in sequence with TAB = A1 and the value is c1.
Then I have 3 rows with TAB = A3 and value = 23.
Then I have 2 rows with TAB = A6 and value = 45.
Then I have 3 rows with TAB = A1 and value = 7.

Does anybody have a good algoritm to solve this if it's not possible to
use linq ?

I don't have VS 2008 here so this is likely to need some fixing but
something like this should work

public class<T> CountConsec
{
public T Key;
public int Count;
}

public static IEnumerable<CountConsec<TResult>> GetConsecCounts<T,
TResult>(this IEnumerable<T> source, Func<T, TResult> selector)
{
TResult prev = default(TResult);
int count = 0;
foreach(T item in source)
{
TResult val = selector(item);
if(count > 0)
{
if(!val.Equals(prev))
{
yield return new CountConsec<TResult>() { Key = prev, Count
= count };
count = 0;
}
}
count++;
prev = val;
}
if(count > 0) yield return new CountConsec<TResult>() { Key = prev,
Count = count };
}

usage would then be:

MyCollection.GetConsecCount(i => i.Tab).Select(i => i.Count);

You could do this other ways also, eg by returning an IGrouping interface
which returned a collection of the grouped items. This would be much more
flexible but what's here should work ok.

Michael
 
miher said:
Hi,

I don't understand what You mean by not being allowed to move rows, but
You can try this :

var result = yourCollection.GroupBy(i => i.TAB).Select(i => i.Count());
//(this will yield 5, 3, 2, 3 for the example You posted)

No, it will yield 8, 3, 2. If you look at his data the A1 is repeated but he
wants to count them as 2 groups.

Michael
 
Hello!

I have a collection of objects with the following important fields for this
question
string TAB;
string value;

Here is an example how it could look like.
TAB        Value
A1            c1
A1            c1
A1            c1
A1            c1
A1            c1
A3            23
A3            23
A3            23
A6            45
A6            45
A1            7
A1            7
A1            7

What I want is to get the following answer from a linq query but I don't
thing it's possible to write such a query
5,3,2,3
So I want to group on TAB and Value in a way but don't  allowed to moveany
rows. In this case I have 5 rows
in sequence with TAB = A1 and the value is c1.
Then I have 3 rows with TAB = A3 and value = 23.
Then I have 2 rows with TAB = A6 and value = 45.
Then I have 3 rows with TAB = A1 and value = 7.

Does anybody have a good algoritm to solve this if it's not possible to use
linq ?

//Tony

One solution to your problem using LINQ can be found here:
http://pastey.net/107674-3xo2
 
Hi,

Thanks, it seems i haven't examined the example enought. As You said in that
case this query wont work.
-Zsolt
 
hi,
This simple modification works fine for me:

var result = yourCollection
.GroupBy(i => i.TAB + i.Value)
.Select(i => i.Count());

If GetHashCode() is implemented

var result = yourCollection
.GroupBy(i => i.GetHashCode())
.Select(i => i.Count());

e.g.:

public override int GetHashCode()
{
return Convert.ToInt32(
this.Tab.Replace("A", "") +
this.Value.Replace("c", "")
);
}


mfG
--> stefan <--
 
Stefan Hoffmann said:
var result = yourCollection
.GroupBy(i => i.TAB + i.Value)
.Select(i => i.Count());

That's fairly dangerous code. A1 C7 will group together with A1C 7

Michael
 
hi Michael,

Michael said:
That's fairly dangerous code. A1 C7 will group together with A1C 7
Sure, therefore I mentioned the GetHashCode() example (obviously with a
bad implementation, too)... but it solves the concrete problem.


mfG
--> stefan <--
 
Tony said:
Hello!

I have a collection of objects with the following important fields for
this question
string TAB;
string value;

Here is an example how it could look like.
TAB Value
A1 c1
A1 c1
A1 c1
A1 c1
A1 c1
A3 23
A3 23
A3 23
A6 45
A6 45
A1 7
A1 7
A1 7

What I want is to get the following answer from a linq query but I don't
thing it's possible to write such a query
5,3,2,3
So I want to group on TAB and Value in a way but don't allowed to move
any rows. In this case I have 5 rows
in sequence with TAB = A1 and the value is c1.
Then I have 3 rows with TAB = A3 and value = 23.
Then I have 2 rows with TAB = A6 and value = 45.
Then I have 3 rows with TAB = A1 and value = 7.

Does anybody have a good algoritm to solve this if it's not possible to
use linq ?

var q = from v in sequence
group v by new { v.TAB, v.Value} into g
select g.Count();

proof:

class Program
{
static void Main(string[] args)
{
List<Bucket> sequence = new List<Bucket>()
{
new Bucket("A1", "c1"),
new Bucket("A1", "c1"),
new Bucket("A1", "c1"),
new Bucket("A1", "c1"),
new Bucket("A1", "c1"),
new Bucket("A3", "23"),
new Bucket("A3", "23"),
new Bucket("A3", "23"),
new Bucket("A6", "45"),
new Bucket("A6", "45"),
new Bucket("A1", "7"),
new Bucket("A1", "7"),
new Bucket("A1", "7")
};

var q = from v in sequence
group v by new { v.TAB, v.Value } into g
select g.Count();

foreach(var v in q)
{
Console.WriteLine(v);
}
}
}


public class Bucket
{
public Bucket(string tab, string value)
{
this.TAB = tab;
this.Value = value;
}

public string TAB { get; set; }
public string Value { get; set; }
}

gives 5, 3, 2, 3

FB


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
Michael said:
I don't have VS 2008 here so this is likely to need some fixing but
something like this should work

public class<T> CountConsec
{
public T Key;
public int Count;
}

public static IEnumerable<CountConsec<TResult>> GetConsecCounts<T,
TResult>(this IEnumerable<T> source, Func<T, TResult> selector)
{
TResult prev = default(TResult);
int count = 0;
foreach(T item in source)
{
TResult val = selector(item);
if(count > 0)
{
if(!val.Equals(prev))
{
yield return new CountConsec<TResult>() { Key = prev, Count
= count };
count = 0;
}
}
count++;
prev = val;
}
if(count > 0) yield return new CountConsec<TResult>() { Key = prev,
Count = count };
}

usage would then be:

MyCollection.GetConsecCount(i => i.Tab).Select(i => i.Count);

You could do this other ways also, eg by returning an IGrouping interface
which returned a collection of the grouped items. This would be much more
flexible but what's here should work ok.

You can just group on an anonymous type, which is 3 lines of code ;).
Grouping on an anonymous type to group on more than 1 element is often
overlooked as a valid grouping mechanism.

FB

--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
Frans Bouma said:
You can just group on an anonymous type, which is 3 lines of code ;).
Grouping on an anonymous type to group on more than 1 element is often
overlooked as a valid grouping mechanism.

I was assuming that the OP wanted to do things in sequence, so if the same 2
values occurred further down the list then they would be considered a
different group. This is the statement that made me think this:

"So I want to group on TAB and Value in a way but don't allowed to move any
rows."

I am presuming he means you don't "move" the rows around to group them
together. It's a bit vague and I should probably have found out first before
going to all that trouble. He appears to have lost interest anyway.

Michael
 
Stefan Hoffmann said:
hi Michael,


Sure, therefore I mentioned the GetHashCode() example (obviously with a
bad implementation, too)... but it solves the concrete problem.


mfG
--> stefan <--


Could have also used:

var result = yourCollection
.GroupBy(i => i.TAB + " " + i.Value)
.Select(i => i.Count());

A1 C7 and A1C 7 (notice the spaces)...

Mythran
 
Mythran said:
Could have also used:

var result = yourCollection
.GroupBy(i => i.TAB + " " + i.Value)
.Select(i => i.Count());

A1 C7 and A1C 7 (notice the spaces)...

No, that can fail if the source data contains a space. You need to use a
delimiter that is not used in the source data but it reeks of dodgy to me.
Any method that fails when you change the source data is flawed imo even if
you are very sure that will never occur.

Michael
 
Stefan Hoffmann said:
Sure, therefore I mentioned the GetHashCode() example (obviously with a
bad implementation, too)... but it solves the concrete problem.

Ok, fair enough I did miss that. Although I think you'll always end up with
a similar problem of duplicate hash codes. I believe the correct method is
to override the GetHashCode method and the Equals method also. You should
not group on the hash code, just group on the object itself. The groupby
will call gethashcode first, if the hashcode returned is the same for 2
objects then it will call the equals method to make sure. That way it
doesn't matter if gethashcode doesn't always return unique values (although
obviously you would try to make sure this is rare). If you don't have the
source code then you should create a custom comparer. This comes from the
Joe Rattz book Pro Linq.

Michael
 
Hello!

No I haven't lost interest in this issue.

You are absolutely true I'm not allowed to move rows.
I will only count the number of rows in sequence where the TAB and the value
is the same
Is this possible ?

//Tony
 
Tony said:
Hello!

No I haven't lost interest in this issue.

You are absolutely true I'm not allowed to move rows.
I will only count the number of rows in sequence where the TAB and the
value is the same
Is this possible ?

that's what grouping does. see my code in another post.

FB


--
------------------------------------------------------------------------
Lead developer of LLBLGen Pro, the productive O/R mapper for .NET
LLBLGen Pro website: http://www.llblgen.com
My .NET blog: http://weblogs.asp.net/fbouma
Microsoft MVP (C#)
------------------------------------------------------------------------
 
hi Michael,

Michael said:
Ok, fair enough I did miss that. Although I think you'll always end up with
a similar problem of duplicate hash codes.
That's true.

Have you read Frans Bouma's post? The grouping on an anonymous type is a
very smart solution.

I like it:

.GroupBy(i => new { Tab = i.Tab, Value = i.Value })


mfG
--> stefan <--
 
hi Frans,
var q = from v in sequence
group v by new { v.TAB, v.Value } into g
select g.Count();
Pretty cool. So this will give him the correct output:

var q = from v in samples
group v by new { v.TAB, v.Value } into g
select new { Key = g.Key, Count = g.Count() };

foreach (var v in q)
Console.WriteLine(String.Format(
"I have {0} rows with TAB = {1} and value = {2}.",
r.Count, r.Key.Tab, r.Key.Value
));


mfG
--> stefan <--
 
Back
Top