Strange results attempting IDENTITY in LINQ

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I tried to add an one-up sequence to a LINQ select such as:

var i = 0;
var tokenized =
from line in lines
select new {
index = (i = i + 1)
,tokens = line.Split(new char[] {' '}
}

However, I get strange, but somewhat explainable results. It seems that
every time I access 'index' above the value increments. It is behaving as if
the code was assigned to the member and not the value. I have tried many
different ways to
reduce the result to a value without much success:

index = Convert.ToInt32((new int[] {i=i+1})[0].ToString())

in C# I defined a class for holding the values:

Tokenized
{
:
public int Index() { get {return _index; }}
:
}
and assigned _index using a static value at construction. And still saw the
same issue.

The objective was to assign values to each line so that they could be
refered to directly in later queries. But since the values keep changing,
this does not work.

This is a cool and interesting feature and I can think of many uses for
this, but it is irritating in that I can't seem to find a way to force it to
take the value rather than the expression.

Can someone tell me how to set up the equivalent of IDENTITY(1,1) in LINQ?
 
Code to reproduce in LinqPAD:

//var lines = File.ReadAllLines(@"C:\Documents and Settings\kenglish\My
Documents\Genealogy\Hans Ancestry.GED");
var lines = new string[] {
"0 HEAD"
,"1 CHAR ANSI"
,"1 SOUR Ancestry.com Family Trees"
,"2 VERS Ancestry.com Family Trees (1)"
,"2 NAME Ancestry.com Family Trees"
,"2 CORP The Generations Network"
,"1 GEDC"
,"2 VERS 5.5"
,"2 FORM LINEAGE-LINKED"
,"0 @P2570715294@ INDI "
,"1 SEX F"
,"1 NAME Catharina"
,"1 FAMS @F198@"
,"0 @P2577568933@ INDI "
,"1 BIRT "
,"2 DATE 1515"
,"2 PLAC Cambridge,,Kent,England"
,"1 DEAT "
,"2 DATE 1590"
,"2 PLAC Cambridge,,Kent,England"
,"1 SEX F"
,"1 NAME John /Fosten/"
,"1 FAMS @F252@"
};

var i = Convert.ToInt32(0);

var tokenized =
from line in lines
select new {
index = (new int[] {i, i=i+1, i+2}).Min()
,tokens = line.Split(new char[] {' '})
,depth = Convert.ToInt32(line.Split(new char[]{' '})[0])
};

tokenized.Dump();

var tokenized2 =
from tl in tokenized
select new {
index = tl.index
,depth = tl.depth
,tokens = tl.tokens
,prevDepth = (tokenized.Where(L2 => L2.index == (tl.index-1)))
};

tokenized2.Dump();

Note that the first select has index values of: 0,1,2,...22
and the second select has index values of: 23, 47, 71, ...551
as if each iteration of the where clause incremented the index value.
 
probably a user error....
here is my test program:
================
class Program
{
static void Main(string[] args)
{
List<string> lines = new List<string>()
{
"e gweuir gwoergew roiwegrwel",
"ewqhk few9p oi cowicn cnu",
"le chien chat bouh",
"a fish in the sky",
"selamat paggi"
};
var i = 0;
var tokenized = from line in lines
select new
{
index = (i = i + 1),
tokens = line.Split(new char[] { ' ' })
};

foreach (var v in tokenized)
Console.WriteLine("{0}, {1}, {2}, {3}", v, v.index, v.index,
v.tokens[0]);

Console.ReadLine();
}
}
================
and its output
================
{ index = 1, tokens = System.String[] }, 1, 1, e
{ index = 2, tokens = System.String[] }, 2, 2, ewqhk
{ index = 3, tokens = System.String[] }, 3, 3, le
{ index = 4, tokens = System.String[] }, 4, 4, a
{ index = 5, tokens = System.String[] }, 5, 5, selamat
================

--
Regards,
Lloyd Dupont
NovaMind Software
Mind Mapping at its best
www.nova-mind.com
 
It's not a user error it is very reproducable. Your example does not use a
where clause to attempt to look up a record from a second query. If you look
at the code I posted you will see that this is where things get strange. The
results of my first select look completely normal. It is when I try to
reference the value in a useful way that things go nuts.

Lloyd Dupont said:
probably a user error....
here is my test program:
================
class Program
{
static void Main(string[] args)
{
List<string> lines = new List<string>()
{
"e gweuir gwoergew roiwegrwel",
"ewqhk few9p oi cowicn cnu",
"le chien chat bouh",
"a fish in the sky",
"selamat paggi"
};
var i = 0;
var tokenized = from line in lines
select new
{
index = (i = i + 1),
tokens = line.Split(new char[] { ' ' })
};

foreach (var v in tokenized)
Console.WriteLine("{0}, {1}, {2}, {3}", v, v.index, v.index,
v.tokens[0]);

Console.ReadLine();
}
}
================
and its output
================
{ index = 1, tokens = System.String[] }, 1, 1, e
{ index = 2, tokens = System.String[] }, 2, 2, ewqhk
{ index = 3, tokens = System.String[] }, 3, 3, le
{ index = 4, tokens = System.String[] }, 4, 4, a
{ index = 5, tokens = System.String[] }, 5, 5, selamat
================

--
Regards,
Lloyd Dupont
NovaMind Software
Mind Mapping at its best
www.nova-mind.com
Kevin_E said:
I tried to add an one-up sequence to a LINQ select such as:

var i = 0;
var tokenized =
from line in lines
select new {
index = (i = i + 1)
,tokens = line.Split(new char[] {' '}
}

However, I get strange, but somewhat explainable results. It seems that
every time I access 'index' above the value increments. It is behaving as
if
the code was assigned to the member and not the value. I have tried many
different ways to
reduce the result to a value without much success:

index = Convert.ToInt32((new int[] {i=i+1})[0].ToString())

in C# I defined a class for holding the values:

Tokenized
{
:
public int Index() { get {return _index; }}
:
}
and assigned _index using a static value at construction. And still saw
the
same issue.

The objective was to assign values to each line so that they could be
refered to directly in later queries. But since the values keep changing,
this does not work.

This is a cool and interesting feature and I can think of many uses for
this, but it is irritating in that I can't seem to find a way to force it
to
take the value rather than the expression.

Can someone tell me how to set up the equivalent of IDENTITY(1,1) in LINQ?
 
There is nothing wrong here, user error again.
You declare 'i' at the top level.

A simplified version of your error would be:

== pseudo code with the same "error" ===

int i = 0;
foreach(var v in something)
i ++;
foreach(var v in something)
i ++;

====
at the end, as expected, i = something.Count * 2
--
Regards,
Lloyd Dupont
NovaMind Software
Mind Mapping at its best
www.nova-mind.com
 
Oops, sorry, I just understand your problem.

Good point indeed.

But, after short reflection, it seems to be inline with what I read
somewhere (and can't find again) that LINQ query are evaluated on demand (as
opposed to store in an invisible variable)

Hence it is evaluated a second time in the second statement, so that it can
be enumerated.
 
Correction
Hence it is evaluated a second time in the second statement, so that it
can be enumerated.
Hence it is evaluated a second time in the second statement AS IT IS
enumerated.
 
Yep, I think that is what I am seeing.

So I guess my question should be is there any way to force the evaluation or
scalarization of the expression so that the first result just contains the
values?

I've tried all kinds of things. I thought for sure that :

index = (new int[] {i, i=i+1, i+2}).Min()

would work ... but it did not.
 
So I guess my question should be is there any way to force the evaluation
or
scalarization of the expression so that the first result just contains the
values?

Well, with anonymous type I don't know.
But if you return a known type, how about something like

List<Result> results = new List<Result>(from .....);

as in
==================
class Program
{
class Result
{
public int Index;
public string Text;
public override string ToString()
{
return string.Concat("Result: ", Index, " ", Text);
}
}

static void Main(string[] args)
{
var lines = new string[] {
"0 HEAD"
,"1 CHAR ANSI"
,"1 SOUR Ancestry.com Family Trees"
,"2 VERS Ancestry.com Family Trees (1)"
,"2 NAME Ancestry.com Family Trees"
,"2 CORP The Generations Network"
,"1 GEDC"
,"2 VERS 5.5"
,"2 FORM LINEAGE-LINKED"
,"0 @P2570715294@ INDI "
,"1 SEX F"
,"1 NAME Catharina"
,"1 FAMS @F198@"
,"0 @P2577568933@ INDI "
,"1 BIRT "
,"2 DATE 1515"
,"2 PLAC Cambridge,,Kent,England"
,"1 DEAT "
,"2 DATE 1590"
,"2 PLAC Cambridge,,Kent,England"
,"1 SEX F"
,"1 NAME John /Fosten/"
,"1 FAMS @F252@"
};


int i = 0;
List<Result> results = new List<Result>(from line in lines
select new Result()
{
Index = i++,
Text = line.Split(new char[] { ' ' })[1]
});

foreach (var v in results)
Console.WriteLine(v);
foreach (var v in results)
Console.WriteLine(v);

Console.ReadLine();
}
}
==================
 
Kevin_E said:
Code to reproduce in LinqPAD:

It's very straightforward. Each time you iterate through "tokenized" it
will increment i for each line.

Here's where you do it first:

tokenized.Dump();

And here's where you do it again:

var tokenized2 =
from tl in tokenized
...

You could avoid this by saving the results in a list:

var list = tokenized.ToList();

Then do:

list.Dump();

var tokenized2 =
from tl in list
...
 
Back
Top