BUG: DataRowView.Delete() deletes the wrong row in the table

  • Thread starter Thread starter Nigel Norris
  • Start date Start date
N

Nigel Norris

There appears to be a rather fundamental bug in DataViews. The DataRowView
keeps track of the corresponding DataRow by two mechanisms - an index, and a
reference. Sometimes it accesses the row by one means, sometimes the other.

If you delete a row from a view, the index for other DataRowViews gets out
of sync. Subsequent operations on these rows can then fail in various ways.
Here's a small example to demonstrate. The code deletes two rows from a
table. The first delete works ok, but the second one deletes the wrong row,
event though it gets the correct data from the row (it prints the correct
name).

Am I missing anything?

----------

using System;
using System.Data;

class Test
{

static void Main()
{
// Create a table with three rows.
DataTable table = new DataTable();
table.Columns.Add("Name", typeof(string));

table.Rows.Add(new object[]{"Fred"});
table.Rows.Add(new object[]{"Joe"});
table.Rows.Add(new object[]{"Mary"});

// Get DataRowView references for the first two rows.
DataRowView rowViewFred = table.DefaultView[0];
DataRowView rowViewJoe = table.DefaultView[1];

// Now delete the two rows (and print
// names to verify we're deleting the right ones)
Console.WriteLine ("Deleting:" + rowViewFred["Name"]);
rowViewFred.Delete();
Console.WriteLine ("Deleting:" + rowViewJoe["Name"]);
rowViewJoe.Delete();

// Print the remaining table - it should contain only
// the third record ("Mary"). However it will contain "Joe"
Console.WriteLine("Dump table:");
foreach(DataRow row in table.Rows)
{
Console.WriteLine(row["Name"]);
}
Console.Read();
}

}
 
I'm not surprised, there are other bugs in the dataview. Like, if you add
rows to the middle of the underlying table, the dataview for that table will
still show that new row at the end.

Try deleting the row through the DataTable reference.
 
Marina,

Thanks - yes, if I do rowView.Row.Delete() it does indeed work correctly.

In fact I have to confess that because of this and related bugs I've already
adopted a policy of always operating on the DataRow directly, instead of
using the DataRowView. My purpose in posting was partly to warn people, and
partly in the hope that someone from Microsoft would pick up on it.

Unfortunately data binding insists on working via a DataView and
DataRowView, so this and related problems can still bite you when you have
controls bound to the DataRowView. I've not so far figured out a completely
satisfactory workaround for that case.


"Marina" <m> wrote in message
 
I'm not surprised, there are other bugs in the dataview. Like, if you add
In v1.x, if DataView is not associated with any explicit sort criteria and a
row is directly added to the underlying DataTable (instead of adding it via
the DataView) then it would result in the above behavior and row added
directly to the middle of the underlying table would show up in the end.

We understand that this is kind of unintuitive and in v2.0, the order of
rows in DataView without any sort criteria, is exactly same as the order of
rows in DataTable.Rows collection - at all times. So in the above case, the
row added to middle of the underlying table would end up in the middle for
DataView as well.

Some more conceptual differences between V1.x and V2.0 DataView:
DataView maintains a cache of DataRowView objects that wrap a DataRow
object.
The management of this cache has changed in V2.0 resulting in some
interesting differences:

DataRowView:
In v1.x any change that causes the underlying DataRow to change position,
like an add or delete from RowsCollection, the cache of DataRowView objects
is discarded. A new cache with a new DataRowView objects is created. This as
you would guess has a cost, and managing large DataViews with rapidly
changing DataRows is expensive.

In V2.0, the cache management algorithm has changed and the existing
DataRowView objects are reused resulting in huge performance gains.
DataViews are much more lightweight and scale to large number of rows.

In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale; it
always reference the same physical row, even if the row has changed position
or deleted from DataTable. Get a fresh reference using DataView[n].

Hope it helps.

Thanks,
Kawarjit Bedi [MSFT]
This posting is provided "AS IS" with no warranties, and confers no rights.
 
Kawarjit,

Thanks very much for the response, and the insight into ADO V2.0.

However I really not convinced by your explanation of the problem I'm
seeing:
In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

In my example:



Kawarjit Bedi said:
In v1.x, if DataView is not associated with any explicit sort criteria and
a row is directly added to the underlying DataTable (instead of adding it
via the DataView) then it would result in the above behavior and row added
directly to the middle of the underlying table would show up in the end.

We understand that this is kind of unintuitive and in v2.0, the order of
rows in DataView without any sort criteria, is exactly same as the order
of rows in DataTable.Rows collection - at all times. So in the above case,
the row added to middle of the underlying table would end up in the middle
for DataView as well.

Some more conceptual differences between V1.x and V2.0 DataView:
DataView maintains a cache of DataRowView objects that wrap a DataRow
object.
The management of this cache has changed in V2.0 resulting in some
interesting differences:

DataRowView:
In v1.x any change that causes the underlying DataRow to change position,
like an add or delete from RowsCollection, the cache of DataRowView
objects is discarded. A new cache with a new DataRowView objects is
created. This as you would guess has a cost, and managing large DataViews
with rapidly changing DataRows is expensive.

In V2.0, the cache management algorithm has changed and the existing
DataRowView objects are reused resulting in huge performance gains.
DataViews are much more lightweight and scale to large number of rows.

In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

Hope it helps.

Thanks,
Kawarjit Bedi [MSFT]
This posting is provided "AS IS" with no warranties, and confers no
rights.

Nigel Norris said:
Marina,

Thanks - yes, if I do rowView.Row.Delete() it does indeed work correctly.

In fact I have to confess that because of this and related bugs I've
already adopted a policy of always operating on the DataRow directly,
instead of using the DataRowView. My purpose in posting was partly to
warn people, and partly in the hope that someone from Microsoft would
pick up on it.

Unfortunately data binding insists on working via a DataView and
DataRowView, so this and related problems can still bite you when you
have controls bound to the DataRowView. I've not so far figured out a
completely satisfactory workaround for that case.


"Marina" <m> wrote in message
 
Kawarjit,

First off, thanks very much for the response and the insight into V2.0.

However I have to question your explanation of my problem:
In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

What you are saying, I think, is that in my example code:

rowViewFred.Delete();
Console.WriteLine ("Deleting:" + rowViewJoe["Name"]);
rowViewJoe.Delete();

After the first line then 'rowViewJoe' becomes invalid, because I've changed
the underlying table, and I shouldn't use it. There's nothing in the
documentation that implies that. And the object quite happily allows me to
go on using it - surely if it were invalid attempting to use it should throw
an exception?

Also, you say that the DataRowView always references the same physical row.
That's exactly my point - it doesn't. In the second line, the code returns
what you'd expect - the field value from row 'Joe'. But the third line
deletes a *different* physical row - 'Mary'. It does know the correct row,
but for the delete case uses a different mechanism to access it, and gets it
wrong.

I'm sorry, I can't see any way of explaining this as intended behaviour. I
really do think you have a serious bug here.

Regards,

Nigel





Kawarjit Bedi said:
In v1.x, if DataView is not associated with any explicit sort criteria and
a row is directly added to the underlying DataTable (instead of adding it
via the DataView) then it would result in the above behavior and row added
directly to the middle of the underlying table would show up in the end.

We understand that this is kind of unintuitive and in v2.0, the order of
rows in DataView without any sort criteria, is exactly same as the order
of rows in DataTable.Rows collection - at all times. So in the above case,
the row added to middle of the underlying table would end up in the middle
for DataView as well.

Some more conceptual differences between V1.x and V2.0 DataView:
DataView maintains a cache of DataRowView objects that wrap a DataRow
object.
The management of this cache has changed in V2.0 resulting in some
interesting differences:

DataRowView:
In v1.x any change that causes the underlying DataRow to change position,
like an add or delete from RowsCollection, the cache of DataRowView
objects is discarded. A new cache with a new DataRowView objects is
created. This as you would guess has a cost, and managing large DataViews
with rapidly changing DataRows is expensive.

In V2.0, the cache management algorithm has changed and the existing
DataRowView objects are reused resulting in huge performance gains.
DataViews are much more lightweight and scale to large number of rows.

In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

Hope it helps.

Thanks,
Kawarjit Bedi [MSFT]
This posting is provided "AS IS" with no warranties, and confers no
rights.

Nigel Norris said:
Marina,

Thanks - yes, if I do rowView.Row.Delete() it does indeed work correctly.

In fact I have to confess that because of this and related bugs I've
already adopted a policy of always operating on the DataRow directly,
instead of using the DataRowView. My purpose in posting was partly to
warn people, and partly in the hope that someone from Microsoft would
pick up on it.

Unfortunately data binding insists on working via a DataView and
DataRowView, so this and related problems can still bite you when you
have controls bound to the DataRowView. I've not so far figured out a
completely satisfactory workaround for that case.


"Marina" <m> wrote in message
 
Nigel,

There is an anomaly in how a stale DataRowView object behaves in v1.x; I'll
follow up on that.

Please note that this issue has been addressed in v2.0.

Thanks,

Kawarjit Bedi [MSFT]

This posting is provided "AS IS" with no warranties, and confers no rights.

Nigel Norris said:
Kawarjit,

First off, thanks very much for the response and the insight into V2.0.

However I have to question your explanation of my problem:
In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

What you are saying, I think, is that in my example code:

rowViewFred.Delete();
Console.WriteLine ("Deleting:" + rowViewJoe["Name"]);
rowViewJoe.Delete();

After the first line then 'rowViewJoe' becomes invalid, because I've
changed the underlying table, and I shouldn't use it. There's nothing in
the documentation that implies that. And the object quite happily allows
me to go on using it - surely if it were invalid attempting to use it
should throw an exception?

Also, you say that the DataRowView always references the same physical
row. That's exactly my point - it doesn't. In the second line, the code
returns what you'd expect - the field value from row 'Joe'. But the third
line deletes a *different* physical row - 'Mary'. It does know the correct
row, but for the delete case uses a different mechanism to access it, and
gets it wrong.

I'm sorry, I can't see any way of explaining this as intended behaviour. I
really do think you have a serious bug here.

Regards,

Nigel





Kawarjit Bedi said:
I'm not surprised, there are other bugs in the dataview. Like, if you
add
rows to the middle of the underlying table, the dataview for that table
will still show that new row at the end.

In v1.x, if DataView is not associated with any explicit sort criteria
and a row is directly added to the underlying DataTable (instead of
adding it via the DataView) then it would result in the above behavior
and row added directly to the middle of the underlying table would show
up in the end.

We understand that this is kind of unintuitive and in v2.0, the order of
rows in DataView without any sort criteria, is exactly same as the order
of rows in DataTable.Rows collection - at all times. So in the above
case, the row added to middle of the underlying table would end up in the
middle for DataView as well.

Some more conceptual differences between V1.x and V2.0 DataView:
DataView maintains a cache of DataRowView objects that wrap a DataRow
object.
The management of this cache has changed in V2.0 resulting in some
interesting differences:

DataRowView:
In v1.x any change that causes the underlying DataRow to change position,
like an add or delete from RowsCollection, the cache of DataRowView
objects is discarded. A new cache with a new DataRowView objects is
created. This as you would guess has a cost, and managing large DataViews
with rapidly changing DataRows is expensive.

In V2.0, the cache management algorithm has changed and the existing
DataRowView objects are reused resulting in huge performance gains.
DataViews are much more lightweight and scale to large number of rows.

In v1.x, when the DataView is modified because of changes in underlying
DataTable, since the existing DataRowView objects are invalidated, users
should not keep reference to DataRowView objects and use it as its stale;
it always reference the same physical row, even if the row has changed
position or deleted from DataTable. Get a fresh reference using
DataView[n].

Hope it helps.

Thanks,
Kawarjit Bedi [MSFT]
This posting is provided "AS IS" with no warranties, and confers no
rights.

Nigel Norris said:
Marina,

Thanks - yes, if I do rowView.Row.Delete() it does indeed work
correctly.

In fact I have to confess that because of this and related bugs I've
already adopted a policy of always operating on the DataRow directly,
instead of using the DataRowView. My purpose in posting was partly to
warn people, and partly in the hope that someone from Microsoft would
pick up on it.

Unfortunately data binding insists on working via a DataView and
DataRowView, so this and related problems can still bite you when you
have controls bound to the DataRowView. I've not so far figured out a
completely satisfactory workaround for that case.


"Marina" <m> wrote in message

I'm not surprised, there are other bugs in the dataview. Like, if you
add
rows to the middle of the underlying table, the dataview for that table
will still show that new row at the end.

Try deleting the row through the DataTable reference.
 
Back
Top