DataSet and serialization is dangerous

  • Thread starter Thread starter Guest
  • Start date Start date
Hi ABad,
Excuse I am late. I live in Brazil and here has carnival (4 days). In Rio it
is very famous. Do you like carnival?
Let's go to work.
Imagine a server giving the dataset to a client application. The client
binds this dataset to a datagrid. The client user deletes the father row and,
afterwards, decides to create the same row again. What the client must return
to the server? Isn't the dataset.GetChanges?
I would like to say when a return the dataset.GetChanges to a local method,
it works ok, but if I return to a remote server by remoting, it doesn't work
(the child row vanishes) and I think the serialization doesn't work ok in
this case.
I change your example program again (the code is above) creating a new
column in father table. You will see that the father row was inserted, but
there is no child.
Code:
Sub CreateSchema(ByVal InDS As DataSet)
Dim FatherTable As New DataTable("FatherTable")
Dim ChildTable As New DataTable("ChildTable")
Dim fatherCol1 As New DataColumn("FatherCol1")
'I create a new column in father table
Dim fatherCol2 As New DataColumn("FatherCol2")
Dim childCol1 As New DataColumn("ChildCol1")
Dim childCol2 As New DataColumn("ChildCol2")
FatherTable.Columns.Add(fatherCol1)
FatherTable.Columns.Add(fatherCol2)
ChildTable.Columns.Add(childCol1)
ChildTable.Columns.Add(childCol2)
FatherTable.PrimaryKey = New DataColumn() {fatherCol1}
InDS.Tables.Add(FatherTable)
InDS.Tables.Add(ChildTable)
InDS.Relations.Add("FatherChildRelation", fatherCol1, childCol1)
End Sub
Function CreateOriginalDataset() As DataSet
Dim dsOriginal As New DataSet
CreateSchema(dsOriginal)
dsOriginal.Tables("FatherTable").Rows.Add(New Object() {"Father1",
"Col1"})
dsOriginal.AcceptChanges()
Return dsOriginal
End Function

Sub WorkWithDataset(ByVal WorkingDataSet As DataSet)
WorkingDataSet.Tables("FatherTable").Rows(0).Delete()
WorkingDataSet.Tables("FatherTable").Rows.Add(New Object()
{"Father1", "ColChanged"})
WorkingDataSet.Tables("ChildTable").Rows.Add(New Object()
{"Father1", "Child1"})
End Sub

Sub Main()
'dsOriginal is the server copy
Dim dsOriginal As DataSet = CreateOriginalDataset()

'dsWorking is the client copy
'starts off with original copy
Dim dsWorking As DataSet = dsOriginal.Copy
'client copy is modified
WorkWithDataset(dsWorking)

'get changes to send to the server<===
Dim dsChanges As DataSet = dsWorking.GetChanges

'persist out the changes (remoting process) <===
dsChanges.WriteXml("myFileName.xml", XmlWriteMode.DiffGram)

'read back in the changes (remoting processs) <===
Dim dsRead As New DataSet
CreateSchema(dsRead)
dsRead.ReadXml("myFileName.xml", XmlReadMode.DiffGram)

'merge client side changes to the server copy
dsOriginal.Merge(dsRead.GetChanges)
dsOriginal.AcceptChanges()

Console.WriteLine("")
Console.WriteLine("****
ORIGINAL*****************************************************")
Console.WriteLine("")
dsOriginal.WriteXml(Console.Out)

Console.WriteLine("")
Console.WriteLine("**** WORKING
*****************************************************")
Console.WriteLine("")
dsWorking.WriteXml(Console.Out)
Console.ReadLine()
End Sub
 
Hi ABad,
Did you see my a response?

ABad said:
In other words change you client side logic From:
- Delete Father1 from Father Table
- Add Father1 to Father Table
- Add Child1 to Child Table with relationship to Father1 in Father
Table


To:
- Delete Father1 from Father Table
- Undo Father1 delete or rollback
- Add Child1 to Child Table with relationship to Father1 in Father
Table
 
Hey Mauricio,

I just got back from a week in Mexico, had some fun.

Ok. I'm going to state some of my observations on this problem, and I
probably can't offer anything else from here on in, but this is my take.
Take it with a grain of salt and if you don't like it then submit another
message to the newsgroup and see if anyone else may have some input.

1. The dataset deserialization is based on XML.
2. The XML that is deserialized could be created by hand, a java client, a
perl client, etc. But the thing to note here is it doesn't have to come from
a dataset! Think interoperability.
3. The diffgram is a stateful data representation, meaning it knows it's
previous state and its current state.
4. Because of 1,2 & 3, the deserialization process is complex, look at the
XmlReadMode enumeration. It has deal with missing schema, conflicting
schemas, constraints, missed match data, etc. Also read about the dataset
merge because ReadXml deserialization is closely related to the dataset
merge.
5. An in memory merge between two datasets is much different than an XML
serialization/ReadXml merge. In memory merge can assume and track much more.
XML can only provide so much when serialized and deserialized, and it was
meant for interoperability. .NET 2.0 will have binary serialization and this
may help you in situations like this.

Now back to your problem.

1. Your server is behind an explicit boundary, where clients send in a
stateful serialized state via XML. Your server can only depend on the
diffgram your clients send in to fully describe what the client had
originally and what was the final outcome of the client's workflow.
2. Your server has no idea about the client's workflow and the number of
deletes that took place. The client could have deleted/added Father1 a 100
times but at the diffgram submittal the server only knows that Father1
existed in the original dataset and Father1 is getting inserted on the
proposed changes.
3. When the client receives a dataset with a row that exists and has a
primary key, the only changes to that row that make sense to the server are:
no changes, modified, or deleted.
4.With all this in mind, go back and take a look at the XML diffgram that
your client is submitting to the server (I've included it below). The
dataset used for deserialization on the server only knows this information
coming in, nothing else. Your diffgram states that Father1 existed
originally but its proposing to insert another Father1.
5. The only thing the dataset, when deserializing, can do in this case is
either not accept the changes or *infer* a delete has taken place and
another row is being inserted with the identical primary key. The dataset
does not do go forward and try to guess that the delete has happened, so it
rejects your changes (I know that it accepts the change to the 2nd column in
the father table but what actually happens is that the merge logic makes the
change to the existing Father1 row but throws away the rest of the changes.
Attach events to the dataset that reads in the XML file and you can see for
yourself.).

So what can your client do?

1. It can send over the delete 1st seperately, and then do the re-insert and
child addition.
2. It could do what I did in my first merge example with the addition of the
primary key, keep the original dataset in memory on the client, and when the
client submittal takes place, merge the dataset with the changes with the
original dataset in memory and submit the changes from the original dataset.
3. It could first check, on additions, if a row with the primary key already
exists but is in a state of deleted. If so bring back the original version
and then continue to make changes to it.

There might be a more elegant solution but I can't come up with one at the
moment. But these solutions, and any other solution, will avoid sending an
XML diffgram with an invalid state shown below in the diffgram.

Good Luck!


<?xml version="1.0" standalone="yes"?>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata"
xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1">
<NewDataSet>
<FatherTable diffgr:id="FatherTable2" msdata:rowOrder="1"
diffgr:hasChanges="inserted">
<FatherCol1>Father1</FatherCol1>
<FatherCol2>ColChanged</FatherCol2>
</FatherTable>
<ChildTable diffgr:id="ChildTable1" msdata:rowOrder="0"
diffgr:hasChanges="inserted">
<ChildCol1>Father1</ChildCol1>
<ChildCol2>Child1</ChildCol2>
</ChildTable>
</NewDataSet>
<diffgr:before>
<FatherTable diffgr:id="FatherTable1" msdata:rowOrder="0">
<FatherCol1>Father1</FatherCol1>
<FatherCol2>Col1</FatherCol2>
</FatherTable>
</diffgr:before>
</diffgr:diffgram>
 
Hi ABad,
Thank you very much for your help. I was slightly confused, but I understood.
I decided to work with 2 dataset on client side (original and working
dataset), but do you think the serialization has a problem?
I think the serialization is missing this case, isn't it?
 
I myself don't think XML serialization really has a problem. There really
are only two choices when a diffgram is in this state, delete and reinsert
or don't accept the proposed changes. The developers took the safe route and
don't accept the changes.

Couple of problems I did have is:
- that there was no clear way to know exactly what was going without
attaching handler to all the events.
- there is no way to override the behavior
- there is no documentation telling us that was the behavior. Without that
we are still assuming that it might be the correct behavior.
- there is no binary serialization of datasets which in the remoting
scenario would be extremely helpful and efficient, but that is coming in 2.0
(I think). This may fix your specific problem if its a true binary
serialization of the complete internals of the dataset. I would test this
scenario with 2.0 when it comes out and I might test it with 2.0 beta. If I
do later today I will post my results.

ABad
 
Back
Top