caching dataset -- how?

  • Thread starter Thread starter Chuck
  • Start date Start date
C

Chuck

I am working on an application for use on laptop computers (field use).
When the laptop is docked, the primary data source will be a remote dataset
hosted on the server which is connected to the database. When the laptop is
disconnected the application will use an XML cache of a subset of the data
to fill and save a local copy of the dataset. The application may be
started and stopped several times and the laptop will be turned off and on
as well. This kills any way of using memory caching.

Now, the problem is synching the local dataset with the remote one. I don't
want to directly access the database as it is in a state of flux and I want
the remote service to act as a facade.

I will have added, and modified records in the local dataset. Deletes are
generally not allowed. The remote dataset may have records that are
different or new to the local copy.

The only thing I have come up with is to merge the local records that are
added or modified into the remote 'master' dataset then update the master
into the database. Afterwards, I can then merge the master back into the
local copy.

This feels clumsy to me. Can anyone come up with a better way to do this
sync.

If the application was permantly connected to the server, I would not have a
local cache copy, so the problem would not arise.
Chuck
 
Hi Chuck,

Well datasets are disconnected in the nature.
Why are you using two of them in the first place?
Why don't you send a "real" dataset to client, he modifies it (cache it on
the disk, etc) and then when he reconnects, he sends back to server only the
changes (GetChanges()) where the modified dataset is Updated to database and
sent back to client.
 
Hi Chuck,

Well datasets are disconnected in the nature.
Why are you using two of them in the first place?
Why don't you send a "real" dataset to client, he modifies it (cache it on
the disk, etc) and then when he reconnects, he sends back to server only the
changes (GetChanges()) where the modified dataset is Updated to database and
sent back to client.

Given the fact that the remote application can start and stop multiple
times, I don't really see any easy way to GetChanges() on the remote
dataset. The only way would be to cache the original 'local' dataset
on the laptop so that changes could be tracked, but that's more or less
the same solution as the current one.

Essentially, if I'm reading Chuck correctly, the problem is that
RowState information disappears in 'truly disconnected' datasets, so
you have to do some kind of dataset merge when you reconnect in order
to get the RowState info back. (I'm not sure if DiffGrams would help
here).

On the other hand, I'm also not sure what's particularly 'clumsy' about
re-merging the dataset, so maybe I'm missing something. Handling
deletes would be pretty ugly, but since that's not necessary I think the
re-merge is fairly straightforward and even elegant.

PS. Ironically, I need to implement a feature very similar to this in a
current project, so I'm very interested in any ideas you or others might
have on it.
 
Writing the DataSet to an XML file using DiffGram does preserve row change
information. When read back in, the DataSet is restored. So, I am able to
maintain a local DataSet between re-boots. Any changes made to the local
DataSet are available when the laptop is docked to the LAN.

If it was possible to connect directly to the database, then the
GetChanges()/ DataAdapter.Update()/etc. combination would work great.
However, the requirement is that the application only talk to a remote
service. The service maintains its own copy of the DataSet and acts as a
hub for all of the clients. One of the services it provides is telling the
clients when the database has changed. The service, also, will check for
and attempt to resolve conflicts and validity problems, before attempting to
save the data into the database. As the application design develops, I am
sure I will think of other things it will do.

NET remoting does lots of great things, but I do not want to
marshal-by-value the entire DataSet just to do a merge.
It would be nice if I could download only those records in the remote
DataSet that have been added/changed while the application was disconnected,
without doing a full Fill(). As they will be marked as unchanged, I don't
see how to identify them.

So, once again, does anyone have a better way of synching the two DataSets.
Chuck
 
Hi Chuck,

One thing is not clear to me.
Does the service have access to database at all times?
Why does it have to have a DataSet? Because it has no database access all
the time?

--
Miha Markic - RightHand .NET consulting & development
miha at rthand com
www.rhand.com

news:[email protected]...
 
Yes the service does have continuous access to the database. The current
idea is that they will be on the same machine.

I was planning on using ADO.NET in the service to get data from and save
data to the database. My thought was that I would need a singleton DataSet
to manage things for all of the clients.

If I get your thought, I would create the data structures I need when I
need them and then dispose of them when the synch is complete. This seems
like it could cause a lot of activity on the database as several clients
make changes at the same time. The only other way I can see to keep from
having a permanent DataSet is to dynamically instantiate it or some other
structure as a transport between the database and the client when needed.

Remember that direct connection between the client and the database via
ADO.NET DataAdapters is not allowed. If it was allowed, I wouldn't need a
remote service.

Hopefully you have still an other idea that eliminates the master DataSet.
If so, please let me know.

Chuck
 
Hi Chuck,

Ok, I will try to explain better myself.
Here is the scenario:
Client requests certain data from webservice.
Webservice queries the database and produces a result in form of a dataset.
The dataset containing data is sent to client.
Now, the client does some modifications to the dataset he recived.
As the dataset is disconnected in the nature, he can persist it in several
ways (WriteXml, Serialization).
When client decides to store the changes to database, he would produce a
dataset containing only changes (GetChanges() method for the sake of network
transfer) and sent it to webservice.
Webservice will invoke (several) Update methods on the dataset it recieved
and the data will be safely in database.
Optionally, updated dataset will be resent back to client so he can
consolidate changes.

There is no need for singleton dataset whatsoever in most cases.

As per:
"This seems
like it could cause a lot of activity on the database as several clients
make changes at the same time."

I don't think there is a way to store data in database without dataactivity.
Sooner or later you'll have to store data back in database.
Plus, if you don't store data in db asap, some other app (perhaps Sql query)
won't see the changes.
Plus, you might lose modifications.
Plus, if you do caching manually, you'll have to take care of transactions
and all the stuff that database servers do.
Take also note that database servers do a lot of caching plus system does
some caching plus ... you get the point.
I don't think that you'll have a database as a bottleneck specially if you
are dealing with webservices.
I could go on, really :)
 
That works. If it will support this scenario then I will do it your way.

Three clients (A,B,C) all are in sync with database. A and B undock their
laptops and go to work. A, B and C all add and change records in their
datasets. As C is still connected his updates to the database happen right
away. Now the database is no longer the same as at the last sync. A and B
return to office and dock. Now the service must tell the newly attached
applications that the database has changed. A and B submit their changes
which must be posted causing two change events. A, B, and C must then
re-synch to the database. C to the change events from A and B. B to the
updated database and the change event from A. And A to the database and B's
change event. Hopefully, this can be done with only minimum transport of
data. Don't forget validation and conflict resolution.

The big problem is to prevent transporting the entire database everytime
someone re-syncs. The evnts can be handled from the datasets sent to the
remote service. After successfully posting the updates, just accept the
changes and send them to anyone who wants them. The client can then merge
them into the local dataset.

How do you handle changes that were posted to the databas?. I only want to
send those records that are out of sync. I, also, don't want a lot of book
keeping to keep track of prior changes.

Chuck
 
Hi Chuck,

Are you going to cache the whole database on clients?
In this case you might consider database replication - it will do it for
you.
If you do it manually - huh, you'll go through awful lot of problems.

If you still want to do it manually, you might ad a LastChangedDate column
to each table where the time of last change will be stored.
In this way, when retrieving last changes, you might send from client to
server only the lastdate the database was synced and webservice will collect
all rows with greater lastdatetime and send them back to client.
How are you going to notify the clients of database changes?
Are you going push (is it possible with webservices at all?) or pull model?

Anyway, you should take a look at replication...
 
I don't want to replicate the entire database onto the laptops. Only those
rows that are needed by the user should be copied. Thats another problem.
It is probable that the entire database could become too large for the
limited amount of hard drive space available on the laptops
Additional tables, and fields might work. A controller table could contain
one datetime field for each working table with the datetime last changed and
each row in the tables already have a last-changed field

I will look into replication, but I don't even know if it can be controlled
from within my application. Everything has to work off of one sync menuitem
control.

Chuck
 
Back
Top