A DataSet Theory In Need of Debunking

  • Thread starter Thread starter James Alexander
  • Start date Start date
J

James Alexander

We are looking at using a DataSet object as an in-memory store/queue of
"jobs" (represented as rows). Each job has several features associated with
it (represented w/ columns) that dictate how the job should proceed at an
undetermined later date. Via Remoting, another piece of logic does some work
for each job and notifies the job of it's success. When the job is finished
and the remoting piece finishes notifying it of success or failures for each
"feature", the job will signal another a work thread to do some work with it
and it will then be removed from the DataSet.

This DataSet could contain thousands of items at a time and will be read
from and written to contantly. In theory, I'd like for each time one of the
columns is updated for a particular row (a job), it checks to see if the
other features have completed and then notify the worker thread that some
work needs to be done. Most of this will be done via events and what not,
I'm just curious if anyone can think of any limits that could prevent me
from accomplishing this w/ a DataSet that I may not have already thought of?

I appreciate any comments. Thanks!

- James
 
Concurrency can become a problem, if this is a Web Based solution, and the
number of users changing data in the dataset is greater than one (1).

Lets say Linda views the Dataset, and alters the record in Row(12)

Rob, meanwhile views the Dataset, and alters the record in Row(12)

Depending on which one alters the Row LAST, is the record that will be
written to disk (xml).

Further, say that Linda and Rob are altering different columns in the
DataSet, and Rob's info is dependent on Linda's input. This can become a
very confusing mess, if this issue is not resolved.

You could possibly put boolean columns in the dataset that become true only
when someone is editing the dataset. When the boolean column is True, edit
links are disabled - when False, edit links are enabled.

Severin
 
Michael,

I appreciate the feedback. I didn't go into too much detail but the
DataSet is in fact bound to a db and it's possible to add jobs directly to
the db and jobs added to the DataSet need to be persisted to the db. So
there is a need for it there.

- James


Michael Lang said:
This sounds like there is no associated database that is populating the
DataSet or being updated from the DataSet. Is that the case? If so, then
the following applies...

I don't see any advantage to using a dataset to store in memory application
data. the DataSet class is bloated with database related features. If you
merely have a group of job related data, just create a custom collection
class based on CollectionBase. You can apply logic to any of the collection
manipulation methods, unlike in the DataSet. You cannot derive from the
DataSet.

Exactly what do you feel you are gaining by using a DataSet?

//This C# example could easily be converted to VB.NET
public class Job
{
public event System.EventHandler Complete;
// other events ...
public string Property1{get;set;}
public int Property2{get;set;}
// make each "feature", as you call it, a "property"
public JobStatus Status{get;set;} //uses enum below to store current
state
}
public enum JobStatus{Pending, Phase1, Phase2, Complete}
public class Jobs:CollectionBase
{
public void Add(Job job)
{this.InnerList.Add(job);}
public Job this[int index]
{
get{return (Job)this.InnerList[index];}
}
//... other collection methods ...
//... queue utilities ...
//... etc ...
}

This Jobs collection can be bound to any data-aware controls if you need
that. You may want to consider the other collection types that may apply to
how you need to use it. There is a queue type collection already in the
framework. Take a look at "System.Collections".

--
Michael Lang, MCSD
See my .NET open source projects
http://sourceforge.net/projects/dbobjecter (code generator)
http://sourceforge.net/projects/genadonet ("generic" ADO.NET)

James Alexander said:
We are looking at using a DataSet object as an in-memory store/queue of
"jobs" (represented as rows). Each job has several features associated with
it (represented w/ columns) that dictate how the job should proceed at an
undetermined later date. Via Remoting, another piece of logic does some work
for each job and notifies the job of it's success. When the job is finished
and the remoting piece finishes notifying it of success or failures for each
"feature", the job will signal another a work thread to do some work
with
it
and it will then be removed from the DataSet.

This DataSet could contain thousands of items at a time and will be read
from and written to contantly. In theory, I'd like for each time one of the
columns is updated for a particular row (a job), it checks to see if the
other features have completed and then notify the worker thread that some
work needs to be done. Most of this will be done via events and what not,
I'm just curious if anyone can think of any limits that could prevent me
from accomplishing this w/ a DataSet that I may not have already thought of?

I appreciate any comments. Thanks!

- James
 
This sounds like there is no associated database that is populating the
DataSet or being updated from the DataSet. Is that the case? If so, then
the following applies...

I don't see any advantage to using a dataset to store in memory application
data. the DataSet class is bloated with database related features. If you
merely have a group of job related data, just create a custom collection
class based on CollectionBase. You can apply logic to any of the collection
manipulation methods, unlike in the DataSet. You cannot derive from the
DataSet.

Exactly what do you feel you are gaining by using a DataSet?

//This C# example could easily be converted to VB.NET
public class Job
{
public event System.EventHandler Complete;
// other events ...
public string Property1{get;set;}
public int Property2{get;set;}
// make each "feature", as you call it, a "property"
public JobStatus Status{get;set;} //uses enum below to store current
state
}
public enum JobStatus{Pending, Phase1, Phase2, Complete}
public class Jobs:CollectionBase
{
public void Add(Job job)
{this.InnerList.Add(job);}
public Job this[int index]
{
get{return (Job)this.InnerList[index];}
}
//... other collection methods ...
//... queue utilities ...
//... etc ...
}

This Jobs collection can be bound to any data-aware controls if you need
that. You may want to consider the other collection types that may apply to
how you need to use it. There is a queue type collection already in the
framework. Take a look at "System.Collections".
 
if jobs can be added to the database directly, why bother with a dataset.
you would need to load it before every use to keep it current. also some
sort of locking would be required.

i'd just use sql procs and use a sqltable for the queue. sqlserver will
handle all interlocks, and you only need to return one row per query. much
more efficient, then reloading the whole dataset before every dataset
request.

if you wanted to use a dataset as in inmemory queue, that was just backed up
to a database, you might have a case. you would need to implement locking
for updating the dataset, as this is not builtin.


-- bruce (sqlwork.com)






James Alexander said:
Michael,

I appreciate the feedback. I didn't go into too much detail but the
DataSet is in fact bound to a db and it's possible to add jobs directly to
the db and jobs added to the DataSet need to be persisted to the db. So
there is a need for it there.

- James


Michael Lang said:
This sounds like there is no associated database that is populating the
DataSet or being updated from the DataSet. Is that the case? If so, then
the following applies...

I don't see any advantage to using a dataset to store in memory application
data. the DataSet class is bloated with database related features. If you
merely have a group of job related data, just create a custom collection
class based on CollectionBase. You can apply logic to any of the collection
manipulation methods, unlike in the DataSet. You cannot derive from the
DataSet.

Exactly what do you feel you are gaining by using a DataSet?

//This C# example could easily be converted to VB.NET
public class Job
{
public event System.EventHandler Complete;
// other events ...
public string Property1{get;set;}
public int Property2{get;set;}
// make each "feature", as you call it, a "property"
public JobStatus Status{get;set;} //uses enum below to store current
state
}
public enum JobStatus{Pending, Phase1, Phase2, Complete}
public class Jobs:CollectionBase
{
public void Add(Job job)
{this.InnerList.Add(job);}
public Job this[int index]
{
get{return (Job)this.InnerList[index];}
}
//... other collection methods ...
//... queue utilities ...
//... etc ...
}

This Jobs collection can be bound to any data-aware controls if you need
that. You may want to consider the other collection types that may
apply
to
how you need to use it. There is a queue type collection already in the
framework. Take a look at "System.Collections".

--
Michael Lang, MCSD
See my .NET open source projects
http://sourceforge.net/projects/dbobjecter (code generator)
http://sourceforge.net/projects/genadonet ("generic" ADO.NET)

some
work for
each with of
the thought
of?
 
Back
Top