Thread safety of DataTable class - Filling on background thread OK?

  • Thread starter Thread starter Alan Cobb
  • Start date Start date
A

Alan Cobb

Hi,

The DataTable documentation says:
"This type is safe for multithreaded read operations.
You must synchronize any write operations."

So I should be able to do the following safely?
Create and fill a DataTable on a worker thread and then
pass that same DataTable object over to the GUI thread.
Then as long as the worker thread no longer touches
the DataTable, the GUI thread can read and write to it
all it wants?

Or the GUI thread could create the DataTable and pass it
to the worker thread for filling?

Or both threads could even write to the same DataTable
over time, as long as they used some kind of lock to
insure only one wrote at a time?

I would be nice if the documentation explicitly said which
methods were thread safe. That is, which involved only
"read" operations and hence could be done at any time
from any thread.

Thanks,
Alan
 
Hi Alan,

Alan Cobb said:
Hi,

The DataTable documentation says:
"This type is safe for multithreaded read operations.
You must synchronize any write operations."

So I should be able to do the following safely?
Create and fill a DataTable on a worker thread and then
pass that same DataTable object over to the GUI thread.
Then as long as the worker thread no longer touches
the DataTable, the GUI thread can read and write to it
all it wants?
Yes.


Or the GUI thread could create the DataTable and pass it
to the worker thread for filling?

It could, as long as no fields are databound to UI controls.
Or both threads could even write to the same DataTable
over time, as long as they used some kind of lock to
insure only one wrote at a time?

Yes, unless data is databound. In this case, you have to do all writtings
from the thread that created UI controls.
I would be nice if the documentation explicitly said which
methods were thread safe. That is, which involved only
"read" operations and hence could be done at any time
from any thread.

There is only a bunch of threadsafe controls/components out there becuase of
performance hit. You have to assume that methods are not threadsafe unless
othwerise specified.
And note, that read only methods are not thread safe by default.
 
Thanks Cor and Miha,

Actually my current situation does not require a lot of
multi-threading, but I was curious.

I am now filling a DataTable asynchronously using a
BackgroundWorker component. When the filling is done the
DataTable gets handed back to the GUI thread and from
then on the GUI thread deals with it alone. This works fine
and it makes up for the lack of "built-in" asynchronous
Load/Fill support in ADO.NET 2.0.

Alan
 
Alan Cobb said:
Thanks Cor and Miha,

Actually my current situation does not require a lot of
multi-threading, but I was curious.

I am now filling a DataTable asynchronously using a
BackgroundWorker component. When the filling is done the
DataTable gets handed back to the GUI thread and from
then on the GUI thread deals with it alone. This works fine
and it makes up for the lack of "built-in" asynchronous
Load/Fill support in ADO.NET 2.0.

Sure, this a good approach. I do likewise with save (I create a copy using
GetChanges() beforehand).
 
A related question:

It is not clear to me how to asynchronously fill a DataTable
that lives inside a DataSet.

Assume a BackgroundWorker thread has asynchronously filled
a DataTable (which might take 10 seconds in my case, all done
without blocking the GUI thread) and then returned it to the
GUI thread. One way to then get the newly filled DataTable into
the DataSet is to call DataSet.Merge( DataTable ). But that takes
as much time (10 seconds) as the initial Fill, and it blocks the GUI
thread the whole time, so the initial asynchronous Fill has not
achieved "non-blocking-ness".

Another possibility is to pass the BackgroundWorker thread a
reference to the member DataTable inside the DataSet and have
the background thread fill it directly. But if controls are bound
to the DataSet, that isn't allowed, as you have said. I guess
there is no way to temporarily unbind the DataSet and just
assign a new filled DataTable to the internal DataTable member
of the DataSet, rather than Merging it in?

Any suggestions?

Thanks,
Alan
 
Alan Cobb said:
A related question:

It is not clear to me how to asynchronously fill a DataTable
that lives inside a DataSet.

Assume a BackgroundWorker thread has asynchronously filled
a DataTable (which might take 10 seconds in my case, all done
without blocking the GUI thread) and then returned it to the
GUI thread. One way to then get the newly filled DataTable into
the DataSet is to call DataSet.Merge( DataTable ). But that takes
as much time (10 seconds) as the initial Fill,

It sounds weird. It shouldn't take that much time. Are you sure? How many
records we are talking about?


and it blocks the GUI
thread the whole time, so the initial asynchronous Fill has not
achieved "non-blocking-ness".

Another possibility is to pass the BackgroundWorker thread a
reference to the member DataTable inside the DataSet and have
the background thread fill it directly. But if controls are bound
to the DataSet, that isn't allowed, as you have said. I guess
there is no way to temporarily unbind the DataSet and just
assign a new filled DataTable to the internal DataTable member
of the DataSet, rather than Merging it in?

One way is to unbind the controls before filling the data. But I would
really rather use the Merge way.
 
It sounds weird. It shouldn't take that much time. Are you sure? How many
records we are talking about?

About 150,000 records. The private bytes used by my app go up
about 100 MB because of the Fill. Is a >10 (actually more like 15)
second Fill reasonable for that many records?

Maybe in-memory DataTables and DataSets aren't normally used
with that many records?
and it blocks the GUI

One way is to unbind the controls before filling the data. But I would
really rather use the Merge way.

I noticed the method BindingSource . SuspendBinding. Is that ever
used for that purpose? Or when you say "unbind the controls" that
would be something like temporarily setting
BindingSource . DataSource to null or typeof( MyDataSetType )
or something like that?

Thanks,
Alan
 
About 150,000 records. The private bytes used by my app go up
about 100 MB because of the Fill. Is a >10 (actually more like 15)
second Fill reasonable for that many records?

Maybe in-memory DataTables and DataSets aren't normally used
with that many records?

That's huge.
The real question is the design of your application. Do you really need that
much records at client side?
Can be done differently?

I noticed the method BindingSource . SuspendBinding. Is that ever
used for that purpose?

I guessed so, but a while ago I tested and I recall that it wasn't working
as expected.
Not 100% sure on this, I guess I have to test a bit more.

Or when you say "unbind the controls" that
would be something like temporarily setting
BindingSource . DataSource to null or typeof( MyDataSetType )
or something like that?

Yes, that's why I meant - disconnecting datasource - that will always work.
 
It sounds weird. It shouldn't take that much time. Are you sure? How many
That's huge.
The real question is the design of your application. Do you really need that
much records at client side? Can be done differently?

Yes it probably can and should be done differently.
One advantage of pulling it all into memory is that once it's there
it's fast to iterate over all the records to compute some statistics,
but there are probably other ways I can do that like queries.

I also currently bind all 150,000 records to a grid, which
apparently is a somewhat naive design. That bind to the grid
also takes an additional 15 seconds, during which the GUI thread
is again blocked and unresponsive. Apparently there is also no
way to do the bind asynchronously and no simple stock way to
get some kind of automatic paging of individual screenfuls of
rows. I guess the "solution" is to just load and bind less.

It has been said before: SQL's SELECT statement has a WHERE
clause for a reason :).
I guessed so, but a while ago I tested and I recall that it wasn't working
as expected.
Not 100% sure on this, I guess I have to test a bit more.

Or when you say "unbind the controls" that

Yes, that's why I meant - disconnecting datasource - that will always work.

BTW: When I tried "unbinding" the controls by setting
BindingSource.DataSource to null or typeof( MyDataSetType ) I got
"Cannot bind to the property or column..." exceptions. What did work
was to set DataSource to point at an empty instance of my typed
DataSet.

Thanks,
Alan
 
Back
Top