How efficient are datatables? Can they have indexes?

  • Thread starter Thread starter damiensawyer
  • Start date Start date
D

damiensawyer

Hi,

Can someone please tell me if ado.net datatables are efficient for
searching? I am thinking of a model whereby at intial startup, I
populate 'all' of the data for the application into a multi table
dataset (say 5-10 tables each holding between 1000 and 100000 rows).

During execution, objects are instatiated from data in those datatables
as needed and then disposed of by a 'clean up' thread if they've been
inactive for a given duration.

For this to work, I would need to be able to very quickly seek to the
appropriate records in the datatable. Are they indexed? Can I use
multiple indexes? How about composite indexes? Are they fast? Is there
a better way?

One possible answer might be to create objects and put them in a hash
table... however, the creation of business objects is exactly what I'm
trying to get around.

Thanks very much in advance,



Damien Sawyer
 
Damiensawyer,

You have completely told what you want to do, however not how you become the
data. Is your purpose to type the data in or is there a more efficient
datasource?.

Cor
 
If you look at the datatable object, you will see that there is no way to
add indexes.
Now, internally, I am guessing there is some sort of indexing for the
primary keys.

However, datatables are not meant to contain 100K worth of rows.

I think you need to redesign the way your application works, and when and
how it accesses data. You should just be going after the data directly
through the database when you need it. Databases have all sorts of query
rewriting and optimization features that make accessing data faster.
 
Marina,

Why you think there is a database. When this is from a textfile, than it is
maybe a solution to think about.

Cor
 
I assumed there would be one.

If not, then I would question why a text file is being used in this case.
And, I would suggest that the poster puts the data into a database before
working further with it. An Access database if nothing better is available.
 
The DataTable.PrimaryKey or using a DataView class with the RowFilter or
Sort fields are effective ways to create and keep alive an index. The
DataRowCollection and DataView classes have Find method which search on the
index. DataTable.Select method will create an index as needed, but there is
nothing to keep that index alive after Select has returned.

By cleanup, I assume you mean by deleting individual rows from the table.
In V1.1 DataTable does not efficiently delete rows and becomes worse with
the more rows you have. That's been fixed in V2.0 :)

DataTable is multithread safe only for readers, so if you have background
cleanup thread - make sure no one is reading or writing on the
DataSet/DataTable while making changes.
 
Hi All,

Thanks very much for the responses.

To start from the top - yes, I am getting the data from SQL server. I'm
aware that it is indeed preferable to retrieve data as you need it with
discreet SQL statements... however it doesn't really work in this case.

I am writing a 'modeling' application which needs to load large amounts of
data into quite a complex object model for the purpose of performing
iterative aggregations. It is these objects that I'm looking to cleanup -
not the datatable rows.

I've just finished a similar application that 'goes to the db as needed'...
It suffers however from serious performance limitations brought about by
'lazy loading' of data. The answer I tried for this was to bring all the
data in at once and work on it in RAM... however I then got hit on the other
side of the coin by my objects staying in scope, therefore not being GC'd
and giving the app a huge memory footprint. I think that I can live with
100k rows in a data table if the objects instatiated from that data destroy
themselves when possible..

That's very interesting about indexes on the datavowcollection and
dataviews. Do you know if the indexes are limited to the primary keys? Or
will they create themselves to suit the where clause of the query?

Thanks very much everyone :-)



Damien Sawyer
 
Back
Top