Defining a block of consecutive rows in a datatable

  • Thread starter Thread starter prodata
  • Start date Start date
P

prodata

I've got a large set of time-series data which is organised _strictly_
in time sequence and contained in a (VB2005) datatable.

I want to iterate through this data to extract various hourly means,
which means identifying blocks of consecutive rows and computing
hourly aggregate functions (mean, SD etc).

I do have this working but by progressively working through the table
from top to bottom, identifying a block of rows with the same hour
value, copying this block of rows to a separate scratchpad datatable
and using various datatable.compute aggregate functions on all of the
rows in the scratchpad table.

I'm sure this isn't the most efficient approach in that it would be
better to work on the full original datatable and to pass a filter
expression to the datatable.compute command to indicate which
particular block of row numbers should be used in the calculation, but
being only an occasional ADO.Net user I don't know how to build the
filter.

What I want to say for the filter argument is something like 'Where
Rownumber Between R1 And R2" or 'Rownumber >=R1 AND Rownumber <=R2".
(I will know which row numbers constitute the relevant block and I
don't want to filter by datetime value - each hourly block would be
little more than 0.01% of an annual datatable, so not efficient to
repeatedly test the whole datatable just to select a particular hour's
worth of data.)

So I guess the question is whether there is any way of accessing the
row number property from a filter expression. (I guess I could
manually build in a row number column into the table and filter on
that, but maybe this is unnecesssary.)

Anyone able to help please?
 
By the way a block of consecutive rows in a datatable don't exist, in fact
they are ordered in the ways they are added without looking at the content.

But you can use a DataView to get a view on the rows in a consecutive way by
using the sort and rowfilter properties.
 
By the way a block of consecutive rows in a datatable don't exist, in fact
they are ordered in the ways they are added without looking at the content.

Yes I understand what you're saying. But in fact I control the order
that the rows are added in when creating the initial datatable so - at
least assuming that this order doesn't change during the lifetime of
the datatable - I can be confident that when I imagine that a given
block of rows is in strict datetime order then it really is.

But perhaps I might take from your comment that what I'm seeking to do
actually represents a relatively unusual and artificial situation and
so maybe is not well provided for by ADO.Net functions. Maybe I would
be better adding a column to the datatable that represents integer
hours since the very first entry in the datatable (which would be
calculated as each individual datarow was added) and then simply
performing the aggregate functions on each valid hour value in turn
(ie using the hour value as the filter in the .compute command).
 
prodata said:
Yes I understand what you're saying. But in fact I control the order
that the rows are added in when creating the initial datatable so - at
least assuming that this order doesn't change during the lifetime of
the datatable - I can be confident that when I imagine that a given
block of rows is in strict datetime order then it really is.

But perhaps I might take from your comment that what I'm seeking to do
actually represents a relatively unusual and artificial situation and
so maybe is not well provided for by ADO.Net functions. Maybe I would
be better adding a column to the datatable that represents integer
hours since the very first entry in the datatable (which would be
calculated as each individual datarow was added) and then simply
performing the aggregate functions on each valid hour value in turn
(ie using the hour value as the filter in the .compute command).


Datatable? Why bother? Why are you not using a List<T> of accessor
objects? Why are you not using Linq with a where clause to query the
List<T> of objects, along with using the Linq aggregate functions?
 
Why not, looks for me without your current need something you could do.

Seems to me vital for the data you are storing, as far as I understand it
from your reply.
 
Why do you create an extra collections when you have already a Ilist
implementing collection of rows.

Are you afraid that you don't use all the memory in a computer.

A datatable is a very good IListSource implementing Type
 
Why Not? Looks for me, without knowing all your current needs, something you
could do.
 
Cor said:
Why do you create an extra collections when you have already a Ilist
implementing collection of rows.

I moved away from datatables and datasets the moment I went to ADO.NET
Entity Framework and Linq-2-Objects.
Are you afraid that you don't use all the memory in a computer.

Accessing a datatable has been proven to be a slow means of accessing
data, much slower to access than a collections of objects when it comes
to querying.
A datatable is a very good IListSource implementing Type

I prefer the List<T> of objects with its ability to be quired using
Linq-2-Objects. Objects are much more flexible. I'll take objects every
time over datatable and dataset.
 
Back
Top