Sequential date-time data and ado.net

John Dann · Oct 16, 2004

I'd appreciate some general advice on using ado.net in a VB.net
environment for a particular application:

I need to store a large series of instrument-generated data for later
retrieval and analysis and a database (probably Access to start with
for ease-of-development, but maybe a more scalable DBMS later) seems
to offer a number of advantages in being able to retrieve data
flexibly. But because this is a rather different type of application
from most business databases, I'm struggling a little for relevant
advice.

The idea is that new records will be created at defined time intervals
(eg every 10mins, but conceivably as often as every minute) over
potentially long periods of time, eg a year or more. So, in extremis,
we could be looking at a million records or more, though 100 000 might
be more typical. Each record will have its own explicit DateTime stamp
as a column and records will be mostly retrieved by datetime so this
column would be made the primary index. The records will be added to
the database only in strict time sequence, though the processes of
data collection and record creation may be decoupled, ie new data
records created and added potentially in a batch some time after
actual data collection. The database will probably only need to be
queried by one or two users at a time.

That's the background, but a couple of questions, one general and one
specific:

Q1 The general one is whether anyone would disagree with the use of
database such as Access or MSDE for this type of application. I don't
want to get too hung up on the exact choice of database right now.
Provided eg MSDE could do the job ultimately and Access is adequate
for smaller scale prototyping then I can live with that. It's more the
principle of using an established database tool here rather than
creating eg a custom binary file structure and having to explicitly
code the update and query procedures where I'd appreciate any
comments.

Q2 One specific task will be to find out the datetime stamp for the
last record previously added prior to doing the next update of the
database. Not being an experienced database programmer I'm concerned
that the brute force approach to doing this could be very slow. In
other words, if I just ask for the MAX value of the datetime column is
the database going to have to look through all of eg a million records
to find the answer? Or are they cleverer than that and maybe maintain
a separate index of eg min/max values, ie so a column MAX can be
quickly returned?

Or alternatively is there some other way of returning the datetime
column value for the last record previously added?

JGD

David Browne · Oct 16, 2004

John Dann said:
I'd appreciate some general advice on using ado.net in a VB.net
environment for a particular application:

I need to store a large series of instrument-generated data for later
retrieval and analysis and a database (probably Access to start with
for ease-of-development, but maybe a more scalable DBMS later) seems
to offer a number of advantages in being able to retrieve data
flexibly. But because this is a rather different type of application
from most business databases, I'm struggling a little for relevant
advice.

The idea is that new records will be created at defined time intervals
(eg every 10mins, but conceivably as often as every minute) over
potentially long periods of time, eg a year or more. So, in extremis,
we could be looking at a million records or more, though 100 000 might
be more typical. Each record will have its own explicit DateTime stamp
as a column and records will be mostly retrieved by datetime so this
column would be made the primary index. The records will be added to
the database only in strict time sequence, though the processes of
data collection and record creation may be decoupled, ie new data
records created and added potentially in a batch some time after
actual data collection. The database will probably only need to be
queried by one or two users at a time.

That's the background, but a couple of questions, one general and one
specific:

Q1 The general one is whether anyone would disagree with the use of
database such as Access or MSDE for this type of application. I don't
want to get too hung up on the exact choice of database right now.
Provided eg MSDE could do the job ultimately and Access is adequate
for smaller scale prototyping then I can live with that. It's more the
principle of using an established database tool here rather than
creating eg a custom binary file structure and having to explicitly
code the update and query procedures where I'd appreciate any
comments.

Any time you need to share data, even with one or two users, at runtime, I
would use a database.

Q2 One specific task will be to find out the datetime stamp for the
last record previously added prior to doing the next update of the
database. Not being an experienced database programmer I'm concerned
that the brute force approach to doing this could be very slow. In
other words, if I just ask for the MAX value of the datetime column is
the database going to have to look through all of eg a million records
to find the answer? Or are they cleverer than that and maybe maintain
a separate index of eg min/max values, ie so a column MAX can be
quickly returned?

Both Access and SQL Server (MSDE) can cheaply return the max value from an
indexed column. But Access lacks the locking semantics to ensure you can
get the latest value and lock the table for long enough to insert new
records.

MSDE is SQL Server, and you can develop againt it using access, especially
using Access ADP projects. This might give you the ease-of-use you are
looking for with Access, but still have a more robust database back end.

David

Cor Ligthert · Oct 17, 2004

John,

I would use the MSDE database in your situation with as only reason that MS
Access is comparing with MSDE very slow, and therefore probably not the
right one for you.

As well would I good investigate how to synchonize the timesettings on the
different computers, wherefor you probably will need some kind of service
for that on the worldclock.

When you do that I would use the datetime and not the timestamp, what that
can be something personal of me. However have a look at this as well.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlref/ts_ta-tz_6fn4.asp

Just my thought,

Cor

John Dann · Oct 17, 2004

As well would I good investigate how to synchonize the timesettings on the
different computers, wherefor you probably will need some kind of service
for that on the worldclock.

When you do that I would use the datetime and not the timestamp, what that
can be something personal of me.

I should probably have made clearer (because the terms timestamp and
even datetime are used much more widely than their specific meaning in
the database context) that the timestamp I'm talking about here is a
datetime parameter whose value is set when each data item is
collected. This may be long before the data ever sees a database. It's
therefore a time reference for the time-point at which the data was
first collected not when the database record was created. There is
only a single source of instrumental data and the timestamp is created
by that source. The database column is therefore simply being used to
hold an externally-created timestamp, and does not map to a timestamp
in any SQL/database sense.

Sorry for any confusion.

JGD

Sequential date-time data and ado.net

John Dann

David Browne

Cor Ligthert

John Dann