OT: Fastest Database?

  • Thread starter Thread starter Jim Hubbard
  • Start date Start date
J

Jim Hubbard

What is the fastest database to use for searching vast amounts of data (650
billion records - 7 fields)?
 
A lot depends on how the indexes are configured and what hardware the
database runs on. Our largest and fastest database is Oracle running
 
Simple Pentium 4, 2 GB RAM, Windows XP Pro or 2000 Advanced Server as a
test.

To do the original calculations, and fill the database, took us several
days.

With so much data, we are onsidering pre-fetching the results for all
queries and creating another db from them. All possible queries are known.
Looking up the results by query would be faster than calculating the results
from 650bn records on the fly.

Of course this could take several months, unless we can create a hive of PCs
to work on the possible results.

Jim
 
Jim,

The fastest database for as you describe the problem is of course your own
build database, which has not any overhead at all.

Just my thought,

Cor
 
Jim Hubbard said:
What is the fastest database to use for searching vast amounts of data
(650 billion records - 7 fields)?

Jim Jim Jim ... everybody in this MICROSOFT newsgroup knows full well that
SQLServer is gods gift to databases! ;-)

Seriously, though, I think you will find an unintended bias towards MSSQL in
a MS newgroup .. just because almost everybody here uses it, it is good, and
I would speculate that many here may have never used another RDBMS. You may
want to spread this question around if you haven't already.

With a DB that size, it sounds like a serious investment of time and may
warrant your own investigation .. even if it does take a couple days for
each. Personally, I would test SQLServer, Oracle, PostreSQL, and even MySQL
(if you are just looking up data).

Also, if you were so inclined, and speed & size was worth it; use an open
source DB and gut what ever is not necessary.

Good luck,
John MacIntyre
http://www.johnmacintyre.ca
Specializing in; Database, Web-Applications, and Windows Software
 
I do not know enought about your demands.
However, as you describe it here, it seems like you have a huge amount of
static data that you only need to retrive some in beforehand known subset
of.
You also had only a few columns.

It might very well be that the absolutly fastes and most efficient solution
might be to create your own structure to hold and work on the data.
Any database is a general storage to retrive answares on general data.
As you know exactly what data you have, the size of it and what you want to
do with it, creating you own stuff might very well be the fastes possible
solution...

Regards, TEK
 
Thanks for your post.

The database is comprised of 7 fields, each of which holds a maximum of 4
bytes. The total number of unique records is almost 650 billion.

As we do know the queries beforehand, I think the route we will try is to
have several machines work around the clock making the queries and storing
the queries and corresponding answers in a seperate database. Then, when
the predetermined query is executed, we will already know the answer and can
simply return the correct, predetermined answer.

Think of the database as a genome project where each field is an arrangement
of the 4 main particles that unite to form a "step" on the DNA ladder. The
database is being used to determine the % chance of any given sequence being
created from two known dna strands and the possible variants that would
apply in a given situation.

Predetermined results seem to be the way to go. It will take quite some
time, but, as time moves along, the database will get faster and faster by
checking for predetermined answers and only "doing the math" if a
predetermined answer is not found. This new answer should also be added to
the expanding answer database.

Thanks again for your post.

Jim
 
Back
Top