merge multiple databases

  • Thread starter Thread starter DIOS
  • Start date Start date
D

DIOS

I have multiple Access97 databases that i want to merge into one single big
database. I have some ideas on how to do it but would like any suggestions
on doing the mrege more efficiently. Currently i am using a VB app to
connect to
the various databases via DAO. The tables look like so:

tblDrive
--------
DriveID (primary key)
fldTitle
fldDesc

tblFiles
--------
FileID (primary key)
DriveID
fldFTitle
fldFSize

tblSubFiles
-----------
SubFileID (primary key)
FileID
DriveID

Basically each drive has many files and each file has many subfiles. I was
going to iterate through each record in
tblDrive and then do a query with the DriveID and get all proper records
from tblFiles and add those to the new database
with a new DriveID. Then for my tblFiles recordset I was going to perform a
query with the same DriveID and the same
FileID and take those records and add them to the new database. This seems
like rather inefficient to me but i cant
see how to do it in a more elegant manner.

tia
AGP
 
Create links to the tables. If you need to move data, then via
Access create a make table query that uses the link to create
a new table.

I typically name the links, lnkNames and then create the new
table and give it name, tNames. You'll have to be careful with
the links inside of Access. Make sure you're working with
backups while you're playing around. You don't want to delete
a whole table on accident.

I tend to use DAO to do everything, but it can be done in ADO
as well.

If you plan on reusing the tables over and over, say, you're
going to delete everything in the table, and recreate it, you'll
need to set up a loop to go through the tabledefs to delete
a specific table, before executing the make table query.

The make table queries are ran with a DAO.Execute statement,
as it's an action query. Select queries can be run opened as
recordsets. Action queries need to be executed.

Let me know if you need any other help.

--
Jim Carlock
http://www.microcosmotalk.com/
Post replies to the newsgroup.


:
I have multiple Access97 databases that i want to merge into one single big
database. I have some ideas on how to do it but would like any suggestions
on doing the mrege more efficiently. Currently i am using a VB app to
connect to
the various databases via DAO. The tables look like so:

tblDrive
--------
DriveID (primary key)
fldTitle
fldDesc

tblFiles
--------
FileID (primary key)
DriveID
fldFTitle
fldFSize

tblSubFiles
-----------
SubFileID (primary key)
FileID
DriveID

Basically each drive has many files and each file has many subfiles. I was
going to iterate through each record in
tblDrive and then do a query with the DriveID and get all proper records
from tblFiles and add those to the new database
with a new DriveID. Then for my tblFiles recordset I was going to perform a
query with the same DriveID and the same
FileID and take those records and add them to the new database. This seems
like rather inefficient to me but i cant
see how to do it in a more elegant manner.

tia
AGP
 
The database merge is going to be done completely through code.
The end user does not even know that that databases are Access97
databases. I will keep the same table structure in the new database.
I just want to dump all database records into the new database but because
some IDs may be the same then i just want to account for that. Again,
i have a plan but its based on iterating through each record. There should
be
an easier way to do this.

AGP
 
DIOS said:
The database merge is going to be done completely through code.
The end user does not even know that that databases are Access97
databases. I will keep the same table structure in the new database.
I just want to dump all database records into the new database but because
some IDs may be the same then i just want to account for that. Again,
i have a plan but its based on iterating through each record. There should
be
an easier way to do this.

AGP

There is no easier way (afaik). How often to you need to do this? If
only once in a while, efficiency shouldn't be an issue. Reliability and
control are far more important (imo) for these 'once in a while'
operations. If you need to worry about over-writing specific records,
there is no better way than to have code look at each record.
 
DIOS said:
I have multiple Access97 databases that i want to merge into one single big
database. I have some ideas on how to do it but would like any suggestions
on doing the mrege more efficiently. Currently i am using a VB app to
connect to
the various databases via DAO. The tables look like so:

<snip>

After reading this thread, let me throw this out at you. I apologize in advance
for word wrap:

sSQL = "INSERT INTO tblOne "
sSQL = sSQL & "IN '" & connLocal.Properties("Data Source").value & "' "
sSQL = sSQL & "SELECT tblOne.* "
sSQL = sSQL & "FROM tblOne "
sSQL = sSQL & "IN '" & connServer.Properties("Data Source").value & "' ; "
connLocal.Execute sSQL

Let me explain that a bit. connLocal is a connection to one Access database.
connServer is a connection to another Access database. The Data Source property
of the connection objects is the path to the actual mdb file. You would be
executing an insert into a table in a specific database all the record from a
select statement on another database.

Now, this may or may not work for you because of your autonumber situations.
Well, if you have autonumber and this may mess everything up, then add in some
columns into the new database to hold the oldDriveID (bad name). Then, after
you have all your records in your new database, you can issue some cleanup
routines to update the ids of the child tables to the new ids, based on the
oldDriveId field.

Does that help?

Matt
 
Okay, first some comments on the process I'm describing below.
You need to create a new Access datafile (mdb) to store all in.
This file can be installed into the folder that other access files are
located in. This will work not only with Access databases, but
with just about any type of data system in the world.

With that in mind:

(1) Create the new mdb file.
(2) Open the newly created Access database. Create the links
inside the new Access mdb. Keep in mind that having all the
mdb's in the same folder will help greatly. Name the links:

lnkDrive
lnkFiles
lnkSubFiles

If there are any other files, continue creating these links.
The goal is to gather all the information into one file.

(3) Once all the links are created, you are ready to start
creating the Make tables. Do this inside Access. It's very
easy to do. Click on the Queries icon. Click on
"Create new query", then select lnkDrive as the table to
work with. Click on the (*) asterisk inside the field list.
Right click inside the query and click on Query Type...,
then click on Make Table. You'll be prompted for a file
name. Type in tDrive. Save the query and exit it.

(4) Test it out inside of Access, but NOTE: when working with
queries inside Access, make sure you have back ups of the files
you are going to play around with. When playing with these queries,
it's way to easy to delete a whole table, or modify and totally
obliterate a whole table. I can't emphasize it enough. Make sure you
have a valid backup before you start playing. :-)

The power of the Internal Access queries should be classified as
the 8th wonder of the world.

(5) Once you have your set of Make Table queries created, it's
time to start setting up the code inside of VB to run the action
queries.

(6) Once all the tables are localized into the new access file,
you can start messing with the queries there to present the data
to yourself. Once you have a proper query in Access, DAO
works extremily efficiently and fast handling the internal Access
queries.

I don't know what your full understanding is of Access, so
pardon me if I am oversimplifying things.

Take it one step at a time, and break it up into the proper
steps, it's really amazing.
---------------------------------------
Some important concepts to think about and understand
when working with Access:
---------------------------------------
When you work with Access, think of tables as tables of
information, and think of queries as one of two types. They
are either Action queries or Select queries. Action queries
perform some operation (modification) on the data, and the
Select queries just present the data (including sorting and
sorting is not considered an action). It's important to keep
the concepts of Action vs (I think it might be called Passive)
as separate notions in your head.

One other thing I'll mention is that I've used such methods in
extracting data from ODBC sources, and they were very very
large tables of data, which covered accounts receivables
where one particular Home Depot account had thousands
of Items that were sold, and amounted to millions of dollars.
I learned some great things about Access when I was doing
such things:

(1) Access is probably the best file extraction organism in the
universe and I would classify it as the 8th wonder of the world.
(2) Internal Access queries, rather than using SQL syntax
through VB is much much faster.
(3) I've set and configured 30 or so internal access queries to
be called in sequence and they all start with the initial make
table query to copy data from an original database to a local
access mdb.

I'll wait for a reply from you before I continue with this. Let me
know if this helps at all.

--
Jim Carlock
http://www.microcosmotalk.com/
Post replies to the newsgroup.


:
The database merge is going to be done completely through code.
The end user does not even know that that databases are Access97
databases. I will keep the same table structure in the new database.
I just want to dump all database records into the new database but because
some IDs may be the same then i just want to account for that. Again,
i have a plan but its based on iterating through each record. There should
be
an easier way to do this.

AGP
 
Jim,

Thanks for all the info. i appreciate it. i have a very good understanding
of how
Acess databases are structured. In the end what i need is just an algorithm
to
dump each table from each database into another new database. the
autu-increment
on the primary keys is what the problem is. I can make links but in the end
i still
have to iterate through each record like Ken halter stated. I dont see any
way around
this and thats why i was asking for suggestions. Since this will be done not
very often i
may just stick to my original plan of looping through records and processing
the merging
that way.

Also like i said, all this has to be done through code and I have yet to do
any linking
through code so i may just INSERT each tabel into my new database, do the
processing
on that table, and then DROP it.

Again thanks for the info.

AGP
 
Yes, its only once in a while so like you said i may just stick
to the record looping process i described. Thanx.

AGP
 
Heh, I don't have a clue about what Merging is except in the
concept of mail merge, whereby you have a letter in word and
you are mass mailing the letter to a group and you merge the
contents of the database into a letter.

DIOS said:
In the end what i need is just an algorithm to dump each table
from each database into another new database. the
auto-increment on the primary keys is what the problem is.

Links are just tabledefs without an empty connect property.

'purpose: connect links to remote mdb
Dim DBLinkPath As String 'fully qualified path to the mdb
Dim td As DAO.TableDef, sDBLinkPath

gSubConnect
sDBLinkPath = App.Path & "\" & gsMdbName
'Loop through tabledefs to find links:
For Each td In gDAO_DB.TableDefs
If td.Connect <> "" Then
'We have a link here, so update it
td.Connect = ";DATABASE=" & sDBLinkPath
td.RefreshLink
End If
Next td
Set td = Nothing
gSubDisconnect
'double check the links are good
mnuPerformImport.Enabled = Me.DBLinksAreGood
'------------------

I'm pretty sure the AutoIncrement can be turned on via code
as well.

DIOS said:
I can make links but in the end i still have to iterate through
each record like Ken halter stated. I dont see any way around
this and thats why i was asking for suggestions.

Since this will be done not very often i may just stick to my
original plan of looping through records and processing the
merging that way.

Let Access do the work on the tables for you. Create a set
of predefined queries in Access to do everything you need.
Then call those queries and executes queries in the VB code.
But set the queries up in Access and get them working in
Access. It works very very well.
Also like i said, all this has to be done through code and
I have yet to do any linking through code so i may just
INSERT each table into my new database, do the processing
on that table, and then DROP it.
*********************************************

You do know that you can link a newer Access database up
to any previous Access database around. I did this last year
using an Access XP mdb which was saved as an Access 2k
mdb which in turn connected to an Access 1.0 mdb.
 
Hmmm, I can create queries in Access or on the fly but I dont think there
is a concise way of creating a query that will merge my databases into
one big database. Each table has a primary index which is tied in with the
other
tables. Each database can possibly have the same primary keys. So when
i merge the databases i have to keep track of what the new keys are and then
transfer those to the the other tables and then merge those into the new
database.
Sounds easy but I just dont see how to do it with a standard update query.

AGP
 
DIOS said:
Hmmm, I can create queries in Access or on the fly but I dont think there
is a concise way of creating a query that will merge my databases into
one big database. Each table has a primary index which is tied in with the
other
tables. Each database can possibly have the same primary keys. So when
i merge the databases i have to keep track of what the new keys are and then
transfer those to the the other tables and then merge those into the new
database.
Sounds easy but I just dont see how to do it with a standard update query.

I dont know if you saw my other post, but see if this would work for you:

Add 2 columns to tblDrive -- dbId and oldDriveID.
Add 2 columns to tblFiles -- dbId and oldFileID.
Add 1 column to tblSubFiles -- dbId

dbId would be text. You'd update every record in every table, set dbId to some
unique identifier for each of your old databases. Would machine name work? How
about Pathname? You could even use a numbering scheme (NOT autonumber! You are
seeing why autonumber sucks sometimes, and GUIDs may work better for you in the
long run).

After you do that for each client datbase, then you can issue 3 sql statements
per database, one per table, like I showed you in my other post INSERTing all
records into the new datbase from teh old datbase, letting the new datbase
assign all brand new autonumber values in DriveId (but you have oldDriveID,
remember?).

Now you ahve all the information you need, and I think you'll just need to issue
3 update statements -- here are 2 of them...

UPDATE tblSubFiles INNER JOIN tblFiles ON (tblSubFiles.FileID =
tblFiles.oldFileId AND tblSubFiles.dbId = tblFiles.dbId) SET tblSubFiles.FileID
= tblFiles.FileId

UPDATE tblSubFiles INNER JOIN tblDrive ON (tblSubFiles.DriveId =
tblDrive.oldDriveID AND tblSubFiles.dbId = tblDrive.dbId) SET
tblSubFiles.DriveId = tblDrive.DriveID

' do the same thing for tblFiles

Does that work for you?

Matt
 
That is a very interesting algoithm. It might work so ill give it a shot.
I will create the tempID fiields and then run my queries and updates against
those
and just inputa temp number for tha actual Ids until i fill them in. I have
the iteration
algorithm already written but this eems like it may be more efficient and
faster.
I will post my comparisons.

AGP
 
"DIOS" indicated:
Hmmm, I can create queries in Access or on the fly but I dont
think there is a concise way of creating a query that will merge
my databases into one big database. Each table has a primary
index which is tied in with the other tables.

Well, you may or may not be able to do it with one query. I've
done such things in the past with a few queries. What I did, was
create such as the following:

BuildBigTable-0001-qm
BuildBigTable-0002-qm
BuildBigTable-0003-qm
BuildBigTable-0004-qm
BuildBigTable-0005-qu
BuildBigTable-0006-qu
BuildBigTable-0007-qu

You can drop two tables into the initial make table query and
link them together. You can even do three tables, but I tend
to avoid doing such things unless the two other queries link to
the "master" query in a 1 to 1 relationship or a 1 to many
relationship. I prefer to do things in steps though rather than
making it all one glob from the outset. It's easier to follow
and understand, but someone that is really really good with
designing queries could probably get away with the one big
glob query.

You can mess with the queries inside of Access... which you
should be doing to start with. Once you get the initial make
table queries set up, run it once, use the newly created tables,
connect them in a select query and start messing with the
select query until you get the results you want and then start
throwing the stuff into a new BigTable.

You can create a Macro inside of Access to do test runs of
all the queries.

--
Jim Carlock
http://www.microcosmotalk.com/
Post replies to the newsgroup.


:
That is a very interesting algoithm. It might work so ill give it a shot.
I will create the tempID fiields and then run my queries and updates against
those
and just inputa temp number for tha actual Ids until i fill them in. I have
the iteration
algorithm already written but this eems like it may be more efficient and
faster.
I will post my comparisons.

AGP
 
DIOS said:
That is a very interesting algoithm. It might work so ill give it a shot.
I will create the tempID fiields and then run my queries and updates against
those
and just inputa temp number for tha actual Ids until i fill them in. I have
the iteration
algorithm already written but this eems like it may be more efficient and
faster.

In my experience, it should be IMMENSELY faster. We were doing somewhat the
same thing a while ago, copying down a "library" of materials and other things.
Initially, we were selecting everything into a recordset, then inserting into
the local database. Once we changed the logic to select from one DB directly
into the other, it sped the process up over 100x if I remember correctly.
I will post my comparisons.

Please do! I'd be interested to see how it fares.

Matt
 
One other thing...I was using a DAO database object to connect to each
database (at most
10 databases) and then doing my processes by getting recordsets of each
database. However,
if i do my process as suggested this will not work as using the INSERT INTO
requires that the
database be connected externally but it will be tied up with the database
object. Or I can just
copy one table from each database to the master database. but I cant do that
via the DAO
database objects can I?

AGP
 
I think i stand corrected. Even though i have the databases opned
exclusively and as
read-only via a DAO database object, it must recognize this and let me do
inserts from
the databases using the following SQL

sSQL="INSERT INTO tblDrive (tempDiscID, fld2) SELECT DiscID, fld2 FROM
tblDrive IN 'dbpath';"

and then running it on the new database like so

dbNew.Execute sSQL

Is that thte way it's supposed to work. I didnt know that you could hvae the
db's open via a DAO db object
and then run this type of SQL on them. It seems that the IN describes an
"external" database but in this case
i have ownership of the db's and the DAO model must recognize this.

AGP
 
DIOS said:
I think i stand corrected. Even though i have the databases opned
exclusively and as
read-only via a DAO database object, it must recognize this and let me do
inserts from
the databases using the following SQL

Yeah, I dont know what's happening behind the scenes. I would theorize that
possibly because you are reading from the db that is opened by you read only,
but you aren't using that connection object, then it doesn't mind that someone
else (even if it's you) has it opened exclusively... It's not trying to CHANGE
that database, so maybe it doesn't care.

I guess you could check this by trying to open up another connection in a
different app to that database, non-exclusive and read only...

Don't know.

Matt
 
My results were astonishing. As YYZ commented the SQL version was way
faster.
Here are my results:

Test applied to merging 8 databases with three tables each. 4 medium sized
databases
and 4 small databases. Four runs were applied and the times were avearged.

Record Looping method: 410 seconds
SQL UPDATE method: 57 seconds

The record looping method had more status updates via a label but the
difference wa still
heavily favoring the SQL method.

Thanx for all the help guys.

AGP
 
DIOS said:
My results were astonishing. As YYZ commented the SQL version was way
faster.
Here are my results:

Test applied to merging 8 databases with three tables each. 4 medium sized
databases
and 4 small databases. Four runs were applied and the times were avearged.

Record Looping method: 410 seconds
SQL UPDATE method: 57 seconds

The record looping method had more status updates via a label but the
difference wa still
heavily favoring the SQL method.

Glad to hear it. Sometimes I think that it is more important to let the user
know what is happening rather than speeding a process up, but in this case those
results seem to be too dramatic to go back.

I would recommend 1 of 2 approaches. You may be able to still show realtime
statuses if you execute the queries asynchronously. I've never done it myself,
and it may not work out like I think it will, but I'd look into it. I tried to
do it for you, but my ADO help is trashed for some reason, and I don't have time
to look it up on MSDN right now.

Or, you can show an animation while it's going.
http://vbaccelerator.com/home/VB/Code/Controls/AVI_Player/ComCtl32_AVI_Player/ar
ticle.asp

I use that all the time in my current application.

Matt
 
If you put the SQL into queries inside of Access, I'm thinking that
it should get even faster... IE make the query in the new Access file,
and then call that query from the VB application, with an .Execute
if it's an action query, or with the .RecordSet object if it's a Select
query.

If the query inside of Access was named "qSelectAllRecords", then
you can open it with :

Set rs = gDAO_DB.OpenRecordset("qSelectAllRecords")

--
Jim Carlock
http://www.microcosmotalk.com/
Post replies to the newsgroup.


:
My results were astonishing. As YYZ commented the SQL version was way
faster.
Here are my results:

Test applied to merging 8 databases with three tables each. 4 medium sized
databases
and 4 small databases. Four runs were applied and the times were avearged.

Record Looping method: 410 seconds
SQL UPDATE method: 57 seconds

The record looping method had more status updates via a label but the
difference wa still
heavily favoring the SQL method.

Thanx for all the help guys.

AGP
 
Back
Top