Database Bloat

Sean Nijenhuis · Dec 5, 2003

Hi, thanks for your time.

I'm currently working on a database whereby it imports a rather large single
field text file. (400000 records - 50Mb)

The point of doing this is that it is a telephone accounts listing, number
called, date time, duration etc. However, the number / extension from which
the calls are made is listed once at the top of the "page".

What I'm doing is using ADO to look for that "first" line, identify the
telephone number in that first line, and then, .edit it to a field with each
subsequent line until the next "first" line. This works fine - however -
Database grows to 1.7Gb.

Now sure, compact and repair will reduce it down to 108MG, but heres the
problem, i need to perform more calculations and identify more fields before
i can "leave" the database to do the compact and repair.

Also, this is the test data (50Mb) - the actual data this needs to be
preformed on is 178Mb. Immediatly hit the 2gb limit.

1) Is there any "neater" way of identify or anaylsing the records in order
to prevent the bloat?

2) Can a update query written in such a way that it updates the current
record with data from the previous record that its just updated?

Any help will be much appreciated!

Thanks
Sean

tina · Dec 5, 2003

if nobody suggests a viable solution for you within
Access, suggest you check out the Monarch for Windows
software. i've used it many times to "trap" data in the
kind of head-of-page format you describe, and output .txt
or .xls files containing the line-by-line data you need -
which can then be imported into Access without additional
handling.
see website http://monarch.datawatch.com/

david epsom dot com dot au · Dec 5, 2003

Try this:

Create a blank database.
Link to two tables in separate databases.

Import to one table.
Update by doing an append to the second table.

(If you now how to use code to link, make the links 'exclusive' rather than
'shared')

This will run slower: It should take less space.

You can create the tables and databases as temporary files if you wish.
It may run faster if you do the processing inside a single transaction.

I do text processing like this in VB, rather than using the database engine:
read line, process, write line, repeat.

(david)

Albert D. Kallal · Dec 5, 2003

It would seem that your initial import you create is a rather large mess.
You then have to process this mess record by record, and create even more
records.

It would seem to me that you should open the text file directly, and do your
import from the text file, parsing out each record.

I am sure like any developer you have a nice grab bag of parsing routines
that strips out fields via some delimiter (all developers have a bunch of
string processing routines...right?).

Anyway, the way to read in a text file line by line directly is:

Dim F As Integer
Dim strFile As String

Dim strOneLine As String

strFile = "c:\data\test.txt"

F = FreeFile

Open strFile For Input As F

Do While EOF(F) = False

Line Input #F, strOneLine

' process the oneline

Loop
Close #F

As mentioned, if you have some nice routines to work with delimiters etc,
then parsing out each line by line can work very well indeed.

Another trick before you do the above is to actually read in the file line
by line, and then strip out the junk lines, and then write the text file
back out to ANOTHER text file. The result is thus a nice clean importable
file, and you don't have to do the text part, as the ms-access import will
handle things quite well once you strip out all the junk lines from the text
file. So, I probably consider pre-processing the txt file BEFORE you do the
import.

Database Bloat

Sean Nijenhuis

tina

david epsom dot com dot au

Albert D. Kallal