Automating Excel Flat Imports into a Normalized Database

  • Thread starter Thread starter bcd
  • Start date Start date
B

bcd

I receive several excel spreadsheets a day that I need to import into
a normalized database. The spreadsheets have a different file name
everytime they are emailed to me, but the data is consistant. The
data is in a flat file format though, and the information that is
included (shipping info, user info, order number) are all broken down
into their respective normalized tables.

These emailed spreadsheets are the primary source of data for my
database.

I'm looking for advice for the best way to to do the following:

1. Automate the import of the spreadsheets.
2. Automate the appends into the multiple tables (I'm not even sure if
I have worked out the kinks doing this unautomated).

I've spent a decent amount of time reading this newsgroup trying to
find a good solution, but I'm a little stumped since all of my data is
driven by these imported spreadsheets. I feel like I could get my
arms around it if it was just a one time thing. To make matters
worse, I'm not 100% sure how I can maintain the integrety of the data
since I will get many of these "orders" with the same user and ship to
info. If there were a way to make this form driven I would, but the
spreadsheet method is really the only way for now. To be honest, I've
almost given up on using a normalized database and am going down the
path of just using a primarily flat file structure. I know that
almost seems criminal so I figured I would ask the question.

Thanks for any help.

BCD
 
Hi BCD,

In principle what you do is

a) import or link the worksheet to temporary Access table (let's call it
T).

b) run a query on T that extracts (say) all unique values from the
shipping information fields, compares them with what's already in the
existing shipping info table, and appends any necesssary new records.

c) run another query on T that pulls out all the unique values from the
user information fields and appends any new users to the users table.

d) ... and so on.

The queries must be run in the right order (e.g. users before orders,
orders before order detail lines) or the relation integrity constraints
in your database will prevent records being added.

In practice the biggest problem is in ensuring that the data in Excel
truly is consistent. It's difficult enough to ensure consistency at the
level of only having numbers in columns that should be imported into
number fields (there are still typists who can't tell the difference
between 0 and O); it's almost impossible to prevent typing mistakes and
variant forms of names (e.g. if one order is to be shipped to "97 Acacia
Ave" and another to "97, Acacia Avenue").

And if the data isn't consistent, it's somewhere between hellishly
difficult and totally impossible to import it automatically to an
acceptable standard.
 
Back
Top