repeated imports

  • Thread starter Thread starter quaddawg via AccessMonster.com
  • Start date Start date
Q

quaddawg via AccessMonster.com

This question is largely about basic db design--it could probably go in New
Users, but it does involve importing, so here goes:

I've inherited a database in which the designer would import from Excel
worksheets test score data from over a hundred schools with thousands of
students. This would be a one shot deal in that he would create a single
table with the student info and test results and run reports.

I want to expand the database to track students' progress by creating a one
to many relationship between newly created tblStudent and tblTests tables:
when the new batch of scores comes in, I would like to import the results as
new records in tblTests (as well as any new students into tblStudent and
their accompanying results in tblTests).
What is the best (or any, really) way to import this data, associating it
with the existing records? The Excel worksheets have all the data except the
autonumber primary key. How do I go about setting a primary (or foreign) key
and adding these records?

Thanks in advance for your insight
 
For automatic importing of new results for existing students,
tblStudents and the Excel sheet must have fields in common that
identify, with certainty, which row of the worksheet relates to which
existing Student record.

Your "autonumber primary key" is useless for this because it only exists
in tblStudents, so you'll need to look for another field or combination
of fields. Do the Excel sheets include some kind of unique StudentID
assigned by the education system? If so, and if the schools can be
trusted to enter it correctly, that might do the job.

Otherwise, you'll need to use a compound key. This can be really
difficult because it's quite possible for a school to have two students
with the same name and even the same date of birth. In principle you can
add more fields to the key - e.g. mother's maiden name and the student's
home address - but (a) this information may not be included in the Excel
sheets, and (b) home addresses can and do change.

A further complication is that if the Excel data was entered by hand
there will be typing mistakes and inconsistencies in the key fields, so
the data will need to be inspected and cleaned up before reliable
matching is possible.

Once you've got all that sorted out, it's quite simple to create

1) an append query that adds a record to tblStudents for each new
Student

2) an append query that adds new records to tblTests.
 
Unfortunately, there is no supplied unique key, and I won't have any say on
how the excel data comes to me (I don't think I would trust it anyway). I've
considered a compound key, but I've run into records even missing last names!
I suppose I'll have to do a lot of massaging and come up with the right key,
and then do the append queries.
Thanks

John said:
For automatic importing of new results for existing students,
tblStudents and the Excel sheet must have fields in common that
identify, with certainty, which row of the worksheet relates to which
existing Student record.

Your "autonumber primary key" is useless for this because it only exists
in tblStudents, so you'll need to look for another field or combination
of fields. Do the Excel sheets include some kind of unique StudentID
assigned by the education system? If so, and if the schools can be
trusted to enter it correctly, that might do the job.

Otherwise, you'll need to use a compound key. This can be really
difficult because it's quite possible for a school to have two students
with the same name and even the same date of birth. In principle you can
add more fields to the key - e.g. mother's maiden name and the student's
home address - but (a) this information may not be included in the Excel
sheets, and (b) home addresses can and do change.

A further complication is that if the Excel data was entered by hand
there will be typing mistakes and inconsistencies in the key fields, so
the data will need to be inspected and cleaned up before reliable
matching is possible.

Once you've got all that sorted out, it's quite simple to create

1) an append query that adds a record to tblStudents for each new
Student

2) an append query that adds new records to tblTests.
This question is largely about basic db design--it could probably go in New
Users, but it does involve importing, so here goes:
[quoted text clipped - 15 lines]
Thanks in advance for your insight
 
Real life is a nuisance, isn't it<g>. One approach might be along these
lines:

1) import from Excel into a "staging" table which includes all the Excel
fields plus your autonumber Student ID

2) use a query joining this to your existing data (on, say first name, last
name, school, and any other sensible fields) to identify records in the
staging table that almost certainly correspond to students already in the
database and update the Student ID field in the staging table accordingly

3) use more queries joining the staging table and tblStudents on useful
combinations of fields to identify records in the staging table that may
correspond to existing Student records, and update the Student Id field in
the staging table manually as you identify each one.

At each stage, records in the staging table that have no Student ID are ones
that have yet to be processed. Eventually you'll be left with (a) records
you're sure relate to new students and (b) records that you can't place.
Append (a) to tblStudents and update the staging table student ID values.
(b) you can either research or ignore.

4) B|y this point you have reconciled the Excel data adn tblStudents and are
ready to run another query to append the test restult.

quaddawg via AccessMonster.com said:
Unfortunately, there is no supplied unique key, and I won't have any say
on
how the excel data comes to me (I don't think I would trust it anyway).
I've
considered a compound key, but I've run into records even missing last
names!
I suppose I'll have to do a lot of massaging and come up with the right
key,
and then do the append queries.
Thanks

John said:
For automatic importing of new results for existing students,
tblStudents and the Excel sheet must have fields in common that
identify, with certainty, which row of the worksheet relates to which
existing Student record.

Your "autonumber primary key" is useless for this because it only exists
in tblStudents, so you'll need to look for another field or combination
of fields. Do the Excel sheets include some kind of unique StudentID
assigned by the education system? If so, and if the schools can be
trusted to enter it correctly, that might do the job.

Otherwise, you'll need to use a compound key. This can be really
difficult because it's quite possible for a school to have two students
with the same name and even the same date of birth. In principle you can
add more fields to the key - e.g. mother's maiden name and the student's
home address - but (a) this information may not be included in the Excel
sheets, and (b) home addresses can and do change.

A further complication is that if the Excel data was entered by hand
there will be typing mistakes and inconsistencies in the key fields, so
the data will need to be inspected and cleaned up before reliable
matching is possible.

Once you've got all that sorted out, it's quite simple to create

1) an append query that adds a record to tblStudents for each new
Student

2) an append query that adds new records to tblTests.
This question is largely about basic db design--it could probably go in
New
Users, but it does involve importing, so here goes:
[quoted text clipped - 15 lines]
Thanks in advance for your insight
 
Back
Top