Don't import data thats already been imported

  • Thread starter Thread starter BTU_needs_assistance_43
  • Start date Start date
B

BTU_needs_assistance_43

I've just finished some code designed to bring Excel files into my Access
Database and after it does it moves old files into a folder called "Processed
Reports". However it has come to my attention that some supervisors will add
data to their reports that is most current, but then go back as much as 2
months later to add new data from recent work to the old file instead of just
making a new report. What I need to do and am having trouble finding is a way
to import recently modified data (thats the easy part with
msoLastModifiedLastMonth) but I need it to do it without adding the old data
from those Excel files again, only the new data. Or if it is going to add the
old data again, delete the old data that was already stored in the Database
and add the new data to overwrite the old with the same data PLUS the new
data. The end result should be I have all the data stored in the Database
with no repeated data.

To import the data I'm using the Application.Filesearch command, then
defining the name of the worksheet within the reports I want the data from
and then having my code read the data by columns until theres no data in the
next column. Then it just dumps it all into my Database and moves onto the
next Excel file.

I assume that I'll have to write in code in the last phase to tell it to
check for identical records, but I'm having a heck of a time finding code to
do this. How do I tell Access to check for identical data, but not just one
line, for a section or column or row because all lines for each chunk will
have the same name in the first cell for each set of files?
 
Here's the thing, this is the system they've been using for years and I have
already floated changing it by them and didn't even get to finish my
sentence. We're talking crusty old men set in their way being the best. The
way they file reports isn't going to change so I must adapt to them. I was
playing around with the properties of a query and found Unique Values Yes/No
and its default status is no, but when I switch it to yes it makes it so that
in the query there is only one copy of the records with no repeats, although
the repeated value remains in the original table in the database.

This database imports lots of data, as many as sixty rows with over twenty
columns per file, and multiple files a day. Not so much on a given day but
this holds records going back over 6 years and will this program will be used
for years after I finish updating it. I don't know what the limit for
information in a table is but should I be worried that it will eventually max
out a table? And if that is the case I would like to keep it from replicating
records simply to keep the table from overflowing. If it was not going to
ever run out of space in my lifetime, then I suppose it wouldn't really
matter if there were hundreds of duplicate records because they'd get weeded
out by my queries.
 
How can you identify a record that is a duplicate? Are there specific fields
that in combination say - this is a duplicate record so don't import it?

If the number of fields that identifies a unique record is 10 or less, you can
use a compound unique index to keep any record that is a duplicate from being
added to the table.

To create a multiple field unique index (Compound index)
--Open up the table in design mode
--Select View: Index from the menu
--Enter Name for Index in first blank cell under Index Name
--Select one field in the index under Field Name
--Set Unique to Yes
--Move down one line and select the next FieldName
--Continue moving down and selecting fieldnames until all needed are included.
--Close the index window and close and save the table

Of course if the uniqueness of a record is determined by over 10 fields then
you need to come up with some other method.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
There is one consistant cell in each series of data, the name of the shot.

Shot Name Sta BCY LCY .... Date
owe123a 1 1000 1500 .... 1/1/09
owe123a 2 1100 1600 .... 1/1/09
owe123b 3 1050 1500 .... 2/5/09
owe123b 4 1150 1800 .... 2/5/09
owe123b 4 1150 1800 .... 2/5/09 <---------
Duplicate. Needs GONE
owe123c 5 1200 2000 .... 2/25/09
owe123c 6 900 1200 .... 2/25/09

B124a 1 2500 3400 .... 2/16/09
B124a 2 2400 3400 .... 2/16/09
....

Where the owe123a is the initial data. Then maybe right away or a month
later they will come back and add owe123b and maybe another month later
owe123c, depending on the date.

If the code detects that owe123a has two records with a value under "Sta" of
"1", thats when I need the record deleted. But I need it to ignore dates
getting repeated, names, and the large numbers too. There may be multiple
shots happening on a single day so date cant be a factor and the large
numbers may have similar values though that is EXTREMELY unlikely.

The main factors in determining if it is a duplicate record would be if a
Shot Name had more than one similar Sta value.
 
So it sounds as if the combination of Shot Name and Sta determines if a record
is unique.

So if you build a unique index based on Shot Name plus Sta you could keep the
duplicates out. The problem is that you have existing duplicates.

To strip out the duplicates I would do the following.
Make a backup of the database (recovery if something goes wrong)

Open the database in design mode

==Copy your current table
==Paste the table and when asked paste JUST the structure into a new table name
==Open the new table and add the index based on the two fields. See previous post
==Use an append query to add all the records from the current table to the new
table. You will get a warning that xx records cannot be added due to ...
The records that won't be added should be the duplicates (one will be added,
all others won't be)

==Rename the new table to the name of the old table. Answer yes to the
replace warning.

==Compact your database to recover the now unused space.

==Try importing one of your files and see what happens.
==Now try importing that same source file again and see what happens.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Yes the Shot Name and Sta are what make it unique. I actually dont have any
duplicate records yet because I have yet to impliment the program but I'm
trying to trouble shoot before I do so I don't end up damaging the existing
database and I can already see where this could be a major problem. I'd
prefer to just not have duplicates in my database at all BUT if thats not
reasonable for my level of expertise, I could live with leaving them in one
table and drawing them out with the Unique Values command in a data query.
It's just that I would prefer to keep the whole system as clean as possible,
IF possible.
 
Back
Top