Best way to delete duplcate records in a table.

  • Thread starter Thread starter Ghost
  • Start date Start date
G

Ghost

I have a table that has duplicate records in it. I ran a duplicate query
that returned all duplicates.

What is the best way to delete the duplicates since you apprantly can not do
so while in the query?

Any responses are greatly appreciated.
 
Ghost,
Without any specific information about your table/fields, that would be
hard to say.
Give us a bit of info about the relevant fields in your table, and... how
you would determine that a record is a duplicate or not. Then we might be
able to find some logic (if possible) to assist in their removal.

As a very general rule though, dupes uaually can't be found
"automagically"...

John V. Smith 37 Main Street....
and
John V. Smith 37 Main St....
and
John Smith 37 Main Street.....

None of these records would "logically" qualify as a duplicate.
Unfortunately, this is where the human brain has to be applied, to make that
decision.
So finding dupes is usually human manual comparing and editing.
AND...
It is better to try to avoid duplicates up front, at user entry time,
rather than hunt them down after the fact.
Something like...
You just created a new record with LastName = "Smith", and FirstName =
"John". If, at that point, you show the user all the records you have
already, with Smith and John would help them decide if the new record is a
dupe or not.
Of course, this is not perfect, and is used an example of how...
using your own logic... you might be able prevent dupe entries for at new
records.
--
hth
Al Campagna
Microsoft Access MVP
http://home.comcast.net/~cccsolutions/index.html
"Find a job that you love... and you'll never work a day in your life."
 
A lot depends on what your data looks like, and what you are using to
determine "duplicate records". Personally, I put an autonumber ID field in
every table so that I can uniquely identify each record. Then I put
composite indices on those fields that should combine to create a unique
index.

1. If you are using all the fields in your table to identify duplicates (no
memo fields please), then I would just create a maketable query by grouping
on all of the fields. Create a new table, then delete the old one and rename
the new one.

2. If you are using a subset of your fields to determine duplicates, then
the issue becomes how do you know which record to delete, because the
remaining fields may (probably) contain different information. As long as
you are not joining to another table, the "Find duplicates" wizard should
generate a query that is updateable, which would allow you to review and
delete the "duplicate" record.

HTH
Dale
 
Al:

Thanks for reply. Sorry for incompleteness.

I am comparing the following fields for duplications:

LastName; FirstName; StreetAddress; HomePhoneNumber.

As noted, I ran a duplicate records query that return all duplicates:
however, I want to globally delete them. The duplicates came from an
imported table.

Any further comments most welcomed.

Ghost
 
Dale:

See my comment just posted.
Dale Fye said:
A lot depends on what your data looks like, and what you are using to
determine "duplicate records". Personally, I put an autonumber ID field
in
every table so that I can uniquely identify each record. Then I put
composite indices on those fields that should combine to create a unique
index.

1. If you are using all the fields in your table to identify duplicates
(no
memo fields please), then I would just create a maketable query by
grouping
on all of the fields. Create a new table, then delete the old one and
rename
the new one.

2. If you are using a subset of your fields to determine duplicates, then
the issue becomes how do you know which record to delete, because the
remaining fields may (probably) contain different information. As long as
you are not joining to another table, the "Find duplicates" wizard should
generate a query that is updateable, which would allow you to review and
delete the "duplicate" record.

HTH
Dale
 
Ghost,

Don't see the post you are referring to. Is it supposed to be in the
Formscoding forum, or somewhere else?

Dale
 
Ghost,
I don't have any experience with delete duplicates queries, but
let me just say how I'd start out.
Caveat... trying to "automagically" delete dupes is dangerous
business, so First...back up your data, and Secondly... back it up again.
:-D
I'll also set aside the legitimate arguments against tryting to delete
dupes programmatically.

First, I'd build a query that... by whatever logic you use... identifies
"dupes" I'm thinking a Totals query. Your table should have an
incrementing unique keyfield, such as an autonumber, or I often use DOC
(DateOfCreation). This is so you can determine the oldest of the two
records.
So, we could use Count on any field ( = 2 or greater, must be a dupe),
and use Max on your key field.
Raw data...
RecID SomeData MoreData
1234 Bob Smith
1324 Bob Smith

After query....
RecID SomeData MoreData Count
1324 Bob Smith 2

Using this query as a pass-thru you would use RecID 1324 to find and
delete that record from the original table.
--
hth
Al Campagna
Microsoft Access MVP
http://home.comcast.net/~cccsolutions/index.html
"Find a job that you love... and you'll never work a day in your life."
 
Hi,

From Al’s data sample, I want “John Smith†but not “John V. Smithâ€.

Delete * From Table1 Where FirstName Not In (Select col1,col2,…From Table1
Where FirstName = ‘John’ AND LastName = ‘Smith’)

It is better to use the ID, if you know the ID that you don’t want to delete
than,

DELETE * FROM Table1 WHERE ID NOT IN (Select col1,col2,…From Table1 Where ID
= nnnn1)

Use the subquery and a IN clause, where in this case I use the NOT IN to omit
the record which I don’t want to delete.
 
Hi Dave,

Bugs? My post did not appear in the forum.

From the original poster...
"Best way to delete duplcate records in a table. 10/11/2007 7:06 AM PST"

From Server....
"Subject: Best way to delete duplcate records in a table. 10/12/2007 1:07
AM PST

By: unknown "
Doug,
How do you respond without the post to which you are responding being
included?
[quoted text clipped - 27 lines]
 
Back
Top