quick way to remove duplicates

G

Guest

I need a quick way to remove duplicates that are found with the Duplicate
Query. I have been using the "select each record and delete" method. There
has to be a quicker and easier way. Please help
 
R

Randy

Cheyenne said:
I need a quick way to remove duplicates that are found with the Duplicate
Query. I have been using the "select each record and delete" method.
There
has to be a quicker and easier way. Please help

Cheyenne, Have you tried to create a Delete query based on a Duplicate
Sub-Query?

-Randy
 
R

Randy

Cheyenne said:
No I haven't. Could you please give instructions on how to do this?
Thanks.

Cheyenne, first you need a saved Find Duplicates Query (which you already
have according to your original post). Then you need to create a new Query
to delete the results of the previous Query, below the steps:

1) Create a new Query on "Design View." 2) A little window called "Show
Table" comes up. Select the record source for the query, by clicking on the
"Queries" tab of that window and selecting the name of the existent query
you use to find the duplicates.3) Click "Add" and then "Close" buttons. 4)
Then double-click the asterisk "*" on the tiny window. This will include all
the fields of your Find Duplicates query. You will see how a new column is
added to the grid. 5) Then go to the menu on the top and click "Delete
Query" under the "Query" Menu Category. 6) On the top menu go to Save (under
File category). 7) A new window comes up for you to type the name of this
query, which you may want to call "Delete XXXXX Duplicates".

That's it. Just double-click the query to automatically delete the duplicate
records.

Alternative, you may use the "Select All Records" (under Edit category) on
the top menu, and the delete button, to preview and delete the records using
just the Find Duplicates query.

-Randy
 
G

Guest

I tried both methods and it doesn't allow me to delete at all. When I open
the Find Duplicates Query and select all records I can't delete because the
delete option on the toolbar is not available or "grayed out". I also tried
creating a new query in Design View. I followed all the steps that you gave
me but when I try to run the Delete Query, I get the error message "Could Not
Delete From Specified Tables".
I don't know what is causing this because the original table does not
contain any indexes or primary keys or any imported links or anything. Any
suggestions? Thanks.
 
R

Randy

Cheyenne said:
I tried both methods and it doesn't allow me to delete at all. When I open
the Find Duplicates Query and select all records I can't delete because
the
delete option on the toolbar is not available or "grayed out". I also
tried
creating a new query in Design View. I followed all the steps that you
gave
me but when I try to run the Delete Query, I get the error message "Could
Not
Delete From Specified Tables".
I don't know what is causing this because the original table does not
contain any indexes or primary keys or any imported links or anything.
Any
suggestions? Thanks.

Cheyenne, You said you were deleting records one by one? So that was
manually, from the table itself, by eye? We should target issues one at the
time. Did you have any problems before anything I recommended? If you cannot
delete records from the SubQuery (Find Duplicates) directly, you cannot
delete them from an upper lever like from the Delete Query. So, How many
tables are involved here?

-Randy
 
G

Guest

Yes. After importing new data to my table at the end of the week, I would
run a Find Duplicate Query to locate the duplicate data for deletion. I
would note the data found in the results and go back to the original table
and find each duplicate record and delete directly from the original table.
As this takes a lot of time, I was looking for an easier way to delete the
duplicate data. I followed your suggestions, but got the error messages that
I described in my earlier post. I was wondering if you had any suggestions
 
R

Randy

Cheyenne said:
Yes. After importing new data to my table at the end of the week, I would
run a Find Duplicate Query to locate the duplicate data for deletion. I
would note the data found in the results and go back to the original table
and find each duplicate record and delete directly from the original
table.
As this takes a lot of time, I was looking for an easier way to delete the
duplicate data. I followed your suggestions, but got the error messages
that
I described in my earlier post. I was wondering if you had any
suggestions
on how to help resolve. Thanks.

Cheyenne, Open your Find Duplicates query in design mode. On the top menu
click on "Properties" from the "View" category menu. A window will come up.
Check on the property "Recordset Type" and chage its value to "Dynaset".
From the File menu click on "Save" and then on "Close". Run the Find
Duplicates Query and see if you can now delete records from that query.

-Randy
 
R

Randy

Cheyenne said:
Nope. Sorry Randy, that doesn't work either. It was already set to
Dynaset.
Thanks

Cheyenne, Something is very wrong. It is on your side. I will need you to
send me the Find Duplicates query code. To view the code open the query in
design mode and click on "SQL View" from the "View" category on the top
menu. Please copy that code and send it for reviewing.

-Randy
 
G

Guest

Here's the code. Hope you can help. Thanks.

SELECT First(MDV.[ACCOUNT NUMBER]) AS [ACCOUNT NUMBER Field],
Count(MDV.[ACCOUNT NUMBER]) AS NumberOfDups
FROM MDV
GROUP BY MDV.[ACCOUNT NUMBER]
HAVING (((Count(MDV.[ACCOUNT NUMBER]))>1));
 
R

Randy

Cheyenne said:
Here's the code. Hope you can help. Thanks.

SELECT First(MDV.[ACCOUNT NUMBER]) AS [ACCOUNT NUMBER Field],
Count(MDV.[ACCOUNT NUMBER]) AS NumberOfDups
FROM MDV
GROUP BY MDV.[ACCOUNT NUMBER]
HAVING (((Count(MDV.[ACCOUNT NUMBER]))>1));

Cheyenne, This code does not conform with a valid Access Find Duplicate
standard query. Please recreate a query using the Find Duplicate Query
Wizard. When prompted specify the MDV table. Then specify ACCOUNT NUMBER as
the field that may contains duplicates. Then select other fields you would
want to see reported. Then type a name for the query (like "Find Duplicate
Accounts"). Click on Finish.

Run the Query and see if you can edit/or delete the record(s).

-Randy
 
D

Dave

I think there are two issues here:

1. When you try to create a Deletion query from a select query that joins
the basic table with a Find Duplicates query based on the same table, the
deletion will not run. You will also see that the basic query (the Select
query that joins the table and the find-duplicates query) is not updateable.

2. When you do perform the deletion, you probably don't want to delete all
the records that are duplicated, only some of them; presumably you want to
leave one copy.

You were on the right track by creating a First Of query for the duplicates,
but in order to get the deletion done, you need to use a sub-query, and you
will need to add a primary key to your basic table. I think most folks here
would recommend that you ALWAYS use an Autonumber primary key in your
tables, even if you don't see an obvious use for it when you start out --
I've never regretted it.

Try this (I'm translating from queries that work for me to SQL statements
that I think use your table and field names, so beware of typos: the idea
works)

0. Back up your data.

1. Add an Autonumber primary key field to your table (the table is "MDV",
right? So let's say you name it MDV_ID)

2. Create a Find Duplicates query to find duplicates of [Account Number]
in MDV. It should look like this:

SELECT MDV_ID, [AccountNumber]
FROM MDV
WHERE ((([AccountNumber]) In (SELECT [AccountNumber] FROM [MDV] As Tmp
GROUP BY [AccountNumber] HAVING Count(*)>1 )))
ORDER BY [AccountNumber];

Access will give this query a name like "Find Duplicates for MDV" or
something similar

3. Create a new query based on the find duplicates query, grouping on
Account Number and finding the First MDV_ID. It should look something like
this:

SELECT [Account Number], First([MDV_ID]) AS FirstOfMDV_ID
FROM [Find duplicates for MDV]
GROUP BY [Account Number];

Save this query as qryFirstAcctDupes

4. Now (and finally!) create a deletion query for your MDV table by
dragging the asterisk into the first column and changing the query type to
delete. Then drag the MDV_ID field to the second column, and in the
criterion row type:

In (Select FirstOfMDV_ID From qryFirstAcctDupes)

The complete query SQL should look something like this:

DELETE *
FROM MDV
WHERE (((MDV_ID) In (Select [FirstOfMDV_ID] From qryFirstAcctDupes)));

When you click the Run button, this query should delete some presumably
large number of records, but if you have situations where there were more
than two records with the same Account Number, you will need to run the
query again, maybe several times, to get rid of all of them. Eventually you
will find that the query reports that it is about to delete 0 records, and
then you're done!

HTH
 
R

Randy

Randy said:
Cheyenne, Have you tried to create a Delete query based on a Duplicate
Sub-Query?

-Randy

Cheyenne, as Dave pointed out, there is an issue to my first sugestion of
using a Delete Query based on Find Duplicates Query, since it may ended up
deleting even the original records. I missed that detail. Below is the
shortest, easiest and fastest way to solve your "remove dupicates" issue. It
deletes one (1) or more duplicates if present all at once (one run).

DELETE MDV.ID
FROM MDV
WHERE (((MDV.ID)
NOT IN (SELECT Min(MDV.ID) AS OldestID
FROM MDV GROUP BY MDV.[ACCOUNT NUMBER];)));

INSTRUCTIONS:
Create a new query in "Design Mode." Close the "Show Table" window. Switch
to SQL View by clicking on "SQL View" from the "View" category menu. Delete
any text already present on that window and paste there the code above using
copy & paste. Save the query by clicking on "Save" from the "File" category
menu. Name the query "qryDedupAccounts". Close the query and run it.

HOW IT WORKS:
The query gets all unique first-created accounts. It does that by retrieving
the lowest (oldest) ID of records with the same account number. Then the
query inverts that selection, resulting in all the other records, which are
the ones that were later appended duplicated accounts. That final selection
is deleted.

REQUIREMENTS
Your accounts table (MDV) requires a Primary Key field (autonumber and
no-duplicates).

-Randy
 
G

Guest

thanks Dave. I'll try it now and see what happens. thanks.

Dave said:
I think there are two issues here:

1. When you try to create a Deletion query from a select query that joins
the basic table with a Find Duplicates query based on the same table, the
deletion will not run. You will also see that the basic query (the Select
query that joins the table and the find-duplicates query) is not updateable.

2. When you do perform the deletion, you probably don't want to delete all
the records that are duplicated, only some of them; presumably you want to
leave one copy.

You were on the right track by creating a First Of query for the duplicates,
but in order to get the deletion done, you need to use a sub-query, and you
will need to add a primary key to your basic table. I think most folks here
would recommend that you ALWAYS use an Autonumber primary key in your
tables, even if you don't see an obvious use for it when you start out --
I've never regretted it.

Try this (I'm translating from queries that work for me to SQL statements
that I think use your table and field names, so beware of typos: the idea
works)

0. Back up your data.

1. Add an Autonumber primary key field to your table (the table is "MDV",
right? So let's say you name it MDV_ID)

2. Create a Find Duplicates query to find duplicates of [Account Number]
in MDV. It should look like this:

SELECT MDV_ID, [AccountNumber]
FROM MDV
WHERE ((([AccountNumber]) In (SELECT [AccountNumber] FROM [MDV] As Tmp
GROUP BY [AccountNumber] HAVING Count(*)>1 )))
ORDER BY [AccountNumber];

Access will give this query a name like "Find Duplicates for MDV" or
something similar

3. Create a new query based on the find duplicates query, grouping on
Account Number and finding the First MDV_ID. It should look something like
this:

SELECT [Account Number], First([MDV_ID]) AS FirstOfMDV_ID
FROM [Find duplicates for MDV]
GROUP BY [Account Number];

Save this query as qryFirstAcctDupes

4. Now (and finally!) create a deletion query for your MDV table by
dragging the asterisk into the first column and changing the query type to
delete. Then drag the MDV_ID field to the second column, and in the
criterion row type:

In (Select FirstOfMDV_ID From qryFirstAcctDupes)

The complete query SQL should look something like this:

DELETE *
FROM MDV
WHERE (((MDV_ID) In (Select [FirstOfMDV_ID] From qryFirstAcctDupes)));

When you click the Run button, this query should delete some presumably
large number of records, but if you have situations where there were more
than two records with the same Account Number, you will need to run the
query again, maybe several times, to get rid of all of them. Eventually you
will find that the query reports that it is about to delete 0 records, and
then you're done!

HTH

Cheyenne said:
No I haven't. Could you please give instructions on how to do this? Thanks.
 
G

Guest

ok, sorry. I've been out of the office. i'll try it and i'll check back in
to let you know what happened. thanks


Randy said:
Randy said:
Cheyenne, Have you tried to create a Delete query based on a Duplicate
Sub-Query?

-Randy

Cheyenne, as Dave pointed out, there is an issue to my first sugestion of
using a Delete Query based on Find Duplicates Query, since it may ended up
deleting even the original records. I missed that detail. Below is the
shortest, easiest and fastest way to solve your "remove dupicates" issue. It
deletes one (1) or more duplicates if present all at once (one run).

DELETE MDV.ID
FROM MDV
WHERE (((MDV.ID)
NOT IN (SELECT Min(MDV.ID) AS OldestID
FROM MDV GROUP BY MDV.[ACCOUNT NUMBER];)));

INSTRUCTIONS:
Create a new query in "Design Mode." Close the "Show Table" window. Switch
to SQL View by clicking on "SQL View" from the "View" category menu. Delete
any text already present on that window and paste there the code above using
copy & paste. Save the query by clicking on "Save" from the "File" category
menu. Name the query "qryDedupAccounts". Close the query and run it.

HOW IT WORKS:
The query gets all unique first-created accounts. It does that by retrieving
the lowest (oldest) ID of records with the same account number. Then the
query inverts that selection, resulting in all the other records, which are
the ones that were later appended duplicated accounts. That final selection
is deleted.

REQUIREMENTS
Your accounts table (MDV) requires a Primary Key field (autonumber and
no-duplicates).

-Randy
 
G

Guest

Dave, I tried to follow the steps that you gave me but after running the find
duplicates query I get a code that looks like this and not like the one that
you posted.

SELECT First([ACCOUNT NUMBER]) AS [ACCOUNT NUMBER Field], Count([ACCOUNT
NUMBER]) AS NumberOfDups
FROM MDV
GROUP BY [ACCOUNT NUMBER]
HAVING Count([ACCOUNT NUMBER])>1;

I am running ACCESS 2000 could this be the reason why the codes are not
similiar?


Dave said:
I think there are two issues here:

1. When you try to create a Deletion query from a select query that joins
the basic table with a Find Duplicates query based on the same table, the
deletion will not run. You will also see that the basic query (the Select
query that joins the table and the find-duplicates query) is not updateable.

2. When you do perform the deletion, you probably don't want to delete all
the records that are duplicated, only some of them; presumably you want to
leave one copy.

You were on the right track by creating a First Of query for the duplicates,
but in order to get the deletion done, you need to use a sub-query, and you
will need to add a primary key to your basic table. I think most folks here
would recommend that you ALWAYS use an Autonumber primary key in your
tables, even if you don't see an obvious use for it when you start out --
I've never regretted it.

Try this (I'm translating from queries that work for me to SQL statements
that I think use your table and field names, so beware of typos: the idea
works)

0. Back up your data.

1. Add an Autonumber primary key field to your table (the table is "MDV",
right? So let's say you name it MDV_ID)

2. Create a Find Duplicates query to find duplicates of [Account Number]
in MDV. It should look like this:

SELECT MDV_ID, [AccountNumber]
FROM MDV
WHERE ((([AccountNumber]) In (SELECT [AccountNumber] FROM [MDV] As Tmp
GROUP BY [AccountNumber] HAVING Count(*)>1 )))
ORDER BY [AccountNumber];

Access will give this query a name like "Find Duplicates for MDV" or
something similar

3. Create a new query based on the find duplicates query, grouping on
Account Number and finding the First MDV_ID. It should look something like
this:

SELECT [Account Number], First([MDV_ID]) AS FirstOfMDV_ID
FROM [Find duplicates for MDV]
GROUP BY [Account Number];

Save this query as qryFirstAcctDupes

4. Now (and finally!) create a deletion query for your MDV table by
dragging the asterisk into the first column and changing the query type to
delete. Then drag the MDV_ID field to the second column, and in the
criterion row type:

In (Select FirstOfMDV_ID From qryFirstAcctDupes)

The complete query SQL should look something like this:

DELETE *
FROM MDV
WHERE (((MDV_ID) In (Select [FirstOfMDV_ID] From qryFirstAcctDupes)));

When you click the Run button, this query should delete some presumably
large number of records, but if you have situations where there were more
than two records with the same Account Number, you will need to run the
query again, maybe several times, to get rid of all of them. Eventually you
will find that the query reports that it is about to delete 0 records, and
then you're done!

HTH

Cheyenne said:
No I haven't. Could you please give instructions on how to do this? Thanks.
 
G

Guest

Wow! Randy, it finally worked!! I followed your last suggestion and it
worked like a charm. I got rid of all the duplicates and it was as simple as
pie. Thanks a bunch for all your help and info. What a relief. Thanks.

Randy said:
Randy said:
Cheyenne, Have you tried to create a Delete query based on a Duplicate
Sub-Query?

-Randy

Cheyenne, as Dave pointed out, there is an issue to my first sugestion of
using a Delete Query based on Find Duplicates Query, since it may ended up
deleting even the original records. I missed that detail. Below is the
shortest, easiest and fastest way to solve your "remove dupicates" issue. It
deletes one (1) or more duplicates if present all at once (one run).

DELETE MDV.ID
FROM MDV
WHERE (((MDV.ID)
NOT IN (SELECT Min(MDV.ID) AS OldestID
FROM MDV GROUP BY MDV.[ACCOUNT NUMBER];)));

INSTRUCTIONS:
Create a new query in "Design Mode." Close the "Show Table" window. Switch
to SQL View by clicking on "SQL View" from the "View" category menu. Delete
any text already present on that window and paste there the code above using
copy & paste. Save the query by clicking on "Save" from the "File" category
menu. Name the query "qryDedupAccounts". Close the query and run it.

HOW IT WORKS:
The query gets all unique first-created accounts. It does that by retrieving
the lowest (oldest) ID of records with the same account number. Then the
query inverts that selection, resulting in all the other records, which are
the ones that were later appended duplicated accounts. That final selection
is deleted.

REQUIREMENTS
Your accounts table (MDV) requires a Primary Key field (autonumber and
no-duplicates).

-Randy
 
D

Dave

Glad you got it done, and Randy's revised scheme is much more elegant
anyway, so I won't go further with the one I suggested.

Cheyenne said:
Dave, I tried to follow the steps that you gave me but after running the find
duplicates query I get a code that looks like this and not like the one that
you posted.

SELECT First([ACCOUNT NUMBER]) AS [ACCOUNT NUMBER Field], Count([ACCOUNT
NUMBER]) AS NumberOfDups
FROM MDV
GROUP BY [ACCOUNT NUMBER]
HAVING Count([ACCOUNT NUMBER])>1;

I am running ACCESS 2000 could this be the reason why the codes are not
similiar?


Dave said:
I think there are two issues here:

1. When you try to create a Deletion query from a select query that joins
the basic table with a Find Duplicates query based on the same table, the
deletion will not run. You will also see that the basic query (the Select
query that joins the table and the find-duplicates query) is not updateable.

2. When you do perform the deletion, you probably don't want to delete all
the records that are duplicated, only some of them; presumably you want to
leave one copy.

You were on the right track by creating a First Of query for the duplicates,
but in order to get the deletion done, you need to use a sub-query, and you
will need to add a primary key to your basic table. I think most folks here
would recommend that you ALWAYS use an Autonumber primary key in your
tables, even if you don't see an obvious use for it when you start out --
I've never regretted it.

Try this (I'm translating from queries that work for me to SQL statements
that I think use your table and field names, so beware of typos: the idea
works)

0. Back up your data.

1. Add an Autonumber primary key field to your table (the table is "MDV",
right? So let's say you name it MDV_ID)

2. Create a Find Duplicates query to find duplicates of [Account Number]
in MDV. It should look like this:

SELECT MDV_ID, [AccountNumber]
FROM MDV
WHERE ((([AccountNumber]) In (SELECT [AccountNumber] FROM [MDV] As Tmp
GROUP BY [AccountNumber] HAVING Count(*)>1 )))
ORDER BY [AccountNumber];

Access will give this query a name like "Find Duplicates for MDV" or
something similar

3. Create a new query based on the find duplicates query, grouping on
Account Number and finding the First MDV_ID. It should look something like
this:

SELECT [Account Number], First([MDV_ID]) AS FirstOfMDV_ID
FROM [Find duplicates for MDV]
GROUP BY [Account Number];

Save this query as qryFirstAcctDupes

4. Now (and finally!) create a deletion query for your MDV table by
dragging the asterisk into the first column and changing the query type to
delete. Then drag the MDV_ID field to the second column, and in the
criterion row type:

In (Select FirstOfMDV_ID From qryFirstAcctDupes)

The complete query SQL should look something like this:

DELETE *
FROM MDV
WHERE (((MDV_ID) In (Select [FirstOfMDV_ID] From qryFirstAcctDupes)));

When you click the Run button, this query should delete some presumably
large number of records, but if you have situations where there were more
than two records with the same Account Number, you will need to run the
query again, maybe several times, to get rid of all of them. Eventually you
will find that the query reports that it is about to delete 0 records, and
then you're done!

HTH

Cheyenne said:
No I haven't. Could you please give instructions on how to do this? Thanks.

:

Cheyenne wrote:

I need a quick way to remove duplicates that are found with the Duplicate
Query. I have been using the "select each record and delete" method.
There
has to be a quicker and easier way. Please help

Cheyenne, Have you tried to create a Delete query based on a Duplicate
Sub-Query?

-Randy
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top