Problem with delete duplicates query

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I am importing data from Excel into an Access 2003 table. I want to check for
duplicates and delete them if found. I've followed the procedure to do this
from the Microsoft Access Assistance paper on "Find, eliminate, or hide
duplicate records in Access". The totals query identifies the duplicates
correctly, but the problem is the delete query. I have to define one field in
the totals query that separates the records and use this field as the single
field in the delete query to define which records to delete. When I run the
delete query I get an error message that the data is found in more than one
record so the delete query doesn't delete any records. There is not one field
that will always be a distinct field- rather it is a combination of 10 fields
I am using to define a duplicate record. When I try setting more than one
field as "first" in order to run the totals query I get less than the
appropriate number of duplicates. What am I doing wrong?
 
Tracey said:
I am importing data from Excel into an Access 2003 table. I want to check for
duplicates and delete them if found. I've followed the procedure to do this
from the Microsoft Access Assistance paper on "Find, eliminate, or hide
duplicate records in Access". The totals query identifies the duplicates
correctly, but the problem is the delete query. I have to define one field in
the totals query that separates the records and use this field as the single
field in the delete query to define which records to delete. When I run the
delete query I get an error message that the data is found in more than one
record so the delete query doesn't delete any records. There is not one field
that will always be a distinct field- rather it is a combination of 10 fields
I am using to define a duplicate record. When I try setting more than one
field as "first" in order to run the totals query I get less than the
appropriate number of duplicates. What am I doing wrong?

Short answer: I would import everything into the Access Table
(duplicates and all), and then attach an Autonumber field to serve as
the primary key and delete using that.

Long answer (and maybe not everything I suggest here is necessary, but I
think it will work) ...

Suppose your Table looks like this, though I'm using only 8 fields in
this example. I've added an Autonumber field at the beginning, which is
not used in locating duplicates (none of the Autonumbers are duplicates).

[Table1] Table Datasheet View (before deleting but after adding an
Autonumber field):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-1775235345 2 8 4 x 6 3 9 c
-760197351 1 4 5 a 15 7 8 b
-588098952 1 4 5 a 15 7 8 a
-459449852 2 8 4 x 6 3 9 d
-74945867 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

I define a Query to smoke out the duplicate records. It's kind of
hairy, but it checks all of the fields you say must match. In your own
Query, of course you'll need to include each of the matching fields in
the list. The "Find Duplicates Query Wizard" will do most of the work
for you.

[Q_Table1_Duplicates] SQL:

SELECT T1.F1, T1.F2, T1.F3, T1.F4, T1.F5,
T1.F6, T1.F7, T1.F8, T1.Table1_ID
FROM Table1 AS T1
WHERE (((T1.F1) In (
SELECT [F1] FROM [Table1] As Tmp
GROUP BY [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8]
HAVING Count(*)>1 And [F2] = [T1].[F2]
And [F3] = [T1].[F3] And [F4] = [T1].[F4]
And [F5] = [T1].[F5] And [F6] = [T1].[F6]
And [F7] = [T1].[F7] And [F8] = [T1].[F8])))
ORDER BY T1.F1, T1.F2, T1.F3, T1.F4, T1.F5, T1.F6, T1.F7, T1.F8;

This lists the duplicate records, including the (unique) primary key
field, [Table1_ID] in my example.

[Q_Table1_Duplicates] Query Datasheet View:

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-74945867 1 4 5 a 15 7 8 a
-588098952 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
-1775235345 2 8 4 x 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

The next Query identifies the records to be deleted, including all but
one of each set of duplicates.

[Q_ToBeDeleted] SQL:

SELECT DISTINCTROW Q1.Table1_ID
FROM Q_Table1_Duplicates AS Q1
INNER JOIN Q_Table1_Duplicates AS Q2
ON (Q1.F8 = Q2.F8) AND (Q1.F7 = Q2.F7)
AND (Q1.F6 = Q2.F6) AND (Q1.F5 = Q2.F5)
AND (Q1.F4 = Q2.F4) AND (Q1.F3 = Q2.F3)
AND (Q1.F2 = Q2.F2) AND (Q1.F1 = Q2.F1)
GROUP BY Q1.Table1_ID
HAVING (((Max(Q2.Table1_ID))>[Q1]![Table1_ID]));

The key values of the records to be zapped are listed by this Query.

Table1_ID
---------
-1775235345
-588098952
-74945867

At this point, you know which records to delete, so the Delete Query is
fairly straightforward:

[Q_DeleteDuplicates] SQL:

DELETE DISTINCTROW Table1.*
FROM Table1 INNER JOIN Q_ToBeDeleted
ON Table1.Table1_ID = Q_ToBeDeleted.Table1_ID;

.... and running it deletes 3 of the duplicate records, leaving the
others in your Table.

[Table1] Table Datasheet View (after running the Delete Query):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
---------- -- -- -- -- -- -- -- --
-760197351 1 4 5 a 15 7 8 b
-459449852 2 8 4 x 6 3 9 d
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c


-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
Thanks for the help on this, but I am having a new problem now. When I run
the final query to actually delete the records (the query based on the table
and the query identifying the duplicates), I get a message that "Could not
delete from specified tables". I did the troubleshooting to verify that the
table wasn't "read-only", that I am the admin, etc. but can't get it to work.
Any ideas?

Vincent Johns said:
Tracey said:
I am importing data from Excel into an Access 2003 table. I want to check for
duplicates and delete them if found. I've followed the procedure to do this
from the Microsoft Access Assistance paper on "Find, eliminate, or hide
duplicate records in Access". The totals query identifies the duplicates
correctly, but the problem is the delete query. I have to define one field in
the totals query that separates the records and use this field as the single
field in the delete query to define which records to delete. When I run the
delete query I get an error message that the data is found in more than one
record so the delete query doesn't delete any records. There is not one field
that will always be a distinct field- rather it is a combination of 10 fields
I am using to define a duplicate record. When I try setting more than one
field as "first" in order to run the totals query I get less than the
appropriate number of duplicates. What am I doing wrong?

Short answer: I would import everything into the Access Table
(duplicates and all), and then attach an Autonumber field to serve as
the primary key and delete using that.

Long answer (and maybe not everything I suggest here is necessary, but I
think it will work) ...

Suppose your Table looks like this, though I'm using only 8 fields in
this example. I've added an Autonumber field at the beginning, which is
not used in locating duplicates (none of the Autonumbers are duplicates).

[Table1] Table Datasheet View (before deleting but after adding an
Autonumber field):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-1775235345 2 8 4 x 6 3 9 c
-760197351 1 4 5 a 15 7 8 b
-588098952 1 4 5 a 15 7 8 a
-459449852 2 8 4 x 6 3 9 d
-74945867 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

I define a Query to smoke out the duplicate records. It's kind of
hairy, but it checks all of the fields you say must match. In your own
Query, of course you'll need to include each of the matching fields in
the list. The "Find Duplicates Query Wizard" will do most of the work
for you.

[Q_Table1_Duplicates] SQL:

SELECT T1.F1, T1.F2, T1.F3, T1.F4, T1.F5,
T1.F6, T1.F7, T1.F8, T1.Table1_ID
FROM Table1 AS T1
WHERE (((T1.F1) In (
SELECT [F1] FROM [Table1] As Tmp
GROUP BY [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8]
HAVING Count(*)>1 And [F2] = [T1].[F2]
And [F3] = [T1].[F3] And [F4] = [T1].[F4]
And [F5] = [T1].[F5] And [F6] = [T1].[F6]
And [F7] = [T1].[F7] And [F8] = [T1].[F8])))
ORDER BY T1.F1, T1.F2, T1.F3, T1.F4, T1.F5, T1.F6, T1.F7, T1.F8;

This lists the duplicate records, including the (unique) primary key
field, [Table1_ID] in my example.

[Q_Table1_Duplicates] Query Datasheet View:

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-74945867 1 4 5 a 15 7 8 a
-588098952 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
-1775235345 2 8 4 x 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

The next Query identifies the records to be deleted, including all but
one of each set of duplicates.

[Q_ToBeDeleted] SQL:

SELECT DISTINCTROW Q1.Table1_ID
FROM Q_Table1_Duplicates AS Q1
INNER JOIN Q_Table1_Duplicates AS Q2
ON (Q1.F8 = Q2.F8) AND (Q1.F7 = Q2.F7)
AND (Q1.F6 = Q2.F6) AND (Q1.F5 = Q2.F5)
AND (Q1.F4 = Q2.F4) AND (Q1.F3 = Q2.F3)
AND (Q1.F2 = Q2.F2) AND (Q1.F1 = Q2.F1)
GROUP BY Q1.Table1_ID
HAVING (((Max(Q2.Table1_ID))>[Q1]![Table1_ID]));

The key values of the records to be zapped are listed by this Query.

Table1_ID
---------
-1775235345
-588098952
-74945867

At this point, you know which records to delete, so the Delete Query is
fairly straightforward:

[Q_DeleteDuplicates] SQL:

DELETE DISTINCTROW Table1.*
FROM Table1 INNER JOIN Q_ToBeDeleted
ON Table1.Table1_ID = Q_ToBeDeleted.Table1_ID;

.... and running it deletes 3 of the duplicate records, leaving the
others in your Table.

[Table1] Table Datasheet View (after running the Delete Query):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
---------- -- -- -- -- -- -- -- --
-760197351 1 4 5 a 15 7 8 b
-459449852 2 8 4 x 6 3 9 d
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c


-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
Tracey,

I'm sorry I didn't get back sooner. I have a couple of ideas (but
please don't play around with them until AFTER you have made a backup
copy of your database).

First, is the Table from which you're trying to delete records
linked to another Table that is (for example) read only? If you make a
copy of it, can you delete from the copy using the Delete Query? Can
you delete records (from the copy) in Table Datasheet View, by selecting
a record and pressing the Delete key?

Were you able to copy my example Table into your database and delete
from it the way I did? You can do that by copying it from the message
to Notepad, saving it to disk, and in Access using File --> Get External
Data --> Import... and asking the wizard to import from a text file (the
one you saved from Notepad). Make a copy of the Table (you may want to
try deleting from it several times) and try to delete a record by
opening it in Table Datasheet View, clicking at the left edge of a
record, and using the Delete key.

If that works, restore the Table from your backup and try running the
Delete Query on your unlinked, non-read-only Table.

You might look at the Help topic called "Delete a group of records
with a query" (in Access 2000; I'm not sure what it's called in Access
2003, but it should have a similar title).

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
Thanks for the help on this, but I am having a new problem now. When I run
the final query to actually delete the records (the query based on the table
and the query identifying the duplicates), I get a message that "Could not
delete from specified tables". I did the troubleshooting to verify that the
table wasn't "read-only", that I am the admin, etc. but can't get it to work.
Any ideas?

:

Tracey wrote:

I am importing data from Excel into an Access 2003 table. I want to check for
duplicates and delete them if found. I've followed the procedure to do this
from the Microsoft Access Assistance paper on "Find, eliminate, or hide
duplicate records in Access". The totals query identifies the duplicates
correctly, but the problem is the delete query. I have to define one field in
the totals query that separates the records and use this field as the single
field in the delete query to define which records to delete. When I run the
delete query I get an error message that the data is found in more than one
record so the delete query doesn't delete any records. There is not one field
that will always be a distinct field- rather it is a combination of 10 fields
I am using to define a duplicate record. When I try setting more than one
field as "first" in order to run the totals query I get less than the
appropriate number of duplicates. What am I doing wrong?

Short answer: I would import everything into the Access Table
(duplicates and all), and then attach an Autonumber field to serve as
the primary key and delete using that.

Long answer (and maybe not everything I suggest here is necessary, but I
think it will work) ...

Suppose your Table looks like this, though I'm using only 8 fields in
this example. I've added an Autonumber field at the beginning, which is
not used in locating duplicates (none of the Autonumbers are duplicates).

[Table1] Table Datasheet View (before deleting but after adding an
Autonumber field):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-1775235345 2 8 4 x 6 3 9 c
-760197351 1 4 5 a 15 7 8 b
-588098952 1 4 5 a 15 7 8 a
-459449852 2 8 4 x 6 3 9 d
-74945867 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

I define a Query to smoke out the duplicate records. It's kind of
hairy, but it checks all of the fields you say must match. In your own
Query, of course you'll need to include each of the matching fields in
the list. The "Find Duplicates Query Wizard" will do most of the work
for you.

[Q_Table1_Duplicates] SQL:

SELECT T1.F1, T1.F2, T1.F3, T1.F4, T1.F5,
T1.F6, T1.F7, T1.F8, T1.Table1_ID
FROM Table1 AS T1
WHERE (((T1.F1) In (
SELECT [F1] FROM [Table1] As Tmp
GROUP BY [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8]
HAVING Count(*)>1 And [F2] = [T1].[F2]
And [F3] = [T1].[F3] And [F4] = [T1].[F4]
And [F5] = [T1].[F5] And [F6] = [T1].[F6]
And [F7] = [T1].[F7] And [F8] = [T1].[F8])))
ORDER BY T1.F1, T1.F2, T1.F3, T1.F4, T1.F5, T1.F6, T1.F7, T1.F8;

This lists the duplicate records, including the (unique) primary key
field, [Table1_ID] in my example.

[Q_Table1_Duplicates] Query Datasheet View:

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-74945867 1 4 5 a 15 7 8 a
-588098952 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
-1775235345 2 8 4 x 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

The next Query identifies the records to be deleted, including all but
one of each set of duplicates.

[Q_ToBeDeleted] SQL:

SELECT DISTINCTROW Q1.Table1_ID
FROM Q_Table1_Duplicates AS Q1
INNER JOIN Q_Table1_Duplicates AS Q2
ON (Q1.F8 = Q2.F8) AND (Q1.F7 = Q2.F7)
AND (Q1.F6 = Q2.F6) AND (Q1.F5 = Q2.F5)
AND (Q1.F4 = Q2.F4) AND (Q1.F3 = Q2.F3)
AND (Q1.F2 = Q2.F2) AND (Q1.F1 = Q2.F1)
GROUP BY Q1.Table1_ID
HAVING (((Max(Q2.Table1_ID))>[Q1]![Table1_ID]));

The key values of the records to be zapped are listed by this Query.

Table1_ID
---------
-1775235345
-588098952
-74945867

At this point, you know which records to delete, so the Delete Query is
fairly straightforward:

[Q_DeleteDuplicates] SQL:

DELETE DISTINCTROW Table1.*
FROM Table1 INNER JOIN Q_ToBeDeleted
ON Table1.Table1_ID = Q_ToBeDeleted.Table1_ID;

.... and running it deletes 3 of the duplicate records, leaving the
others in your Table.

[Table1] Table Datasheet View (after running the Delete Query):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
---------- -- -- -- -- -- -- -- --
-760197351 1 4 5 a 15 7 8 b
-459449852 2 8 4 x 6 3 9 d
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c


-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
Thanks for getting back with me. In the interim, a colleague wrote some code
for the that deletes the identified records. Thanks for all your help. This
discussion group has saved my bacon on many occasions!

Vincent Johns said:
Tracey,

I'm sorry I didn't get back sooner. I have a couple of ideas (but
please don't play around with them until AFTER you have made a backup
copy of your database).

First, is the Table from which you're trying to delete records
linked to another Table that is (for example) read only? If you make a
copy of it, can you delete from the copy using the Delete Query? Can
you delete records (from the copy) in Table Datasheet View, by selecting
a record and pressing the Delete key?

Were you able to copy my example Table into your database and delete
from it the way I did? You can do that by copying it from the message
to Notepad, saving it to disk, and in Access using File --> Get External
Data --> Import... and asking the wizard to import from a text file (the
one you saved from Notepad). Make a copy of the Table (you may want to
try deleting from it several times) and try to delete a record by
opening it in Table Datasheet View, clicking at the left edge of a
record, and using the Delete key.

If that works, restore the Table from your backup and try running the
Delete Query on your unlinked, non-read-only Table.

You might look at the Help topic called "Delete a group of records
with a query" (in Access 2000; I'm not sure what it's called in Access
2003, but it should have a similar title).

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
Thanks for the help on this, but I am having a new problem now. When I run
the final query to actually delete the records (the query based on the table
and the query identifying the duplicates), I get a message that "Could not
delete from specified tables". I did the troubleshooting to verify that the
table wasn't "read-only", that I am the admin, etc. but can't get it to work.
Any ideas?

:

Tracey wrote:


I am importing data from Excel into an Access 2003 table. I want to check for
duplicates and delete them if found. I've followed the procedure to do this
from the Microsoft Access Assistance paper on "Find, eliminate, or hide
duplicate records in Access". The totals query identifies the duplicates
correctly, but the problem is the delete query. I have to define one field in
the totals query that separates the records and use this field as the single
field in the delete query to define which records to delete. When I run the
delete query I get an error message that the data is found in more than one
record so the delete query doesn't delete any records. There is not one field
that will always be a distinct field- rather it is a combination of 10 fields
I am using to define a duplicate record. When I try setting more than one
field as "first" in order to run the totals query I get less than the
appropriate number of duplicates. What am I doing wrong?

Short answer: I would import everything into the Access Table
(duplicates and all), and then attach an Autonumber field to serve as
the primary key and delete using that.

Long answer (and maybe not everything I suggest here is necessary, but I
think it will work) ...

Suppose your Table looks like this, though I'm using only 8 fields in
this example. I've added an Autonumber field at the beginning, which is
not used in locating duplicates (none of the Autonumbers are duplicates).

[Table1] Table Datasheet View (before deleting but after adding an
Autonumber field):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-1775235345 2 8 4 x 6 3 9 c
-760197351 1 4 5 a 15 7 8 b
-588098952 1 4 5 a 15 7 8 a
-459449852 2 8 4 x 6 3 9 d
-74945867 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

I define a Query to smoke out the duplicate records. It's kind of
hairy, but it checks all of the fields you say must match. In your own
Query, of course you'll need to include each of the matching fields in
the list. The "Find Duplicates Query Wizard" will do most of the work
for you.

[Q_Table1_Duplicates] SQL:

SELECT T1.F1, T1.F2, T1.F3, T1.F4, T1.F5,
T1.F6, T1.F7, T1.F8, T1.Table1_ID
FROM Table1 AS T1
WHERE (((T1.F1) In (
SELECT [F1] FROM [Table1] As Tmp
GROUP BY [F1],[F2],[F3],[F4],[F5],[F6],[F7],[F8]
HAVING Count(*)>1 And [F2] = [T1].[F2]
And [F3] = [T1].[F3] And [F4] = [T1].[F4]
And [F5] = [T1].[F5] And [F6] = [T1].[F6]
And [F7] = [T1].[F7] And [F8] = [T1].[F8])))
ORDER BY T1.F1, T1.F2, T1.F3, T1.F4, T1.F5, T1.F6, T1.F7, T1.F8;

This lists the duplicate records, including the (unique) primary key
field, [Table1_ID] in my example.

[Q_Table1_Duplicates] Query Datasheet View:

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
----------- -- -- -- -- -- -- -- --
-74945867 1 4 5 a 15 7 8 a
-588098952 1 4 5 a 15 7 8 a
208438707 1 4 5 a 15 7 8 a
-1775235345 2 8 4 x 6 3 9 c
2035178854 2 8 4 x 6 3 9 c

The next Query identifies the records to be deleted, including all but
one of each set of duplicates.

[Q_ToBeDeleted] SQL:

SELECT DISTINCTROW Q1.Table1_ID
FROM Q_Table1_Duplicates AS Q1
INNER JOIN Q_Table1_Duplicates AS Q2
ON (Q1.F8 = Q2.F8) AND (Q1.F7 = Q2.F7)
AND (Q1.F6 = Q2.F6) AND (Q1.F5 = Q2.F5)
AND (Q1.F4 = Q2.F4) AND (Q1.F3 = Q2.F3)
AND (Q1.F2 = Q2.F2) AND (Q1.F1 = Q2.F1)
GROUP BY Q1.Table1_ID
HAVING (((Max(Q2.Table1_ID))>[Q1]![Table1_ID]));

The key values of the records to be zapped are listed by this Query.

Table1_ID
---------
-1775235345
-588098952
-74945867

At this point, you know which records to delete, so the Delete Query is
fairly straightforward:

[Q_DeleteDuplicates] SQL:

DELETE DISTINCTROW Table1.*
FROM Table1 INNER JOIN Q_ToBeDeleted
ON Table1.Table1_ID = Q_ToBeDeleted.Table1_ID;

.... and running it deletes 3 of the duplicate records, leaving the
others in your Table.

[Table1] Table Datasheet View (after running the Delete Query):

Table1_ID F1 F2 F3 F4 F5 F6 F7 F8
---------- -- -- -- -- -- -- -- --
-760197351 1 4 5 a 15 7 8 b
-459449852 2 8 4 x 6 3 9 d
208438707 1 4 5 a 15 7 8 a
286939153 2 8 4 y 6 3 9 c
2035178854 2 8 4 x 6 3 9 c


-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
I'm pleased that you solved your problem, but do you know why you
couldn't delete those records? If so, the answer might help someone else.

-- Vincent Johns <[email protected]>
Please feel free to quote anything I say here.
 
I never figured out why it wouldn't work. My colleague just wrote new code
since we couldn't figure out why it wasn't working. Thanks again.
 
Back
Top