Delete duplicate records

  • Thread starter Thread starter Opal
  • Start date Start date
O

Opal

I am trying to create a query that will delete
a duplicate record in a table. I want to delete
the older record and keep the new one created.

I have used the Find Duplicates Query Wizard
but I am not getting the desired results and I
think its due to the parameters I am adding.

Below is my SQL:

DELETE MSCompl.MSComplID, MSCompl.SupName,
MSCompl.SupLink, MSCompl.CompletionDate,
MSCompl.Shift, MSCompl.GroupName, MSCompl.
MSCourseID, MSCompl.Deficiency
FROM MSCompl
WHERE (((MSCompl.SupLink)=[Forms]![HoldingInfo]![
txtHoldSupNumber]) AND ((MSCompl.CompletionDate)
Is Null Or (MSCompl.CompletionDate)<Date()) AND
((MSCompl.MSCourseID) In (SELECT [MSCourseID]
FROM [MSCompl] As Tmp GROUP BY [MSCourseID]
HAVING Count(*)>1 )));

Can someone help me create a query that will only
delete the older duplicate record? Could I take advantage
of the Primary key and remove the duplicate record that
has the lower primary key #?
 
Perhaps the following will work for you

STEP 1: BACKUP your data before attempting the following.
STEP 2: BACKUP your data before attempting the following.

Without a backup you cannot restore the data if this does not work the
way you expect.

DELETE
FROM MSCompl
WHERE MSComp1ID NOT IN
(SELECT Max(Temp.MSComplID)
FROM MSCompl as Temp
GROUP BY Temp.CourseID)
AND MSCompl.SupLink=[Forms]![HoldingInfo]![txtHoldSupNumber]
AND (MSCompl.CompletionDate Is Null
Or MSCompl.CompletionDate)<Date())

If that proves too slow, post back and I will try to construct a faster
routine. This one will delete records based on MSComplID. If you want
to do it based on


'====================================================
John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
'====================================================
 
Thank, you John, I am actually working with a "test"
database and the data is dummy data. I want to
get everything working before incorporating in the
"live" database. I have tried your SQL, but get an
error -

"Query must have at least one destination field" ?
 
That is a strange error.

Did you copy and paste what I posted? Or did you enter the query yourself?

DELETE
FROM MSCompl
WHERE MSComplID NOT IN
(SELECT Max(Temp.MSComplID)
FROM MSCompl as Temp
GROUP BY Temp.CourseID)
AND MSCompl.SupLink=[Forms]![HoldingInfo]![txtHoldSupNumber]
AND (MSCompl.CompletionDate Is Null
Or MSCompl.CompletionDate)<Date())

If you look closely at what I posted, then you will note that in one place I
refer to MSComp1ID (that is MSComp ONE ID) and another place I have MSComp EL
ID). Also, I assumed that MSComplID field is the primary key field.

If you are running this as a delete query then you should not get the message
you did. If you want to run this as a SELECT query then need to have a field
list or return all field using the asterisk * to signal all fields.

SELECT MSCompl.*
FROM MSCompl
WHERE MSComplID NOT IN
(SELECT Max(Temp.MSComplID)
FROM MSCompl as Temp
GROUP BY Temp.CourseID)
AND MSCompl.SupLink=[Forms]![HoldingInfo]![txtHoldSupNumber]
AND (MSCompl.CompletionDate Is Null
Or MSCompl.CompletionDate)<Date())


John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
John,

This is really strange: I did copy your SQL exactly as
you posted it. When I got the error, I thought that the "1"
was a typo and changed it to an "l" and I got the same error.

The select query does not produce the same error.

You are correct, the MSComplID is the primary key.

With the select query however, the results show all
values, not the max value.....or the value to the
corresponding MSCourseID...
 
I went to Allen Browne's website to re-familiarize
myself with sub-queries and tried to fashion the
following to suit my needs:

DELETE FROM Table1
WHERE ID <> (SELECT Min(ID) AS MinOfID FROM Table1 AS Dupe
WHERE (Dupe.Surname = Table1.Surname)
AND (Dupe.FirstName = Table1.FirstName));

into:

DELETE FROM MSCompl
WHERE MSComplID <> (SELECT Max(MSComplID) AS MaxOfMSComplID FROM
MSCompl AS Dupe
WHERE (Dupe.TSMSCourseID = TSMSCompl.TSMSCourseID)
AND MSCompl.SupLink=[Forms]![HoldingInfo]![txtHoldSupNumber]
AND (MSCompl.CompletionDate Is Null
Or MSCompl.CompletionDate)<Date());

....and I still get that "Query must have at least one destination
field"
error..... :-S

I even simplified it down to just:

DELETE FROM MSCompl
WHERE MSComplID <> (SELECT Max(MSComplID) AS MaxOfMSComplID FROM
MSCompl AS Dupe
WHERE (Dupe.TSMSCourseID = TSMSCompl.TSMSCourseID));

and still get that error.... I must be missing something obvious but
can't see
the forest for the trees..... :-(
 
Ok, lets try breaking this down into steps.

Step 1: Construct a query with the values you want to keep

SELECT Max(Temp.MSComplID)
FROM MSCompl as Temp
GROUP BY Temp.CourseID

IF that returns all the records then you must need to group by some other
field than CourseID.

Also is MSCompl a table or is it a query? If it is a query, then are you
returning fields in the select clause of the query?

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Hi John,

MSCompl is a table.

This provides the data I am looking for.

I have also added the criteria and still
get the max value I am looking for..

so, so far, so good.....
 
So you now have a query that identifies ALL the records you want to KEEP.

I can only guess what it looks like as you did not post the SQL.

If that query is returning the records you want to KEEP then you should be
able to use it as a subquery as long as the query is only returning ONE column
of information.

DELETE
FROM MSCompl
WHERE MSComplID NOT IN
(Subquery goes here)

If that still gives you an error, try building a new table and populating it
with the records you wish to retain.

First copy MsCompl structure.
== Copy the table
== Paste the table
Choose Structure only when you paste

Now use an append query to add all the records you want to keep to the table
INSERT INTO MSCompl_A
SELECT *
FROM MSCompl
WHERE MSComplID IN
(Subquery goes Here)

Now you can check the records in MSCompl_A and if everything is ok, you can
delete MsCompl and rename MSCompl_A to MSCompl. If you have relationships set
up you will have to remove them before deleting and renaming and you will have
to reset them afterwords.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Sorry, John, wasn't thinking :-(

My SQL is:

SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING (((Temp.MSCourseID)=[Forms]![frmMSCompl]![MSCourse]) AND
((Temp.SupLink)=[Forms]![frmMSCompl]![SupLink]));

When I make it:

DELETE
FROM MSCompl
WHERE MSComplID NOT IN
)SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING (((Temp.MSCourseID)=[Forms]![frmMSCompl]![MSCourse]) AND
((Temp.SupLink)=[Forms]![frmMSCompl]![SupLink])));

I still get the desitination field error..... I will
try your other suggestion now....
 
I can't see any reason for this to fail or give you the error message it is
giving you.

DELETE
FROM MSCompl
WHERE MSComplID NOT IN
(SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING Temp.MSCourseID)=[Forms]![frmMSCompl]![MSCourse] AND
Temp.SupLink=[Forms]![frmMSCompl]![SupLink])

Perhaps the error message is bogus. Is MSCompl in any relationships with
other tables? If so, are the relationships enforced? That could cause a
failure, but it should give you a different message.

It is possible that the query you are working with has gotten mucked up. Have
you tried creating a NEW query and entering the SQL into the new query.

I am really stumped and I am grasping at straws at this point.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Sorry for making you grasp at straws. Each time I tried this it was as a new
query. I will try working with the relationships and see if that makes a
difference.
 
Actually,

Found a better solution; instead I will base my form on the query that
shows only the Max rather than struggle with deleting the max.

Thanks for your support!
 
I did finally get the same error message and it went away and I could not get
it back. I just added a field name from the table I wanted to delete to the
Delete clause. Later I removed the field from the Delete clause and the query
still worked.

You might try this variant note the addition of the * on the first line.

By the way, you do realize that deleting information means it is gone permanently.

DELETE MSCompl.*
FROM MSCompl
WHERE MSComplID NOT IN
(SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING Temp.MSCourseID)=[Forms]![frmMSCompl]![MSCourse] AND
Temp.SupLink=[Forms]![frmMSCompl]![SupLink])

Another variation
DELETE MsCompl.MSComplID
FROM MSCompl
WHERE MSComplID NOT IN
(SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING Temp.MSCourseID)=[Forms]![frmMSCompl]![MSCourse] AND
Temp.SupLink=[Forms]![frmMSCompl]![SupLink])

The strange thing here is that you should not have to have any field reference
of any type in the DELETE clause. You can have as many fields from the table
you are deleting in the DELETE clause as you wish, but you can only have only
that one table referenced in the DELETE clause.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Hi John,

I missed this response. I tried your options
and although I no longer get the destination
field error, the query deletes all the records
in the table with the exception of the Max
MSComplID

Not exactly what I was looking for....
 
Hmm. A challenge. Perhaps you need the following

DELETE MsCompl.MsComplID
FROM MsCompl
WHERE MSCourseID=[Forms]![frmMSCompl]![MSCourse]
AND SupLink=[Forms]![frmMSCompl]![SupLink]
AND MsComplID NOT IN
(SELECT Max(Temp.MSComplID) AS MaxOfMSComplID
FROM MSCompl AS Temp
WHERE Temp.MSCourseID=[Forms]![frmMSCompl]![MSCourse] AND
Temp.SupLink=[Forms]![frmMSCompl]![SupLink]
GROUP BY Temp.MSCourseID, Temp.SupLink
HAVING Max(Temp.MSComplID) is not Null)

As always, backup first.

John Spencer
Access MVP 2002-2005, 2007-2009
The Hilltop Institute
University of Maryland Baltimore County
 
Back
Top