Delete duplicate rows of data when two columns are the same

  • Thread starter Thread starter Katerinia
  • Start date Start date
K

Katerinia

I have a worksheet and need to delete duplicate rows when two of my columns
are the same.

So as below:
For every "EMPLOYEE", look at the "JOB ADP" columns - if its the rows are
the same, look at the "MASTER COST CENTER" column and see if thats the
cooresponding rows are the same. If yes, delete that record (row). If no,
leave it.
T U W
X
1 Job ADP EMPLOYEE Master Cost Center Strd Hours
2 123500 111111111 98140 40
3 123500 111111111 98140 37.5
4 409997 333333333 17280 40
5 409997 444444444 16582 40
6 409997 666666666 17275 37.5
7 409997 666666666 17280 40
----------------------------------------------------------------------------
example: should look like this

Job ADP EMPLOYEE Master Cost Center Strd Hours
123500 111111111 98140 40
409997 333333333 17280 40
409997 444444444 16582 40
409997 666666666 17275 37.5
409997 666666666 17280 40


Hope you can help, I am nearing the end of a very long project and would
appreciate this so much!
 
Hi,

Questions.

Is it always the first row of any duplicates you want to keep?
Will there ever be more than 2 duplicate rows?
--
Mike

When competing hypotheses are otherwise equal, adopt the hypothesis that
introduces the fewest assumptions while still sufficiently answering the
question.
 
Hi

befoere you start make sure you really know which record you want to delete
- in the example you got rid of the one with fewer standard hours - is that
always the case?

Excel 2007 includes a duplicate row removal feature and prior to that you
should consider downloading the 'Asap Utilities' a free excel add-on with
lots of stuff like this.

In the meantime try the following:

create a column at the end of your data set as follows:


Column AA: =Jobadp&employee&master cost centre (use the correct columns/rows
for your sheet - from the example it would be =T2 & U2 & W2)
fill this down to the bottom of your data

Column AB: = countif($AA:$AA,AA2) - the AA2 should reflect the first row of
your data (2 in the example)

- when you recalculate column AB will show a number greater than 1 for every
row you want to delete. Either manually scan down and delete those greater
than 1 or :-

Select all the values in column AB and 'copy', then 'paste special, values'
into column AC - make sure the rows match up exactly. Now select the ENTIRE
table including all your data upto AC and sort on AC - find the first row in
AC that is greater then 1 and delete the rest.

simple huh?

regards
Reg
 
yes, the stnd hours is data that was just shoved in there and not accurate,
they have to reenter that data. So the first record is fine.

there could be more than two duplicates yes.
 
Katerinia,

Try this macro. your original post wasn't very clear because of line-wrap
but I think the 2 columns were interested in are T & W.

If that's incorrect then alter column T in these lines to the correct one
LastRow = sht.Cells(Cells.Rows.Count, "T").End(xlUp).Row
Set MyRange = sht.Range("T3:T" & LastRow)

I get the second column as an offset from column T in this line
If c = c.Offset(-1) And c.Offset(, 3) = c.Offset(-1, 3) Then

Column W is offset 3 from column T so if that's wrong change the 3 but NOT
the -1


Sub stance()
Dim MyRange As Range
Dim CopyRange As Range
Dim LastRow As Long
Dim c As Range
Set sht = Sheets("Sheet1") ' Change to suit
LastRow = sht.Cells(Cells.Rows.Count, "T").End(xlUp).Row
Set MyRange = sht.Range("T3:T" & LastRow)
For Each c In MyRange
If c = c.Offset(-1) And c.Offset(, 3) = c.Offset(-1, 3) Then
If CopyRange Is Nothing Then
Set CopyRange = c.EntireRow
Else
Set CopyRange = Union(CopyRange, c.EntireRow)
End If
End If
Next
If Not CopyRange Is Nothing Then
CopyRange.Delete
End If
End Sub

--
Mike

When competing hypotheses are otherwise equal, adopt the hypothesis that
introduces the fewest assumptions while still sufficiently answering the
question.
 
Actually Colomns T, U and W are the important ones.

For each SS#(EMPLOYEE) in Column U, i need to check T and W to see if any
rows match. If all three do, then delete.

Thanks for your help on this.. Hope this explains it more.
 
Katerinia,

This now check columns T, U and W and if they are the same then as the
previous rows(s) then only the top record is retained, the rest are deleted.
One thing I never considered is that in your test data column T is sorted and
I have assumed this will always be the case.

Dim MyRange As Range
Dim CopyRange As Range
Dim LastRow As Long
Dim c As Range
Set sht = Sheets("Sheet1") ' Change to suit
LastRow = sht.Cells(Cells.Rows.Count, "T").End(xlUp).Row
Set MyRange = sht.Range("T3:T" & LastRow)
For Each c In MyRange
If c = c.Offset(-1) And c.Offset(, 3) = c.Offset(-1, 3) And _
c.Offset(, 1) = c.Offset(-1, 1) Then
If CopyRange Is Nothing Then
Set CopyRange = c.EntireRow
Else
Set CopyRange = Union(CopyRange, c.EntireRow)
End If
End If
Next
If Not CopyRange Is Nothing Then
CopyRange.delete
End If
End Sub
--
Mike

When competing hypotheses are otherwise equal, adopt the hypothesis that
introduces the fewest assumptions while still sufficiently answering the
question.
 
Back
Top