create list of duplicate data

  • Thread starter Thread starter eggman2001
  • Start date Start date
E

eggman2001

Hello,

I have two columns of data that have duplicates across the columns. I
have found many articles on how to remove duplicate data, but how
would I create a column that includes only data that is duplicated
across the two columns?
 
One simple formulas tinker ..

Assume your 2 source data cols are cols A and B, data in row1 down
In C1: =IF(COUNTIF(B:B,A1)>0,ROW(),"")
In D1: =IF(ROWS($1:1)>COUNT(C:C),"",INDEX(A:A,SMALL(C:C,ROWS($1:1))))
Copy C1:D1 down to the last row of data in col A. Minimize/hide col C. Col D
returns the desired results, neatly packed at the top

Success? Celebrate it, hit the stars below
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:27,000 Files:200 Subscribers:70
xdemechanik
 
Great. Just one note - you want the longer column of data to be in
column B, since it seems to match duplicates to the bottom of the data
in column B. When I had it the other way around (column A was longer),
it didn't capture all duplicates.

Thanks a lot!
 
Smart solution Max. It had me thinking this one, but I was probably
over-complicating it by trying to do it in one formula/column, without
the 'helper' column you've used in C:C.

I've not frequented usenet for a while as I've not had a computer at
home, but I treated myself to one of the new Apple MacBook Pros, and I'm
back 'lurking'. I hope to be able to post the occasional solution,
though:-)

Nice solution to the OP's question though. Good effort.

Jay
 
Smart solution Max. It had me thinking this one, but I was probably
over-complicating it by trying to do it in one formula/column, without
the 'helper' column you've used in C:C.

I've not frequented usenet for a while as I've not had a computer at
home, but I treated myself to one of the new Apple MacBook Pros, and I'm
back 'lurking'. I hope to be able to post the occasional solution,
though:-)

Nice solution to the OP's question though. Good effort.

Jay
 
Glad to hear. But I think it works fine as responsed earlier. As long as you
remember to copy "down to the last row of data in col A" -- as mentioned in
the response -- it doesn't matter whether col A is longer or shorter than
col B.
--
Max
Singapore
http://savefile.com/projects/236895
Downloads:27,000 Files:200 Subscribers:70
xdemechanik
 
I've used this scenario as a 'team challenge' at work.

In order to help develop my staff's technical skills (Excel, SQL, SAS
etc.) and hopefully have a bit of fun, I set them weekly 'challenges',
in which I pose a problem and they have to find a solution, and the
first to do it gets a small prize.

So I've used the OP's original question, re-worded it a little, created
a file with two columns of random numbers (with some duplicates) and
tasked the team (7 analysts) with finding a solution that doesn't use
VBA, only formulas (with a helper column allowed).

I like your use of SMALL and ROW with the absolute/relative reference to
iterate through the list of row numbers in column C. Without using VBA
Excel doesn't lend itself to iterative processes and/or loops and your
solution is pretty neat.
 
Hello,

Another approach:
Enter into C1:
=1-ISERROR(MATCH(A1,B:B,0))
Into C2:
=1-ISERROR(MATCH(A2,B:B,0))+C1
and copy down as far as column A goes.
Enter into D1:
=INDEX(A:A,MATCH(ROW(),C:C,0))
and copy down as far as necessary.

This approach is about 4x faster dependend on how often MATCH finds a
value. A successful MATCH stops while COUNT will look at the whole
input each time.

Regards,
Bernd
 
Back
Top