Deleting duplicates - kind of

  • Thread starter Thread starter Tamara
  • Start date Start date
T

Tamara

My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.

Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)

Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?
 
In these 40K rows of data, what field (column) or combination of
fields/columns would constitute a unique identification. I mean that like
this: You mentioned Group Name - would the Group Name be considered unique;
all entries with the same Group Name would be considered as different parts
of the same row and so could be consolidated into one single row?

With that information someone here (probably end up being me) can come up
with recommendations or, in my case, some VBA code to run as a macro to
consolidate the data for you and eliminate the extra entries.
 
Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.


Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data

Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space

'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub
 
I kind of gave up on having someone help me and was doing it the very long,
long, way! Thank you so much!!

JLatham said:
Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.


Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data

Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space

'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub


Tamara said:
My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.

Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)

Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?
 
That worked perfectly!!! There is still some clean up to do, of course, as
some of the group names had minor punctuation differences. All in all
though, that knocked out over 10k records of nothing but crap!! Thank you so
much!!! You just saved me HOURS!!

Tamara

JLatham said:
Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.


Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data

Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space

'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub


Tamara said:
My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.

Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)

Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?
 
Always nice to turn hours of work into minutes, isn't it?

Glad I could help. Thanks for letting us know it worked.

Tamara said:
That worked perfectly!!! There is still some clean up to do, of course, as
some of the group names had minor punctuation differences. All in all
though, that knocked out over 10k records of nothing but crap!! Thank you so
much!!! You just saved me HOURS!!

Tamara

JLatham said:
Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.


Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data

Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space

'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub


Tamara said:
My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.

Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)

Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?
 
Excellent JLatham!!! Also worked for me. That's some know how.

JLatham said:
Always nice to turn hours of work into minutes, isn't it?

Glad I could help. Thanks for letting us know it worked.

Tamara said:
That worked perfectly!!! There is still some clean up to do, of course, as
some of the group names had minor punctuation differences. All in all
though, that knocked out over 10k records of nothing but crap!! Thank you so
much!!! You just saved me HOURS!!

Tamara

JLatham said:
Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.


Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data

Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space

'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub


:

My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.

Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)

Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?
 
Back
Top