Tamara,
Perhaps this will do what you need. I've made my best guesses based on your
initial post.
Start by making a copy of your workbook so that if things do go bad you'll
still have the source information. Press [Alt]+[F11] to put the code below
(with any changes to the 1st 2 Const values you need to make) into one of the
workbooks. Once the VB Editor opens, use Insert | Module to start a new code
module. Copy the code below and paste it into the code module. Select the
sheet with data on it and use Tools | Macro | Macros and the [Run] button to
run the code. As noted in it, it will take some time - somewhere between 9
and 20 minutes probably, depending on your setup.
Sub CollateAndRemoveExtraEntries()
'sheet to be processed must be selected
'before calling this Macro
'Tested Time-To-Run:
' Excel 2003 on Win XP
' Single Core AMD 3200+: 9m 37s
' Excel 2007 on Win XP
' Dual Core AMD 4800+: 18m 07s
' Excel 2007 on Vista
' Intel CoreDuo 6600: 14m 10s
'
'There's often help to be found at
' HelpFrom @ jlathamsite. com
'
'change these Const values to
'match your worksheet layout
Const uniqueIDColumn = "A" ' where to look for dupes
'assumes all columns from A to 'lastUsedColumn' are involved
Const lastUsedColumn = "W" ' last column of data
Dim dataOffsets() As Long ' offsets to columns of data
Dim sortRange As Range
Dim baseCell As Range
Dim anyColOffset As Long
Dim LC As Long ' loop counter
Dim RO As Long ' row offset pointer
Dim BCO As Long ' row offset from baseCell
Dim lastUsedRow As Long
Dim tempString As String ' work space
'set up to hold data offsets
ReDim dataOffsets(1 To _
Range(lastUsedColumn & "1").Column)
'fill offset array, assumes all columns
'from A to 'lastUsedColumn' are involved
For LC = LBound(dataOffsets) To UBound(dataOffsets)
dataOffsets(LC) = Cells(1, LC).Column - _
Range(uniqueIDColumn & 1).Column
Next
'begin by sorting the data
'borrow sortRange for the sorting
Set sortRange = Range("A1:" & _
Range(uniqueIDColumn & Rows.Count).End(xlUp).Address)
'speed up the process
Application.ScreenUpdating = False
'sort the range assuming it has a header row and
'sort by the uniqueIDColumn column.
tempString = uniqueIDColumn & "1"
sortRange.Sort Key1:=Range(tempString), _
Order1:=xlAscending, Header:=xlYes, _
OrderCustom:=1, MatchCase:=False, _
Orientation:=xlTopToBottom
Set sortRange = Nothing ' free up resources
'now we can start collating data
Set baseCell = Range(uniqueIDColumn & 1)
BCO = 1 ' initialize
lastUsedRow = Range(uniqueIDColumn & _
Rows.Count).End(xlUp).Row
Do Until baseCell.Row > lastUsedRow
Set baseCell = Range(uniqueIDColumn & BCO)
RO = 1 ' reset/initialize
Do While baseCell.Offset(RO, 0).Row <= lastUsedRow And _
baseCell.Offset(RO, 0) = baseCell
'matched item, collate
For LC = LBound(dataOffsets) To UBound(dataOffsets)
If dataOffsets(LC) <> 0 Then
'not the unique ID column
If IsEmpty(baseCell.Offset(0, dataOffsets(LC))) And _
Not IsEmpty(baseCell.Offset(RO, dataOffsets(LC))) Then
'copy data
baseCell.Offset(0, dataOffsets(LC)) = _
baseCell.Offset(RO, dataOffsets(LC))
End If
End If
Next ' end column content loop
RO = RO + 1 ' ready to look at next row
Loop ' end of matched loop
'move the base cell
BCO = baseCell.Row + RO
Loop ' end of comparisons & collating
'now on to remove the extra entries
'again assumes that row 1 has labels
'work from bottom up to 2nd row with data in it
For RO = lastUsedRow To 2 Step -1
If Range(uniqueIDColumn & RO) = _
Range(uniqueIDColumn & RO - 1) Then
Range(uniqueIDColumn & RO).EntireRow.Delete
End If
Next
Application.ScreenUpdating = True
End Sub
Tamara said:
My boss had me merge together a ton of data.
Group Name, Renewal Month, Broker Name, Broker Phone, Address,
Address 2, City, State, County, Zip, Employer Phone, Employer Key Contact,
EE's, SIC Code, Current Carrier, EE Rate, EE Rate2, Benefit Design, Status,
RFP Date, Comments 1, Comments 2, Comments 3 - Are all the columns.
Due to merging data from 10 different spreadsheets into one big spreadsheet
(40k records), I have been trying to find a way to consolidate all
information for each unique record onto one line. Some are entirely
duplicate and some just need to consolidate the information. (i.e. Group name
is the same for three records but one record has the address and another has
the SIC code and the third is blank)
Is there a simple way to do this? I tried running the EasyFilter Add-In
that was mentioned on another post and it took all of my memory and timed out
after an hour. I looked at the pearson guy's site but it wouldn't
consolidate AND delete duplicates. Any ideas?