Combining Worksheets that have similar (not exact) matches

  • Thread starter Thread starter robertuva
  • Start date Start date
R

robertuva

Okay - I thought maybe some people would have some experience an
insight on something like this:

I have two worksheets with the following header row in each:

PropertyID
PropertyName
Address1
Address2
City
State
Country
ZipCode

This is a database of hotels I have to match up, but here is the issu
- there was a complete overhaul, and they did not link up any of th
fields in the change. Now I am stuck with completely differen
PropertyID (in all cases), PropertyName (in most cases), Address1 (i
most cases), Address2 (in most cases). The City, State, Country an
ZipCode fields obviously stay the same.

Because of what this company needs to do they now need to backtrac
slightly and merge the files so that the same property has bot
PropertyID fields associated. Like I said, the problem is that th
PropertyName, etc. can be off by a bit, but they could be very similar
Here is an example of what I mean:

Database1:
57238 WHISKEY PETES HOTEL AND CASINO 100 W Primm Blvd
1 WESTWARD HO CASINO 2900 Las Vegas Boulevard South
57804 WESTGATE FLAMINGO BAY 5625 W Flamingo Rd
3094 WELLESLEY SUITES LAS VEGAS 1550 E Flamingo Rd
4633 VL LAS VEGAS NELLIS AFB 4244 N Las Vegas Blvd
4885 VILLA ROMA MOTEL 220 Convention Center Dr.
45 VENETIAN RESORT HOTEL CASINO 3355 Las Vegas Blvd S
4765 VAGABOND INN LAS VEGAS 3265 Las Vegas Blvd S

Database2:

109787 WILD WILD WEST GAMBLING HALL 3330 W TROPICANA AVENUE
136353 WESTIN CASUARINA HTL AND SPA 160 EAST FLAMINGO
198288 WESTGATE FLAMINGO BAY 5625 WEST FLAMINGO ROAD
149423 WELLESLEY INN & SUITES 1550 E. FLAMINGO ROAD
192018 VILLAGER LDG LAS VEGAS NELLIS 4244 LAS VEGAS BLVD NORTH A
172792 VILLA ROMA MOTEL 220 CONVENTION CENTER DR
147594 VENETIAN 3355 LAS VEGAS BLVD SOUTH
120554 VAGABOND INN LAS VEGAS 3265 LAS VEGAS BLVD S


If you notice, some don't exist in one that exist in another. As well
look at certain instances like Venetian. In Database1, it is "VENETIA
RESORT HOTEL CASINO" versus "VENETIAN" in Database2. Also, look at th
address line. For VILLA ROMA MOTEL it is "220 Convention Center Dr." i
Database1, but it is "220 CONVENTION CENTER DR" in Database2.

As I said, there are similarities, but mostly the addresses and/o
hotel names won't match up perfectly.

Does anyone have any recommendations on how I should approach this? An
help is VERY much appreciated. Bear in mind I have like 60,000 lin
items for this. It is pretty tough to manually do that.

Bobb
 
Robert,

This won't be a lot of help, but if I were given your challenge I would do
the following:

a) find some old "correct" data...with 60,000 records there must be a
back-up file somewhere...use the back-up data to clean up your current
database...it might not clean it up completely, but it should assist a great
deal.

b) try to find a pattern as to how the data was "mixed" and determine if
database "A" or database "B" or both is mixed.

Without more information there is little anyone here can do for you. All we
know is that you have two databases that appear "scrambled." So the trick
now is to find old data that will assist you to restoring the databases, or
find a pattern by which the databases were scrambled.

Good luck.

Regards,
Kevin



robertuva said:
Okay - I thought maybe some people would have some experience and
insight on something like this:

I have two worksheets with the following header row in each:

PropertyID
PropertyName
Address1
Address2
City
State
Country
ZipCode

This is a database of hotels I have to match up, but here is the issue
- there was a complete overhaul, and they did not link up any of the
fields in the change. Now I am stuck with completely different
PropertyID (in all cases), PropertyName (in most cases), Address1 (in
most cases), Address2 (in most cases). The City, State, Country and
ZipCode fields obviously stay the same.

Because of what this company needs to do they now need to backtrack
slightly and merge the files so that the same property has both
PropertyID fields associated. Like I said, the problem is that the
PropertyName, etc. can be off by a bit, but they could be very similar.
Here is an example of what I mean:

Database1:
57238 WHISKEY PETES HOTEL AND CASINO 100 W Primm Blvd
1 WESTWARD HO CASINO 2900 Las Vegas Boulevard South
57804 WESTGATE FLAMINGO BAY 5625 W Flamingo Rd
3094 WELLESLEY SUITES LAS VEGAS 1550 E Flamingo Rd
4633 VL LAS VEGAS NELLIS AFB 4244 N Las Vegas Blvd
4885 VILLA ROMA MOTEL 220 Convention Center Dr.
45 VENETIAN RESORT HOTEL CASINO 3355 Las Vegas Blvd S
4765 VAGABOND INN LAS VEGAS 3265 Las Vegas Blvd S

Database2:

109787 WILD WILD WEST GAMBLING HALL 3330 W TROPICANA AVENUE
136353 WESTIN CASUARINA HTL AND SPA 160 EAST FLAMINGO
198288 WESTGATE FLAMINGO BAY 5625 WEST FLAMINGO ROAD
149423 WELLESLEY INN & SUITES 1550 E. FLAMINGO ROAD
192018 VILLAGER LDG LAS VEGAS NELLIS 4244 LAS VEGAS BLVD NORTH A
172792 VILLA ROMA MOTEL 220 CONVENTION CENTER DR
147594 VENETIAN 3355 LAS VEGAS BLVD SOUTH
120554 VAGABOND INN LAS VEGAS 3265 LAS VEGAS BLVD S


If you notice, some don't exist in one that exist in another. As well,
look at certain instances like Venetian. In Database1, it is "VENETIAN
RESORT HOTEL CASINO" versus "VENETIAN" in Database2. Also, look at the
address line. For VILLA ROMA MOTEL it is "220 Convention Center Dr." in
Database1, but it is "220 CONVENTION CENTER DR" in Database2.

As I said, there are similarities, but mostly the addresses and/or
hotel names won't match up perfectly.

Does anyone have any recommendations on how I should approach this? Any
help is VERY much appreciated. Bear in mind I have like 60,000 line
items for this. It is pretty tough to manually do that.

Bobby


------------------------------------------------



~~Now Available: Financial Statements.xls, a step by step guide to
creating financial statements
 
Back
Top