Company Name and Normalization

  • Thread starter Thread starter Carlos Alvarez
  • Start date Start date
C

Carlos Alvarez

I maintain a database which collects thousands of
organization names submitted by individuals. In looking
for matches of organization names, the challenge lies in
the multiple variations of names which a query usually
doesn't match. Examples of this would be "IBM" vs
"International Business Machines", "UConn" vs "University
of Connecticut", "FDA" vs "Food and Drug Administration".

A normalized table with a primary key would be ideal, where
a user could then choose from a combo box of organization
names to identify an organization they are affiliated with.

I know I can't be the first one to be dealing with this
issue, so I am looking to draw on experience from a pro here.

Thought about perhaps looking for a subscription service
that would provide organization name, address(city, state
would suffice), and Federal Tax ID Number(which would be
used as primary key) and include monthly updates, but I
really don't know if this type of subscription even exists.

Any advice on this?

Thanks,

Carlos
 
There are companies that can help with this. It's not cheap but, if your db is big enough, it may make sense. They will remove dupes, fixup & verify addresses, add Plus4 zip code, etc. Search on "database cleaning" or similar.

It may also make sense to roll your own. If your area of concern is limited, you can do your own brainstorming on all the likely ways to spell/misspell/abbreviate common items, such as FDA, F.D.A., F D A, etc. and run your own batch job at night that will make corrections. You'll probably want to write something like this in C.
 
Back
Top