String comparison algorithms

  • Thread starter Thread starter almurph
  • Start date Start date
A

almurph

Hi,


Hope you can help me with this one. I'm looking for some nice string
comparison algorithms. I want to be able to compare 2 strings (fairly
smallish, less than 50 characters) and return a % of how well they are
similar. So, 2 strings that are absoloutly identical will return 100%.
Strings that are radically different will return numbers near 0%:-


Tokenistic Examples:

String1 String2 % Comnparison
Albatross Car 5
Car Car 100



I would appreciate any
comments/code-samples/suggestions/user-experiences that you may have...


Thanks in advance,
Al.

PS: I already have implemented Lavenstein distance, so no worries there.
 
It all depends how you really want to rate the similarity

In the case that you returned 5 I don't see any similarity at all
 
MySQL has got a similar command: soundex

$sql = "
SELECT title FROM entries
WHERE
(
title LIKE '$ax%'
OR soundex(title) LIKE soundex('$ax')
)
LIMIT 20
";
 
nime said:
MySQL has got a similar command: soundex

But soundex is not a string comparison method. It compares words by
their sound (hence the name). I doesn't give a "score" of how well two
words compare.
 
I think there is a dynamic programming algorithm to determine this : its
called length between the 2 strings. Google this up and I am sure you will
find some code


---------
- G Himangi, Sky Software http://www.ssware.com
Shell MegaPack : Drop-In Explorer GUI Controls For Your Apps (.Net & ActiveX
Editions Available)
EZNamespaceExtensions.Net : Develop namespace extensions rapidly in .Net
EZShellExtensions.Net : Develop all shell extensions rapidly in .Net
 
Back
Top