"Fuzzy" Search

  • Thread starter Thread starter Bruce Rodtnick
  • Start date Start date
B

Bruce Rodtnick

I was looking at another type database and this database had what it
called a Fuzzy Search. In other words if you were searching a memo
field and put in the word "Mother" it could come up with "Other", or
"Brother", etc., things that rhymed with "Mother".

Can Access do something like this? I'd like to incorporate that into my
database.

Bruce Rodtnick
 
You are probably refering to soundex.

I have used used soundex on small ms-acces sfiles (small tables with only
35,000 reords). for these small tables, it works well. I think hwen tbles
get largeer (say above 150,000 reocrs), then soundex is not good, since you
get too many mantaches for a give search.

The sourndex code is not built into ms-access. So, for reasons of speed, you
have to use the after update event of the lastname (and possbility first
name) field to write out the soundex code to another field. That way, you
can use high-speed indexing to search for a soundex match. Unlike foxpro, we
can't index on a expression, so we have to actauly store the value. The
soundex code I used follows:


--
Albert D. Kallal (MVP)
Edmonton, Alberta Canada
(e-mail address removed)
http://www.attcanada.net/~kallal.msn

Function strSoundex(strSource As Variant) As Variant

'
' Produces a code based on the "Soundex" method
'
'
' Parms:
' strSource Passed string
'
'
' Written by Albert D. Kallal
'
'
' See page 392 of Knuths' book 'Sorting and Searching', Volume 3 of 'The
' Art of Computer Programming", Addison/Wesley publisher.
'
' Method used:
' 1. Change all lowercase to uppercase
' 2. Retain first letter of input string (must be alpha)
' 3. Ignore the letters A, E, H, I, O, U, W, Y, and any other
' non-alphabetic characters:
' 4. Subsitiute the following
' 1 = B F P V
' 2 = C G J K Q S X Z
' 3 = D T
' 4 = L
' 5 = M N
' 6 = R
' 5. Ignore identical letters next to each other
' 6. Add trailing zeros if the length is less than "max.soundlen"
' characters (in this example it is 4 numeric + 1 alpha = 5).
' Stop when the string reaches max.soundlen (ie:truncate the rest)
'
'START:--------------------------------------------------------------------
'
If IsNull(strSource) = True Then
strSoundex = Null
Exit Function
End If

Const maxsoundlen As Integer = 4 ' Max length of resulting soundex
code
Dim TransTable As String
Dim Offset As Integer ' Offset for easy TABLE translates
Dim SourceLen As Integer
Dim charptr As Integer
Dim testchar As String
Dim lastchar As String
Dim intLookup As Integer
Dim strDigit As String

' Conversion table
' [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
TransTable = "01230120022455012623010202"
Offset = Asc("A") - 1 ' Offset for easy TABLE translates
'
' note: for "AEHIOUWY" are ignored by translating to zero
'
strSource = UCase(strSource) ' convert string to uppercase

SourceLen = Len(strSource) ' find/set string length to process

strSoundex = Left$(strSource, 1) ' Get/set first character
lastchar = strSoundex
charptr = 2 ' We skiped the first char above

Do While (charptr <= SourceLen) And (Len(strSoundex) < maxsoundlen)
'
testchar = Mid$(strSource, charptr, 1) ' get 1 char to test
'
' if different than last character, then process
'
If testchar <> lastchar Then
'
intLookup = Asc(testchar) - Offset
If (intLookup > 0) And (intLookup <= 26) Then
strDigit = Mid$(TransTable, intLookup, 1) ' table translate to
soundex
If strDigit <> "0" Then
strSoundex = strSoundex & strDigit
lastchar = testchar
End If
End If
End If
'
charptr = charptr + 1 ' move on to next character
Loop

'strSoundex = Left(strSoundex & "00000", maxsoundlen) ' pad with
trailing zeros (5 CHARS)

End Function
 
Soundex? Not for this.


Albert D. Kallal said:
You are probably refering to soundex.

I have used used soundex on small ms-acces sfiles (small tables with only
35,000 reords). for these small tables, it works well. I think hwen tbles
get largeer (say above 150,000 reocrs), then soundex is not good, since you
get too many mantaches for a give search.

The sourndex code is not built into ms-access. So, for reasons of speed, you
have to use the after update event of the lastname (and possbility first
name) field to write out the soundex code to another field. That way, you
can use high-speed indexing to search for a soundex match. Unlike foxpro, we
can't index on a expression, so we have to actauly store the value. The
soundex code I used follows:


--
Albert D. Kallal (MVP)
Edmonton, Alberta Canada
(e-mail address removed)
http://www.attcanada.net/~kallal.msn

Function strSoundex(strSource As Variant) As Variant

'
' Produces a code based on the "Soundex" method
'
'
' Parms:
' strSource Passed string
'
'
' Written by Albert D. Kallal
'
'
' See page 392 of Knuths' book 'Sorting and Searching', Volume 3 of 'The
' Art of Computer Programming", Addison/Wesley publisher.
'
' Method used:
' 1. Change all lowercase to uppercase
' 2. Retain first letter of input string (must be alpha)
' 3. Ignore the letters A, E, H, I, O, U, W, Y, and any other
' non-alphabetic characters:
' 4. Subsitiute the following
' 1 = B F P V
' 2 = C G J K Q S X Z
' 3 = D T
' 4 = L
' 5 = M N
' 6 = R
' 5. Ignore identical letters next to each other
' 6. Add trailing zeros if the length is less than "max.soundlen"
' characters (in this example it is 4 numeric + 1 alpha = 5).
' Stop when the string reaches max.soundlen (ie:truncate the rest)
'
'START:--------------------------------------------------------------------
'
If IsNull(strSource) = True Then
strSoundex = Null
Exit Function
End If

Const maxsoundlen As Integer = 4 ' Max length of resulting soundex
code
Dim TransTable As String
Dim Offset As Integer ' Offset for easy TABLE translates
Dim SourceLen As Integer
Dim charptr As Integer
Dim testchar As String
Dim lastchar As String
Dim intLookup As Integer
Dim strDigit As String

' Conversion table
' [ABCDEFGHIJKLMNOPQRSTUVWXYZ]
TransTable = "01230120022455012623010202"
Offset = Asc("A") - 1 ' Offset for easy TABLE translates
'
' note: for "AEHIOUWY" are ignored by translating to zero
'
strSource = UCase(strSource) ' convert string to uppercase

SourceLen = Len(strSource) ' find/set string length to process

strSoundex = Left$(strSource, 1) ' Get/set first character
lastchar = strSoundex
charptr = 2 ' We skiped the first char above

Do While (charptr <= SourceLen) And (Len(strSoundex) < maxsoundlen)
'
testchar = Mid$(strSource, charptr, 1) ' get 1 char to test
'
' if different than last character, then process
'
If testchar <> lastchar Then
'
intLookup = Asc(testchar) - Offset
If (intLookup > 0) And (intLookup <= 26) Then
strDigit = Mid$(TransTable, intLookup, 1) ' table translate to
soundex
If strDigit <> "0" Then
strSoundex = strSoundex & strDigit
lastchar = testchar
End If
End If
End If
'
charptr = charptr + 1 ' move on to next character
Loop

'strSoundex = Left(strSoundex & "00000", maxsoundlen) ' pad with
trailing zeros (5 CHARS)

End Function
 
Back
Top