MD5 Hash with single quote = grief in dao.findfirst

  • Thread starter Thread starter Stephen Rasey
  • Start date Start date
S

Stephen Rasey

(this is a cross-posting from microsoft.public.excel.programming with some
editing)

I know when you need to create a query string and the data contains a single
quote, you must double the quote as an escape sequence. I am doing that,
but my dao recordset.FindFirst gives a NoMatch. Am I blundering?

The situation. I am using a wwGetMD5Hash2 function See:
http://excelsig.org/VBA/wwHash.htm#wwGetMD5Hash2
I use this to track whole sets of assumptions in my portfolio models.
Hundreds of assumption cells combined into one 16 character Hash. That
Hash I write to a database table called H70. The HASH field is a No
Duplicates index. Each record has an IDHash, a Long integer, random
autonumber.

When I run the model, I check the Hash result against the H70 table in the
database using a FindFirst method. If the Hash cannot be found, I create
a new record. If it can
be found, I retrieve the IDHash long integer value.

It works 95% of the time. But if the Hash string contains a single quote,
my DAO rsH70.FindFirst fails to find the Hash string. I double the single
quote. I do not get a SQL error. But it will not find the record.

Here is an example. The 14th character in the hash stored in a row of the
H70
table contains a single quote.
Field:HASH in Table H70 contains
~8æQzzûù¦ÿ5'?;

The Hash Value in Spreadsheet read into variable strHash (as seen in the
Immediate window)
~8æQzzûù¦ÿ5'?;

I have a small function to replace a single quote with two single quotes as
I build the criteria SQL string.
Dim sql as string
sql is set to the value
sql = "HASH = '" & wwQuoteFix(strHash) & "'"

Function wwQuoteFix(str1 As String) As String
wwQuoteFix = Replace(str1, "'", "''")
End Function


In the H70Check function below, the value of sql in the find first statement
is:
HASH = '~8æQzzûù¦ÿ5''?;'
so I think I have correctly doubled the single quote within the string.

yet the DAO recordset.NoMatch is true

Function H70Check(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched table
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With rsH70ProspSched 'a Public DAO.Recordset
.FindFirst sql
If .NoMatch Then
H70Check = 0
Else
H70Check = !IDHashProspSched
End If
End With
End Function

When I attempt to write the Hash as a new record, I error because the Access
Index already has the Hash of that value.

Maybe the non-printing characters are part of the problem. Other Hashes
with non-printing characters work if they do not have single quotes.

I am prepared to be very embarrased at some dumb mistake I made. I'll
accept that. I am out of ideas.
I hope someone sees the problem.

I am on the lookout for a Hash function that provided a more restricted set
of characters. the wwGetMD5Hash2 function will return asc(0), asc(10),
asc(39) and some other characters that might be problematic in Excel and
Access.

Stephen Rasey
WiserWays, LLC
Houston
http://excelsig.org
 
You don't show the code where you write the value to the database.
Check to see that you are not truncating at that stage.
 
Here is the writing portion. I have been getting the H70Write Error
message.


Set rsH70ProspSched = dbTS.OpenRecordset("select * from H70HashProspSched")

Function H70Write(strHash As String) As Long
On Error GoTo e1
With rsH70ProspSched
.AddNew
!Hash = strHash
!DateAdded = Now()
H70Write = !IDHashProspSched
.Update
End With
Exit Function

e1:
wwErr1 "H70Write", "Error writing Hash:" & strHash
H70Write = 0
End Function

I showed in the original message what was in the database. The only thing
I haven't done is show the Asc() values for each of the characters.
Here is another example. Text to search for and Asc() numbers.
-s3ë=­'[+@û;ÒëN 151, 115, 2, 51, 235, 61, 173, 39, 91, 43, 64, 251,
59, 210, 235, 78


This is what is in the database
-s3ë=­'[+@û;ÒëN

The problem is always with a hash with an Asc()=39

Your question about truncating made me think about the size of the database
field. It is a Text 16. Just the size of the Hash. But maybe not big
enough to handle a doubled single quote as an escape sequence. It would
not seem logical that it would apply the 16 character truncation until after
the doubled single quote was resolved, but.....

I'll try it again with an increased field size.

Stephen Rasey
WiserWays, LLC
Houston
 
I set the Hash Field size to Text 20.

From the immediate window.
?strhash
-s3ë=­'[+@û;ÒëN
?sql
HASH = '-s3ë=­''[+@û;ÒëN'

What is in the database
-s3ë=­'[+@û;ÒëN

The Asc() codes for the strHash.
151, 115, 2, 51, 235, 61, 173, 39, 91, 43, 64, 251, 59, 210, 235, 78,


With the Field size at Text 20, it still errors. So that wasn't it.

I'm dumbfounded.

Stephen Rasey
 
I am still dumbfounded, but happier.

I tried using an ADO recordset to do the check the Hash. It worked.
One of the possibilities is there is a bug in the DAO 3.6 library.

I am using libraries VBA, Excel 11.0, Office 11.0, Forms 2.0, Access 11.0,
DAO 3.6, ADO Ext. 2.7,
and ActiveX Data Objects 2.7 defined in that order.

I am willing to close the thread, but I'll be glad to discuss anything I
might have done wrong .
Stephen Rasey
WiserWays, LLC
Houston
http://excelsig.org

Sub TestH70Ado()
H02_Define_H_Recordsets
Dim strH As String
Dim ID As Long

ID = H70Check("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114
works
ID = H70CheckADO("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70Check("-s3ë=­'[+@û;ÒëN") 'Answer is supposed to be
1106461086, returns 0
ID = H70CheckADO("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
End Sub

Sub H02_Define_H_Recordsets
Set adoConn1 = Nothing
Set adorsH70 = Nothing
Set adoConn1 = New ADODB.Connection
Set adorsH70 = New ADODB.Recordset
adoConn1.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _
& "Data Source=" & gstrDBFullPath

adorsH70.CursorLocation = adUseServer
adorsH70.Open "select * from H70HashProspSched", adoConn1,
adOpenDynamic, adLockReadOnly
end sub


Function H70CheckADO(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With adorsH70
.MoveFirst 'With ADO, to do a find first, you must do a
MoveFirst, then Find.
.Find sql
If .EOF Then
H70CheckADO = 0
Else
H70CheckADO = !IDHashProspSched
End If
End With
End Function

Function H70Check(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
'Bug of unknown cause: if strHash has a single quote, it fails to find
the record.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
'sql = "HASH = '" & strHash & "'" 'Rasey 040804 v24j
With rsH70ProspSched
.MoveFirst 'Tried a .MoveFirst to see if that
would fix the DAO problem. It didn't.
.FindFirst sql
If .NoMatch Then
H70Check = 0
Else
H70Check = !IDHashProspSched
End If
End With

End Function

Function wwQuoteFix(str1 As String) As String
wwQuoteFix = Replace(str1, "'", "''")
End Function
 
I tried using an ADO recordset to do the check the Hash. It worked.
One of the possibilities is there is a bug in the DAO 3.6 library.

Normally, Jet is case-insensitive in string comparisons, and therefore DAO
comparisons will be too. I guess that u with little dots and other accents
all equate to the same character.

Other databases can be forced to carry out true binary comparisons, and
this is probably why ADO fared better. There was a post here recently about
collating orders in text searches in SQL Server, with a MSDN URL but I
can't find it just now.

Bottom line: it's not a bug. It's working according to design and is
accurately documented. The web page you refer to does not make any claim of
this method working with Jet-DAO and I doubt that anyone else would either.

Hope that helps


Tim F
 
RE: case-insensitive. I had not considered that. It does not fit the
data, however.
If the DAO.FindFirst was case insensitive, it would have found an incorrect
wrong record, not failed to find one at all.
It was only the strings that contained a single quote that caused me
trouble.

RE: "the web page". What web page did I reference? Not that it matters
much. If I had a nickle for every important note missing from a help file
or web page... I might not be rich, but I would have another steady source
of income.

Stephen Rasey
WiserWays, LLC
Houston
http://excelsig.org
 
RE: case-insensitive. I had not considered that. It does not
fit the data, however.
If the DAO.FindFirst was case insensitive, it would have found an
incorrect wrong record, not failed to find one at all.

Okay:- I obviously misunderstood the problem (not a rare occurrence!).
It was only the strings that contained a single quote that caused me
trouble.

Now there was definitely an article recently that cited a couple of MSDN
articles about searching in SQL Server and jet, where special characters
are ignored such as "'" and "-". This is apparently to improve handling of
names like O'Connor and Handley-Page. Sorry I cannot find the reference to
it now... :-(
RE: "the web page". What web page did I reference? Not that it
matters much.

Your original post:-
The situation. I am using a wwGetMD5Hash2 function See:
http://excelsig.org/VBA/wwHash.htm#wwGetMD5Hash2
I use this to track whole sets of assumptions in my portfolio models.

All the best


Tim F
 
Tim Ferguson said:
Your original post:-

Ah. That is MY web page. It is a function that creates a hash and
returns a string. It has nothing to do with and database reads or writes.

Stephen Rasey
 
FWIW, I don't have any problem with "single quotes' in
DAO.findfirst, in Access 2000 or A97.

Two thoughts:
1) You haven't set Option Compare Binary at the top of the code module?

2) If you check using AscW() instead of Asc() do you see anything different?

(david)


Stephen Rasey said:
I set the Hash Field size to Text 20.

From the immediate window.
?strhash
-s3ë=­'[+@û;ÒëN
?sql
HASH = '-s3ë=­''[+@û;ÒëN'

What is in the database
-s3ë=­'[+@û;ÒëN

The Asc() codes for the strHash.
151, 115, 2, 51, 235, 61, 173, 39, 91, 43, 64, 251, 59, 210, 235, 78,


With the Field size at Text 20, it still errors. So that wasn't it.

I'm dumbfounded.

Stephen Rasey

david epsom dot com dot au said:
You don't show the code where you write the value to the database.
Check to see that you are not truncating at that stage.




of
the quotes
as restricted
set
 
david epsom dot com dot au said:
1) You haven't set Option Compare Binary at the top of the code module?

2) If you check using AscW() instead of Asc() do you see anything different?

No, I don't have Option Compare Binary. That might have been my blunder.
I'll check it out. Thanks

Stephen Rasey
WiserWays
Houston
http://excelsig.org
 
No
Option Compare Binary
was not the solution. Binary is the default anyway. It tried the Sub
TestH70Ado using an Option Compare Text and it did not help.

I could not use Option Compare Database. Odd, huh? 'Compare Database' is
only available from Access according to the help.

I am doing all this from Excel VBA. I have the Access 11.0 library
referenced, but it still will not take Option Compare Database. I doubt my
problem is caused by running DAO.recordset from excel. If it is, then it
certainly is a bug.

I have the ADO working. I have always prefered DAO (simpler setups), but I
guess I'll have to prefer ADO of DAO if I can't trust DAO.

Stephen Rasey
WiserWays
Houston
http://excelsig.org

Sub TestH70Ado()
H02_Define_H_Recordsets
Dim strH As String
Dim ID As Long

ID = H70Check("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70CheckADO("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70Check("-s3ë=­'[+@û;ÒëN") 'Answer is supposed to be
1106461086, returns 0
ID = H70CheckADO("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
End Sub

Function H70Check(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With rsH70ProspSched
.Requery
.MoveFirst 'rasey 040813 test
.FindFirst sql
If .NoMatch Then
H70Check = 0
Else
H70Check = !IDHashProspSched
End If
End With
End Function

Function H70CheckADO(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With adorsH70
.Requery
.MoveFirst 'With ADO, to do a find first, you must do a
MoveFirst, then Find.
.Find sql
If .EOF Then
H70CheckADO = 0
Else
H70CheckADO = !IDHashProspSched
End If
End With

End Function

Function wwQuoteFix(str1 As String) As String
wwQuoteFix = Replace(str1, "'", "''")

End Function
 
I am doing all this from Excel VBA.

Interesting... Yes, option compare database is an Access
option - it picks up the database sort order. And I was
assuming Access - which defaults VBA to Database Order.

I THINK that if they are different, DAO uses the VBA sort
order rather than the Database sort order. And I would
expect ADO to go the other way. But I could be wrong.

If your hash routine assumes that 'a' and 'A' are different,
you may want to check your ADO code to see that you are
getting a Binary comparison.

(david)

Stephen Rasey said:
No
Option Compare Binary
was not the solution. Binary is the default anyway. It tried the Sub
TestH70Ado using an Option Compare Text and it did not help.

I could not use Option Compare Database. Odd, huh? 'Compare Database' is
only available from Access according to the help.

I am doing all this from Excel VBA. I have the Access 11.0 library
referenced, but it still will not take Option Compare Database. I doubt my
problem is caused by running DAO.recordset from excel. If it is, then it
certainly is a bug.

I have the ADO working. I have always prefered DAO (simpler setups), but I
guess I'll have to prefer ADO of DAO if I can't trust DAO.

Stephen Rasey
WiserWays
Houston
http://excelsig.org

Sub TestH70Ado()
H02_Define_H_Recordsets
Dim strH As String
Dim ID As Long

ID = H70Check("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70CheckADO("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70Check("-s3ë=­'[+@û;ÒëN") 'Answer is supposed to be
1106461086, returns 0
ID = H70CheckADO("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
End Sub

Function H70Check(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With rsH70ProspSched
.Requery
.MoveFirst 'rasey 040813 test
.FindFirst sql
If .NoMatch Then
H70Check = 0
Else
H70Check = !IDHashProspSched
End If
End With
End Function

Function H70CheckADO(strHash As String) As Long
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
Dim sql As String
sql = "HASH = '" & wwQuoteFix(strHash) & "'" 'Rasey 040804 v24j
With adorsH70
.Requery
.MoveFirst 'With ADO, to do a find first, you must do a
MoveFirst, then Find.
.Find sql
If .EOF Then
H70CheckADO = 0
Else
H70CheckADO = !IDHashProspSched
End If
End With

End Function

Function wwQuoteFix(str1 As String) As String
wwQuoteFix = Replace(str1, "'", "''")

End Function



david epsom dot com dot au said:
FWIW, I don't have any problem with "single quotes' in
DAO.findfirst, in Access 2000 or A97.

Two thoughts:
1) You haven't set Option Compare Binary at the top of the code module?

2) If you check using AscW() instead of Asc() do you see anything different?

(david)
 
Yuck! my checks are case insensitive on both DAO and ADO!

'Option compare binary test
ID = H70CheckADO("aAaAaAaBcDeFG") 'Answer is -2030670587
ID = H70CheckADO("aAaAaAABcDeFG") 'answer 0. Returns -2030670587
ADO error
ID = H70Check("aAaAaAABcDeFG") 'answer 0.
Returns -2030670587 DAO error
ID = H70CheckADO("AAAAAABCDEFG") 'answer is 1092326831
ID = H70CheckADO("AAaAAABCDEFG") 'answer is 0 Returns 1092326831
ADO error
ID = H70Check("AAaAAABCDEFG") 'answer is 0 Returns
1092326831 DAO error
ID = H70CheckADO("AAbAAABCDEFG") 'answer is 0 returns 0.

With the full 16 character hash there are 256^16 = 3*10^38 values
With a case insensitive 16 character hash there are 128^16 = 5*10^33
values.
So even if I leave it case insensitive, the odds of a collision on a billion
record table are less than 1 in 10^24.

To recap, I am doing this in Excel VBA.
I used
Option Explicit
Option Base 1
Option Compare Binary

Set adoConn1 = New ADODB.Connection
Set adorsH70 = New ADODB.Recordset
adoConn1.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _
& "Data Source=" & gstrDBFullPath
adorsH70.CursorLocation = adUseServer
adorsH70.Open "select * from H70HashProspSched", adoConn1, adOpenDynamic,
adLockOptimistic

Public dbTS As DAO.Database
Set rsH70ProspSched = dbTS.OpenRecordset("select * from
H70HashProspSched")


I've have spent far too much time on this. But it is fixed!!!
<--------------------------

Function H70CheckBin(strHash As String) As Long
'A Case Sensitive SQL match for the HASH.
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
'technique from http://support.microsoft.com/?kbid=209674
Dim sql As String
Dim rsh70 As DAO.Recordset
Set rsh70 = dbTS.OpenRecordset("Select * from H70HashProspSched as H
WHERE (0 = StrComp(H.HASH, '" & wwQuoteFix(strHash) & "',0))")
With rsh70
If .EOF Then
H70CheckBin = 0
Else
H70CheckBin = !IDHashProspSched
End If
End With
Set rsh70 = Nothing
End Function

ID = H70CheckBin("aAaAaAaBcDeFG") 'Answer is -2030670587 correct
ID = H70CheckBin("aAaAaAABcDeFG") 'Answer is 0 correct!
ID = H70CheckBin("AAAAAABCDEFG") 'answer is 1092326831 correct
ID = H70CheckBin("AAaAAABCDEFG") 'answer is 0
correct.
ID = H70CheckBin("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70CheckBin("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
It even works on the single quote.

I still say there is a problem with a single quote in a
dao.recordset.findfirst. But the solution to fix the case insensitivity
also fixes the single quote problem.

I'm done. Back to product.
Thanks, David.

Stephen Rasey
WiserWays
Houston
http://excelsig.org


Other links in my search.
http://www.4guysfromrolla.com/webtech/sqlguru/q022400-1.shtml
Access How To Articles.
http://www.kayodeok.btinternet.co.uk/favorites/kbofficeaccesshowto.htm
 
I think what you just said was that, given the odds, you
won't worry about hash collisions on inserts. But given
the odds, you could just as easily UCASE the hash value???
If you are using a Text field, I think that UCASEing the
hash value would be a much clearer/cleaner idea ???? Or
using a Binary field if you want to retain case ????

(david)

Stephen Rasey said:
Yuck! my checks are case insensitive on both DAO and ADO!

'Option compare binary test
ID = H70CheckADO("aAaAaAaBcDeFG") 'Answer is -2030670587
ID = H70CheckADO("aAaAaAABcDeFG") 'answer 0. Returns -2030670587
ADO error
ID = H70Check("aAaAaAABcDeFG") 'answer 0.
Returns -2030670587 DAO error
ID = H70CheckADO("AAAAAABCDEFG") 'answer is 1092326831
ID = H70CheckADO("AAaAAABCDEFG") 'answer is 0 Returns 1092326831
ADO error
ID = H70Check("AAaAAABCDEFG") 'answer is 0 Returns
1092326831 DAO error
ID = H70CheckADO("AAbAAABCDEFG") 'answer is 0 returns 0.

With the full 16 character hash there are 256^16 = 3*10^38 values
With a case insensitive 16 character hash there are 128^16 = 5*10^33
values.
So even if I leave it case insensitive, the odds of a collision on a billion
record table are less than 1 in 10^24.

To recap, I am doing this in Excel VBA.
I used
Option Explicit
Option Base 1
Option Compare Binary

Set adoConn1 = New ADODB.Connection
Set adorsH70 = New ADODB.Recordset
adoConn1.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _
& "Data Source=" & gstrDBFullPath
adorsH70.CursorLocation = adUseServer
adorsH70.Open "select * from H70HashProspSched", adoConn1, adOpenDynamic,
adLockOptimistic

Public dbTS As DAO.Database
Set rsH70ProspSched = dbTS.OpenRecordset("select * from
H70HashProspSched")


I've have spent far too much time on this. But it is fixed!!!
<--------------------------

Function H70CheckBin(strHash As String) As Long
'A Case Sensitive SQL match for the HASH.
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
'technique from http://support.microsoft.com/?kbid=209674
Dim sql As String
Dim rsh70 As DAO.Recordset
Set rsh70 = dbTS.OpenRecordset("Select * from H70HashProspSched as H
WHERE (0 = StrComp(H.HASH, '" & wwQuoteFix(strHash) & "',0))")
With rsh70
If .EOF Then
H70CheckBin = 0
Else
H70CheckBin = !IDHashProspSched
End If
End With
Set rsh70 = Nothing
End Function

ID = H70CheckBin("aAaAaAaBcDeFG") 'Answer is -2030670587 correct
ID = H70CheckBin("aAaAaAABcDeFG") 'Answer is 0 correct!
ID = H70CheckBin("AAAAAABCDEFG") 'answer is 1092326831 correct
ID = H70CheckBin("AAaAAABCDEFG") 'answer is 0
correct.
ID = H70CheckBin("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70CheckBin("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
It even works on the single quote.

I still say there is a problem with a single quote in a
dao.recordset.findfirst. But the solution to fix the case insensitivity
also fixes the single quote problem.

I'm done. Back to product.
Thanks, David.

Stephen Rasey
WiserWays
Houston
http://excelsig.org


Other links in my search.
http://www.4guysfromrolla.com/webtech/sqlguru/q022400-1.shtml
Access How To Articles.
http://www.kayodeok.btinternet.co.uk/favorites/kbofficeaccesshowto.htm




david epsom dot com dot au said:
Interesting... Yes, option compare database is an Access
option - it picks up the database sort order. And I was
assuming Access - which defaults VBA to Database Order.

I THINK that if they are different, DAO uses the VBA sort
order rather than the Database sort order. And I would
expect ADO to go the other way. But I could be wrong.

If your hash routine assumes that 'a' and 'A' are different,
you may want to check your ADO code to see that you are
getting a Binary comparison.

(david)
 
The database looks like it correctly checks case on insert.

The MD5 Hash is a 16 character string containing any of 256 characters. So
it is a 16 digit base 256 "number"

The only problem would be if I generated a HashB that differed from an
already stored HashA in different case. The H70CheckADO function would
INCORRECTLY report a hit and I would use an incorrect record insted of store
a new one. The odd of that happening once in a billion record table are
in the magnitude of 1 in 10^24.

Let's compare that with using a Long Integer Autonumber Random primary key.
2^32 is 4.3*10^9. The chances of a random collision in a billion record
table is not 1 in 4, but it is virtually guaranteed.

It is the old "same birthday in a room ful of people" problem. The
birthday paradox statement is:
You have N number of people in a room randomly selected. At what is the
lowest value N is there a >50% chance there will be two or more people that
share a birthday. I think N is 29. [edit: nope it is 23]
http://www.people.virginia.edu/~rjh9u/birthday.html
http://mathforum.org/dr.math/faq/faq.birthdayprob.html)
http://science.howstuffworks.com/question261.htm
http://science.howstuffworks.com/fr...p://burtleburtle.net/bob/crypto/exchange.html

With the long integer case, the billionth record has a 1:4 of picking a PK
allready used. The record before that is also 1:4. In fact the last
million records also has a 1:4.

Even a million record table has a VERY High probability of a collision. I
would guess there is a 50% probability of a collision at SQRT(2^32) = about
64,000 records.

So I literally did not need to loose sleep over the case insensitivity of
the ADO recordset Find method.

I will check the performance of the H70CheckBin routine against the
H70CheckADO. If there is a big penalty on performance for case sensitive,
then I'll switch back to the ADO method.

I've learned something along the way and I have that d***ed single quote
problem solved.

Stephen Rasey
WiserWays
Houston
http://excelsig.org
 
I forgot to add in the previous reply, that with a HASH of 256 different
characters, I do not know what a UCASE() will do to some of the less common
characters.

I already do a replace for Chr(0) and make it an "A". Chr(0) ends the
string display in Excel. Maybe UCASE will be cheap means of keeping things
fast. I'll keep it in mind, but I've got to make progress on the product.

Stephen Rasey


david epsom dot com dot au said:
I think what you just said was that, given the odds, you
won't worry about hash collisions on inserts. But given
the odds, you could just as easily UCASE the hash value???
If you are using a Text field, I think that UCASEing the
hash value would be a much clearer/cleaner idea ???? Or
using a Binary field if you want to retain case ????

(david)

Stephen Rasey said:
Yuck! my checks are case insensitive on both DAO and ADO!

'Option compare binary test
ID = H70CheckADO("aAaAaAaBcDeFG") 'Answer is -2030670587
ID = H70CheckADO("aAaAaAABcDeFG") 'answer 0. Returns -2030670587
ADO error
ID = H70Check("aAaAaAABcDeFG") 'answer 0.
Returns -2030670587 DAO error
ID = H70CheckADO("AAAAAABCDEFG") 'answer is 1092326831
ID = H70CheckADO("AAaAAABCDEFG") 'answer is 0 Returns 1092326831
ADO error
ID = H70Check("AAaAAABCDEFG") 'answer is 0 Returns
1092326831 DAO error
ID = H70CheckADO("AAbAAABCDEFG") 'answer is 0 returns 0.

With the full 16 character hash there are 256^16 = 3*10^38 values
With a case insensitive 16 character hash there are 128^16 = 5*10^33
values.
So even if I leave it case insensitive, the odds of a collision on a billion
record table are less than 1 in 10^24.

To recap, I am doing this in Excel VBA.
I used
Option Explicit
Option Base 1
Option Compare Binary

Set adoConn1 = New ADODB.Connection
Set adorsH70 = New ADODB.Recordset
adoConn1.Open "Provider=Microsoft.Jet.OLEDB.4.0;" _
& "Data Source=" & gstrDBFullPath
adorsH70.CursorLocation = adUseServer
adorsH70.Open "select * from H70HashProspSched", adoConn1, adOpenDynamic,
adLockOptimistic

Public dbTS As DAO.Database
Set rsH70ProspSched = dbTS.OpenRecordset("select * from
H70HashProspSched")


I've have spent far too much time on this. But it is fixed!!!
<--------------------------

Function H70CheckBin(strHash As String) As Long
'A Case Sensitive SQL match for the HASH.
'return the IDHashProspSched if the strHash is found in
H70HashProspSched
'return 0 if not found. Protect against 0 being a legitimate ID.
'technique from http://support.microsoft.com/?kbid=209674
Dim sql As String
Dim rsh70 As DAO.Recordset
Set rsh70 = dbTS.OpenRecordset("Select * from H70HashProspSched as H
WHERE (0 = StrComp(H.HASH, '" & wwQuoteFix(strHash) & "',0))")
With rsh70
If .EOF Then
H70CheckBin = 0
Else
H70CheckBin = !IDHashProspSched
End If
End With
Set rsh70 = Nothing
End Function

ID = H70CheckBin("aAaAaAaBcDeFG") 'Answer is -2030670587 correct
ID = H70CheckBin("aAaAaAABcDeFG") 'Answer is 0 correct!
ID = H70CheckBin("AAAAAABCDEFG") 'answer is 1092326831 correct
ID = H70CheckBin("AAaAAABCDEFG") 'answer is 0
correct.
ID = H70CheckBin("$¥¨3¶¯ô¼.Ë©£F³Ñ") 'Answer is -1915054114 works
ID = H70CheckBin("-s3ë=­'[+@û;ÒëN") 'Answer is 1106461086 works.
It even works on the single quote.

I still say there is a problem with a single quote in a
dao.recordset.findfirst. But the solution to fix the case insensitivity
also fixes the single quote problem.

I'm done. Back to product.
Thanks, David.

Stephen Rasey
WiserWays
Houston
http://excelsig.org


Other links in my search.
http://www.4guysfromrolla.com/webtech/sqlguru/q022400-1.shtml
Access How To Articles.
http://www.kayodeok.btinternet.co.uk/favorites/kbofficeaccesshowto.htm




david epsom dot com dot au said:
I am doing all this from Excel VBA.

Interesting... Yes, option compare database is an Access
option - it picks up the database sort order. And I was
assuming Access - which defaults VBA to Database Order.

I THINK that if they are different, DAO uses the VBA sort
order rather than the Database sort order. And I would
expect ADO to go the other way. But I could be wrong.

If your hash routine assumes that 'a' and 'A' are different,
you may want to check your ADO code to see that you are
getting a Binary comparison.

(david)
 
PMFJI, but why not store your 16 base-256 digits as a string of 32
base-16 digits (0..F)?
 
What function would I use to convert the string? is there a CHex() in VBA?
I do not think so. The Hex() functions require a numeric input, not a
string.

On a practical matter, The HASH is a unique key key field. A 16 character
primary key is smaller than a 32 character primary key. So it would
increase the size of some of my tables.

On the other hand, I was originally looking for a Hash function that
returned a base 10, base 36 or something less than 256. Stephen Bullen had
an MD5 Hash on the newsgroup in February and I used it after modifying it to
fix a bug and change its handeling of empty cells.

Stephen Rasey
WiserWays
Houston
http://excelsig.org
 
Back
Top