Checksum, CRC or something better?

  • Thread starter Thread starter Bob
  • Start date Start date
B

Bob

Hi there,

I am working on an application to be used in our local Forensics
department...

Among other things the app will connect to a digital camera and download the
images to the hard drive. For various reasons I need to be able to say with
certainty that the image on the hard drive is exactly what was on the
camera... any ideas on best way to achieve this...?

I could do a checksum of each image and compare this but not sure how to
implement in code or if there is a better way. Merely checking the file size
matches is not sufficient to stand up in Court.

Any suggestions appreciated, especially if sample code included (can be
classic VB if necessary). Does .Net perhaps have something like this built
into the Framework?

Cheers
 
Bob said:
Hi there,

I am working on an application to be used in our local Forensics
department...

Among other things the app will connect to a digital camera and download the
images to the hard drive. For various reasons I need to be able to say with
certainty that the image on the hard drive is exactly what was on the
camera... any ideas on best way to achieve this...?

I could do a checksum of each image and compare this but not sure how to
implement in code or if there is a better way. Merely checking the file size
matches is not sufficient to stand up in Court.

Any suggestions appreciated, especially if sample code included (can be
classic VB if necessary). Does .Net perhaps have something like this built
into the Framework?

Cheers

Hi Bob,
Have a look at
http://www.vbaccelerator.com/home/NET/Code/Libraries/CRC32/article.asp ,
maybe you can use some of this.
 
Hi Bob,

Here is write a sample for you.

Imports System
Imports System.IO
Imports System.Security.Cryptography
Module Module1
Sub Main()
Dim fs As FileStream = New FileStream("c:\dafen.jpg",
FileMode.Open, FileAccess.Read)
Dim arr(fs.Length) As Byte
fs.Read(arr, 0, fs.Length)
fs.Close()
Dim tmpHash() As Byte
tmpHash = New MD5CryptoServiceProvider().ComputeHash(arr)
fs = New FileStream("c:\dafen1.jpg", FileMode.Open, FileAccess.Read)
fs.Read(arr, 0, fs.Length)
Dim tmpNewHash() As Byte
tmpNewHash = New MD5CryptoServiceProvider().ComputeHash(arr)
Dim bEqual As Boolean
If tmpNewHash.Length = tmpHash.Length Then
Dim i As Integer
Do While (i < tmpNewHash.Length) AndAlso (tmpNewHash(i) =
tmpHash(i))
i += 1
Loop
If i = tmpNewHash.Length Then
bEqual = True
End If
End If
If bEqual Then
Console.WriteLine("The two hash values are the same")
Else
Console.WriteLine("The two hash values are not the same")
End If
End Sub
End Module

HOW TO: Compute and Compare Hash Values by Using Visual Basic .NET
http://support.microsoft.com/default.aspx?scid=kb;en-us;301053

If you have any related question, you may post the question here.

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
 
Hi Bob,

You are right... what you want to do must be done correctly... no mistakes allowed.

Checksums should be used only for error correction, not where evil people may come in. Checksums are simple functions, that can be exploited to create fake documents with the same checksum.

You should use an hash algorithm. An hash algorithm works like a checksum; you provide the input (any file or any buffer with any length), and the algorithm returns the hash value (also called digest, a fixed size array of bytes). The difference is hash algorithms are complex cryptographic functions, and there are several commonly accepted as completely secure. Technically an hash algorithm is an one-way function; a mathematical function that can't be reversed; so that the digest can't be reversed to get the input (or know anything about it). A single bit difference in the input results in a completly different digest. Most important, it's not possible to find another input that computes to the same hash. Hash algorithms are widely used for data integrity verification.

You don't need to implement the hash algorithm, because the Framework already has several (MD5, SHA, with different digest sizes, in bits). SHA512Managed for instance returns a 64 bytes digest for any file or buffer you supply as input. Example:

Dim path As String = "c:\myfile.dat"
Dim stream As New IO.FileStream(path, IO.FileAccess.Read)
Dim sha As New Security.Cryptography.SHA512Managed
Dim digest() As Byte = sha.ComputeHash(stream)

If you are not going to store the digest as a file, and prefer a string for a database, convert it to Base64 encoding:
Dim hashString As String = Convert.ToBase64String(digest)

That's it. When you want to verify if a file if the same or was changed (trojan horse, for instance), you compute the hash and compare the new digest with the one you had stored. If they match, you can be sure the file is exactly the same, if they dont, you are also sure it was changed somehow.

Forensics is another story... hash is good to check data integrity, but that's only a small piece of the big puzzle; computer forensics is a very complex world. Most jobs must be done in a lab, and require special hardware, to make data recovery readings on hard disks. It's possible to compromise a system for a while, and then return it exactly the way it was, leaving no tracks... Hash can tell you the system is no longer compromised, but not what nasty stuff was done before settings and files returned to normal.

To stand up in Court... it depends of what is standing up for. If you want to prove two files are different, my answer is yes, the Court will accept it based on the hash values. Hashes are on the base of digital signatures and code signing certificates; where another major technology is present, public key cryptography; and that stand up too. But that's it. Hash values are no proof that it was Alice who hacked Bob's server, or that the keylogger found in the office PC was placed by the boss. Private forensics may not stand up in court, and are not recommended if you are taking legal actions. If you want to go the legal way, let the police do the forensics. If you just want to know fast what was done on your system, save any important data that may be there, and format so that the system returns quickly into business again, take it to a private lab... one way or another, dont do forensics on serious cases if you are no expert.

Regards,
Mario
 
Hi,

Actually, the other answers that you received are on target, technically.
However, they do nothing to address the legal issue...

And, there is no way to do what you want -- UNLESS the camera ITSELF
generates the CRC. Your software could then compare and store (perhaps
encrypted) the results. The issue here is that there is no chain of
evidence with a conventional digital camera. You only have the image
information AFTER it is stored on the PC. And, the original image may be
purported to have come from a camera, but... How do you prove that?

A better legal remedy would be to capture the image, encrypt it with a
digital certificate, and include a copy of some sort of (other) written
document (probably a physically signed image) that certifies the actual
chain of handling. You should talk to you legal eagles to get a specific
recommendation on this additional work.

BTW, this all is IMO. I'm not a lawyer.

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
* "Bob said:
Among other things the app will connect to a digital camera and download the
images to the hard drive. For various reasons I need to be able to say with
certainty that the image on the hard drive is exactly what was on the
camera... any ideas on best way to achieve this...?

You can use this code to calculate a CRC checksum:

<http://vbaccelerator.com/article.asp?id=930>
 
Hi there,

I am working on an application to be used in our local Forensics
department...

Among other things the app will connect to a digital camera and download the
images to the hard drive. For various reasons I need to be able to say with
certainty that the image on the hard drive is exactly what was on the
camera... any ideas on best way to achieve this...?

I could do a checksum of each image and compare this but not sure how to
implement in code or if there is a better way. Merely checking the file size
matches is not sufficient to stand up in Court.

Any suggestions appreciated, especially if sample code included (can be
classic VB if necessary). Does .Net perhaps have something like this built
into the Framework?

Cheers

You need a digital camera that supports watermarking and/or
signatures. As hinted by the other guys in this thread, the jury's
still out on aspects of this technology.

refs: http://citeseer.nj.nec.com/context/381517/0

I've had a quick look for suitable cameras, but haven't had any luck.
I'm sure someone manufactures them, as the problem was identified
years ago... Best of luck.

Rgds,
 
Mario,

I was just after suggestions on how best to compare 2 files and guarantee they are identical... the actual use is for fingerprint images taken at crime scenes by Forensic Scene Of Crime Officers (SOCOs).

I am writing software that handles getting the images off the camera, making sure the local copy matches exactly and package it with case notes and the hash or whatever data etc.Then I'll be using remoting or similar to transfer back to base for fingerprint identification.

The chain of evidence will be preserved as much as is possible by ensuring the images are written to the hard drive in a folder that the SOCO has no access to - which just covers their butts really. Then once a job is submitted only the fingerprint experts will have access to it.

All the legal stuff has been sorted out but I couldn't give a rats about that... I'm just interested in the technology side of things...

Cheers for the info...



Hi Bob,

You are right... what you want to do must be done correctly... no mistakes allowed.

Checksums should be used only for error correction, not where evil people may come in. Checksums are simple functions, that can be exploited to create fake documents with the same checksum.

You should use an hash algorithm. An hash algorithm works like a checksum; you provide the input (any file or any buffer with any length), and the algorithm returns the hash value (also called digest, a fixed size array of bytes). The difference is hash algorithms are complex cryptographic functions, and there are several commonly accepted as completely secure. Technically an hash algorithm is an one-way function; a mathematical function that can't be reversed; so that the digest can't be reversed to get the input (or know anything about it). A single bit difference in the input results in a completly different digest. Most important, it's not possible to find another input that computes to the same hash. Hash algorithms are widely used for data integrity verification.

You don't need to implement the hash algorithm, because the Framework already has several (MD5, SHA, with different digest sizes, in bits). SHA512Managed for instance returns a 64 bytes digest for any file or buffer you supply as input. Example:

Dim path As String = "c:\myfile.dat"
Dim stream As New IO.FileStream(path, IO.FileAccess.Read)
Dim sha As New Security.Cryptography.SHA512Managed
Dim digest() As Byte = sha.ComputeHash(stream)

If you are not going to store the digest as a file, and prefer a string for a database, convert it to Base64 encoding:
Dim hashString As String = Convert.ToBase64String(digest)
That's it. When you want to verify if a file if the same or was changed (trojan horse, for instance), you compute the hash and compare the new digest with the one you had stored. If they match, you can be sure the file is exactly the same, if they dont, you are also sure it was changed somehow.

Forensics is another story... hash is good to check data integrity, but that's only a small piece of the big puzzle; computer forensics is a very complex world. Most jobs must be done in a lab, and require special hardware, to make data recovery readings on hard disks. It's possible to compromise a system for a while, and then return it exactly the way it was, leaving no tracks... Hash can tell you the system is no longer compromised, but not what nasty stuff was done before settings and files returned to normal.

To stand up in Court... it depends of what is standing up for. If you want to prove two files are different, my answer is yes, the Court will accept it based on the hash values. Hashes are on the base of digital signatures and code signing certificates; where another major technology is present, public key cryptography; and that stand up too. But that's it. Hash values are no proof that it was Alice who hacked Bob's server, or that the keylogger found in the office PC was placed by the boss. Private forensics may not stand up in court, and are not recommended if you are taking legal actions. If you want to go the legal way, let the police do the forensics. If you just want to know fast what was done on your system, save any important data that may be there, and format so that the system returns quickly into business again, take it to a private lab... one way or another, dont do forensics on serious cases if you are no expert.

Regards,
Mario
 
Hi,

I don't need to prove it came from a certain camera... just that the local
(or server side) copy is exactly the same as the source image on the camera.
Otherwise you are right, it would be a great deal more complex.

Cheers
 
Yep, that would be ideal but no such camera (of the quality required) exists
just yet... at least one is under development apparently though... not sure
who is doing it though...

That link was interesting, thanks.
 
Thanks Lasse,

I'll have a look but I think the hasing method might be better as suggested
by the others...


Cheers
 
Hi Bob,

I look forward to your result.
If you have any related question, please post here.

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
--------------------
 
Hi again Peter,

Your code worked fine and even the tiniest change in the image is picked
up... as required. Thanks for your help.

I have a question which is sort of related - at least, it's for the same app
but it relates to Windows Image Acquisition 2.0... if you have any
experience with this then I'd be interested in hearing your input.

I have posted the questions with a subject of Windows Image Acquisition 2.0
for XP.

Thanks again
 
Hi Bob,

Did the project works?
If you have any related question, please post here.

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hi Bob,

For the new question you post in the newsgroup
microsoft.public.dotnet.languages.vb,
Jeffery Tan has followed up in the new post.
If you have any more concern on this question, please post here.

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hi again,

I need to be able to store and display the hash value... is there any way I
can represent it as text?

Cheers
 
Hi Boaz,

The hash value is a byte array, some of the bytes can not be represent as
recognizable ascii or unicode characters.
So you may try to store and display the byte array.

I modify the sample as following.

Imports System
Imports System.IO
Imports System.Security.Cryptography
Imports System.Text

Module Module1
Sub Main()
Dim fs As FileStream = New FileStream("c:\dafen.jpg",
FileMode.Open, FileAccess.Read)
Dim arr(fs.Length) As Byte
fs.Read(arr, 0, fs.Length)
fs.Close()

Dim tmpHash() As Byte
tmpHash = New MD5CryptoServiceProvider().ComputeHash(arr)
fs = New FileStream("c:\dafen.jpg.dat", FileMode.OpenOrCreate,
FileAccess.Write)
fs.Write(tmpHash, 0, tmpHash.Length)
fs.Close()

'Dim uniDecoder As Decoder = Encoding.Unicode.GetDecoder()

'Dim charCount As Integer = uniDecoder.GetCharCount(tmpHash, 0,
tmpHash.Length)
'Dim chars() As Char = New Char(charCount - 1) {}
'Dim charsDecodedCount As Integer = _
' uniDecoder.GetChars(tmpHash, 0, tmpHash.Length, chars, 0)
'Console.WriteLine()

Dim b As Byte
For Each b In tmpHash
Console.Write("{0:X} ", b)
Next b
Console.WriteLine()

fs = New FileStream("c:\dafen1.jpg", FileMode.Open, FileAccess.Read)
fs.Read(arr, 0, fs.Length)
fs.Close()

Dim tmpNewHash() As Byte
tmpNewHash = New MD5CryptoServiceProvider().ComputeHash(arr)
fs = New FileStream("c:\dafen1.jpg.dat", FileMode.OpenOrCreate,
FileAccess.Write)
fs.Write(tmpHash, 0, tmpHash.Length)
fs.Close()

For Each b In tmpNewHash
Console.Write("{0:X} ", b)
Next b
Console.WriteLine()
'charCount = uniDecoder.GetCharCount(tmpNewHash, 0,
tmpNewHash.Length)
'chars = New Char(charCount - 1) {}
'charsDecodedCount = _
' uniDecoder.GetChars(tmpNewHash, 0, tmpNewHash.Length, chars, 0)
'Console.WriteLine(chars)


Dim bEqual As Boolean
If tmpNewHash.Length = tmpHash.Length Then
Dim i As Integer
Do While (i < tmpNewHash.Length) AndAlso (tmpNewHash(i) =
tmpHash(i))
i += 1
Loop
If i = tmpNewHash.Length Then
bEqual = True
End If
End If
If bEqual Then
Console.WriteLine("The two hash values are the same")
Else
Console.WriteLine("The two hash values are not the same")
End If
End Sub
End Module



Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Hi Bob,

If you have any question on this issue please post here.

Regards,
Peter Huang
Microsoft Online Partner Support
Get Secure! www.microsoft.com/security
This posting is provided "as is" with no warranties and confers no rights.
 
Back
Top