What is MD5 Hash?

  • Thread starter Thread starter Kenny
  • Start date Start date
K

Kenny

I know it's something added to a file when it's created and have found a few
definitions using Google but I still don't understand it! Is it a form of
error detection/correction and why is it used? Can someone explain it in
layman's terms please?
 
I know it's something added to a file when it's created and have found a
few
definitions using Google but I still don't understand it! Is it a form of
error detection/correction and why is it used? Can someone explain it in
layman's terms please?

A hash is simply a way to compress data in an evenly distributed way. For
example, say that you want to take in numbers between 1 and 10,000. Later,
you want to search through these numbers to find one. Rather than search
through all of the numbers each time, you think it would be faster to search
through a subset of these numbers, dividing them into individual buckets.
So, the hash algorithm you choose could take the last two digits of the
number.

For example, you are storing the numbers 3429, 1014, 8, and 9929. They would
fall into the following buckets:

Bucket 08: 8
Bucket 14: 1014
Bucket 29: 3429, 9929

So, if you wanted to find where you have 9929, you start at bucket 29, and
then only have to search through two numbers to find what you want, rather
than searching through all 4. With this trivial example, you didn't save
much time, but with a larger set of numbers this can save you quite a bit of
time indeed.

Now, imagine a hash algorithm that, instead of taking the last two numbers
of whatever you are inputting, generates its "bucket" from every single bit
of the application code. Maybe you sum all of the numbers in the code
together (all the machine sees of a program is numbers) and take the last 15
digits. Now, each program will generate a hash that is based on every bit of
the program, so if somebody changes even one line of code, it will generate
a new hash value.

Obviously, this is not perfect. It is theoretically possible for two
different sets of input to generate the exact same hash code. However, it is
extremely unlikely for this to happen - unlikely enough that you are likely
never going to have to worry about this at any point in your lifetime.

As for what you can use it for, it enables you to be really, really sure tha
t your application has not been changed since the developer created it. When
it goes through and computes the hash code on the actual program, and then
compares this to the hash code generated when the program was made by the
manufacturer, if these are the same, you can be sufficiently convinced that
nothing has changed along the way.
 
Kenny said:
I know it's something added to a file when it's created and have found a few
definitions using Google but I still don't understand it! Is it a form of
error detection/correction and why is it used? Can someone explain it in
layman's terms please?

Given the responses I'll have to chime in, in an effort to minimize possible
further confussion.
Currently is primary usage is to figure out if a file is corrupt or tampered
with i.e. can you trust it, is it crashing a program or the entire system.
If someone offers up a file for download and the hash does not match the
file is either corrupt or in a worst case scenario the file has been
altered, for instance a backdoor may have been inserted etc.
For example: Basically Windows File Checker verifies the hash in a file to
the file itself then tells you if the file is corrupt i.e. altered in some
way or not.
 
Back
Top