Hi,
I have to make a program which will generate 100,000 different ID
numbers (for example 2345-9341-0903-3432-3432 ...) It must be really
different, I meen it can not be a similar (like
2345-9341-0903-3432-343<b>1</b>) Does exist some algorithm for that...
Thanks.
Consider the maximum number of IDs - 100000. That's 6 decimal digits.
Your format indicated that you want a string of four-digit numbers
separated by "-" signs. (Even if that's not the case, I'll use it to
demonstrate the principal.)
So let's say the output format is:
DDDD-DDDD-DDDD-DDDD-DDDD
We could convert our integer (the counter going from 1 to 100000) into
an ASCII string with leading zeros, and cut each character and place
it into a particular position in the output string.
e.g.
D1D2-D34D-DDDD-D5DD-D6DD
(The numbers indicate the digit positions on the input string)
The "D" digits could therefore be filled with random data. See that?
Ok, you can actually do something more useful than simply using random
data. You can use that extra bandwidth for testing the validity of the
original number.
In the example I gave, there are 14 digit spaces that we can make use
of (the D positions without the numbers in).
Taking our original number, we can then produce a hash (SHA, MD5 etc.)
of that value. Then convert it to an ASCII string (padding or cutting
the end if necessary) so that it has 14 digits. We can then insert
those digits into the empty positions. When we parse the number (and
extract the number digits) we can then apply the same hash and test
that the hash matches.
As an additional note: 14 digits for a "checksum" is a little extreme.
If we reduce that number to 6, we are left with 8 digits that are now
redundant. We can reserve those for future expansion (supporting more
than 100000 IDs). The most useful thing we can do is to introduce a
"format version code" - a single digit number that will appear in the
first position.
This leads to a well-known method of allocating IDs:
1) We need to identify the version of the ID format
2) We need to encode the original numeric value
3) We need to encode a checksum of that value
e.g.:
Example: 1162-2341-2763-1532-1623
Version: 1... .... .... .... ....
Number: .1.2 .34. .... .5.. .6..
Hash: ..6. 2..1 27.. 1... ....
Unused: .... .... ..63 ..32 1.23
We could parse that number and immediately see that it is in "format
1". We therefore know where the digits for the number and the digits
for the hash are.
i.e. Format=1 Number=123456 Hash=621271 (The rest are unused - random)
Probably more than you're after, but if you're thinking about the
format, you might as well think about its future too...
Rgds,