compress specific data

  • Thread starter Thread starter Romain TAILLANDIER
  • Start date Start date
R

Romain TAILLANDIER

Hi group

My customer want me i crypt a 16 alphanumeric char in 8 byte of memory.
the format is :
AAA AAA AAA L NNN NNN
where the A are A-Z or 0-9
L are A-Z
N are 0-9

So i caculate the number of possibilities :
36^9 * 26 * 10^6 = 2,640,558,873,378,816,000,000
< 2^64

So i think it is possible.
but i can't lists all the possibilities. So i create some 'ASCII' specific
code :
the AlphaNum (36 possibilities) coded on 6 bits
Letters (26 possibilities) on 5 bits
Numbers (10 possibilities) on 4 bits
and finally i get 10 bytes ....

what ever i can do i can't get down under 9 bytes.
the only possibility is to list all the 2,640,. ..... ... possibility but it
is not realist.

any ideas ?
calculate an hashcode on the 16 alphanum code woud be save me ?

ROM
 
Romain TAILLANDIER said:
My customer want me i crypt a 16 alphanumeric char in 8 byte of memory.
the format is :
AAA AAA AAA L NNN NNN
where the A are A-Z or 0-9
L are A-Z
N are 0-9

So i caculate the number of possibilities :
36^9 * 26 * 10^6 = 2,640,558,873,378,816,000,000
< 2^64

So i think it is possible.
but i can't lists all the possibilities. So i create some 'ASCII' specific
code :
the AlphaNum (36 possibilities) coded on 6 bits
Letters (26 possibilities) on 5 bits
Numbers (10 possibilities) on 4 bits
and finally i get 10 bytes ....

what ever i can do i can't get down under 9 bytes.
the only possibility is to list all the 2,640,. ..... ... possibility but it
is not realist.

any ideas ?

Well, the most obvious suggestion is to convert each character into its
appropriate base, ie 0-35 for the first nine characters, then 0-25 for
the next, then 0-9 for the last 6. Then just take the first result +
36*second + (36*36)*third etc (or the other way round depending on what
endianness you want).

Let me know if you need code to do this.
 
Well, the most obvious suggestion is to convert each character into its
appropriate base, ie 0-35 for the first nine characters, then 0-25 for
the next, then 0-9 for the last 6. Then just take the first result +
36*second + (36*36)*third etc (or the other way round depending on what
endianness you want).

that exactly the night say to me :)
It was so simple that i can't see it on the moment :)

thank you john skeet :)

ROM
 
Hello Romain.

The problem you have is, basically, rounding error.
So i caculate the number of possibilities :
36^9 * 26 * 10^6 = 2,640,558,873,378,816,000,000
< 2^64

So i think it is possible.

Technically, it is possible. But look at this calculation a litle closer:
the AlphaNum (36 possibilities) coded on 6 bits
Letters (26 possibilities) on 5 bits
Numbers (10 possibilities) on 4 bits
and finally i get 10 bytes ....

The problem is this: while it takes 6 bits to store 36 possibilities, 6 bits
actually hold 64 possible numbers. So, you are "wasting" 28 possibilities
per position.
Similarly, 5 bits holds 32 possibilities, of which you are use 26, and 4
bits hold 16 possibilities, of which you are using 10.

You may be able to do this as a "moving base" number. Remember that a
number, in any base, is equivalent to multiplying the position of the digit
times the base value to the power of the position (from the left). So the
decimal (base 10) number 520 is equal to (((5 * 10) + 2) * 10) + 0

In this way, your 16 byte value can be coded as an eight byte number by
multiplying each "digit" by the base. The complication is that you don't
have a single base to work with.

So, you need a moving base. It goes something like this:

public decimal newvalue(oldvalue, insertedvalue, newbase)
{
return (oldvalue * newbase) + insertedvalue;
}

int[] baselist = {36, 36, 36, 36, 36, 36, 36, 36, 36, 26, 10, 10, 10, 10,
10, 10};

// loop through the string, one character at a time
decimal register = 0;
for (int ct = 0; ct < 16; ct++)
{
char mychar = sourcestring[ct];
int numvar = GetNumericEquivalent(mychar, baselist[ct]); // write a
little function that returns the "numeric value" of each character
register = newvalue(register, numver, baselist[ct]);
}
return register;

The resulting value will be stored in the binary 'register' value.

Note: the GetNumericEquivalent routine above is simple:
int GetNumericEquivalent(char mychar, int mybase)
{
string base10 = "0123456789";
string base26= "ABCDEF ... XYZ";
string base36 = base10 + base26;

if (mybase = 10)
return base10.IndexOf(mychar);
else if (mybase = 26)
return base26.IndexOf(mychar);
else
return base36.IndexOf(mychar);
}

By the way, this code is NOT object oriented. It is entirely procedural. I
did this because I felt that it was more important to illustrate the
algorithm than it is to encapsulate the classes in this case. If you'd
like, we could create a flyweight pattern to encapsulate the data values
with their respective bases. However, the algorithm would have been
obscured. Apologies to any purists out there.

--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik

Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--
 
Back
Top