Sure a simple parsing question (hex conversion)

I

i_robot73

I have a file, containing hex values for dates (MMDDYYYY)<status
code><??> such as:

303530313230303102003035303232303031020030353033323030310200303530343230303102003035303732303031020030353038323030310200

breaking that down:

30353031323030310200
30353032323030310200
30353033323030310200
30353034323030310200...

How do I break it down, convert and, say, throw it into an Array/
HashTable/etc.?? Never messed w/ HEX (binary??) files before.

Appreciate any and all help.

David D.
 
S

Stanimir Stoyanov

Hi David,

Below is sample code I wrote how to interpret the data blocks. You should
implement appropriate error handling when parsing non-integer strings.

string content =
"303530313230303102003035303232303031020030353033323030310200303530343230303102003035303732303031020030353038323030310200";

// TODO: Implement proper exception handling
// Constants based on the block format
const int datePartLength = 8;
const int otherPartLength = 1;
const int blockLength = (datePartLength + otherPartLength * 2) * 2;

for (int i = 0; i < content.Length; i += blockLength)
{
// Each block is extracted and converted here
string blockRaw = content.Substring(i, blockLength);
string blockReadable = string.Empty;

for (int j = 0; j < blockRaw.Length; j += 2)
blockReadable += (char)Convert.ToByte(blockRaw.Substring(j,
2), 16);

string datePart = blockReadable.Substring(0, datePartLength);

DateTime date = DateTime.ParseExact(datePart, "MMddyyyy",
System.Globalization.CultureInfo.InvariantCulture);
int status = (int)blockReadable[datePartLength];

// Make use of date and status here
}
 
J

Jeff Johnson

I have a file, containing hex values for dates (MMDDYYYY)<status
code><??> such as:

303530313230303102003035303232303031020030353033323030310200303530343230303102003035303732303031020030353038323030310200

breaking that down:

30353031323030310200
30353032323030310200
30353033323030310200
30353034323030310200...

How do I break it down, convert and, say, throw it into an Array/
HashTable/etc.?? Never messed w/ HEX (binary??) files before.

Do you mean your file contains an ASCII representation of hex digits (in
other words, if you open your file in Notepad you see exactly what you wrote
above)?

If so, you can use the Convert.ToByte() overload which takes a String and an
Int32 representing the base of the number in the string. Then you can build
a byte array and convert it to a string with the ASCIIEncoding class. At
that point you'll just have to substring to get out the values you need.

If, however, you were just making a visual representation of your file for
posting purposes and it actually contains "real bytes" then you're simply
skipping the conversion and you can decode your bytes with ASCIIEncoding
directly.

In other words, take a look at the ASCIIEncoding class first, especially
methods like GetString().
 
I

i_robot73

Do you mean your file contains an ASCII representation of hex digits (in
other words, if you open your file in Notepad you see exactly what you wrote
above)?

If so, you can use the Convert.ToByte() overload which takes a String andan
Int32 representing the base of the number in the string. Then you can build
a byte array and convert it to a string with the ASCIIEncoding class. At
that point you'll just have to substring to get out the values you need.

If, however, you were just making a visual representation of your file for
posting purposes and it actually contains "real bytes" then you're simply
skipping the conversion and you can decode your bytes with ASCIIEncoding
directly.

In other words, take a look at the ASCIIEncoding class first, especially
methods like GetString().

The file is from an 'in-house' vendor provided solution:

file opened in a hex editor:

3035303132303031020030353032323030310200303530333230303102003035303432303031020030353037323030310200...

file opened in wordpad:

050120010502200105032001...
 
I

i_robot73

Hi David,

Below is sample code I wrote how to interpret the data blocks. You should
implement appropriate error handling when parsing non-integer strings.

        string content =
"30353031323030310200303530323230303102003035303332303031020030353034323030­3102003035303732303031020030353038323030310200";

        // TODO: Implement proper exception handling
        // Constants based on the block format
        const int datePartLength = 8;
        const int otherPartLength = 1;
        const int blockLength = (datePartLength + otherPartLength * 2) * 2;

        for (int i = 0; i < content.Length; i += blockLength)
        {
            // Each block is extracted and converted here
            string blockRaw = content.Substring(i, blockLength);
            string blockReadable = string.Empty;

            for (int j = 0; j < blockRaw.Length; j += 2)
                blockReadable += (char)Convert.ToByte(blockRaw.Substring(j,
2), 16);

            string datePart = blockReadable.Substring(0, datePartLength);

            DateTime date = DateTime.ParseExact(datePart, "MMddyyyy",
System.Globalization.CultureInfo.InvariantCulture);
            int status = (int)blockReadable[datePartLength];

            // Make use of date and status here
        }
--
Stanimir Stoyanovhttp://stoyanoff.info




I have a file, containing hex values for dates (MMDDYYYY)<status
code><??> such as:

breaking that down:

How do I break it down, convert and, say, throw it into an Array/
HashTable/etc.??  Never messed w/ HEX (binary??) files before.
Appreciate any and all help.
David D.- Hide quoted text -

- Show quoted text -

Code works great, throws some 'odd' characters (the '0200'...looks
like a square). Guess the last ? is....how to I READ in the file :p

FileStream to byte[]?
 
I

i_robot73

The file is from an 'in-house' vendor provided solution:

file opened in a hex editor:

303530313230303102003035303232303031020030353033323030310200303530343230303­1020030353037323030310200...

file opened in wordpad:

05012001  05022001  05032001  ...- Hide quoted text -

- Show quoted text -

With the lovely help of all here, I've learned alot today.


File is read as binary (decimal?) into a byte array
each byte[] is converted & appended to string as hex

Then use the code above to parse:



string FILE= (@"<path to file>");
FileStream fs = File.OpenRead(FILE);
byte[] data = new byte[fs.Length];
fs.Read(data, 0, (int)fs.Length);
fs.Close();

//Convert DEC (binary) byte[] to HEX
string content = null;
foreach (byte dec in data)
{
string content = += dec.ToString("X");
}

Woohoo....Thanks again!
 
J

Jeff Johnson

File is read as binary (decimal?)

Okay, let me try to clear up this misunderstanding. In the programming world
we sometimes use words in ways that aren't always technically correct.
Computers deal in nothing but binary (in bytes). So if you think about it,
EVERY file is a "binary" file. However, when talking about data (files), we
use the term "binary" to refer to files that contain characters with no
printable representation. In general, this means that the file contains
bytes in the range of 0 -31, with the exception of a few bytes which control
printing, such as a carriage return (13), line feed (10), and tab (9). If a
file contains nothing but bytes from 32-255 (plus 9, 10, and/or 13) then we
refer to it as a "text" file, because you'll more than likely be able to
open it up in Notepad and see all of the characters in it without "weird"
stuff. The file you described in your original post would be considered a
"binary" file because it contains 2 and 0.

Now, as to binary, decimal, and hexadecimal. These are number systems (or,
apparently more correctly, "numeral systems":
http://en.wikipedia.org/wiki/Numeral_system). They are merely a
REPRESENTATION of a number made for the ease of human beings. It is possible
to represent (or display) the same amount of something in several different
ways. For example, consider the number of bars written below:

| | | | | | | | | | | |

In the decimal system, which is what humanity has basically adopted globally
(because we have 10 TOES, not fingers, or so anthropologists believe), this
amount is represented as "12".

In the binary system, this number is represented as "1100" (or, if you like
your binary numbers in 8-bit chunks, "00001100").

In the hexadecimal system, it's "C", or if you prefer 2 digits, "0C".

When switching between systems (changing "bases"), there is no conversion of
the amount. Twelve things are always twelve things, but there can be
conversions of the way the amount is DISPLAYED. This is what it means when
you "convert" decimal to hex, for example.

When you wrote this:

//Convert DEC (binary) byte[] to HEX
string content = null;
foreach (byte dec in data)
{
string content = += dec.ToString("X");
}

what you really meant was "make a 2-digit hexadecimal representation of the
values in the byte array and store it in a string." Saying that your bytes
are "in decimal" is incorrect. If anything, they are "in binary," but it
doesn't matter. The computer knows that twelve is twelve, and forty is
forty.

Back to the original issue: I don't know why you've turned your data into a
string of hex digits. I thought you were looking to extract "05012001" from
the string and turn it into the date 2001-05-01. (Which, by the way, is just
a way to DISPLAY a date, but that's another story. If you're American you
probably prefer 5/1/2001, and if you're European you probably want
1/5/2001.) You should do what I recommended: take the first 8 bytes and use
ASCIIEncoding.GetString() to get a string from them and then use
DateTime.Parse() to get a date.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top