problem with asc function

guoqi zheng · Dec 31, 2005

Dear sir,

I am writting a binary newsgroup yEnc file decoder, I need to change
character back to a byte. When I am doing this, I always get an error of:
"Arithmetic operation resulted in an overflow."

I did a little test code with the ASC function, I got exactly the same error.
See my code example below.

Any one know what I am doing wrong with the ASC function here?

Dim ary As New ArrayList
Dim i As Integer
For i = 0 To 255
Dim bc As String = Chr(i)
Dim bytevalue As Byte = Asc(bc)
ary.Add(bytevalue)
Next

Me.ListBox1.DataSource = ary

Regards,

Guoqi Zheng
http://www.ureader.com

Jon Skeet [C# MVP] · Dec 31, 2005

guoqi zheng said:
I am writting a binary newsgroup yEnc file decoder, I need to change
character back to a byte. When I am doing this, I always get an error of:
"Arithmetic operation resulted in an overflow."

I did a little test code with the ASC function, I got exactly the same error.
See my code example below.

Any one know what I am doing wrong with the ASC function here?

Dim ary As New ArrayList
Dim i As Integer
For i = 0 To 255
Dim bc As String = Chr(i)
Dim bytevalue As Byte = Asc(bc)
ary.Add(bytevalue)
Next

Me.ListBox1.DataSource = ary

Well, the above doesn't raise any errors for me, but I don't think it's
a good idea to use Asc or Chr anyway. That is converting the character
to a byte using the default encoding of the current thread - is that
definitely what you want to do? It sounds unlikely to me.

Where did the character come from, *exactly*? If it was originally
decoded from binary, was the correct character encoding definitely
used? What are you going to use the byte for?

You need to answer these questions *very carefully*. I suggest you look
at
http://www.pobox.com/~skeet/csharp/unicode.html
first, to understand encodings etc.

You *may* just need to use CByte instead...

guoqi zheng · Dec 31, 2005

That is converting the character to a byte using the default encoding of
the current thread - is that
definitely what you want to do?

I think I know why I have error but you don't have. Because we have
different codepage/encoding.

I am trying to decode yEnc encoded binary post on newsgroup, those bytes
will be written to a output stream directly. I am not sure what kind of
encoding they used originally to decode those binary, what I know is that I
need to remove \r\n, convert it back to byte and write it to a binary file.

Actually, it was always my question, what kind of encoding I should use to
convert bytes I received from NNTP to string? I used ISO-8859-1 for now.

Regards,

Guoqi Zheng
http://www.ureader.com

Jon Skeet [C# MVP] · Dec 31, 2005

guoqi zheng said:
the current thread - is that
definitely what you want to do?

I think I know why I have error but you don't have. Because we have
different codepage/encoding.

Almost certainly.

I am trying to decode yEnc encoded binary post on newsgroup, those bytes
will be written to a output stream directly. I am not sure what kind of
encoding they used originally to decode those binary, what I know is that I
need to remove \r\n, convert it back to byte and write it to a binary file.

Without knowing the encoding, you can't recognise the "\r\n".

Actually, it was always my question, what kind of encoding I should use to
convert bytes I received from NNTP to string? I used ISO-8859-1 for now.

Well, it sounds to me like you don't really need to convert the bytes
at all. If you assume that the "\r\n" are encoded as bytes 13 and 10
respectively, you should be able to do it all without ever treating it
as text data.

If you *have* to treat it as character data, using 8859-1 is probably a
good bet. In theory I believe it doesn't contain characters for bytes
128-139, but in practice I believe the encoding treats them as Unicode
128-139.

Alternatively, just cast each character to a byte using CByte.

I've just had a look at the yEnc spec, and unfortunately it seems to
have been written by someone who doesn't appreciate the difference
between binary data and text data, and also doesn't understand that
ASCII doesn't have any values > 127...

guoqi zheng · Jan 1, 2006

If you *have* to treat it as character data, using 8859-1 is probably a
good bet. In theory I believe it doesn't contain characters for bytes
128-139,

now I understand why some times I have strange things when using ISO-8859-1,
because it don't have characters of 128-139.

I use windows 1252 now, everything works now.
text data, and also doesn't understand that ASCII doesn't have any values >
127...

The guy of yEnc does understand that ASCII(7bit) < 128, but he realized that
many Usenet servers can transfer 8 bit characters, that is why he invented
yEnc which using 8 bit.

Thanks for your help

regards,

Guoqi Zheng
http://www.ureader.com

Jon Skeet [C# MVP] · Jan 1, 2006

guoqi zheng said:
good bet. In theory I believe it doesn't contain characters for bytes
128-139,

now I understand why some times I have strange things when using ISO-8859-1,
because it don't have characters of 128-139.

I use windows 1252 now, everything works now.

Hmm. Have you tested it on data created with systems with a far east
region, or the like? Windows 1252 is the defaut western European code
page, but that doesn't mean it'll work everywhere.

Again, it sounds like you're trying to treat binary data as text data
when it's really just not. I would try to avoid treating it as text in
the first place.

text data, and also doesn't understand that ASCII doesn't have any values >
127...

The guy of yEnc does understand that ASCII(7bit) < 128, but he realized that
many Usenet servers can transfer 8 bit characters, that is why he invented
yEnc which using 8 bit.

If he understood that, he wouldn't take about taking the "ASCII value"
of a character when that character is read from an input stream of
*bytes* (not characters in the first place). At the very least he's
being very loose with his use of terminology

problem with asc function

guoqi zheng

Jon Skeet [C# MVP]

guoqi zheng

Jon Skeet [C# MVP]

guoqi zheng

Jon Skeet [C# MVP]