Hex 00 in Stringbuilder

  • Thread starter Thread starter Richard Van Dyke
  • Start date Start date
Jay,

* "Jay B. Harlow said:
Both System.String & System.Text.StringBuilder fully support having a null
char in the string.
ACK.

If sb.length = 1920, then sb.tostring.length should also be 1920!

I know the debugger in VS.NET has troubles with null chars in strings as it
relies on Win32 APIs that treat the null char as a string terminator,
however it is an API problem not a String or StringBuilder problem. Also
System.Console & System.Diagnostics.Debug & Trace will have problems, as
they all rely on a Win32 API that treat the null char as a string
terminator...

I have played around with that too and I got the same results. As
expected, the strings will be visually terminated at the null character,
but I didn't find any place inside the IDE which showed the wrong
/length/ (the number of characters, not the characters) of the string or
the string builder, respectively.
 
Jay,

Thanks for the information.

I was doing a cmrcv (sna cpic call) using the cpic dll sent with Host
Integration. In the call I passed a stringbuilder for the returned data ....
after the receive the returned length of the receive( an integer passed to
the dll and filled in by the dll) contained 1920 which is what I expected
to be there as the application is sending that many bytes.....

I need to pass the data recieved as a string to a class so I passed
stringbuilder.tostring.... prior to doing so I printed out in our log file
stringbuilder.tostring.length which printed 180..... I also printed out
stringbuilder.tostring which only printed 180 bytes of data.... right up to
the character prior to the X'00'. I used writeline as a means of writing the
string to the logfile..

We found out today that the sending application(a C app) was placing a X'00'
just after the last byte of data in a field which was used by a routine to
clear data... the app in certain situations would move 1 byte to many to the
output buffer which is why we only got this data occassionally....

Thanks for all of your input.....
Rick

Jay B. Harlow said:
Rick,
A null char (ChrW(0)) in your StringBuilder should not be terminating the
string.

Both System.String & System.Text.StringBuilder fully support having a null
char in the string.

If sb.length = 1920, then sb.tostring.length should also be 1920!

I know the debugger in VS.NET has troubles with null chars in strings as it
relies on Win32 APIs that treat the null char as a string terminator,
however it is an API problem not a String or StringBuilder problem. Also
System.Console & System.Diagnostics.Debug & Trace will have problems, as
they all rely on a Win32 API that treat the null char as a string
terminator...

Are you calling a Win32 API that is treating the null char as a string
terminator?

If you have a StringBuilder you can use its replace method to replace the
null char with a different character.

sb.Replace(ChrW(0), " "c)


Here is an example that demonstrates that null chars are allowed in both
String & StringBuilder:
Dim sb As New StringBuilder(1920)
For i As Integer = 1 To 1920
sb.Append("a"c)
Next
sb.Chars(180) = ChrW(0) ' force a null char
Debug.WriteLine(sb.Length, "sb.length")
Debug.WriteLine(sb.ToString().Length, "sb.tostring().length")

Will report:

sb.length: 1920
sb.tostring().length: 1920

What is "receive the data using CPIC call to cmrecv...." that you are
calling? Is this somehow changing lengths of strings on you?

The above is based on VS.NET 2003.

Hope this helps
Jay
 
Rick,
Ah! There's the Rub!

I take it that CPIC is called via Declare statements? (you mentioned a C
app).

When you marshal strings (via Declare statements) the default is to copy up
to the terminating null! (every thing after the null is not considered part
of the string!!!)

Instead of defining the "buffer" as String or StringBuilder, define the
"buffer" as an IntPtr, then use Marshal.PtrToStringAnsi or
Marshal.PtrToStringUni depending on whether you called an ANSI or a UNICODE
API, to convert the buffer into a String, this String can then be passed to
the StringBuilder if needed...

You use Marshal.AllocHGlobal & Marshal.FreeHGlobal to alloc a buffer that
you can pass to your API.

The Marshal class can be found in System.Runtime.InteropServices namespace.

Adam Nathan's book ".NET and COM - The Complete Interoperability Guide" from
SAMS press has a complete description on how to get this to work.

Hope this helps
Jay
 
Jay,

Thanks for your answer.

I very appreciated it, however I have long time worked with EBCDIC and
therefore know very well the problems with conversions with it, and because
I am a European, I know the things about extended ASCII code sets in the old
DOS environment. Therefore, I am very happy with the Unicode standards.

As you probably know I am Dutch and more strange is that the code set 437,
which as far as I did understand was/is the US standard, code did
completely cover the for Dutch needed characters. (Although Microsoft always
installs the 850-code page (International) on computers necessary in
Holland.)

In the 437-codeset is the guilder (which is not used anymore). I never
figured out if that guilder sign has also another meaning in the US. That is
wherefore is this question. However, maybe it was just a stupid mistake from
IBM, who as far as I know designed that code page in those days.

Cor
 
Cor,
Ahhh! I follow you now!

No we, at least I don't, use the guilder sign in the US. Of course if I was
talking Dutch currency (pre Euro) then I would use the guilder...

You can use
System.Globalization.CultureInfo.CurrentCulture.TextInfo.ANSICodePage to
find the code page that Chr uses. For me its 1252.

My System.Globalization.CultureInfo.CurrentCulture.TextInfo also has a
couple other interesting properties. The EBCDICCodePage is 37, while the
OEMCodePage is 437, and the MacCodePage is 10000...
As you probably know I am Dutch and more strange is that the code set 437,
For some reason I was thinking you were Spanish not Dutch.

Jay
 
Hi,
The encoding on the file may be different then the encoding that Chr uses.

Also Chr & ChrW are independent of any Stream! (in that you can use them
without having a Stream).
<<

Chr and ChrW are identical for all values between 0 and 127 (decimal),
regardless of Code Page. Only for values higher than 127 is there variance.

You ABSOLUTELY should not rely on either Chr or ChrW for data higher than
127. Use only Byte or Byte array data. (IMO).

Dick
--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
Hi,

I always use strings for this sort of thing. The reason for using a
StringBuilder object is because of the immutable nature of strings.
However... StringBuilder actually incurs even more overhead when or if you
have to do any significant parsing of the content.

Dick

--
Richard Grier (Microsoft Visual Basic MVP)

See www.hardandsoftware.net for contact information.

Author of Visual Basic Programmer's Guide to Serial Communications, 3rd
Edition ISBN 1-890422-27-4 (391 pages) published February 2002.
 
Dick,
Chr and ChrW are identical for all values between 0 and 127 (decimal),
regardless of Code Page. Only for values higher than 127 is there
variance.
I believe I have stated that!!!
regardless of Code Page.
Except EBCDIC!
You ABSOLUTELY should not rely on either Chr or ChrW for data higher than
127. Use only Byte or Byte array data. (IMO).
I don't believe I ever have, I have been stating using ChrW only. As 0 to
127 Chr & ChrW are the same, while above 128 they are different.

I'm really not sure why you insist on repeating what I have stated, making
it sound like I stated something incorrectly. I do thank you for reaffirming
what I have stated.

Hope this helps
Jay
 
Back
Top