Problem with .net and Strings

  • Thread starter Thread starter Travis Ellis
  • Start date Start date
T

Travis Ellis

I am having a problem with .net Strings.

Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();
if( foo == "" )
{
Console.WriteLine("Empty"); // This never happens
}

I have noticed that when you create a string using the null characters
(byte)0 or '\u0000' that it doesn't trim the whitespace.
My example above would be the string "" but it would have a .Length of 4 so
it is not comparing right with the == operator.
If I do this:

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

I noticed that I can use CompareTo to compare against an empty string when I
create the string from a byte or char array that might be 0 filled.
Anybody know why this is happening?

I can do a similar thing in Java and not have the same problem:

String foo = new String( new byte[]{ 0, 0, 0, 0 } ).trim();
System.out.println(foo.length); // this prints 0 b/c the string is empty


Yet in C#.net it would print 4, but it would still appear that foo is "" by
using CompareTo
 
Travis Ellis said:
I am having a problem with .net Strings.

Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();
if( foo == "" )
{
Console.WriteLine("Empty"); // This never happens
}

Sure. The null character isn't treated as a whitespace character in
..NET.
I have noticed that when you create a string using the null characters
(byte)0 or '\u0000' that it doesn't trim the whitespace.
My example above would be the string "" but it would have a .Length of 4 so
it is not comparing right with the == operator.
If I do this:

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

I noticed that I can use CompareTo to compare against an empty string when I
create the string from a byte or char array that might be 0 filled.
Anybody know why this is happening?

I can do a similar thing in Java and not have the same problem:

String foo = new String( new byte[]{ 0, 0, 0, 0 } ).trim();
System.out.println(foo.length); // this prints 0 b/c the string is empty

Yet in C#.net it would print 4, but it would still appear that foo is "" by
using CompareTo

Java's String.trim() method treats all characters which have Unicode
values of 32 or less as whitespace; .NET has a somewhat stricter
definition.

If you want to trim nulls as well, you can the form of String.Trim
which takes a character array (as a params argument) eg:

string x = x.Trim(' ', '\0');
 
Travis Ellis said:
Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

This is weird. The following prints 'This will display.'.

using System;
using System.Text;

class Test
{
static void Main() {
byte[] bytes = new byte[] { 65, 0, 0, 0, 0 };
string s = Encoding.ASCII.GetString(bytes);

if ("A".CompareTo(s) == 0) {
Console.WriteLine("This will display.");
}
if ("A" == s) {
Console.WriteLine("This won't.");
}

Console.Read();
}
}

Changing the byte array declaration to

byte[] bytes = new byte[] { 65, 0, 65, 0, 0 };

stops that from happening, however. But then this expression evaluates to
true:

"AA".CompareTo(s) == 0

And so does this:

"\0AA".CompareTo(s) == 0

Conclusion: String.CompareTo, using my current culture, trims null
characters from both strings before performing the comparison, even though
this doesn't seem to be documented.
 
Complete speculation on my part, but I think it's more likely that it just
uses the data in chunks of 16 bits and ignores that dangling 8 bits of null
that it can't do anything with anyway.
--
Phil Wilson
[Microsoft MVP-Windows Installer]

C# Learner said:
Travis Ellis said:
Encoding ascii = new ASCIIEncoding();
string foo = ascii.GetString( new byte[]{ 0, 0, 0, 0 } ).Trim();

if( foo.CompareTo("") == 0 )
{
Console.WriteLine("Empty"); //this works
}

This is weird. The following prints 'This will display.'.

using System;
using System.Text;

class Test
{
static void Main() {
byte[] bytes = new byte[] { 65, 0, 0, 0, 0 };
string s = Encoding.ASCII.GetString(bytes);

if ("A".CompareTo(s) == 0) {
Console.WriteLine("This will display.");
}
if ("A" == s) {
Console.WriteLine("This won't.");
}

Console.Read();
}
}

Changing the byte array declaration to

byte[] bytes = new byte[] { 65, 0, 65, 0, 0 };

stops that from happening, however. But then this expression evaluates to
true:

"AA".CompareTo(s) == 0

And so does this:

"\0AA".CompareTo(s) == 0

Conclusion: String.CompareTo, using my current culture, trims null
characters from both strings before performing the comparison, even though
this doesn't seem to be documented.
 
Back
Top