XmlReader and first char

  • Thread starter Thread starter Steve B.
  • Start date Start date
S

Steve B.

Hi,

I've a string that contains a Xsl transformation :

string tranform = @"
<xml version=""1.0"">
<xsl .....>
";

Notice that the first chars are \r\n

I write this string into a MemoryStream, then I create a new XmlReader like
this :

XmlReader reader = XmlReader.Create(myMemoryStream);

This line throw an exception : the char 0x00 is not valid.
If I slightly change the code :

string tranform = @"<xml version=""1.0"">
<xsl .....>
";

(the content starts with the declaratibe tag)

The code then works correctly.

So my question is : why does the first \r\n make the XmlReader throw an
Exception ? I thought spaces are ignored (W3C specs).

Thanks in advance for any clarifications
Steve
 
When you stream, the streamreader will attempt to determine if the entire
transmission is bogus or not. A null char is a definite signal that the
stream is bad or non-existent. I am not sure why the white space in the
first position would be seen as a null char, but I am not overly surprised
either.

Remember that files, with the exception of ascii files, begin with real
characters, not white space. It appears the mentality of working with binary
files was extended to XML (incorrectly? perhaps).

--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

*************************************************
Think outside of the box!
*************************************************
 
So my question is : why does the first \r\n make the XmlReader throw an
Exception ? I thought spaces are ignored (W3C specs).

Spaces are ignored (to some extent) in most of XML, but looking at the
specs I can't see anything in the definition of the prolog part of the
XML which allows whitespace before the XMLDecl part.
 
Cowboy (Gregory A. Beamer) said:
When you stream, the streamreader will attempt to determine if the entire
transmission is bogus or not. A null char is a definite signal that the
stream is bad or non-existent. I am not sure why the white space in the
first position would be seen as a null char, but I am not overly surprised
either.

My guess is that the stream has been created using a UTF-16 (or
similar) encoding, giving 0 as the first byte.
Remember that files, with the exception of ascii files, begin with real
characters, not white space.

On what grounds? There's not reason why a text file stored in UTF-16 shouldn't
start with spaces, for instance. It wouldn't be valid XML, but it's a
perfectly reaosnable text file.
 
Jon Skeet said:
My guess is that the stream has been created using a UTF-16 (or
similar) encoding, giving 0 as the first byte.

That would make sense. I will file that one away in a place where it is
accessible. :-)
On what grounds? There's not reason why a text file stored in UTF-16
shouldn't
start with spaces, for instance. It wouldn't be valid XML, but it's a
perfectly reaosnable text file.


Okay, you have me on point #2. :-)

I got a bit too focused on XML.


--
Gregory A. Beamer
MVP; MCP: +I, SE, SD, DBA

*************************************************
Think outside of the box!
*************************************************
 
Back
Top