why these character substitutions?

  • Thread starter Thread starter Mike
  • Start date Start date
M

Mike

I found some code that writes text entries to an XML file. Part of the logic
includes a call to a method that processes every single character prior to
being written out to the XML file (see the code below). I'm just wondering
what the benefit is of making the substitutions made by this logic.

switch(c) {
case ">":
s += @"&gt";
break;
case "<":
s += @"&lt";
break;
case "\"":
s += @"&quot";
break;
case "'":
s += @"&apos";
break;
default:
s += c;
break;
}

return s;
 
This converts special characters, into those XML can process. For example,
if you have"
<node>text here < more text </node>, this is not valid XML, as the '<'
character is in the node text, and an XML parser would not deal with this.

However, there are easier ways to deal with this, then doing it character by
character.
 
<<<there are easier ways to deal with this, then doing it character by
character>>>

I think so too - the Replace() function might be a bit better. Do you have
another idea?
 
Yes, there is a HttpUtility class with a HtmlEncode function, which will
take care of all the special character encoding for you.
 
Or use an XmlTextWriter which is specifically meant to
write xml-text, for instance to a string.

System.Text.StringBuilder sb = new System.Text.StringBuilder();
System.IO.StringWriter sw = new System.IO.StringWriter(sb);
System.Xml.XmlTextWriter xw = new System.Xml.XmlTextWriter(sw);

now you can use methods like
WriteStartElement / WriteEndElement
WriteAttributeString
WriteElementString
and many others

get the result with sb.ToString()


Hans Kesting
 
Back
Top