HttpUtility.HtmlEncode and HtmlDecode problems

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

In working with System.Web.HttpUtility class, I've come across some
inconsistencies in encoding and decoding. If I start with the following code:

string s = @"& " &apos; < > ® ™ © é";
s = System.Web.HttpUtility.HtmlDecode(s);
s = System.Web.HttpUtility.HtmlEncode(s);

The value of the string after the call to HtmlDecode is:
& \" &apos; < > ® ™ © é

This indicates that it doesn't correctly decode the (&apos;) to be a single
quote ('), otherwise, it decodes all characters correctly. Further, the
value of the string after the call to HtmlEncode is:
& " &apos; < > ® ™ © é

This indicates that it doesn't re-encode all the characters correctly
(notice the &apos double encoding due to the initial decoding error, as well
as the inability to encode the TM symbol at all).

Does anyone have any insight into why this is happening?
 
J80127 said:
In working with System.Web.HttpUtility class, I've come across some
inconsistencies in encoding and decoding. If I start with the
following code:

string s = @"& " &apos; < > ® ™ © é";
s = System.Web.HttpUtility.HtmlDecode(s);
s = System.Web.HttpUtility.HtmlEncode(s);

The value of the string after the call to HtmlDecode is:
& \" &apos; < > ® ™ © é

This indicates that it doesn't correctly decode the (&apos;) to be a
single quote ('), otherwise, it decodes all characters correctly.
Further, the value of the string after the call to HtmlEncode is:
& " &apos; < > ® ™ © é

This indicates that it doesn't re-encode all the characters correctly
(notice the &apos double encoding due to the initial decoding error,
as well as the inability to encode the TM symbol at all).

Does anyone have any insight into why this is happening?

These methods are somewhat broken and famous for not encoding certain
characters like the apostrophe.

See also
http://lab.msdn.microsoft.com/productfeedback/viewfeedback.aspx?feedback
id=7cf0356f-d2ff-47eb-858c-faf6226dee03

Cheers,
 
Back
Top