The right way to Encode html output

  • Thread starter Thread starter ViperDK
  • Start date Start date
V

ViperDK

What is the best way for that?
I store all Data in the original form in the Database. To prevent output
fields (especially the fields everyone can use) to do bad things like
killing the page-design or even worse attacking my site with javascript
directives i use stuff like
(WebControls.Label)Output.Text = HttpUtility.HtmlEncode(userDefinedData);
and my own functions which allow Line-Breaks and handle links.

But that way seems not to be very safe since it happens that you forget a
htmlEncode and then there is a security risc.
I saw that a "HtmlControls.HtmlGenericControl" (a html label converted to
RunAt Server) has the very useful attributes "InnerText" and "InnerHtml".
InnerHtml works like the text property of the Label WebControl but InnerText
automatically converts all special signs to the html codes. Why isn't there
a thing like "InnerText" in the Label WebControl? Seems very usefull to me
and it's more likely to forget a HtmlEncode before a Label than to use the
wrong property for setting the controls text.

Or maybe is the beste solution to HtmlEncode all user-input before writing
it to the database? On the one side it could be easier and more secure to
focus on the well-formed data in the database but on the other side i think
to care about valid html data is the business of the asp.net apllication,
not of the database.
 
The best thing to do is always to verify any user input before passing it on
to your application - I would do this long before it gets to the database.

As for HtmlEncode, this uses the approach of removing known bad characters.
This is good. Even better, however, is only explicitly allowing known good
characters, which definitionally removes bad characters (both known and
unknown) which you can do using regular expression validators.
 
Chris,

A few problems with this approach:

1. Screening data on the way in is not a sure thing. Your app is not
necessarily the only thing pushing data into the db. Even if it is now, it
might not be in the future. Even if it remains the only client, you can't
stop the dba from doing whatever he/she likes to the data. I'm not
suggesting that validation should not be performed but, rather, that it's
not sufficient for adequate protection against the problems that
HTML-encoding solves.

2. Sometimes you must accept data that could cause display problems in a
given UI. It is the responsibility of the UI tier (or its developer <g>) to
ensure that these problems do not manifest. For example, in an app with
both web and Windows UIs, why should the Windows app deal with potential
html or javascript inclusions in user text? In fact, there might be cases
where such text should be accepted and/or rendered exactly as original
submitted by a user. However, each UI should take care of rendering these
appropriately (don't render, HTML-encode, or leave as-is).

3. HTML-encoding takes care of more than just the security-related issues
involved in rendering user input. Some characters should be mapped to their
HTML representations (e.g.: &, <, >) if they are to be displayed correctly
in the browser. They should not be stored in the database in their HTML
representations since other clients (other app UI, reporting tools, dba)
might just need to read the stored text as well.

Nicole
 
Back
Top