Alter Rendered HTML for page

  • Thread starter Thread starter Kersh
  • Start date Start date
K

Kersh

I'm trying to implement XHTML standards in my ASP.NET web pages but
whenever I use web controls I get problems because of the very strict
nature of W3C XHTML (transitional version is picky but strict1.1 very
severe!)

e.g. align="Center" fails validation because of the capital "C" - this
makes asp:Calendar control unuseable because it happens to use this
tag for each of the cells. Also Javascript tags require a 'type'
attribute...etc

My question is - is there a generic event that would allow me to
'intercept' the rendered html and manipulate it so these standards can
be met?

P.S. I admit that the best answer would be for Microsoft to include
DTD selection options for the HTML produced from web forms pages.
Visual Studio 2002 only allows HTML 4.0 Transitional and that's what
I'm using - can't say I'm not (very) disappointed either.

Microsoft staff please take note! I work for the UK government and
XHTML forms part of their e-government standards. There is a lot of
business to be gained by making XHTML output part of ASP.NET - think
accessibiity - all for the sake of a few inconsistencies in case
etc...
 
Can arbitrary HTML be reliably translated into XHTML?

I would question whether you're going to have general success in making a
non-XHTML toolset comply with standards it doesn't understand. I would think
that it would be more effective for the British Government to simply dump
ASP.NET and send a nice letter to Microsoft telling them how much business
they just lost. You could then spend your time developing applications for
toolsets which actually care what your requirements are.
 
Hmmm...thanks for replying John but not very useful comment
unfortunately.

The HTML I am transforming is not arbitrary - ASP.NET still produces
good HTML and the aspx page is still designed by myself - hence could
hardly be described as "arbitrary" - it's 98% XHTML - just happens to
contain some server control rendered attributes & tags that are
invalid.

I just need to tweak a few common tags/attributes I'm finding
(align="Center", type="text/css" etc). If I can intercept the HTML at
an event like page_prerender I can easily write a component to
validate the HTML output. The problem is that it would have to
encapsulate the entire page's HTML not just that of the individual
controls - because there are things like <script> tags that need
modifying that are inserted to facilitate the postbacks...

ASP.NET is the best tool available for our requirements (fast, neat,
scalable, re-useable development.) Try to remember that server
controls are only a small part of this technology - even if I never
use them I'd still be using ASP.NET - but I anticipate and hope for
the ability to implement multiple DTD's up to XHTML strict 1.1 as part
of the next framework release.

Any answers containing some useful code?
 
Keith, if your ASP.NET page contains arbitrary Server Controls, then it will
generate arbitrary HTML. A control can generate whatever HTML it likes -
ASP.NET has no say in the matter.

Also, if you need to translate some small, fixed number of ASP.NET pages
which you are designing, then I'm sure you can do the translation. But if
others are working on the project, or if the project has more than a few
pages or exists over more than a short span of time, then you will rapidly
approach "arbitrary".

Also, to the extent that you are not the person controlling the HTML to be
translated, then you will not have control over when the generator of that
HTML should decide to change the details of how it's generated. You'll be in
the same position as some programmers I've met who wrote code which depended
on the precise text of an error message generated by another piece of code,
and who were disappointed when the error message changed to use proper
grammar.

It's because of things like these that I asked whether there was some
general algorithm for translating valid HTML into valid XHTML. It seems to
me that only such an algorithm will succeed in the long run.

Also, it sounds to me as though you are at the beginning of a process which
may involve you using your translator for the next three years or so - until
you can deploy the hypothetical version of ASP.NET which is fully XHTML
compliant, along with compliant versions of all of the controls used by all
of your pages. Keep in mind that compliance to your requirements may perhaps
not be achieved in version 2.0 of ASP.NET, even if it _is_ an obvious
feature to include in a 2.0 release. Talk to someone about Microsoft Visual
C++ and how one is always waiting for the "next release" for full standards
compliance.

My recommendation about the Government clearly indicating their requirements
to Microsoft ( ;-) ) were not entirely facetious. Given what I perceive as
the likely course you're about to embark upon (making someone else's code do
something they never intended that it should do), I thought that it may be
early enough in Microsoft's development cycle that forceful action could
cause them to actually meet your requirements in the 2.0 timeframe. "I can't
use your product and so won't be buying any" is more likely to cause the
desired reaction than "I can use your product, but only with this workaround
I've written and maintain, and we'll be buying it anyway, so you needn't
lose sleep over it".

BTW, you say that this is a UK standard, but I wonder if there are similar
EU standards? Collective action might be even more effective.

Any answers containing some useful code?

Oh, you wanted code, did you? :-)

Well, before I go into code, I'll quickly mention that the "political"
solutions I discussed might be effective in causing Microsoft to give you a
supported solution to your problem in addition to focusing their attention
on the issue for the 2.0 release. It might also put you in continued contact
with a person at Microsoft who may act as a conduit into Microsoft for your
concerns on this matter. In other words, you might require them to provide
you with a workaround as a condition of purchase.

You have a few choices given the appropriate translation algorithm. You can
derive all of your pages from a single base class, which itself derives from
System.Web.UI.Page. This class would override the Render method:

/// <summary>
/// Render - override to modify the output of a derived page
/// </summary>
/// <param name="writer">The HtmlTextWriter to send the output
to</param>
protected override void Render(HtmlTextWriter writer)
{
System.IO.StringWriter sw = new System.IO.StringWriter();
HtmlTextWriter localWriter = new HtmlTextWriter(sw);
base.Render(localWriter);

string output = sw.ToString();

// Do what you like with the output
output = output.ToUpper();

writer.Write(output);
}

The other option involves using Response.Filter. Replace that with a stream
of your choice to modify the output as it comes out. That will be less
convenient to program for your needs, since the output does not come out all
at once.

Good Luck,
John Saunders
Internet Engineer
(e-mail address removed)

P.S. Out of curiosity and self-interest, I wonder: with which version of the
XHTML standard does the UK Government require compliance? Since I work at a
UK-based company creating web sites, this might become interesting to me.
 
Firstly, it's Kersh - not Keith.
It's a nickname. My Surname is 'Kershaw'.

OK dude. I didn't want to get into a discussion. You don't know how to
alter the HTML output from Server Controls in an ASP.NET page? That's
OK then.

I don't have time for a crusade to get Microsoft to standardise the
output of ASP.NET WebControls to XHTML. We simply live in hope.

And here's the definition of Arbitrary for you:
Determined by chance, whim, or impulse, and not by necessity, reason,
or principle: stopped at the first motel we passed, an arbitrary
choice.

ALL my HTML AND ASP.NET server controls are inserted by a team working
to the same XHTML principles, and we have examined the controls to
determine exactly which ones will produce me a few non-XHTML
tags/attributes and which won't. NOTHING chance, random, or impulsive
about that mate, it's all cause, effect, & logic. I.e. NON-ARBITRARY
no matter how many pages & controls I develop, because we work to
standards and procedures.

NO RANDOM FACTORS in there at all. I can also clearly identify 5-6
modifications to the HTML produced by them that would make ANY GENERIC
PAGE that uses these controls XHTML compliant. So if there isn't an
algorithm, I'm going to write a basic version of it.

OK - maybe I'd have to modify the algorithm if I upgraded to ASP.NET
2.0, but I'll do that when I come to it. As for your error text
example - don't take me for a total amateur - I would be writing a
sort of crude but generic XHTML parser - but it would never be that
idiotic!

To be honest John I'd quite like an answer - not a lecture about why I
shouldn't be asking the question. But you don't have an answer for me,
do you?
 
Kersh,

Did you notice that my most recent post included some code in answer to your
question?
 
No I didn't John. But I will test it out and I'm sure it will work
fine.
Apologies for my ramblings there - especially seeing as you had an
answer for me. I was in the wrong on that one - although I still feel
I know when to use the word arbitrary if you fancy throwing a
dictionary at me!

The UK government advocate adoption of XHTML standards generally as
part of it's E-gif (interoperability framework) - therefore mostly
XHTML 1.0 transitional will do - although many local government
organisations seem not to be able to do what they want under the
restrictions demanded by XHTML and so they ignore the standards.

Once example would be the APLAWS project (Accessible and Personalised
Local Government Websites http://www.aplaws.org.uk). This is supposed
to supply software to run sites according to XHTML & accessibility
guidelines supplied by the RNIB, Bobby and W3C but unfortunately
drastically fails to meet any of them. (It is also ludicrously
difficult to install, setup, use, and manage. We got a working example
running but had to abandon it because everyone who used it from a
technical prespective or not found it a nightmare. My theory is that
there's a few London Borough Councillors with shares in Red Hat Linux
and Oracle!)

However, I feel that these standards should be adhered to and so will
do so with all my webpages, with a little help from you John and my
XHTML checking algorithm...

If you want to read more check out the website of the deputy prime
minister, the IDeA, or search for E-gif standards.
 
Thanks for the response, Kersh.

I'll check out the references you provided. I've been very concerned once I
heard about governments demanding adherence to incomplete standards.

And, BTW, I was dead serious about how to get MS to actually adhere to the
(eventual) standard, especially considering how the EU anti-trust people
feel about them
(http://news.com.com/2100-1016_3-5060463.html?tag=fd_nbs_ent).
 
Back
Top