Html parser

  • Thread starter Thread starter Dmitriy Lapshin [C# / .NET MVP]
  • Start date Start date
D

Dmitriy Lapshin [C# / .NET MVP]

Hi Mark,

If you look for a simple solution, you could then use Microsoft IE's HTML
parser which is pretty smart in dealing with non-closed tags. I'd also say
such a routine won't be short as the parser should keep track of each
non-closed tag and its context to suggest proper closing tags in proper
places.
 
Hi, I am using a program that is ultra paranoid about start and end html
tags.

For example

<p>This is a test
<br>A new line

The above code causes the program to fail

<p>This is a test</p>
<br>A new line</br>

The above code works fine.

Does anyone have a short routine that looks for a start HTML tag and if a
matching end tag does not exist inserts an end tag? I could write one myself
but rather than reinventing the wheel and all that ;)

Thanks in advance
Mark
 
if you need to work with existing html then consider using tidycom or the c#
wrapper for it (http://www.mattstan.pwp.blueyonder.co.uk/tidy/tidycs.html),
if you're manipulating existing html then use the dom exposed by mshtml, if
you're creating html from scratch, then just use an xml dom.

r.

Dmitriy Lapshin said:
Hi Mark,

If you look for a simple solution, you could then use Microsoft IE's HTML
parser which is pretty smart in dealing with non-closed tags. I'd also say
such a routine won't be short as the parser should keep track of each
non-closed tag and its context to suggest proper closing tags in proper
places.

--
Dmitriy Lapshin [C# / .NET MVP]
X-Unity Test Studio
http://x-unity.miik.com.ua/teststudio.aspx
Bring the power of unit testing to VS .NET IDE

Mark said:
Hi, I am using a program that is ultra paranoid about start and end html
tags.

For example

<p>This is a test
<br>A new line

The above code causes the program to fail

<p>This is a test</p>
<br>A new line</br>

The above code works fine.

Does anyone have a short routine that looks for a start HTML tag and if a
matching end tag does not exist inserts an end tag? I could write one myself
but rather than reinventing the wheel and all that ;)

Thanks in advance
Mark
 
Back
Top