Parseing HTML

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I have a large number of html file (10,000+) and need to programmatical
modify them on a regular basis.

How can I determine the textual data that is present, ignoring tags etc

so if i have a line such as:-
<TD WIDTH="524"><B><FONT SIZE="2" FACE="Times New Roman"
COLOR="#000000">Direction2</FONT></B></TD>

I need to return "Direction2" and nothing else

any ideas?

guy
 
The only way I know how would be to use MSHTML. This has the HTML object
model and allows you to load up an HTML dom, and then you can get the inner
text of a node, and so on.
 
Back
Top