Strip html Tags

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

Does anyone have an example of how to strip HTML tags from an laready existing text file

Thanks
 
AdamD,

I think that the easiest way to do this would be to load the file into
an instance of HTMLDocument (you can access it through COM interop by
setting a reference to mshtml in the project references, there should
already be a managed wrapper there). Once you have that, you can just get
the InnerText property of the tag representing the body, and it will return
the text, sans tags.

Hope this helps.
 
Many thanks.

Also found that I could use .NET Regular Expression and strip all content inside html tags <>.
 
Back
Top