R
Rob
Hi,
I've written a small VB application that parses an HTML document and
removes code I don't need and re-writes the file. I'm looking for the
regex pattern that will remove the following code:
<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
..shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<title>T1 CRR Online Manual</title>
<style>
<!--
-->
</style>
</head>
...of course the page continues after the </head> tag but I want to
remove everything within the head tags including the head tags. This has
to work on any HTML file that I parse so the contents within the head
tags may be different. This is what I've got so far:
pattern = "<head>[.|\s|\n]*<\/head>"
returntext = Regex.Replace(returntext, pattern, "")
...but this doesn't work. Anyone out there with a solution?
Thanks
Rob
I've written a small VB application that parses an HTML document and
removes code I don't need and re-writes the file. I'm looking for the
regex pattern that will remove the following code:
<head>
<meta http-equiv=Content-Type content="text/html; charset=windows-1252">
<!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
..shape {behavior:url(#default#VML);}
</style>
<![endif]-->
<title>T1 CRR Online Manual</title>
<style>
<!--
-->
</style>
</head>
...of course the page continues after the </head> tag but I want to
remove everything within the head tags including the head tags. This has
to work on any HTML file that I parse so the contents within the head
tags may be different. This is what I've got so far:
pattern = "<head>[.|\s|\n]*<\/head>"
returntext = Regex.Replace(returntext, pattern, "")
...but this doesn't work. Anyone out there with a solution?
Thanks
Rob