Regular Expressions

  • Thread starter Thread starter Jonathan Wolfson
  • Start date Start date
J

Jonathan Wolfson

Given an xml string s that looks like:

<?xml version="1.0" encoding="iso-8859-1"?>
<!DOCTYPE foo PUBLIC "-//Netscape Communications//DTD RSS 0.91//EN
http://www.goo.com/xml_dtds/zoo.dtd">
<xxx version="1.01">
<zzz>ggg</zzz>
</xxx>

How do you remove the stuff within a <!DOCTYPE ... > tag such that the
result
is:

<?xml version="1.0" encoding="iso-8859-1"?>
<xxx version="1.01">
<zzz>ggg</zzz>
</xxx>

s = System.Text.RegularExpressionsSystemRegex.Replace(s, PATTERN, "")?

In other words, what should I put in for PATTERN?

I tried "<!DOCTYPE*>" but that did not work.

Thanks,
Jon
 
hi Jon,

I am not sure abt the regular expression. But You can
acheive the same result using teh XmlDocument. First load
the cml to XmlDocument and then remove your second node.

regards
Sreejumon[MVP]
DOTNET makes IT happen
 
Hi Jonathan

A regex pattern that works (i.e. matches) is "<!DOCTYPE[^>]*>". i.e.

s = System.Text.RegularExpressions.Regex.Replace(s, "<!DOCTYPE[^>]*>", "")

If you want to find out more about Regular Expressions, check under
VS.NET->.NET Framework->Reference->Regular Expression Language Elements in
the VS.NET Help.

Peter
 
Back
Top