Regular Expression Help!

  • Thread starter Thread starter Steve Peterson
  • Start date Start date
S

Steve Peterson

Hi - I'll admit it.. I'm clueless using regular expressions. That's why I
was hoping someone could help me out a bit. I need to search an html file
and find strings that are between tags. For example, everything in bewtween
the <body></body> tag. As you well know, this string could span many lines,
white spaces, tabs, et.... So I need an expression that will match the
<body> tag, ingnore EVERYTHING then match the end </body> tag.

Anyone to the rescue?? :)

Thanks a lot in advance
Steve
 
Hi,

I have been using this for doing the job in C#:

private string getBodyText(string inputString)
{
string pattern = "<body(<?body>.*)</body>";
Regex r = new Regex(pattern,
RegexOptions.IgnoreCase|RegexOptions.Compiled);
Match m = r.Match(inputString);
if (m.Success)
{
return m.Groups["body"].Value;
}
else
{
return inputString;
}
}

Regards,
Svend
 
Did the trick! Thanks!!!
Steve


Svend Dyhr Hansen said:
Hi,

I have been using this for doing the job in C#:

private string getBodyText(string inputString)
{
string pattern = "<body(<?body>.*)</body>";
Regex r = new Regex(pattern,
RegexOptions.IgnoreCase|RegexOptions.Compiled);
Match m = r.Match(inputString);
if (m.Success)
{
return m.Groups["body"].Value;
}
else
{
return inputString;
}
}

Regards,
Svend

Steve Peterson said:
Hi - I'll admit it.. I'm clueless using regular expressions. That's why I
was hoping someone could help me out a bit. I need to search an html file
and find strings that are between tags. For example, everything in bewtween
the <body></body> tag. As you well know, this string could span many lines,
white spaces, tabs, et.... So I need an expression that will match the
<body> tag, ingnore EVERYTHING then match the end </body> tag.

Anyone to the rescue?? :)

Thanks a lot in advance
Steve
 
Hi Steve,

You could host the Microsoft Web Browser (ShDocVw) and use the Document
Object Model to do the work for you.

In your case it would be
Browser.Document.Body.InnerHtml
(just as it would in Javascript or VbScript).

Regards,
Fergus
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Back
Top