Multiline Regex Question

  • Thread starter Thread starter David Elliott
  • Start date Start date
D

David Elliott

I have an expression that works for single line but not multiline. What am I missing?

expression = "<div(?<data1>.*?)>(?<data2>.*?)</div>";
MatchCollection mc = Regex.Matches(data, expression,
RegexOptions.Multiline |
RegexOptions.IgnoreCase |
RegexOptions.IgnorePatternWhitespace);


Expression works if data looks like this:
<div id="bodyblktx10">5679664</div>
Result:
data1 -=> id="bodyblktx10"
data2 -=> 5679664

Expression DOESN'T works if data looks like this:
<div id="bodyblktx10">

5679664

</div>


Thanks,
Dave
(e-mail address removed)
 
1) Multiline prop doesn't do what you think. It only affects eval of
^ and $

2) The default behavior of regex in matching \n with . is to __not__
be greedy, i.e. it does not consider \n to match . by default

Therefore (and oddly enough)
you want to replace RegexOptions.Multiline with
RegexOptions.Singleline.

The official doco states

Specifies single-line mode. Changes the meaning of the period
character (.) so that it matches every character (instead of every
character except \n).
 
Back
Top