question about RegEx group capture in C#

  • Thread starter Thread starter Jose
  • Start date Start date
J

Jose

There's something for me to learn with this example, i'm sure :)

Given this text:
"[Contact].[Region].[All ContactRegion].[ASIA
PACIFIC].[Japan].[Japan]"
and my first attempt at capture the groups:
"(?:\[)(.+?)(?:\])"
RegExTest gives me what i expect: 6 captured groups: Contact, Region,
All ContractRegion, ASIA PACIFIC, Japan, Japan.

However, with this C# code, i just get 2 capture groups: "[Contact],
Contact":

Regex r = new Regex(@"(?:\[)(.+?)(?:\])");
Match m = r.Match(LongParameterValue);
GroupCollection gc = m.Groups; //group count is 2

Am i capturing this incorrectly in C#?

Thanks.
 
Hey,

I haven't done this in a while, but when your matching, I believe you
looking only for the first occurance. You should use Test or at the end of
your match specify /g for global searching, so that it will find all of the
instances in your pattern.

Hope this helps. Just an idea.

Nick Harris, MCSD
http://www.VizSoft.net
 
Hi,
[inline]

Jose said:
There's something for me to learn with this example, i'm sure :)

Given this text:
"[Contact].[Region].[All ContactRegion].[ASIA
PACIFIC].[Japan].[Japan]"
and my first attempt at capture the groups:
"(?:\[)(.+?)(?:\])"

You don't need the non-capture groups. Simplified this becomes:
@"\[(.+?)\]"
RegExTest gives me what i expect: 6 captured groups: Contact, Region,
All ContractRegion, ASIA PACIFIC, Japan, Japan.

If you look at the regex there are 2 groups :
- the frist group is implicit (the entire match if there is a match)
- the second group is your (.+?)

It is possible for a capture group to contain more then one captured string.
For this to work, the capture group itself must be repeated and use ^$ to
parse the entire string, like in:
@"^(?:\[(.+?)\]\.?)+$"

If you execute this then there will be 2 groups, but the second group will
contain 6 captures.

Regex r = new Regex( @"^(?:\[(.+?)\]\.?)+$" );
Match m = r.Match(LongParameterValue);
foreach (Capture c in m.Groups[1] ) //second group
{
Console.Writeline(c.Value); // prints a part
}


Now to make it more complicated, there is another way, which I like more.
Using your original regex : @"\[(.+?)\]"
The problem was that it only matches one part (which is in the second group,
first capture), so another solution would be to repeat the regex for all
parts.

Regex r = new Regex( @"\[(.+?)\]" );
Match m = r.Match(LongParameterValue);
while (m.Success)
{
Console.Writeline(m.Groups[1].Value); // print a part
// note that this is the same as using m.Groups[1].Captures[0].Value;
m = m.NextMatch();
}


HTH,
Greetings
However, with this C# code, i just get 2 capture groups: "[Contact],
Contact":

Regex r = new Regex(@"(?:\[)(.+?)(?:\])");
Match m = r.Match(LongParameterValue);
GroupCollection gc = m.Groups; //group count is 2

Am i capturing this incorrectly in C#?

Thanks.
 
Back
Top