Questions on a Regular Rexpression

Ioannis Vranos · Feb 25, 2005

Given the regular expression:

S"^([a-zA-Z]+|[a-zA-z]+\\s[a-zA-Z]+)$"

1) Isn't the "[a-zA-Z]+|[a-zA-z]+" part redundant? As far as I can
understand it means exactly the same as "[a-zA-Z]+" alone.

2) Isn't the parenthesis grouping redundant?

3) How can we define the parenthesis characters themselves as expected
characters in a match?

Thanks in advance.

Carl Daniel [VC++ MVP] · Feb 25, 2005

Ioannis said:
Given the regular expression:

S"^([a-zA-Z]+|[a-zA-z]+\\s[a-zA-Z]+)$"

1) Isn't the "[a-zA-Z]+|[a-zA-z]+" part redundant? As far as I can
understand it means exactly the same as "[a-zA-Z]+" alone.

No, because of the alternative - it's

[a-zA-Z]+

-or-

[a-zA-z]+\\s[a-zA-Z]+

2) Isn't the parenthesis grouping redundant?

Since it's the entire expression, yes. If this expression was embedded
inside a larger regex then no - it defines the limits of the alternative.

3) How can we define the parenthesis characters themselves as expected
characters in a match?

Just escape them: \\(. You shouldn't need to escape the right paren in
most cases - just the left.

-cd

Ioannis Vranos · Feb 26, 2005

Carl said:
1) Isn't the "[a-zA-Z]+|[a-zA-z]+" part redundant? As far as I can
understand it means exactly the same as "[a-zA-Z]+" alone.

Click to expand...

No, because of the alternative - it's

[a-zA-Z]+

-or-

[a-zA-z]+\\s[a-zA-Z]+

I did not understand what you mean with the above. May you explain with
some details?

Since it's the entire expression, yes. If this expression was embedded
inside a larger regex then no - it defines the limits of the alternative.

Just escape them: \\(. You shouldn't need to escape the right paren in
most cases - just the left.

Ok, thanks for the info.

Carl Daniel [VC++ MVP] · Feb 26, 2005

Ioannis said:
Carl said:

1) Isn't the "[a-zA-Z]+|[a-zA-z]+" part redundant? As far as I can
understand it means exactly the same as "[a-zA-Z]+" alone.

Click to expand...

No, because of the alternative - it's

[a-zA-Z]+

-or-

[a-zA-z]+\\s[a-zA-Z]+

Click to expand...

I did not understand what you mean with the above. May you explain
with some details?

The alternative operation has low precendence - lower than concatenation, so

(bob|joe|sue)

parses as 'bob' or 'joe' or 'sue' not as 'bo'+('b' or 'j')+'o'+('e' or
's')+'ue'

similarly,

[a-zA-Z]+|[a-zA-Z]+\\s+[a-zA-Z]+

parses as

'[a-zA-Z]+' or '[a-zA-Z]+\\s[a-zA-Z]+'

instead of

('[a-zA-Z]+' or '[a-zA-Z]+')\\s+[a-zA-Z]+

does that make sense?

The original expression could be factored, since the alternatives have a
common prefix:

[a-zA-Z]+(\\s+[a-zA-Z]+)?

I would expect a DFA-based regex engine might well do that factoring as a
matter of course when computing the DFA.

-cd

Ioannis Vranos · Feb 26, 2005

Carl said:
The alternative operation has low precendence - lower than concatenation, so

(bob|joe|sue)

parses as 'bob' or 'joe' or 'sue' not as 'bo'+('b' or 'j')+'o'+('e' or
's')+'ue'

similarly,

[a-zA-Z]+|[a-zA-Z]+\\s+[a-zA-Z]+

parses as

'[a-zA-Z]+' or '[a-zA-Z]+\\s[a-zA-Z]+'

instead of

('[a-zA-Z]+' or '[a-zA-Z]+')\\s+[a-zA-Z]+

does that make sense?

The original expression could be factored, since the alternatives have a
common prefix:

[a-zA-Z]+(\\s+[a-zA-Z]+)?

I would expect a DFA-based regex engine might well do that factoring as a
matter of course when computing the DFA.

Thanks for the explanation.

Serge Baltic · Feb 26, 2005

IV> S"^([a-zA-Z]+|[a-zA-z]+\\s[a-zA-Z]+)$"

Note that the [A-z] character set listed above (in the second group) includes
non-alphabetic characters.

Ioannis Vranos · Feb 26, 2005

Serge said:
IV> S"^([a-zA-Z]+|[a-zA-z]+\\s[a-zA-Z]+)$"

Note that the [A-z] character set listed above (in the second group)
includes non-alphabetic characters.

Thanks for the correction, it was just a typo of mine, it was meant to be:

S"^([a-zA-Z]+|[a-zA-Z]+\\s[a-zA-Z]+)$"

Regular expression for validating [GrandTotal]=4*[TotalCharges]+[currentCharges]+2	10	Nov 6, 2007
Regular expression	4	Feb 21, 2011
Account for leading and trailing whitespace.	5	Dec 6, 2006
Are you a RegEx bandido?	2	Oct 16, 2008
Regular Expression for cell address	11	Jan 2, 2007
Regular Expression Question for Emails	3	Sep 20, 2004
I need a test to see that a string is a valid path to a file	7	Feb 24, 2009
Regular Expressions question on whitespace	2	Oct 18, 2003

Questions on a Regular Rexpression

Ioannis Vranos

Carl Daniel [VC++ MVP]

Ioannis Vranos

Carl Daniel [VC++ MVP]

Ioannis Vranos

Serge Baltic

Ioannis Vranos

Ask a Question

Similar Threads