Regex question - why doesn't "(\d+).{\1}" match "3abc"?

  • Thread starter Thread starter Jon Shemitz
  • Start date Start date
J

Jon Shemitz

"(\d+).{3}" does match "3abc", as expected, and "(\d+).{\2}" doesn't
compile. So, why doesn't "(\d+).{\1}" match "3abc"?
 
Jon Shemitz said:
"(\d+).{3}" does match "3abc", as expected, and "(\d+).{\2}" doesn't
compile. So, why doesn't "(\d+).{\1}" match "3abc"?

Why would it match? I haven't seen anywhere that the number in {n} can be a
backreference. Have you seen that somewhere?

John Saunders
 
John said:
Why would it match? I haven't seen anywhere that the number in {n} can be a
backreference. Have you seen that somewhere?

No, I haven't. But "(\d+).{\1}" is not disallowed the way that
"(\d+).{\2}" is, so it looked like the engine understands
backreferences in count clauses at SOME level.

Otoh, it compiles "(\d+).{c}" as well - looks like the {} parser is
not as robust as it could be.
 
Jon Shemitz said:
No, I haven't. But "(\d+).{\1}" is not disallowed the way that
"(\d+).{\2}" is, so it looked like the engine understands
backreferences in count clauses at SOME level.

No, it doesn't understand backreferences in count clauses. It understands
backreferences well enough to know that there is a \1 but not a \2.

John Saunders
 
My answer would be: because even if you pretend that the backreference does
get evaluated, you're then asking for (\d+).{"3"} instead of (\d).{3} --
that is, the character "3" instead of the value 3. There is no implicit
mechanism for converting the string value of a backreference into a numeric
value that I know of, so consequently, there is no support for using
backreferences in a count clause.

The best I can think of would be for you to process the string twice. Once
with the (\d+) regex to get the "3". Then, manually build the string
"(\d)+.{3}" and see if your input matches that. At least becuase the "3" is
a string and the second regex is also a string, you get to skip converting
the "3" to an integer and back! Sort of ironic, that, given the nature of
why you can't use a backreference in a count clause in the first place...
 
Back
Top