F
Florian Haag
Hi,
I'm not sure whether this is the right group; I'm trying to achieve the
following with .NET's RegEx class:
I want to match strings while ignoring the number of whitespaces.
In a simple case, this would of course mean something like
a\s+b
which would match not only "a b", but also "a b", "a b" etc.
However, a case like
a\s+b?\s+b
already doesn't work for me any more, as it would only match "a c"
(two spaces in between), not "a c" (one space in between), if the "b"
is omitted. I can override this by using an expression like
a\s+(b\s+)?b
, which would already require some modifications from the input,
though, as users of the target application will not bother to include
the 2nd whitespace into the optional part of the string when they input
the expression (using a very simplified and otherwise limited syntax,
which I'd like to convert to RegEx).
Things get even more complicated in cases like this:
(a|b\s+)(c|\s+d)
It seems to me that I cannot evaluate this directly but instead have to
replace it with
ac|a\s+d|b\s+c|b\s+d
in order to make it match "b d" (one space in between), too, not only
"b d" (two spaces in between).
I wonder whether this can be done by including each \s+ into a named
group and then use an alternation construct referencing to the groups
of any possibly adjacent space, thereby determining whether another \s+
is required to match.
But maybe there's another, simpler (and maybe even faster?) way to
achieve this?
Thanks in advance,
Florian
I'm not sure whether this is the right group; I'm trying to achieve the
following with .NET's RegEx class:
I want to match strings while ignoring the number of whitespaces.
In a simple case, this would of course mean something like
a\s+b
which would match not only "a b", but also "a b", "a b" etc.
However, a case like
a\s+b?\s+b
already doesn't work for me any more, as it would only match "a c"
(two spaces in between), not "a c" (one space in between), if the "b"
is omitted. I can override this by using an expression like
a\s+(b\s+)?b
, which would already require some modifications from the input,
though, as users of the target application will not bother to include
the 2nd whitespace into the optional part of the string when they input
the expression (using a very simplified and otherwise limited syntax,
which I'd like to convert to RegEx).
Things get even more complicated in cases like this:
(a|b\s+)(c|\s+d)
It seems to me that I cannot evaluate this directly but instead have to
replace it with
ac|a\s+d|b\s+c|b\s+d
in order to make it match "b d" (one space in between), too, not only
"b d" (two spaces in between).
I wonder whether this can be done by including each \s+ into a named
group and then use an alternation construct referencing to the groups
of any possibly adjacent space, thereby determining whether another \s+
is required to match.
But maybe there's another, simpler (and maybe even faster?) way to
achieve this?
Thanks in advance,
Florian