problem with regular expression and parsing a sql-statement

  • Thread starter Thread starter robert kurz
  • Start date Start date
R

robert kurz

hallo ng,

i am trying to parse a sql-statement with regular expressions.
my goal is to get the parts of the statement. in my thoughts the
group-functionality of the regular expressions should do this.

my pattern looks like:

A) string strPattern = "(select.+)(from.+)(where.+)?(order.+)?";

my searchstrings are like this:

1) string strText = "select * from x";
2) string strText = "select * from x where x.a=1";
3) string strText = "select * from x where x.a=1 order by x.a";
4) string strText = "select * from x order by x.a";

the resulting groups are:

3)
group 0: the statement
group 1: select *
group 2: from x where x.a=1 order by x.a

i don't understand the behaviour for the optional grouping. i am
expecting, that the optional patterns, e.g. (where.+)?, would be
grouped, if they are, and disappear, if they are not.

where is my fault?

whithout the ? in (where.+)? and (order.+)? im getting the wanted
result for 3).

thanks for helping, robert
 
robert kurz wrote:

i am trying to parse a sql-statement with regular expressions.
my goal is to get the parts of the statement. in my thoughts the
group-functionality of the regular expressions should do this.

my pattern looks like:

A) string strPattern = "(select.+)(from.+)(where.+)?(order.+)?";

my searchstrings are like this:

1) string strText = "select * from x";
2) string strText = "select * from x where x.a=1";
3) string strText = "select * from x where x.a=1 order by x.a";
4) string strText = "select * from x order by x.a";

the resulting groups are:

3)
group 0: the statement
group 1: select *
group 2: from x where x.a=1 order by x.a

i don't understand the behaviour for the optional grouping. i am
expecting, that the optional patterns, e.g. (where.+)?, would be
grouped, if they are, and disappear, if they are not.

where is my fault?

Matching by default is greedy, meaning as much as possible is matched,
if you use
(from.+?)
then non-greedy matching meaning after "from" at least one abritrary
(".") character is matched but not all that are possible.
 
Martin Honnen said:
robert kurz wrote:



Matching by default is greedy, meaning as much as possible is matched,
if you use
(from.+?)
then non-greedy matching meaning after "from" at least one abritrary
(".") character is matched but not all that are possible.

hallo martin,

thank you for your answer.

in my opinion the problem is devided in two parts. the first is, that
non-optional groups are one level higher than optional. the second is,
that .+ is greedy, so the optional group is not taken.

do i think the right way or am i wrong? if i'm right, i don't have an
idea to solve my sql-problem.

robert
 
Back
Top