Regex to recognize math/string functions

  • Thread starter Thread starter Tim Conner
  • Start date Start date
T

Tim Conner

Hi,

Thanks to Peter, Chris and Steven who answered my previous answer about
regex to split a string. Actually, it was as easy as create a regex with the
pattern "/*-+()," and most of my string was splitted.
I am fascinated to the powerfull use of this RegEx class, so I wonder if it
could go a step further.

As a question, can regex be used to valid a set of different functions ?
Example : Suppose I have to verify the correctness of an input string, which
may contains one or more of the following functions :

Round ( NumericValue, Decimals)
Lower( StringValue )
Upper( StringValue )
Abs(NumericValue)

.... it will be like 15 functions, but let's name just this three.

Note : I just want to validate the input, I don't pretend to perform the
resolving part of this functions, just validate the input in terms of :
1.- Data type of parameters.
2.- Pairing parenthesis.
(the resolution of the of the functions will be done by 3rd party's code).


So, if I receive :
Abs("VB is great").

I would reject that sentense due the characters between parenthesis are a
string, not numeric values.

But, instead if I receive :
Upper( "C# is the best thing since sliced bread")

I would accept the sentence because the parameter is of the proper type.

Also:
Round( 1234.56, 2

would be invalid, due the missing parenthesis.

Finally, the functions can be nested.


So, the question is : can Regex handle this ? or should I start to go for
the parsers libraries ?


Thanks in advance,
 
Hi Tim,

I think you COULD use RegExp to perform such a validation, but there are
more suitable tools for such tasks - lexical analyzers. These are state
machines controlled by so called syntax graphs describing what is valid for
the grammar and what is not. I suppose RegExp uses a similar engine behind
the scenes by building a syntax graph from the regular expression you
provide, but it's just the expression can grow enormously for complex
grammars.
 
Hi,

Thanks to Peter, Chris and Steven who answered my previous
answer about regex to split a string. Actually, it was as easy
as create a regex with the pattern "/*-+()," and most of my
string was splitted. I am fascinated to the powerfull use of
this RegEx class, so I wonder if it could go a step further.

As a question, can regex be used to valid a set of different
functions ? Example : Suppose I have to verify the correctness
of an input string, which may contains one or more of the
following functions :

Round ( NumericValue, Decimals)
Lower( StringValue )
Upper( StringValue )
Abs(NumericValue)

... it will be like 15 functions, but let's name just this
three.

Note : I just want to validate the input, I don't pretend to
perform the resolving part of this functions, just validate the
input in terms of : 1.- Data type of parameters.
2.- Pairing parenthesis.
(the resolution of the of the functions will be done by 3rd
party's code).


So, if I receive :
Abs("VB is great").

I would reject that sentense due the characters between
parenthesis are a string, not numeric values.

But, instead if I receive :
Upper( "C# is the best thing since sliced bread")

I would accept the sentence because the parameter is of the
proper type.

Also:
Round( 1234.56, 2

would be invalid, due the missing parenthesis.

Finally, the functions can be nested.


So, the question is : can Regex handle this ? or should I start
to go for the parsers libraries ?

Tim,

Taken individually, each function's form could be validated by a
regular expression. For 15 functions, you would need to write 15
regexes.

Taken together, however, the complexity of matching arbitrarily
nested function calls will quickly turn any regex-based solution into
an unmaintainable mess. This is assuming it's even possible to do
with regexes. Assuming the following would be valid input in your
system, I have no idea of how to write a generic regex to validate
this:

Upper(Lower(Upper(Lower("())()(((()()())"))))

I would suggest investigating lexers and parsers. They're not that
hard to write, and can handle the above input with ease (and much
more complex input as well). For a gentle introduction to writing a
parser from scratch, here's a good site:

"Let's Build a Compiler" by Jack Crenshaw:
http://compilers.iecc.com/crenshaw/

It's written in Pascal, but it shouldn't be too hard to port to C#.

Chris.
 
Back
Top