G
George Neuner
Hi all,
Does anyone have any hard data on the size / complexity limits of the
..NET regex implementation - either as interpreted or as compiled to an
assembly?
I have an application that scans structured documents for user
specified words and phrases (including looking for common misspellings
and/or mispunctuations of words in the user's pattern) which I am
planning to port to .NET. The current app generates a separate
pattern to be applied to each document section and then searches over
a database of documents picking out those that match.
Currently pattern complexity is limited only by available memory as
the regex implementation uses interpreted DFAs. I have seen some
users construct tremendously complicated patterns which have many
hundreds of states in the resulting DFA. Although this is far from
typical usage, the current implementation allows it and I would like
to preserve as much of the current functionality as possible in any
new version.
Preferably I would like to compile each generated regex to an assembly
and dynamically load it for use ( I know this requires auxiliary
appdomains to unload ). Basically I'd like to get a feel for how
complex a single regex may be before I go doing a lot of coding using
them and then trying to figure out why it doesn't work.
Thanks,
George
Does anyone have any hard data on the size / complexity limits of the
..NET regex implementation - either as interpreted or as compiled to an
assembly?
I have an application that scans structured documents for user
specified words and phrases (including looking for common misspellings
and/or mispunctuations of words in the user's pattern) which I am
planning to port to .NET. The current app generates a separate
pattern to be applied to each document section and then searches over
a database of documents picking out those that match.
Currently pattern complexity is limited only by available memory as
the regex implementation uses interpreted DFAs. I have seen some
users construct tremendously complicated patterns which have many
hundreds of states in the resulting DFA. Although this is far from
typical usage, the current implementation allows it and I would like
to preserve as much of the current functionality as possible in any
new version.
Preferably I would like to compile each generated regex to an assembly
and dynamically load it for use ( I know this requires auxiliary
appdomains to unload ). Basically I'd like to get a feel for how
complex a single regex may be before I go doing a lot of coding using
them and then trying to figure out why it doesn't work.
Thanks,
George