Regular Expression question

N

Natalia DeBow

Hi,

I am stuck trying to come up with a regular expression for the following
pattern:
A string that contains "/*" but that does not contain */ within it.
Basically I am searching for C-style multiline comments and would much
rather use Regex than strings.

Here is what I have, but it does not seem to work.
Match match = Regex.Match(textLine, @"\s*(/[*])(?<comment>(?![*]/).*)$");

There is something wrong with the fact that the match should succeed only if
the string "*/" is not found.

Any input would be greatly appreciated!

Thanks,

Natalia
 
G

Guest

Maybe not what you want, but it may be better to test that is IS a match on the regex "/\*[.\s\S]"
but DOESN'T match "/\*[.\s\S]/\*"
 
N

Natalia DeBow

Hi sevenfifteen,

Thanks so much for replying to my message. Your suggestion worked
indeed. Thanks!

I am stuck here on another problem, trying to come up with a regular
expression for the following case:
a. if a substring "/*" is detected, there is no "*/" that would follow;
b. if a substring "/*" is detected, there is neither "//" nor "/" that
preceeds it.

Part a is working fine, but when I add part b to it, things become
broken once again.

I have tried a few reg. ex.'s but nothing seems to work.

Here is what I have tried so far:
@"\s*(?!//?.*)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?!.*//?)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?<!//?.*)(?<=/[*])(?!.*[*]/)(?<comment>.*)"

Not sure where my logic is wrong.

Any help would be greatly appreciated!

Thanks,
Natalia
 
J

Justin Rogers

Trying to match comments in /* */ pairs appears to be what you are doing.
Based on the semantics of something like c# it requires more logic than what
you are using.

"(?ms)^(?!.*//.*).*/\\*(?<comments>.*)\\*/"

The above is a C# escaped string that will match comment structures
not preceded by line comments "//".
 
J

Justin Rogers

"(?ms)^.*?(?<!//[^\\n]*)/\\*(?<comments>.*)\\*/"

The above is a little better from what I can tell. You may
also need to make (?<comments>.*?), so that you don't
eat past the end of a comment.


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Justin Rogers said:
Trying to match comments in /* */ pairs appears to be what you are doing.
Based on the semantics of something like c# it requires more logic than what
you are using.

"(?ms)^(?!.*//.*).*/\\*(?<comments>.*)\\*/"

The above is a C# escaped string that will match comment structures
not preceded by line comments "//".


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers


Natalia DeBow said:
Hi sevenfifteen,

Thanks so much for replying to my message. Your suggestion worked
indeed. Thanks!

I am stuck here on another problem, trying to come up with a regular
expression for the following case:
a. if a substring "/*" is detected, there is no "*/" that would follow;
b. if a substring "/*" is detected, there is neither "//" nor "/" that
preceeds it.

Part a is working fine, but when I add part b to it, things become
broken once again.

I have tried a few reg. ex.'s but nothing seems to work.

Here is what I have tried so far:
@"\s*(?!//?.*)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?!.*//?)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?<!//?.*)(?<=/[*])(?!.*[*]/)(?<comment>.*)"

Not sure where my logic is wrong.

Any help would be greatly appreciated!

Thanks,
Natalia
 
J

Justin Rogers

Eck, this one matches single line and multi-line comments.

Regex regex = new Regex(
"(?ms)" +
"^.*?((?<lineComment>//)|/\\*)" +
"(?<comments>.*?)" +
"(?(lineComment)$|\\*/)");

You can tell which you are matching by checking the lineComment
group for a value.


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Justin Rogers said:
"(?ms)^.*?(?<!//[^\\n]*)/\\*(?<comments>.*)\\*/"

The above is a little better from what I can tell. You may
also need to make (?<comments>.*?), so that you don't
eat past the end of a comment.


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Justin Rogers said:
Trying to match comments in /* */ pairs appears to be what you are doing.
Based on the semantics of something like c# it requires more logic than what
you are using.

"(?ms)^(?!.*//.*).*/\\*(?<comments>.*)\\*/"

The above is a C# escaped string that will match comment structures
not preceded by line comments "//".


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers


Natalia DeBow said:
Hi sevenfifteen,

Thanks so much for replying to my message. Your suggestion worked
indeed. Thanks!

I am stuck here on another problem, trying to come up with a regular
expression for the following case:
a. if a substring "/*" is detected, there is no "*/" that would follow;
b. if a substring "/*" is detected, there is neither "//" nor "/" that
preceeds it.

Part a is working fine, but when I add part b to it, things become
broken once again.

I have tried a few reg. ex.'s but nothing seems to work.

Here is what I have tried so far:
@"\s*(?!//?.*)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?!.*//?)(/[*])(?!.*[*]/)(?<comment>.*)"
@"\s*(?<!//?.*)(?<=/[*])(?!.*[*]/)(?<comment>.*)"

Not sure where my logic is wrong.

Any help would be greatly appreciated!

Thanks,
Natalia
 
N

Natalia DeBow

Hi there,

Thanks so much, Justin, for your very helpful comments. They are
greatly appreciated!

I have another question for .NET RegEx experts.

I am reading in a C Sharp file line by line and I am trying to detect
comments that start with either // of ///. What I am particularly
interested is the comments themselves. I am interested in some stats in
regards to the amount of comments in the file (comment bytes).

So, I tried several regular expressions, but they don't seem to work in
all the cases.

Here are the cases that I need to cover:

a. /// comments or // comments
b. /// <xml-tag> comments </xml-tag>
c. /// <xml-tag> comments <another xml-tag> comments </another xml-tag>
comments </xml-tag>
d. /// <xml-tag>
e. /// </xml-tag>

I need to be able to capture the comments and not the xml tags.

Here are a few of regular expressions that I have tried but
unsuccessfully.

@"^.*?///?\s*((</?.+>)*(?<comments>.*))*$"
@"///?\s*(</?.+>)*(?<comments>.*)"

I am having difficulty capturing multiple comments if they are separated
by xml tags. For some odd reason, if I have more than one set of tags,
the returned result is always the right most set of comments.

Thanks so much for any input!
Natalia




*** Sent via Devdex http://www.devdex.com ***
Don't just participate in USENET...get rewarded for it!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top