Regular expression to detect comments

  • Thread starter Thread starter Bob Altman
  • Start date Start date
B

Bob Altman

Hi all,

I need some help from a RegEx guru: Can some kind soul come up with a regex expression that detects comments in VB source lines? I think that this should be something that detects an apostrophe which is not enclosed within quotes.

Thanks!
 
well gee I guess I shouldn't write code into a message and not check it for accuracy (duh).
Below is the correct code, illustrating my test cases and finally the output...still not regex, but at least this works and will help somebody work out the regex logic?

Chris Langsenkamp

--------------------------------------------------------------------------------

Module Module1
Sub Main()
' normal comment line
Console.WriteLine(" " & FindComment("'This is a test comment"))
' normal comment line with a single quote in it
Console.WriteLine(" " & FindComment("'This is a " & Chr(34) & "test comment"))
' code line ending in a comment
Console.WriteLine(" " & FindComment("x=1 'This is a test comment"))
' code line ending in a comment containing an apostrophe
Console.WriteLine(" " & FindComment("x=1 'This is a test 'comment"))
' code line ending in a comment containing a double quote
Console.WriteLine(" " & FindComment("x=1 'This is a test " & Chr(34) & "comment"))
' code line ending in a comment containing two double quotes
Console.WriteLine(" " & FindComment("x=1 'This is a test " & Chr(34) & "comment" & Chr(34)))
' code line containing two double quotes followed by comment containing an apostrophe
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "1" & Chr(34) & " 'This is a test ' comment"))
' code line containing two double quotes containing an apostrophe followed by comment
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test comment"))
' code line containing two double quotes containing an apostrophe followed by comment containing an apostrophe
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test ' comment"))
' code line containing two double quotes containing an apostrophe followed by comment containing a double quote
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comment"))
' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comment" & Chr(34)))
' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes containing an apostrophe
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comm'nt" & Chr(34)))
' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes containing an apostrophe followed by an apostrophe
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comm'nt" & Chr(34) & "'"))
' code line containing two double quotes containing an apostrophe followed by comment containing an apostrophe and containing two double quotes containing an apostrophe followed by an apostrophe
Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test '" & Chr(34) & "comm'nt" & Chr(34) & "'"))
End Sub

Function FindComment(ByVal ThisLine As String) As String
Dim LineArray() As String
Dim ThisComment As String = ""
Dim a As Byte = InStrRev(ThisLine, "'") ' last apostrophe

Console.WriteLine(ThisLine)

Do While a > 0
If InStr(Mid(ThisLine, a), Chr(34)) = 0 Then
ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line and has no double quotes in it
Else
' might be a comment with one or more double-quotes in it - let's find out
If InStr(Mid(ThisLine, 1, a), Chr(34)) = 0 Then
ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line and no matching double quote before apostrophe
Else
' there's at least one double quote preceeding...let's see if there are all matched pairs preceeding
LineArray = Split(Mid(ThisLine, 1, a), Chr(34))
If LineArray.Length Mod 2 <> 0 Then
ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line
' all double quotes preceeding are matched pairs because the split returns an odd number of elements
End If
End If
End If
If a > 1 Then a = InStrRev(ThisLine, "'", a - 1) Else Exit Do
Loop

If ThisComment <> "" Then
Return ThisComment
Else
Return "No Comment Found"
End If

End Function
End Module

--------------------------------------------------------------------------------

Console Output:
'This is a test comment
1: 'This is a test comment
'This is a "test comment
1: 'This is a "test comment
x=1 'This is a test comment
5: 'This is a test comment
x=1 'This is a test 'comment
5: 'This is a test 'comment
x=1 'This is a test "comment
5: 'This is a test "comment
x=1 'This is a test "comment"
5: 'This is a test "comment"
x="1" 'This is a test ' comment
7: 'This is a test ' comment
x="they're" 'This is a test comment
13: 'This is a test comment
x="they're" 'This is a test ' comment
13: 'This is a test ' comment
x="they're" 'This is a test "comment
13: 'This is a test "comment
x="they're" 'This is a test "comment"
13: 'This is a test "comment"
x="they're" 'This is a test "comm'nt"
13: 'This is a test "comm'nt"
x="they're" 'This is a test "comm'nt"'
13: 'This is a test "comm'nt"'
x="they're" 'This is a test '"comm'nt"'
13: 'This is a test '"comm'nt"'

--------------------------------------------------------------------------------
Hi all,

I need some help from a RegEx guru: Can some kind soul come up with a regex expression that detects comments in VB source lines? I think that this should be something that detects an apostrophe which is not enclosed within quotes.

Thanks!
 
Hello Bob,

Thanks very much for your post.

I agreed with Chris that it is hard to think of a regex expression to verify a line contains VB comments. A function should be a
much more convenient way to achieve it.

Best regards,
Yanhong Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

--------------------
!From: "Bob Altman" <[email protected]>
!Subject: Regular expression to detect comments
!Date: Tue, 19 Aug 2003 15:50:54 -0700
!Lines: 42
!MIME-Version: 1.0
!Content-Type: multipart/alternative;
! boundary="----=_NextPart_000_000E_01C36669.AC463F70"
!X-Priority: 3
!X-MSMail-Priority: Normal
!X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
!X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
!Message-ID: <[email protected]>
!Newsgroups: microsoft.public.dotnet.framework
!NNTP-Posting-Host: blv-gate-01.boeing.com 130.76.32.64
!Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTNGP10.phx.gbl
!Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework:51649
!X-Tomcat-NG: microsoft.public.dotnet.framework
!
!Hi all,
!I need some help from a RegEx guru: Can some kind soul come up with a regex expression that detects comments in VB
source lines? I think that this should be something that detects an apostrophe which is not enclosed within quotes.
!Thanks!
!
 
Hello Chris,

Thanks very much for sharing it in the community.

Best regards,
Yanhong Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

--------------------
!From: "Chris Langsenkamp" <[email protected]>
!References: <[email protected]> <[email protected]> <eCF0N
[email protected]>
!Subject: Re: Regular expression to detect comments REVISED
!Date: Tue, 19 Aug 2003 20:58:18 -0500
!Lines: 376
!MIME-Version: 1.0
!Content-Type: multipart/alternative;
! boundary="----=_NextPart_000_038C_01C36694.9DBC4460"
!X-Priority: 3
!X-MSMail-Priority: Normal
!X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
!X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
!Message-ID: <[email protected]>
!Newsgroups: microsoft.public.dotnet.framework
!NNTP-Posting-Host: dsl1a-281.ccrtc.com 209.132.165.27
!Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!tk2msftngp13.phx.gbl
!Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework:51660
!X-Tomcat-NG: microsoft.public.dotnet.framework
!
!well gee I guess I shouldn't write code into a message and not check it for accuracy (duh).
!Below is the correct code, illustrating my test cases and finally the output...still not regex, but at least this works and will help
somebody work out the regex logic?
!Chris Langsenkamp
!--------------------------------------------------------------------------------
!Module Module1
! Sub Main()
! ' normal comment line
! Console.WriteLine(" " & FindComment("'This is a test comment"))
! ' normal comment line with a single quote in it
! Console.WriteLine(" " & FindComment("'This is a " & Chr(34) & "test comment"))
! ' code line ending in a comment
! Console.WriteLine(" " & FindComment("x=1 'This is a test comment"))
! ' code line ending in a comment containing an apostrophe
! Console.WriteLine(" " & FindComment("x=1 'This is a test 'comment"))
! ' code line ending in a comment containing a double quote
! Console.WriteLine(" " & FindComment("x=1 'This is a test " & Chr(34) & "comment"))
! ' code line ending in a comment containing two double quotes
! Console.WriteLine(" " & FindComment("x=1 'This is a test " & Chr(34) & "comment" & Chr(34)))
! ' code line containing two double quotes followed by comment containing an apostrophe
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "1" & Chr(34) & " 'This is a test ' comment"))
! ' code line containing two double quotes containing an apostrophe followed by comment
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test comment"))
! ' code line containing two double quotes containing an apostrophe followed by comment containing an apostrophe
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test ' comment"))
! ' code line containing two double quotes containing an apostrophe followed by comment containing a double quote
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comment"))
! ' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comment" & Chr
(34)))
! ' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes
containing an apostrophe
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comm'nt" & Chr
(34)))
! ' code line containing two double quotes containing an apostrophe followed by comment containing two double quotes
containing an apostrophe followed by an apostrophe
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test " & Chr(34) & "comm'nt" & Chr
(34) & "'"))
! ' code line containing two double quotes containing an apostrophe followed by comment containing an apostrophe and
containing two double quotes containing an apostrophe followed by an apostrophe
! Console.WriteLine(" " & FindComment("x=" & Chr(34) & "they're" & Chr(34) & " 'This is a test '" & Chr(34) & "comm'nt" & Chr
(34) & "'"))
! End Sub
! Function FindComment(ByVal ThisLine As String) As String
! Dim LineArray() As String
! Dim ThisComment As String = ""
! Dim a As Byte = InStrRev(ThisLine, "'") ' last apostrophe
! Console.WriteLine(ThisLine)
! Do While a > 0
! If InStr(Mid(ThisLine, a), Chr(34)) = 0 Then
! ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line and has no double quotes in it
! Else
! ' might be a comment with one or more double-quotes in it - let's find out
! If InStr(Mid(ThisLine, 1, a), Chr(34)) = 0 Then
! ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line and no matching double quote
before apostrophe
! Else
! ' there's at least one double quote preceeding...let's see if there are all matched pairs preceeding
! LineArray = Split(Mid(ThisLine, 1, a), Chr(34))
! If LineArray.Length Mod 2 <> 0 Then
! ThisComment = a.ToString & ": " & Mid(ThisLine, a) 'found comment at end of line
! ' all double quotes preceeding are matched pairs because the split returns an odd number of elements
! End If
! End If
! End If
! If a > 1 Then a = InStrRev(ThisLine, "'", a - 1) Else Exit Do
! Loop
! If ThisComment <> "" Then
! Return ThisComment
! Else
! Return "No Comment Found"
! End If
! End Function
!End Module
!--------------------------------------------------------------------------------
!Console Output:
!'This is a test comment
! 1: 'This is a test comment
!'This is a "test comment
! 1: 'This is a "test comment
!x=1 'This is a test comment
! 5: 'This is a test comment
!x=1 'This is a test 'comment
! 5: 'This is a test 'comment
!x=1 'This is a test "comment
! 5: 'This is a test "comment
!x=1 'This is a test "comment"
! 5: 'This is a test "comment"
!x="1" 'This is a test ' comment
! 7: 'This is a test ' comment
!x="they're" 'This is a test comment
! 13: 'This is a test comment
!x="they're" 'This is a test ' comment
! 13: 'This is a test ' comment
!x="they're" 'This is a test "comment
! 13: 'This is a test "comment
!x="they're" 'This is a test "comment"
! 13: 'This is a test "comment"
!x="they're" 'This is a test "comm'nt"
! 13: 'This is a test "comm'nt"
!x="they're" 'This is a test "comm'nt"'
! 13: 'This is a test "comm'nt"'
!x="they're" 'This is a test '"comm'nt"'
! 13: 'This is a test '"comm'nt"'
!--------------------------------------------------------------------------------
!Hi all,
!I need some help from a RegEx guru: Can some kind soul come up with a regex expression that detects comments in VB
source lines? I think that this should be something that detects an apostrophe which is not enclosed within quotes.
!Thanks!
!
 
Hello Bob,

You could also refer to http://www.regexplib.com/REDetails.aspx?regexp_id=324.

Hope that helps.

Best regards,
Yanhong Huang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.

--------------------
!X-Tomcat-ID: 147051379
!References: <[email protected]>
!MIME-Version: 1.0
!Content-Type: text/plain
!Content-Transfer-Encoding: 7bit
!From: (e-mail address removed) (Yan-Hong Huang[MSFT])
!Organization: Microsoft
!Date: Wed, 20 Aug 2003 06:29:19 GMT
!Subject: RE: Regular expression to detect comments
!X-Tomcat-NG: microsoft.public.dotnet.framework
!Message-ID: <[email protected]>
!Newsgroups: microsoft.public.dotnet.framework
!Lines: 37
!Path: cpmsftngxa06.phx.gbl
!Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework:51671
!NNTP-Posting-Host: TOMCATIMPORT1 10.201.218.122
!
!Hello Bob,
!
!Thanks very much for your post.
!
!I agreed with Chris that it is hard to think of a regex expression to verify a line contains VB comments. A function should be a
!much more convenient way to achieve it.
!
!Best regards,
!Yanhong Huang
!Microsoft Online Partner Support
!
!Get Secure! - www.microsoft.com/security
!This posting is provided "AS IS" with no warranties, and confers no rights.
!
!--------------------
!!From: "Bob Altman" <[email protected]>
!!Subject: Regular expression to detect comments
!!Date: Tue, 19 Aug 2003 15:50:54 -0700
!!Lines: 42
!!MIME-Version: 1.0
!!Content-Type: multipart/alternative;
!! boundary="----=_NextPart_000_000E_01C36669.AC463F70"
!!X-Priority: 3
!!X-MSMail-Priority: Normal
!!X-Newsreader: Microsoft Outlook Express 6.00.2800.1158
!!X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
!!Message-ID: <[email protected]>
!!Newsgroups: microsoft.public.dotnet.framework
!!NNTP-Posting-Host: blv-gate-01.boeing.com 130.76.32.64
!!Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTNGP10.phx.gbl
!!Xref: cpmsftngxa06.phx.gbl microsoft.public.dotnet.framework:51649
!!X-Tomcat-NG: microsoft.public.dotnet.framework
!!
!!Hi all,
!!I need some help from a RegEx guru: Can some kind soul come up with a regex expression that detects comments in VB
!source lines? I think that this should be something that detects an apostrophe which is not enclosed within quotes.
!!Thanks!
!!
!
!
!
 
Well, it turns out that the regular expression at www.regexlib.com doesn't work correctly. It incorrectly detects apostrophes enclosed within quote marks as comments.

For what it's worth, here's the VB code that I wrote to detect comments. (I posted the original message in this thread looking for a more "elegant" way to detect comments using a regular expression.).

' Is it a comment? Loop, looking for an apostrophe preceeded by an even number
' of quote marks. In this code, variable t contains the text to examine.
Dim q As Int32 = t.IndexOf("'"c)
Do While q <> -1
' Count the quote marks preceeding q
Dim count As Int32 = 0
Dim i As Int32 = t.IndexOf(""""c)
Do While i <> -1 And i < q
count += 1
i = t.IndexOf(""""c, i + 1) ' Look for the next quote
Loop

' Bail out of the loop if we found an even number of quote marks
If count Mod 2 = 0 Then Exit Do

' Find the next apostrophe
q = t.IndexOf("'"c, q + 1)
Loop

' Did we find a comment?
If q <> -1 Then
' q is the index of the apostrophe
End If
 
Back
Top