Regex Question

  • Thread starter Thread starter Stevo
  • Start date Start date
S

Stevo

What is the easiest way to find a regex string that is outside the bounds of an html tag?

Example string:
test1 <font color="red"> test2 </font> test3

I only want test1 and test3 captured. So what expression can be used to achieve this? Thanks.
 
You can use the pattern "<(.|\n)*?>" and then access the RegEx.Split(x)
array.

Brandon

What is the easiest way to find a regex string that is outside the bounds of
an html tag?

Example string:
test1 <font color="red"> test2 </font> test3

I only want test1 and test3 captured. So what expression can be used to
achieve this? Thanks.
 
I need to modify the string data as a whole, not just retrieve the fields.
Using split makes it easy to get all the data outside the bounds of the
tags, but then how would I modify that data and place it back into the
initial string?
 
In that case, you can try splitting the matches and non matches into
different arrays/collections, change the non-matched strings that you need,
and then reconstruct them at the end... here's a roundabout way of doing
that...

Dim strHtml As String = "test1 <font color=""red""> test2 </font>
test3"
Dim regex As New System.Text.RegularExpressions.Regex("<.*?>.*<.*>")
Dim mchMatches As System.Text.RegularExpressions.MatchCollection
Dim strNonMatches() As String
Dim mchTemp As System.Text.RegularExpressions.Match
Dim strTemp As String
Dim strFinalString As String
Dim i As Integer = 0

mchMatches = regex.Matches(strHtml)
strNonMatches = regex.Split(strHtml)

For Each strTemp In strNonMatches
MessageBox.Show("Split: " & strTemp)
Next

For i = 0 To mchMatches.Count - 1
mchTemp = mchMatches(i)

strFinalString += strNonMatches(i) & " [changed]"
strFinalString += mchTemp.ToString()
MessageBox.Show("Match: " & mchTemp.ToString())

If i = mchMatches.Count - 1 Then
strFinalString += strNonMatches(i + 1) & " [changed]"
End If
Next

While i < strNonMatches.Length - 1

strFinalString += strNonMatches(i) & " [changed]"
i += 1
End While

MessageBox.Show(strFinalString)

HTH,

Brandon
 
That helps a bunch. Thanks!


Brandon Potter said:
In that case, you can try splitting the matches and non matches into
different arrays/collections, change the non-matched strings that you need,
and then reconstruct them at the end... here's a roundabout way of doing
that...

Dim strHtml As String = "test1 <font color=""red""> test2 </font>
test3"
Dim regex As New
System.Text.RegularExpressions.Regex( said:
Dim mchMatches As System.Text.RegularExpressions.MatchCollection
Dim strNonMatches() As String
Dim mchTemp As System.Text.RegularExpressions.Match
Dim strTemp As String
Dim strFinalString As String
Dim i As Integer = 0

mchMatches = regex.Matches(strHtml)
strNonMatches = regex.Split(strHtml)

For Each strTemp In strNonMatches
MessageBox.Show("Split: " & strTemp)
Next

For i = 0 To mchMatches.Count - 1
mchTemp = mchMatches(i)

strFinalString += strNonMatches(i) & " [changed]"
strFinalString += mchTemp.ToString()
MessageBox.Show("Match: " & mchTemp.ToString())

If i = mchMatches.Count - 1 Then
strFinalString += strNonMatches(i + 1) & " [changed]"
End If
Next

While i < strNonMatches.Length - 1

strFinalString += strNonMatches(i) & " [changed]"
i += 1
End While

MessageBox.Show(strFinalString)

HTH,

Brandon

Stevo said:
I need to modify the string data as a whole, not just retrieve the fields.
Using split makes it easy to get all the data outside the bounds of the
tags, but then how would I modify that data and place it back into the
initial string?


bounds
 
Back
Top