recommendation for parsing string

  • Thread starter Thread starter awrightus
  • Start date Start date


Not looking for exact syntax, just a recommendation. Using VB Express
2008. I have a string that's X characters in length (perhaps
2000-4000 characters or so, essentially sentences). I need to insert
a carriage return (vbCrLf) at no more than the 69th character. If the
69th character is a space, then that's the simplest scenario. But if
the 69th character is in the middle of a word, I need to search back
until I find a space and then insert the vbCrLf. I know how to use
substring() to search through the string for spaces, but I'm getting a
little lost on an approach to subsequently inject that carriage return
at every subsequent 69th character (or backing up when it's in a
middle of a word). Just wondering if someone could give me a
suggestion as to the functions I need to be looking at. Thanks for
any insight.
Not looking for exact syntax, just a recommendation. Using VB Express
2008. I have a string that's X characters in length (perhaps
2000-4000 characters or so, essentially sentences). I need to insert
a carriage return (vbCrLf) at no more than the 69th character. If the
69th character is a space, then that's the simplest scenario. But if
the 69th character is in the middle of a word, I need to search back
until I find a space and then insert the vbCrLf. I know how to use
substring() to search through the string for spaces, but I'm getting a
little lost on an approach to subsequently inject that carriage return
at every subsequent 69th character (or backing up when it's in a
middle of a word). Just wondering if someone could give me a
suggestion as to the functions I need to be looking at. Thanks for
any insight.

Do you want to split the whole string to pieces or just the first 69
characters and ignore the rest?

Do you want to split the whole string to pieces or just the first 69
characters and ignore the rest?


No, I need to breakup the entire string.

Not looking for exact syntax, just a recommendation. Using VB Express
2008. I have a string that's X characters in length (perhaps
2000-4000 characters or so, essentially sentences). I need to insert
a carriage return (vbCrLf) at no more than the 69th character. If the
69th character is a space, then that's the simplest scenario. But if
the 69th character is in the middle of a word, I need to search back
until I find a space and then insert the vbCrLf. I know how to use
substring() to search through the string for spaces, but I'm getting a
little lost on an approach to subsequently inject that carriage return
at every subsequent 69th character (or backing up when it's in a
middle of a word). Just wondering if someone could give me a
suggestion as to the functions I need to be looking at. Thanks for
any insight.

You could use a regular expression to split the string into lines, then
use a StringBuilder to put them together with line breaks between them:

MatchCollection lines = Regex.Matches(text, @"(.{1,69})(?: |$)");
StringBuilder builder = new StringBuilder();
foreach (Match line in lines) builder.AppendLine(line.Value);
You could use a regular expression to split the string into lines, then
use a StringBuilder to put them together with line breaks between them:

MatchCollection lines = Regex.Matches(text, @"(.{1,69})(?: |$)");
StringBuilder builder = new StringBuilder();
foreach (Match line in lines) builder.AppendLine(line.Value);

That's a very cool solution. Here is the same code in VB:

Dim lines As MatchCollection = Regex.Matches(TextToSplit,
"(.{1,69})(?: |$)")
Dim builder As New StringBuilder()
For Each k As Match In lines

I wrote some code but it was over 20 lines so I won't post it here. :-)

I wrote some code but it was over 20 lines so I won't post it here. :-)

That regular expression couldn't handle situations when there were no spaces
so maybe I can post my code.

Dim TextToSplit As String =
TextToSplit = TextToSplit.Replace(Chr(10), "")
TextToSplit = TextToSplit.Replace(Chr(13), " ")
Dim Previous As Integer = 0
Dim x As String = ""

For i As Integer = 69 To TextToSplit.Length - 1 Step 69

' Character is space
If TextToSplit(i) = " " Then
x = TextToSplit.Substring(Previous, i - Previous)
ListBox1.Items.Add(x.Length.ToString + " " + x)
Previous = i + 1

Dim OriginalI As Integer = i
While True

i = i - 1

' No space found
If i = Previous Then
i = OriginalI
x = TextToSplit.Substring(Previous, i - Previous)
ListBox1.Items.Add(x.Length.ToString + " " + x)
Previous = i
Exit While
End If

' Found space
If TextToSplit(i) = " " Then
x = TextToSplit.Substring(Previous, i - Previous)
ListBox1.Items.Add(x.Length.ToString + " " + x)
Previous = i + 1
Exit While
End If
End While
End If


x = TextToSplit.Substring(Previous)
ListBox1.Items.Add(x.Length.ToString + " " + x)

I threw this together. Doesn't parse sentences exactly, but separates words...

Public Class Form1

Private Sub Form1_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
Dim Words As String = "Now is the time for all good men to come to the aid of their party. "
Words += Words
Words += Words
Dim Word() As String = Split(Words, " ")
Dim Sentence As String = ""
Dim ctr As Integer = 0
For Each w In Word
Dim Last = ctr + 1
ctr += w.Length + 1
If w.IndexOf(".") > -1 Then
Sentence += w + " "
Sentence += w + " "
End If
If ctr > 69 Then
Sentence = Sentence.Substring(1, Last - 1) + vbCrLf
ctr = ctr - Last
Sentence += w + " "
End If
End Sub
End Class

- David
awrigh said:
Not looking for exact syntax, just a recommendation.  Using VB Express
2008.  I have a string that's X characters in length (perhaps
2000-4000 characters or so, essentially sentences).  I need to insert
a carriage return (vbCrLf) at no more than the 69th character.  If the
69th character is a space, then that's the simplest scenario.  But if
the 69th character is in the middle of a word, I need to search back
until I find a space and then insert the vbCrLf.  I know how to use
substring() to search through the string for spaces, but I'm getting a
little lost on an approach to subsequently inject that carriage return
at every subsequent 69th character (or backing up when it's in a
middle of a word).  Just wondering if someone could give me a
suggestion as to the functions I need to be looking at.  Thanks for
any insight.

You probably will need Substring() to get chunks of the string and
then look for the space with the LastIndexOf() function.

I know you said you don't need code, but I couldn't resist the fun.
Therefore, here follows a possible solution, which accepts the text to
be broken up, the max line size and a list of acceptable delimiters.
It returns a list of lines, which can then be turned into an array and
joined() using ControlChars.CrLf as "glue".

In your case, it would be called like this:

Dim L As IList(Of String) = Wordwrap(Sample, 69, new Char(){" "c})
Dim R As String = String.Join(ControlChars.CrLf, L.ToArray)

Function Wordwrap( _
Text As String, _
Size As Integer, _
Delimiters As IList(Of Char) _
) As IList(Of String)

Dim Delims() As Char = Delimiters.ToArray
Dim Result As New List(Of String)

Dim Max As Integer = Text.Length - Size
Dim Pos As Integer = 0
Do While Pos < Max
Dim Chunk As String = Text.Substring(Pos, Size)
If Chunk.Length = Size Then
Dim Break As Integer = Chunk.LastIndexOfAny(Delims) + 1
If Break > 0 AndAlso Break < Size Then
'Found one of the delimiters
Chunk = Chunk.Remove(Break)
End If
End If
Pos += Chunk.Length
If Pos < Text.Length Then Result.Add(Text.Substring(Pos))
Return Result
End Function

Hope this helps.


A string is immutable so as you start working with inserting in the string
or whatever it would take time.

I simply would try first the substrings to build a long string with
stringbuilder in a loop.

At the end simple dim x = mystringbuilder.ToString

That whould most probably gives the most performance and the less memory

Teemu said:
That regular expression couldn't handle situations when there were no

That is true.

This regular expression does:

"(.{1,69})(?: |$)|([^ ]{69})"