chop, separate, split a STRING into sections?

  • Thread starter Thread starter Terry Moreau
  • Start date Start date
T

Terry Moreau

seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for: break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ? well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

What about converting the string to a Character Array then using Group
or Aggregate?
well i can't find good examples of these either so here I am posting
for ideas!
 
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for: break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ? well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

What about converting the string to a Character Array then using Group
or Aggregate?
well i can't find good examples of these either so here I am posting
for ideas!

Hmmm... If your using VS 2008, you could create an extension method :)

Option Strict On
Option Explicit On

Module Module1

Sub Main()
Dim a() As String = "01234567ABCDEFGHIJ".Chunk(4)

For index As Integer = 0 To a.Length - 1
Console.WriteLine("({0}) = {1}", index, a(index))
Next

End Sub

End Module

Option Strict On
Option Explicit On

Imports System.Runtime.CompilerServices

Module StringExtensions
<Extension()> _
Public Function Chunk(ByVal str As String, ByVal chunkSize As Integer) As String()
Dim los As New List(Of String)

Dim i As Integer = 1
Dim temp As String

While True
temp = Mid(str, i, chunkSize)
If temp = String.Empty Then Exit While
los.Add(temp)
i += chunkSize
End While

Return los.ToArray()
End Function

End Module

HTH
 
Thanks Tom !!!

I always wondered how to do compiler "Extensions"

Is that little baby you cooked up going to forevermore be within my
VisualStudio2008, or am I going to need to paste it into every new
project I make that needs that function?

Is there a way to make it part of ALL my new projects?

And of course if I want to modify, or delete it... how can that be
done?


;-) DANGER! when you show yourself as clever, you just get asked
more questions in here!
 
I'm not the OP but I want to say thanks too for making me aware of that
<Extension()> thing. I hadn't bumped into it before.

Thanks!

Bob
 
Thanks Tom !!!

I always wondered how to do compiler "Extensions"

Is that little baby you cooked up going to forevermore be within my
VisualStudio2008, or am I going to need to paste it into every new
project I make that needs that function?

Is there a way to make it part of ALL my new projects?

And of course if I want to modify, or delete it... how can that be
done?


;-) DANGER! when you show yourself as clever, you just get asked
more questions in here!

An Extension method is a new feature of VB9 and C#3.0. They were added to
support link. As for it becoming part of VS - well it's just a module. If
you want to reuse it, you can put it in a class library and then reference it
from any project you want to make use of it in.
 
An Extension method is a new feature of VB9 and C#3.0. They were added to
support link. As for it becoming part of VS - well it's just a module. If

LOL... Should be "support LINQ", not link :)

LINQ stands for Language INtegrated Query.
 
Terry said:
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.
Tom said:
<Extension()> _
Public Function Chunk(ByVal str As String, ByVal chunkSize As Integer) As String()
Dim los As New List(Of String)

Dim i As Integer = 1
Dim temp As String

While True
temp = Mid(str, i, chunkSize)
If temp = String.Empty Then Exit While
los.Add(temp)
i += chunkSize
End While

Return los.ToArray()
End Function

That function still uses Mid and While loop. Terry Moreau asked for
an "elegant" solution. Does calling the function to an extension method
make it an "elegant" solution?

I do admit it's an elegant example, how to use extension methods :)
 
Where is Bill McCarhty? Not here helping real .NET issues? Maybe he needs
to stop trolling the classic vb groups and make himself useless over here.

| seems like a rather common and simple task for .NET, but I can't seem
| to find any "elegant" way to do it without using MID or SUBSTRING and
| code looping such as DO WHILE or FOR NEXT.
|
| Here's a sample of what I'm looking for: break apart a string into
| sections of 4 characters each...
|
| DIM strMyInputString as String = "01234567ABCDEFGHIJ"
| DIM strMyOutputString() as String = {some kind sectioning function or
| method that operates on strMyInputString}
|
| and here's the desired output string array:
|
| (0) = 0123
| (1) = 4567
| (2) = ABCD
| (3) = EFGH
| (4) = IJ
|
| String.Spilt wont' do it cause there's no delimiter
| Regex has a Split that from what I can tell does nothing more
| than .NET built in Split
| Something in Regex that I'm missing maybe?
| Select ? well maybe it could be used but I can't find a good VB.NET
| example of how to use it !!!
|
| Such a simple task, and there's no 'single function' or elegant 'one-
| liner' that can chop an a string into regular sized sections???
|
| What about converting the string to a Character Array then using Group
| or Aggregate?
| well i can't find good examples of these either so here I am posting
| for ideas!
 
That function still uses Mid and While loop. Terry Moreau asked for
an "elegant" solution. Does calling the function to an extension method
make it an "elegant" solution?

I do admit it's an elegant example, how to use extension methods :)

While first off, I don't see the problem with mid, substring, or looping :)
But, I would definately want something like this encapsulated some how. I
figure a custom extension method looks good - because it appears to be a
member of string, even though it's not. And, it will probably end up being
faster then using a linq type query...
 
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for:  break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ?  well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

What about converting the string to a Character Array then using Group
or Aggregate?
well i can't find good examples of these either so here I am posting
for ideas!

Hi. Try this

'--------------------------
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Dim cSourceText As String = "HELLO WORLD EXAMPLE."
Dim cExp As String = ".{4}"

Dim oMatches As System.Text.RegularExpressions.MatchCollection

oMatches = Regex.Matches(cSourceText, cExp)
For Each oMatch As System.Text.RegularExpressions.Match In
oMatches
MsgBox(oMatch.Value)
Next
End Sub
End Module

'--------------------------

just change the {4} to the number of groupings that you require.

hope this helps

Bo
 
Terry Moreau said:
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for: break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ? well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

Not a "one-liner" but a move towards elegance.
No doubt it can be improved upon

Public Function Chop(ByVal sIn As String, ByVal iCars As Integer) As List(Of
String)
Dim los As New List(Of String)
If sIn.Length <= iCars Then
los.Insert(0, sIn)
Else
los = Chop(sIn.Remove(0, iCars), iCars)
los.Insert(0, sIn.Substring(0, iCars))
End If
Return los
End Function
 
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for:  break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ?  well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

What about converting the string to a Character Array then using Group
or Aggregate?
well i can't find good examples of these either so here I am posting
for ideas!

Just to through my version in there. Basically the same as Tom's, only
using a For loop and Substring instead of the While loop.

/////////////////
<System.Runtime.CompilerServices.Extension()> _
Public Function Chunk(ByVal s As String, ByVal chunkSize As
Integer) As String()
Dim strings = New List(Of String)()

For i = 0 To s.Length - 1 Step chunkSize
strings.Add(s.Substring(i, Math.Min(chunkSize, s.Length -
i)))
Next

Return strings.ToArray()
End Function
/////////////////

At first I was thinking that LINQ's Skip(...) and Take(...) could be
used here, but those return an IEnumerable(Of Char) instead of a
string, so I decided Substring would be best. Also note the Math.Min
(...) part, this prevents the Substring's length parameter from being
larger than the number of characters remaining in the string.

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
I always wondered how to do compiler "Extensions"

They are rather simple, but the way of implementing them varies for VB
and C#. Just for those bilingual readers, the C# version is:

//////////////////
public static string[] Chunk(this string s, int chunkSize)
{
}
//////////////////

Notice the "Extension" attribute is not there, C# uses the "this"
keyword instead. The other difference is that C# you put the extension
methods in a static class, as that's their version of a module.
Is that little baby you cooked up going to forevermore be within my
VisualStudio2008, or am I going to need to paste it into every new
project I make that needs that function?

Extension methods are nothing more than a compiler trick, basically a
shortened syntax for calling into a member of a module. With that
said, this extension method will only appear where the module method
would appear. So you would have to add a reference to the dll (or copy
in the code file) and add the necessary Import statement to make it
visible to what you are working on.

Also note that you can use the same methods as-is in VB 2.0 by simply
referencing them as a member of a module (StringExtensions.Chunk
("somestring", 4)). Like I mentioned previously - the whole
string.Chunk(...) is just a trick.
Is there a way to make it part of ALL my new projects?

I suppose you could modify the MSBuild templates for new projects to
automatically import it, but I don't know of a simpler way to include
it in all new projects.
And of course if I want to modify, or delete it... how can that be
done?

This is basically answered by the above statements - you'd modify /
delete it just like you would a module that was part of your project.

I would recommend you check out the following links:

http://blogs.msdn.com/vbteam/archive/2007/01/05/extension-methods-part-1.aspx
http://msdn.microsoft.com/en-us/magazine/cc163317.aspx

Thanks,

Seth Rowe [MVP]
http://sethrowe.blogspot.com/
 
Terry said:
seems like a rather common and simple task for .NET, but I can't seem
to find any "elegant" way to do it without using MID or SUBSTRING and
code looping such as DO WHILE or FOR NEXT.

Here's a sample of what I'm looking for: break apart a string into
sections of 4 characters each...

DIM strMyInputString as String = "01234567ABCDEFGHIJ"
DIM strMyOutputString() as String = {some kind sectioning function or
method that operates on strMyInputString}

and here's the desired output string array:

(0) = 0123
(1) = 4567
(2) = ABCD
(3) = EFGH
(4) = IJ

String.Spilt wont' do it cause there's no delimiter
Regex has a Split that from what I can tell does nothing more
than .NET built in Split
Something in Regex that I'm missing maybe?
Select ? well maybe it could be used but I can't find a good VB.NET
example of how to use it !!!

Such a simple task, and there's no 'single function' or elegant 'one-
liner' that can chop an a string into regular sized sections???

What about converting the string to a Character Array then using Group
or Aggregate?
well i can't find good examples of these either so here I am posting
for ideas!

Anything can be a one-liner. ;)

Here's a one-liner that splits the string into an array, and it even
manages to get the last two characters in a string too. :)

Dim strMyOutputString() As String = (From m In
Regex.Matches(strMyInputString, "[\W\w]{1,4}") Select DirectCast(m,
Match).Value).ToArray

It could have been more elegant if only the MatchCollection class would
have implemented IEnumerable<Match> and not only IEnumerable.
 
Appr3nt1c3 said:
Hi. Try this

'--------------------------
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Dim cSourceText As String = "HELLO WORLD EXAMPLE."
Dim cExp As String = ".{4}"

Dim oMatches As System.Text.RegularExpressions.MatchCollection

oMatches = Regex.Matches(cSourceText, cExp)
For Each oMatch As System.Text.RegularExpressions.Match In
oMatches
MsgBox(oMatch.Value)
Next
End Sub
End Module

'--------------------------

just change the {4} to the number of groupings that you require.

hope this helps

Bo

Pretty close, but it only works correctly if the length of the string is
evenly divisible by four. With the string provided by the OP, it will
not get the last "IJ" string.

You need to change the pattern to ".{1,4}" to get the last string also.
 
... (history removed) ...
Anything can be a one-liner. ;)

Here's a one-liner that splits the string into an array, and it even
manages to get the last two characters in a string too. :)

Dim strMyOutputString() As String = (From m In
Regex.Matches(strMyInputString, "[\W\w]{1,4}") Select DirectCast(m,
Match).Value).ToArray

It could have been more elegant if only the MatchCollection class would
have implemented IEnumerable<Match> and not only IEnumerable.

Can someone please help me understand this? I don't understand the "From
.... In ... Select" part and searches have been useless.

Thanks, Bob
 
Imports System.Text.RegularExpressions
Module Module1
Sub Main()
Dim cSourceText As String = "HELLO WORLD EXAMPLE."
Dim cExp As String = ".{4}"

While I think a regular expression would work, that particular
expression will only return the correct result if your input string is
a multiple of 4 characters. What you need instead is ".{1,4}" This
will return at least 1 character and no more than 4

Here is my entry (and not loops):

Imports System.Runtime.CompilerServices
Imports System.Text.RegularExpressions

Module ExtensionMethods
<Extension()> _
Public Function Chunk(ByVal input As String, ByVal chunkSize As
Integer) As String()
Dim pattern As String = String.Format(".{{1,{0}}}", chunkSize)
Return Regex.Matches(input, pattern).Cast(Of Match).Select
(Function(mt) mt.Value).ToArray()
End Function
End Module
 
... (history removed) ...


Anything can be a one-liner. ;)
Here's a one-liner that splits the string into an array, and it even
manages to get the last two characters in a string too. :)
Dim strMyOutputString() As String = (From m In
Regex.Matches(strMyInputString, "[\W\w]{1,4}") Select DirectCast(m,
Match).Value).ToArray
It could have been more elegant if only the MatchCollection class would
have implemented IEnumerable<Match> and not only IEnumerable.

Can someone please help me understand this? I don't understand the "From
... In ... Select" part and searches have been useless.

Thanks, Bob

That is part of LINQ and is only available in VB9. Here's a page that
have many more samples:

http://msdn.microsoft.com/en-us/vbasic/bb688088.aspx

And here is some more information:

http://msdn.microsoft.com/en-us/vbasic/aa904594.aspx

Chris
 
Back
Top