Use of Mid Statement in VB.Net

  • Thread starter Thread starter Harry Strybos
  • Start date Start date
H

Harry Strybos

Hi All

If anyone has the time, I think feedback on the following may be of interest
to all:

I have to build a lot of bank files which are generally of the fixed length
field type eg aba format. The beauty of the old VB6 Mid statement (which is
available in VB.Net) is that it allows me to replace n chars at a specified
position in a string eg

'(air code)
Private Enum MyFields
mfRecordType = 1
mfTransactionType = 2
'etc
End Enum

Dim buffer as New String(120, " "c)
Dim pointer as Integer = 1

Mid(buffer, pointer, MyFields.mfRecordType) = "A"
pointer += MyFields.mfRecordType

Mid(buffer, pointer, MyFields.mfTransactionType ) = "AB"
pointer += mfTransactionType 'etc etc

There appears no native VB.Net equivalent to this method. The StringBuilder
Class does not have this method either. It should be noted that the Replace
method for either String or StringBuilder would not be appropriate for the
above.

The above algorithm that I use will actually write several thousands records
to file in very fast time (around or under 1 second)

As I have stated, this whole question is of an academic nature. Having said
that, is there any equivalent method (as used above) for the Mid Statement
in VS (any language)?

Thanks for your time.
 
Harry Strybos said:
Hi All

If anyone has the time, I think feedback on the following may be of
interest to all:

I have to build a lot of bank files which are generally of the fixed
length field type eg aba format. The beauty of the old VB6 Mid statement
(which is available in VB.Net) is that it allows me to replace n chars at
a specified position in a string eg

'(air code)
Private Enum MyFields
mfRecordType = 1
mfTransactionType = 2
'etc
End Enum

Dim buffer as New String(120, " "c)

Oops... Dim buffer as New String(" "c, 120)
 
Harry said:
Hi All

If anyone has the time, I think feedback on the following may be of interest
to all:

I have to build a lot of bank files which are generally of the fixed length
field type eg aba format. The beauty of the old VB6 Mid statement (which is
available in VB.Net) is that it allows me to replace n chars at a specified
position in a string eg

'(air code)
Private Enum MyFields
mfRecordType = 1
mfTransactionType = 2
'etc
End Enum

Dim buffer as New String(120, " "c)
Dim pointer as Integer = 1

Mid(buffer, pointer, MyFields.mfRecordType) = "A"
pointer += MyFields.mfRecordType

Mid(buffer, pointer, MyFields.mfTransactionType ) = "AB"
pointer += mfTransactionType 'etc etc

There appears no native VB.Net equivalent to this method. The StringBuilder
Class does not have this method either. It should be noted that the Replace
method for either String or StringBuilder would not be appropriate for the
above.

The above algorithm that I use will actually write several thousands records
to file in very fast time (around or under 1 second)

As I have stated, this whole question is of an academic nature. Having said
that, is there any equivalent method (as used above) for the Mid Statement
in VS (any language)?

Thanks for your time.


As you probably know, it is impossible to change the content of a String
(="immutable") in .Net. The out-of-every-standard Mid statement can't do
anything else. It pretends to change some chars but actually returns
a new String.

Dim s1, s2 As String

s1 = "abc"
s2 = s1
MsgBox(s1 Is s2)
Mid(s1, 2, 1) = "X"
MsgBox(s1 Is s2)

You'll get "True" and "False" in the boxes.

So, you'll have to stick either with the StringBuilder or the & operator
for concatenation. If I understood correctly, you have to build
a new String every time, not change the content of a previous one, right?
To make this considerably faster, you can use one Stringbuilder object
with maximum capacity (120 in this case), and set it's Length property
to 0 each time you build a new record. You can easily write a
function that also does the padding of blanks. I'd not use one of
the built-in methods like AppendFormat or PadRight because it's probably
much slower. Instead something like:

Module whatever
<Runtime.CompilerServices.Extension()> _
Sub AddWithPadding(ByVal sb As System.Text.StringBuilder, _
ByVal Text As String, ByVal TotalLength As Integer)
sb.Append(Text)
sb.Append(New String(" "c, TotalLength - Text.Length))
End Sub
End Module

I've measured the time with a little test:

Shared Sub test()
Dim sb As New System.Text.StringBuilder(10)
Dim watch = Stopwatch.StartNew

For i = 1 To 30000000
sb.Length = 0
sb.AddWithPadding("X", 5)
sb.AddWithPadding("Y", 5)
Next
watch.Stop()

MsgBox(sb.ToString & " " & watch.Elapsed.ToString)

End Sub

This took 5.95 s here.

I was guessing that the creation of a new String may be too much overhead,
therefore I replaced the 2nd sb.Append line by a loop:

For i = 1 To TotalLength - Text.Length
sb.Append(" "c)
Next

However, it was slightly slower (6.16). Anyway, the by far quickest version that
came into my mind was this one:

sb.Append(Main.StringsWithBlanks(TotalLength - Text.Length))

In this version, Main.StringsWithBlanks just returns a String from an array that
has been filled once at the start by Strings with as many blanks as the index
that the String is stored at. Total execution time: 2.4 s. So, if I didn't completely
miss the point of you post, that's the version of my choice.

My extensive dealing with padding is because it's the only problem I see in your
case. If padding is not an issue, you can just use a Stringbuilder (keeping the
".Length = 0" hint in mind). I wouldn't be suprised if it worked even quicker
than using the Mid statement (because it creates a new string). ..... tested:

Shared Sub test2()
Dim s = Space(10)
Dim watch = Stopwatch.StartNew

For i = 1 To 30000000
Mid(s, 1, 1) = "X"
Mid(s, 6, 1) = "Y"
Next
watch.Stop()

MsgBox(s & " " & watch.Elapsed.ToString)

End Sub

Execution time: 11.6 s. Questions? ;-)
 
As you probably know, it is impossible to change the content of a String
(="immutable") in .Net. The out-of-every-standard Mid statement can't do
anything else. It pretends to change some chars but actually returns
a new String.

Actually, VB Mid$ is extremely fast. It seems
to be treating the string as an integer
array... which it is... which allows characters
to be replaced without creating a new string.
(Assuming Mid$ is being used on a string
buffer of sufficient length. Offhand I don't know
what happens if one tries to Mid$ 10 characters
into an 8-char. string.)
 
mayayana said:
Actually, VB Mid$ is extremely fast.

Talking about the Mid _statement_? In that case it was slower in my
test case.
It seems
to be treating the string as an integer
array...
?

which it is... which allows characters
to be replaced without creating a new string.

It does create a new string as you can reproduce yourself with my first
example. Do you have an example to show the opposite?
 
Talking about the Mid _statement_? In that case it was slower in my
test case.


It does create a new string as you can reproduce yourself with my first
example. Do you have an example to show the opposite?

mayayana is playing a game with you. He is talking about the vb6 mid
statement...

As for the Mid statement in VB.NET - that maps to
Microsoft.VisualBasic.CompilerServices.StringType.MidStmtStr. This method
basically takes your string ByRef, because it as you say returns a new string.
It basically does some checks, uses a StringBuilder to build the new string,
and then set your string reference to the StringBuilder instances .ToString
 
It's not clear from your original post, but the Mid statement is still
available in VB (.NET) ?
(It can also be used from other .NET languages, is well-tested, and is
easier than writing from scratch).
--
David Anton
http://www.tangiblesoftwaresolutions.com
Convert VB to C#, C++, or Java
Convert C# to VB, C++, or Java
Convert C++ to C#, VB, or Java
Convert Java to C#, C++, or VB
 
Armin Zingler said:
Talking about the Mid _statement_? In that case it was slower in my
test case.

I believe he is talking about the VB6 Mid statement, and looking for
something as fast in .Net.
 
Actually, VB Mid$ is extremely fast.
Talking about the Mid _statement_?

Isn't that what Harry Strybos was talking about?
He wants something like a VB Mid *statement* to
edit a string without reallocating.
It does create a new string as you can reproduce yourself with my first
example.
Do you have an example to show the opposite?

You may be right about your code. But it's a VB.Net
sample. You're mixing up VB Mid with VB.Net Mid.
Mid in VB.Net *does* create a new string. But as I
understood Harry's question, he's looking for
something in VB.Net that has the same efficiency as
Mid in VB. That's what he hasn't been able to find.
The two Mids are two different things:

http://www.vbdotnetheaven.com/UploadFile/ggaganesh/KeyChanges04262005080509A
M/KeyChanges.aspx

Your code can't be tested in VB. The Is operator
compares object pointers. In VB a string is not an
object. If I run your code I get a type mismatch error.
But I can write similar code sample to show how
VB Mid works:

Dim s1, s2 As String
s1 = "abc"
s2 = s1
MsgBox StrPtr(s1) & vbCrLf & StrPtr(s2)
Mid(s1, 2, 1) = "X"
MsgBox StrPtr(s1) & vbCrLf & StrPtr(s2)

The string pointers for each variable are the
same in each message box. If I replace the Mid
line with: s1 = "axc"
then I get a new pointer for s1.

I first discovered the usefulness of VB Mid when
I was writing a parsing routine that needed to
tokenize a very big string and output an RTF string.
I was colorcoding various types of code in a
RichTextBox, typically 20-100 KB files. My aim was to
get a routine that could take that very long string of plain
text and colorcode it as an RTF string, then put that
back into the RTB window, in under 1/4 second so that
it wouldn't be noticeable.

The RTB text read and replace is nearly instant, but
string building/tokenizing time adds up.
I ended up using a routine that tokenized the input
string of plain text by pointing an integer array at it
and tokenizing numerically.
I then built the new RTF string by using
a large string buffer and writing the new string to it, 1
character or word at a time, with RTF encoding added,
using the Mid statement.
It's very efficient. Before settling on that method I
also tested a method of using an array for both
strings, since array ops are very efficient. I don't
remember whether that method was slower or the same,
but it wasn't faster.

So VB Mid is extremely efficient. Whether or not
there's something as fast as *VB* Mid in VB.Net
is something I leave to others to figure out. It seems
like a rather difficult thing to test as an accurate
comparison. But if you have a method to actually
replace characters without creating a new string then
that should have comparable efficiency...at least
insofar as .Net can be said to have potential for
efficiency.
 
Hi,
There appears no native VB.Net equivalent to this method. The
StringBuilder Class does not have this method either. It should be noted
that the Replace method for either String or StringBuilder would not be
appropriate for the above.

IMO it could be done using CopyTo (to replace a string at this location),
Insert (if the string is longer than the one that is removed), Remove (if
the string is shorter than the one being replaced)...

Using an extension method would allow to make this new method appears as if
it was part of a stringbuilder...
 
It's not clear from your original post, but the Mid statement
is still available in VB (.NET) ? (It can also be used from other
.NET languages, is well-tested, and is easier than writing from
scratch).

What the OP is asking for is a VB.Net statement that performs the same
functionality as the VB6 Mid$ statement and is as fast as the VB6 Mid$
statement. By comparison the VB.Net Mid statement is painfully slow,
hundreds of times slower in some cases.

Mike
 
mayayana is playing a game with you. He is talking about the vb6 mid
statement...

I'm not playing a game with him. You feel
in competition with people not using .Net, but
the feeling is not necessarily mutual. Yet you
attack anyway. I only post here when I think
I've got something useful to add. This is a
VB.Net group. (When you come charging into
the VB group to fight about VB.Net, though....that's
different, and I'm more than happy to criticize both
you and VB.Net in that case. :)

In this case
Harry Strybos was looking for something comparable
to VB Mib in VB.Net. Armin Zingler misunderstood
the two Mids, no one else had offered to clarify, and I
was in a position to set the record straight.

The last time you did this was when I posted
some VBScript Shell code, suggesting that the OP
could adapt it, after no one else had answered his
question. That time, as well, you jumped right in
to fight about it, assuming my post was a dig without
really reading it, yet never offering any useful VB.Net
solution to the OP.

If you look at my posts here you'll see that a) my
posts are not frequent, b) I'm generally only posting
when no one else has offered an answer and c) in
those cases I have something to offer, like
maybe a VBScript snippet that the OP might be
able to adapt to VB.Net.

So if you want to argue the validity of my criticisms,
when I make them, that's OK with me. But I would
thank you to just give it a little space and read a post
fully before just "reaching for your gun". There's plenty
of room for sheer passion for understanding to co-exist
with disagreement.
And while you're at it, you might consider offering an
answer to the OP while you're here. He wants a
replacement in VB.Net as efficient as VB Mid, and I'm
guessing that you're probably well qualified to address that.
 
David said:
It's not clear from your original post, but the Mid statement is still
available in VB (.NET) ?

I think he knows it because he wrote "the old VB6 Mid statement (which
is available in VB.Net)".
 
Harry only mentioned that the Mid statement was already there in VB6.
He is now working with VB.Net, and he needs a solution for VB.Net.
Therefore all my statements are related to VB.Net and all examples
are written for VB.Net

If you're trying to apply my statements to VB6 or trying to execute my
code in VB6, it's really your own problem.
 
Actually, VB Mid$ is extremely fast. It seems
to be treating the string as an integer
array... which it is... which allows characters
to be replaced without creating a new string.
(Assuming Mid$ is being used on a string
buffer of sufficient length. Offhand I don't know
what happens if one tries to Mid$ 10 characters
into an 8-char. string.)

In .NET you CAN NOT modify a string. If it appears a string is being
modified, a new string is being created.
 
Harry only mentioned that the Mid statement was
already there in VB6. He is now working with VB.Net
and he needs a solution for VB.Net.

Don't tell lies Zingler in your attempt to score points. Harry also said
that he knows the Mid statement is also available in VB.Net. In fact you've
already admitted that yourself in your curt reply to David Anton. Harry in
fact said, "The beauty of the old VB6 Mid statement (which is available in
VB.Net) is that . . ." He then went on to say that he wants to deal with
several thousand records in a very fast time, although he did not specifying
how many times he would need to use the Mid statement on each record. So, it
would appear that either the OP meant to say, "which is /not/ available in
VB.Net", whereas in fact it is, or that he actually did know it was
available in VB.Net but (as indicated by his use of the phrase "very fast
time") he knew that the VB.Net version is painfully slow by comparison. The
OP has asked a question, Zingler. Why don't you just help him out and leave
it at that, instead of trolling around telling lies.

Mike
 
Don't tell lies Zingler in your attempt to score points. Harry also said
that he knows the Mid statement is also available in VB.Net. In fact you've
already admitted that yourself in your curt reply to David Anton. Harry in
fact said, "The beauty of the old VB6 Mid statement (which is available in
VB.Net) is that . . ." He then went on to say that he wants to deal with
several thousand records in a very fast time, although he did not specifying
how many times he would need to use the Mid statement on each record. So, it
would appear that either the OP meant to say, "which is /not/ available in
VB.Net", whereas in fact it is, or that he actually did know it was
available in VB.Net but (as indicated by his use of the phrase "very fast
time") he knew that the VB.Net version is painfully slow by comparison. The
OP has asked a question, Zingler. Why don't you just help him out and leave
it at that, instead of trolling around telling lies.

Mike - my reading of the op's post is that the "very fast time" was refering
to his current vb.net solution. I take the whole thing to mean, that
basically the VB.NET version is fast enough, but can I speed it up to match
the VB6 version using the vb6 mid....

The answer to that is probably - but, he won't be able to use the mid
statement. That works on strings and strings are immutable in .NET.
Immutability gives string memory and performance benifits in some areas,
but kills you when you are trying to dynamically build strings.

In this scenario, I would probably write a method that uses a StringBuilder as
a buffer - which does allow you to directly modify the underlying buffer. In
fact, as was suggested above I would probably make the method an extension
method... It would make the code look a bit cleaner (something like the
following psuedo code)

Dim pointer As Integer = 0
While Not Done
Dim record As New StringBuilder (recordLength, recordLength)

' add the field
record.Mid(pointer, "A")
pointer += firstField

record.Mid(pointer, "AB")
pointer += secondField

'etc, etc

' write it out
WriteRecord (record.ToString())
End While
 
Harry said:
If anyone has the time, I think feedback on the following may be of interest
to all:

I have to build a lot of bank files which are generally of the fixed length
field type eg aba format. The beauty of the old VB6 Mid statement (which is
available in VB.Net) is that it allows me to replace n chars at a specified
position in a string eg

'(air code)
Private Enum MyFields
    mfRecordType = 1
    mfTransactionType = 2
    'etc
End Enum

Dim buffer as New String(120, " "c)
Dim pointer as Integer = 1

Mid(buffer, pointer, MyFields.mfRecordType) = "A"
pointer +=  MyFields.mfRecordType

Mid(buffer, pointer, MyFields.mfTransactionType ) = "AB"
pointer +=  mfTransactionType  'etc etc

There appears no native VB.Net equivalent to this method. The StringBuilder
Class does not have this method either. It should be noted that the Replace
method for either String or StringBuilder would not be appropriate for the
above.

The above algorithm that I use will actually write several thousands records
to file in very fast time (around or under 1 second)
<snip>

Personally I'd refrain from using the Mid statement, because, since I
know strings are immutable in .Net, it's somewhat obvious that the
VB.Net's version of Mid is replacing my original string with a new
one, which may be hyper-inneficient in the case of intense use.
Instead, there are at least two approaches I'd prefer:

a) save each element in an list and then join the list elements to get
the final string

Ex:
Public Enum FieldSizes
Id = 4
Name = 20
Status = 15
End Enum

Sub AddField( _
ByVal A As List(Of String), _
ByVal V As String, _
ByVal S As FieldSizes _
)
A.Add(V.PadRight(S, " "c))
End Sub

'...
Dim Buffer As New List(Of String)
AddField(Buffer, "12", FieldSizes.Id)
AddField(Buffer, "John", FieldSizes.Name)
AddField(Buffer, "Unenployed", FieldSizes.Status)
Debug.Print("[" & String.Join("", Buffer.ToArray()) & "]")
'...

b) Insert the values in a pre-filled string buffer.
Ex:
Public Enum FieldPos
Id = 0
Name = 4
Status = 24
FullSize = 39
End Enum

Sub SetField(ByVal S As StringBuilder, ByVal V As String, ByVal P As
FieldPos)
S.Remove(P, V.Length)
S.Insert(P, V)
End Sub

'...
Dim Buffer As New StringBuilder(New String(" "c,
FieldPos.FullSize))
SetField(Buffer, "John", FieldPos.Name)
SetField(Buffer, "Unemployed", FieldPos.Status)
SetField(Buffer, "25", FieldPos.Id)
Debug.Print("[" & Buffer.ToString() & "]")
'...


Hope this helps.

Regards,

Branco.
 
Perhaps you can use Char array instead, or two strings, one In string, and
the other Out string that represents the processed result.
 
Back
Top