file parsing algorithms in vb.net?

  • Thread starter Thread starter Christoph Bisping
  • Start date Start date
C

Christoph Bisping

Hello!

Maybe someone is able to give me a little hint on this:
I've written a vb.net app which is mainly an interpreter for specialized
CAD/CAM files.
These files mainly contain simple movement and drawing instructions like
"move to's" and "change color's" optionally followed by one or more numeric
(int or float) arguments. My problem is that the parsing algorithm I've
currently implemented is extremely slow.

Basically I'm aproaching these files as follows:

Read 2 bytes from disk and check if these bytes match any known command.
This is done in a large Select Case-statement (about 30 "cases").
If this statement finds a "move to" command, for example, then I'm trying to
extract the arguments.
Please see the following code extract:

Do While ((IsNumeric(strThisChar)) Or (strThisChar = ",") Or _
(strThisChar =".") Or (strThisChar = "-") Or (strThisChar = " "))

If ((IsNumeric(strThisChar)) Or (strThisChar = ".") Or _
(strThisChar = "-")) Then
' this character belongs to the current argument

If ((Len(aryValues(lValueIdx)) = 0) And (strThisChar = ".")) Then

aryValues(lValueIdx) = "0"

strThisChar = "," ' Workaround: this value will become float

End If

aryValues(lValueIdx) += strThisChar ' Here I'm simply concentrating the
values

strThisChar = m_HPGLFile.ReadChar ' this is my input filestream

Else ' finished one argument

lValueIdx += 1

ReDim Preserve aryValues(lValueIdx)

strThisChar = m_HPGLFile.ReadChar

End If

Loop

As you can see, I'm building an array with all arguments that are following
the command.
How would you code such a file parsing algorithm? I strongly believe that my
extensive Select Case-statement and the "argument reader" shown above are my
main bottlenecks. These functions are called thousands of times during file
parsing...

Any hints would be greatly appreciated ;-)

Greetings,
Christoph Bisping
 
Christoph,
One thing I would suggest is that you do *not* read bytes, parse them,
read bytes, parse them, etc.
It would probably speed up your code tremendously to read the entire
file into memory (a collection or array of bytes or something else),
*then* close the file and parse the data you've already read into
memory. All those file access might be costing you a lot of performance,
and you really shouldn't leave files open for any longer than absolutely
necessary.

Christoph said:
Hello!

Maybe someone is able to give me a little hint on this:
I've written a vb.net app which is mainly an interpreter for specialized
CAD/CAM files.
These files mainly contain simple movement and drawing instructions like
"move to's" and "change color's" optionally followed by one or more numeric
(int or float) arguments. My problem is that the parsing algorithm I've
currently implemented is extremely slow.

Basically I'm aproaching these files as follows:

Read 2 bytes from disk and check if these bytes match any known command.
This is done in a large Select Case-statement (about 30 "cases").
If this statement finds a "move to" command, for example, then I'm trying to
extract the arguments.
Please see the following code extract:

Do While ((IsNumeric(strThisChar)) Or (strThisChar = ",") Or _
(strThisChar =".") Or (strThisChar = "-") Or (strThisChar = " "))

If ((IsNumeric(strThisChar)) Or (strThisChar = ".") Or _
(strThisChar = "-")) Then
' this character belongs to the current argument

If ((Len(aryValues(lValueIdx)) = 0) And (strThisChar = ".")) Then

aryValues(lValueIdx) = "0"

strThisChar = "," ' Workaround: this value will become float

End If

aryValues(lValueIdx) += strThisChar ' Here I'm simply concentrating the
values

strThisChar = m_HPGLFile.ReadChar ' this is my input filestream

Else ' finished one argument

lValueIdx += 1

ReDim Preserve aryValues(lValueIdx)

strThisChar = m_HPGLFile.ReadChar

End If

Loop

As you can see, I'm building an array with all arguments that are following
the command.
How would you code such a file parsing algorithm? I strongly believe that my
extensive Select Case-statement and the "argument reader" shown above are my
main bottlenecks. These functions are called thousands of times during file
parsing...

Any hints would be greatly appreciated ;-)

Greetings,
Christoph Bisping

--
==================================================================
Sam J. Marrocco
Sr. Visual Effects Artist/R&D
Travelling Pictures/GTN
Inferno, Flame, Maya, All that cool stuff!
"The fact that no one understands you doesn't make you an artist."
==================================================================
 
Back
Top