TextFieldParser - reading tab delimited file

  • Thread starter Thread starter al jones
  • Start date Start date
A

al jones

I’m using textfieldparser to read a data file. which contains, for example:

Amondó Szegi Amondo Szegi
andré nossek André Nossek
© Characte Character

Note the vowels with diacriticals and the copyright symbol - it is dropping
these (and other similar) characters which fall outside ascii range
(apparently)

The code is simple and looks like:
Using MyReader As New TextFieldParser(Application.StartupPath &
"\designers.txt")
MyReader.TextFieldType = FileIO.FieldType.Delimited
MyReader.CommentTokens = New String() {"#"}
MyReader.Delimiters = New String() {vbTab}
MyReader.TrimWhiteSpace = True
Dim currentRow As String()
intElement = 0
While Not MyReader.EndOfData
Try
currentRow = MyReader.ReadFields()
If Microsoft.VisualBasic.Left(currentRow(0), 7) =
"UNKNOWN" Then
strUnknownDesigner = currentRow(1)
Continue While
End If
arDesigner(intElement, 0) = currentRow(0)
arDesigner(intElement, 1) = currentRow(1)
arDesignerCounter(intElement) = 0
intElement += 1
Catch ex As MalformedLineException
MsgBox("Designer Line " & ex.Message & "is not valid
and will be skipped.")
End Try
End While
End Using

I can’t see any reason in the documentation for it dropping copyright or
the French and German (etc…) vowels with accents.

Comments or suggestions anyone??

Thanks //al
 
al said:
I'm using textfieldparser to read a data file. which contains, for
example:

Amondó Szegi Amondo Szegi
andré nossek André Nossek
© Characte Character

Note the vowels with diacriticals and the copyright symbol - it is
dropping these (and other similar) characters which fall outside
ascii range (apparently)

It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
..BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew
 
It appears to be an encoding problem where the file uses (I'm guessing)
ISO-8859-1 or maybe Windows-1252 whereas the .NET framework defaults to
Unicode. Does a TextFieldParser have a setting for that (or have a
.BaseClass that does)?

Or perhaps you can arrange for the file to be encoded with Unicode?

Andrew

Possibly my confusion is from the fact that I maintain these files (there
are three of them) within VS 2005 so I would have epected them to be
unicode. The characters exist within the files (the three line examples are
cut & paste from the file itself) so I don't understand why reading them
would literally eliminate the characters.

I've been over the TextFieldParser docs and see nothing that indicates that
it shouldn't take the data as presented.
 
Back
Top