M
Mike
Harry said:Hi All
I have a .csv file that contains newline chars embedded between quotes in a
line of text eg
BSPADV1,John.public,9413,"Sharrock Ashley
TEST STREET 1
TEST NSW 2200",Address Insufficient,,,Mbase Print
Report,R7TDKPFMDBCLKE07CGJMFKKW6VVB/21,Sharrock Ashley
There are actually some 19 columns of data but when a StreamReader.ReadLine
method tries to read line by line, it only returns data up to the first
embedded newline chars. Interestingly, the data does display correctly in
Notepad, so I guess Notepad must ignore newline chars inside quotation
marks.
Is there anyway to read the above line and get the full line of data? My
only thought so far is to use the ReadToEnd method and then try and remove
the newline chars between quotes programatically before Spltting on the
"real" newline chars.
Thanks for any help
Harry, have you tried the builtin FileIO.TextFieldParser class?
It handles embedded eol characters, including embedded eol or commas
within quoted fields broken into two lines. Its handled for you. <g>
plus I presumed that it is cached as it reads, and Windows RTL already
caches text files, so it should be pretty efficient.
Here is an example console application with a ReaderCVS class wrapper
and example usage:
'--------------------------------------------------------------
' File : readcsf.vb
'--------------------------------------------------------------
Option Strict on
Option Explicit On
imports system
imports system.console
imports system.collections.generic
Module module1
public Class ReaderCVS
public shared data as new List(Of List(Of String))
'
' Read cvs file with max_fields, optional eolfilter
'
public shared function ReadCSF( _
byval fn as string, _
byval max_fields as integer, _
optional byval eolfilter as boolean = true) as boolean
try
data.clear
Dim rdr As FileIO.TextFieldParser
rdr = My.Computer.FileSystem.OpenTextFieldParser(fn)
rdr.SetDelimiters(",")
dim flds as new List(Of String)
While Not rdr.EndOfData()
dim lines as string() = rdr.ReadFields()
For Each fld as string In lines
if eolfilter then
'fld = fld.Replace(vbCr,"\r").Replace(vbLf,"\n")
fld = fld.Replace(vbCr," ").Replace(vbLf,"")
end if
flds.Add(fld)
if flds.count = max_fields then
data.Add(flds)
flds = new List(Of String)
end if
Next
End While
if flds.count > 0 then data.Add(flds)
return true
catch ex as exception
writeline(ex.message)
return false
end try
end function
sub Dump()
writeline("------- DUMP ")
dim nrow as integer = 0
for each flds as List(Of String) in data
nrow += 1
dim nfld as integer = 0
for each f as string in flds
nfld += 1
writeline("- r{0,-3} f{1,-3} => [{2}]",nrow, nfld, f)
next
next
end sub
end class
sub main(ByVal args() As String)
dim csf as new ReaderCvs
csf.readcsf("harry1.csv", 19) ' 19 fields
csf.dump()
readkey(true)
end sub
end Module
I used your example text as the harry1.csv test file.
I provided a eolfilter boolean to filter the <CR><LF> and played with
\r\n and space/blank replacements.
--