Split Function revisited - :- (

  • Thread starter Thread starter mannyGonzales
  • Start date Start date
M

mannyGonzales

Hey guys,

Earliery I posted this common task of reading a csv file.
My data read as: "1","2","3"

Unfortunately it now reads as:
"1","Text with, comma", "2"

embedded commas!
--------------------------------------------
Currently have:
Dim instr As String
Dim indata() As String

While filename.Peek > -1

instr = filename.ReadLine
indata = Split(instr.Replace("""", ""), ",")

<logic>

End While

How do I account for this qualifier?

Thanks
 
Hi Manny

This isn't a trivial task but it's not rocket science either. ;-)

You might find this useful. There's more comment that code so don't be put
off by it's length.

Regards,
Fergus

<code>
'=========================================================================
'This function splits a string in exactly the same way
'as Split() except that delimiters (commas) enclosed by
'the specified quoting characters (double-quote) are not
'treated as delimiters.
'
'Like Split(), this function removes <only> characters
'which occur as delimeters. The quoting characters and
'leading and traiing spaces are retained.
'
'For normal strings, this function will behave entirely
'as expected. For example (using ' as the quoting char)
' [Cat, 'Apple, Orange', 'Spade, Trowel', Dog]
'will result in
' [Cat] ['Apple, Orange'] ['Spade, Trowel'] [Dog]
'
'It is not necessary for a quoting character to occur at
'the start of a substring. If a quoting character is
'embedded, it will still act to quote embedded commas.
'
'For example:
' [A, B', 'C, D] will result in [A] [B' , 'C] and [D].
'The [', '] is embedded within the B and C and the substring
'thus formed runs from the B to the C inclusive.
'
'Note, therefore, that the following examples:
' [A', ''B,C'' ,'D] result [A', ''B,C'' ,'D]
' ['A, 'B', 'C', D'] result ['A, 'B', 'C', D']
'give single strings as output. This is correct behaviour.
'On the other hand,
' ['A, 'B,' 'C,' D'] will result in ['A, 'B] [' 'C] [' D']
'because the comma after the B is not enclosed within quotes.
'
'If there is a closing quote missing, it is assumed that
'it would have been at the end of the entire string.
'
'cComma and cQuote are named thus for convenience.
'They can be any character. They can even be the
'same character but then no splitting will occur.
'
Public Function SplitQuoted (sStr As String, _
Optional cComma As Char = ","c, _
Optional cQuote As Char = """"c _
) As String()

'If there are no quotes, do it the easy way.
If sStr.IndexOf (cQuote) < 0 Then _
Return sStr.Split (cComma)

Dim alParts As New ArrayList

Dim StartPos As Integer = 0
Do
Dim PosOfComma As Integer = sStr.IndexOf (cComma, StartPos)
If PosOfComma < 0 Then
'Add the remainder of the string (or an
'empty string if there's a comma at the end)
alParts.Add (sStr)
Exit Do
End If

Dim PosOfQuote As Integer = sStr.IndexOf (cQuote, StartPos)
If PosOfQuote < 0 Then _
PosOfQuote = sStr.Length

If PosOfComma < PosOfQuote Then
'Extract the substring.
alParts.Add (sStr.Substring (0, PosOfComma))
'Remove the substring and comma.
sStr = sStr.Substring (PosOfComma + 1)
StartPos = 0
Else
'The comma comes after a quote.
'Find the closing quote and loop around to
'look for the next comma after it.

'Move to the closing quote.
PosOfQuote = sStr.IndexOf (cQuote, PosOfQuote + 1)
If PosOfQuote < 0 Then _
PosOfQuote = sStr.Length - 1
'Look for the next comma after the closing quote.
StartPos = PosOfQuote + 1
End If
Loop

'Turn the ArrayList back into an array of strings.
Dim O As Object = alParts.ToArray (GetType (String))
Return DirectCast (O, String())
End Function
</code>
 
Hi Manny,

I made something maybe it is something,

\\\
Dim a As String = """aaa"",""bbb"",""ccc"",""ddd"""
Dim b() As String = Split(a, """,""")
b(0) = b(0).Substring(1)
b(b.Length - 1) = b(b.Length - 1).Substring(0, b(b.Length - 1).Length - 1)
MessageBox.Show(b(0).ToString & b(1).ToString & b(2).ToString &
b(3).ToString)
///

I hope this helps a little bit?
Cor
 
* "mannyGonzales said:
Earliery I posted this common task of reading a csv file.
My data read as: "1","2","3"

Unfortunately it now reads as:
"1","Text with, comma", "2"

embedded commas!

You can split at the occurances of """,""".
 
And then look for leading or trailing "s on each element, since the first and last will have one.
 
So how would i call this function as a string=
"1","a,comma","3"?

I keep getting length errors when i simply call the
function with a string only.

thanks again
manny
 
Hi Manny,

Could you show me the code you are using to call it?

When I called it with your example string, it worked perfectly.

Dim asParts() As String
asParts = SplitQuoted ("""1"",""a,comma"",""3""")
For I = 0 To asParts.Length - 1
Console.WriteLine (asParts(I))
Next

Output
"1"
"a,comma"
"3"

Note that the quotes are still there. This allows you to do Trim on the
string before you remove them - in case there are spaces before or after the
comma which shouldn't be part of the string..

'Remove extraneous spaces and the quotes.
For I = 0 To asParts.Length - 1
asParts(I) = asParts(I).Trim.Replace ("""", "")
Next

Regards,
Fergus
 
Back
Top