string.split question

  • Thread starter Thread starter Senthil
  • Start date Start date
S

Senthil

Code
----------------------
string Line = "\"A\",\"B\",\"C\",\"D\"";

string Line2 = Line.Replace("\",\"","\"\",\"\"");

string [] CSVColumns = Line2.Split("\",\"".ToCharArray());





I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.



Let me know if I am missing anything.



Thanks
 
Code
----------------------
string Line = "\"A\",\"B\",\"C\",\"D\"";

string Line2 = Line.Replace("\",\"","\"\",\"\"");

string [] CSVColumns = Line2.Split("\",\"".ToCharArray());

The problem is here. You are asking it to split on a set of individual
characters, not on a string. You can't split on a *string* with
String.Split.

Instead, you'll have to use the following:

string [] CSVColumns =
System.Text.RegularExpressions.Regex.Split(Line2, "\",\"");
I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.

In any case, the following achieves the same effect in a more efficient
manner.

string line = "\"A\",\"B\",\"C\",\"D\"";
string[] csvColumns = line.Split(',');
 
Senthil said:

Why is this line here? I don't understand the purpose of adding more quotes to your string.


string [] CSVColumns = Line2.Split("\",\"".ToCharArray());



I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.

string.Split splits the string on *each* character given. You seem to be expecting "," to be treated as a three-character delimiter, but it's actually three one-character delimiters. The empty strings you're getting are the zero-length strings between the quotes and commas in your input string.
 
http://weblogs.asp.net/justin_rogers/archive/2004/03/14/89545.aspx

An algorithm for splitting based on string based delimiter.


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Jerry said:
Senthil said:

Why is this line here? I don't understand the purpose of adding more quotes to your string.


string [] CSVColumns = Line2.Split("\",\"".ToCharArray());



I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.

string.Split splits the string on *each* character given. You seem to be
expecting "," to be treated as a three-character delimiter, but it's actually
three one-character delimiters. The empty strings you're getting are the
zero-length strings between the quotes and commas in your input string.
 
Senthil,
In addition to the other comments.

There are three Split functions in .NET:

Use Microsoft.VisualBasic.Strings.Split if you need to split a string based
on a specific word (string). It is the Split function from VB6.

Use System.String.Split if you need to split a string based on a collection
of specific characters. Each individual character is its own delimiter.

Use System.Text.RegularExpressions.RegEx.Split to split based
on matching patterns.


By referencing the Microsoft.VisualBasic assembly in C# you can use the
Strings.Split function to split a string based on a word.

Something like:
string [] CSVColumns = Strings.Split(Line2, "\",\"", -1,
CompareMethod.Binary);

Hope this helps
Jay
 
Jerry said:
Senthil said:

Why is this line here? I don't understand the purpose of adding more quotes to your string.


string [] CSVColumns = Line2.Split("\",\"".ToCharArray());



I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.

string.Split splits the string on *each* character given. You seem to be expecting "," to be treated as a three-character delimiter, but it's actually three one-character delimiters. The empty strings you're getting are the zero-length strings between the quotes and commas in your input string.

I am trying to break a csv file. I can't break it by comma as one of
column value itself can have a comma. So, I am replacing all "," with
"","" and break it by ",". That is the reason I have to do that
replace. Now I understood what I missed. Thanks for your help. I will
use the method suggested in the previous message.

Thanks

Senthil
 
int offset = 0;
int[] offsets = new int[input.Length];

int quotes = 0;
for(int i = 0; i < input.Length; i++) {
if ( input == '\"' ) { quotes++; }
else if ( input == ',' && quotes % 2 == 0 ) {
offsets[offset++] = i;
}
}

The breaker above should use matching quotes to decide wether or not the comma
can be a splitting character. The values left in offsets can then be used to
grab out
particular columns as needed or can be used to provide a full string[] split.


--
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers

Senthil said:
"Jerry" <[email protected]> wrote in message
Senthil said:

Why is this line here? I don't understand the purpose of adding more quotes to your string.


string [] CSVColumns = Line2.Split("\",\"".ToCharArray());



I expect 4 values in my CSVColumns Array ("A", "B", "C", "D"). But it
returns me 18 values with lot of empty strings.

string.Split splits the string on *each* character given. You seem to be
expecting "," to be treated as a three-character delimiter, but it's actually
three one-character delimiters. The empty strings you're getting are the
zero-length strings between the quotes and commas in your input string.
 
Back
Top