regular expression for comma delimit

  • Thread starter Thread starter msnews.microsoft.com
  • Start date Start date
M

msnews.microsoft.com

hi all,

i'm trying to retrieve contents that are comma delimited like the following.

aaa, bbb, ccc, ddd

how would my regular expression match string look.
also what if there are whitespaces (cr,lf,tab) how would like ignore them ?

aaa(cr,lf, spaces), bbb, ccc.........

thanks,
 
Think this may work for you. Returns as fields anything between commas
(including empty for ,,) Commas between quote pairs are not concidered
delimeters any longer but just part of the string until next unquoted comma.
I could drive myself nuts trying to figure test matrix for this, but let me
know if it works the way you need. HTH

MyFile.csv Sample:
"Field 1", Field2, Fie ld3
Row1c1,"two,"\three,,four
....

In your ButtonX:
ParseCSV(@"c:\myfile.txt");

Add following static method to some class:

public static void ParseCSV(string file)
{
using(StreamReader sr = new StreamReader(file))
{
string line = null;
string pattern = ",(?=(?:[^\"]*\"[^\"]*\")*(?![^\"]*\"))";
Regex rx = new Regex(pattern);
while((line = sr.ReadLine()) != null)
{
string[] fields = rx.Split(line);
for(int i=0; i < fields.Length; i++)
{
Console.WriteLine("{0} {1}", i, fields);
}
Console.WriteLine();
}
}
}
 
hi william,

to state it a little more clearly... this is what i'm doing..

<code>
----------------------------------------------------------
public class Fields
{
[Match("(.*?),[\\s]+", IgnoreCase=true)]
public string[] Field;
}

public class CsvScraper : HttpGetClientProtocol
{
public CsvScraper ()
{
this.Url = @"file://c:\mydata.txt";
}

[HttpMethod(typeof(TextReturnReader), typeof(UrlParameterWriter))]
public Fields GetFields()
{
return ((Fields)(this.Invoke("GetFields", (this.Url + ""), new
object[0])));
}
}


<mydata.txt>
----------------------------------------------------------
aaa, bbb, ccc, ddd

<usage>
----------------------------------------------------------
CsvScraper scraper = new CsvScraper();
Fields flds = scraper.GetFields();

----------------------------------------------------------

in this case flds.Field array returns me only 3 elements with "ddd" missing.

what would be the proper regular expession for this ?

thanks,
 
Hi maersa. Did you not read my post. I spent some time on it for you.
TMK, it does exactly what you want. Change as needed for your style. I get
four fields from your test data. The regex in shown in the method. Have
you tried it?

<mydata.txt>
----------------------------------------------------------
aaa, bbb, ccc, ddd

Returns 4 fields. Trim() to remove any spaces.
0 aaa
1 bbb
2 ccc
3 ddd
 
hi,

yes, i did try it, but notice that you are using the "split" method which is
a little different from matching patterns.

since i'm using the MatchAttribute, I'm restricted to a pattern if i'm
correct.

thanks,



William Stacey said:
Hi maersa. Did you not read my post. I spent some time on it for you.
TMK, it does exactly what you want. Change as needed for your style. I get
four fields from your test data. The regex in shown in the method. Have
you tried it?

<mydata.txt>
----------------------------------------------------------
aaa, bbb, ccc, ddd

Returns 4 fields. Trim() to remove any spaces.
0 aaa
1 bbb
2 ccc
3 ddd

--
William Stacey, MVP
http://mvp.support.microsoft.com

maersa said:
hi william,

to state it a little more clearly... this is what i'm doing..

<code>
----------------------------------------------------------
public class Fields
{
[Match("(.*?),[\\s]+", IgnoreCase=true)]
public string[] Field;
}

public class CsvScraper : HttpGetClientProtocol
{
public CsvScraper ()
{
this.Url = @"file://c:\mydata.txt";
}

[HttpMethod(typeof(TextReturnReader), typeof(UrlParameterWriter))]
public Fields GetFields()
{
return ((Fields)(this.Invoke("GetFields", (this.Url + ""), new
object[0])));
}
}


<mydata.txt>
----------------------------------------------------------
aaa, bbb, ccc, ddd

<usage>
----------------------------------------------------------
CsvScraper scraper = new CsvScraper();
Fields flds = scraper.GetFields();

----------------------------------------------------------

in this case flds.Field array returns me only 3 elements with "ddd" missing.

what would be the proper regular expession for this ?

thanks,
 
yes, i did try it, but notice that you are using the "split" method which
is
a little different from matching patterns.

Do you have to match? If so, why? From what I understand, you just need to
split at commas while ignoring quoted commas - no? If that is not the case,
please explain further with exact sample data and exact needs as to what to
get and what not to get. TIA
 
Back
Top