Parsing

  • Thread starter Thread starter Michel Racicot
  • Start date Start date
M

Michel Racicot

How can I parse easily the following line:

Hi There How "Are you?"

I need to obtain the following values:

1- Hi
2- There
3- How
4- Are You?

Can I use some string tokenisation with this? Can I use the String.Split
member function? How will I treat the "" ?

Thank you
 
what's the deal with the Are You? on one line. The rest you can split on a
blank but I don't know your reason for Are You? Was that a typo?
 
The most direct way would be to use a regular expression. (In the
System.Text.RegularExpressions namespace.)

If you use this pattern for a regular expression:
(?<=").+(?=")|[^\s"]+

You will get the following matches on your text:
Hi
There
How
Are you?

In plain English, the pattern reads as "one or more characters preceeded by
a quotation mark and followed by a quotation mark, OR one or more of any
character except for a whitespace character or quotation mark." It's not
very intuitive, but it works.
 
It looks like he wants to keep anything surrounded by quotations as a single
line. Unfortunately, String.Split won't work for something like that.
 
right, i assumed it was a typo (actually didn't see the quote :-) and he
wanted a simple split. yes that regex will work just fine if thats what he
wants to do.
 
Back
Top