regex - should be simple

  • Thread starter Thread starter lit
  • Start date Start date
L

lit

I want to capture everything up to the _first_ semicolon, and
everything after. Can't get it to work. Tried

(.*);(.*)
(.*?);(.*)

and many variations.

Testing against the string "one;two;three"
should return a group "one" and subsequent group "two;three"

an expression that works would be much appreciated.

- Leo
 
I want to capture everything up to the first semicolon, and
everything after. Can't get it to work. Tried

(.*);(.*)
(.*?);(.*)

and many variations.

Testing against the string "one;two;three"
should return a group "one" and subsequent group "two;three"

an expression that works would be much appreciated.

You need to restrict what's matched by the first group, then things should
work. Like this:

([^;]*);(.*)


Oliver Sturm
 
Oliver Sturm said:
I want to capture everything up to the first semicolon, and
everything after. Can't get it to work. Tried

(.*);(.*)
(.*?);(.*)

and many variations.

Testing against the string "one;two;three"
should return a group "one" and subsequent group "two;three"

an expression that works would be much appreciated.

You need to restrict what's matched by the first group, then things should
work. Like this:

([^;]*);(.*)

(.*?);(.*)

already does this.

From the docs:

*? -- Specifies the first match that consumes as few repeats as possible
(equivalent to lazy *).
http://msdn.microsoft.com/library/d...l/cpconRegularExpressionsLanguageElements.asp


Here's an example of using this pattern. Note the named groups and
ExplicitCapture option. I find they make life much easer.

public class Program
{
static void Main(string[] args)
{
Regex pattern = new Regex("(?<first>.*?);(?<remainder>.*)",
RegexOptions.ExplicitCapture);
Match match = pattern.Match("one;two;three;");
Console.WriteLine(match.Groups["first"]);
Console.WriteLine(match.Groups["remainder"]);
}
}

David
 
David said:
You need to restrict what's matched by the first group, then things should
work. Like this:

([^;]*);(.*)

(.*?);(.*)

already does this.

I know that. But the OP was stating it didn't work for him, so I thought
I'd just suggest an alternative - one that I actually like to use because
it specifies more cleary how the match works, thereby making it more
readable.

But you're right, that expression, which the OP also gave, should really
work with the sample string given.


Oliver Sturm
 
Thanks David, Oliver,
You are right, both solutions work. I must have made a mistake when I
tested (.*);(.*) earlier.

regards,

leo
 
I want to capture everything up to the _first_ semicolon, and
everything after.

I've got one, too. I'd like to capture VB string literals. For example,
with the input like this,

if mystring = "far ""out!""" then mystring = "This is ""just"" a ""test
string"" for now."

The regex needs to return:

far ""out!""
This is ""just"" a ""test string"" for now.


I've stumped several people with this one, and tried several variants
myself, and still don't have an answer.

If you like what to me is a challenge...

Rob
 
Rob Perkins said:
I've got one, too. I'd like to capture VB string literals. For example,
with the input like this,

if mystring = "far ""out!""" then mystring = "This is ""just"" a ""test
string"" for now."

The regex needs to return:

far ""out!""
This is ""just"" a ""test string"" for now.


I've stumped several people with this one, and tried several variants
myself, and still don't have an answer.

If you like what to me is a challenge...
How about:

"[^"]*(""[^"]*)+"

David
 
David Browne said:
Rob Perkins said:
I've got one, too. I'd like to capture VB string literals. For example,
with the input like this,

if mystring = "far ""out!""" then mystring = "This is ""just"" a ""test
string"" for now."

The regex needs to return:

far ""out!""
This is ""just"" a ""test string"" for now.


I've stumped several people with this one, and tried several variants
myself, and still don't have an answer.

If you like what to me is a challenge...
How about:

"[^"]*(""[^"]*)+"

Oops, should be

"[^"]*(""[^"]*)*"

And this is the pattern, not the VB literal string of the pattern which
would be

dim p as string = """[^""]*(""""[^""]*)*"""

David
 
David said:
Oops, should be

"[^"]*(""[^"]*)*"

And this is the pattern, not the VB literal string of the pattern which
would be

dim p as string = """[^""]*(""""[^""]*)*"""

Interesting. I'm working on something else right now, or I'd have tested
it already, but how about this input string?

system.console.writeline("""Hi"" seems to be a ""short"" enough
""word.""")

Rob
 
Rob Perkins said:
David said:
Oops, should be

"[^"]*(""[^"]*)*"

And this is the pattern, not the VB literal string of the pattern which
would be

dim p as string = """[^""]*(""""[^""]*)*"""

Interesting. I'm working on something else right now, or I'd have tested
it already, but how about this input string?

system.console.writeline("""Hi"" seems to be a ""short"" enough
""word.""")

Worked for me.

David
 
David said:
"[^"]*(""[^"]*)*"

I fancy this:

"(""|[^"])*"

because I find it simpler. A VB string consists of two things, "" and
characters that are not ". Right?


Oliver Sturm
 
Back
Top