R
Roger Frost
Hi all
I've been messing with this since early yesterday. I thought it might come
to me in my sleep, but no such luck.
Here is the basic problem, I need to split a given string into sub-strings
of it's "token" and "non-token' parts.
For instance, the string "This is {blue}, this is {red}, and this is {green}."
Should result in:
"This is "
"{blue}"
", this is "
"{red}"
", and this is "
"{green}"
"."
Now, I can do this in two parts, seperating the tokens from the literals
(the output includes "}" and/or "{" on the literals, but I can deal with
this). What I can't seem to do is combine the two to get the above results,
which is what I need, it allows me to rebuild the string in the correct order
easily with minimal code, nevermind that, the important part is that I need
to do this with Regular Expressions.
Here is a complete example program:
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Press Enter To Start.");
Console.ReadLine();
string mystr = "{id}: The best {item:{category}} of all {items}
in {country}{industry} of the world.";
mystr = mystr.Replace("}{", "} {"); // Just some validation to
make things simpler
string matchTokens = @"{(.+?)(}?)}";
string matchLiterals =
@"^([^{}]+?){|}([^{}]+?){|}([^{}]+?)$|^([^{}]+?)$";
Regex findTokens = new Regex(matchTokens);
Regex findLiterals = new Regex(matchLiterals);
MatchCollection tokens = findTokens.Matches(mystr);
MatchCollection literals = findLiterals.Matches(mystr);
foreach (Match m in tokens)
{ Console.WriteLine(m.Value); }
Console.WriteLine();
foreach (Match m in literals)
{ Console.WriteLine(m.Value); }
Console.WriteLine();
Console.WriteLine("Press Enter To Exit.");
Console.ReadLine();
}
}
}
I've tried the following pattern:
string matchTokens =
@"({(.+?)(}?)})|^([^{}]+?){|}([^{}]+?){|}([^{}]+?)$|^([^{}]+?)$";
It's just a combination of the two, but outputs the same as the matchTokens
pattern in the example.
If any
I've been messing with this since early yesterday. I thought it might come
to me in my sleep, but no such luck.
Here is the basic problem, I need to split a given string into sub-strings
of it's "token" and "non-token' parts.
For instance, the string "This is {blue}, this is {red}, and this is {green}."
Should result in:
"This is "
"{blue}"
", this is "
"{red}"
", and this is "
"{green}"
"."
Now, I can do this in two parts, seperating the tokens from the literals
(the output includes "}" and/or "{" on the literals, but I can deal with
this). What I can't seem to do is combine the two to get the above results,
which is what I need, it allows me to rebuild the string in the correct order
easily with minimal code, nevermind that, the important part is that I need
to do this with Regular Expressions.
Here is a complete example program:
using System;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Press Enter To Start.");
Console.ReadLine();
string mystr = "{id}: The best {item:{category}} of all {items}
in {country}{industry} of the world.";
mystr = mystr.Replace("}{", "} {"); // Just some validation to
make things simpler
string matchTokens = @"{(.+?)(}?)}";
string matchLiterals =
@"^([^{}]+?){|}([^{}]+?){|}([^{}]+?)$|^([^{}]+?)$";
Regex findTokens = new Regex(matchTokens);
Regex findLiterals = new Regex(matchLiterals);
MatchCollection tokens = findTokens.Matches(mystr);
MatchCollection literals = findLiterals.Matches(mystr);
foreach (Match m in tokens)
{ Console.WriteLine(m.Value); }
Console.WriteLine();
foreach (Match m in literals)
{ Console.WriteLine(m.Value); }
Console.WriteLine();
Console.WriteLine("Press Enter To Exit.");
Console.ReadLine();
}
}
}
I've tried the following pattern:
string matchTokens =
@"({(.+?)(}?)})|^([^{}]+?){|}([^{}]+?){|}([^{}]+?)$|^([^{}]+?)$";
It's just a combination of the two, but outputs the same as the matchTokens
pattern in the example.
If any