My rules:
coming string always starts with '(' and ends with ')'.
and the 'a' and 'o' chars between id's C1,C2,C3.. are my operands.
after i can parse and get the divided strings i will do an bool
operation like, (C1aC2) string1:C1 operand:a string2:C2 ,do '&'
operation - > C1 & C2 -> 1 & 1 = 1 . This point is not so important.
Lets say that all the operands are 'a'. Mainly the last & last string
will be (C1aC2). because coming string could not be simple. Most
complex could be ((C1aC2)aC3)a(C6a(C4aC5)).
What i want is, as you see if we compare the paranthesis,i have to
find the most middle operand and the strings between it -> string1:
((C1aC2)aC3) operand:a string2
C6a(C4aC5))
then recursively, i will divide new strings until they form the
simplest one like,
((C1aC2)aC3)
string1
C1aC2) operand:a string2:C3
(C6a(C4aC5))
string1:C6 operand:a string2
C4aC5)
((C1aC2)aC3)
then;
(C1aC2)
string1:C1 operand:a string2:C2
(C4aC5)
string1:C4 operand:a string2:C5
this is last point of my parser, i only need to divide string phase by
phase. thanks for reading
Okay, so you do end up dividing them to the very end. And no spaces in
the input either, right?
I think that what you really want to get in the end here is a parse
tree, not just a bunch of strings (it will be easier to process it
later on). In your case, the only three kinds of nodes in the tree
would be "a", "o", and "C". Let's define them:
// Quick & dirty... in production code you'll probably want to make
those immutable, use properties over fields, and add constructors
abstract class Node { }
class ANode : Node { public Node Left; public Node Right; }
class ONode: Node { public Node Left; public Node Right; }
class CNode: Node { public int N; }
Now, your original example:
((C1aC2)aC3)o(C6a(C4aC5))
would be described with the following tree:
new ONode {
Left = new ANode {
Right = new ANode {
Left = new CNode { N = 1 },
Right = new CNode { N = 2 }
},
Left = new CNode { N = 3 }
},
Right = new ANode {
Left = new CNode { N = 6 },
Right = new ANode {
Left = new CNode { N = 4 },
Right = new CNode { N = 5 },
}
}
}
It probably easiest to get the tree with a simple hand-written
recursive descent parser operating directly on strings. Something
along these lines:
static Node Parse(string input)
{
// add a sentinel value for easy parsing.
input += '\0';
int index = 0;
var tree = ParseAOExpr(input, ref index);
// make sure input was fully parsed
if (input[index] != '\0')
{
throw new Exception("Expected end of input at " + index);
}
return tree;
}
// AOExpr is an expression of form "X", "XaY" or "XoY", where X
and Y are CExprs or groups.
static Node ParseAOExpr(string input, ref int index)
{
// parse X
var x = ParseCExprOrGroup(input, ref index);
// see if we have an operator there or not
char op = input[index];
switch (op)
{
case 'a':
case 'o':
{
// consume the operator
++index;
// parse Y;
var y = ParseCExprOrGroup(input, ref index);
if (op == 'a')
{
return new ANode { Left = x, Right = y };
}
else
{
return new ONode { Left = x, Right = y };
}
}
default:
return x;
}
}
// CExpr is an expression of form "Cddd", where ddd is a sequence
of digits of arbitrary length.
// Group is an expression of form "(X)", where X is an AExpr
static Node ParseCExprOrGroup(string input, ref int index)
{
switch (input[index])
{
case '(':
{
// consume opening brace
++index;
// recursively parse what follows the opening
brace
var x = ParseAOExpr(input, ref index);
// index should point at the closing brace now
if (input[index] != ')')
{
throw new Exception("Closing brace expected at
" + index);
}
// consume closing brace
++index;
return x;
}
case 'C':
{
// consume 'C'
++index;
// consume all digits following 'C'
string sn = "";
while (char.IsDigit(input[index]))
{
sn += input[index];
++index;
}
if (sn.Length == 0)
{
throw new Exception("At least one digit
expected at " + index);
}
return new CNode { N = int.Parse(sn) };
}
default:
throw new Exception("Expected 'C' or '(' at " +
index);
}
}
Again, this is quick & dirty - you'd probably want to use a distinct
exception type for error reporting, not hardcode the messages in, etc.