Regex on RTF files

  • Thread starter Thread starter Ganesh
  • Start date Start date



I have a RTF file as follows

{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case value is
'1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so we should
not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Hello Ganesh,

I have a RTF file as follows

icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
World \par}

I want a Regex to get the value next to the \expnd, in this case value
is '1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so
we should not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);


Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }

so maybe you can provide some samples on what to match and what not to match
and why, so that we can help you better.
Hello Ganesh,

I have a RTF file as follows

icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
World \par}

I want a Regex to get the value next to the \expnd, in this case value
is '1' ({\expnd1 Hello}), but the next '\expnd' is not a valid RTF, so
we should not take that. I need only within the { } and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);


Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }

so maybe you can provide some samples on what to match and what not to match
and why, so that we can help you better.
Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string.
string str =
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF. So
when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string there
are two values. 1 and 1. So there can be multiple \expnd in a string. I want
to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the values ?

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string.
string str =
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF. So
when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string there
are two values. 1 and 1. So there can be multiple \expnd in a string. I want
to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the values ?

Hi Jesse Houwing,

Thanx very much for ur reply.  Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Micr osoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}  
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

Then why don't you just match against "{\expnd(\d+)" (i.e. include the
curly brace?)
Hi Jesse Houwing,

Thanx very much for ur reply.  Check my new string.
string str =
@"{\rtf1\ansi\ansicpg1252\deff0\deflang1033{\fonttbl{\f0\fnil\fcharset0Micr osoft
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}  
\expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value> <data>}

Then why don't you just match against "{\expnd(\d+)" (i.e. include the
curly brace?)
Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hi Jesse Houwing,

Thanx for the code again. check my code:

-- Begin Code --

sRtfData =
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();

-- End Code --

m.Success is always false,


Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
{\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF.
So when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string
there are two values. 1 and 1. So there can be multiple \expnd in a
string. I want to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Hi Jesse Houwing,

Thanx for the code again. check my code:

-- Begin Code --

sRtfData =
Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello} {\expnd1 World}
\expnd ganesh\par}";

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();

-- End Code --

m.Success is always false,


Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling NextMatch op
your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();

Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1 Hello}
{\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>

so in the string there are 2 \expnd which start with { and there is a
another \expnd, which does not start with {, this is not valid in RTF.
So when I do a Regex I want to get only the valid \expnd, which is
{\expnd<value> <data>} and the value next to the \expnd. In the string
there are two values. 1 and 1. So there can be multiple \expnd in a
string. I want to get all the vales next to the \expnd keyword.

This is my probelm. Your sample did work, but how to get all the
values ?

Hello Ganesh,

Where is your Regex definition?

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;
Hello Ganesh,

Where is your Regex definition?

Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...


Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...


Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;
Hi Jesse,

That's worked..... Thanx very much for your help. At last my probelm solved.

And thanx for every one


Jesse Houwing said:
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...


Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
et 0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} {\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>

so in the string there are 2 \expnd which start with { and there is
a another \expnd, which does not start with {, this is not valid in
RTF. So when I do a Regex I want to get only the valid \expnd, which
is {\expnd<value> <data>} and the value next to the \expnd. In the
string there are two values. 1 and 1. So there can be multiple
\expnd in a string. I want to get all the vales next to the \expnd

This is my probelm. Your sample did work, but how to get all the
values ?

Hello Ganesh,


I have a RTF file as follows

et 0M icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case
value is '1' ({\expnd1 Hello}), but the next '\expnd' is not a
valid RTF, so we should not take that. I need only within the { }
and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);


Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }
so maybe you can provide some samples on what to match and what not
to match and why, so that we can help you better.
Hi Jesse,

That's worked..... Thanx very much for your help. At last my probelm solved.

And thanx for every one


Jesse Houwing said:
Hello Ganesh,

Ahh I guess I found what was going wrong... even if you use a @" you still
need to escape a single backslash to let it match a \... stupid me ;)

Regex rx = new Regex(@"{\\expnd(\d+)", RegexOptions.None);

Should work...


Hi Jesse Houwing,

Thanx for the code again. check my code:

Match m = rx.Match(sRtfData);

while (m.Success)
string value = m.Groups[1].Value;
m = m.NextMatch();
Jesse Houwing said:
Hello Ganesh,

After getting a Match, you can get the next match by calling
NextMatch op your match object:

Regex rz = new Regex(@"{expnd(\d+)", RegexOption.None);
Match m = rx.Match (rtftext);
while (m.Success)
string value = m.Groups[1].Value;
m= m.NextMatch();
Or you can use

MatchCollection mc = rx.Matches(rtfText);
foreach (Match m in mc)
string value = m.Groups[1].Value;

Hi Jesse Houwing,

Thanx very much for ur reply. Check my new string. string str =
et 0Microsoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} {\expnd1 World} \expnd ganesh\par}";

In the RTF RFC, the keyword \expnd always start with {\expnd<value>

so in the string there are 2 \expnd which start with { and there is
a another \expnd, which does not start with {, this is not valid in
RTF. So when I do a Regex I want to get only the valid \expnd, which
is {\expnd<value> <data>} and the value next to the \expnd. In the
string there are two values. 1 and 1. So there can be multiple
\expnd in a string. I want to get all the vales next to the \expnd

This is my probelm. Your sample did work, but how to get all the
values ?

Hello Ganesh,


I have a RTF file as follows

et 0M icrosoft Sans Serif;}}\viewkind4\uc1\pard\f0\fs17 {\expnd1
Hello} World \par}

I want a Regex to get the value next to the \expnd, in this case
value is '1' ({\expnd1 Hello}), but the next '\expnd' is not a
valid RTF, so we should not take that. I need only within the { }
and \expnd<value>

I did this in PHP, but couln't do it in c#, Can someone help me ?

Regex rx = new Regex(@"\expnd(\d+)", RegexOptions.None);


Should get you started... But I don't understand what you mean by
- but the next '\expnd' is not a valid RTF
- and I need only within the { }
so maybe you can provide some samples on what to match and what not
to match and why, so that we can help you better.