Translating JavaScript function with Regex to CSharp

  • Thread starter Thread starter Jon Maz
  • Start date Start date
J

Jon Maz

Hi All,

Am getting frustrated trying to port the following (pretty simple) function
to CSharp. The problem is that I'm lousy at Regular Expressions....

//from http://support.microsoft.com/default.aspx?scid=kb;EN-US;246800
function fxnParseIt()
{
var sInputString = 'asp and database';

sText = sInputString;
sText = sText.replace(/"/g,"");
if (sText.search(/(formsof|near|isabout)/i) == -1)
{
sText = sText.replace(/ (and not|and) /gi,'" $1 "');
sText = sText.replace(/ (or not|or) /gi,'" $1 "');
sText = '"' + sText + '"';
}

sInputString = sText;
}

If anyone can give me a hand, it would be much appreciated!

TIA,

JON
 
To save typing, we'll assume:
RegexOptions opts = RegexOptions.Compiled | RegexOptions.IgnoreCase;

The line:
sText = sText.Replace(/"/g,"")
should be:
Regex Quotes = new Regex("\"", opts); // matches all quotes
sText = Quotes.Replace(sText, ""); // replace quotes with nothing (empty
string).

The block starting with:
if (sText.search(/(formsof|near|isabout)/i) == -1)
should be
Regex foo = new Regex("(formsof|near|isabout)", opts);
Regex andnot = new Regex(" (and not|and) ", opts);
Regex ornot = new Regex(" (or not|or) ", opts);
if(! foo.IsMatch(sText) ) {
sText = andnot.Replace(sText, new MatchEvaluator(TheMatch));
sText = ornot.Replace(sText, new MatchEvaluator(TheMatch));
sText = "" + sText + ""; // i have no idea what this line is
supposed to be for.
}

Elsewhere in the class that's doing this, you'll need:

private string TheMatch(Match m) {
return m.ToString();
}

For more information on .NET regular expressions, see:

http://msdn.microsoft.com/library/d...stemtextregularexpressionsregexclasstopic.asp

In general, though, you create a Regex object to hold the pattern, and then
you call methods on it to see if stuff matches the pattern or to perform
replacements. Match and replace operations that were one step in Perl or
JavaScript are thus two steps in C#. Sort of a bummer. So,
$mystr =~ m/whatever/;
is equivalent to
Regex R = new Regex("whatever");
R.IsMatch(mystr) f

and
$mystr =~ s/whatever/somethingelse/;
is equivalent to
Regex R = new Regex("whatever");
R.Replace(mystr, "somethingelse");

On the whole, I prefer the terser Perl-style syntax, but what are you gonna
do. I guess if you really hate having to create a regex object and then use
it, you could always derive your own string type in order to add .IsMatch
and .Replace methods to it, thus hiding all the actual regexp messiness from
the caller. But that seems like more trouble than it's probably worth.
 
Hi Taruntius,

Fantastic, thanks for your help! Below is the (almost identical) code that
I am now using!

Cheers,

JON

---------------------------------------------------------

public static string ParseFullTextIndexSearchTerm(string inputString)
{
RegexOptions opts = RegexOptions.Compiled | RegexOptions.IgnoreCase;

Regex Quotes = new Regex("\"", opts); // matches all quotes
outputString = Quotes.Replace(outputString, ""); // replace quotes with
nothing (empty string).

Regex keyWords = new Regex("(formsof|near|isabout)", opts);
Regex andnot = new Regex(" (and not|and) ", opts);
Regex ornot = new Regex(" (or not|or) ", opts);
if(! keyWords.IsMatch(outputString) )
{
outputString = andnot.Replace(outputString, new MatchEvaluator(TheMatch));
outputString = ornot.Replace(outputString, new MatchEvaluator(TheMatch));
outputString = "\"" + outputString + "\"";
}

return outputString;

}

public static string TheMatch(Match m)
{
return "\"" + m.ToString() + "\"";
}
 
Hi,

1.) I wouldn't recommend using RegexOptions.Compiled. It doesn't seem
that the OP is parsing thousands characters long strings, so the
RegexOptions.Compiled will almost certainly slow things down (it causes
a dynamic assembly to be generated, which is not what you should want at
all unless you really know you need it).

2.) You don't need to create the Regex objects, there are static methods
in the Regex class that do it for you:

3.) There is absolutely no need to use the match evaluator, you can use
substitutions ($1, $2, etc.) in the replacement string

So I would suggest rewriting the function as: (not tested)

public static string Parse(string s)
{
s = s.Replace("\"", ""); // no need for regex

if (!Regex.Match(s, "(formsof|near|isabout)",
RegexOptions.IgnoreCase).Success))
{
s = Regex.Replace(s, " (and not|and) ", "\" $1 \"",
RegexOptions.IgnoreCase);
s = Regex.Replace(s, " (or not|or) ", "\" $1 \"",
RegexOptions.IgnoreCase);
s = '"' + s + '"';
}

return s;
}

HTH,
Stefan
 
Back
Top