RegEx - Translate

  • Thread starter Thread starter Thomas
  • Start date Start date
T

Thomas

Hi,

I want to categorize using one Regular Expression:

input: "Earth"
output: "Planet"

input: "Moon"
output: "Planet"

input: "Cat"
output: "Animal"

input: "Dog"
output: "Animal"

input: "Tomato"
output: "Vegetable"

input: "Carrot"
output: "Vegetable"

And if no match is found an empty string
input: "^%&%#"
output: ""

Something like this:
"if (Earth|Moon) then Planet ElseIf(Cat|Dog) then Animal... Else "" "

Any ideas?

- Thomas
 
Thomas said:
I want to categorize using one Regular Expression:

Is there any reason you particularly want to use a regular expression
here rather than something else which is more suitable, like a
Hashtable?
 
Thomas said:
Yes - I want the espression to be stored as a dynamic setting for the
application.

Why not store the mapping as a dynamic setting instead, and load that
in at runtime? Certainly if you're using XML for your configuration it
should be very easy to work out a mapping schema - and most other
configuration techniques make it quite easy too.
 
Yes - I want the espression to be stored as a dynamic setting for the
application.
Hi,

I am curious what that has to do with a regular expression, can you tell us
how you have planned to do this?

Cor
 
Please illustrate with an example - I'm very interested!

The setting must be stored as a single string - but can be structured - eg
with XML.

This is the current setting I have come up with:

This will categorize company name endings into LLC, BUSINES_CORP etc. I had
to escape the >< to store in an attribute.

<add key="Designations.Match"
value="(?&lt;LLC&gt;\bLimited\s+Company\b\W*$)|(?&lt;LLC&gt;\bLimited\s+Liab
ility\s+Company\b\W*$)|(?&lt;LLC&gt;\bLC\b\W*$)|(?&lt;LLC&gt;\bLLC\b\W*$)|(?
&lt;BUSINESS_CORP&gt;\bCorporation\b\W*$)|(?&lt;BUSINESS_CORP&gt;\bIncorpora
ted\b\W*$)|(?&lt;BUSINESS_CORP&gt;\bLimited\b\W*$)|(?&lt;BUSINESS_CORP&gt;\b
Corp\b\W*$)|(?&lt;BUSINESS_CORP&gt;\bInc\b\W*$)|(?&lt;BUSINESS_CORP&gt;\bLtd
\b\W*$)|(?&lt;BUSINESS_CORP&gt;\bCorporation\b\W*$)|(?&lt;PROFESSIONAL_CORP&
gt;\bProfessional\s+Corporation\b\W*$)|(?&lt;PROFESSIONAL_CORP&gt;\bProfessi
onal\s+Corporation\b\W*$)|(?&lt;PROFESSIONAL_CORP&gt;\bProfessional\s+Corp\b
\W*$)|(?&lt;PROFESSIONAL_CORP&gt;\bProf\s+Corp\b\W*$)|(?&lt;PROFESSIONAL_COR
P&gt;\bPC\b\W*$)|(?&lt;PROFESSIONAL_CORP&gt;\bPC\b\W*$)|(?&lt;LLP&gt;\bRegis
tered\s+Limited\s+Liability\s+Partnership\b\W*$)|(?&lt;LLP&gt;\bLLP\b\W*$)|(
?&lt;LLP&gt;\bLimited\s+Liability\s+Partnership\b\W*$)|(?&lt;LLP&gt;\bLimite
d\s+LP\b\W*$)|(?&lt;LLP&gt;\bLimited\s+LP\b\W*$)|(?&lt;LLLP&gt;\bRegistered\
s+Limited\s+Partnership\b\W*$)|(?&lt;LLLP&gt;\bLimited\s+Partnership\b\W*$)|
(?&lt;LLLP&gt;\bLP\b\W*$)|(?&lt;LLLP&gt;\bRegistered\s+Limited\s+Liability\s
+Limited\s+Partnership\b\W*$)|(?&lt;LLLP&gt;\bLimited\s+Liability\s+Limited\
s+Partnership\b\W*$)|(?&lt;LLLP&gt;\bLimited\s+LLP\b\W*$)|(?&lt;LLLP&gt;\bLL
LP\b\W*$)" />
 
Thomas said:
Please illustrate with an example - I'm very interested!

The setting must be stored as a single string - but can be structured - eg
with XML.

Okay, here's an example XML format you could use:

<map>
<mapping from="Something" to="Another" />
<mapping from="Earth" to="Planet" />
<mapping from="Moon" to="Planet" />
</map>

You'd then convert the XML into a hashtable in any of various ways - eg
loading it into an XmlDocument and using XPath to find all the mapping
nodes.

Your conversion would then just be:

public string Convert (string input)
{
if (map.ContainsKey(input))
{
return (string)map[input];
}
// This would depend on exactly what you wanted to do with
// non-matching input
return input;
}
 
Back
Top