Regular Expressions

  • Thread starter Thread starter Guest
  • Start date Start date
G

Guest

I need to take a short piece of html/xml as a string

IE: <tagName attribute1='value' attribute2=value/>

Note that one attribute value is quoted and one is not.

The input string could have any number of attributes with any combination of
quoted and unquoted values.

The ouput string needs all values quoted.

Can I do this with regular expressions or can you offer another suggestion?
 
Bill said:
I need to take a short piece of html/xml as a string

IE: <tagName attribute1='value' attribute2=value/>

Note that one attribute value is quoted and one is not.

The input string could have any number of attributes with any combination
of
quoted and unquoted values.

The ouput string needs all values quoted.

Can I do this with regular expressions or can you offer another suggestion?

Sure you can do that with regular expressions. For example, try to replace
this expression

=(?<val>[A-Za-z0-9_-]+)

with

="${val}"

I say for example, because you'll need to figure out which characters you
want to use for the non-quoted attribute value - the example expression
uses all upper and lower case characters, digits, the underscore and the
dash.



Oliver Sturm
 
Here's one we worked out for getting HTML form field tag attributes. It also
checks for the 2 unnamed form field attributes "selected" or "checked". It
can work with or without the trailing slash character. It puts the name of
the attribute into group 1, the value into group 2, and any "selected" or
"checked" into group 3. You can remove the condition for "selected" or
"checked" if you like.

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
Ambiguity has a certain quality to it.
 
Whoa. Just realized I didn't post the regular expression!

(?i)\s+(?:(\w+)=(?:["']?([^"'/>=]*)["']?)(?<!\s*(?:selected|checked))(?=\s|/?>)|(?:\s*(selected|checked))(?=\s|/?>))

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
Ambiguity has a certain quality to it.
 
I'll give this a try. Thank you

--
Bill

Kevin Spencer said:
Whoa. Just realized I didn't post the regular expression!

(?i)\s+(?:(\w+)=(?:["']?([^"'/>=]*)["']?)(?<!\s*(?:selected|checked))(?=\s|/?>)|(?:\s*(selected|checked))(?=\s|/?>))

--
HTH,

Kevin Spencer
Microsoft MVP
..Net Developer
Ambiguity has a certain quality to it.

Kevin Spencer said:
Here's one we worked out for getting HTML form field tag attributes. It
also checks for the 2 unnamed form field attributes "selected" or
"checked". It can work with or without the trailing slash character. It
puts the name of the attribute into group 1, the value into group 2, and
any "selected" or "checked" into group 3. You can remove the condition for
"selected" or "checked" if you like.

--
HTH,

Kevin Spencer
Microsoft MVP
.Net Developer
Ambiguity has a certain quality to it.
 
Back
Top