R
Roberto Cavalieri
hi everybody, i have a little problem with a simple regex. I need to extract
the href attribute value from a html tag.
Now, it is very simple, just googleing tell me:
..NET Framework Developer's Guide
Example: Scanning for HREFs
http://msdn.microsoft.com/en-us/library/t9e807fx.aspx
OK let see the example:
private static void DumpHRefs(string inputString)
{
Match m;
string HRefPattern = "href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";
m = Regex.Match(inputString, HRefPattern,
RegexOptions.IgnoreCase | RegexOptions.Compiled);
while (m.Success)
{
Console.WriteLine("Found href " + m.Groups[1] + " at "
+ m.Groups[1].Index);
m = m.NextMatch();
}
}
Well, this is my code:
System.Text.RegularExpressions.Regex Regex;
System.Text.RegularExpressions.Match Match;
string ToCheck = "<a
href='/Jobs/796/Software-Developer-M-F.aspx'>Software Developer (M/F)</a>";
string Pattern = "{0}\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";
Regex = new System.Text.RegularExpressions.Regex(string.Format(Pattern,
"href"), System.Text.RegularExpressions.RegexOptions.IgnoreCase |
System.Text.RegularExpressions.RegexOptions.Compiled);
for (Match = Regex.Match(ToCheck); Match.Success; Match =
Match.NextMatch())
{
string MatchValue = Match.Groups[1].Value;
}
The Match is obtained but the value is
*****Jobs\796\Software-Developer-M-F.aspx'>Software***** ?????
Can someone explain me what's the wrong on my code?
Thank you in advise, good job to all
See you soon
the href attribute value from a html tag.
Now, it is very simple, just googleing tell me:
..NET Framework Developer's Guide
Example: Scanning for HREFs
http://msdn.microsoft.com/en-us/library/t9e807fx.aspx
OK let see the example:
private static void DumpHRefs(string inputString)
{
Match m;
string HRefPattern = "href\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";
m = Regex.Match(inputString, HRefPattern,
RegexOptions.IgnoreCase | RegexOptions.Compiled);
while (m.Success)
{
Console.WriteLine("Found href " + m.Groups[1] + " at "
+ m.Groups[1].Index);
m = m.NextMatch();
}
}
Well, this is my code:
System.Text.RegularExpressions.Regex Regex;
System.Text.RegularExpressions.Match Match;
string ToCheck = "<a
href='/Jobs/796/Software-Developer-M-F.aspx'>Software Developer (M/F)</a>";
string Pattern = "{0}\\s*=\\s*(?:\"(?<1>[^\"]*)\"|(?<1>\\S+))";
Regex = new System.Text.RegularExpressions.Regex(string.Format(Pattern,
"href"), System.Text.RegularExpressions.RegexOptions.IgnoreCase |
System.Text.RegularExpressions.RegexOptions.Compiled);
for (Match = Regex.Match(ToCheck); Match.Success; Match =
Match.NextMatch())
{
string MatchValue = Match.Groups[1].Value;
}
The Match is obtained but the value is
*****Jobs\796\Software-Developer-M-F.aspx'>Software***** ?????
Can someone explain me what's the wrong on my code?
Thank you in advise, good job to all
See you soon