Hi ,
Thanks for your reply.
As you said I have done this .
System.IO.StreamReader rdr = new System.IO.StreamReader("c:\\test.html");
string inputString = "";
inputString = rdr.ReadToEnd();
Regex regex = new Regex(
@"(?<=href="").*?(?="")",
RegexOptions.IgnoreCase
| RegexOptions.Multiline
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
MatchCollection col = regex.Matches(inputString);
foreach (Match match in col)
{
Console.WriteLine("href = " + match.Groups["href"].Value);
}
Console.ReadLine();
but I am getting output only href = ""
I want to get what is in href ?
am I wrong anywhere ?
please suggest me.
thanks ,
hemant
Hi,
I want to find all href from anchor tag in a html file .
I read the file in string but I am not getting how to get url from href of
anchor tab.
I have to get all the url from anchor tag .
thanks ,
Hemant
Use Regular Expressions.
If you link is
<a name="label">Any content</a>
and you need to get "label", use
using System.Text.RegularExpressions;
Regex regex = new Regex(
@"(?<=name="").*?(?="")",
RegexOptions.IgnoreCase
| RegexOptions.Multiline
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);
if you need to get a named anchor from the url like
<a href="
http://www.site.com/page.htm#tips">Jump to Tips</a>
then use following
Regex regex = new Regex(
@"[#].*?(?="")",
RegexOptions.IgnoreCase
| RegexOptions.Multiline
| RegexOptions.IgnorePatternWhitespace
| RegexOptions.Compiled
);