Getting Web Page info

  • Thread starter Thread starter Kris Rockwell
  • Start date Start date
K

Kris Rockwell

Hello,

I have put together a browser using the AXWebBrowser control. I would like
to access elements of the displayed web page and pass that information back
to the application (say, for example, getting the title (<title>Page
Title</title>) of the page and passing that to a label on the application).
What would be the best method for doing that? Is it even possible?

I may have posted this question before, but I haven't been able to find it
if I have. If so, I am sorry for the repost.

Regards,
Kris
 
Hi Chris,


I used this in an application of mine - place something like the following
code in DocumentComplete of the webbrowser. This assumes that you know what
you want from the web page. I have seen other solutions.

Best wishes

Paul Bromley

Dim HTMLDoc As mshtml.HTMLDocument
Dim elementColl As mshtml.IHTMLElementCollection
Dim element As mshtml.IHTMLElement
Dim sOriginal As String
Dim LinkType As String
Dim StartPosition As Integer
Dim EndPosition As Integer
HTMLDoc = AxWebBrowser1.Document
elementColl = HTMLDoc.all.tags("TBODY")
For Each element In elementColl
sOriginal = element.innerHTML
'ListBox1.Items.Add(sOriginal)
Next element
Dim mName As Match = Regex.Match(sOriginal, "<TD vAlign=top
\?TOP\?><B>Name</B></TD>\s+<TD vAlign=top>.+</TD></TD>",
RegexOptions.IgnoreCase)
Dim iName As Integer
If (mName.Success) Then 'Output match information if a match was found
sName = mName.Value
StartPosition = sName.IndexOf("=top>", 0)
EndPosition = sName.IndexOf("</TD></TD>", 0)
StartPosition = StartPosition + 5
sName = sName.Substring(StartPosition, (EndPosition - StartPosition)) '61 -
Start of Address. 71 = Start of Address +10(Chars to end of text)
 
Paul,

I have noted that, in my application, using this code results in the
mshtml.HTMLDocument (and other such references) as being "undeclared". Is
there a particular reference I have to add?

Regards,
Kris
 
Hi Kris,

For the mshtml. you have to set a reference to mshtm in the normal reference
project=>Add.reference->.Net->Microsoft.mshtml

Notice that Paul uses only the tag innerHTML and than start using the regex.
However you can almost get every part from the document because it uses the
Document Object Model DOM.

I hope this helps?

Cor
 
Back
Top