Read webpage with VB app

  • Thread starter Thread starter KC
  • Start date Start date
K

KC

Is there a simple way to read (and parse) a webpage with a VB.net
application. Right now I can sort of read a page but it's a framed page and
the server thinks the app is a browser that can't handle frames...so I get a
message to the effect in the response.

Ken
 
* "KC said:
Is there a simple way to read (and parse) a webpage with a VB.net
application. Right now I can sort of read a page but it's a framed page and
the server thinks the app is a browser that can't handle frames...so I get a
message to the effect in the response.

You can load each frame's HTML page. This will require you to know the
filename or parse the frameset. Alternatively, you can place a
WebBrowser control on your form and use this code:

\\\
Me.WebBrowser1.Navigate("http://www.over-the-moon.org/dollz")
///

In the WebBrowser's 'DocumentComplete' event handler:

\\\
Dim i As Integer
For i = 0 To Me.WebBrowser1.Document.frames.length - 1
MsgBox(Me.WebBrowser1.Document.frames(i).Document.documentElement.innerText)
Next i
///
 
Hi KC,

In addition to Herfried,

A little example,

Open a new windows application project

In the toolbox rightclick and select add/Remove items

In the customize toolbox select Com and in that Microsoft Webbrowser

When that is in the toolbox drag it to your form
Drag also a button to your form.

Then this code and you have a mini Webbrowser.

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click
Me.AxWebBrowser1.Navigate2("www.google.com")
End Sub

webbrowser
http://support.microsoft.com/?kbid=311303

mshtml
http://msdn.microsoft.com/library/default.asp?url=/workshop/browser/hosting/hosting.asp

I hope this helps a little bit?

Cor
 
KC said:
Is there a simple way to read (and parse) a webpage with a VB.net
application. Right now I can sort of read a page but it's a framed page
and the server thinks the app is a browser that can't handle frames...so
I get a message to the effect in the response.

It sounds like you just want to transfer it, and not display it? If so the
webbrower control is overkill and in fact can interfere. You should look at
just using straight HTTP.


--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/
 
Hi Kudzu,
It sounds like you just want to transfer it, and not display it? If so the
webbrower control is overkill and in fact can interfere. You should look at
just using straight HTTP.
Altough I do not disagree, are we not talking about ants on the Titanic. ;)

Cor
 
* "Chad Z. Hower aka Kudzu said:
It sounds like you just want to transfer it, and not display it? If so the
webbrower control is overkill and in fact can interfere. You should look at
just using straight HTTP.

I agree, but then you will have to parse the frameset to get the URLs of
the pages displayed in the frames.
 
Cor Ligthert said:
Altough I do not disagree, are we not talking about ants on the Titanic. ;)

Not in this case. WebBrowser doesnt just do HTTP, or simple parsing. It loads
TONS of stuff to render etc just to do this. We are not talking about ants on
the Titanic, but rather Ants VERSUS the Titanic. :)


--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Empower ASP.NET with IntraWeb
http://www.atozed.com/IntraWeb/
 
Hi Kudzu,
Yes, but thats quite a trivial task.
That should be very easy when you use the logic from Mshmtl.

Something simple as let say (I did not test it, however it is not much
different))

If tagname = "iframe" Then
Url = DirectCast( DirectCast(iDocument.all.item(i), mshtml.IHTMLElement)
mshtml.IHTMLFrameBase).src.ToString
End If

:-))

Cor
 
(e-mail address removed) (Herfried K. Wagner [MVP]) wrote in
Mhm... but why implement it if it's already available through MSHTML?

Because unless you are implementing just one instance in a client app,
dragging in WebBrowser will us significant resources. Its like loading up a
747 to go to the corner store for a loaf of bread.


--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Develop ASP.NET applications easier and in less time:
http://www.atozed.com/IntraWeb/
 
Hi Kudzu,

It becomes smaller, first the Titanic and now it is a 747.

:-)

Although I agree with you when the OP wants to integrate it in his program.
When he wants an basic approach he can use the webbrowser because it is so
easy to navigate with that.

Do not mix up MSHTML with the webbrowser, in my opinion that are different
things.
MSDN is not real clear about MSHTML.

You can load a doc in MSHTML and than process it using the DOM.
See for that the sample I showed in the other thread in this message.

Cor
 
* "Chad Z. Hower aka Kudzu said:
Because unless you are implementing just one instance in a client app,
dragging in WebBrowser will us significant resources. Its like loading up a
747 to go to the corner store for a loaf of bread.

Full ACK. Depends on if you are lazy or not...
 
Cor Ligthert said:
It becomes smaller, first the Titanic and now it is a 747.

Different examples. :)
Although I agree with you when the OP wants to integrate it in his program.
When he wants an basic approach he can use the webbrowser because it is so
easy to navigate with that.

If he doesnt care about weight, or for that matter dependencies. In a client
app he's probably ok.
Do not mix up MSHTML with the webbrowser, in my opinion that are different
things.
MSDN is not real clear about MSHTML.

Yes, but they are quite interreliant, and he spoke of using WebBrowser to do
the fetching.
You can load a doc in MSHTML and than process it using the DOM.

But he will get it using WebBrowser. :)



--
Chad Z. Hower (a.k.a. Kudzu) - http://www.hower.org/Kudzu/
"Programming is an art form that fights back"

Get your ASP.NET in gear with IntraWeb!
http://www.atozed.com/IntraWeb/
 
Back
Top