web in windows ?

  • Thread starter Thread starter Mike Smith
  • Start date Start date
M

Mike Smith

hi i was wondering if there was a HTML object in .net ?
I need to access a particular web page in my windows application and to be
able to parse it. How is this done ? Is there an Inet control or something
like in vb6 ?
 
Hi Mike,

You need for that to open your toolbox and add in that the microsoft
webbrowser from the com objects.

Than you can drag it on your form.

I hope this helps,

Cor
 
Hello Mike

There's a com component called 'Microsoft Web Browser'
i've never used it so i don't how it works.

Also you can use WebClient class to download the .HTML
file that you wish to parse.

Kind Regards
Jorge Cavalheiro
 
what do you want to do exactly?

if for example you want to parse the html code to find one particular part
of it, you can use regular expressions
System.Text.RegularExpressions

dominique
 
yeah realised i got to use back the old controls back in .net for it..
thought there might be new ones.

I basically want to parse a particular page content in an application. This
is my dilema now.

The internet control can return me the sourcecode of a page from the OpenUrl
method, the trouble is the page has a script to be executed onLoad and i
need to read the refreshed version of the source.

The webbrowser control can link to the url i want and it runs the script
onLoad and shows me the correct page to parse.. But i cant find any means to
retrieve the source of this control..

So what do i do ????????????????????
 
can you maybe send an example of what you want to do

but maybe this can help...
add a extra "non-html" tag before and after the part of html you need (if
this is possible)
browsers will simply ignore these not-known tags

so

....htmlcode and stuff...
<pompom>
....code to find...
</pompom>
....other htmlcode and stuff...

dim re as new Regex(".+<pompom>(.+)</pompom>.+")
dim m as Match = re.Match(theHtmlSourceAsString)

if(m.Success) then
dim myCode as String = m.Groups(1).Value
....
end if

if you have the correct structure you will have a match
this match will have one group containing the code you want to find...
(if i didn't made a regex error ;-))


dominique
 
Hi Mike,

If you want to get information from a webpage (although you can never get a
page from the internet only one document and sometimes that is the page).
Than you use mshtml to get the information from the document.

Instead of the webbrowser you can also have a look for all the httprequest
classes but that are so much that I cannot tell you which you have to use.

If you want to have the whole page of a document you can get that by using
Html.innertext block. (The only thing you do not get it the leading HTML
tag and the closing HTML tag) Or any other innertext block.

When you use mshtml you have to set a reference for it but do not set an
import, because this class freezes your IDE, you need to set an prefix
reference every time you use it.

I hope this helps,

Cor
 
thanks for all the input..

yeah i ended up using the httprequest classes... i had to workaround it..
cause i had to send data back to the page, it works with the GET method so i
could just parse the result.

I wanted to use it with a post method so i had to call the submit routine
from a script that executes when the page loads. The trouble with this way
is , the source code returned is always of the original page before the
submit process excutes. So i cant parse the info i needed.

Ofcourse i'm still wondering how do u excute a web page and get its source
out ..... The webserver control is the only one that seems to execute the
page scripts properly. Yeah i can right click the control and view source,
but how does one access this thru code ?
 
Hi Mike,

I am not sure what you mean with the webserver control if that is the
webbrowser you have to look at the Documentcomplete event

I hope this helps,

Cor
 
Back
Top