Screen Scraping Questions

  • Thread starter Thread starter Nick
  • Start date Start date
N

Nick

I need to write a VB.NET application for the network team at work that will
backup a firewall configuration each evening. I found that I may need screen
scraping to do this, but am not quite sure how to go about it as I have never
done it before.

The web login page is at https://firewall/auth.html and contains a form that
looks like: <form name="standardPass" onSubmit="return(processButn());"
action="auth.cgi" method="POST" target="authTgtFrm">

The certificate on the login page isn't trusted as it is self signed. I
have used the following to get around it.

I call this from my console app:

ServicePointManager.ServerCertificateValidationCallback = New
RemoteCertificateValidationCallback(AddressOf
CertificateHandler.ValidateServerCertificate)

Which uses this class:

Imports System
Imports System.Net.Security
Imports System.Security.Cryptography.X509Certificates

Public Class CertificateHandler

Public Shared Function ValidateServerCertificate(ByVal sender As Object,
ByVal certificate As X509Certificate, ByVal chain As X509Chain, ByVal
sslPolicyErrors As SslPolicyErrors) As Boolean
Return True
End Function

End Class

The two input boxes in question on the auth page are named userName and pwd.
If the login is successful I get redirected to http://firewall/main.html.
From there I need to download the configuration http://firewall/config.bin.

Does anyone have an idea how I can go about this?
 
I need to write a VB.NET application for the network team at work that will
backup a firewall configuration each evening.  I found that I may need screenscrapingto do this, but am not quite sure how to go about it as I have never
done it before.

Theweblogin page is athttps://firewall/auth.htmland contains a form that
looks like: <form name="standardPass" onSubmit="return(processButn());"
action="auth.cgi" method="POST" target="authTgtFrm">

The certificate on the login page isn't trusted as it is self signed.  I
have used the following to get around it.

I call this from my console app:

ServicePointManager.ServerCertificateValidationCallback = New
RemoteCertificateValidationCallback(AddressOf
CertificateHandler.ValidateServerCertificate)

Which uses this class:

Imports System
Imports System.Net.Security
Imports System.Security.Cryptography.X509Certificates

Public Class CertificateHandler

    Public Shared Function ValidateServerCertificate(ByVal sender As Object,
ByVal certificate As X509Certificate, ByVal chain As X509Chain, ByVal
sslPolicyErrors As SslPolicyErrors) As Boolean
        Return True
    End Function

End Class

The two input boxes in question on the auth page are named userName and pwd.
 If the login is successful I get redirected tohttp://firewall/main.html..  
From there I need to download the configurationhttp://firewall/config.bin.

Does anyone have an idea how I can go about this?

You can also try SWExplorerAutomation from http://webius.net/ to
record and generate VB.NET automation/scrapping code.
 
Back
Top