Help reading HTML page

  • Thread starter Thread starter nomad
  • Start date Start date
N

nomad

Hi,

I have been tasked with creating a program that screen scrapes a web
site. Instead of scraping the site I am doing a form post, which is
quicker and easier. The problem I am having is when I need to
retrieve information from the last page. A colleague has mentioned
using regular expression to find the values within the page that I
want, but as I am under tight timescales to get the project done, I
don't believe that I will have the time to research into regular
expressions. Does anone know of another way that I can handle this.

Appreciate any help
 
Why not use an HTML parser? You can use MSHTML (you would have to do
some interop in order to stream the contents into the model) or I am sure
that there are some open source parsers that you can use. A Google search
for "HTML parser .NET" turned up a few results.
 
    Why not use an HTML parser?  You can use MSHTML (you would haveto do
some interop in order to stream the contents into the model) or I am sure
that there are some open source parsers that you can use.  A Google search
for "HTML parser .NET" turned up a few results.

--
          - Nicholas Paldino [.NET/C# MVP]
          - (e-mail address removed)


I have been tasked with creating a program that screen scrapes a web
site.  Instead of scraping the site I am doing a form post, which is
quicker and easier.  The problem I am having is when I need to
retrieve information from the last page.  A colleague has mentioned
using regular expression to find the values within the page that I
want, but as I am under tight timescales to get the project done, I
don't believe that I will have the time to research into regular
expressions.  Does anone know of another way that I can handle this.
Appreciate any help

Thanks for your advice.
 
Back
Top