HTML Form posting & response parsing

  • Thread starter Thread starter Esa
  • Start date Start date
E

Esa

Hi,

I'm having problems with one strange web system where submitting an
application and making queries about its handling status require a series of
form submits and response parsing - all in HTML. Luckily other interfaces
are "modern" using xml file up/downloads without any difficulties...

I'm not very used to .NET-environment yet, so I'd appreciate some clues
about the classes I should use to implement this stupid interface - stupid
because service user has to take part to bank's internal handling process by
posting some mid-handling-results back to the bank...And the best part was
that the bank's interface documentation just talked about submitting a form
and getting response form without a smallest word about these
redirections/reposts...

Basically the process goes like this:

0. I receive a xml-document and should return another xml-doc to the caller
after making the following queries with the bank.

1. I'll create my own initial html page and post it (https) to the other
server - no
problems here.

2. I get first response from the bank - either:

a) HTML-page with <META REFRESH...> redirection to another page (in case of
errors) - in this case I should get the error code from the next page for
further processing.

b) HTML-page containing a javascript function, a form with hidden fields and
a cookie that is required in further processing at the bank. That
js-function should be called to post the form back to another asp-page at
the bank's site (at least _currently_ it should be enough to just post that
form, but js-call is preferable). No values can be changed in this page, so
the task is just to get the form posted - I don't understand why this has to
made by me and not the bank automatically...

3. After submitting 2b form w/wo the javascript function either:

a) Same as 2a

b) Another HTML-page with hidden form fields with the final values I need to
update to the xml document before returning it to the caller.

I'm constrained to use .NET 1.1, and this function is most likely going to
be used in other programs too, but the first case is inside a message
transformation in BizTalk 2004 (this is the reason for 1.1 constraint).

I started testing with windows form application using WebBrowser control
just to get the content handling part started, but now I really need a hint.
Should I create my own hidden browser (with WebClient/Request etc). that
handles all the cookies, redirections and parses values from html and posts
them back to the bank. And what could be the easiest way to parse the
response html pages in this case.

Or can I somehow use (with .NET 1.1) a web browser like control (although
the final solution is not an application but a function to be used by other
programs) that handles cookies and redirections automatically and makes it
possible to call the javascript function on the bank's form.

I've already lost half my hair in the last two weeks with these undocumented
features I keep running into :-)

Thanks for all the answers in advance.

-- Esa
 
Thus wrote Esa,
Hi,

I'm having problems with one strange web system where submitting an
application and making queries about its handling status require a
series of form submits and response parsing - all in HTML. Luckily
other interfaces are "modern" using xml file up/downloads without any
difficulties...

I'm not very used to .NET-environment yet, so I'd appreciate some
clues about the classes I should use to implement this stupid
interface - stupid because service user has to take part to bank's
internal handling process by posting some mid-handling-results back to
the bank...And the best part was that the bank's interface
documentation just talked about submitting a form and getting response
form without a smallest word about these redirections/reposts...

Basically the process goes like this:

0. I receive a xml-document and should return another xml-doc to the
caller after making the following queries with the bank.

1. I'll create my own initial html page and post it (https) to the
other
server - no
problems here.
2. I get first response from the bank - either:

a) HTML-page with <META REFRESH...> redirection to another page (in
case of errors) - in this case I should get the error code from the
next page for further processing.

b) HTML-page containing a javascript function, a form with hidden
fields and a cookie that is required in further processing at the
bank. That js-function should be called to post the form back to
another asp-page at the bank's site (at least _currently_ it should be
enough to just post that form, but js-call is preferable). No values
can be changed in this page, so the task is just to get the form
posted - I don't understand why this has to made by me and not the
bank automatically...

3. After submitting 2b form w/wo the javascript function either:

a) Same as 2a

b) Another HTML-page with hidden form fields with the final values I
need to update to the xml document before returning it to the caller.

I'm constrained to use .NET 1.1, and this function is most likely
going to be used in other programs too, but the first case is inside a
message transformation in BizTalk 2004 (this is the reason for 1.1
constraint).

I started testing with windows form application using WebBrowser
control just to get the content handling part started, but now I
really need a hint. Should I create my own hidden browser (with
WebClient/Request etc). that handles all the cookies, redirections and
parses values from html and posts them back to the bank. And what
could be the easiest way to parse the response html pages in this
case.

I don't think that a browser control will do you any good in a Biztal orchestration,
so HttpWebRequest is probably your best option.

Using a HTTP proxy like Fiddler, I would capture the traffic of a successful
workflow, and program a sequence of web requests that contain the same headers
and similar payload. This allows you to ignore client-side scripting -- all
you need to know is what is actually being transmitted.

Cheers,
 
I don't think that a browser control will do you any good in a Biztal
orchestration, so HttpWebRequest is probably your best option.

That's what I was afraid of... :)
Using a HTTP proxy like Fiddler, I would capture the traffic of a
successful workflow, and program a sequence of web requests that contain
the same headers and similar payload. This allows you to ignore
client-side scripting -- all you need to know is what is actually being
transmitted.

Thanks for the Fiddler hint, I'll make life a little bit easier.

I'll try to find some small HTML parser tool to help me with the response
HTML-page contents as I can't get them now from the WebBrowser control
directly.

-- Esa

-- Esa
 
Back
Top