HttpWebRequest / Response disconnect question

  • Thread starter Thread starter lw_elite
  • Start date Start date
L

lw_elite

Hi all,

I'm trying to figure something out and not quite sure if I'm doing it
the right way. First, the quick overview and I'll post more info if
it's needed.

On my web site, I've got an HTTP Module that serves binaries. It
basically just reads them from disk and streams the bytes to the
client. This works fine -- every 10k I check if the client is
connected, and if so, read another 10k, etc.

I have another process that pings a few URLs using an HttpWebRequest
and HttpWebResponse. If the content type isn't plain text or HTML, I
don't try to read the site content and instantly try to close the
connection. I was hoping this would prevent the files from being
downloaded (some are quite big), but that's not the case.

If I debug both the web app and the pinging utility, I see the request
made, and the first bit of of data is pushed to the app. The app sees
the content type, and attempts to close the connection. The thing is,
the close() call doesn't return until the module is done streaming.
Each check to see if the client is connected is still returning true
(which it is, since the client hasn't disconnected yet). When the
module is done streaming the content, the close call in my app returns.


Any Ideas? Happy to post more code. Ideally I want the close call to
instantly sever the connection...

Thanks!
Tom
 
One of the solutions will be to write own HttpResponse class, that will
inherit from original HttpWebResponse. This new class will override the
Close method of HttpWebResponse.

If you will look inside the HttpWebResponse, you'll see that it uses
internal ConnectStream from network communications. When you call Close() on
response object - Close() is called in the ConnectStream. This Close invokes
InternalClose(), which checks internal socket from data, and if there's data
it reads it and then closes. This is the behavior you get.

You can get the ConnectStream by calling GetResponseStream().

So in your case I'd recommend you to override HttpWebResponse and use your
own data stream for network operations.
 
Ah, the WebRequest Abort() method seems to do the trick! It throws an
exception on the server (as it should -- a thread abort) but it works.
I just need to modify my module to handle it a bit more gracefully.

Thanks!
 
Ah, the WebRequest Abort() method seems to do the trick! It throws an
exception on the server (as it should -- a thread abort) but it works.
I just need to modify my module to handle it a bit more gracefully.

Thanks!

You might want to try sending HEAD requests. If the remote host plays
along nicely, you'll get all the HTTP headers as if you had sent a GET
request but no payload, i.e. less traffic, and no reason to Abort().


Cheers,
 
Sending the HEAD request is a great idea. I think it's a cleaner
solution, but my fear is that it will cause more network traffic than
it saves. I tried this out and it works great, but if the HEAD request
returns a content type I care about, I then need to do a GET. Since
the tool polls every link on a page, 90% of the time they'll be html or
plain text links. Of course this varies site to site, but unless the
majority of the links are binaries, I'd be generating a lot of extra
traffic to request the file. Hmmm... I do like the idea though and it
seems cleaner... thoughts?
 
Sending the HEAD request is a great idea. I think it's a cleaner
solution, but my fear is that it will cause more network traffic than
it saves. I tried this out and it works great, but if the HEAD
request returns a content type I care about, I then need to do a GET.

Oops, I missed you're requirement that you only need to avoid binary
data. In this case you're probably better off using Abort() -- at least
regarding bandwidth consumption.
Since the tool polls every link on a page, 90% of the time they'll be
html or plain text links. Of course this varies site to site, but
unless the majority of the links are binaries, I'd be generating a
lot of extra traffic to request the file. Hmmm... I do like the idea
though and it seems cleaner... thoughts?

Test it. Do you scan arbitrary sites or a well known group?

Cheers,
 
Generally it's a few sites in a well known group, but my idea is to
release the tool for anyone to use, in which case my primary interest
is in designing it correctly. My guess is that sticking with Abort is
the right idea. If I test on my sites, and actually meter the
bandwidth, no doubt the abort() would be more performant -- but again,
that may not be the case at large but I'd imagine it would be, unless
the site contains links with hundreds on Word docs, for example.
 
Back
Top