Why can IE return a web page but TcpClient cannot?

  • Thread starter Thread starter Johann Blake
  • Start date Start date
J

Johann Blake

Hi,

I have come across a rather bizarre problem using the TcpClient to
retrieve a web page. I use the TcpClient in conjunction with a
StreamWriter to write a HTTP request to the web site. The web server
returns either a 301, 302 or 404 code indicating that the page could
not be found or has been redirected. If I issue the same URL using
Internet Explorer (IE), it does retrieve the page.

I then used a HTTP sniffer program to see what was really going out on
the line. The sniffer shows exactly the same HTTP headers being sent
out using either my program or IE. They are exact in both
case-sensitivity, size and content. I can't figure this out. I have
thought that maybe the HTTP request being sent out to the web server
is too slow. I used the Flush method, set the linguiring state and
tried a number of other things, but the web server always comes back
with 404. Sometimes it even comes back with a 301 or 302. Yet IE is
consistently able to download the file.

I've deleted cookies and the cache but it doesn't help.

What gives??

Here is a partial amount of my code:

----------------------------------------------

TcpClient tcp = new TcpClient();
tcp.Connect(cnn.law.printthis.clickability.com, 80);

-- Tried the following to no avail.
// tcp.NoDelay = true;
// LingerOption tcpClientLingerOption = new LingerOption(false,0);
// tcp.LingerState = tcpClientLingerOption;

NetworkStream networkStream = tcp.GetStream();
StreamWriter streamWriter = new StreamWriter(networkStream);

-- Tried the following to no avail.
// streamWriter.AutoFlush = true;

// Note: there are no line breaks if you are viewing this as posted in
a newsgroup (Google, etc).
string line = @"GET
/pt/cpt?action=cpt&title=CNN.com+-+Malvo+gets+life+sentence+in%A0sniper+killing+-+Oct+26%2C+2004&expire=-1&urlID=12081313&fb=Y&url=http%3A%2F%2Fwww.cnn.com%2F2004%2FLAW%2F10%2F26%2Fmalvo.plea%2Findex.html&partnerID=2013
HTTP/1.1" + "\r\n" +
@"Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg,
application/x-shockwave-flash, application/vnd.ms-powerpoint,
application/vnd.ms-excel, application/msword, */*" + "\r\n" +
@"Accept-Language: en-us" + "\r\n" +
@"Accept-Encoding: gzip, deflate" + "\r\n" +
@"User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT
5.0; .NET CLR 1.1.4322)" + "\r\n" +
@"Host: cnn.law.printthis.clickability.com" + "\r\n" +
@"Connection: Keep-Alive" + "\r\n\r\n";

streamWriter.Write(line);
streamWriter.Flush();
// Even tried using a StreamReader instead of a NetworkStream, but to
no avail.
int bytesRead = networkStream.Read(buffer, 0, bytesToRead);
string packet = Encoding.GetEncoding(1252).GetString(buffer, 0,
bytesRead);
streamWriter.Close();


Thanks,
Johann Blake
 
Hi Johann,
Hi,

I have come across a rather bizarre problem using the TcpClient to
retrieve a web page. I use the TcpClient in conjunction with a
StreamWriter to write a HTTP request to the web site. The web server
returns either a 301, 302 or 404 code indicating that the page could
not be found or has been redirected. If I issue the same URL using
Internet Explorer (IE), it does retrieve the page.

301 and 302 are redirects. You have to follew them to
get the right page.

But why on earth don't you use the WebClient oder HttpWebRequest??

bye
Rob
 
Back
Top