HttpWebResponse contains carriage returns, tab characters, etc?

  • Thread starter Thread starter Dave
  • Start date Start date
D

Dave

I'm trying to download a webpage by using the HttpWebRequest. It returns the
html source, however, it contains "\r\n", "\t" etc throughout the text. Is
there a way to return the same HTML as when I navigate to the url in the
browser and do a "View Source"? Or do I have to manually strip these out?
The full code is below where I'm posting the necessary data to simulate a
form post.

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Url);
req.Method = "POST";
req.UserAgent = "Mozilla/4.0+";
req.ContentType = "application/x-www-form-urlencoded";

System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
byte[] PostBuffer = encoding.GetBytes(PostData);
req.ContentLength = PostBuffer.Length;
Stream stm = req.GetRequestStream();
stm.Write(PostBuffer, 0, PostBuffer.Length);
stm.Close();

// Get the response.
resp = req.GetResponse() as HttpWebResponse;
sr = new StreamReader(resp.GetResponseStream());

string result = sr.ReadToEnd(); <--returns the source but with carriage
returns etc.
 
I would expect your source to contain CR's and Tabs, this is probably
the way it's being sent. if you see something else in a "view source"
I suspect that's a problem with the browser, not with the request.

so yes, you have to manually strip these out.
 
Dave said:
I'm trying to download a webpage by using the HttpWebRequest. It returns the
html source, however, it contains "\r\n", "\t" etc throughout the text. Is
there a way to return the same HTML as when I navigate to the url in the
browser and do a "View Source"? Or do I have to manually strip these out?
The full code is below where I'm posting the necessary data to simulate a
form post.

HttpWebRequest req = (HttpWebRequest)WebRequest.Create(Url);
req.Method = "POST";
req.UserAgent = "Mozilla/4.0+";
req.ContentType = "application/x-www-form-urlencoded";

System.Text.ASCIIEncoding encoding = new System.Text.ASCIIEncoding();
byte[] PostBuffer = encoding.GetBytes(PostData);
req.ContentLength = PostBuffer.Length;
Stream stm = req.GetRequestStream();
stm.Write(PostBuffer, 0, PostBuffer.Length);
stm.Close();

// Get the response.
resp = req.GetResponse() as HttpWebResponse;
sr = new StreamReader(resp.GetResponseStream());

string result = sr.ReadToEnd(); <--returns the source but with carriage
returns etc.

I have a suspision that you believe the string actually contains \r\n and \t
rather than the controls control code equivalents and that you believe this
because that's what the debugger is showing you. However the debugger shows
you the string in an escaped form that is valid as string literal in C#.

OTH something really bizare is happening.
 
Back
Top