Logging(username,password) in the web automatically by web crawler

  • Thread starter Thread starter shaily903
  • Start date Start date
S

shaily903

hi

I am looking for writing a c# web crawler .
Requirements: I need to browse the "general mail" site daily to see if
there is new Email present in my Inbox. I want to automate this whole
process by writing a web crawler which will crawl these site,enter my
login username and password then let me know if there is any new email.
I started writing like below:

//First, I downloaded the page by using the HttpWebRequest class
provided by C#.
private void GetUrlData()
{
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(url);
WebResponse response = request.GetResponse();
System.IO.Stream stream = response.GetResponseStream();

//now the stream is read as text. To do this, a reader is obtained
and the
// file is added to a buffer, line by line.
string buffer = "",line;
StreamReader reader = new StreamReader(stream);
while( (line = reader.ReadLine())!=null )
{
buffer+=line+"\r\n";
}
CreateFile(buffer); //then i created a text file and wrote the whole
stream
}

private void CreateFile(string buffer)
{
string filename = "c:\\test";
StreamWriter outStream = new StreamWriter( filename );
outStream.Write(buffer);
outStream.Close();
}


Question :
1. As per my requirements above I need to login to my site(i.e. I need
to give username and password,to go to my inbox). How to do that
programatically?

Thanks
Shaily
 
I am not famikliar with the website you are speaking of but I have done
this numerous times with other sites. Basically you have to find their
form and submit your authentication credentials. The login information
is usualy then stored in cookies which you need to save so that they
are always present with subsequent webrequest that you do.
 
I am very new to writing a web-cralwer which programatically loging to
a site.Currently the way I am doing is :
private void login()
{
HttpWebRequest req =
(HttpWebRequest)WebRequest.Create("http://mail.rediff.com/cgi-bin/login.cgi");
req.Method = "Post";
string s = "login=shaily&passwd=mypasswd";
req.CookieContainer = new CookieContainer();
req.ContentType = "application/x-www-form-urlencoded";
//Set the POST data in a buffer
byte[] PostBuffer = Encoding.GetEncoding(1252).GetBytes(s);
//Specify the length of the buffer
req.ContentLength = PostBuffer.Length;
//Open up a request stream
Stream RequestStream = req.GetRequestStream();
//Write the POST data
RequestStream.Write(PostBuffer, 0, PostBuffer.Length);
//Close the stream
RequestStream.Close();

//Create the Response object
HttpWebResponse Response = (HttpWebResponse)req.GetResponse();
//Create the reader for the response
StreamReader sr = new
StreamReader(Response.GetResponseStream(),Encoding.GetEncoding(1252));
//Read the response
string t = sr.ReadToEnd();
//Close the reader, and response
sr.Close();
Response.Close();
MessageBox.Show(t);
CreateFile(t,"c:\\Changelog");
}
private void CreateFile(string buffer,string filename)
{
StreamWriter outStream = new StreamWriter( filename );
outStream.Write(buffer);
outStream.Close();
}

Problem: The respond which I am getting does not show me the inbox of
mine where I can check the number of new mails. It seems to me that it
is not posting the data to a form properly. Am I doing something wrong?
What should I do to "login to my inbox" and retrieve the different
values from there(E.g no of new mails, Total number of mails in inbox )
 
Back
Top