L
Lost
I have a program that scrapes a website. The site displays ever
changing numbers in the form of a table. My program constantly checks
the site to gather the new numbers and put them into an array for
processing.
The page of HTML that is received does not have carriage returns or
linefeed characters at the end of the relevant lines, but each
relevant line ends with the characters "n.l".
At the moment I process this by saving the HTML to disk as follows:
FileStr = "TempFile.tmp"
File2Str = "TempFile2.tmp"
ToF = FreeFile()
FileOpen(ToF, FileStr, OpenMode.Output)
Dim rsp As Net.HttpWebResponse = req.GetResponse
Dim strm As IO.Stream = rsp.GetResponseStream
Dim reader As New IO.StreamReader(strm)
PrintLine(ToF, reader.ReadToEnd())
reader.Close()
rsp.Close()
FileClose(ToF) 'HTML saved
The file is then read in, broken into lines and saved to another file:
FromF = FreeFile()
FileOpen(FromF, FileStr, OpenMode.Input)
ToF = FreeFile()
FileOpen(ToF, File2Str, OpenMode.Output)
Do While Not EOF(FromF)
LineStr = LineInput(FromF)
Do While InStr(LCase(LineStr), "n.l") > 0
x = InStr(LCase(LineStr), "n.l")
If x > 0 Then
PrintLine(ToF, LeftStr(LineStr, x - 1))
LineStr = Mid(LineStr, x + 3)
End If
Loop
PrintLine(ToF, LineStr)
Loop
FileClose(FromF)
FileClose(ToF)
The second file now contains the data and is read in one line at a
time and put into the array.
The routine works but it's slower than necessary because of the disk
accesses. Can someone please show me how to process the data while
it's still in memory? I'm using VB 2005 Express.
changing numbers in the form of a table. My program constantly checks
the site to gather the new numbers and put them into an array for
processing.
The page of HTML that is received does not have carriage returns or
linefeed characters at the end of the relevant lines, but each
relevant line ends with the characters "n.l".
At the moment I process this by saving the HTML to disk as follows:
FileStr = "TempFile.tmp"
File2Str = "TempFile2.tmp"
ToF = FreeFile()
FileOpen(ToF, FileStr, OpenMode.Output)
Dim rsp As Net.HttpWebResponse = req.GetResponse
Dim strm As IO.Stream = rsp.GetResponseStream
Dim reader As New IO.StreamReader(strm)
PrintLine(ToF, reader.ReadToEnd())
reader.Close()
rsp.Close()
FileClose(ToF) 'HTML saved
The file is then read in, broken into lines and saved to another file:
FromF = FreeFile()
FileOpen(FromF, FileStr, OpenMode.Input)
ToF = FreeFile()
FileOpen(ToF, File2Str, OpenMode.Output)
Do While Not EOF(FromF)
LineStr = LineInput(FromF)
Do While InStr(LCase(LineStr), "n.l") > 0
x = InStr(LCase(LineStr), "n.l")
If x > 0 Then
PrintLine(ToF, LeftStr(LineStr, x - 1))
LineStr = Mid(LineStr, x + 3)
End If
Loop
PrintLine(ToF, LineStr)
Loop
FileClose(FromF)
FileClose(ToF)
The second file now contains the data and is read in one line at a
time and put into the array.
The routine works but it's slower than necessary because of the disk
accesses. Can someone please show me how to process the data while
it's still in memory? I'm using VB 2005 Express.