getting the source code of a web page

  • Thread starter Thread starter Southern at Heart
  • Start date Start date
S

Southern at Heart

I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page. (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet) If I
open the page in IE, view the source, I can programmatically filter out the
coordinates.
So, my question: How to get the source saved to a string in vba
thanks.
 
You don't need to get the source. You can use the IE explorer like below to
help you filter the data. The source is under the IE.Document.all property.
You can get these lines by doing a loop like this

for each itm in IE.Document
'your code here
next itm

You can use the methods getElementsByTagname(), and getElementById() to help
you filter the data. I often for debuggin use something like this

RowCount = 1
for each itm in IE.Document
Range("A" & RowCount) = itm.tagname
Range("B" & RowCount) = itm.Classname
Range("C" & RowCount) = itm.id
Range("D" & RowCount) = left(itm.innertext,1024)

RowCount = rowcount + 1
next itm

The source is partioned into items. Has you go down the ALL property the
innertext is partioned into small pices. When you do the dump above you will
see the same innertext repeated over and over again but broken into samller
pieces each time it is repeated.

If you need more help give me the URL and I will get what you need. I help
lots of people who have had problems.




Sub GetZipCodes()


ZIPCODE = InputBox("Enter 5 digit zipcode : ")

Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://zip4.usps.com/zip4/citytown_zip.jsp"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4 And _
IE.busy = True

DoEvents
Loop

Set Form = IE.document.getElementsByTagname("Form")

Set zip5 = IE.document.getElementById("zip5")
zip5.Value = ZIPCODE


Set ZipCodebutton = Form(0).onsubmit

Form(0).submit
Do While IE.busy = True
DoEvents
Loop

Set Table = IE.document.getElementsByTagname("Table")
Location = Table(0).Rows(2).innertext
IE.Quit
MsgBox ("Zip code = " & ZIPCODE & " City/State = " & Location)


End Sub
 
IO got the address. I'm not sue if I can get the GPS address from here. the
zip code is passed to an activex utility and I don't think that is available
at this URL. To get the URL there is probably a Google utility that will do
thhis. I would have to look for it. Never tried it before.

Sub GetAddress()


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.ymca.net/maps/profile.aspx?assn=6757"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4 Or _
IE.Busy = True
DoEvents
Loop
Call Dump(IE)


a = 1
Set Title = IE.document.getElementsByTagName("Title")
YName = Title.Item(0).innertext
MsgBox (YName)
Set P = IE.document.getElementsByTagName("p")
Address = P.Item(1).innertext
MsgBox (Address)

IE.Quit


End Sub
Sub Dump(IE)
RowCount = 1
Cells.ClearContents
For Each itm In IE.document.all
Range("A" & RowCount) = itm.tagName
Range("B" & RowCount) = itm.ID
Range("C" & RowCount) = itm.className
Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm

End Sub
 
I need to get the source code of a web page and asign it to a string, where I
can search out the specific data I'm needing from the page.  (a web query of
the page gives me most of the data I need, but it doesn't give me the google
map on that page, and the gps coordinates which I need in my sheet)  IfI
open the page in IE, view the source, I can programmatically filter out the
coordinates.  
So, my question:  How to get the source saved to a string in vba
thanks.

If you are repetitively interrogating a website, then capturing the
source code using the "GET" construction shown below can be used to
ones advantage. This method assigns the source code to a variable
(my_var in the construction shown below). In a subsequent step the
variable can be parsed for the desired data. This method is typically
much faster than opening IE and extracting the same information. I
have a macro that collects data on over 700 real estate addresses
found at a website. The macro takes over an hour when I run it
through IE, but only 14 minutes when I use the GET method...Ron

Sub Test()
my_url = "http://www.google.com"
Set my_obj = CreateObject("MSXML2.XMLHTTP")
my_obj.Open "GET", my_url, False
my_obj.send
my_var = my_obj.responsetext
Set my_obj = Nothing
End Sub
 
I now got the GPS locations. It was buried in the webpage at a spot I didn't
check.

Sub GetAddress()

Dim Latitude As String
Dim Longitude As String


Set IE = CreateObject("InternetExplorer.Application")
IE.Visible = True

URL = "http://www.ymca.net/maps/profile.aspx?assn=6757"

'get web page
IE.Navigate2 URL
Do While IE.readyState <> 4 Or _
IE.Busy = True
DoEvents
Loop
'Call Dump(IE)



Set Title = IE.document.getElementsByTagName("Title")
YName = Title.Item(0).innertext
MsgBox (YName)


Set P = IE.document.getElementsByTagName("p")
Address = P.Item(1).innertext
MsgBox (Address)

Set Scripts = IE.document.getElementsByTagName("Script")
RowCount = 1
For Each Script In Scripts
If InStr(Script.outerHTML, "GLatLng") > 0 Then
GLatLngStart = InStr(Script.outerHTML, "GLatLng")
GLatLng = Mid(Script.outerHTML, GLatLngStart)
'remove parenthisis
GLatLngStart = InStr(GLatLng, "(")
GLatLng = Mid(GLatLng, GLatLngStart + 1)

GLatLngEnd = InStr(GLatLng, ")")
GLatLng = Left(GLatLng, GLatLngEnd - 1)

GPS = Split(GLatLng, ",")
MsgBox ("Latitude : " & GPS(0) & vbCrLf & "Longitude : " & GPS(1))

Exit For

End If
RowCount = RowCount + 1
Next Script

IE.Quit


End Sub
Sub Dump(IE)
RowCount = 1
Cells.ClearContents
For Each itm In IE.document.all
Range("A" & RowCount) = itm.tagName
Range("B" & RowCount) = itm.ID
Range("C" & RowCount) = itm.className
Range("D" & RowCount) = Left(itm.innertext, 1024)

RowCount = RowCount + 1
Next itm

End Sub
 
Back
Top