+ Reply to Thread
Results 1 to 10 of 10

scraping data from a website

Hybrid View

  1. #1
    Registered User
    Join Date
    05-02-2010
    Location
    Houston
    MS-Off Ver
    Excel 2003
    Posts
    5

    scraping data from a website

    I am trying to scrape the hourly weather data from a NOAA web page. How can I do this? I have tried to do a web query and that isn't giving me the desired result. I only need the hourly weather data table. Here is the link, and you thank you for all your help!

    http://forecast.weather.gov/MapClick....y=9&BackDay=0

  2. #2
    Forum Moderator Leith Ross's Avatar
    Join Date
    01-15-2005
    Location
    San Francisco, Ca
    MS-Off Ver
    2000, 2003, & 2010
    Posts
    23,259

    Re: scraping data from a website

    Hello redpanda,

    After you have selected your tables using the Import Data Wizard, did you click the Properties... button to bring up the External Data Range Properties dialog?

    On this dialog there is check box named Refresh every. When you check this box by clicking it, you can then set the number of minutes between refresh cycles. Did you do this?
    Sincerely,
    Leith Ross

    Remember To Do the Following....

    1. Use code tags. Place [CODE] before the first line of code and [/CODE] after the last line of code.
    2. Thank those who have helped you by clicking the Star below the post.
    3. Please mark your post [SOLVED] if it has been answered satisfactorily.


    Old Scottish Proverb...
    Luathaid gu deanamh maille! (Rushing causes delays!)

  3. #3
    Valued Forum Contributor
    Join Date
    05-21-2009
    Location
    Great Britain
    MS-Off Ver
    Excel 2003
    Posts
    550

    Re: scraping data from a website

    I think the OP means the table of hourly data, not retrieve it hourly. I tried it with Excel 2003 and a web query on the table doesn't retrieve the expected data; selecting the entire page doesn't work either. In this case you'll need VBA to do Internet Explorer automation and extract rows and cells from the appropriate HTML table:
    Public Sub Get_Weather_Data()
    
        Dim URL As String
        Dim IE As Object
        Dim HourlyTable As Object
        Dim row As Object, cell As Object
        
        URL = "http://forecast.weather.gov/MapClick.php?w0=t&w1=td&w2=hi&w3=sfcwind&w3u=1&w4=sky&w5=pop&w6=rh&w7=thunder&w8=rain&AheadHour=48&FcstType=digital&textField1=38.89500&textField2=-77.03730&site=lwx&unit=0&dd=0&bw=0&BackDay.x=44&BackDay.y=9&BackDay=0"
        
        Set IE = CreateObject("InternetExplorer.Application")
        
        With IE
            .Visible = True
            .Navigate URL
            While .Busy Or .ReadyState <> 4: DoEvents: Wend
            Set HourlyTable = .document.getElementsByTagName("TABLE")(7)
        End With
                                
        If Not HourlyTable Is Nothing Then
            Sheet1.Cells.ClearContents
            For Each row In HourlyTable.Rows
                For Each cell In row.Cells
                    Sheet1.Range("A1").Offset(row.RowIndex, cell.cellIndex).Value = cell.innerText
                Next
            Next
        Else
            MsgBox "Table index 7 doesn't exist"
        End If
                            
    End Sub

  4. #4
    Registered User
    Join Date
    05-02-2010
    Location
    Houston
    MS-Off Ver
    Excel 2003
    Posts
    5

    Re: scraping data from a website

    Thank You for the help!

  5. #5
    Forum Contributor
    Join Date
    07-13-2007
    Posts
    151

    Re: scraping data from a website

    The NOAA site you refer to offers the data in an XML format. IIt appears that the URL to navigate to the XML version of the data employs the Lat/Long coordinates of the location you're after.

    I used the location you had in your example and retrieved the XML data into a string variable using an XMLHTTP request. From there, you can parse it down to retrieve the elements you need to capture into a spreadsheet.

    Here's a rough example to get you started. Works very fast. No need to employ Internet Explorer automation.

    Public Sub getNOAA_XML()
    Dim xmlhttp As Object
    Dim myxml As String
    Dim strURL As String
    Dim strMyTimes As String, arrMyTimes()
    Dim strMyTemps As String
    Dim temps As Variant
    Dim n As Integer
    Dim regexp As Object
    Dim times As Object
    
    strURL = "http://forecast.weather.gov/MapClick.php?lat=38.89500&lon=-77.03730&FcstType=digitalDWML"
    
    Set xmlhttp = CreateObject("microsoft.xmlhttp")
    With xmlhttp
        .Open "get", strURL, False
        .send
        x = .responsetext
    End With
    
    strMyTemps = Mid(x, InStr(1, x, "<temperature type=""hourly"""))
    strMyTemps = Mid(strMyTemps, InStr(1, strMyTemps, "<value>"))
    strMyTemps = Mid(strMyTemps, 1, InStr(1, strMyTemps, "</temperature>"))
    strMyTemps = Replace(strMyTemps, "<value>", "")
    temps = Split(strMyTemps, "</value>")
    
    strMyTimes = Mid(x, InStr(1, x, "<start-valid-time>"))
    strMyTimes = Mid(strMyTimes, 1, InStr(1, strMyTimes, "</time-layout>"))
    
    Set regexp = CreateObject("vbscript.regexp")
    With regexp
        .Global = True
        .Pattern = "<start-valid-time>(.+)</start-valid-time>"
         Set times = .Execute(strMyTimes)
    End With
    
    If times.Count = UBound(temps) Then
        ReDim arrMyTimes(times.Count)
        For n = 0 To times.Count - 1
            arrMyTimes(n) = times(n).submatches(0)
        Next n
        
        With ActiveSheet
            .Range("a1").Resize(UBound(arrMyTimes), 1) = WorksheetFunction.Transpose(arrMyTimes)
            .Range("b1").Resize(UBound(temps), 1) = WorksheetFunction.Transpose(temps)
        End With
    Else
        MsgBox "error"
       
    End If
    
    Set xmlhttp = Nothing
    Set regexp = Nothing
    
    End Sub
    .

  6. #6
    Forum Contributor
    Join Date
    07-13-2007
    Posts
    151

    Re: scraping data from a website

    I felt the previous post was a bit sloppy, so I re-did the routine using the responsexml.

    Public Sub getNOAA_XML()
    Dim xmlhttp As Object
    Dim strURL As String
    Dim n As Integer
    Dim times As Object
    Dim temps As Object
    Dim arrMyData()
    
    strURL = "http://forecast.weather.gov/MapClick.php?lat=38.89500&lon=-77.03730&FcstType=digitalDWML"
    
    Set xmlhttp = CreateObject("microsoft.xmlhttp")
    With xmlhttp
        .Open "get", strURL, False
        .send
        With .responsexml
            Set times = .getElementsByTagName("start-valid-time")
            Set temps = .getElementsByTagName("temperature")(0).ChildNodes
        End With
    End With
    
    If times.Length = temps.Length Then
        ReDim arrMyData(times.Length, 1 To 2)
        For n = 0 To times.Length - 1
            arrMyData(n, 1) = times(n).Text
            arrMyData(n, 2) = temps(n).Text
        Next n
    
        ActiveSheet.Range("a1").Resize(UBound(arrMyData, 1), 2) = arrMyData
        
    Else
        MsgBox "error"
       
    End If
    
    Set xmlhttp = Nothing
    Set times = Nothing
    Set temps = Nothing
    
    End Sub
    Any feedback on parsing XML would be greatly appreciated as I'd like to learn the subtleties of it. I'm coming across more and more sties that offer XML. From the little I've used it, it's very efficient, as compared to scraping a traditional HTML page.

  7. #7
    Forum Contributor
    Join Date
    07-01-2008
    Location
    Cincinnati, OH
    Posts
    150

    Re: scraping data from a website

    Nice one ShredDude.

  8. #8
    Registered User
    Join Date
    07-03-2012
    Location
    Cleveland, OH
    MS-Off Ver
    Excel 2010
    Posts
    3

    Re: scraping data from a website

    This thread is very helpful. I have a followup question: How can I tweak the VBA so that it waits for a page to fully load?

  9. #9
    Forum Expert Cutter's Avatar
    Join Date
    05-24-2004
    Location
    Ontario,Canada
    MS-Off Ver
    Excel 2010
    Posts
    6,451

    Re: scraping data from a website

    Welcome to the Forum. Sorry, but you've inadvertently broken one of our rules. Please read over the following rule:

    Your post does not comply with Rule 2 of our Forum RULES. Don't post a question in the thread of another member -- start your own thread. If you feel it's particularly relevant, provide a link to the other thread. It makes sense to have a new thread for your question because a thread with numerous replies can be off putting & difficult to pick out relevant replies.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1