Good morning!
I have been working on parsing data from a website into a VBA array from Excel and I am absolutely stuck.
Objective: Parse out specific data located between certain html tags from a website and load the parsed data into a VBA array
Here is a sample copy of the html code from the website (website address for the data below is: http://cosprings.craigslist.org/sear...tor&catAbb=sss)
<blockquote id="toc_rows">
<div class="sortby">
sort by:
<strong>most recent</strong>
<a href="/search/?sort=rel&areaID=210&catAbb=sss&query=tractor">best match</a>
<a href="/search/?sort=priceasc&areaID=210&catAbb=sss&query=tractor">low price</a>
<a href="/search/?sort=pricedsc&areaID=210&catAbb=sss&query=tractor">high price</a>
</div>
<h4 class="ban resultshdr">
<span class="nplink prev">« Prev</span>
<span class="nplink next">
<a href="http://cosprings.craigslist.org/search/?areaID=210&catAbb=sss&query=tractor&s=100">Next »</a>
</span>
Found: 192 Displaying: 1 - 100
<br>
<span class="pagelinks">
<b>1</b>
|
<a href="http://cosprings.craigslist.org/search/?areaID=210&catAbb=sss&query=tractor&s=100">2</a>
</span>
</h4>
class="srch row" data-pid="3647539519" data-longitude="-105.076448916774" data-latitude="40.611477699806">
<span class="i" data-id="3Ee3K73Mc5I85N75E2d2rd1b1b16a5f981f75.jpg"> </span>
<span class="pl">
<span class="itemdate">Mar 29</span>
<a href="http://cosprings.craigslist.org/grq/3647539519.html">2012 TYM T723 CAB Tractor 74HP With Loader & Bucket</a>
</span>
<span class="itempnr">
<span class="itempp"> $40532</span>
<font size="-1"> (Fort Collins, CO)</font>
</span>
<small class="gc">
<a href="/grq/">farm & garden - by dealer</a>
</small>
<span class="itempx">
<span class="p">
pic
<span class="maptag" data-pid="3647539519">map</span>
</span>
</span>
<br class="c">
</p>
...code continues on in a repeating pattern of class "srch row"
What I need loaded into the VBA array is in BOLD RED. So from the example code above, I would need the array to have the elements:
Mar 29
http://cosprings.craigslist.org/grq/3647539519.html
2012 TYM T723 CAB Tractor 74HP With Loader & Bucket
$40532
(Fort Collins, CO)
farm & garden - by dealer
For the data contained within every class "srch row" in the HTML, I need the one row (each separate line of data above should be put into a separate column of the array, on the same row) of the VBA array to be filled with all of the data in the example above and then move on to the next array row entry (for each unique data set contained within the HTML class "srch row") until there are no more "srch row" data in the HTML code.
I hope that makes sense - I have tried to make my way through the coding and I cannot get it to work right. Here is the code I have now:
With CreateObject("MSXML2.XMLHTTP")
.Open "GET", "http://cosprings.craigslist.org/search/?areaID=210&subAreaID=&query=tractor&catAbb=sss"
.send
Do: DoEvents: Loop Until .readystate = 4
Set htmlFile = New HTMLDocument
htmlFile.body.innerHTML = .responseText
.abort
End With
With htmlFile
rowCount = .getElementsByClassName("srch row").Length
On Error Resume Next
ReDim data(1 To rowCount, 1 To 5)
On Error Resume Next 'in case the search produces 0 results
x = 1
For Each orow In .getElementsByClassName("srch row")
data(x, 1) = orow.Children(0).innerText
data(x, 2) = orow.Children(1).href
data(x, 3) = orow.Children(2).innerText
data(x, 4) = orow.Children(3).innerText
data(x, 5) = orow.Children(3).innerText
x = x + 1
Next orow
End With
I am thinking that the above code is not working because the class "srch row" is nested within the HTML code?
Thank you in advance for the help!
Bookmarks