I have a list of URLs in column A, is there a command to download each single webpage to column B?
I mean downloading the webpage of URL in A1 into B1, A2 into B, and so forth.
I have a list of URLs in column A, is there a command to download each single webpage to column B?
I mean downloading the webpage of URL in A1 into B1, A2 into B, and so forth.
Last edited by etrader; 01-28-2010 at 05:27 AM.
Come again? What do you want to see in B1? The web page? How would you like that to look? The HTML code, javascript and CSS? The page layout complete with images and Flash? In a cell?
Or do you want to strip the URL from
http://www.excelforum.com
to
www.excelforum.com
????
I mean the contents of each page (full html, html-stripped, removed scripts, whatsoever). I want to know whether it is possible.
It can be like downloading from Web Data (from Data menu), but with two difference
1. Extracting data analyze the page tables and insert every cell to cell of excel spreadsheet
2. Each page should be handled manually
Now I want to do this with a VBA command to extract the content of each page (in any possible format)
Last edited by teylyn; 01-28-2010 at 06:20 AM. Reason: removed quote
please don't quote whole posts. It's just clutter. In fact, only quote when you are referring to something particular, and then only quote the pertinent lines.
Like this:
that's pretty broad, given what a page can contain these days. Many pages use include files so you won't see some of the actual content in the html, even though you see it on the rendered page.full html, html-stripped, removed scripts, whatsoever
Also, the html of many web pages will be too much text for a single Excel cell.
As an example, the html for this very page you're reading right now has the CSS all written in the page, and excluding my post has over 107000 characters. Word would want 149 pages to print it all.
You want all that in one cell?
Web pages don't necessarily consist of tables anymore. Forget <table><tr><td> and get a grasp of CSS. You'll be more likely to see DIVs and SPANs in any which order, and the actual order on the visible page can be quite different, since it is commanded by the CSS. So, you may not be able to immediately recognise the "cells" unless you're good at CSS and can reverse-engineer the code.1. Extracting data analyze the page tables and insert every cell to cell of excel spreadsheet
If you're lucky, the page uses XML, so you can at least glean some structure.
Take a look at the underlying code for this page. Can you readily identify the pertinent parts where the actual text of the post sits?
Don't know what you mean by that. Can you explain?2. Each page should be handled manually
I think this may be a pipe dream ....
Thanks for good tips teylyn,
For importing a page to excel, we should go to Data>Get External Data from Web ... I call it manuall, as one needs to do this process for each URL
I am looking for an automatic method to import webpages by a list of URLs to excel
When you use Data - Get External Data from Web, then the web page will be loaded into a sheet, not a cell.
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks