Hi all,

I'll try my best to describe the task I'm trying to do in Excel. Any help would be much appreciated...

I have an exported html bookmarks file. Each cell contains a URL as well as a bunch of html code before and after the URL. I need to remove everything except the URL from each cell (i.e. delete everything before the start of the URL and after the end of it).

Example of a cell (URL is in quotation marks following HREF=):

<DT><A HREF="http://www.google.ca/search?q=funny+cartoons&hl=en&site=webhp&prmd=imvns&tbm=isch&tbo=u&source=univ&sa=X&ei=5DfsToP2PMLL0QGdvsCpCQ&sqi=2&ved=0CCgQsAQ&biw=1399&bih=779" ADD_DATE="1324103767">funny cartoons (images) - Google Search</A>


Desired result:

http://www.google.ca/search?q=funny+...w=1399&bih=779


I'm a novice with Excel formulas. My first thinking was to somehow combine two separate formulas that would do the following, respectively:

1) Delete everything preceding "http" (this would also allow any link beginning with https to be affected correctly)
2) Delete everything from "ADD_DATE" onward, including "ADD_DATE" itself plus 2 characters before it (to eliminate the quotation mark at the end of the URL and the space)

It doesn't matter at all if the formula is placed in the actual cell, or in the cell next to it. I will be copying and pasting the post-formula results to a different location.

Really looking forward to your help with this - in implementing either a 2-part formula like I suggested, or perhaps a simpler, more efficient approach you can think of to achieve the same result.

Thanks in advance!

Ed