You may notice that in each raw html cell there are typically duplicates of each href link....and in cells from the same source URL in column 2, I have to manually remove duplicates. So typically we have to take the output file (which may be thousands of lines long) and sort by column 2, then take that html and remove the href links/remove duplicates for all the href links from the same source url, then copy the source url to the final list from that link and put it in the final worksheet for analysis. It would save a multiple steps if it removed duplicates where column 2 is the same as well....not sure if that is possible but thought I would ask. Maybe it is a separate script I run on the results where it looks for duplicates where they have the same source url in column 2......Thanks!
Bookmarks