+ Reply to Thread
Results 1 to 3 of 3

clean data - Removing Unwanted characters

  1. #1
    Forum Contributor
    Join Date
    02-19-2007
    Location
    Poole England
    MS-Off Ver
    2010
    Posts
    117

    clean data - Removing Unwanted characters

    Hello All.
    I have a problem :-(
    I have an excel sheet with approx 30,000 rows of data.
    These are all keyword phrases.
    They can be related to any subject, but for this example these are related to the root keyword phrase of "car Rent"

    I have some software which basically pulls in keyword phrases from search engines and meta tags etc, including misspelled keyword phrases.

    My problem;
    The data is uncleaned.
    In other words there might be I think it's called "Carriage return" data in there, so the row of data might be very deep (Instead of a row height of say 10.5 it could be anything, IE some could be 100 or 200 even).
    There are unwanted characters, for example; ()[]{}+?!""^*

    (If it could delete all unwanted characters except for letters/digits)
    There is a problem I see, that if it removes _ or - between words, that it will join the words together which won't be of any use. If it deletes anything with a letter either side of it or a letter and digit, or 2 digits, 1 either side it would then need to add a space to replace the hyphen.

    So for example; if there was a phrase in the list like
    car_for rent
    if it just removed the underscore, then the phrase would be
    carfor rent
    Which isn't correct. It would need to replace the underscore with a space.
    I hope I'm making sense here:-)
    So basically I'd love to have if possible a macro button that runs through my entire column of data,
    (Always in Column A , on a sheet called "AllKWs", and always starting from row3 downwards.)

    If it could go through the list and delete all unwanted characters including double spaces.
    So the end result is a keyword phrase list without a lot of junk basically.
    After it's gone through the list I suppose it needs to then look at what's left and delete any duplicate phrases last (As once some of these unwanted characters are removed, the keyword list may have duplicates).

    Once all this is completed, can a pop up window appear saying something like;
    =======================================

    Starting No. Phrases: 29,745
    Finishing No. Phrases: 29,722
    No.Deleted Characters: 12,345
    No.Deleted Carriage Returns: 234
    No.Deleted Spaces: 235
    No.Deleted Duplicates: 23

    Time Elapsed: 7.78seconds
    ======================================
    I think that's about it:-)
    Not asking for much am I :-)
    I really hope someone can help me out on this 1.
    I can't write this for sure.
    Out of my league I'm afraid:-(
    I hope it is possible as this would be very very useful for me.
    Maybe it isn't possible as it is quite complicated.
    If someone can have a look at it for me and have a go that would be brilliant

    Thanks for your time.
    Many Thanks
    John Caines

  2. #2
    Forum Moderator davesexcel's Avatar
    Join Date
    02-19-2006
    Location
    Regina
    MS-Off Ver
    MS 365
    Posts
    13,493
    you could use find and replace
    select the range
    hit Ctrl F
    enter the item to find
    the click replace enter a space to replace

    then replace all

    use the macro recorder if you want to do this automatice will look something like this



    Sub Macro2()
    Columns("A:A").Select
    Selection.Replace What:="(", Replacement:=" ", LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False
    Selection.Replace What:=")", Replacement:=" ", LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False
    Selection.Replace What:="[", Replacement:=" ", LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False
    Selection.Replace What:="]", Replacement:=" ", LookAt:=xlPart, _
    SearchOrder:=xlByRows, MatchCase:=False


    there could be a problem that others may have a solution to...
    there is a possibility when you search for "?" or "*"
    it will replace the entire cell

    hopefully somebody has a solution for that

  3. #3
    Forum Moderator davesexcel's Avatar
    Join Date
    02-19-2006
    Location
    Regina
    MS-Off Ver
    MS 365
    Posts
    13,493
    Entire code

    Please Login or Register  to view this content.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1