+ Reply to Thread
Results 1 to 6 of 6

Extracting numbers related to certain keywords from alphanumeric strings

Hybrid View

  1. #1
    Registered User
    Join Date
    03-05-2009
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    4

    Extracting numbers related to certain keywords from alphanumeric strings

    Hi,

    I'm a postgraduate student at Durham University in the United Kingdom and my research is on media coverage and humanitarian assistance of Natural Hazards and is heavily Excel-based but myself and my supervisors seem to have come to a brick wall and no one in my whole department know what to do.

    We are looking for a function or formula that can return a number from an alphanumeric string that relates to a keyword in that string.

    I shall explain...

    We have one column that looks like:

    7 Vietnamese women die in quarry collapse--officials - Inquirer.net
    1 dead, 1 missing as avalanches hit stormy California - San Diego Union Tribune
    1 person dead, 1 missing skier found in SoCal avalanche - San Jose Mercury News
    Landslide kills 7 on Indonesia's Java - Reuters UK

    And we need a column that shows only the number which relates to fatalities mentioned in each title.

    I have managed to find a way of extracting the numbers from the alphanumeric strings but we have problems with titles like this:

    Mingo National Wildlife Refuge Going Through "100 Year Flood" - KFVS

    So we are looking for a way to extract numbers from the strings that contain keywords like "dead" "deaths" "fatalities" etc.

    We are also looking for a way to extract the last part of the above strings:

    - KFVS
    - Reuters UK
    - San Jose Mercury News

    and then automatically assign abbreviated codes and country names to the news agency specified.

    The final database should look something like this:

    "Landslide kills 7 on Indonesia's Java - Reuters UK" - Reuters UK - REU - UK - 7

    No one in my entire university can figure this out.

    Can you please help????

  2. #2
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Extracting numbers related to certain keywords from alphanumeric strings

    What formula did you use to extract so far that works for some and not for others like you mentioned?
    Where there is a will there are many ways.

    If you are happy with the results, please add to the contributor's reputation by clicking the reputation icon (star icon) below left corner

    Please also mark the thread as Solved once it is solved. Check the FAQ's to see how.

  3. #3
    Registered User
    Join Date
    03-05-2009
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    4

    Re: Extracting numbers related to certain keywords from alphanumeric strings

    Hi NBVC,

    I have used a UDF from this site -

    http://www.ozgrid.com/VBA/ExtractNum.htm

    and then used -

    =ExtractNumber(F3,TRUE)

    It does return all numbers in the column but we don't want all the numbers, we just want numbers from the cells that include the words "dead" "deaths" "fatalities" "injuries" etc.

  4. #4
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Extracting numbers related to certain keywords from alphanumeric strings

    Yeah, that won't be easy... you could identify which strings have the key words, but sometimes the number is before, sometimes after, sometimes at the beginning of the string, other times in the middle somewhere... there seems to be no defined pattern...

    Is there any sort of consistency when any of those keywords appear (e.g. everytime kill appears, the number will immediately follow, or could there be for example, "7 killed" or "kills 7", etc...)

    Not sure this can be efficiently done..

  5. #5
    Registered User
    Join Date
    03-05-2009
    Location
    London, England
    MS-Off Ver
    Excel 2007
    Posts
    4

    Re: Extracting numbers related to certain keywords from alphanumeric strings

    There's no strict pattern but most titles use the format "7 dead" or "7 deaths".

    Occasionaly there will be a "7 feared dead".

    If we can get most of the results automatically I can always text filter and find the occasional "7 feared dead" manually.

    Any help is much appreciated.

  6. #6
    Forum Expert NBVC's Avatar
    Join Date
    12-06-2006
    Location
    Mississauga, CANADA
    MS-Off Ver
    2003:2010
    Posts
    34,898

    Re: Extracting numbers related to certain keywords from alphanumeric strings

    Ok,

    See attached to see if it will get you somewhere.

    What I did is create a list of key words in F1:F5.. (note, instead of die and dies, just use die... same with fatality and fatalies..just use fatal...)

    Then Formula in B1 is:

    =IF(OR(ISNUMBER(SEARCH($F$1:$F$5,A1))),ROUND(LOOKUP(9.999999999E+307,--("0"&MID(A1,MIN(SEARCH({0,1,2,3,4,5,6,7,8,9},A1&"0123456789")),ROW($1:$10)))),0),"")
    which is confirmed with CTRL+SHIFT+ENTER instead of ENTER.. that autogenerates the { } brackets... then copied down..

    Note: If you change references in the formula (for example expand F1:F5 to include more key words) then you must re-confirm with the CSE keys before copying down..

    Hope that helps a little...
    Attached Files Attached Files

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1