+ Reply to Thread
Results 1 to 10 of 10

A Difficult Unconcatinate Problem

Hybrid View

Guest A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest Re: A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest Re: A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest RE: A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest RE: A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest Re: A Difficult Unconcatinate... 09-13-2005, 11:05 AM
Guest Re: A Difficult Unconcatinate... 01-24-2006, 03:30 PM
Guest Re: A Difficult Unconcatinate... 01-24-2006, 07:15 PM
Guest Re: A Difficult Unconcatinate... 01-24-2006, 07:40 PM
Guest Re: A Difficult Unconcatinate... 01-24-2006, 09:15 PM
  1. #1
    RestlessAde
    Guest

    A Difficult Unconcatinate Problem

    Hi,

    I have tried to solve this problem using a mixture of text functions such as
    RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    suggestions.

    I'm trying to split out the following data stored in a single column into
    three columns: Country, Data Value and Source.

    Angola 13,294,000 Source: ibid.
    Anguilla 13,000 Source: ibid.
    Antigua and Barbuda 76,000 Source: ibid.
    Argentina 37,880,000 Source: ibid.
    Armenia 3,206,000 Source: ibid.
    Aruba 94,000 Source: ibid.
    Australia 20,125,000 Source: ibid.

    The problem I'm having relates to the fact that some countries contain more
    than one word, so I can't just search for the first " ". I think the answer
    is to somehow detect the first instance of a numeric value, but I have no
    idea how to do this.

    Thanks,
    RA

  2. #2
    Dave Peterson
    Guest

    Re: A Difficult Unconcatinate Problem

    I'd use some helper columns and formulas.

    Assumes your data is in A1:Axxx.

    In B1, put this formula:
    =MATCH(TRUE,ISNUMBER(-MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)),0)
    This is an array formula. Hit ctrl-shift-enter instead of enter. If you do it
    correctly, excel will wrap curly brackets {} around your formula. (don't type
    them yourself.)

    In C1, put this formula:
    =SEARCH("source:",A1)

    In D1, put this formula:
    =TRIM(LEFT(A1,B1-1))

    In E1:
    =--MID(A1,B1,C1-B1)

    In F1:
    =TRIM(MID(A1,C1+LEN("source:"),255))

    And select b1:F1 and drag down as far as you need.



    RestlessAde wrote:
    >
    > Hi,
    >
    > I have tried to solve this problem using a mixture of text functions such as
    > RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > suggestions.
    >
    > I'm trying to split out the following data stored in a single column into
    > three columns: Country, Data Value and Source.
    >
    > Angola 13,294,000 Source: ibid.
    > Anguilla 13,000 Source: ibid.
    > Antigua and Barbuda 76,000 Source: ibid.
    > Argentina 37,880,000 Source: ibid.
    > Armenia 3,206,000 Source: ibid.
    > Aruba 94,000 Source: ibid.
    > Australia 20,125,000 Source: ibid.
    >
    > The problem I'm having relates to the fact that some countries contain more
    > than one word, so I can't just search for the first " ". I think the answer
    > is to somehow detect the first instance of a numeric value, but I have no
    > idea how to do this.
    >
    > Thanks,
    > RA


    --

    Dave Peterson

  3. #3
    RestlessAde
    Guest

    Re: A Difficult Unconcatinate Problem

    Dave,

    That's fantastic. I don't understand how it works at the moment, although I
    shall invest some brain power into trying to understand it for future
    reference. Thanks for replying so quickly.

    Thanks also to bigwheel.

    RA

    "Dave Peterson" wrote:

    > I'd use some helper columns and formulas.
    >
    > Assumes your data is in A1:Axxx.
    >
    > In B1, put this formula:
    > =MATCH(TRUE,ISNUMBER(-MID(A1,ROW(INDIRECT("1:"&LEN(A1))),1)),0)
    > This is an array formula. Hit ctrl-shift-enter instead of enter. If you do it
    > correctly, excel will wrap curly brackets {} around your formula. (don't type
    > them yourself.)
    >
    > In C1, put this formula:
    > =SEARCH("source:",A1)
    >
    > In D1, put this formula:
    > =TRIM(LEFT(A1,B1-1))
    >
    > In E1:
    > =--MID(A1,B1,C1-B1)
    >
    > In F1:
    > =TRIM(MID(A1,C1+LEN("source:"),255))
    >
    > And select b1:F1 and drag down as far as you need.
    >
    >
    >
    > RestlessAde wrote:
    > >
    > > Hi,
    > >
    > > I have tried to solve this problem using a mixture of text functions such as
    > > RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > > suggestions.
    > >
    > > I'm trying to split out the following data stored in a single column into
    > > three columns: Country, Data Value and Source.
    > >
    > > Angola 13,294,000 Source: ibid.
    > > Anguilla 13,000 Source: ibid.
    > > Antigua and Barbuda 76,000 Source: ibid.
    > > Argentina 37,880,000 Source: ibid.
    > > Armenia 3,206,000 Source: ibid.
    > > Aruba 94,000 Source: ibid.
    > > Australia 20,125,000 Source: ibid.
    > >
    > > The problem I'm having relates to the fact that some countries contain more
    > > than one word, so I can't just search for the first " ". I think the answer
    > > is to somehow detect the first instance of a numeric value, but I have no
    > > idea how to do this.
    > >
    > > Thanks,
    > > RA

    >
    > --
    >
    > Dave Peterson
    >


  4. #4
    bigwheel
    Guest

    RE: A Difficult Unconcatinate Problem

    Well, I think I'd start with Data>Text to Columns then sort out those
    countries with more than one word afterwards (can't be too many of them) or
    you can do some checking using the CODE function to find the number part of
    the contents and use the string functions to split up the different elements

    "RestlessAde" wrote:

    > Hi,
    >
    > I have tried to solve this problem using a mixture of text functions such as
    > RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > suggestions.
    >
    > I'm trying to split out the following data stored in a single column into
    > three columns: Country, Data Value and Source.
    >
    > Angola 13,294,000 Source: ibid.
    > Anguilla 13,000 Source: ibid.
    > Antigua and Barbuda 76,000 Source: ibid.
    > Argentina 37,880,000 Source: ibid.
    > Armenia 3,206,000 Source: ibid.
    > Aruba 94,000 Source: ibid.
    > Australia 20,125,000 Source: ibid.
    >
    > The problem I'm having relates to the fact that some countries contain more
    > than one word, so I can't just search for the first " ". I think the answer
    > is to somehow detect the first instance of a numeric value, but I have no
    > idea how to do this.
    >
    > Thanks,
    > RA


  5. #5
    bj
    Guest

    RE: A Difficult Unconcatinate Problem

    try
    =MIN(IF(ISERROR(FIND({"1","2","3","4","5","6","7","8","9","0"},B2)),1000000,FIND({"1","2","3","4","5","6","7","8","9","0"},B2)))

    to find your first number

    "RestlessAde" wrote:

    > Hi,
    >
    > I have tried to solve this problem using a mixture of text functions such as
    > RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > suggestions.
    >
    > I'm trying to split out the following data stored in a single column into
    > three columns: Country, Data Value and Source.
    >
    > Angola 13,294,000 Source: ibid.
    > Anguilla 13,000 Source: ibid.
    > Antigua and Barbuda 76,000 Source: ibid.
    > Argentina 37,880,000 Source: ibid.
    > Armenia 3,206,000 Source: ibid.
    > Aruba 94,000 Source: ibid.
    > Australia 20,125,000 Source: ibid.
    >
    > The problem I'm having relates to the fact that some countries contain more
    > than one word, so I can't just search for the first " ". I think the answer
    > is to somehow detect the first instance of a numeric value, but I have no
    > idea how to do this.
    >
    > Thanks,
    > RA


  6. #6
    Ron Rosenfeld
    Guest

    Re: A Difficult Unconcatinate Problem

    On Tue, 13 Sep 2005 05:57:13 -0700, "RestlessAde"
    <RestlessAde@discussions.microsoft.com> wrote:

    >Hi,
    >
    >I have tried to solve this problem using a mixture of text functions such as
    >RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    >suggestions.
    >
    >I'm trying to split out the following data stored in a single column into
    >three columns: Country, Data Value and Source.
    >
    > Angola 13,294,000 Source: ibid.
    > Anguilla 13,000 Source: ibid.
    > Antigua and Barbuda 76,000 Source: ibid.
    > Argentina 37,880,000 Source: ibid.
    > Armenia 3,206,000 Source: ibid.
    > Aruba 94,000 Source: ibid.
    > Australia 20,125,000 Source: ibid.
    >
    >The problem I'm having relates to the fact that some countries contain more
    >than one word, so I can't just search for the first " ". I think the answer
    >is to somehow detect the first instance of a numeric value, but I have no
    >idea how to do this.
    >
    >Thanks,
    >RA


    Array formulas will do this. The first two are array formulas; the last is
    not. To enter an array formula, be sure to hold down <ctrl><shift> while
    hitting <enter>. Excel will place braces {...} around the formula.

    Assuming data is in A2.

    Country (array formula):

    =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))

    Data Value (array formula):

    =MID(A2,MATCH(FALSE,ISERROR(-MID(A2,ROW(
    INDIRECT("1:"&LEN(A2))),1)),0),-1+FIND("Source",A2)
    -MATCH(FALSE,ISERROR(-MID(A2,ROW(
    INDIRECT("1:"&LEN(A2))),1)),0))

    Source (NOT an array formula)

    =MID(A3,FIND("Source: ",A3)+8,255)


    --ron

  7. #7
    R.Douthwaite
    Guest

    Re: A Difficult Unconcatinate Problem

    "Ron Rosenfeld" wrote:

    > On Tue, 13 Sep 2005 05:57:13 -0700, "RestlessAde"
    > <RestlessAde@discussions.microsoft.com> wrote:
    >
    > >Hi,
    > >
    > >I have tried to solve this problem using a mixture of text functions such as
    > >RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > >suggestions.
    > >
    > >I'm trying to split out the following data stored in a single column into
    > >three columns: Country, Data Value and Source.
    > >
    > > Angola 13,294,000 Source: ibid.
    > > Anguilla 13,000 Source: ibid.
    > > Antigua and Barbuda 76,000 Source: ibid.
    > > Argentina 37,880,000 Source: ibid.
    > > Armenia 3,206,000 Source: ibid.
    > > Aruba 94,000 Source: ibid.
    > > Australia 20,125,000 Source: ibid.
    > >
    > >The problem I'm having relates to the fact that some countries contain more
    > >than one word, so I can't just search for the first " ". I think the answer
    > >is to somehow detect the first instance of a numeric value, but I have no
    > >idea how to do this.
    > >
    > >Thanks,
    > >RA

    >
    > Array formulas will do this. The first two are array formulas; the last is
    > not. To enter an array formula, be sure to hold down <ctrl><shift> while
    > hitting <enter>. Excel will place braces {...} around the formula.
    >
    > Assuming data is in A2.
    >
    > Country (array formula):
    >
    > =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))
    >
    > Data Value (array formula):
    >
    > =MID(A2,MATCH(FALSE,ISERROR(-MID(A2,ROW(
    > INDIRECT("1:"&LEN(A2))),1)),0),-1+FIND("Source",A2)
    > -MATCH(FALSE,ISERROR(-MID(A2,ROW(
    > INDIRECT("1:"&LEN(A2))),1)),0))
    >
    > Source (NOT an array formula)
    >
    > =MID(A3,FIND("Source: ",A3)+8,255)
    >
    >
    > --ron
    >



    Ron (or anyone else for that matter),

    The first array formula cited above has solved a very major headach I've
    been working on all day as it can be modified to give the position of the
    first number in a mixed string of numbers and letters and for that you
    deserve major kudos!

    However I hate not knowing how it is achieved! Can you explain in plain
    english how this is working?

    I guess this may help me get my head around a few other thorny issues I have
    pending and it may just help out a few others here too

    Thanks in advance

    R.Douthwaite

  8. #8
    Ron Rosenfeld
    Guest

    Re: A Difficult Unconcatinate Problem

    On Tue, 24 Jan 2006 11:27:51 -0800, R.Douthwaite
    <RDouthwaite@discussions.microsoft.com> wrote:

    >"Ron Rosenfeld" wrote:
    >
    >> On Tue, 13 Sep 2005 05:57:13 -0700, "RestlessAde"
    >> <RestlessAde@discussions.microsoft.com> wrote:
    >>
    >> >Hi,
    >> >
    >> >I have tried to solve this problem using a mixture of text functions such as
    >> >RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    >> >suggestions.
    >> >
    >> >I'm trying to split out the following data stored in a single column into
    >> >three columns: Country, Data Value and Source.
    >> >
    >> > Angola 13,294,000 Source: ibid.
    >> > Anguilla 13,000 Source: ibid.
    >> > Antigua and Barbuda 76,000 Source: ibid.
    >> > Argentina 37,880,000 Source: ibid.
    >> > Armenia 3,206,000 Source: ibid.
    >> > Aruba 94,000 Source: ibid.
    >> > Australia 20,125,000 Source: ibid.
    >> >
    >> >The problem I'm having relates to the fact that some countries contain more
    >> >than one word, so I can't just search for the first " ". I think the answer
    >> >is to somehow detect the first instance of a numeric value, but I have no
    >> >idea how to do this.
    >> >
    >> >Thanks,
    >> >RA

    >>
    >> Array formulas will do this. The first two are array formulas; the last is
    >> not. To enter an array formula, be sure to hold down <ctrl><shift> while
    >> hitting <enter>. Excel will place braces {...} around the formula.
    >>
    >> Assuming data is in A2.
    >>
    >> Country (array formula):
    >>
    >> =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))
    >>
    >> Data Value (array formula):
    >>
    >> =MID(A2,MATCH(FALSE,ISERROR(-MID(A2,ROW(
    >> INDIRECT("1:"&LEN(A2))),1)),0),-1+FIND("Source",A2)
    >> -MATCH(FALSE,ISERROR(-MID(A2,ROW(
    >> INDIRECT("1:"&LEN(A2))),1)),0))
    >>
    >> Source (NOT an array formula)
    >>
    >> =MID(A3,FIND("Source: ",A3)+8,255)
    >>
    >>
    >> --ron
    >>

    >
    >
    >Ron (or anyone else for that matter),
    >
    >The first array formula cited above has solved a very major headach I've
    >been working on all day as it can be modified to give the position of the
    >first number in a mixed string of numbers and letters and for that you
    >deserve major kudos!
    >
    >However I hate not knowing how it is achieved! Can you explain in plain
    >english how this is working?
    >
    >I guess this may help me get my head around a few other thorny issues I have
    >pending and it may just help out a few others here too
    >
    >Thanks in advance
    >
    >R.Douthwaite




    =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))

    If you separate out the various nestings:

    MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)

    This segment returns an array of each character in A2. (The ROW(INDIRECT(...)
    returns an array of "1:n" where n is the number of characters in A2.

    The next step is to put a minus sign <-> in front of each character:

    -MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)

    If the character is NOT a number, the array will return an error; otherwise it
    will return a number.

    We then Test each character position to see if there is an error:

    ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1))

    This returns an array of {TRUE,TRUE,FALSE...} depending on whether there is a
    digit.

    The MATCH function with the 0 match_type argument then finds the first FALSE
    which would be the location of the first digit.

    ================================

    Since September, however, I've become enamored of "regular expressions". One
    way of using these effectively in Excel is to download and install Longre's
    morefunc.xll add-in from http://xcell05.free.fr/

    Then, to find the position of the first number in the above, one could use the
    much simpler formula:

    =REGEX.FIND(A1,"\d")

    To extract the first number, where it might contain commas:

    =REGEX.MID(A1,"(\d+|,)+")

    Depending on your data, these can be much more flexible.


    --ron

  9. #9
    R.Douthwaite
    Guest

    Re: A Difficult Unconcatinate Problem


    --
    R.Douthwaite.
    Information Coordinator
    www.swift-research.co.uk


    "Ron Rosenfeld" wrote:

    > On Tue, 24 Jan 2006 11:27:51 -0800, R.Douthwaite
    > <RDouthwaite@discussions.microsoft.com> wrote:
    >
    > >"Ron Rosenfeld" wrote:
    > >
    > >> On Tue, 13 Sep 2005 05:57:13 -0700, "RestlessAde"
    > >> <RestlessAde@discussions.microsoft.com> wrote:
    > >>
    > >> >Hi,
    > >> >
    > >> >I have tried to solve this problem using a mixture of text functions such as
    > >> >RIGHT, LEN, LEFT, FIND etc. However, am still stuck. Would appreciate any
    > >> >suggestions.
    > >> >
    > >> >I'm trying to split out the following data stored in a single column into
    > >> >three columns: Country, Data Value and Source.
    > >> >
    > >> > Angola 13,294,000 Source: ibid.
    > >> > Anguilla 13,000 Source: ibid.
    > >> > Antigua and Barbuda 76,000 Source: ibid.
    > >> > Argentina 37,880,000 Source: ibid.
    > >> > Armenia 3,206,000 Source: ibid.
    > >> > Aruba 94,000 Source: ibid.
    > >> > Australia 20,125,000 Source: ibid.
    > >> >
    > >> >The problem I'm having relates to the fact that some countries contain more
    > >> >than one word, so I can't just search for the first " ". I think the answer
    > >> >is to somehow detect the first instance of a numeric value, but I have no
    > >> >idea how to do this.
    > >> >
    > >> >Thanks,
    > >> >RA
    > >>
    > >> Array formulas will do this. The first two are array formulas; the last is
    > >> not. To enter an array formula, be sure to hold down <ctrl><shift> while
    > >> hitting <enter>. Excel will place braces {...} around the formula.
    > >>
    > >> Assuming data is in A2.
    > >>
    > >> Country (array formula):
    > >>
    > >> =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))
    > >>
    > >> Data Value (array formula):
    > >>
    > >> =MID(A2,MATCH(FALSE,ISERROR(-MID(A2,ROW(
    > >> INDIRECT("1:"&LEN(A2))),1)),0),-1+FIND("Source",A2)
    > >> -MATCH(FALSE,ISERROR(-MID(A2,ROW(
    > >> INDIRECT("1:"&LEN(A2))),1)),0))
    > >>
    > >> Source (NOT an array formula)
    > >>
    > >> =MID(A3,FIND("Source: ",A3)+8,255)
    > >>
    > >>
    > >> --ron
    > >>

    > >
    > >
    > >Ron (or anyone else for that matter),
    > >
    > >The first array formula cited above has solved a very major headach I've
    > >been working on all day as it can be modified to give the position of the
    > >first number in a mixed string of numbers and letters and for that you
    > >deserve major kudos!
    > >
    > >However I hate not knowing how it is achieved! Can you explain in plain
    > >english how this is working?
    > >
    > >I guess this may help me get my head around a few other thorny issues I have
    > >pending and it may just help out a few others here too
    > >
    > >Thanks in advance
    > >
    > >R.Douthwaite

    >
    >
    >
    > =LEFT(A2,-1+MATCH(FALSE,ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)),0))
    >
    > If you separate out the various nestings:
    >
    > MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)
    >
    > This segment returns an array of each character in A2. (The ROW(INDIRECT(...)
    > returns an array of "1:n" where n is the number of characters in A2.
    >
    > The next step is to put a minus sign <-> in front of each character:
    >
    > -MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1)
    >
    > If the character is NOT a number, the array will return an error; otherwise it
    > will return a number.
    >
    > We then Test each character position to see if there is an error:
    >
    > ISERROR(-MID(A2,ROW(INDIRECT("1:"&LEN(A2))),1))
    >
    > This returns an array of {TRUE,TRUE,FALSE...} depending on whether there is a
    > digit.
    >
    > The MATCH function with the 0 match_type argument then finds the first FALSE
    > which would be the location of the first digit.
    >
    > ================================
    >
    > Since September, however, I've become enamored of "regular expressions". One
    > way of using these effectively in Excel is to download and install Longre's
    > morefunc.xll add-in from http://xcell05.free.fr/
    >
    > Then, to find the position of the first number in the above, one could use the
    > much simpler formula:
    >
    > =REGEX.FIND(A1,"\d")
    >
    > To extract the first number, where it might contain commas:
    >
    > =REGEX.MID(A1,"(\d+|,)+")
    >
    > Depending on your data, these can be much more flexible.
    >
    >
    > --ron
    >


    Ron,

    Thank you for the excellent explaination. I think you just opened up a whole
    new world to me... Not sure that's necessarily a good thing :-)

    R.Douthwaite

  10. #10
    Ron Rosenfeld
    Guest

    Re: A Difficult Unconcatinate Problem

    On Tue, 24 Jan 2006 15:39:02 -0800, R.Douthwaite
    <RDouthwaite@discussions.microsoft.com> wrote:

    >Ron,
    >
    >Thank you for the excellent explaination. I think you just opened up a whole
    >new world to me... Not sure that's necessarily a good thing :-)
    >
    >R.Douthwaite


    You're very welcome.
    --ron

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1