+ Reply to Thread
Results 1 to 10 of 10

Cleaning Up Data with Multiple Variation

  1. #1
    Registered User
    Join Date
    03-19-2012
    Location
    kl
    MS-Off Ver
    Excel 2007
    Posts
    29

    Cleaning Up Data with Multiple Variation

    Hi Guys,
    Really appreciate your help on the issue that I am facing when working with my dataset.

    There are variations of my current data in one column (as below). And I need to segregate the column between the Title (AM1) with the Role (Role AAAA etc). Tried using the text to column function, but due to the variation of entries formatting and Title convention - can't do it.

    AM 1 ROLE AAAA
    AM 2 ROLE AAA
    AM 2 ROLE B
    AM 2 ROLE B
    AM 2, ROLE BBBBBB
    AM 2-ROLE BBBB
    AM 2,ROLE BB
    AM 2 - ROLE

    The outcome intended from above is as below (in 2 different column)

    Title Role
    AM1 ROLE AAAA
    AM2 ROLE AAA
    AM2 ROLE B
    AM2 ROLE B
    AM2 ROLE BBBBBB
    AM2 ROLE BBBB
    AM2 ROLE BB
    AM2 ROLE
    ACTING HEAD ROLE A
    ACTING HEAD ROLE A
    ACTING HEAD ROLE B
    ACTING HEAD ROLE B
    ACTING HEAD ROLE B
    ACTING HEAD ROLE B
    ACTING HEAD ROLE B
    ACTING HEAD ROLE
    SVP ROLE A
    SVP ROLE A
    SVP ROLE B
    SVP ROLE B
    SVP ROLE B
    SVP ROLE B
    SVP ROLE B
    SVP ROLE

    Thanks in advance for your help.

    Cheers
    Bert.
    Attached Files Attached Files

  2. #2
    Forum Expert
    Join Date
    08-12-2012
    Location
    Sydney, Australia
    MS-Off Ver
    Excel 2010
    Posts
    5,636

    Re: Cleaning Up Data with Multiple Variation

    put formula in C3 and fill down

    Formula: copy to clipboard
    Please Login or Register  to view this content.
    If you are satisfied with the solution(s) provided, please mark your thread as Solved.
    Select Thread Tools-> Mark thread as Solved. To undo, select Thread Tools-> Mark thread as Unsolved.

  3. #3
    Registered User
    Join Date
    03-19-2012
    Location
    kl
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Cleaning Up Data with Multiple Variation

    it works if all role start with "role" ...
    what if we have different naming convention for the role?

    ACTING HEAD HR
    ACTING HEAD IT
    ACTING HEAD COMPLIANCE
    ACTING HEAD FINANCE
    ACTING HEAD, OPS
    ACTING HEAD-SALES
    ACTING HEAD,BRANCH
    ACTING HEAD - FINANCE SOLUTIONS

  4. #4
    Forum Expert
    Join Date
    08-12-2012
    Location
    Sydney, Australia
    MS-Off Ver
    Excel 2010
    Posts
    5,636

    Re: Cleaning Up Data with Multiple Variation

    then update your example file accordingly to eliminate me guessing

  5. #5
    Forum Expert
    Join Date
    08-12-2012
    Location
    Sydney, Australia
    MS-Off Ver
    Excel 2010
    Posts
    5,636

    Re: Cleaning Up Data with Multiple Variation

    something like thread should be what your after
    http://www.excelforum.com/excel-form...ml#post4309348

  6. #6
    Registered User
    Join Date
    03-19-2012
    Location
    kl
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Cleaning Up Data with Multiple Variation

    ok have updated the dataset to reflect the situation.
    the link that you provided were useful, but didn't address my issue. (thanks for giving the link).
    Attached Files Attached Files

  7. #7
    Forum Expert
    Join Date
    08-12-2012
    Location
    Sydney, Australia
    MS-Off Ver
    Excel 2010
    Posts
    5,636

    Re: Cleaning Up Data with Multiple Variation

    changed your example to what was mentioned in the other post

    even me reading your list i do not know all the desired outcomes without making some assumptions.....
    remember the computer can only apply rules you put in place it does not make assumptions for you
    Attached Files Attached Files

  8. #8
    Registered User
    Join Date
    03-19-2012
    Location
    kl
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Cleaning Up Data with Multiple Variation

    thanks, i think i can use this as stop gap measure. it is just that a lot of works needed to be done to establish the "directory" as per your solution.

    my apologies for not being clear/descriptive enough.

    if you look at my example, there a few ways of writing the position title - position title is a combination of grade (AM2, etc) with title (projct admin etc). In my example, there are few scenarios - space in between AM and the number e.g. AM 2; no space in between title e.g. AM1; comma in between title and "outcome needed" e.g. AM1, Project Admin. The data was generated from the system which has many stakeholder i.e. different ways of writing the position title for employees.

    my requirement/outcome needed:
    1. how do i omit the different variation of "grade" in the position title and take only the position title? your example works, but we are talking at 15-20K entries, which i think would be hard to keep on updating the "directory"

    thanks for your helps.

  9. #9
    Forum Expert
    Join Date
    08-12-2012
    Location
    Sydney, Australia
    MS-Off Ver
    Excel 2010
    Posts
    5,636

    Re: Cleaning Up Data with Multiple Variation

    i think you already know what the problem with this
    The data was generated from the system which has many stakeholder i.e. different ways of writing the position title for employees.
    letting users free form text without any structure leads to you having to "dictionary" plausible outcomes
    with no set way of entering data people are left to do whatever they want for position

    essentially what you are after is some smart parsing. this generally requires some sort of logic
    however it eludes me what logic can be applied to your data set as there is little to no framework on what can be entered and what defines a "grade"

    again some sort of "directory" or dictionary to parse data would be required
    http://www.datasciencecentral.com/pr...-big-data-sets
    (the programming language used in this article is not excel but the principle is still the same)

    you may not go with the type of dictionary i had in my previous example
    you can go with "what to exclude" ie Grade
    list of every variation you can think of for manager, mgr, assistant mgr, am, am1, am 1, etc


    ps you probably should spell check the whole list first
    ASISTANT MANAGER MARKETING
    ASSISTANT BRANCH MANAGER

    additional reading on techniques you can use
    https://support.office.com/en-us/art...f-9cfc08a0a4e3

    this is an exhaustive tasks that happens quite often
    more often then not, people just manually brute force it once and use that as a master file
    the more you can group things into manageable chunks the better/easier it becomes
    Last edited by humdingaling; 02-09-2016 at 10:19 PM.

  10. #10
    Registered User
    Join Date
    03-19-2012
    Location
    kl
    MS-Off Ver
    Excel 2007
    Posts
    29

    Re: Cleaning Up Data with Multiple Variation

    thanks a lot humdingaling!

    the link on data directory/dictionary is really useful. it might be pain the back, but i guess i may have to resort to this (because i can't control the entry point to the system). not the direct solution to my problem, but it was a useful one. and i learn more than just a solution from the references/sharing that you put up in this thread.

    will consider this thread as solved.

    thank you for your help. really appreciate it.

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Replies: 1
    Last Post: 11-18-2014, 10:36 AM
  2. Cleaning Multiple Sheets Across Time.
    By Insert Name in forum Excel Programming / VBA / Macros
    Replies: 0
    Last Post: 09-29-2014, 11:08 PM
  3. Replies: 6
    Last Post: 02-08-2014, 05:01 AM
  4. [SOLVED] Cleaning and Consolidating Multiple Worksheets
    By VKS in forum Excel Programming / VBA / Macros
    Replies: 11
    Last Post: 12-31-2012, 11:01 AM
  5. Cleaning and sorting data to multiple worksheets
    By rmf17 in forum Excel Programming / VBA / Macros
    Replies: 2
    Last Post: 02-23-2012, 08:10 AM
  6. Increase the data variation range
    By jdbaba in forum Excel General
    Replies: 2
    Last Post: 11-04-2011, 01:18 PM
  7. Data validation variation
    By RyanNIF in forum Excel General
    Replies: 1
    Last Post: 08-03-2010, 01:24 PM

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1