+ Reply to Thread
Results 1 to 6 of 6

Tidying up addresses database - removing redundant variant records

  1. #1
    Registered User
    Join Date
    07-08-2013
    Location
    Malta
    MS-Off Ver
    Excel 2007
    Posts
    7

    Unhappy Tidying up addresses database - removing redundant variant records

    Good morning/ afternoon

    I know that the following could be more tidy. Forgive my transgression in this regards and any assistance you can provide is greatly appreciated.

    In short I have about 25,000 entries of addresses to trawl through (again - already did this manually 6 months ago), tidying any redundant variants out - and am looking to save myself a lot of manual trawling

    Sample of Excel Data Attached - pre and post-processing:Sample.xls

    /\/\/\/\/\


    What I am personally hoping for is advice and perhaps snippets of code that could:

    a) Firstly determine the 'alphas'

    (start pseudo code)

    If current record in list is identified as an official record (H),
    Identify the closest match to current official record's Street (D) (and Prefix (C)?)
    where the Locality (E) is the same,
    and assign the same official database value (G) to the same column of the chosen record (G).

    (end pseudo code)

    b) and... once the 'alphas' are determined: (and optionally with the removal of 'officials' from the picture'):

    (start pseudo code)

    If current record in list is identified as an 'alpha' record (G)
    Identify all records of adequately similar match to current record's Street (D) (and Prefix (C)?)
    Where the Locality (E) is the same,
    and assign the same official value of the alpha (B) to column (F) of the variant (to facilitate database update)

    (end pseudo code)

    SOS?
    Last edited by Avestron; 07-09-2013 at 02:54 AM.

  2. #2
    Registered User
    Join Date
    07-08-2013
    Location
    Malta
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Tidying up addresses database - removing redundant variant records

    Well I am a little discouraged by the lack of response... but I am hoping its because people are too busy rather than because what I am asking for is not within the normal capabilities of Excel.

    If the later is the case then I would be happy to hear opinions about what my next step should be (even if that is to start trudging through the data manually).

    I have also modified the origional post to be more reader-friendly - hope it helps.

  3. #3
    Administrator FDibbins's Avatar
    Join Date
    12-29-2011
    Location
    Duncansville, PA USA
    MS-Off Ver
    Excel 7/10/13/16/365 (PC ver 2310)
    Posts
    53,048

    Re: Tidying up addresses database - removing redundant variant records

    Hi and welcome to the forum

    Just a reminder that ALL members here are all volunteers, no-one gets paid for anything (except on the commercial forum), we all contibute our knowledge free of charge, as and when we can.

    It could be that nobody has seen this post (I hadnt, until you bumped it), so I would suggest a little more patience, Im sure you will get the answer you want soon

    Also, to assist in helping you, it might be better if you included a sample workbook, showing the data you are working with (no sensitive data), what your expected outcome is, and how you arrived atthat.
    (your 1st thread is quite a mouthful, so maybe that put off some members)
    1. Use code tags for VBA. [code] Your Code [/code] (or use the # button)
    2. If your question is resolved, mark it SOLVED using the thread tools
    3. Click on the star if you think someone helped you

    Regards
    Ford

  4. #4
    Registered User
    Join Date
    07-08-2013
    Location
    Malta
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Tidying up addresses database - removing redundant variant records

    Thank you kindly for the welcome and also your advice. I 'was' worried that the OP was too wordy - Will try to fix (and include a sample workbook - there is little sensitive information as such)).

    I can certainly appreciate that nobody owes me a response. This is certainly not the way I wished it to sound and I apologize to those who took it that way.

    /\/\/\/\/\

    For the sake of freeing up the original post of unnnecessary background information - I am relocating it here for those who find any part of it useful for gaining a better grasp of the issue.

    Background:

    Well it seems that I've inherited a bit of a hefty problem - a database with a good 25,000-odd addresses that I need to tidy up.

    How did it get so untidy? All users could create new street entries (rather than actually search for them).
    The result: Numerous variants of the same address.

    Now that I've finally sorted out the how (choice from list - no longer adding redundant records to the list) I still have somewhat of a monster task to tidy these up - a good couple of months work do do it manually.

    I cannot help but think that there is a much better way to go about it though and this is where I ask for your advice.

    Databases:

    To help work on this matter I have combined two databases - one featuring our own data and the other featuring the official list of addresses. I am using the later list to determine an 'official' match amongst the multiple instances of variants of this address and then I am associating the variants with this 'alpha' address.

    The intention is to replace the street data of the variants with that of the 'alpha', reducing or eliminating redundancy and very much tidying up the database to make any further manual tidying a lot more managable.

    The reason I cannot simply ditch the first for the later is because clients are associated with the former - and the links binding person and address would be lost.

    Columns of data:

    (Column) - Purpose

    (A) Not in Use
    (B) Unique database address (integer);
    (C) Street Prefix (String);
    (D) Street Name (String);
    (E) Locality (String);
    (F) Replace Column 'A' With this Value if not (Integer);
    (G) Address reference for 'official' database - also assigned to 'alpha' record in database (integer);
    (H) Identifier of 'Official' Database (So as not to confuse 'official' with 'alpha') (String);

    -



    /\/\/\/\/\

    Data Sample:

    Please Login or Register  to view this content.
    (side-note, 'Triq' is the local word for Street (we use it prior to the name rather than afterwards, as in the English language))
    Last edited by Avestron; 07-09-2013 at 02:54 AM.

  5. #5
    Registered User
    Join Date
    07-08-2013
    Location
    Malta
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Tidying up addresses database - removing redundant variant records

    Good morning.

    I am also giving this some thought and am posting this for any available feedback.

    If I use the data as a source sheet and create a filter sheet then I might be able to 'at least' facilitate the manual trawling through the data, should this be necessary.

    Each cell in the filter sheet would tie into the conditions of one or more key fields and other cells on the row would appear conditionally to the key cell conditions being true.

    To put this in code I'd expect something along the following for the key cell:

    =(if (Source!E9 = "*Filter!E1*", Source!E9, ""))

    The only thing is that "*Filter!E1*" is to refer to cells containing the contents of that cell (example *SAN XMUN*)...

  6. #6
    Registered User
    Join Date
    07-08-2013
    Location
    Malta
    MS-Off Ver
    Excel 2007
    Posts
    7

    Re: Tidying up addresses database - removing redundant variant records

    Kindly note that the last post of this thread was mainly solved within another thread on formulae.

    This will certainly assist me in manually tackling street names containing a particular variable term in common with each other.

    I am currently thinking of applying this to the example excel sheet posted in the original post of this thread.

    I will re-separate the 'official' (CDR) entries from the live database entries.

    Then I will use the CDR entry itself as a search criteria...

    In the event of a perfect match I highlight the match for manual entry of 'official' code (to indicate them as 'alpha')

    In the event of a partial match I display them - and consider them as candidates for entry of the 'alpha's' database ID - I may set up to three likely permutations manually each search - to increase results...

    Yes I do get the feeling that I am missing something

+ Reply to Thread

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts

Search Engine Friendly URLs by vBSEO 3.6.0 RC 1