I am trying to create a worksheet where the end user can manually type in (or paste) company names in one column and then have excel look up the typed name and return the closest matched company name from the database tab (in the same worksheet). The database tab is being pulled from an internal database and is simply just values.
The main problem is that due to human input (such as abbreviations or punctuation), VLOOKUP is not robust enough for me to return accurate values.
For example, if the master list has " Coca-Cola " and the user types in " Coca Cola " (without the dash), the value returned might be something else. Other common issues arise with companies having "Inc" and "Corp"/ "Co." at the end.
I was thinking of creating a macro that got rid of these miscellaneous addendums to the names. (Any suggestions on how to do that nicely?)
Next, I would concatenate the whole word together in one word and try to see if at least 50% or more of the consecutive letters match with any of the names in the master database column.
*Any idea on how to go about trying to see if more than 50% of the consecutive characters in one word match with any of the words in the master database column? *
There is also the problem of abbreviations but I am not sure how to handle that. (If someone types in GE instead of General Electric.) Since the database has 1000s of names, I cannot think of every permutation.
Finally with the returned cleaner name, I would then use vlookup to retrieve relevant data from the database columns (such as revenue).
I am open to using some other programming language as long as it can easily be tied back to excel interface. I have some experience in JAVA and C.
Thanks in advance. Any and all help would be appreciated. This would be a great holiday gift! Happy holidays and a happy new year to all.
Bookmarks