Hello all,
In a previous post, I was explaining a dataset that I need to do a "basket analysis" on. The dataset is 40,000+ Rows big. And after a few days of thought, here is my thoughts on a solution.
I have a dataset that currently looks like this:
Product Type Unique ID <<3 Rows Not Needed>> Product Category Rank/Sort Apples 145 Fruit 1 Oranges 145 Fruit 2 Celery 122 Veggie 1 Bananas 122 Fruit 2 Lettuce 167 Veggie 1 Apples 555 Fruit 1 Celery 555 Veggie 2 Bannanas 555 Fruit 3 Lettuce 555 Veggie 4
To do the analysis that I want, what I want to do is transpose the data into a format like this:
Rank/Sort>> 1 2 3 4 122 Celery Banana 145 Apples Orages 167 Lettuce 555 Apple Celery Bannana Lettuce
So that I can then easily create an array based on this information for further analysis, and have two columns:
122 {Celery,Banana} 145 {Apples,Oranges} 167 {Lettuce} 555 {Apple, Celery,Bannana, Lettuce}
With the data in arrays, I can then find out patterns/compare/analyze based on who ordered Apples and Lettuce and in what order. And who ordered Just Oranges without lettuce, etc...
Ideally this would be done with formulas, as the data set will be reduced and expanded in the future... but its nothing to run a macro after it is cleaned.
I am at my wits end with this macro, and any help is much appreciated! Thanks!
Bookmarks