I need to combine projected information and actual performance into one unified spreadsheet that can be used to analyze part's and customer's performance.

I was wondering if people could provide me tips on how to format my data to do this as efficiently as possible. I have two sources of data: 20 years of projection Excel worksheets and 20 years of Gross Margin Excel reports. Because there are so many, laid out in different formats (thanks previous employees), I have to clean and combine the data.

I am familiar with pivot tables, index/match, and vlookups, but I know there are better ways to analyze all of this, and I want to get it right at the outset.

As shown in the picture below, information from the Projection Spreadsheets includes:
-Year (of projection), Part #, Division, Year of Original Sale, Customer Name, Projected Qty, Projected Revenue

Actual Gross Margin Spreadsheet includes:
-Year (of sales), Part #, Division, Customer Name, Qty Sold, Revenue, Actual Cost, Actual GM $, Actual GM %

Sample.PNG

Part Sales are only projected out for 5 years, so that means I will have 5 "Projection" rows for each part #, and up to 20 more "Actual" rows for every year that part # had sales. Plus, there are thousands of parts, so that means hundreds of thousands of rows. So you can see why I want to get this formatted right in the first place.

The information people want from this report is:
-What parts failed to launch (where projected sales was way higher than actual sales)
-What parts went obsolete quickly (projected five years of sales, only sold for one year)
-Analyze what customers tend to over-estimate how much they will buy

Should I split the data into two tables? Keep going as is? Or am I entirely on the wrong track?

Thanks in advance for your help!