Hi All,

This is my first time here. I have a requirement and I need your advise and help of how to proceed with the same.

My Requirement:

I have to write a Duplicate Detector macro on an excel spreadsheet which has around 25000 rows of data. And my excel has data like the following:


S.No Name Query
---- ------ -------------------------------------------------------------------

1. Query1 Select S1.A,S2.B from DB1.Table1 S1, DB2.Table2 S2
2. Query2 Select S2.B , S1.A from Db2.Table2 S2, Db1.Table1 S1
3. Query3 Sel S1.B, S2.A from Db2.Table2 S1, DB1.Table1 S2
4. Query4 Sel A,B from DB3.Table3
5. Query5 Sel B,A from DB4.Table4
6. Query6 Select B,A
6. Query6 from DB3.Table3;
7. Query7 Select A from Db.Table S3 WHERE S3.ID=100 AND S3.NAME='John'
8. Query8 Sel A from Db.Table S1 WHERE S1.NAME='John' AND S1.ID=100
9. Query9 SEL * FROM DB1.TBL1 A LEFT OUTER JOIN DB2.TBL2 B ON A.KEY = B.KEY AND A.NAME = B.NAME WHERE B.KEY IS NULL;
10. Query10 select * from db2.tbl2 c left outer join db1.tbl1 d on c.key = d.key and c.name = d.name where c.key is null


I would need to remove duplicates from the column C(3rd column) which has SQL queries. when I say duplicates its a bit tricky. Any 2 queries which are logically similar should be considered to be duplicates. For example, in the above list, the following queries should be considered as the same and duplicate.

Queries: (1 THRU 3), (4 AND 6), (7 AND 8), (9 AND 10)

I know it is highly difficult to do it in code (as you cannot assume infinite conditions and at anytime you can miss something) and that is why I want to do the following:

1. I dont want to delete any row if it is identified as duplicate (like above). Instead highlight it in a different color, so that once after this Duplicate Detector macro is run, someone can go ahead and manually verify only the highlighted rows to see if they are really duplicates and if yes delete it.

2. I dont want the Duplicate Detector to catch 100% of the duplicates, as I know it is highly impossible.

Is there any idea that anyone can suggest me for doing this? Any help on this is heartily appreciated.
Let me know for any questions.

Thanks,
Bharath