Generate a "similarity" matrix on non-continuous values (not numbers)?

**guillm** · 05-03-2012, 02:13 PM

Dear all,

I have this matrix of non-continuous, independent data (let's call them "a", "b", "c")

Please Login or Register  to view this content.

Values are not continous numbers, or measures, but more of a label for each "Variable". Additionally, "a" in Variable 1 does not relate to "a" in Variable 2.

I would like a way to assess similarity (=shared values) between "Samples". I don't care to know which "Variable" is similar or different between the two "Samples", just the number of shared values is fine.

For the example above, we see that:
- Sample 1 has 2 shared value with Sample 3 (for Variable 2)
- Sample 2 has 0 shared value with Sample 1 for any variable
- Sample 3 has 0 shared values with Sample 2
- Each sample has 3 shared values with itself

In that example, Sample 1 and 3 are more similar to each other than Sample 2 (if we exclude self-similarity).

I guess a good way of outputing this is to create a "similarity" matrix:

Shared values:

Please Login or Register  to view this content.

Does it sound like something possible to do in Excel?

Thanks a lot for any help. I hope I was clear enough!

All the best,

G.

**ChemistB** · 05-03-2012, 02:53 PM

Try this where your data table (including headers is in H1:K5) and your comparision table is in A1:E5

In B2
=SUM(1*(INDEX($I$2:$K$5,MATCH(B$1,$H$2:$H$5,0),)=INDEX($I$2:$K$5,MATCH($A2,$H$2:$H$5,0),))) entered as an array (CNTRL SHFT ENTER instead of ENTER)
Drag across and down

See attachment

**guillm** · 05-07-2012, 03:09 AM

It works perfectly, thanks a lot!

**yomlao** · 07-09-2013, 08:36 AM

Thanks for the solution indeed !

I am now struggling to adapt/expand the formula provided by ChemistB using a condition -> something like: "if the two cells that are compared are empty, don't count them as similar"...

See the objective in attachment: ComparisonTable_v2.xlsx

Any help would be much appreciated

Cheers,

Y.

**arlu1201** · 07-09-2013, 09:09 AM

yomlao,

Unfortunately you need to post your question in a new thread, it's against the forum rules to post a question in the thread of another user. If you create your own thread, any advice will be tailored to your situation so you should include a description of what you've done and are trying to do. Also, if you feel that this thread is particularly relevant to what you are trying to do, you can surely include a link to it in your new thread.

**guillm** · 06-18-2015, 06:36 AM

Dear all,

I have previously solved my initial problem using the answer above, but I have the same related question as #4, so I post it here.

The formula provided in #2 works perfectly with the exception that empty cells are considered to be identical. As the poster in #4, I am also struggling to adapt the formula to not take into account the empty cells and would be grateful for some help in this.

Thank you very much for your help and your time.
G.

**guillm** · 06-19-2015, 04:49 AM

(Just marked the thread as unsolved, as I was advised to post the new related question below it)

Generate a "similarity" matrix on non-continuous values (not numbers)?

LinkBack

Thread Tools

Rate This Thread

Display

Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Re: Generate a "similarity" matrix on non-continuous values (not numbers)?

Thread Information

Users Browsing this Thread

Bookmarks

Bookmarks

Posting Permissions