Generate a "similarity" matrix on non-continuous values (not numbers)?

**guillm** · 05-03-2012, 02:13 PM

Dear all,

I have this matrix of non-continuous, independent data (let's call them "a", "b", "c")

	Variable 1	Variable 2	Variable 3
Sample 1	a	a	a
Sample 2	b	b	e
Sample 3	c	a	a

Values are not continous numbers, or measures, but more of a label for each "Variable". Additionally, "a" in Variable 1 does not relate to "a" in Variable 2.

I would like a way to assess similarity (=shared values) between "Samples". I don't care to know which "Variable" is similar or different between the two "Samples", just the number of shared values is fine.

For the example above, we see that:
- Sample 1 has 2 shared value with Sample 3 (for Variable 2)
- Sample 2 has 0 shared value with Sample 1 for any variable
- Sample 3 has 0 shared values with Sample 2
- Each sample has 3 shared values with itself

In that example, Sample 1 and 3 are more similar to each other than Sample 2 (if we exclude self-similarity).

I guess a good way of outputing this is to create a "similarity" matrix:

Shared values:

	Sample 1	Sample 2	Sample 3
Sample 1	3	0	2
Sample 2	0	3	0
Sample 3	2	0	3

Does it sound like something possible to do in Excel?

Thanks a lot for any help. I hope I was clear enough!

All the best,

G.

Generate a "similarity" matrix on non-continuous values (not numbers)?

LinkBack

Thread Tools

Rate This Thread

Display

Threaded View

Generate a "similarity" matrix on non-continuous values (not numbers)?

Thread Information

Users Browsing this Thread

Bookmarks

Bookmarks

Posting Permissions