`pdist`

, but instead passed `X`

into `linkage`

. In that case scipy will implicitly calculate the distances. In other words: you won’t find rows of `X`

in `Z`

, but you should find `pdist(X)`

rows in `Z`

, especially for `method='single'`

.
Wrt. the second part, i don’t know where you got that quote from. I never made that statement about cophenet. Cophenet is neither metric nor method. Obviously, if you’re calculating cophenet on `Z, pdist(X)`

, you should pass the corresponding metric into `pdist`

: `pdist(X, metric='...')`

.

I’m not sure i fully understand the question. If all you want to do is get from the idx back to your data, you should be able to do `X[idx]`

. If there’s further information that is not in `X`

connected to that data row, you can always have an external other dictionary to do the lookup!?!

You stated the following:

“No matter what method and metric you pick, the linkage() function will use that method and metric (i.e. cophenet) to calculate the distances of the clusters (starting with your n individual samples (aka data points) as singleton clusters)) and in each iteration will merge the two clusters which have the smallest distance according the selected method and metric.”

However, in the documentation for scipy.cluster.hierarchy.linkage, cophenet is never mentioned. In other words, I don’t understand why cophenet should bypass method and metric. Could you please support the previous statement?

]]>Someone has posted about getting a sample indices within the cluster and stored into a dictionary. I did the solution that you recommended in which each cluster list contains the size and the value, i.e. sample index.

My question is, instead of returning the sample index, what can I do to return the data id itself? In prior to the clustering, I have tagged my data so that each data has a unique id.

The reason for this is because I want to do an analysis that involves identification and measuring the amount of data from the same class that were clustered in the same cluster.

Thanks in advance!

Lucy.

Only one minor change to your steps:

$ brew install qt pyqt

Error: No available formula with the name “pyqt”

==> Searching for similarly named formulae…

This similarly named formula was found:

pyqt5

To install it, run:

brew install pyqt5

I ran brew install qt pyqt5 and all seems well.

Thanks again!

]]>