本文主要是介绍Datacamp 笔记代码 Unsupervised Learning in Python 第二章 Visualization with hierarchical clustering t-SNE,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
更多原始数据文档和JupyterNotebook
Github: https://github.com/JinnyR/Datacamp_DataScienceTrack_Python
Datacamp track: Data Scientist with Python - Course 23 (2)
Exercise
Hierarchical clustering of the grain data
In the video, you learned that the SciPy linkage()
function performs hierarchical clustering on an array of samples. Use the linkage()
function to obtain a hierarchical clustering of the grain samples, and use dendrogram()
to visualize the result. A sample of the grain measurements is provided in the array samples
, while the variety of each grain sample is given by the list varieties
.
Instruction
- Import:
linkage
anddendrogram
fromscipy.cluster.hierarchy
.matplotlib.pyplot
asplt
.
- Perform hierarchical clustering on
samples
using thelinkage()
function with themethod='complete'
keyword argument. Assign the result tomergings
. - Plot a dendrogram using the
dendrogram()
function onmergings
. Specify the keyword argumentslabels=varieties
,leaf_rotation=90
, andleaf_font_size=6
.
import pandas as pddf = pd.read_csv('https://s3.amazonaws.com/assets.datacamp.com/production/course_2234/datasets/seeds.csv', header=None)sample_indices = [5 * i + 1 for i in range(42)]
df = df.iloc[sample_indices]
samples = df[list(range(7))].values
varieties = list(df[7].map({1: 'Kama wheat', 2: 'Rosa wheat', 3: 'Canadian wheat'}))
# Perform the necessary imports
from scipy.cluster.hierarchy import linkage, dendrogram
import matplotlib.pyplot as plt# Calculate the linkage: mergings
mergings = linkage(samples, method='complete')# Plot the dendrogram, using varieties as labels
dendrogram(mergings,labels=varieties,leaf_rotation=90,leaf_font_size=6,
)
plt.show()
Exercise
Hierarchies of stocks
In chapter 1, you used k-means clustering to cluster companies according to their stock price movements. Now, you’ll perform hierarchical clustering of the companies. You are given a NumPy array of price movements movements
, where the rows correspond to companies, and a list of the company names companies
. SciPy hierarchical clustering doesn’t fit into a sklearn pipeline, so you’ll need to use the normalize()
function from sklearn.preprocessing
instead of Normalizer
.
linkage
and dendrogram
have already been imported from scipy.cluster.hierarchy
, and PyPlot has been imported as plt
.
Instruction
- Import
normalize
fromsklearn.preprocessing
. - Rescale the price movements for each stock by using the
这篇关于Datacamp 笔记代码 Unsupervised Learning in Python 第二章 Visualization with hierarchical clustering t-SNE的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!