HandsonML 9. Unsupervised Learning

1. Basic

Why unsupervised:

Types:

example of clustering: semi-supervised learning, customer segmentation, data analysis, anomaly detection, search engine.

Find K centoids from grouping

run the algorithm multiple times with different random initialization and keep the best solution.

$$ need to copy image @ page 251

Label Propagation

Active learning (co-work with human)

Train a model on the labeled instances gathered so far.
This model makes predictions on all unlabeled instances.
The most uncertain instances are given to the expert to be labeled. (i.e., when probability is lowest)
Iterate until the performance improvement stops.

Defines clusters as continuous regions of high density

For each instance, counts # of instances within a small distance ε as neighbors.
An instance is a core when it has # of neighbors > n.
Cluster and merge cores & their neighbors.

$$ add page 257, figure 9-14

A probabilistic model
GMM assumes that the instances were generated from a mixture of several Gaussian distributions whose parameters are unknown.

Last updated 5 years ago

Was this helpful?