In unsupervised learning, the computer is given a data set of unlabeled patterns from which it should extract "knowledge". This knowledge can take different forms, each giving rise to a different learning problem.
Clustering aims at classifying the data into a number of groups containing similar patterns.
Density estimation aims at learning a generative model of the data, which could be used for instance to generate new data instances or estimate the probability that a pattern falls within a given region.
Dimensionality reduction methods can also be seen as unsupervised learning ones.
The goal of unsupervised learning is to extract knowledge from a data set of $N$ unlabeled input vectors, $$ \{ \g x_1, \dots, \g x_N \} \in \X^N . $$ This knowledge can take different forms, such as groups of similar patterns or a probabilistic model of the data, typically a probability density function $$ p_X(x). $$