In a previous post on clustering and cluster validity (i.e. determining the number of clusters), I was writing about the different types of algorithms. Another way of doing clustering is through Gaussian mixtures.
Andrew W. Moore has made a nice presentation on this topic. After a short introduction on unsupervised learning, he then presents GMM (Gaussian Mixtures Models) principles. He continues with the EM (Expectation Maximization) algorithm for maximum likelihood. He also gives real-life examples. Finally the Duda et al. book is suggested as reference.
Using Gaussian mixtures for clustering is clean and provides a strong mathematical background. Moreover, using cross-validation, the number of clusters within data can be inferred. However, the algorithm (with cross-validation) is time consuming and perhaps not practical for some real-life data sets.