Multiple clustering solutions analysis through least-squares consensus algorithms

Abstract
Clustering is one of the most important unsupervised learning problems and it deals with finding a structure in a collection of unlabeled data; however, different clustering algorithms applied to the same data-set produce different solutions. In many applications the problem of multiple solutions becomes crucial and providing a limited group of good clusterings is often more desirable than a single solution. In this work we propose the Least Square Consensus clustering that allows a user to extrapolate a small number of different clustering solutions from an initial (large) set of solutions obtained by applying any clustering algorithm to a given data-set. Two different implementations are presented. In both cases, each consensus is accomplished with a measure of quality defined in terms of Least Square error and a graphical visualization is provided in order to make immediately interpretable the result. Numerical experiments are carried out on both synthetic and real data-sets. © 2010 Springer-Verlag.
Anno
2010
Tipo pubblicazione
Altri Autori
Murino, L.a and Angelini, C.b and Bifulco, I.a and De Feis, I.b and Raiconi, G.a and Tagliaferri, R.a
Editore
Springer
Rivista
Lecture notes in computer science