2005
Conference article  Unknown

Speeding-up hierarchical agglomerative clustering in presence of expensive metrics

Nanni M.

Clustering  Data Mining 

In several contexts and domains, hierarchical agglomerative clustering (HAC) offers best-quality results, but at the price of a high complexity which reduces the size of datasets which can be handled. In some contexts, in particular, computing distances between objects is the most expensive task. In this paper we propose a pruning heuristics aimed at improving performances in these cases, which is well integrated in all the phases of the HAC process and can be applied to two HAC variants: single-linkage and complete-linkage. After describing the method, we provide some theoretical evidence of its pruning power, followed by an empirical study of its effectiveness over different data domains, with a special focus on dimensionality issues.

Source: Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 378–387, Hanoi, Vietnam, May 2005

Publisher: Springer-Verlag, Berlin Heidelberg, DEU



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:43839,
	title = {Speeding-up hierarchical agglomerative clustering in presence of expensive metrics},
	author = {Nanni M.},
	publisher = {Springer-Verlag, Berlin Heidelberg, DEU},
	booktitle = {Pacific-Asia Conference on Knowledge Discovery and Data Mining, pp. 378–387, Hanoi, Vietnam, May 2005},
	year = {2005}
}