2004
Conference article  Restricted

Organizing digital libraries by automated text categorization

Avancini H, Rauber A, Sebastiani F

Hierarchical text classification  Hierarchical clustering 

Text Categorization (TC) is the discipline concerned with the construction of automatic text classifiers, i.e. programs capable of assigning to a document one or more among a set of predefined categories based on the content of the document. Building these classifiers is itself done automatically, by means of a general inductive process that learns the characteristics of the categories from a set of preclassified documents. In this paper we discuss a class of applications, automatic indexing with controlled vocabularies, that is of direct concern to organizing digital libraries. We exemplify this class of applications by discussing an ongoing project aimed at classifying scientific papers about computer science with respect to the ACM Classification Scheme.



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:91096,
	title = {Organizing digital libraries by automated text categorization},
	author = {Avancini H and Rauber A and Sebastiani F},
	year = {2004}
}