2003
Conference article  Restricted

Expanding Domain-Specific Lexicons by Term Categorization

Avancini H, Lavelli A, Magnini B, Sebastiani F, Zanoli R

Term classification  Classifier Design and Evaluation  Learning  Information Search and Retrieval  Thesauruses 

We discuss an approach to the automatic expansion of domain specific lexicons by means of term categorization, a novel task employing techniques from information retrieval (IR) and machine learning (ML). Specifically, we view the expansion of such lexicons as a process of learning previously unknown associations between terms and domains. The process generates, for each ci in a set C = {c1,.....,cm} of domains, a lexicon L1i, bootstrapping from an initial lexicon L0i and a set of documents given as input. The method is inspired by text categorization (TC), the discipline con=cerned with labelling natural language texts with labels from a predefined set of domains, or categories. However, while TC deals with documents represented as vectors in a space of terms, we formulate the task of term categorization as one in which terms are (dually) represented as vectors in a space of documents, and in which terms (instead of documents) are labelled with domains.



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:91003,
	title = {Expanding Domain-Specific Lexicons by Term Categorization},
	author = {Avancini H and Lavelli A and Magnini B and Sebastiani F and Zanoli R},
	year = {2003}
}