2015
Conference article  Restricted

Distributional correspondence indexing for cross-language text categorization

Esuli A., Fernandez A. M.

Cross-Language Text Categorization  Sentiment Analysis  Distributional Semantics 

Cross-Language Text Categorization (CLTC) aims at producing a classifier for a target language when the only available training examples belong to a different source language. Existing CLTC methods are usually affected by high computational costs, require external linguistic resources, or demand a considerable human annotation effort. This paper presents a simple, yet effective, CLTC method based on projecting features from both source and target languages into a common vector space, by using a computationally lightweight distributional correspondence profile with respect to a small set of pivot terms. Experiments on a popular sentiment classification dataset show that our method performs favorably to state-of-the-art methods, requiring a significantly reduced computational cost and minimal human intervention.

Source: ECIR 2015 - Advances in Information Retrieval. 37th European Conference on IR Research, pp. 104–109, Vienna, Austria, 29 March - 2 April 2015

Publisher: Springer, Berlin , Germania


Metrics



Back to previous page
BibTeX entry
@inproceedings{oai:it.cnr:prodotti:329758,
	title = {Distributional correspondence indexing for cross-language text categorization},
	author = {Esuli A. and Fernandez A. M.},
	publisher = {Springer, Berlin , Germania},
	doi = {10.1007/978-3-319-16354-3_12},
	booktitle = {ECIR 2015 - Advances in Information Retrieval. 37th European Conference on IR Research, pp. 104–109, Vienna, Austria, 29 March - 2 April 2015},
	year = {2015}
}