2023
Journal article  Open Access

Solving imbalanced learning with outlier detection and features reduction

Lusito S., Pugnana A., Guidotti R.

Imbalanced data learning  Outlier detection  Features reduction  Features selection  Classification framework 

A critical problem for several real world applications is class imbalance. Indeed, in contexts like fraud detection or medical diagnostics, standard machine learning models fail because they are designed to handle balanced class distributions. Existing solutions typically increase the rare class instances by generating synthetic records to achieve a balanced class distribution. However, these procedures generate not plausible data and tend to create unnecessary noise. We propose a change of perspective where instead of relying on resampling techniques, we depend on unsupervised features engineering approaches to represent records with a combination of features that will help the classifier capturing the differences among classes, even in presence of imbalanced data. Thus, we combine a large array of outlier detection, features projection, and features selection approaches to augment the expressiveness of the dataset population. We show the effectiveness of our proposal in a deep and wide set of benchmarking experiments as well as in real case studies.

Source: Machine learning (2023). doi:10.1007/s10994-023-06448-0

Publisher: Kluwer Academic Publishers,, Boston/U.S.A. , Stati Uniti d'America


Metrics



Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:490298,
	title = {Solving imbalanced learning with outlier detection and features reduction},
	author = {Lusito S. and Pugnana A. and Guidotti R.},
	publisher = {Kluwer Academic Publishers,, Boston/U.S.A. , Stati Uniti d'America},
	doi = {10.1007/s10994-023-06448-0},
	journal = {Machine learning},
	year = {2023}
}

SoBigData-PlusPlus
SoBigData++: European Integrated Infrastructure for Social Mining and Big Data Analytics


OpenAIRE