2019
Journal article  Open Access

Parallel Traversal of Large Ensembles of Decision Trees

Lettich F., Lucchese C., Nardini F. M., Orlando S., Perego R., Tonellotto N., Venturini R.

Learning-to-Rank  Settore INF/01 - Informatica  GPUs  Data structures  Hardware and Architecture  Efficient Machine Learning  SIMD  Signal Processing  Computational Theory and Mathematics  Decision Tree Ensembles  NUMA multiprocessors  Data models  Settore ING-INF/05 - Sistemi di Elaborazione delle Informazioni  Regression tree analysis  Parallel Algorithms 

Machine-learnt models based on additive ensembles of regression trees are currently deemed the best solution to address complex classification, regression, and ranking tasks. The deployment of such models is computationally demanding: to compute the final prediction, the whole ensemble must be traversed by accumulating the contributions of all its trees. In particular, traversal cost impacts applications where the number of candidate items is large, the time budget available to apply the learnt model to them is limited, and the users' expectations in terms of quality-of-service is high. Document ranking in web search, where sub-optimal ranking models are deployed to find a proper trade-off between efficiency and effectiveness of query answering, is probably the most typical example of this challenging issue. This paper investigates multi/many-core parallelization strategies for speeding up the traversal of large ensembles of regression trees thus obtaining machine-learnt models that are, at the same time, effective, fast, and scalable. Our best results are obtained by the GPU-based parallelization of the state-of-the-art algorithm, with speedups of up to 102.6x.

Source: IEEE transactions on parallel and distributed systems (Print) 30 (2019): 2075–2089. doi:10.1109/TPDS.2018.2860982

Publisher: Institute of Electrical and Electronics Engineers,, New York, NY , Stati Uniti d'America


Metrics



Back to previous page
BibTeX entry
@article{oai:it.cnr:prodotti:398989,
	title = {Parallel Traversal of Large Ensembles of Decision Trees},
	author = {Lettich F. and Lucchese C. and Nardini F. M. and Orlando S. and Perego R. and Tonellotto N. and Venturini R.},
	publisher = {Institute of Electrical and Electronics Engineers,, New York, NY , Stati Uniti d'America},
	doi = {10.1109/tpds.2018.2860982 and 10.5281/zenodo.2668378 and 10.5281/zenodo.2668379},
	journal = {IEEE transactions on parallel and distributed systems (Print)},
	volume = {30},
	pages = {2075–2089},
	year = {2019}
}

BigDataGrapes
Big Data to Enable Global Disruption of the Grapevine-powered Industries


OpenAIRE