Accelerating the XGBoost algorithm using GPU computing
Top Cited Papers
Open Access
- 24 July 2017
- journal article
- Published by PeerJ in PeerJ Computer Science
- Vol. 3, e127
- https://doi.org/10.7717/peerj-cs.127
Abstract
We present a CUDA-based implementation of a decision tree construction algorithm within the gradient boosting library XGBoost. The tree construction algorithm is executed entirely on the graphics processing unit (GPU) and shows high performance with a variety of datasets and settings, including sparse input matrices. Individual boosting iterations are parallelised, combining two approaches. An interleaved approach is used for shallow trees, switching to a more conventional radix sort-based approach for larger depths. We show speedups of between 3× and 6× using a Titan X compared to a 4 core i7 CPU, and 1.2× using a Titan X compared to 2× Xeon CPUs (24 cores). We show that it is possible to process the Higgs dataset (10 million instances, 28 features) entirely within GPU memory. The algorithm is made available as a plug-in within the XGBoost library and fully supports all XGBoost features including classification, regression and ranking tasks.Keywords
Other Versions
Funding Information
- Marsden Grant from the Royal Society of New Zealand (UOW1502)
This publication has 7 references indexed in Scilit:
- Parallel construction of classification trees on a GPUConcurrency and Computation: Practice and Experience, 2015
- CUDT: A CUDA Based Decision Tree AlgorithmThe Scientific World Journal, 2014
- Decision tree construction on GPU: ubiquitous parallel computing approachComputing, 2013
- HIGH PERFORMANCE AND SCALABLE RADIX SORTING: A CASE STUDY OF IMPLEMENTING DYNAMIC PARALLELISM FOR GPU COMPUTINGParallel Processing Letters, 2011
- The WEKA data mining softwareACM SIGKDD Explorations Newsletter, 2009
- Data Parallel ProgrammingPublished by Springer Science and Business Media LLC ,2000
- The Accuracy of Floating Point SummationSIAM Journal on Scientific Computing, 1993