GPBoost icon

GPBoost R Package

License CRAN Version Downloads

This is the R package implementation of the GPBoost library. See https://github.com/fabsig/GPBoost for more information on the modeling background and the software implementation.

Table of Contents

Examples

The following is also a short example.

# Combine tree-boosting and grouped random effects model
library(gpboost)
data(GPBoost_data, package = "gpboost")
gp_model <- GPModel(group_data = group_data)
bst <- gpboost(data = X, label = y, gp_model = gp_model,
               nrounds = 10, objective = "regression_l2")
summary(gp_model)
pred <- predict(bst, data = X_test, group_data_pred = group_data_test)
pred$random_effect_mean + pred$fixed_effect

Installation

The gpboost package is available on CRAN and can be installed as follows:

install.packages("gpboost", repos = "https://cran.r-project.org")

Installation from source

It is much easier to install the package from CRAN. However, the package can also be build from source as described in the following. In short, the main steps for installation are the following ones:

git clone --recursive https://github.com/fabsig/GPBoost
cd GPBoost
Rscript build_r.R

Below is a more complete installation guide.

Preparation

You need to install git and CMake first. Note that 32-bit R/Rtools is not supported for custom installation.

Windows Preparation

NOTE: Windows users may need to run with administrator rights (either R or the command prompt, depending on the way you are installing this package).

Installing a 64-bit version of Rtools is mandatory.

After installing Rtools and CMake, be sure the following paths are added to the environment variable PATH. These may have been automatically added when installing other software.

The default compiler is Visual Studio (or VS Build Tools) in Windows, with an automatic fallback to MingGW64 (i.e. it is enough to only have Rtools and CMake). To force the usage of MinGW64, you can add the --use-mingw (for R 3.x) or --use-msys2 (for R 4.x) flags (see below).

Mac OS Preparation

You can perform installation either with Apple Clang or gcc. In case you prefer Apple Clang, you should install OpenMP (details for installation can be found in the Installation Guide) first and CMake version 3.12 or higher is required. In case you prefer gcc, you need to install it (details for installation can be found in the Installation Guide) and set some environment variables to tell R to use gcc and g++. If you install these from Homebrew, your versions of g++ and gcc are most likely in /usr/local/bin, as shown below.

# replace 8 with version of gcc installed on your machine
export CXX=/usr/local/bin/g++-8 CC=/usr/local/bin/gcc-8

Install

Build and install the R package with the following commands:

git clone --recursive https://github.com/fabsig/GPBoost
cd GPBoost
Rscript build_r.R

The build_r.R script builds the package in a temporary directory called gpboost_r. It will destroy and recreate that directory each time you run the script. That script supports the following command-line options:

Note: for the build with Visual Studio/VS Build Tools in Windows, you should use the Windows CMD or PowerShell.

Testing

There is currently no integration service set up that automatically runs unit tests. However, any contribution needs to pass all unit tests in the R-package/tests/testthat directory. These tests can be run using the run_tests_coverage_R_package.R file. In any case, make sure that you run the full set of tests by speciying the following environment variable

Sys.setenv(GPBOOST_ALL_TESTS = "GPBOOST_ALL_TESTS")

before runing the tests in the R-package/tests/testthat directory.