Getting started with transferboost

This is quick start tutorial providing code snippets for getting started with tboost.

XGBTransferLearner: transfer learning with xgboost

import warnings
warnings.filterwarnings('ignore')

import transferboost as tb
from transferboost.dataset import load_data

# Load the data
X, y1, y2 = load_data(return_X_y=True)

Train an xgboost model and perform the "transfer learning"

Train an xgboost model on the first target, with y1

import xgboost as xgb

xgb_model = xgb.XGBClassifier(
    max_depth = 2,
    reg_lambda = 0,
    n_estimators=100,
    verbosity = 0
)

xgb_model.fit(X,y1)

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              learning_rate=0.300000012, max_delta_step=0, max_depth=2,
              min_child_weight=1, missing=nan, monotone_constraints='()',
              n_estimators=100, n_jobs=4, num_parallel_tree=1, random_state=0,
              reg_alpha=0, reg_lambda=0, scale_pos_weight=1, subsample=1,
              tree_method='exact', validate_parameters=1, verbosity=0)

Transfer Learners expect a fitted model in the constructor.

from transferboost.models import XGBTransferLearner

t_xgb_model = XGBTransferLearner(xgb_model)

Perfrom the "transfer learning" by fitting the XGBTransferLearner on another target, y2 in this case.

t_xgb_model.fit(X,y2)

XGBTransferLearner with base model
    XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              learning_rate=0.300000012, max_delta_step=0, max_depth=2,
              min_child_weight=1, missing=nan, monotone_constraints='()',
              n_estimators=100, n_jobs=4, num_parallel_tree=1, random_state=0,
              reg_alpha=0, reg_lambda=0, scale_pos_weight=1, subsample=1,
              tree_method='exact', validate_parameters=1, verbosity=0)
base_score = 0.5

Get the predicted probabilities with the transfer-learned model.

XGBTransferLearner.predicted_proba(X) returns the the probabilities (as in any sklearn API).

t_xgb_model.predict_proba(X)

array([[0.70023316, 0.29976684],
       [0.70499377, 0.29500623],
       [0.91163901, 0.08836099],
       ...,
       [0.60772859, 0.39227141],
       [0.76717276, 0.23282724],
       [0.80914091, 0.19085909]])

LGBMTransferLearner: transfer learning with ligthgbm

If the baseline model is a lightgbm, the transfer learning procedure is very similar.

Train a lightgbm model and perform the "transfer learning"

As in the xgboost case, train a LGBM classifier on target y1.

import lightgbm as lgb

lgb_model = lgb.LGBMClassifier(
    max_depth = 2,
    reg_lambda = 0,
    n_estimators=100,
    verbosity = 0
)

lgb_model.fit(X,y1)

[LightGBM] [Warning] Auto-choosing col-wise multi-threading, the overhead of testing was 0.002838 seconds.
You can set `force_col_wise=true` to remove the overhead.

LGBMClassifier(max_depth=2, reg_lambda=0, verbosity=0)

Use the LGBMTransferLearner class

from transferboost.models import LGBMTransferLearner

t_lgb_model = LGBMTransferLearner(lgb_model)

Transfer-learn the model to the new target (y2)

t_lgb_model.fit(X,y2)

XGBTransferLearner with base model
    LGBMClassifier(max_depth=2, reg_lambda=0, verbosity=0)
base_score = 0.3

Get the predicted probabilities with the transfer-learned model.

LGBMTransferLearner.predicted_proba(X) returns the the probabilities (as in any sklearn API).

t_lgb_model.predict_proba(X)

array([[0.67982818, 0.32017182],
       [0.92096689, 0.07903311],
       [0.81908308, 0.18091692],
       ...,
       [0.53578214, 0.46421786],
       [0.80567013, 0.19432987],
       [0.67593326, 0.32406674]])