Python Snippet of Bayesian Optimization for Xgboost

Wed, Oct 16, 2019 2-minute read

Introduction

Bayesian optimization is usually a faster alternative than GridSearch when we’re trying to find out the best combination of hyperparameters of the algorithm. In Python, there’s a handful package that allows to apply it, the bayes_opt.

This post is a code snippet to start using the package functions along xgboost to solve a regression problem.

The Code

Preparing the environment.

import pandas as pd
import numpy as np
import xgboost as xgb
from sklearn import datasets
import bayes_opt as bopt
boston = datasets.load_boston()
dm_input = xgb.DMatrix(boston['data'], label=boston.target)

To run the Bayesian Optimization, it’s required to create a custom function. The function has to outcome the target metric for a parameters' combination trial.

def objective(self, max_depth, eta, max_delta_step, colsample_bytree, subsample):
    cur_params =  {'objective': 'reg:linear',
                   'max_depth': int(max_depth),
                   'eta': eta,
                   'max_delta_step': int(max_delta_step),
                   'colsample_bytree': colsample_bytree,
                   'subsample': subsample}

    cv_results = xgb.cv(params=cur_params, 
                        dtrain=self.dm_input, 
                        nfold=3, 
                        seed=3,
                        num_boost_round=50000,
                        early_stopping_rounds=50,
                        metrics='rmse')

    return -1 * cv_results['test-rmse-mean'].min()

In the case of a regression problem, there’s an important detail, as the Bayes Optimization seeks to maximize the output value and we’re trying to minimize the target rmse metric. We have to assign a minus sign to the output function for it properly search the minimum rmse.

Another detail is that the arguments of the custom function are restricted only to hyperparameters. It creates a problem given we also have to pass the input dataset to the xgboost training function. I see two ways to handle it. We could either use the dataset like a global variable or declare the custom function inside a class which has the dataset like an attribute. The examples follows using the class option.

class custom_bayesopt:
    def __init__(self, dm_input):
        self.dm_input = dm_input
        
    def objective(self, max_depth, eta, max_delta_step, colsample_bytree, subsample):
        cur_params =  {'objective': 'reg:linear',
                       'max_depth': int(max_depth),
                       'eta': eta,
                       'max_delta_step': int(max_delta_step),
                       'colsample_bytree': colsample_bytree,
                       'subsample': subsample}

        cv_results = xgb.cv(params=cur_params, 
                            dtrain=self.dm_input, 
                            nfold=3, 
                            seed=3,
                            num_boost_round=50000,
                            early_stopping_rounds=50,
                            metrics='rmse')

        return -1 * cv_results['test-rmse-mean'].min()

Now, we call the Bayesian process, passing the custom function and the hyperparameters' boundaries.

bopt_process = bopt.BayesianOptimization(custom_bayesopt(dm_input).objective, 
                                         {'max_depth': (2, 15),
                                          'eta': (0.01, 0.3),
                                          'max_delta_step': (0, 10),
                                          'colsample_bytree': (0, 1),
                                          'subsample': (0, 1)},
                              random_state=np.random.RandomState(1))

It’s possible to register the outcome events into a log file. It is especially useful when you want to create a new bayesian optimization instance, allowing the reusing of the saved information. Check more about here.

logger = bopt.observer.JSONLogger(path="bopt.log.json")

bopt_process.subscribe(bopt.event.Events.OPTMIZATION_STEP, logger)
bopt_process.maximize(n_iter=10, init_points=12)

Finishing the iterating process. The winner hyperparameters are on the max attribute:

bopt_process.max
{'target': -2.9679186666666664,
 'params': {'colsample_bytree': 0.935372077775139,
  'eta': 0.013944731934196423,
  'max_delta_step': 0.07555792154893812,
  'max_depth': 3.2975847928232245,
  'subsample': 0.7161419730372364}}