r - How to choose the nrounds using `catboost`? -
if understand correctly catboost, need tune nrounds in xgboost, using cv. see following code in official tutorial in [8]
params_with_od <- list(iterations = 500, loss_function = 'logloss', train_dir = 'train_dir', od_type = 'iter', od_wait = 30) model_with_od <- catboost.train(train_pool, test_pool, params_with_od) which result in best iteractions = 211.
my question are:
- is correct that: command use
test_poolchoose bestiteractionsinstead of using cross-validation? - if yes, catboost provide command choose best
iteractionscv, or need manually?
catboost doing cross validation determine optimum number of iterations. both train_pool , test_pool datasets include target variable. earlier in tutorial write
train_path = '../r-package/inst/extdata/adult_train.1000' test_path = '../r-package/inst/extdata/adult_test.1000' column_description_vector = rep('numeric', 15) cat_features <- c(3, 5, 7, 8, 9, 10, 11, 15) (i in cat_features) column_description_vector[i] <- 'factor' train <- read.table(train_path, head=f, sep="\t", colclasses=column_description_vector) test <- read.table(test_path, head=f, sep="\t", colclasses=column_description_vector) target <- c(1) train_pool <- catboost.from_data_frame(data=train[,-target], target=train[,target]) test_pool <- catboost.from_data_frame(data=test[,-target], target=test[,target]) when execute catboost.train(train_pool, test_pool, params_with_od) train_pool used training , test_pool used determine optimum number of iterations via cross validation.
now right confused, since later on in tutorial again use test_pool , fitted model make prediction (model_best similar model_with_od, uses different overfitting detector inctodec):
prediction_best <- catboost.predict(model_best, test_pool, type = 'probability') this might bad practice. might away with inctodec overfitting detector - not familiar mathematics behind - iter type overfitting detector need have separate train,validation , test data sets (and if want on save side, same inctodec overfitting detector). tutorial showing functionality wouldn't pedantic data have used how.
here link little more detail on overfitting detectors: https://tech.yandex.com/catboost/doc/dg/concepts/overfitting-detector-docpage/
Comments
Post a Comment