24  Xgboost model reference - DRAFT 🛠

In the following we have a list of all model keys of Xgboost models including parameters. They can be used like this:

(comment
  (ml/train df
            {:model-type <model-key>
             :param-1 0
             :param-2 1}))

24.1 :xgboost/binary-hinge-loss

javadoc
user guide


24.2 :xgboost/classification

javadoc
user guide


24.3 :xgboost/count-poisson

javadoc
user guide


24.4 :xgboost/gamma-regression

javadoc
user guide


24.5 :xgboost/gpu-binary-logistic-classification

javadoc
user guide


24.6 :xgboost/gpu-binary-logistic-raw-classification

javadoc
user guide


24.7 :xgboost/gpu-linear-regression

javadoc
user guide


24.8 :xgboost/gpu-logistic-regression

javadoc
user guide


24.9 :xgboost/linear-regression

javadoc
user guide
name description
eta Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative.
gamma Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger gamma is, the more conservative the algorithm will be.
max-depth Maximum depth of a tree. Increasing this value will make the model more complex and more likely to overfit. 0 is only accepted in lossguide growing policy when tree_method is set as hist or gpu_hist and it indicates no limit on depth. Beware that XGBoost aggressively consumes memory when training a deep tree.
min-child-weight Minimum sum of instance weight (hessian) needed in a child. If the tree partition step results in a leaf node with the sum of instance weight less than min_child_weight, then the building process will give up further partitioning. In linear regression task, this simply corresponds to minimum number of instances needed to be in each node. The larger min_child_weight is, the more conservative the algorithm will be.
max_delta_step Maximum delta step we allow each leaf output to be. If the value is set to 0, it means there is no constraint. If it is set to a positive value, it can help making the update step more conservative. Usually this parameter is not needed, but it might help in logistic regression when class is extremely imbalanced. Set it to value of 1-10 might help control the update.
subsample Subsample ratio of the training instances. Setting it to 0.5 means that XGBoost would randomly sample half of the training data prior to growing trees. and this will prevent overfitting. Subsampling will occur once in every boosting iteration.
sampling_method The method to use to sample the training instances. uniform: each training instance has an equal probability of being selected. Typically set subsample >= 0.5 for good results. gradient_based: the selection probability for each training instance is proportional to the regularized absolute value of gradients (more specifically, ). subsample may be set to as low as 0.1 without loss of model accuracy. Note that this sampling method is only supported when tree_method is set to gpu_hist; other tree methods only support uniform sampling.
colsample_bytree
colsample_bylevel
colsample_bynode
lambda L2 regularization term on weights. Increasing this value will make model more conservative.
alpha L1 regularization term on weights. Increasing this value will make model more conservative.
tree_method
sketch_eps
scale_pos_weight
updater
refresh_leaf
process_type
grow_policy
max_leaves
max_bin
predictor
num_parallel_tree
monotone_constraints
interaction_constraints


24.10 :xgboost/logistic-binary-classification

javadoc
user guide


24.11 :xgboost/logistic-binary-raw-classification

javadoc
user guide


24.12 :xgboost/logistic-regression

javadoc
user guide


24.13 :xgboost/multiclass-softmax

javadoc
user guide


24.14 :xgboost/multiclass-softprob

javadoc
user guide


24.15 :xgboost/rank-map

javadoc
user guide


24.16 :xgboost/rank-ndcg

javadoc
user guide


24.17 :xgboost/rank-pairwise

javadoc
user guide


24.18 :xgboost/regression

javadoc
user guide


24.19 :xgboost/squared-error-regression

javadoc
user guide


24.20 :xgboost/survival-cox

javadoc
user guide


24.21 :xgboost/tweedie-regression

javadoc
user guide


source: notebooks/noj_book/xgboost.clj