Module ensemble (0.17.0)

Ensemble models. This module is styled after Scikit-Learn's ensemble module: https://scikit-learn.org/stable/modules/ensemble.html

Classes

RandomForestClassifier

RandomForestClassifier(
    num_parallel_tree: int = 100,
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 0.8,
    gamma: float = 0.0,
    max_depth: int = 15,
    subsample: float = 0.8,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    early_stop=True,
    min_rel_progress: float = 0.01,
    enable_global_explain=False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9",
)

A random forest classifier.

A random forest is a meta estimator that fits a number of decision tree classifiers on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters
Name	Description
`num_parallel_tree`	`Optional[int]` Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2.
`tree_method`	`Optional[str]` Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: ""exact", "approx", "hist".
`min_child_weight`	`Optional[float]` Minimum sum of instance weight(hessian) needed in a child. Default to 1.
`colsample_bytree`	`Optional[float]` Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1.
`colsample_bylevel`	`Optional[float]` Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1.
`colsample_bynode`	`Optional[float]` Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1.
`gamma`	`Optional[float]` (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
`max_depth`	`Optional[int]` Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1.
`subsample`	`Optional[float]` Subsample ratio of the training instance. Default to 0.8. The value should be greater than 0 and less than 1.
`reg_alpha`	`Optional[float]` L1 regularization term on weights (xgb's alpha). Default to 0.0.
`reg_lambda`	`Optional[float]` L2 regularization term on weights (xgb's lambda). Default to 1.0.
`early_stop`	`Optional[bool]` Whether training should stop after the first iteration. Default to True.
`min_rel_progress`	`Optional[float]` Minimum relative loss improvement necessary to continue training when early_stop is set to True. Default to 0.01.
`enable_global_explain`	`Optional[bool]` Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
`xgboost_version`	`Optional[str]` Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".ß

RandomForestRegressor

RandomForestRegressor(
    num_parallel_tree: int = 100,
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree=1.0,
    colsample_bylevel=1.0,
    colsample_bynode=0.8,
    gamma=0.0,
    max_depth: int = 15,
    subsample=0.8,
    reg_alpha=0.0,
    reg_lambda=1.0,
    early_stop=True,
    min_rel_progress=0.01,
    enable_global_explain=False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9",
)

A random forest regressor.

A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy and control over-fitting.

Parameters
Name	Description
`num_parallel_tree`	`Optional[int]` Number of parallel trees constructed during each iteration. Default to 100. Minimum value is 2.
`tree_method`	`Optional[str]` Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: ""exact", "approx", "hist".
`min_child_weight`	`Optional[float]` Minimum sum of instance weight(hessian) needed in a child. Default to 1.
`colsample_bytree`	`Optional[float]` Subsample ratio of columns when constructing each tree. Default to 1.0. The value should be between 0 and 1.
`colsample_bylevel`	`Optional[float]` Subsample ratio of columns for each level. Default to 1.0. The value should be between 0 and 1.
`colsample_bynode`	`Optional[float]` Subsample ratio of columns for each split. Default to 0.8. The value should be between 0 and 1.
`gamma`	`Optional[float]` (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
`max_depth`	`Optional[int]` Maximum tree depth for base learners. Default to 15. The value should be greater than 0 and less than 1.
`reg_alpha`	`Optional[float]` L1 regularization term on weights (xgb's alpha). Default to 0.0.
`reg_lambda`	`Optional[float]` L2 regularization term on weights (xgb's lambda). Default to 1.0.
`early_stop`	`Optional[bool]` Whether training should stop after the first iteration. Default to True.
`min_rel_progress`	`Optional[float]` Minimum relative loss improvement necessary to continue training when early_stop is set to True. Default to 0.01.
`enable_global_explain`	`Optional[bool]` Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
`xgboost_version`	`Optional[str]` Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

XGBClassifier

XGBClassifier(
    num_parallel_tree: int = 1,
    booster: typing.Literal["gbtree", "dart"] = "gbtree",
    dart_normalized_type: typing.Literal["tree", "forest"] = "tree",
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 1.0,
    gamma: float = 0.0,
    max_depth: int = 6,
    subsample: float = 1.0,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    early_stop: bool = True,
    learning_rate: float = 0.3,
    max_iterations: int = 20,
    min_rel_progress: float = 0.01,
    enable_global_explain: bool = False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9",
)

XGBoost classifier model.

Parameters
Name	Description
`num_parallel_tree`	`Optional[int]` Number of parallel trees constructed during each iteration. Default to 1.
`booster`	`Optional[str]` Specify which booster to use: gbtree or dart. Default to "gbtree".
`dart_normalized_type`	`Optional[str]` Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".
`tree_method`	`Optional[str]` Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: ""exact", "approx", "hist".
`min_child_weight`	`Optional[float]` Minimum sum of instance weight(hessian) needed in a child. Default to 1.
`colsample_bytree`	`Optional[float]` Subsample ratio of columns when constructing each tree. Default to 1.0.
`colsample_bylevel`	`Optional[float]` Subsample ratio of columns for each level. Default to 1.0.
`colsample_bynode`	`Optional[float]` Subsample ratio of columns for each split. Default to 1.0.
`gamma`	`Optional[float]` (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
`max_depth`	`Optional[int]` Maximum tree depth for base learners. Default to 6.
`subsample`	`Optional[float]` Subsample ratio of the training instance. Default to 1.0.
`reg_alpha`	`Optional[float]` L1 regularization term on weights (xgb's alpha). Default to 0.0.
`reg_lambda`	`Optional[float]` L2 regularization term on weights (xgb's lambda). Default to 1.0.
`early_stop`	`Optional[bool]` Whether training should stop after the first iteration. Default to True.
`learning_rate`	`Optional[float]` Boosting learning rate (xgb's "eta"). Default to 0.3.
`max_iterations`	`Optional[int]` Maximum number of rounds for boosting. Default to 20.
`min_rel_progress`	`Optional[float]` Minimum relative loss improvement necessary to continue training when early_stop is set to True. Default to 0.01.
`enable_global_explain`	`Optional[bool]` Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
`xgboost_version`	`Optional[str]` Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".

XGBRegressor

XGBRegressor(
    num_parallel_tree: int = 1,
    booster: typing.Literal["gbtree", "dart"] = "gbtree",
    dart_normalized_type: typing.Literal["tree", "forest"] = "tree",
    tree_method: typing.Literal["auto", "exact", "approx", "hist"] = "auto",
    min_tree_child_weight: int = 1,
    colsample_bytree: float = 1.0,
    colsample_bylevel: float = 1.0,
    colsample_bynode: float = 1.0,
    gamma: float = 0.0,
    max_depth: int = 6,
    subsample: float = 1.0,
    reg_alpha: float = 0.0,
    reg_lambda: float = 1.0,
    early_stop: float = True,
    learning_rate: float = 0.3,
    max_iterations: int = 20,
    min_rel_progress: float = 0.01,
    enable_global_explain: bool = False,
    xgboost_version: typing.Literal["0.9", "1.1"] = "0.9",
)

XGBoost regression model.

Parameters
Name	Description
`num_parallel_tree`	`Optional[int]` Number of parallel trees constructed during each iteration. Default to 1.
`booster`	`Optional[str]` Specify which booster to use: gbtree or dart. Default to "gbtree".
`dart_normalized_type`	`Optional[str]` Type of normalization algorithm for DART booster. Possible values: "TREE", "FOREST". Default to "TREE".
`tree_method`	`Optional[str]` Specify which tree method to use. Default to "auto". If this parameter is set to default, XGBoost will choose the most conservative option available. Possible values: ""exact", "approx", "hist".
`min_child_weight`	`Optional[float]` Minimum sum of instance weight(hessian) needed in a child. Default to 1.
`colsample_bytree`	`Optional[float]` Subsample ratio of columns when constructing each tree. Default to 1.0.
`colsample_bylevel`	`Optional[float]` Subsample ratio of columns for each level. Default to 1.0.
`colsample_bynode`	`Optional[float]` Subsample ratio of columns for each split. Default to 1.0.
`gamma`	`Optional[float]` (min_split_loss) Minimum loss reduction required to make a further partition on a leaf node of the tree. Default to 0.0.
`max_depth`	`Optional[int]` Maximum tree depth for base learners. Default to 6.
`subsample`	`Optional[float]` Subsample ratio of the training instance. Default to 1.0.
`reg_alpha`	`Optional[float]` L1 regularization term on weights (xgb's alpha). Default to 0.0.
`reg_lambda`	`Optional[float]` L2 regularization term on weights (xgb's lambda). Default to 1.0.
`early_stop`	`Optional[bool]` Whether training should stop after the first iteration. Default to True.
`learning_rate`	`Optional[float]` Boosting learning rate (xgb's "eta"). Default to 0.3.
`max_iterations`	`Optional[int]` Maximum number of rounds for boosting. Default to 20.
`min_rel_progress`	`Optional[float]` Minimum relative loss improvement necessary to continue training when early_stop is set to True. Default to 0.01.
`enable_global_explain`	`Optional[bool]` Whether to compute global explanations using explainable AI to evaluate global feature importance to the model. Default to False.
`xgboost_version`	`Optional[str]` Specifies the Xgboost version for model training. Default to "0.9". Possible values: "0.9", "1.1".