XGBoostRegressor¶
In [ ]:
XGBoostRegressor(name: str,
cursor=None,
max_ntree: int = 10,
max_depth: int = 5,
nbins: int = 32,
objective: str = 'squarederror',
split_proposal_method: str = 'global',
tol: float = 0.001,
learning_rate: float = 0.1,
min_split_loss: float = 0,
weight_reg: float = 0,
sample: float = 1,)
Creates a XGBoostRegressor object by using the Vertica XGBoost algorithm on the data.
Parameters¶
Name | Type | Optional | Description |
---|---|---|---|
name | str | ❌ | Name of the model to be stored in the database. |
cursor | DBcursor | ✓ | Vertica DB cursor. |
max_ntree | int | ✓ | Maximum number of trees that will be created. |
max_depth | int | ✓ | Maximum depth of each tree. |
nbins | int | ✓ | Number of bins to use for finding splits in each column, more splits leads to longer runtime but more fine-grained and possibly better splits. |
objective | str | ✓ | The objective/loss function that will be used to iteratively improve the model. |
max_depth | int | ✓ | The maximum depth for growing each tree, an integer between 1 and 100, inclusive. |
split_proposal_method | str | ✓ | approximate splitting strategy. Can be 'global' or 'local' (not yet supported). |
tol | float | ✓ | approximation error of quantile summary structures used in the approximate split finding method. |
learning_rate | float | ✓ | weight applied to each tree's prediction, reduces each tree's impact allowing for later trees to contribute, keeping earlier trees from 'hogging' all the improvements. |
min_split_loss | float | ✓ | Each split must improve the objective function value of the model by at least this much in order to not be pruned. Value of 0 is the same as turning off this parameter (trees will still be pruned based on positive/negative objective function values). |
weight_reg | float | ✓ | Regularization term that is applied to the weights of the leaves in the regression tree. The higher this value is, the more sparse/smooth the weights will be, which often helps prevent overfitting. |
sample | float | ✓ | Fraction of rows to use in training per iteration. |
Attributes¶
After the object creation, all the parameters become attributes. The model will also create extra attributes when fitting the model:
Name | Type | Description |
---|---|---|
input_relation | str | Training relation. |
X | list | List of the predictors. |
y | str | Response column. |
test_relation | str | Relation to use to test the model. All model methods are abstractions that simplify the process. The testing relation will be used by the methods to evaluate the model. If empty, the training relation will be used instead. This attribute can be changed at any time. |
Methods¶
Name | Description |
---|---|
deploySQL | Returns the SQL code needed to deploy the model. |
drop | Drops the model from the Vertica DB. |
export_graphviz | Converts the input tree to graphviz. |
features_importance | Computes the model features importance using the Gini Index. |
fit | Trains the model. |
get_attr | Returns the model attribute. |
get_params | Returns the model Parameters. |
get_tree | Returns a tablesample with all the input tree information. |
plot | Draws the Model. |
plot_tree | Draws the input tree. The module anytree must be installed in the machine. |
predict | Predicts using the input relation. |
regression_report | Computes a regression report using multiple metrics to evaluate the model (r2, mse, max error...). |
score | Computes the model score. |
set_cursor | Sets a new DB cursor. |
set_params | Sets the parameters of the model. |
shapExplainer | Creates a shapExplainer for the model. |
to_sklearn | Converts the Vertica model to an sklearn model. |
Example¶
In [10]:
from verticapy.learn.ensemble import XGBoostRegressor
model = XGBoostRegressor(name = "public.xgb_titanic",
max_ntree = 20,
max_depth = 15,)
display(model)