Model.fit

In [ ]:
Model.fit(input_relation: (str, vDataFrame),
          X: list,
          y: str,)

Trains the model.

Parameters

Name Type Optional Description
input_relation
str / vDataFrame
Training relation.
X
list
List of the predictors.
y
str
Response column.

Returns

object : model grid

Example

In [1]:
from verticapy.learn.delphi import AutoML

model = AutoML("titanic_autoML", stepwise = False)
model.fit("public.titanic", y = "survived")
Starting AutoML

Testing Model - LogisticRegression

Model: LogisticRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'penalty': 'none', 'solver': 'bfgs'}; Test_score: 0.058747133750518495; Train_score: 0.030520459056835366; Time: 10.177567720413208;
Model: LogisticRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'penalty': 'l1', 'solver': 'cgd', 'C': 1.0}; Test_score: 0.301029995663981; Train_score: 0.301029995663981; Time: 0.5177769660949707;
Model: LogisticRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'penalty': 'l2', 'solver': 'bfgs', 'C': 1.0}; Test_score: 0.03467372215888777; Train_score: 0.040084656014095; Time: 8.65127698580424;
Model: LogisticRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'penalty': 'enet', 'solver': 'cgd', 'C': 1.0, 'l1_ratio': 0.5}; Test_score: 0.301029995663981; Train_score: 0.301029995663981; Time: 0.4305996100107829;

Grid Search Selected Model
LogisticRegression; Parameters: {'solver': 'bfgs', 'penalty': 'l2', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Test_score: 0.03467372215888777; Train_score: 0.040084656014095; Time: 8.65127698580424;

Testing Model - RandomForestClassifier

Model: RandomForestClassifier; Parameters: {'max_features': 'max', 'max_leaf_nodes': 32, 'max_depth': 5, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.6912644770548286; Train_score: 0.018363257686033533; Time: 0.5188576380411783;
Model: RandomForestClassifier; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 64, 'max_depth': 4, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.04780020077573837; Train_score: 0.03799784570322354; Time: 0.6012808481852213;
Model: RandomForestClassifier; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 32, 'max_depth': 4, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.1121156175279119; Train_score: 0.036462521024479964; Time: 0.6061913967132568;
Model: RandomForestClassifier; Parameters: {'max_features': 'max', 'max_leaf_nodes': 64, 'max_depth': 4, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.32044589382468697; Train_score: 0.026115669482827335; Time: 0.5158038934071859;
Model: RandomForestClassifier; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 1000, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.10587022372781703; Train_score: 0.03147628534489167; Time: 0.6188173294067383;

Grid Search Selected Model
RandomForestClassifier; Parameters: {'n_estimators': 10, 'max_features': 'auto', 'max_leaf_nodes': 64, 'sample': 0.632, 'max_depth': 4, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.04780020077573837; Train_score: 0.03799784570322354; Time: 0.6012808481852213;

Testing Model - NaiveBayes

Model: NaiveBayes; Parameters: {'alpha': 0.01}; Test_score: 49.545500338212236; Train_score: 48.0569530726675; Time: 0.32554197311401367;
Model: NaiveBayes; Parameters: {'alpha': 1.0}; Test_score: 48.6479075941986; Train_score: 48.543435159082904; Time: 0.3659666379292806;
Model: NaiveBayes; Parameters: {'alpha': 10.0}; Test_score: 48.274806620868695; Train_score: 48.7333732981366; Time: 0.36309019724527997;

Grid Search Selected Model
NaiveBayes; Parameters: {'alpha': 10.0, 'nbtype': 'auto'}; Test_score: 48.274806620868695; Train_score: 48.7333732981366; Time: 0.36309019724527997;

Final Model

LogisticRegression; Best_Parameters: {'solver': 'bfgs', 'penalty': 'l2', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Best_Test_score: 0.03467372215888777; Train_score: 0.040084656014095; Time: 8.65127698580424;


Out[1]:
model_type
avg_score
avg_train_score
avg_time
score_std
score_train_std
1LogisticRegression0.034673722158887770.0400846560140958.651276985804240.00275337150282254980.0017338790478024836
2RandomForestClassifier0.047800200775738370.037997845703223540.60128084818522130.0078546411110277780.006098402782038309
3LogisticRegression0.0587471337505184950.03052045905683536610.1775677204132080.0495028864015890150.007393253247850581
4RandomForestClassifier0.105870223727817030.031476285344891670.61881732940673830.113647723060515970.005364074296552674
5RandomForestClassifier0.11211561752791190.0364625210244799640.60619139671325680.130381549990871980.008716486006951588
6LogisticRegression0.3010299956639810.3010299956639810.51777696609497070.00.0
7LogisticRegression0.3010299956639810.3010299956639810.43059961001078290.00.0
8RandomForestClassifier0.320445893824686970.0261156694828273350.51580389340718590.32078625994589250.0026885404517372454
9RandomForestClassifier0.69126447705482860.0183632576860335330.51885763804117830.365408984199262040.0023180986540416245
10NaiveBayes48.27480662086869548.73337329813660.363090197245279971.29138143311236250.6408280844475029
11NaiveBayes48.647907594198648.5434351590829040.36596663792928061.8287399113360180.87773409521784
12NaiveBayes49.54550033821223648.05695307266750.325541973114013671.0509750098683080.5687428669096175
Rows: 1-12 | Columns: 8