Loading...

verticapy.machine_learning.model_selection.hp_tuning.enet_search_cv#

verticapy.machine_learning.model_selection.hp_tuning.enet_search_cv(input_relation: str | vDataFrame, X: str | list[str], y: str, metric: str = 'auto', cv: int = 3, estimator_type: Literal['logit', 'enet', 'auto'] = 'auto', cutoff: float = -1.0, print_info: bool = True, **kwargs) TableSample#

Computes the k-fold grid search using multiple ENet models.

input_relation: SQLRelation

Relation used to train the model.

X: SQLColumns

list of the predictor columns.

y: str

Response Column.

metric: str, optional

Metric used for the model evaluation.

  • auto:

    logloss for classification & RMSE for regression.

For Classification

  • accuracy:

    Accuracy.

    \[Accuracy = \frac{TP + TN}{TP + TN + FP + FN}\]
  • auc:

    Area Under the Curve (ROC).

    \[AUC = \int_{0}^{1} TPR(FPR) \, dFPR\]
  • ba:

    Balanced Accuracy.

    \[BA = \frac{TPR + TNR}{2}\]
  • bm:

    Informedness

    \[BM = TPR + TNR - 1\]
  • csi:

    Critical Success Index

    \[index = \frac{TP}{TP + FN + FP}\]
  • f1:

    F1 Score .. math:

    F_1 Score = 2 \times 
    

rac{Precision times Recall}{Precision + Recall}

  • fdr:

    False Discovery Rate

    \[FDR = 1 - PPV\]
  • fm:

    Fowlkes-Mallows index

    \[FM = \sqrt{PPV * TPR}\]
  • fnr:

    False Negative Rate

    \[FNR = \frac{FN}{FN + TP}\]
  • for:

    False Omission Rate

    \[FOR = 1 - NPV\]
  • fpr:

    False Positive Rate

    \[FPR = \frac{FP}{FP + TN}\]
  • logloss:

    Log Loss

    \[Loss = -\frac{1}{N} \sum_{i=1}^{N} \left( y_i \log(p_i) + (1 - y_i) \log(1 - p_i) \right)\]
  • lr+:

    Positive Likelihood Ratio.

    \[LR+ = \frac{TPR}{FPR}\]
  • lr-:

    Negative Likelihood Ratio.

    \[LR- = \frac{FNR}{TNR}\]
  • dor:

    Diagnostic Odds Ratio.

    \[DOR = \frac{TP \times TN}{FP \times FN}\]
  • mcc:

    Matthews Correlation Coefficient

  • mk:

    Markedness

    \[MK = PPV + NPV - 1\]
  • npv:

    Negative Predictive Value

    \[NPV = \frac{TN}{TN + FN}\]
  • prc_auc:

    Area Under the Curve (PRC)

    \[AUC = \int_{0}^{1} Precision(Recall) \, dRecall\]
  • precision:

    Precision

    \[TP / (TP + FP)\]
  • pt:

    Prevalence Threshold.

    \[\frac{\sqrt{FPR}}{\sqrt{TPR} + \sqrt{FPR}}\]
  • recall:

    Recall.

    \[TP / (TP + FN)\]
  • specificity:

    Specificity.

    \[TN / (TN + FP)\]

For Regression

  • max:

    Max Error.

    \[ME = \max_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
  • mae:

    Mean Absolute Error.

    \[MAE = \frac{1}{n} \sum_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
  • median:

    Median Absolute Error.

    \[MedAE = \text{median}_{i=1}^{n} \left| y_i - \hat{y}_i \right|\]
  • mse:

    Mean Squared Error.

    \[MSE = \frac{1}{n} \sum_{i=1}^{n} \left( y_i - \hat{y}_i \right)^2\]
  • msle:

    Mean Squared Log Error.

    \[MSLE = \frac{1}{n} \sum_{i=1}^{n} (\log(1 + y_i) - \log(1 + \hat{y}_i))^2\]
  • r2:

    R squared coefficient.

    \[R^2 = 1 - \frac{\sum_{i=1}^{n} (y_i - \hat{y}_i)^2}{\sum_{i=1}^{n} (y_i - \bar{y})^2}\]
  • r2a:

    R2 adjusted

    \[\text{Adjusted } R^2 = 1 - \frac{(1 - R^2)(n - 1)}{n - k - 1}\]
  • var:

    Explained Variance.

    \[VAR = 1 - \frac{Var(y - \hat{y})}{Var(y)}\]
  • rmse:

    Root-mean-squared error

    \[RMSE = \sqrt{\frac{1}{n} \sum_{i=1}^{n} (y_i - \hat{y}_i)^2}\]
cv: int, optional

Number of folds.

estimator_type: str, optional

Estimator Type.

cutoff: float, optional

The model cutoff (logit only).

print_info: bool, optional

If set to True, prints the modelinformation at each step.

TableSample

result of the ENET search.

We import verticapy:

import verticapy as vp

Hint

By assigning an alias to verticapy, we mitigate the risk of code collisions with other libraries. This precaution is necessary because verticapy uses commonly known function names like “average” and “median”, which can potentially lead to naming conflicts. The use of an alias ensures that the functions from verticapy are used as intended without interfering with functions from other libraries.

For this example, we will use the Wine Quality dataset.

import verticapy.datasets as vpd

data = vpd.load_winequality()
123
fixed_acidity
Numeric(8)
123
volatile_acidity
Numeric(9)
123
citric_acid
Numeric(8)
123
residual_sugar
Numeric(9)
123
chlorides
Float(22)
123
free_sulfur_dioxide
Numeric(9)
123
total_sulfur_dioxide
Numeric(9)
123
density
Float(22)
123
pH
Numeric(8)
123
sulphates
Numeric(8)
123
alcohol
Float(22)
123
quality
Integer
123
good
Integer
Abc
color
Varchar(20)
13.80.310.0211.10.03620.0114.00.992483.750.4412.460white
23.90.2250.44.20.0329.0118.00.9893.570.3612.881white
34.20.170.361.80.02993.0161.00.989993.650.8912.071white
44.20.2150.235.10.04164.0157.00.996883.420.448.030white
54.40.320.394.30.0331.0127.00.989043.460.3612.881white
64.40.460.12.80.02431.0111.00.988163.480.3413.160white
74.40.540.095.10.03852.097.00.990223.410.412.271white
84.50.190.210.950.03389.0159.00.993323.340.428.050white
94.60.4450.01.40.05311.0178.00.994263.790.5510.250white
104.60.520.152.10.0548.065.00.99343.90.5613.140red
114.70.1450.291.00.04235.090.00.99083.760.4911.360white
124.70.3350.141.30.03669.0168.00.992123.470.4610.550white
134.70.4550.181.90.03633.0106.00.987463.210.8314.071white
144.70.60.172.30.05817.0106.00.99323.850.612.960red
154.70.670.091.00.025.09.00.987223.30.3413.650white
164.70.7850.03.40.03623.0134.00.989813.530.9213.860white
174.80.130.321.20.04240.098.00.98983.420.6411.871white
184.80.170.282.90.0322.0111.00.99023.380.3411.371white
194.80.210.2110.20.03717.0112.00.993243.660.4812.271white
204.80.2250.381.20.07447.0130.00.991323.310.410.360white
214.80.260.2310.60.03423.0111.00.992743.460.2811.571white
224.80.290.231.10.04438.0180.00.989243.280.3411.960white
234.80.330.06.50.02834.0163.00.99373.350.619.950white
244.80.340.06.50.02833.0163.00.99393.360.619.960white
254.80.650.121.10.0134.010.00.992463.320.3613.540white
264.90.2350.2711.750.0334.0118.00.99543.070.59.460white
274.90.330.311.20.01639.0150.00.987133.330.5914.081white
284.90.3350.141.30.03669.0168.00.992123.470.4610.466666666666750white
294.90.3350.141.30.03669.0168.00.992123.470.4610.466666666666750white
304.90.3450.341.00.06832.0143.00.991383.240.410.150white
314.90.3450.341.00.06832.0143.00.991383.240.410.150white
324.90.420.02.10.04816.042.00.991543.710.7414.071red
334.90.470.171.90.03560.0148.00.989643.270.3511.560white
345.00.170.561.50.02624.0115.00.99063.480.3910.871white
355.00.20.41.90.01520.098.00.98973.370.5512.0560white
365.00.2350.2711.750.0334.0118.00.99543.070.59.460white
375.00.240.195.00.04317.0101.00.994383.670.5710.050white
385.00.240.212.20.03931.0100.00.990983.690.6211.760white
395.00.240.341.10.03449.0158.00.987743.320.3213.171white
405.00.2550.222.70.04346.0153.00.992383.750.7611.360white
415.00.270.324.50.03258.0178.00.989563.450.3112.671white
425.00.270.324.50.03258.0178.00.989563.450.3112.671white
435.00.270.41.20.07642.0124.00.992043.320.4710.160white
445.00.290.545.70.03554.0155.00.989763.270.3412.981white
455.00.30.333.70.0354.0173.00.98873.360.313.071white
465.00.310.06.40.04643.0166.00.9943.30.639.960white
475.00.330.161.50.04910.097.00.99173.480.4410.760white
485.00.330.161.50.04910.097.00.99173.480.4410.760white
495.00.330.161.50.04910.097.00.99173.480.4410.760white
505.00.330.184.60.03240.0124.00.991143.180.411.060white
515.00.330.2311.80.0323.0158.00.993223.410.6411.860white
525.00.350.257.80.03124.0116.00.992413.390.411.360white
535.00.350.257.80.03124.0116.00.992413.390.411.360white
545.00.380.011.60.04826.060.00.990843.70.7514.060red
555.00.40.54.30.04629.080.00.99023.490.6613.660red
565.00.420.242.00.0619.050.00.99173.720.7414.081red
575.00.440.0418.60.03938.0128.00.99853.370.5710.260white
585.00.4550.181.90.03633.0106.00.987463.210.8314.071white
595.00.550.148.30.03235.0164.00.99183.530.5112.581white
605.00.610.121.30.00965.0100.00.98743.260.3713.550white
615.00.740.01.20.04116.046.00.992584.010.5912.560red
625.01.020.041.40.04541.085.00.99383.750.4810.540red
635.01.040.241.60.0532.096.00.99343.740.6211.550red
645.10.110.321.60.02812.090.00.990083.570.5212.260white
655.10.140.250.70.03915.089.00.99193.220.439.260white
665.10.1650.225.70.04742.0146.00.99343.180.559.960white
675.10.210.281.40.04748.0148.00.991683.50.4910.450white
685.10.230.181.00.05313.099.00.989563.220.3911.550white
695.10.250.361.30.03540.078.00.98913.230.6412.171white
705.10.260.331.10.02746.0113.00.989463.350.4311.471white
715.10.260.346.40.03426.099.00.994493.230.419.260white
725.10.290.288.30.02627.0107.00.993083.360.3711.060white
735.10.290.288.30.02627.0107.00.993083.360.3711.060white
745.10.30.32.30.04840.0150.00.989443.290.4612.260white
755.10.3050.131.750.03617.073.00.993.40.5112.333333333333350white
765.10.310.30.90.03728.0152.00.9923.540.5610.160white
775.10.330.221.60.02718.089.00.98933.510.3812.571white
785.10.330.221.60.02718.089.00.98933.510.3812.571white
795.10.330.221.60.02718.089.00.98933.510.3812.571white
805.10.330.276.70.02244.0129.00.992213.360.3911.071white
815.10.350.266.80.03436.0120.00.991883.380.411.560white
825.10.350.266.80.03436.0120.00.991883.380.411.560white
835.10.350.266.80.03436.0120.00.991883.380.411.560white
845.10.390.211.70.02715.072.00.98943.50.4512.560white
855.10.420.01.80.04418.088.00.991573.680.7313.671red
865.10.420.011.50.01725.0102.00.98943.380.3612.371white
875.10.470.021.30.03418.044.00.99213.90.6212.860red
885.10.510.182.10.04216.0101.00.99243.460.8712.971red
895.10.520.062.70.05230.079.00.99323.320.439.350white
905.10.5850.01.70.04414.086.00.992643.560.9412.971red
915.20.1550.331.60.02813.059.00.989753.30.8411.981white
925.20.1550.331.60.02813.059.00.989753.30.8411.981white
935.20.160.340.80.02926.077.00.991553.250.5110.160white
945.20.170.270.70.0311.068.00.992183.30.419.850white
955.20.1850.221.00.0347.0123.00.992183.550.4410.1560white
965.20.20.273.20.04716.093.00.992353.440.5310.171white
975.20.210.311.70.04817.061.00.989533.240.3712.071white
985.20.220.466.20.06641.0187.00.993623.190.429.7333333333333350white
995.20.240.157.10.04332.0134.00.993783.240.489.960white
1005.20.240.453.80.02721.0128.00.9923.550.4911.281white
Rows: 1-100 | Columns: 14

Note

VerticaPy offers a wide range of sample datasets that are ideal for training and testing purposes. You can explore the full list of available datasets in the Datasets, which provides detailed information on each dataset and how to use them effectively. These datasets are invaluable resources for honing your data analysis and machine learning skills within the VerticaPy environment.

Next, we can initialize a LogisticRegression model:

from verticapy.machine_learning.vertica import LogisticRegression

model = LogisticRegression()

Now we can conveniently use the enet_search_cv() function to perform the k-fold grid search using multiple ENet models.

from verticapy.machine_learning.model_selection import enet_search_cv

result = enet_search_cv(
    model,
    input_relation = data,
    X = [
        "fixed_acidity",
        "volatile_acidity",
        "citric_acid",
        "residual_sugar",
        "chlorides",
        "density",
    ],
    y = "good",
    cv = 3,
)
avg_score
avg_train_score
avg_time
score_std
score_train_std
10.28846474210062860.288946495459707630.383873065312703430.00084052076440985790.0003972871291635968
20.289411667573606650.290894216230204670.3695224920908610.00103179092234586770.00025169908513882213
30.2920505635257250.292850197207834340.39637629191080730.0010818646530521290.000352015424646665
40.292199825113746040.292206505428863340.36839310328165690.00143653922240740820.0001331198828770389
50.2922173281583650.2929229468828220.35228983561197920.000338870476122530740.00013497959203325914
60.2922937959837740.29296811780939330.376593271891276060.00047676254888127870.0002638968541878935
70.29262662558564070.294314778414417670.383150021235148130.00036963286436427430.0002810065582860633
80.292726489605145630.293305997123056660.33520921071370440.000293347066510250550.0005511786001250951
90.292733674420821040.2915039724668420.37444806098937990.00063228537540456070.00036871805473403965
100.292783244467425260.29244255788238030.39143347740173340.00063034945888503620.00037800200771254903
110.292948998235484040.293779642196430.38440545399983720.00178711668674128590.00017607876889533784
120.2930235738432380.29400236090642670.384665648142496760.000427258621593708060.00016494486654723532
130.29342780050261470.29399768013765830.381739695866902650.00113239633010636470.0002481902669819877
140.293547902336910.292767099556498660.396764278411865230.00236920465783227240.0007781941948349151
150.29378169961323670.2938352419318980.36913943290710450.00031517143147339030.0003248366760323203
160.29410852242733130.29431977157523330.398214737574259460.00195879968854978250.00016071677606058272
170.294339505471163330.294251866875598960.374797344207763670.00090136288678542460.00017631657920467655
180.294514891095627950.294479513183370340.37310655911763510.0012308012417235870.00030646802754460566
190.294635868276777350.29582481954215070.357966740926106750.00072058208592737230.0003108649137713078
200.294858773717627030.29416750947113230.36163791020711260.00050747209349105750.0005024745660770941
210.2954327416782790.295681766644480650.35787892341613770.00061693900742743523.375821541405707e-05
220.2954374547751580.295696441111832350.359477281570434570.0015000170053977950.0005758970087868699
230.29556509304002930.295560905547132660.358800888061523440.00147786612454731390.00024780147524334627
240.2957690903375240.2961462405618440.353827158610026060.0010609251573117620.00033293569354164016
250.29579609356092870.295846117523406670.365135113398234070.00083637245264051162.069478188925922e-05
260.2958619426890310.2939318519401680.394389708836873350.00056507146633474080.00025255057242491496
270.295962994709192350.29574805260367130.35769597689310710.00031811337305454150.0005286761617780127
280.295980248689986350.295365731046451030.373042106628417970.0012239871812344750.00024695828551975804
290.2961845257637090.295419919696630.35637283325195310.00084300987544595490.0005332064118467324
300.2962108821736880.2953334224213930.357869863510131843.629878960107234e-056.422555193363575e-05
310.297024720144306640.29777370018548230.37110662460327150.000199896697471328332.6574223496525005e-05
320.29769048416724930.298137849901374650.35708467165629070.000486056081830378750.00012433656872006416
330.29777435462473230.29790012625442030.38937568664550780.000170735772050182860.00017188372014302985
340.29780346204148240.2975848398308560.36314638455708820.000475715358899980140.00027724353794569503
350.298340159324757660.29880621300941670.35629781087239580.00060149115722153780.00020767840304963358
360.298383933350476970.29806808570094930.34627930323282880.0003122826493215790.0002759540147314154
370.29854840409431030.298021763462917640.35788909594217940.00101131991368517670.00014426283360733237
380.298890226422314660.29873527480963130.37665184338887530.00029796620232560080.00013073033025150856
390.299375151187164350.2994808986414860.373026847839355470.000310416297172435967.005081772563931e-05
400.299706765119059970.2987160332271430.370563427607218440.00069596421473132450.00018731628555545174
410.29973387505080760.29911540626551430.378224293390909850.0006247577851475670.00021531400040275162
420.3003103077488390.300111962254739960.27239998181660970.000199647150669278250.00020171352569825415
430.3003341137623320.3001761521495820.37455836931864420.00040134750852317560.00016468396339247145
440.30045729176070770.30046220616979230.37363934516906740.000214088933600529130.00013281706105331797
450.300467857303882650.30055073354772770.397945960362752260.0002111333587518889.524391088528862e-05
460.300570928936716330.299888122053309340.37706017494201664.515220142851173e-050.00018765949263073513
470.3006063035903490.3005806702620340.369836409886678047.279060627906758e-068.062368456116385e-05
480.300664248593590.30057971155354830.374684969584147158.508757728931223e-058.396499827639037e-05
490.30069850002165530.30063825233047030.318882942199707038.759404747235172e-050.00019788269189060774
500.300712799847659640.3004511862837460.26621603965759280.00013192426179726932.1143397235232396e-05
510.3007479959681210.300743374821770660.3470427195231120.000112399906974102450.00015702781638737403
520.30077984341018830.3008218883615880.25082596143086754.091590479391953e-055.7320513782618745e-05
530.30080662866963230.300645241696616660.256097555160522469.709062081354091e-050.0001063531456502793
540.30080839852541330.30085080963489470.31499950091044116.92597629501217e-056.953801825515175e-05
550.300892695693890650.300917186582354330.27348454793294273.0032734474070047e-051.361847474279079e-05
560.30097738162767770.30096368827235630.267835696538289372.0503766009150824e-053.706367692408624e-06
570.3010299956639810.3010299956639810.25930698712666830.00.0
580.3010299956639810.3010299956639810.26936491330464680.00.0
590.3010299956639810.3010299956639810.2670373121897380.00.0
600.3010299956639810.3010299956639810.25659219423929850.00.0
610.3010299956639810.3010299956639810.274485588073730470.00.0
620.3010299956639810.3010299956639810.273292144139607730.00.0
630.3010299956639810.3010299956639810.28130761782328290.00.0
640.3010299956639810.3010299956639810.27494796117146810.00.0
650.3010299956639810.3010299956639810.28043834368387860.00.0
660.3010299956639810.3010299956639810.277671098709106450.00.0
670.3010299956639810.3010299956639810.26142422358194990.00.0
680.3010299956639810.3010299956639810.246000051498413090.00.0
690.3010299956639810.3010299956639810.24711441993713380.00.0
700.3010299956639810.3010299956639810.248101790746053070.00.0
710.3010299956639810.3010299956639810.26110569636027020.00.0
720.3010299956639810.3010299956639810.257711807886759460.00.0
730.3010299956639810.3010299956639810.25826517740885420.00.0
740.3010299956639810.3010299956639810.262749989827473940.00.0
750.3010299956639810.3010299956639810.268973588943481450.00.0
760.3010299956639810.3010299956639810.275001446406046570.00.0
770.3010299956639810.3010299956639810.25613212585449220.00.0
780.3010299956639810.3010299956639810.25993831952412920.00.0
790.3010299956639810.3010299956639810.27125620841979980.00.0
800.3010299956639810.3010299956639810.2623376051584880.00.0
810.3010299956639810.3010299956639810.26333427429199220.00.0
820.3010299956639810.3010299956639810.278299570083618160.00.0
830.3010299956639810.3010299956639810.271964073181152340.00.0
840.3010299956639810.3010299956639810.26192045211791990.00.0
850.3010299956639810.3010299956639810.25882633527119950.00.0
860.3010299956639810.3010299956639810.267369747161865230.00.0
870.3010299956639810.3010299956639810.261113325754801450.00.0
880.3010299956639810.3010299956639810.26908469200134280.00.0
890.3010299956639810.3010299956639810.271430015563964840.00.0
900.3010299956639810.3010299956639810.25939798355102540.00.0
910.3010299956639810.3010299956639810.260381380716959650.00.0
920.3010299956639810.3010299956639810.258067925771077460.00.0
930.3010299956639810.3010299956639810.263269980748494450.00.0
940.3010299956639810.3010299956639810.265769163767496760.00.0
950.3010299956639810.3010299956639810.255011955897013360.00.0
960.3010299956639810.3010299956639810.262064933776855470.00.0
970.3010299956639810.3010299956639810.262614011764526370.00.0
980.3010299956639810.3010299956639810.265982866287231450.00.0
990.3010299956639810.3010299956639810.26285099983215330.00.0
1000.3010299956639810.3010299956639810.27357490857442220.00.0
1010.3010299956639810.3010299956639810.28346570332845050.00.0
1020.3010299956639810.3010299956639810.27552898724873860.00.0
1030.3010299956639810.3010299956639810.26033012072245280.00.0
1040.3010299956639810.3010299956639810.275769790013631170.00.0
1050.3010299956639810.3010299956639810.2754625479380290.00.0
1060.3010299956639810.3010299956639810.27116258939107260.00.0
1070.3010299956639810.3010299956639810.275322198867797850.00.0
1080.3010299956639810.3010299956639810.2647845745086670.00.0
1090.3010299956639810.3010299956639810.268976688385009770.00.0
1100.3010299956639810.3010299956639810.28173756599426270.00.0
Rows: 1-110 | Columns: 6

Note

In VerticaPy, Elastic Net Cross-Validation (EnetCV) utilizes multiple ElasticNet models for regression tasks and LogisticRegression models for classification tasks. It systematically tests various combinations of hyperparameters, such as the regularization terms (L1 and L2), to identify the set that optimizes the model’s performance. This process helps in automatically fine-tuning the model to achieve better accuracy and generalization on diverse datasets.

See also

grid_search_cv() : Computes the k-fold grid search of an estimator.
randomized_search_cv() : Computes the K-Fold randomized search of an estimator.