Python API for Vertica Data Science at Scale
What's New on VerticaPy v0.4.x
VerticaPy v0.4.x offers several new features:
- New Method to_sklearn: Convert your Vertica Models to sklearn models.
- New Method shapExplainer: Get the shapExplainer of your Vertica Models Linear Models.
- It's easier to split a vDataFrame into training and test sets. The method vDataFrame.train_test_split is now available.
The function train_test_split of the 'model_selection' package has been removed.
get_model_attributewas renamed as get_attr.
- Integrating with GeoPandas to easily draw maps.
- New Method vDataFrame.to_geopandas which allows you to export your vDataFrame to the corresponding GeoPandas DataFrame.
- New datasets: load_world and load_cities allow you to load the different countries and cities as Geospatial objects.
- Changes to methods: vDataFrame.isin and vDataFrame.isin. They are now more intuitive and they return a new vDataFrame of the search.
- Changes to method: vDataFrame.discretize. It is possible to use random forest parameters as inputs.
- New Methods vDataFrame.iv_woe and vDataFrame.iv_woe to compute IV (Information Value) / WOE (Weight Of Evidence) Tables.
Methods set_display_parameters, sql_on_off and time_on_off were removed.They are now all part of a more flexible function set_option. Global options are easier to set.
- vDataFrame.corr Kendall rank correlation method was corrected; it now computes Kendall Tau-B.
- 'stratified' and 'systematic' sampling techniques were added to the vDataFrame.sample method.
- Introducing vDataFrame Magic Methods.
- Method simplification: vDataFrame.select, vDataFrame.groupby, vDataFrame.case_when and vDataFrame.decode. The ambiguous parameters are removed and they are now easier to use.
- You can now join time series by with interpolation: vDataFrame.join.
- New method: vDataFrame.corr_pvalue. You can now get the correlation coefficient p-value. Kendall Tau A & C are now available.
- Many Statistical SQL functions are available for easy features creation.
- It is now possible to use customized expressions in the different graphics computations.
- set_option allows to change graphics colors.
Seeding the Randomness
- set_option allows to seed the randomness of many functions and methods.
- It is now easier to set up DB connections: Functions read_dsn and vertica_conn have new parameters to use an input file location.