Estimating Lithium-ion Battery Health

Introduction to Lithium-based batteries, their cycles characteristics and aging

Lithium-ion (or Li-ion) batteries are rechargeable batteries used for a variety of electronic devices, which range from eletric vehicles, smartphones, and even satellites.

However, despite their wide adoption, research isn't mature enough to avoid problems with battery health and safety, and given the ubiquity of consumer electronics using the technology, this has led to some poor outcomes that range from poor user-experience to public safety concerns (see, for example, the Samsung Galaxy Note 7 explosions from 2016).

Dataset

In this example of predictive maintenance, we propose a data-driven method to estimate the health of a battery using the Li-ion battery dataset released by NASA (csv).

This dataset includes information on Li-ion batteries over several charge and discharge cycles at room temperature. Charging was at a constant current (CC) at 1.5A until the battery voltage reached 4.2V and then continued in a constant voltage (CV) mode until the charge current dropped to 20mA. Discharge was at a constant current (CC) level of 2A until the battery voltage fell to 2.7V.

You can download the Jupyter notebook of this study here.

The dataset includes the following:

  • Voltage_measured: Battery's terminal voltage (Volts) for charging and discharging cycles
  • Current_measured: Battery's output current (Amps) for charging and discharging cycles
  • Temperature_measured: Battery temperature (degree Celsius)
  • Current_charge: Current measured at charger for charging cycles and at load for discharging cycles (Amps)
  • Voltage_charge: Voltage measured at charger for charging cycles and at load for discharging ones (Volts)
  • Start_time: Starting time of the cycle
  • Time: Time in seconds after the starting time for the cycle (seconds)
  • Capacity: Battery capacity (Ahr) for discharging until 2.7V. Battery capacity is the product of the current drawn from the battery (while the battery is able to supply the load) until its voltage drops lower than a certain value for each cell.

Initialization

This example uses the following version of VerticaPy:

In [1]:
import verticapy as vp
vp.__version__
Out[1]:
'0.9.0'

Connect to Vertica. This example uses an existing connection called "VerticaDSN." For details on how to create a connection, use see the connection tutorial.

In [2]:
vp.connect("VerticaDSN")

Before we import the data, we'll drop any existing schemas of the same name.

In [13]:
vp.drop("battery_data", method="schema")
vp.create_schema("battery_data", True)
Out[13]:
True

Since our data is in a .csv file, we'll injest it with read_csv().

In [15]:
battery5 = vp.read_csv("data/battery5_data.csv")

Understanding the Data

Let's examine our data. Here, we use vDataFrame.head() to retrieve the first five rows of the dataset.

In [16]:
display(battery5.head(5))
123
Voltage_measured
Float
123
Current_measured
Float
123
Temperature_measured
Float
123
Current_charge
Numeric(9,6)
123
Voltage_charge
Numeric(8,5)
123
Time
Float
Abc
type
Varchar(20)
📅
start_time
Timestamp
123
ambient_temp
Int
123
Capacity
Float
10.00336524422388181-0.0014957339000395823.3694339095750.00.0032.547charge2008-05-28 11:09:42.00004524[null]
20.236356184152679-0.0034844262761841423.37204750481210.00.0030.0charge2008-05-28 11:09:42.00004524[null]
32.45567932466456-2.0125901157714838.418742770556-1.99821.6853250.766discharge2008-04-05 22:46:35.000484241.8027776247196
42.47131476433413-2.0143889194729738.0765742576909-1.99821.6953270.922discharge2008-04-20 15:37:05.000280241.81396938871035
52.47244830185273-2.0092825187644138.4230398541122-1.99821.6843290.234discharge2008-04-04 09:57:19.000765241.82461955268645
Rows: 1-5 | Columns: 10

Let's perform a few aggregations with vDataFrame.describe() to get a high-level overview of the dataset.

In [17]:
battery5.describe()
Out[17]:
count
mean
std
min
approx_25%
approx_50%
approx_75%
max
"Voltage_measured"5914584.103945038382160.2134686704672740.003365244223881814.092153146510074.205141783239034.205903197728398.39314118056463
"Current_measured"5914580.3694043346408480.907557734956734-4.47965961959820.04296367296636270.1735766992393921.218539466848331.53130142697453
"Temperature_measured"59145826.36970063096272.7724244000932923.214801785728124.490373088316825.476333128955727.346736077890741.4502319190386
"Current_charge"5914580.6344756976826780.737028608855502-4.4680.0570.2611.4981.9984
"Voltage_charge"5914584.024930417713491.208825226982510.04.244.3054.6565.002
"Time"5914584763.856933565873147.859025807610.01927.865854922284492.690863673477535.173675075310807.328
"ambient_temp"59145824.00.024.024.024.024.024.0
"Capacity"502851.56034471765460.1823795278703871.287452522137941.386228767797831.538236598942561.746870617983791.85648742081816
Rows: 1-8 | Columns: 9

To get a better idea of the changes between each cycle, we look at an aggregation at their start time, duration, and voltage at the beginning and the end of each cycle.

In [18]:
battery5['start_time'].describe()
Out[18]:
value
name"start_time"
dtypetimestamp
count591458
min2008-04-02 13:08:17.000920
max2008-05-28 11:09:42.000045
Rows: 1-5 | Columns: 2

To see how the voltage changes during the cycle, we extract the initial and final voltage measurements for each cycle.

In [19]:
battery5.analytic(func="first_value",
                  columns="Voltage_measured",
                  by="start_time",
                  order_by={"Time":"asc"},
                  name="first_voltage_measured")

battery5.analytic(func="first_value",
                  columns="Voltage_measured",
                  by="start_time",
                  order_by={"Time":"desc"},
                  name="last_voltage_measured")

cycling_info = battery5.groupby(columns = ['start_time',
                                           'type',
                                           'first_voltage_measured',
                                           'last_voltage_measured'], 
                                expr = ["COUNT(*) AS nr_of_measurements",
                                        "MAX(Time) AS cycle_duration"]).sort('start_time')
cycling_info['cycle_id'] = "ROW_NUMBER() OVER(ORDER BY start_time)"
cycling_info
Out[19]:
📅
start_time
Datetime
Abc
type
Varchar(20)
123
first_voltage_measured
Float
123
last_voltage_measured
Float
123
nr_of_measurements
Integer
123
cycle_duration
Float
123
cycle_id
Integer
12008-04-02 13:08:17.000920charge3.8730172213014.1910775628027897597.8751
22008-04-02 15:25:41.000593discharge4.191491807505293.27716997682521973690.2342
32008-04-02 16:37:51.000984charge3.325054656844854.1890618410855194010516.03
42008-04-02 19:43:48.000405discharge4.189773213846613.300244887122251963672.3444
52008-04-02 20:55:40.000811charge3.352603659998784.1873982258061693710484.5475
62008-04-03 00:01:06.000687discharge4.18818673599133.327451009868631953651.6416
72008-04-03 01:12:38.000670charge3.378798976512954.1880549560742393310397.897
82008-04-03 04:16:37.000375discharge4.188461118855573.314181858907031943631.5638
92008-04-03 05:27:49.000125charge3.372870917439274.1884384309154193710495.2039
102008-04-03 08:33:25.000702discharge4.188298524761053.305496731286481943629.17210
112008-04-03 09:44:35.000078charge3.366774899409294.1886946835354595210792.67211
122008-04-03 12:55:10.000686discharge4.18881580794863.302329112973761953652.28112
132008-04-03 14:06:43.000234charge3.361035656604384.188638651396895210789.98513
142008-04-03 17:17:16.000015discharge4.188391637938793.29374066954991953650.82814
152008-04-03 18:28:47.000125charge3.353942647778254.1877086746418792010127.56215
162008-04-03 21:28:14.000718discharge4.188927891511963.316230740774151913572.45316
172008-04-03 22:38:27.000452charge3.386527224467544.1885126693926692110147.95317
182008-04-04 01:38:15.000217discharge4.189029250882123.297410935416151903550.59418
192008-04-04 02:48:06.000155charge3.37759546680984.1886422312887492110162.09419
202008-04-04 05:48:08.000609discharge4.189223427076423.288404483095291903551.2520
212008-04-04 06:58:00.000296charge3.371653071239994.1884991398290992010119.37521
222008-04-04 09:57:19.000765discharge4.188916055911243.266275575061471893530.2522
232008-04-04 11:06:50.000375charge3.361169507713134.1879485622144991310013.23523
242008-04-04 15:05:59.000905charge3.647842266057244.189108185681538979586.87524
252008-04-04 17:56:27.000609discharge4.189876247266323.296199859463091873491.01625
262008-04-04 19:05:19.000234charge3.391880207780124.18853685676099129952.35926
272008-04-04 22:01:54.000670discharge4.18928679476883.27175741553361863470.28127
282008-04-04 23:10:25.000420charge3.3810252523114.1892141386325395110758.03128
292008-04-05 02:20:26.000702discharge4.188837765892733.264855084085031863470.029
302008-04-05 03:28:57.000234charge3.375272107192294.188121217551759119921.7530
312008-04-05 06:25:01.000890discharge4.189110301768523.309111201518361853450.29731
322008-04-05 07:33:12.000734charge3.410781053861784.1885654447049291410000.2532
332008-04-05 10:30:32.000311discharge4.189212073040983.288149052868931843429.40733
342008-04-05 11:38:22.000140charge3.405082721309114.187420919333929119936.03134
352008-04-05 14:34:41.000468discharge4.188899969428113.260754100903531833409.48435
362008-04-05 15:42:11.000531charge3.396716679706654.188084604302349139989.53236
372008-04-05 18:39:25.000217discharge4.188779799856793.229466821784411823390.42237
382008-04-05 19:46:36.000125charge3.386943008500994.1890385293424192210158.78238
392008-04-05 22:46:35.000484discharge4.189272591516223.224378894859081823390.28139
402008-04-18 17:34:22.000890charge3.579549917938274.1882795478839693310289.31240
412008-04-18 21:10:19.000795discharge4.187614060563443.25874033216341923591.73441
422008-04-18 22:53:58.000343charge3.389085680145774.1880743483097492710283.07842
432008-04-19 02:29:09discharge4.18762878110843.218046214795761903552.29743
442008-04-19 04:12:06.000343charge3.380743064766384.1875905434737392410281.3944
452008-04-19 07:47:15.000702discharge4.187447226312933.257309626839771893531.57845
462008-04-19 09:29:52.000703charge3.412070285635064.1873633129068691410081.79746
472008-04-19 13:01:42.000561discharge4.187344026387163.280256747494591873492.90747
482008-04-19 14:43:41.000265charge3.436894873571874.1880820129211795010797.32848
492008-04-19 18:27:29.000827discharge4.187438187636333.279765451158691883511.59449
502008-04-19 20:09:47.000750charge3.431542312332484.1872539956451391610099.21850
512008-04-19 23:41:55.000686discharge4.187741381177993.205068915604991843431.9370000000151
522008-04-20 01:22:53.000953charge3.422833856716464.1877933464504691510078.71952
532008-04-20 04:54:41.000734discharge4.188261122540053.276866922356561843431.12553
542008-04-20 06:35:39.000765charge3.456980745548174.1882110168017294210617.37554
552008-04-20 10:16:28.000859discharge4.187338767288233.231130027730991833412.40655
562008-04-20 11:57:08.000765charge3.443131952235314.1880818825263393910564.15756
572008-04-20 15:37:05.000280discharge4.187560306288173.222025772526831833410.68857
582008-04-20 17:17:44.000030charge3.44094582238784.187938211684891810143.98458
592008-04-20 20:50:38.000920discharge4.188051762487133.290082793730791833410.71859
602008-04-20 22:31:18.000045charge3.472049220114074.1887522377346794910790.45360
612008-04-21 02:15:02.000921discharge4.187667484260543.251176185241491823392.71961
622008-04-21 17:51:26.000312charge3.482323938736564.1884126220303894110442.89162
632008-04-22 14:15:41.000186charge8.393141180564634.201935096732435821674.48463
642008-04-22 15:33:49.000875discharge4.201070177337873.249283482317553713470.67264
652008-04-22 17:04:53.000218charge3.405890143202774.20218728438283365910113.65665
662008-04-22 20:26:18.000920discharge4.199622334549583.260087599199013653414.10966
672008-04-22 21:56:21.000405charge3.450747729160214.20215882826111366610137.8967
682008-04-23 01:18:11.000795discharge4.199853323702163.281146601420843633395.21968
692008-04-23 02:47:55.000453charge3.477715920260944.20111410827477385010651.65769
702008-04-23 06:18:19.000920discharge4.199103467081433.309371520154763623386.070
712008-04-23 07:47:54.000718charge3.499765574702664.2027772821499362310032.48571
722008-04-23 11:08:00.000312discharge4.1999985368653.295594467656433603368.18772
732008-04-23 12:37:17.000515charge3.506969895111284.20136149792883381910574.29773
742008-04-23 16:06:25.000170discharge4.199675220998723.317198245350353603368.26674
752008-04-23 17:35:42.000828charge3.515679016850224.20127896701083378010461.73575
762008-04-23 21:02:58.000295discharge4.199184724880323.347818506503993593358.71976
772008-04-23 22:32:06.000671charge3.535634824082754.2021111720211736059993.0620000000177
782008-04-24 01:51:33.000890discharge4.2000124803353.337821923904863573339.578
792008-04-24 03:20:23.000359charge3.544135447281034.20114809274631382610602.28179
802008-04-24 06:50:00.000031discharge4.199237365654793.364372479379793563331.26580
812008-04-24 08:18:41.000703charge3.560236280744184.20106102920315384710667.96981
822008-04-24 11:49:24.000280discharge4.199395781118713.336014493650923553321.18882
832008-04-24 13:17:56.000250charge3.555356315220794.20109580259623378910504.78183
842008-04-24 16:45:56.000015discharge4.199578698026073.355659627946553553321.84484
852008-04-24 18:14:28.000937charge3.561078274872044.20157433593331369610246.78185
862008-04-24 21:38:11.000077discharge4.199546957512693.361490197711283543311.82886
872008-04-24 23:06:34.000155charge3.567581547144624.20145257180874366610166.21987
882008-04-25 12:03:44.000093discharge4.194091469340823.346968121975771793327.39188
892008-04-25 17:02:35.000703charge3.564808268019854.20113835250844390010804.92289
902008-04-25 20:03:55.000920discharge4.200715235801253.371210049221773543312.07890
912008-04-25 21:32:20.000468charge3.568077296760874.20069482424571388210758.98591
922008-04-26 01:04:35.000718discharge4.199098080718033.379951819113273523293.17292
932008-04-26 02:32:41.000093charge3.580462157726454.20144464421736371910317.51593
942008-04-26 05:57:35.000140discharge4.199398534411533.391867015789513503275.32894
952008-04-26 07:25:23.000030charge3.593352146996074.20069890818846389510805.73495
962008-04-26 10:58:25.000843discharge4.199122915612973.389321801243053493265.01696
972008-04-28 17:20:47.000859charge3.637556498781754.2017788304878735859946.39197
982008-04-29 12:15:56.000953discharge4.19040947808143.23996044347973563328.93798
992008-04-29 13:44:46.000545charge3.435745069757844.20225518963933364210063.54799
1002008-04-29 17:05:27.000780discharge4.199933846791873.311127639734913563329.266100
Rows: 1-100 | Columns: 7

We can see from the "duration" column that charging seems to take a longer time than discharging. Let's visualize this trend with an animated graph.

In [20]:
import warnings
warnings.filterwarnings('ignore')
cycling_info.animated(ts="start_time",
                      columns= ["type","cycle_duration"], 
                      by="type", 
                      kind="bar",)
Out[20]:

The animated graph below shows how the cycles change throughout time. Another way we can verify that charging cycles are longer than discharging cycles is by looking at the average duration of each type of cycle.

In [21]:
cycling_info.bar(["type"], 
                 method = "avg", 
                 of = "cycle_duration")
Out[21]:
<AxesSubplot:xlabel='AVG("cycle_duration")', ylabel='"type"'>

In general, charging cycles are longer than discharging cycles. Let's examine how voltage changes between cycles and their transitions.

In [22]:
cycling_info.groupby('type',['MIN(first_voltage_measured) AS min_first_voltage',
                             'AVG(first_voltage_measured) AS avg_first_voltage',
                             'MAX(first_voltage_measured) AS max_first_voltage',
                             'MIN(last_voltage_measured)  AS min_last_voltage',
                             'AVG(last_voltage_measured)  AS avg_last_voltage',
                             'MAX(last_voltage_measured)  AS max_last_voltage'])
Out[22]:
Abc
type
Varchar(20)
123
min_first_voltage
Float
123
avg_first_voltage
Float
123
max_first_voltage
Float
123
min_last_voltage
Float
123
avg_last_voltage
Float
123
max_last_voltage
Float
1charge0.2363561841526793.622611063141548.393141180564634.187253995645134.19866887437344.21343968400049
2discharge4.184534803011314.195971008627574.222920057935793.205068915604993.47130485429033.6211910328537
Rows: 1-2 | Columns: 7

From this table, it looks like batteries are charged until they are almost full (4.2V) and discharging doesn't begin until they are fully charged.

End-of-life (EOL) criteria for batteries is usually defined as when the battery capacity is lower than 70%-80% of its rated capacity. Since the rated capacity by the manufacturer for this battery is 2Ah, this battery is considered EOL when its capacity reaches 2Ah x 70% = 1.4Ah.

Let's plot the capacity curve of the battery with its smoothed version and observe when it reaches the degradation criteria.

In [23]:
# Visualize the capacity degradation curve along with its smoothed version
discharging_data = battery5[battery5['type'] == 'discharge']
d_cap = discharging_data[['start_time', 'Capacity']].groupby(['start_time', 'Capacity'])
d_cap["discharge_id"] = "ROW_NUMBER() OVER(ORDER BY start_time, Capacity)"
d_cap.rolling(func = 'mean',
              columns = 'capacity',
              window = (-100, -1),
              name = 'smooth_capacity')


import matplotlib.pyplot as plt
from matplotlib.pyplot import axhline

fig = plt.figure()
ax = d_cap.plot(ts = 'discharge_id', columns = ['Capacity', 'smooth_capacity'])
ax.axhline(y=1.4, label='End-of-life criteria')
ax.set_title('Capacity degradation curve of the battery, its smoothed version and its end-of-life threshold')
ax.legend() 
plt.show()
<Figure size 432x288 with 0 Axes>

The sudden increases in battery capacity come from the self-charging property of Li-ion batteries. The smoothed graph makes the downward trend in the battery's capacity very clear.

An important observation here is that the battery meets the EOL criteria around the 125th cycle.

Goal and Problem Modeling

Understanding battery health is important, but at the time of writing, there's no direct way to measure it. In our case, we'll create a degredation model to find the relationship between a battery's overall health and the other properties in the dataset, which includes charge and discharge cycle duration, average voltage and current, etc.

One possible definition of the battery's overall health ("state of health" or "SoH") is the following:

Let cap_rate.png be the rated capacity of the battery when its new (2Ah in our case), and cap_act.png be the actual capacity of the battery at specific time. The state of health of the battery is defined as: complex.png

In order to find this relationship, we'll clean and prepare our data by adding some extra features. These extra features will help us understand how the battery behaves during and between each cycle and pinpoint what might be the primary causes for battery degredation.

Data Preparation

Outlier detection

Let's start by finding and removing the global outliers from our dataset.

In [24]:
battery5.outliers(columns = ["Voltage_measured","Current_measured","Temperature_measured","Capacity"],
                  name = "global_outlier",
                  threshold = 4.0)
battery5.filter("global_outlier = 0").drop('global_outlier')
7803 elements were filtered.
Out[24]:
123
Voltage_measured
Float
123
Current_measured
Float
123
Temperature_measured
Float
123
Current_charge
Numeric(9,6)
123
Voltage_charge
Numeric(8,5)
123
Time
Float
Abc
type
Varchar(20)
📅
start_time
Timestamp
123
ambient_temp
Int
123
Capacity
Float
123
first_voltage_measured
Float
123
last_voltage_measured
Float
14.191077562802-0.0028924441582523324.50704049812430.00.0037597.875charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
24.19145297803654-0.0018559325616697324.51436272476090.00.0037579.813charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
34.19138815949231-0.0003505893438418524.51867259892780.00.0037561.75charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
44.19129900116523-0.00094238213739205424.53249794943910.00.0037543.797charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
54.19082169589584-0.0033913541849073824.53939728953580.00.0037525.782charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
64.19117480835044-0.0032923571396797424.54849751985140.00.0037507.86charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
74.19085954561673-0.0018770650559858624.55604280902570.00.0037489.891charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
84.19083893634414-0.0020032933906827624.56757542563410.00.0037472.11charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
94.19098209980588-0.0026713968497880624.575986268980.00.0037454.375charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
104.19084885245013-0.0013194472605410124.59316228761760.00.0037436.719charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
114.19103086559357-0.0051960203110607424.5965952874239-0.0020.0037419.0charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
124.190885428455380.0016696523924065524.60677004616020.00.0037401.391charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
134.19037884805604-0.00058505445487425324.60679470454460.00.0037383.813charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
144.19038472665687-0.0023342118392372424.62351767924570.00.0037366.328charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
154.19039339764219-0.00067293496017350324.64550557657590.00.0037348.875charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
164.18992934989144-0.00098039086961787724.65142202733140.00.0037331.391charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
174.18950409720076-0.0011880397554670924.64986145946210.00.0037313.969charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
184.18937661468013-0.0013529017821975524.67586383669350.00.0037296.61charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
194.18866542676157-0.0017573541579725524.6839175921507-0.0020.0037279.297charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
204.18799981651757-0.001055921344031224.69850482149460.00.0037262.11charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
214.187114841390180.00063349085974015324.722354447270.00.0037244.828charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
224.18557368432219-0.0042711717458869524.73966552571380.00.0037227.625charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
234.18354651785376-0.002823791683449824.7485471119508-0.0020.0037210.485charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
244.18072221326857-0.00037008143017635924.75777741725850.00.0037193.328charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
254.17596639053681-0.0027843188604814824.768309354088-0.0020.0037176.235charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
264.16748106370293-0.0031966575884723524.7299187674368-0.0020.0037159.172charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
274.13560691014186-0.00034702794998498124.4645469045721-0.0020.0037142.282charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
284.20560122834350.011160221654207224.18205990667070.0124.2297125.25charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
294.206852911529860.02403623524776324.17204376442510.0214.2347108.188charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
304.207712416015340.032592930287106224.18776719466960.0314.247091.172charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
314.207403418587640.034296771448464624.18018541969170.0314.247074.157charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
324.207341976834080.03453712360979324.18339558724260.0314.247057.266charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
334.207469554750740.030971145049469924.18344508568690.0314.247040.36charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
344.207293551467290.029887617095663924.18553131430010.0314.247023.578charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
354.207508872645270.03113873525760924.19034909364640.0314.247006.766charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
364.207280565303910.033483896229418524.17830193591870.0314.246988.735charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
374.206984468726910.032083667905550924.17598245430160.0324.246970.75charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
384.207242513141630.03116921622324624.18061916746260.0314.246952.844charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
394.207093693713370.033663287251422424.17230663023810.0314.246934.969charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
404.207190248949990.031121876083941824.17199756615110.0324.246917.188charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
414.207045952494730.03281141443069724.17098969113550.0324.246899.422charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
424.206886519328170.032252205194063224.16706188080590.0324.246881.719charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
434.206971630515970.033239089813278324.16913961914060.0324.246864.047charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
444.206890049043420.030619161529964324.18205677081010.0324.246846.375charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
454.206948065959520.035222529360503824.17607580944980.0324.246828.828charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
464.206879465168830.034420745792322324.18048289535390.0324.246811.297charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
474.206999693250790.03425372167526424.18535306560350.0324.246793.875charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
484.206756539038630.035405752231862824.18374674865380.0324.246776.313charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
494.207044269155620.033052852609992124.18229023134040.0324.246758.813charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
504.206969095566440.033836509236617924.18000610716410.0324.246741.391charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
514.206540851930660.036040120013492324.17328556450090.0324.246723.953charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
524.206564367059720.034019453394449524.177401983460.0324.246706.641charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
534.206469683869250.034062864242807724.18204114324990.0324.246689.266charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
544.206789145986110.031485732954006124.18832645976380.0324.246672.063charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
554.20648590122480.034223721543885924.18887296899550.0324.246654.86charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
564.206616586139990.033500080189723824.18568831179150.0344.246637.703charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
574.206351717917320.032571025467178424.18622638751550.0344.246620.61charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
584.20634893632370.03531216388662524.1775636619870.0344.246603.61charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
594.206417449029280.034388120637286724.17392983081120.0344.246586.594charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
604.206212670881590.033154785143017724.17633877747020.0344.246569.672charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
614.206225770874190.032504255299247324.18704763555560.0344.246552.797charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
624.206463500240930.033729551918569324.17484345193620.0344.246535.953charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
634.206481054220950.031338280880465624.17593803658630.0344.246519.094charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
644.206346327464090.034918504861872124.17377823166030.0344.246502.344charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
654.206393183487480.03554282394624424.17160026060960.0344.246485.61charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
664.20626912250320.032410547860657724.18672602954960.0344.246468.907charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
674.206092441330140.033581969404368624.1781783690560.0344.246452.235charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
684.205997630071480.035196316782327724.17772013521570.0344.246435.61charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
694.205982111716390.036150355512196724.18917981024750.0344.246419.032charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
704.206213485503720.036153639290711624.18704961519210.0344.246402.5charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
714.206020768978590.034909027513068324.18241076091710.0344.246386.016charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
724.206028670744160.03574502223007324.1869382690490.0344.246369.657charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
734.205698414964130.03720890305153524.18862457373540.0344.246353.235charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
744.205946346534940.037038089716701924.1883997001890.0344.246336.86charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
754.206194146615520.034810458867067424.18771093023330.0354.246320.532charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
764.205934873534460.035140879334412524.18775738691730.0344.246304.188charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
774.205910166860290.036564023881036124.19219150536810.0344.246287.953charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
784.205958939488430.034520218303612424.2015167808880.0354.246271.688charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
794.20580728425830.034425610061060124.19361628940550.0354.246255.485charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
804.205889041434540.036890682776929324.1927214085310.0354.246239.453charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
814.205798173357530.035550096848457424.19211432048390.0354.246223.407charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
824.205532621520850.033297650261116424.19652882097330.0354.246207.407charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
834.205788994968320.036084066576501524.19781753120490.0354.246191.422charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
844.205730339553740.03615567491955324.18759696205080.0354.246175.547charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
854.205797744754070.035167333420119924.19409634747810.0354.246159.641charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
864.205769101818990.033970359699098124.19425467950430.0354.246143.782charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
874.20571812948220.033153739168279224.19934072600380.0354.246128.094charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
884.205738225708880.036308741702234424.19496445745080.0354.246112.36charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
894.205592036642640.037390267353312524.20090770909420.0354.246096.75charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
904.205623538573010.033576963227462724.19274448618880.0354.246081.157charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
914.205467319654570.034685974462872824.19155517356150.0354.246065.547charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
924.205376878705270.03766549030091824.1963015203930.0354.246049.985charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
934.205482931210940.037430835206595524.19678409761990.0354.246034.469charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
944.205348522107670.034440235083395524.19039055116580.0354.246019.063charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
954.205620433519230.039035365397138424.20294031942180.0354.246003.532charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
964.205657342419930.037150748803473424.20755786193110.0354.245987.0charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
974.205851451231220.033178527191432724.19715741930430.0354.245970.516charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
984.205929197542190.032380126263258824.19150089806470.0354.245953.969charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
994.206653728513330.043870202591075224.19064210432140.0454.2455937.453charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
1004.206401356769920.047822353768599124.19753982476260.0454.2455921.016charge2008-04-02 13:08:17.00092024[null]3.8730172213014.191077562802
Rows: 1-100 of 583655 | Columns: 12

Feature engineering

Since measurements like voltage and temperature tend to differ within the different cycles, we'll create some features that can describe those cycles.

In [25]:
sample_cycle = battery5[battery5['Capacity'] == '1.83514614292266']
sample_cycle["Voltage_measured"].plot(ts = "Time")
Out[25]:
<AxesSubplot:xlabel='"Time"', ylabel='"Voltage_measured"'>
In [26]:
sample_cycle["Temperature_measured"].plot(ts = "Time")
Out[26]:
<AxesSubplot:xlabel='"Time"', ylabel='"Temperature_measured"'>

We'll define new features that describe the minimum and maximum temperature during one cycle; the minimal voltage; and the time needed to reach minimum voltage and maximum temperature.

In [27]:
# filter for discharge cycles
discharging_data = battery5[battery5['type'] == 'discharge']


# define new features
discharge_cycle_metrics = discharging_data.groupby(columns = ['start_time'], 
                                                   expr = ['MIN(Temperature_measured) AS min_temp',
                                                           'MAX(Temperature_measured) AS max_temp',
                                                           'MIN(Voltage_measured) AS min_volt']).join(
                                                           discharging_data, 
                                                           how = "left",
                                                           on = {"min_volt":"voltage_measured"},
                                                           expr1 = ["*"],
                                                           expr2 = ["Time AS time_to_reach_minvolt"]).join(
                                                           discharging_data, 
                                                           how = "left",
                                                           on = {"max_temp":"temperature_measured"},
                                                           expr1 = ["*"],
                                                           expr2 = ["Time AS time_to_reach_maxtemp"])

# calculate values of SOH
discharging_data = discharging_data.groupby(['start_time','Capacity'])
discharging_data['SOH'] = discharging_data['Capacity'] * 0.5
In [28]:
# define the final dataset and save it to db
final_df = discharge_cycle_metrics.join(discharging_data,
                     on_interpolate = {"start_time":"start_time"},
                     how = "left",
                     expr1 = ["*"],
                     expr2 = ["SOH AS SOH"])

# normalize the features
final_df.normalize(method = "minmax",
                   columns = ["min_temp",
                              "max_temp",
                              "min_volt",
                              "time_to_reach_minvolt",
                              "time_to_reach_maxtemp"])

# save it to db
final_df.to_db(name = "battery_data.finaldata_battery_5")
Out[28]:
📅
start_time
Datetime
123
min_temp
Float
123
max_temp
Float
123
min_volt
Float
123
time_to_reach_minvolt
Float
123
time_to_reach_maxtemp
Float
123
SOH
Float
12008-04-02 15:25:41.0005930.3237198189451650.7342498076262770.01390949271172951.00.8283632138877080.92824371040908
22008-04-02 19:43:48.0004050.4285842537622270.9143511583605770.002885133342340040.7236480229384140.8293735136315070.923163624859965
32008-04-03 00:01:06.0006870.4426605784831560.7957380115774470.05311776437198360.9275478687178470.8154497593088320.917674597111705
42008-04-03 04:16:37.0003750.4187652548517530.6612055048135340.0539070853119710.7120969448802470.8156967693385880.917631263791055
52008-04-03 08:33:25.0007020.3798605208460690.544351195740230.04155202617043760.7109883518897330.8143841695293020.91732275410602
62008-04-03 12:55:10.0006860.3786154527720310.6182465363365010.009940148974305710.9522801659493360.8152817350444040.917830830033775
72008-04-03 17:17:16.0000150.3964314192611750.6739028537138570.01028501929966440.9639258504115080.8146196907204650.91757307146133
82008-04-03 21:28:14.0007180.435976997654240.6543705113782160.02791912825977820.7128168025825120.8165490975517290.912878395283275
92008-04-04 01:38:15.0002170.4118443669196430.5096666951464470.05589334566517590.9278419976020920.8015489826633430.912386926494565
102008-04-04 05:48:08.0006090.3923467567212630.461366819151430.06268048781612330.9406441847080880.8015267230385680.91230663424847
112008-04-04 09:57:19.0007650.3543940965773510.3898867872776560.03688585513453660.952545185170810.8020315138842620.912309776343225
122008-04-04 17:56:27.0006090.444439336992640.6490266608046190.04287098310600930.7007356860813970.802244775450650.907100967883695
132008-04-04 22:01:54.0006700.4423230867746790.6415440236939750.009614483529713180.700375454004610.8018182523178730.906876078877455
142008-04-05 02:20:26.0007020.3592587603948040.4173153959099520.01619439833789120.9160161971015270.7876137395020740.90672024573679
152008-04-05 06:25:01.0008900.3636617897742520.3572476108929890.02843502390123280.8676517053107550.7879727657081150.901299001815325
162008-04-05 10:30:32.0003110.322122756663830.2946574048341050.04850724202397690.6876399765424630.7867391516641580.901053450123075
172008-04-05 14:34:41.0004680.3328465836177160.347107687307420.01102595774023840.6882464278501850.787457204076240.901289750413105
182008-04-05 18:39:25.0002170.3401583659286140.3110330390024580.1029543899255260.6769227690324130.7740497294378510.901534157141705
192008-04-05 22:46:35.0004840.3321790716834280.3249227759839120.08871231211450320.6765813369461660.7736454659298490.9013888123598
202008-04-18 21:10:19.0007950.3888112468363560.6219247097621140.00270315407185140.9897752309518190.8442723831297890.92351299746646
212008-04-19 02:29:090.4960425675119090.9549183089436990.1114714862907770.7250216351504030.8309999023448720.92370865564182
222008-04-19 07:47:15.0007020.3936303595803020.6746277592748840.03942515733456080.7249743319484010.830943894256730.918088710673945
232008-04-19 13:01:42.0005610.2918451920231130.3655638919513620.03208665411875490.9175226221499060.8173798841925070.91289037378881
242008-04-19 18:27:29.0008270.2447694710608470.3026659257649450.03591020680354890.9288559841886020.8163250651991590.91255682175392
252008-04-19 23:41:55.0006860.2359891589761550.1787129556736150.113167492419180.7014840469951250.8031308521271590.91279075211019
262008-04-20 04:54:41.0007340.2101706328801450.1250297675704450.06302321999009640.8924216025233230.80198627658230.907015563754195
272008-04-20 10:16:28.0008590.1738440953015860.1240370387497430.002453149921766090.7014743437742010.8031193632885650.907384597057945
282008-04-20 15:37:05.0002800.157305810312760.00.1002713457294990.6885302470621980.7877932526050940.906984694355175
292008-04-20 20:50:38.0009200.1687677620974350.07433950261855720.02860152628917810.8677554084843751.00.90138283258391
302008-04-21 02:15:02.0009210.1708628905021850.149032999329490.00.6897625561194880.7892523351064440.902038520058675
312008-04-22 15:33:49.0008750.165035340073580.3847048930834310.02896391363700650.7333033342086450.8340853735595870.925901275835225
322008-04-22 20:26:18.0009200.28807530855120.4839441973945960.005338881654506360.8884226626002090.8139576463965260.91535192283697
332008-04-23 01:18:11.0007950.2289791590829550.4881290540604060.002159831627882050.8650360808205530.9967802529842260.909952054475065
342008-04-23 06:18:19.0009200.09473628751194470.3563246879333990.005044231231683780.6934103607354310.9762712399903490.904653981851425
352008-04-23 11:08:00.0003120.1023413188200240.3142997289293590.02474905682304760.6825512436193740.9773705782332460.90230495229026
362008-04-23 16:06:25.0001700.2208900278351130.8846461723315950.05783010531121840.6766571433596310.96352580967590.89968853259861
372008-04-23 21:02:58.0002950.1574213324518620.8478343195852830.003552949662157980.8136393324911750.9359253110603050.894221616768525
382008-04-24 01:51:33.0008900.08732252648961920.7538275942376710.03242678347553040.6591458618491790.9361723210900610.891461524212635
392008-04-24 06:50:00.0000310.01650708082731470.6659693393577610.04232491260320930.6481824351081940.9162793970657510.886516857900915
402008-04-24 11:49:24.0002800.06542210175127190.9447097834198980.03311893772423850.8027329121731150.9230118564814280.88651887753947
412008-04-24 16:45:56.0000150.2347442367403320.9453790862118420.02408907924771360.6423362445017610.9373168966349190.883936055333105
422008-04-24 21:38:11.0000770.3033540111966390.9335537854155360.01563545600104920.6364039378096310.9370698866051630.881157535204655
432008-04-25 12:03:44.0000930.2266231792440170.945347108280120.08693024449431040.6390195622998330.9405926517388360.88380864624692
442008-04-25 20:03:55.0009200.1782002642749370.9405950631224830.03328322966262330.6311635920596120.9166053928608360.881334179871575
452008-04-26 01:04:35.0007180.03896684888194460.8883496886908730.01602350966248390.6194421011839750.8961408991165080.875865243533205
462008-04-26 05:57:35.0001400.006238005156793080.9513020997150390.02448513698965490.6082797584140570.8763714801070760.870924802398
472008-04-26 10:58:25.0008430.00.9159612230136370.05305769445033550.5963405515189480.8758559184752010.868045675592355
482008-04-29 12:15:56.0009531.00.9909737530217090.9999999999999990.4377911345309130.4909123286726950.89681200742868
492008-04-29 17:05:27.0007800.7237475369994610.9796338209229710.6643876743411740.5685007977866950.645675601153480.891594511163665
502008-04-29 22:00:04.0007500.4715062700638860.9799472946936480.08135078073211120.6366034602898720.7263107328730140.88368210381395
512008-04-30 02:43:55.0003110.3340978697698610.9558069403817170.03498307990673850.6254029110875670.9377434197676960.878508892517655
522008-04-30 07:41:42.0005780.2343302984712770.8847696878222210.02547690555107890.6143824779236560.9108272251008150.873435308991895
532008-04-30 12:28:49.0002650.2306855243080480.8982274645260440.04833238294458470.6025851806345540.9178734734205720.870858625372855
542008-04-30 17:20:16.0001400.3243155916124870.9893066788871640.0448496130141810.5967189771349660.9251990441286290.86821125282391
552008-04-30 22:14:54.0000610.2641719578360780.8898178997628740.02776426512974640.5856518472203610.8980035270734480.86316086196179
562008-05-01 03:15:33.0001400.2204655253736660.9573762008348020.03220548872941060.5741668723547350.8776058122034440.857903269466505
572008-05-01 08:16:03.0003430.1808026661291630.7880756106879120.001986249153780470.5681787721422950.8775612929538950.855266675567595
582008-05-01 13:14:24.0008750.2161999974370250.9372149040473850.04752283933801380.5573754485465480.8854598694867940.85300724982426
592008-05-01 18:11:10.0009840.3642505285648280.9329328746117710.04670259586678240.5513394386807980.8853363644719160.85015551363213
602008-05-01 23:10:15.0002650.3067815951814090.8211356261635820.008699593591977430.5452185256319680.8780775726381820.847289930089895
612008-05-02 03:53:34.0007650.2706611078591430.8635103227815650.01392517994887090.5344631116895310.8585329040337310.842451454330465
622008-05-02 08:44:40.0008590.2095465920413580.8284380061826680.01872934359294730.5229223433035950.838135907216140.837237079557985
632008-05-02 13:43:05.0000920.2522287997535030.8522812698151560.02883645458215620.517454578313180.8591834595190770.837284623993955
642008-05-02 19:07:48.0003590.353674939182920.9800409303303460.0292785627544020.5060927130630220.8252332234234440.831858187990705
652008-05-02 23:24:17.0006870.324540853527010.8909148428466890.04750639107630830.5005873480915220.8391792373709010.829506934374125
662008-05-03 04:13:08.0006090.265131673495770.9511278235249210.03683529871715850.489273392494680.8327620029641210.82692702866675
672008-05-03 09:10:19.0008120.2230083597057140.9240505531582680.04715440064995530.4774002887921130.8115234487195690.821326891065065
682008-05-03 14:04:34.0001860.219771601621370.8750004479494570.01502712556363960.4718852205996950.8122853023287880.818928921511185
692008-05-03 18:58:13.0004680.2395500342879870.9498888775022960.01150514190692840.4659535203588740.8054530336278310.816367520354955
702008-05-03 23:42:30.0000920.2503097444597170.8964706838035690.03292636024950970.4549712936273490.806204834503280.81387644577668
712008-05-04 04:41:400.241432620912930.9520055069717030.006747470214354810.4489540837521390.7987550407279330.81106274276058
722008-05-04 09:40:11.0009060.2309698466243990.9505568495130420.03328234150064290.437270192857580.7780212773290750.80566283018829
732008-05-04 14:38:13.0003590.2268447548482720.9129407603181180.0002149052040026430.4319164407130170.7788743235946280.80328156987442
742008-05-04 19:37:07.0008890.2221407729655950.9939783624276110.01741366890303920.420839001126180.7724448822968490.80075711169588
752008-05-05 00:35:53.0009680.207559479972250.9836192435496810.03672097462905690.4092030198849320.7513075734424010.795184615700165
762008-05-05 05:28:13.0009360.193427030882190.9026182502562570.004343724903505770.40354543563520.7515093461701960.79289449841211
772008-05-05 11:41:31.0005000.2009413576642320.977440580722630.006298636080308660.3972917097499780.7374182856355050.792471535368705
782008-05-05 20:40:08.0005610.3665320917188830.9410727056518640.02354885009074970.4150304105008260.771648562171850.797763194624795
792008-05-06 01:38:46.0002650.174564447601240.9567053984382460.03028395374831730.3867073150763190.72442584529130.78736508739005
802008-05-06 06:37:06.0002960.05569272298986430.9718813059423160.001064517691656050.3757438883353340.7041968727381350.782450997546895
812008-05-06 11:25:21.0000300.08432930215668170.8988482180621140.03415827682007960.3640697006616990.7116574372996630.77988297366852
822008-05-06 16:23:36.0002950.2247411437936190.989689402745260.02263246237596670.3638513781909190.7315388724853810.77974078340921
832008-05-06 21:21:49.0002650.3254744998165280.9836341069662020.02967862832321740.3586686453151330.3972294665732240.777344676795305
842008-05-07 02:17:25.0001550.2902003802823820.9390739956207310.03979129528656220.3466718255457910.7252781735044410.77443705399452
852008-05-07 07:04:02.0001080.2288728345578640.9555715759172930.03361053103874750.3351868506801650.6978234395284980.76911829947128
862008-05-07 12:01:49.0009050.1320567468245070.8569596732906950.01829501876555260.324290133583030.6776964304178490.763957129125515
872008-05-07 16:59:29.0009370.1837899960559380.992433260226760.002298339284159340.3243180303431850.6920007525189280.76426263148782
882008-05-07 21:56:09.0006560.2985776566104350.9451392444117730.07581086399726060.3128051587174040.3429260348571360.76132366264219
892008-05-08 02:53:49.0009360.3174694537581040.9584689389505780.1057255227163310.301585809524560.3296420652336260.758742996924495
902008-05-09 12:25:070.261530324037670.9709442155305260.04670164803693440.4096954583468020.4576463965257750.80290944956533
912008-05-09 20:28:09.0007340.4383038957651810.9994699491127420.3179736189360320.3298713049679880.3631327477855260.78192457233722
922008-05-10 01:21:56.0005000.4429877309749070.9682015015426790.3224691639689330.3071578841399030.3362395307958320.77404576238839
932008-05-10 06:18:120.3727056656138520.9573920009958910.2003868731671710.3128718683612530.3430050206224650.76618778159343
942008-05-10 11:16:04.0008430.303033733327620.9697999201531920.06872957886032430.3183105236888980.3494445146540140.763476413424485
952008-05-10 16:11:56.0001090.2721017376396580.9541077851689230.03418363498191640.3128336619288670.6843262083385990.758478650856815
962008-05-10 21:07:54.0006250.2887486550913650.971239561561550.05332104863498860.3017659255629540.3298553268000140.755948798007135
972008-05-11 02:05:21.0007950.295233970307560.9597906708894550.08227202649301920.2897982154563820.3156852804999940.753281948034055
982008-05-11 06:55:27.0007340.3037541541946010.9958935188308060.06295633237999340.2847476889656790.3097053400121780.75077267633357
992008-05-11 11:52:47.0001860.2954589076724020.9671888766745580.08458605246868910.2727417724267210.2954900564101980.74542220252001
1002008-05-11 16:49:58.0009200.308600911143990.9781004889492480.1105786088121770.2619202552917420.6513410346848040.7429341922806
Rows: 1-100 | Columns: 7

Machine Learning

AutoML tests several models and returns input scores for each. We can use this to find the best model for our dataset.

In [29]:
from verticapy.learn.delphi import AutoML

model = AutoML("battery_data.battery_autoML", 
               estimator = "native")
model.fit("battery_data.finaldata_battery_5", 
          X = ["min_temp",
               "max_temp",
               "min_volt",
               "time_to_reach_minvolt",
               "time_to_reach_maxtemp"],
          y = "SOH")
Starting AutoML

Testing Model - LinearRegression

Model: LinearRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'newton'}; Test_score: 0.02360579740139729; Train_score: 0.016785972410622945; Time: 0.2098848819732666;
Model: LinearRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'bfgs'}; Test_score: 0.02008096687524301; Train_score: 0.018674852037929093; Time: 0.4777239163716634;

Grid Search Selected Model
LinearRegression; Parameters: {'solver': 'bfgs', 'penalty': 'none', 'max_iter': 100, 'tol': 1e-06}; Test_score: 0.02008096687524301; Train_score: 0.018674852037929093; Time: 0.4777239163716634;

Testing Model - ElasticNet

Model: ElasticNet; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'cgd', 'C': 1.0, 'l1_ratio': 0.5}; Test_score: 0.0946090154981508; Train_score: 0.09513062482287396; Time: 0.24749835332234701;

Grid Search Selected Model
ElasticNet; Parameters: {'solver': 'cgd', 'penalty': 'enet', 'max_iter': 100, 'l1_ratio': 0.5, 'C': 1.0, 'tol': 1e-06}; Test_score: 0.0946090154981508; Train_score: 0.09513062482287396; Time: 0.24749835332234701;

Testing Model - Ridge

Model: Ridge; Parameters: {'tol': 1e-06, 'max_iter': 100, 'C': 1.0}; Test_score: 0.017973726231011983; Train_score: 0.019579406509925495; Time: 0.2110426425933838;

Grid Search Selected Model
Ridge; Parameters: {'solver': 'newton', 'penalty': 'l2', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Test_score: 0.017973726231011983; Train_score: 0.019579406509925495; Time: 0.2110426425933838;

Testing Model - Lasso

Model: Lasso; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'cgd', 'C': 1.0}; Test_score: 0.09783993033532816; Train_score: 0.09335034317945679; Time: 0.23836588859558105;

Grid Search Selected Model
Lasso; Parameters: {'solver': 'cgd', 'penalty': 'l1', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Test_score: 0.09783993033532816; Train_score: 0.09335034317945679; Time: 0.23836588859558105;

Testing Model - LinearSVR

Model: LinearSVR; Parameters: {'tol': 1e-06, 'fit_intercept': True, 'intercept_mode': 'regularized', 'max_iter': 100}; Test_score: 0.09971802597621725; Train_score: 0.09811156476489769; Time: 2.4670310020446777;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'fit_intercept': True, 'intercept_mode': 'unregularized', 'max_iter': 100}; Test_score: 0.06589853834497636; Train_score: 0.06575896381072349; Time: 5.280991156895955;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'fit_intercept': True, 'intercept_mode': 'regularized', 'max_iter': 100}; Test_score: 0.09637031608421139; Train_score: 0.09901206907082098; Time: 3.5590200424194336;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'fit_intercept': True, 'intercept_mode': 'unregularized', 'max_iter': 100}; Test_score: 0.06709229053393247; Train_score: 0.062144273052144205; Time: 4.722416639328003;

Grid Search Selected Model
LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'max_iter': 100, 'fit_intercept': True, 'intercept_scaling': 1.0, 'intercept_mode': 'unregularized', 'acceptable_error_margin': 0.1}; Test_score: 0.06589853834497636; Train_score: 0.06575896381072349; Time: 5.280991156895955;

Testing Model - RandomForestRegressor

Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 1000, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.011514903525649154; Train_score: 0.007959577346479244; Time: 0.33117159207661945;
Model: RandomForestRegressor; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 128, 'max_depth': 6, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.01665041966752498; Train_score: 0.006759015139569025; Time: 0.28825847307840985;
Model: RandomForestRegressor; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 128, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.015214971131485872; Train_score: 0.013147941569538838; Time: 0.29858922958374023;
Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 32, 'max_depth': 6, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.010831393646800738; Train_score: 0.004767620085266784; Time: 0.32145269711812335;
Model: RandomForestRegressor; Parameters: {'max_features': 'auto', 'max_leaf_nodes': 64, 'max_depth': 5, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.01494343232768997; Train_score: 0.0072498732577846395; Time: 0.321212371190389;

Grid Search Selected Model
RandomForestRegressor; Parameters: {'n_estimators': 10, 'max_features': 'max', 'max_leaf_nodes': 32, 'sample': 0.632, 'max_depth': 6, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 0.010831393646800738; Train_score: 0.004767620085266784; Time: 0.32145269711812335;

Final Model

RandomForestRegressor; Best_Parameters: {'n_estimators': 10, 'max_features': 'max', 'max_leaf_nodes': 32, 'sample': 0.632, 'max_depth': 6, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Best_Test_score: 0.010831393646800738; Train_score: 0.004767620085266784; Time: 0.32145269711812335;


Starting Stepwise
[Model 0] aic: -1812.6614333263835; Variables: ['"min_temp"', '"min_volt"', '"max_temp"', '"time_to_reach_maxtemp"', '"time_to_reach_minvolt"']
[Model 1] aic: -1852.5506153813367; (-) Variable: "time_to_reach_maxtemp"

Selected Model

[Model 1] aic: -1852.5506153813367; Variables: ['"min_temp"', '"min_volt"', '"max_temp"', '"time_to_reach_minvolt"']
Out[29]:
model_type
avg_score
avg_train_score
avg_time
score_std
score_train_std
1RandomForestRegressor0.0108313936468007380.0047676200852667840.321452697118123350.0045259369823968240.0012905134207646044
2RandomForestRegressor0.0115149035256491540.0079595773464792440.331171592076619450.0041978295620318680.0024142794847140043
3RandomForestRegressor0.014943432327689970.00724987325778463950.3212123711903890.00251048746545376760.0007409047272686373
4RandomForestRegressor0.0152149711314858720.0131479415695388380.298589229583740230.0024172351486306190.0003135860896100186
5RandomForestRegressor0.016650419667524980.0067590151395690250.288258473078409850.004629072670059860.00038960505811702905
6Ridge0.0179737262310119830.0195794065099254950.21104264259338380.00099001932967045540.00029552531378825963
7LinearRegression0.020080966875243010.0186748520379290930.47772391637166340.00326278905087908730.0013790855201005533
8LinearRegression0.023605797401397290.0167859724106229450.20988488197326660.0013615514450413740.0004692360462666707
9LinearSVR0.065898538344976360.065758963810723495.2809911568959550.00229066671293205740.0010075870394134958
10LinearSVR0.067092290533932470.0621442730521442054.7224166393280030.00330578058288471130.0010907647350367879
11ElasticNet0.09460901549815080.095130624822873960.247498353322347010.00469354908434389640.0022381056909534904
12LinearSVR0.096370316084211390.099012069070820983.55902004241943360.0038396616749354920.0012152334737706163
13Lasso0.097839930335328160.093350343179456790.238365888595581050.0072820353814796950.0038882949641852096
14LinearSVR0.099718025976217250.098111564764897692.46703100204467770.00185483279408737430.001225713718922675
Rows: 1-14 | Columns: 8

We can visualize the performance and efficency differences of each model with a plot.

In [30]:
model.plot()
Out[30]:
<AxesSubplot:xlabel='time', ylabel='score'>

Let's check the type and hyperparameters of the most performant model.

In [31]:
# take the best model and its parameters
best_model = model.best_model_
params = best_model.get_params()
print(best_model.type)
RandomForestRegressor

We can now define the model using those hyperparameters and train it.

In [32]:
from verticapy.learn.ensemble import RandomForestRegressor

# define a regression model based on the selected parameters
model_rf = RandomForestRegressor(name = "btr_rf1", **params)
model_rf.fit(final_df,
             X = ["min_temp",
                  "max_temp",
                  "min_volt",
                  "time_to_reach_minvolt",
                  "time_to_reach_maxtemp"],
             y = "SOH")
model_rf.regression_report()
Out[32]:
value
explained_variance0.997659518246552
max_error0.0375041491629448
median_absolute_error0.00184377468342722
mean_absolute_error0.00278192359172081
mean_squared_error2.11456072683787e-05
root_mean_squared_error0.004598435306534028
r20.997653174398044
r2_adj0.9975807415090947
aic-1795.8434267288283
bic-1777.6213819828436
Rows: 1-10 | Columns: 2

The predictive power of our model looks pretty good. Let's use our model to predict the SoH of the battery. We can visualize our prediction with a plot against the true values.

In [33]:
# take the predicted values and the plot them along the true ones
result = model_rf.predict(final_df, 
                          name = "SOH_estimates")
result.plot(ts = 'start_time', 
            columns = ['SOH', 'SOH_estimates'])
Out[33]:
<AxesSubplot:xlabel='"start_time"'>