Predicting Popularity on Spotify

This example uses the publicly-available Spotify from Kaggle to predict the popularity of Polish songs and artists on Spotify. We'll also use a model to group artists together based on how similar their songs are.

You can download the Jupyter notebook of this study here.

The "tracks" dataset (tracks.csv) have the following features:

  • id represents the Id of the track generated by Spotify

Numerical:

  • acousticness (range: [0,1])
  • danceability (range: [0,1])
  • energy, (range: [0,1])
  • duration_ms (range: [200000,300000])
  • instrumentalness (range: [0,1])
  • valence (range: [0,1])
  • popularity (range: [0,100])
  • tempo (range: [50,150])
  • liveness (range: [0,1])
  • loudness (range: [-60,0])
  • speechiness (range: [0,1])

Dummy:

  • mode (0 = Minor, 1 = Major)
  • explicit (0 = No explicit content and 1 = Explicit content)

Categorical:

  • key - keys on an octave encoded as integers in range [0,11] (C = 0, C# = 1, etc.)
  • timesignature - predicted time signature
  • artists - list of contributing artists
  • artists - list of IDs of contributing artists
  • release_date - date of release (yyyy-mm-dd)
  • name - track name

The "artists" dataset (artists.csv) has the following features:

  • id - ID of the artist
  • name - artist name
  • followers - how many followers the artist has
  • popularity - popularity of the artists based on their tracks
  • genres - list of genres covered by the artist's tracks

Import libraries

Start by importing VerticaPy and loading the SQL extension, which allows you to query the Vertica database with SQL.

In [1]:
import verticapy as vp
%load_ext verticapy.sql

This examlpe uses the following version of VerticaPy:

In [2]:
vp.__version__
Out[2]:
'0.9.0'

Connect to Vertica. This example uses an existing connection called "VerticaDSN." For details on how to create a connection, use see the connection tutorial.

In [3]:
vp.connect("VerticaDSN")

Create a new schema, "spotify."

In [4]:
vp.drop("spotify", method = "schema")
vp.create_schema("spotify")
Out[4]:
True

Data Loading

Load the datasets into the vDataFrame with read_csv() and then view them with display().

In [5]:
# load datasets as vDataFrame objects
artists = vp.read_csv("data/artists.csv", schema = "spotify", parse_nrows = 100)
tracks  = vp.read_csv("data/tracks.csv" , schema = "spotify", parse_nrows = 100)
The table "spotify"."artists" has been successfully created.
The table "spotify"."tracks" has been successfully created.
In [6]:
display(artists)
display(tracks)
Abc
id
Varchar(44)
123
followers
Numeric(7,2)
Abc
Varchar(36)
Abc
name
Varchar(114)
123
popularity
Int
10001K3ON4ACv7hte9Jlc6o1.0Chris0
20001ZVMPt41Vwzt1zsmuzp13667.0Thyro & Yumi39
30001cekkfdEBoMlwVQvpLg80.0Jordan Colle0
40001wHqxbF2YYRQxGdbyER6329.0Motion Drive19
50002XY9y3JhjzTZqNCqEcv1828.0Katie Price & Peter Andre0
600045gNg7mLEf9UY9yhD0t829.0Kubus & BangBang13
70004m70bDmySiyX0mbJMNS0.0Carla Sanders0
80004mMWDlIBRhLMZP3p92i4.0Yung Danzy0
900052qkj8osxzuMhyZA88Y1.0Stephen Olaussen0
1000062105bTKjdbdc1K1UpI0.0懾魂紫0
110006sHOabJ20USMjG8cWtD705.0The Art Company32
12000778ICMgFxCapm7fVDAW1.0Polyphoniko Sygkrotima0
130007DSGSwy4e3PdpExQESD47.0Detect Audio0
140008EBLAehfRBSeStEUQmn0.0Hi Tide Wild0
150008dSXw2QjJXEg3jNl6hG0.0prottaborton0
1600097QzAB3ui6aBPavao2O0.0Gastraws Gang0
170009LMJQT62wdZVgnM4vBy97.0Bernd Mehler34
180009v6e5hmLdAqSFTLcKry1417.0Komissar0
19000AK4sjc0Np5MbCQ6JC1U11.0Tony Camillo's Bazuka0
20000AreMafq1WYcwszMM61y0.0Da Brxkxnhxrt Boys0
21000BQxIqXh6LNfsVTt703l0.0CecilyCecily0
22000BblCiHJeKvtiq5aiHOs1994.051 Koodia34
23000CkqJliuK4m3uNRl0jd20.0Young Ranz0
24000Cz1Pts8aGpDLBvrc4wi597.0El Negrito18
25000DEqM4VjDJIUIFJ2LNk10.0Han Young Chul0
26000DpPOWQC5JxdVapodOTM0.0Blu Bill$0
27000Dq0VqTZpxOP6jQMscVL4923.0Thug Brothers14
28000ESzt0wlQI41YbKYGIkf192.0Zach Russell19
29000EYNgg9zCfoWNtDBemdu0.0Coral Anointed vases0
30000Eamfs6aR2aDr5svq1XF26.0Scarlett Gonzalez0
31000FVR595IhUXTTTRwMgyQ0.0Mr. Todd0
32000G2jFbXDhyuRZGw3jX0l0.0Ci0
33000GXXsAUTgVJfJzOEMIQz3.0Seanine Seven0
34000Gam69yUQckclDmjLvlL1331.0Tony Victor0
35000IpCJToEX5T7oDZtXSeW0.0@Harlem Fuego0
36000JK7wzCV5X4mg3MbsuGA17.0Delmonaco0
37000JMMX1ZXl1WQIu9GGxCI0.0P.O.P0
38000JdbKW4RCTX2eV304DDR5.0Yuni Ayunda0
39000LGmf577qlcsgUZMWJLL0.0Mel Vine0
40000MS0paUmy5CYke9uzmAO0.0Zues Carter0
41000N5stG2Co67GJJeqBJpy0.0Hajoso0
42000NbZqISnuKaqmw33JxTV1418.0Carlos Nakai9
43000NkLnzi66UcbuTttHDqX1.0R.J. Paradee0
44000Nolc0A0ZuWkFzNYVYdU16.0Peter Clark1
45000OybtozoTUivAiSvHstz3.0Mohamad Sparow0
46000PIVxMsaaWl1FtigEyDo0.0Halo Kitty0
47000PWQyVzAHQICDwxkwEvy0.0TrulySincere0
48000QJOWUPdmFGneapRPzMx144.0The Red Carpet0
49000R5IwxLTiwZ1O0lsEgMZ2.0Verónica Falco0
50000SLLJ1xQBlePMwMXtkK50.0SlimStnr0
51000Saas96Bv97L2quWFvJl4.03MYR0
52000SwGJYBhOXUnpXva6H3v3.0Lucrative Burrito0
53000SzAei1SOa0HERIjNNqb0.0Man Gogh0
54000TLrYTTOschyZq06KUyV0.0Sadia Naeem0
55000TPdyW9vCzS7vAboRLPP0.0Florence Mestais0
56000TcTgrYANFi4NWE4K0fi0.0B Greazy0
57000UmiuzgqlUjKHpvIWmAo3834.0Lisa Knowles-Smith27
58000UxvYLQuybj6iVRRCAw191.0Primera Etica1
59000VNWB7V9Xt6HDlPnP1FM1.0JVC0
60000VtE5U1okZpIrAilYHpJ0.0Элиана Бублик0
61000WDFTEMnS2Ey1iyugp0x3.0Nyell Jeudy0
62000WMX8CCUlKyWxaOasSNZ346.0John Wang9
63000WyXOWRZlAYSCHOl6vbw5.0Naveen Hiremath0
64000XIsEmxd7gGlxWDFCD432.0YkUno0
65000ZSUbEUZFtMiWHIYBv3W0.0Cathy Fishburn0
66000ZhE5mMcLVkkIMYScxGn0.0Torrober0
67000ZqKpgzccO8TcuxHK6VW2.0Alibastard0
68000dicAA3QWT6m118rFXLO134.0Wings of the ISANG0
69000e7vL2PldGExOPwRFmfF0.0Tea K0
70000eJgbruxYJJGYpzP33bd0.0Alonzo Garton0
71000fiSPJm8TYO2T7UPExMS0.0Bobbie Johnson0
72000h0et7HHi3pmuWZ80LNj0.0NT Money0
73000isQj1PXwsP5kJF0F1Fi14.0YoursTruly1
74000j0w6ospeDya5uUncVNQ0.0Pradeep Dutto0
75000j8IPyHeLwvo4ScFUJMn0.0El Bkno0
76000kGSv1ge0O6gJDYkOBLZ11.0Electra Vega0
77000lFylX9iGEN12AIz7qkF0.0Mr. Goodbarz0
78000lUERE6k2fnBFdDn58yb6.0Décio Pulini0
79000lc4PE8DuDC8O1n2jsao0.0Luh H0
80000nr5NQY4yigOdegZaKoV7.0Ghostfog0
81000ohFb1Uurub7P6UxvkJ30.0Leek Sosa0
82000p4jMMhpEHq1h6PFCyO1335.0Anne Veski24
83000pQJoy8qFwpZwB5T8htu2.0RCHILL0
84000q9sWX0l76D7fijF5iOl19.0Room 4040
85000qAzhdvazIILeGHZhfHE1.0Deja Amari0
86000qxDglux9HycJpOEigNr0.0Em. Mâu Huỳnh Điệp0
87000rFz8r7IQznjS1qjIsyk0.0Soph vs. Hypa0
88000rLf9cJqQOstUbHCzl775.0Raffaele Cardone0
89000rOFFfUyGGzKmB8XYIuu132.0Milton Cross0
90000rU47x3QblbKdSE2VRFm2.0Ron One0
91000s6SnJAGVnsoT33VKcSW59.0Poker Pets8
92000s7VhvjpDLaDf3L140Zc0.0古風な 音楽を学ぶ0
93000sD0EGScyHwmiRNF0RRP9.0Doblado0
94000spuc3oKgwYmfg5IE26s301.0Parliament Syndicate8
95000tKel9gS8530Lh9kd8xb2.0Rupantar1
96000vMR3cHSTJ88w3PJWHJB114.0Jeff Kryka20
97000vYbdVeUF5Mx5RLgT2iR0.0608 A0
98000vgESUHZ8XyaYCCxapde7.0Maggie Lee0
99000vr8pSUPAUgtGPpJbGEH0.0Dro-Delecity0
100000vwlhkQBKs6RjdBkjltZ41.0Soul Creation feat. Kenny Bobien0
Rows: 1-100 | Columns: 5
Abc
id
Varchar(44)
Abc
Varchar(96)
123
popularity
Int
123
duration_ms
Int
123
explicit
Int
Abc
Varchar(100)
Abc
Varchar(156)
Abc
release_date
Varchar(20)
123
danceability
Numeric(6,4)
123
energy
Numeric(8,6)
123
key
Int
123
loudness
Numeric(8,4)
123
mode
Int
123
speechiness
Numeric(7,5)
123
acousticness
Numeric(6,4)
123
instrumentalness
Float
123
liveness
Numeric(7,5)
123
valence
Numeric(7,5)
123
tempo
Numeric(9,4)
123
time_signature
Int
10004Uy71ku11n3LMpuyf593425890702012-01-010.6230.5996-9.25510.02550.1770.001480.07480.381140.0564
2000CSYu4rvd8cQ7JilfxhZ4318910702005-05-030.6240.7662-7.8610.07310.3450.00.1130.54895.1284
3000DsoWJKHdaUmhgcnpr8j1623453301977-07-110.5720.6087-10.22410.5550.6420.000220.3440.559106.4534
4000G1xMMuwxNHmwVsBdtj13218234701978-09-230.2560.8952-4.8610.07070.01310.0001060.08210.555191.3074
5000KblXP5csWFFFsD6smOy3524001302006-07-080.6190.5184-5.39200.05340.8050.00.1020.314143.7573
6000Npgk5e2SgwGaIsN3ztv020697201953-12-310.2770.1453-19.89810.08450.980.8790.1110.49475.6444
7000P83HDtOHcNVFZy7Q2Yu18242493019800.6190.5617-8.37700.07360.4384.17e-050.1260.69688.944
8000RDCYioLteXcutOjeweY5419020302018-02-120.6790.770-3.53710.190.05830.00.08250.839161.7214
9000TXa2oEZLYfQGPCiv23U6243200019820.610.916-10.98110.0430.06060.08280.1480.775129.6394
10000TiSS4vK5su0MkoFyQbd4415925002017-12-220.720.6466-7.69100.1060.2370.00.1270.095880.0134
11000d0lQMYaRR5ZXS9nTeiN32258533119950.8260.8271-8.39310.2050.01064.62e-050.170.4992.1374
12000jBcNljWTnyjB4YO7ojf017974701954-01-010.7880.8085-6.5910.03950.6560.00.1540.969113.0464
13000mGrJNc2GAgQdMESdgEc0498560019510.07530.152-16.70500.03710.3020.8840.1210.035176.5583
14000q9YBtesW8yPwlmus12C317851501975-01-010.7150.2545-15.45910.03430.8780.00.08670.55994.2054
15000u1dTg7y1XCDXi80hbBX5830060001989-01-010.7560.477-12.61510.03940.1960.0004870.1260.43120.4844
16000x2qE0ZI3hodeVrnJK8A37200627019750.5070.3560-14.2410.03060.3390.00.180.472134.2484
17000xYdQfIZ4pDmBGzQalKU5318711902016-11-040.5090.8030-6.74310.040.6840.0005390.4630.651166.0184
18000ydDsz4ijCNUsmoIeZcj139866701978-09-220.6370.3531-13.97800.0390.2338.44e-060.1020.662139.2093
1900105Q1NbnHkf8R5eXXeXm4423406702011-08-230.7770.8326-3.98200.04030.3341.08e-050.07650.7294.9644
2000147h65HDYSncB3byziPP1224144001961-01-010.2380.25310-13.89810.03150.8770.00.07480.16490.8554
210018QzCxmMrpa0FubbNdak3114793301993-01-010.8040.7969-8.61610.0320.3190.0001270.09870.965113.9644
22001AGGmHtn4FUzXgAUxz391319150401991-01-010.5160.2957-15.55800.06620.6760.00.1950.881150.5423
23001GxQGaFwTjxM7tmKbMF330214493020040.5470.73111-6.05510.07370.2060.00.09390.965159.8014
24001I9iXPNwN0HlHfdQjDIX2126784001988-01-010.4530.6885-9.07510.05560.02675.21e-060.2930.42120.9534
25001IcYypSE1ryXKY5KNIin58270405302016-10-080.1110.37610-27.78600.08630.5660.7160.311e-0595.285
26001LvKFwYbfKYPQF2Fiv773419078701954-01-010.5550.5333-7.60910.03720.8020.2110.270.88282.684
27001UI3J6PKAEnBgqrwGGQC21163840019770.8390.4067-17.73210.10.05552.39e-050.1070.559139.3674
28001UkMQHw4zXfFNdKpwXAF6919105302019-04-050.5730.8467-4.86610.03440.00377.87e-050.290.562127.0614
29001YQlnDSduXd5LgBd66gT3617726701984-02-060.5540.9212-4.58910.07580.01940.08810.3290.7183.5711
30001ZmOPuWEW5czwun7nkha89286701952-04-110.160.125-15.24410.03670.9150.000240.280.162169.7221
31001e2JrYMwnTeRnxf3sgIz1236428001953-01-010.1920.07633-24.38810.04250.9510.2840.07020.055491.0973
32001f6XLtM53gwKSauiUcKI3137998701977-01-010.5560.3151-12.92300.2280.1510.00.1020.364129.9984
33001gx41rQo0bKh063TrC1I2238203001984-01-010.5030.23310-16.82710.06750.8130.01290.5460.153129.2544
34001hyVfKgE2R1a0TBXeLlV5221974212019-05-200.6660.55411-10.35310.03250.670.000250.06310.414130.0524
35001s02k3baVIxp6lIVRu6k020547301937-12-310.4260.08843-10.92510.06390.9880.001130.3930.39680.213
3600218i6cYENiDszmBKsYYg5513265712018-08-230.7060.4730-8.08500.09260.9840.02420.1060.5987.4124
370022sbR4gAWFhpMaYLbtnX2221656102009-01-010.7480.8658-5.66300.03270.180.00.1330.879138.0184
380024tEymsoc9FyKUauQngQ41546107019770.3050.2341-18.25510.02590.7520.9290.1320.07882.4533
390025JMWRhsWx0GXdlzhHMO119729302016-05-270.2520.8770-7.87600.04810.00010.08070.06850.332103.7844
400029TH4cSnQ12KKfHaq11C3819896002000-01-010.6160.8195-4.10910.03030.00250.0001140.2150.962126.8164
41002Ac7LJjVIcfFYMZ6Irue317761301969-08-230.40.3916-10.48710.02760.4851.18e-050.1180.40183.4594
42002CcxKpBE1tfKOy2CRaWr019415401955-12-310.5050.4193-10.51100.3160.9830.05120.1480.654134.6583
43002DkDzzQ7lrgaqWBF2o1M3720528001985-01-010.5090.7157-6.41910.1760.02510.00.4190.833178.1444
44002KIBXwb0pa66mPGuPKMr3329424901990-03-010.6410.8549-9.10900.02910.00170.3480.3250.871121.2444
45002TGKi4LBwxYodlfWoaN03213209302000-01-010.7630.8815-7.40910.03110.5963.11e-050.1480.987143.1644
46002aR3zqP6SvscCnPT44on072643601955-01-010.2030.1074-21.17500.03990.9250.8320.1120.0375107.3914
47002ak2fuoNB5KW0plsN1jp2621353301990-04-100.7310.399-12.02300.02860.4212.55e-060.07850.69697.3974
48002c2TeuD0GPfL2ahcmWEF017820401945-12-310.6760.2456-10.90600.09450.9950.5710.1340.74875.4444
49002dEfJAJnpoDU5cUhMmZJ0158145019400.3540.3539-12.50610.0680.9850.2780.09570.649152.4244
50002dh6a4LfxfGGnhPZY4fG0309347019510.3890.3127-9.83910.040.9194.08e-060.4250.289119.4794
51002jsFzKzBLqDS71IZpnd65211293019710.6910.44-8.82810.03670.7550.0001250.110.716121.3554
52002lZtb0gYsfOJKqIouc0j145360001972-09-110.6060.9543-5.93900.9340.9590.00.9150.28876.8563
53002sGwDZYna3CKXbYIilHz016694902009-12-160.5250.6599-14.27300.03140.2180.8650.1190.739128.0024
54002x5YvGv3c0OW9HOoa4lG1017303501977-01-010.7450.58610-6.64710.06270.6290.0003820.340.9590.1114
55002zOHMdBKYgNGtmmHSE2D525089301959-02-020.390.6997-9.8710.8070.7970.00.70.791164.584
5600309sp8DDeN07RtAyBKIX3425164611992-02-240.570.8817-12.80810.04830.00070.190.3340.623128.914
570033rRpGIViA0RngXc0jtL024595712012-11-010.1750.06825-16.95510.04190.9370.8880.2260.0354118.2484
580034Fs63aj5J6FTxe5U2Ru1177766701983-06-100.4660.4190-10.34210.03710.6986.09e-050.2090.277122.6374
590036VchQkckr6Xxz7WYEh03123258702010-03-080.7070.2661-12.33810.03070.7663.72e-050.1050.47392.0084
600038DNjmwf6yFivn7PLwqP21336653019830.8980.7399-7.48700.06050.2250.0001280.2850.883113.8844
61003FTlCpBTM4eSqYSWPv4H6923326702002-10-150.5530.7177-5.85710.03180.00010.1280.03850.318127.9474
62003IDC4FUadw6ocSKqWHtq2219376001988-01-010.5740.597-7.87900.02410.08440.00.06760.346102.6834
63003IdD0Ir5LSZHlrPpLZlm27120907019750.2930.1867-27.6710.04190.9890.8480.1090.796152.0454
64003IvyvD1M7gdSDK3z5Kfd36143200019960.7160.335-11.76400.04130.571.79e-050.1380.554117.8354
65003JzPprzThp8SHUctgXnn10182413019520.4610.4547-8.49910.04190.7970.006120.1070.298142.1974
66003KPastz3K8HNyHb8dixS2630312002000-04-090.650.830-5.75800.02810.1060.00.3180.632132.3534
67003PmYxl19KBg6m4uxClbR2221726701990-01-010.470.7125-5.46210.03170.1640.00.2250.28146.1814
68003ShLCyC9ACAsdUWOcnk01922857301993-03-120.7730.5877-12.19610.0390.8293.94e-050.07190.86130.1384
69003WuNd8vTwCW4JyhFQMYT042267019540.6560.5184-18.67310.1010.9870.1320.7620.932142.2163
70003YOvQRIurKekUuUMDtkZ4429361601999-03-170.8010.5556-7.87110.03210.4610.00.2360.806142.3434
71003Ymyk0DvgqFmqEmA5Vzc24249160019780.6930.570-9.94900.03090.380.00.280.855123.1934
72003d3VbyJTZiiOYT2W7fnQ4129093301973-01-010.4610.320-16.33710.03270.2663.5e-060.2110.487134.8813
73003fzMu2jBvZqXNPMYNYox3221310702009-09-290.5670.6037-8.22310.0440.3173.7e-050.130.681159.9344
74003gesN9g85fPpsH1Gx5OO47196040020010.7920.9522-6.74910.0550.1340.0002630.2540.968106.0094
75003hDp0MCmLiYKYRfzbhMo18229227019960.6210.8174-3.3410.05260.5090.00.1560.976155.1294
76003noSDLb7rvHSJ2FPYmDX3819386701995-01-010.5660.952-9.14600.03760.00350.004930.2270.759148.0294
77003oRl3zfRZvhttUxM9Gr51223870701986-11-120.760.7462-5.17910.2710.3222.04e-060.04570.892137.0784
78003vvx7Niy0yvhvHt4a68B82222973020040.3520.9111-5.2310.07470.00120.00.09950.236148.0334
790043vCcTBYYsmUPzCNd1JQ1122462701989-01-010.7490.7569-6.09510.03740.00610.00.03320.861122.4144
8000469l4g2MMTCWwPudV9Qq125554701960-01-010.6820.2625-14.6700.04950.7650.00.2280.598123.7954
810046quUYhSAFccrKIC3Iht4328497301985-06-010.480.4351-9.95910.03050.2075.83e-060.1960.328144.4354
820047CfWoMPpXkZlizPQAb22723621302004-11-230.8560.7547-2.72800.04160.02941.49e-050.2060.72298.0184
83004ADkC8JLeDkT5HGsPDBm423986702005-11-290.5590.7138-4.02900.03180.1230.00.10.734164.0484
84004Me6nm9MknKeGKLnWx79422324001979-04-300.7050.5849-9.99710.02960.1585.39e-060.2340.83198.8444
85004NJ3y5ED5zhn5adktc8g0184812019260.3780.5983-6.38110.04340.9960.6070.430.649198.4924
86004Oj2hkXAaenToKOLbYla0107760019560.7640.3022-14.69810.0370.9110.0004460.2610.829117.1913
87004TG0nRHejwSKisvwTcAB31204853019630.220.04474-25.51410.03840.9950.9410.1010.19768.3494
88004TP6xsBixiK3zqOfPBEx2121060019580.7080.17810-17.49110.9580.6080.00.2340.561100.1213
89004UHJTfLZH02Xds3ETd7e025464001953-01-010.4140.4252-9.29910.03780.8910.01640.370.53398.0134
90004VesjMxIn4AzXd8c52n1539250701990-01-010.5680.6211-11.87800.02740.580.005390.08460.418100.8973
91004XgNPueIQSJrK25rzP3D23243533020120.5360.5060-9.83610.02770.00090.00.09640.30683.4624
92004a0SY8lPBwVnLCtO9aOO367222302016-12-250.00.001251-27.59210.00.9081.00.1110.00.00
93004cCP7Csq7U0m67DDzEFs072147019460.4190.08743-27.61200.04190.9950.9220.120.50764.0085
94004dqsaJ3B8BBpqpN0F4YT31186333019890.6650.57-9.57900.03450.08622.33e-060.1040.95119.9694
95004ltC6j7SM8J06Egoa4wN19251960019990.7620.8247-7.57110.07230.1559.56e-050.0970.75698.0724
96004q4eDxR33ci4f8m4flwl4317162702016-08-260.8380.7031-6.19500.050.7681.56e-060.08610.658104.9824
97004sbHg1qMzdrDXdxJpkj1526284002021-04-020.5680.5176-7.31600.08940.1460.00.3430.578104.4414
98004tIr2Xy3FJVesPxRBLqm222685302020-09-250.6830.5727-7.08610.04470.111.54e-050.08950.583100.0354
99004tQ7CGOm6ZKbmhdTgKsM1714012001995-06-200.6830.4791-11.57410.9420.3670.00.3370.70776.7821
100004uWjBm1cOgxJAwQb6X8P3926046701993-11-010.7740.6340-11.40100.04870.20.009760.220.933119.9534
Rows: 1-100 | Columns: 20

Data Exploration

Our "artists" dataset is too broad for us to use right now; we're only concerned with Polish artists, so let's extract and save them to our Vertica database.

In [7]:
# filter polish artists out of the 'artists' dataset using information in 'genres' column
polish_artists = artists.search("genres ilike '%disco polo%' or genres ilike '%polish%'") 

# save it to the database
polish_artists.to_db('"spotify"."polish_artists"', relation_type = "table")
Out[7]:
Abc
id
Varchar(44)
123
followers
Numeric(7,2)
Abc
Varchar(36)
Abc
Varchar(114)
123
popularity
Integer
1004reCzVFOidvBuYrYia9Y12726.027
200drc18J6PkIXn24widBC53.00
300ekfPE5ZS3NwF8H8o8GBk17574.047
40138Hlfmap3fEvKTqmZhE91289.05
5018gIUaP08hROTOiVdiEQ3584.08
601BlTZ696EkKe6xr56Gu6G362.03
701TgMAgIALWvVXlKjUwpfn1063.09
80244q9rIqAIzBFXKNNRN6O27393.040
902Cq85QmaYHDi4dW7AxTRZ749.09
1002ESuuto8Jwyo4PeiJ1Xim149.01
1102JmHOSFJi2bLjGnO274di8817.031
1202LrsTMdnHVvKmXxN0epQF753.022
1302eZEXslMzAjHDkygNJHSX8010.026
1402keDoJak6YO12KBJFMFNm1873.015
1502tQ309SzZZ0bYs2yyO60G11734.030
160338weYyACbkc5ERuLnFTa270.017
17033WIygOyXwUjc1vfCGxJ2126.00
1803Dy3XKBUsC3vJLCuF0T7I152.01
1903KLzHVK6la8dVop1iVI5x63817.050
2003ZzgzybQr8UyvWCMSCvRy1363.014
2103jLJnyfZXs1ssrIALfGRm2633.022
2203ohDYwWFrXfgp0VEtSTiF706.09
2303qKjVTzyKc3SyTjHaOpFc2376.022
2403rREATXGWcD2CfG3OXDZY10155.044
2503xKZpOUZOQjf7g5WBN4ee3676.030
2603yP3BHBnpGyvddEoIGnsx2089.016
2704Lio76CKJCMPbK5hV6J4w1876.05
2804Loj16dRX1yZodeEQlCOv308.01
2904WxKoI0kS5JclvQ8rn8qp13.00
3004bDWf1u7HxKdskC3N2nIk27127.037
3105AVHcWP9DF6y6LEU845uz1545.015
3205Fgqq7GfWeNol1TR5H3og15868.035
3305UsyksBcAUVdfyREMxbDm295.011
34063D0MKbIbbBjKgtYRGBga7458.036
350690wuO0NVERuqxuoi2mTF319.04
3606O52v4thQuBoLC6jWatGW21.00
3706UcKJxYJXthEwn0c8XOCt11024.020
3806wBGqhkbyUAtVNMbbcK1x607.08
39070tdNOiP3pIsGlqNfVkG386130.051
40072HrG3T5BaaBj4YhKIkxv1166.08
4107ILo13zpakvXxTL3VtqwS540.010
42098RsUTij7grC7evZUhWwA720.031
4309MjLGtslj39ILxA1MqUny556.07
4409ScR35g0VzipHacuPtXZd440.07
4509Z3SI4GkhYjpCB6884vC810395.025
4609j4UTVH7vk7fVfVB71roU348.01
4709u3N07jllcbrnkvjxNBSl1117.015
480AYJ3eg4zKi9ilGrhVaINs2186.018
490AZgkXW6n0zfyOhVAnIopA1109.033
500At3wjxYzZL9WwqbFR0JL824.00
510BBB9DjvskQV0oReJMxTP130889.045
520BQIhJ61mCyaOrVrMJ7e8k5.00
530CEw36eWG0dYKCXOX8eUoO77804.047
540CI4rQj50Dcr30HpiD2LF6165.08
550CgCy79P84g1meaXcwwFqZ80.00
560CsrftI3Zs3nvfSW6MRglc50.00
570D5kXlS7UOApMpTyuSrFAW40370.039
580D9mwbJP5sUH7XYXg4F7u9580.04
590E6TslMisIITlZ1QjjPXeo110.00
600EDBV0NVPOftbsEM0fg7WZ2004.018
610EMDndPZcpfg9Qqgos0S7G73.00
620EPzUAW8kwuPedmmVP6n9S99964.053
630EQaqT3oKtxAGR0Y5c1Jme3572.011
640EYfWGAHPugeWUKKvoMU79336.04
650EalHNt7jDL6Tc7XSunjXu21.00
660Emf6MyFoCjKazTqoaUu6T1107.08
670EvkY8O19trlgsfrVOTQgg26661.031
680F1DvSOKRaSA6XKSwDNs4010792.045
690FKOL5wp6sgB8VRNsJaUlz430.05
700FbccBQBb69lfv4arbt6kX9237.036
710G2VUqbZ4C28aN9y41Wp3G1186.017
720G6miz5dLrc3NZWi4ZYdJK2813.035
730GF5CJ7nKXsMTiWHK4ZQJN30925.046
740GPJYkHJm0Fpbhjovpm1h12261.041
750GPfyyiTlLdG6rQthueRBM682.026
760GQZc3zcll9HXIVaUA1XzJ11211.032
770Gfk7Ww29CWVyrnkqC4KUt7.00
780GnO5BjJfHFwkesoObGU3661.01
790GsCeqHAG63k8CRj1NH8e4164.00
800GxARImYCmCNz0v04YjPq2179.01
810GykMtlKoc68Hj2jwZLXul79213.049
820HC5DGqdUzXorIXUudkeWG1805.029
830HLMuuBFA7R4boMxVl9QgQ9967.029
840HTub0NhKSRgggtmJBP9aR59.00
850HZL4dV60t13CHasIHwaLP385.04
860HhejlCvg1WCO9nXNZGEkc144.01
870Hob9LUr2x0SULSZjuf6li10701.042
880Id5ZU9SxHcgE32nfJMTbh259.019
890It4rGfBk31UDyK9x6uZvP3056.013
900IuXBtCmOjyRjzbfJmfKHa13.00
910Jl6TFKAJR7zIv2kvA1RNf60054.040
920K0Sa7amVwCmQKz7ZHRRim3005.029
930KNOQSBwQim4GXpZHekrvu1728.047
940KTn3DOb57GcGjPoA09ABL4.00
950KZLEvrZHdqVDKdclXRVK07.00
960KirHnU7pIfeMYWSJ6xm8I1309.012
970Ks3WKQ64ZmWa3QkbbeCbj129.00
980LcUNEKY8mVqNmYfrgZrxl5803.024
990M5UiR76X2ybfo6N9iVNWr27540.036
1000M82iosyReYC1ij8svj14549.00
Rows: 1-100 | Columns: 5

We can visualize the top 60 most-followed Polish artists with a bar chart.

In [8]:
# make a highchart of the top 60 most-followed Polish artists
polish_artists.hchart(x = "name",
                      y = "followers", 
                      aggregate = False,
                      kind = "bar",
                      max_cardinality = 60,
                      height = 900,
                      width = 1500)
Out[8]:

We can do the same with the most popular tracks. For example, we can graph Monika Brodka's most popular tracks like so:

In [9]:
# find Monika Brodka's songs
brodka_tracks = tracks.search("artists ilike '%brodka%'")

# plot Brodka's tracks ordered by popularity
brodka_tracks.hchart(x = "name",
                     y = "popularity",
                     aggregate = False,
                     kind = "bar",
                     max_cardinality = 25) 
Out[9]:

To get an idea of what makes Monika Brodka's songs popular, let's create a boxplot of the numerical feature distribution of her tracks.

In [10]:
## list of the relevant numerical features
numerical_features = ['danceability', 
                      'energy', 
                      'speechiness', 
                      'acousticness', 
                      'instrumentalness', 
                      'valence', 
                      'liveness'] 

# create a boxplot of the above features
brodka_tracks[numerical_features].hchart(kind = "boxplot")
Out[10]:

Timing is a classic factor for success, so let's look at the popularity of Monika's songs over time with a smooth curve.

In [11]:
# extract year from the date
brodka_tracks['release_year'] = "year(release_date::date)"

# smooth the popularity using rolling mean
brodka_tracks.rolling(func = 'mean',
                      columns = 'popularity',
                      window = (-3, 3),
                      order_by = 'release_year',
                      name = 'smoothed_popularity')

# plot the smoothed curve for popularity of her songs
brodka_tracks.plot(ts = 'release_date', columns=['smoothed_popularity']) 
Out[11]:
<AxesSubplot:xlabel='"release_date"', ylabel='"smoothed_popularity"'>

Numerical-feature Analysis

Bringing it all together, let's try to get an idea of how these numerical features change and correlate with each other in Monika's most popular songs.

In [12]:
# extract year from date
tracks['release_year'] = "year(release_date::date)"

# get the average of numerical features during the year
yearly_aggs = tracks.groupby('release_year', ['AVG(danceability) as danceability',
                                              'AVG(energy) as energy', 
                                              'AVG(speechiness) AS speechiness', 
                                              'AVG(acousticness) AS acousticness', 
                                              'AVG(instrumentalness) AS instrumentalness',
                                              'AVG(valence) AS valence', 
                                              'AVG(liveness) AS liveness',])

# plot the cures for numerical features along the different years
yearly_aggs.plot(ts='release_year', 
                 columns=numerical_features)
Out[12]:
<AxesSubplot:xlabel='"release_year"'>
In [13]:
# correlation of numerical features
tracks[tracks[numerical_features]].corr()
Out[13]:
"danceability"
"energy"
"speechiness"
"acousticness"
"instrumentalness"
"valence"
"liveness"
"danceability"1.00.2253784433129020.198422110210451-0.228879028726738-0.2195601446357430.517690962189664-0.107989997860069
"energy"0.2253784433129021.0-0.0588034003217321-0.710939277579459-0.1903196750337830.3601950986493490.12611905796149
"speechiness"0.198422110210451-0.05880340032173211.00.0735189869046661-0.1011507094190630.04396325477456220.208192661333212
"acousticness"-0.228879028726738-0.7109392775794590.07351898690466611.00.199817518163958-0.166786499501141-0.00448565029093398
"instrumentalness"-0.219560144635743-0.190319675033783-0.1011507094190630.1998175181639581.0-0.170613868369725-0.0372257235123641
"valence"0.5176909621896640.3601950986493490.0439632547745622-0.166786499501141-0.1706138683697251.0-0.000143636296234245
"liveness"-0.1079899978600690.126119057961490.208192661333212-0.00448565029093398-0.0372257235123641-0.0001436362962342451.0
Rows: 1-7 | Columns: 8

Feature Engineering

To expand our analysis, let's take into account some descriptive features. Since our goal is to predict popularity, some useful features might be:

  • number of followers
  • popularity for the artist of the track
  • the number of artists per track

Additionally, we manipulate our data a bit to make things easier later on:

  • converting the duration unit from 'ms' to 'minute'
  • extracting the year from the date.
In [66]:
%%sql
CREATE TABLE spotify.polish_tracks AS
SELECT * FROM spotify.tracks 
WHERE id_artists IN (SELECT t.id_artists FROM spotify.tracks t JOIN spotify.polish_artists p
                     ON t.id_artists LIKE '%' || p.id || '%');
CREATE TABLE spotify.polish_tracks_clean AS
SELECT 
    x.*, 
    x.duration_ms / 60000 AS duration_minute,
    (CASE 
         WHEN LENGTH(x.release_date) = 4 THEN (release_date::INT) 
         ELSE YEAR(x.release_date::date) 
     END) AS release_year,
    y.followers AS artists_followers,
    y.popularity AS artist_popularity
FROM spotify.polish_tracks AS x LEFT JOIN spotify.artists AS y
ON x.id_artists LIKE '%' || y.id || '%';
CREATE
CREATE
Out[66]:
Execution: 634.41s
In [27]:
polish_tracks = vp.vDataFrame("spotify.polish_tracks_clean")

# count the number of artists per track
polish_tracks.regexp(column = "artists", 
                     pattern = ",", 
                     method = "count",
                     name = "nb_singers")
polish_tracks["nb_singers"].add(1)
Out[27]:
Abc
id
Varchar(44)
Abc
Varchar(96)
123
popularity
Int
123
duration_ms
Int
123
explicit
Int
Abc
Varchar(100)
Abc
Varchar(156)
Abc
release_date
Varchar(20)
123
danceability
Numeric(6,4)
123
energy
Numeric(8,6)
123
key
Int
123
loudness
Numeric(8,4)
123
mode
Int
123
speechiness
Numeric(7,5)
123
acousticness
Numeric(6,4)
123
instrumentalness
Float
123
liveness
Numeric(7,5)
123
valence
Numeric(7,5)
123
tempo
Numeric(9,4)
123
time_signature
Int
123
duration_minute
Numeric(36,18)
123
release_year
Int
123
artists_followers
Numeric(7,2)
123
artist_popularity
Int
123
nb_singers
Integer
10004Uy71ku11n3LMpuyf593425890702012-01-010.6230.5996-9.25510.02550.1770.001480.07480.381140.05644.315116666666666520125425.0361
2008UO76dicJJrUb5g6WID042216813020010.6770.8859-4.10800.04490.1640.00.08270.741119.98143.61355200147110.0441
300D7fT49Q54HNiaaLY32UA4119623102017-06-070.5410.6721-4.95610.1130.1980.00.1120.146120.06153.2705166666666665201775395.0593
400De7REGurYkmfrKAh6CU13640086701992-10-300.2180.10411-22.80900.0470.510.6920.06880.0329114.59236.681116666666667199211683.0401
500KCwnrvIXX8GRU3ZMOIBW5520782102019-12-060.8250.6926-6.08400.06270.1720.0009960.08610.602123.98543.4636833333333334201923399.0541
600g7zzRDf2DOf4fKYzFTUA4619285702018-08-240.750.7971-6.51610.3320.5670.00.09140.726112.34343.2142833333333334201828135.0431
700i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.5730.6747-7.53100.09470.6290.00.2170.92898.84241.632066666666666819785.042
800i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.5730.6747-7.53100.09470.6290.00.2170.92898.84241.632066666666666819785059.0422
900l8ILQ8qgsew789dGn1Gh732624001963-03-250.5680.6012-7.87500.02940.5099.49e-060.1560.78102.42845.4373333333333331963696.091
1000vlVv7WLZjyRS783REKmn24243107019980.8830.5061-8.13710.3720.05190.00.2070.37593.0544.051783333333334199841198.0371
1100wJpDUdKoWQ2ibqTpJh1Y3221165301998-08-300.8020.9236-7.74300.0330.3140.0004180.2250.978130.10343.5275519987987.0271
12010cqDI9IKD8f3Ji8TnT9h1419121301967-04-140.5120.41610-8.30610.03810.7840.00.08360.531135.1543.1868833333333333196713705.0351
130123TtFcZ7pZIss52i7nAK2317416001995-02-270.6340.6270-13.21110.0350.01480.0006430.05190.745126.6342.902666666666666719951132.0281
14015JvDPIBquLQTX6VZPwB2114221101960-09-070.2830.2940-12.07810.03450.6866.71e-060.6260.419112.21432.3701833333333333196012184.0341
1501FN979nyz1gmdsfWZv5OL4518229301995-08-310.5570.6862-6.87410.0260.06215.88e-050.2140.38112.02943.0382166666666666199516698.0381
1601LN8YM6L9V2a135AnhZqf5716833302020-12-050.6180.4597-10.99500.05820.1173.49e-050.1480.268144.04642.8055520208841.0462
1701LN8YM6L9V2a135AnhZqf5716833302020-12-050.6180.4597-10.99500.05820.1173.49e-050.1480.268144.04642.805552020950.0632
1801j7wulLc7rJVk0STUPEKu520016001967-03-060.4120.2444-11.60800.03160.9570.0005060.1580.306132.143.33619671645.0151
1901lYEV7KZT34GOAW2MIgHF3545830701969-06-110.3220.5493-8.30310.05820.5330.002040.2160.209140.60337.63845196987156.0421
2001uK4bh8p175iZj7uSkp5k45204893020000.6120.5332-8.96400.02870.422.5e-060.1080.35996.0743.4148833333333335200077205.0531
21025zlWO2FRZZrFhRw5kqa94019189302013-11-150.6660.9775-3.72300.1450.00123.79e-060.1060.21110.04343.1982166666666667201329602.0421
22028DQ7NS44vnLRaoon0mwg18232640019870.370.9682-3.21710.160.00120.0003270.06970.448119.50443.8773333333333335198723629.0381
2302GvdetV646M4C1Ip8CKg5920950701978-07-220.5340.2375-14.07300.0690.530.0007830.2060.319123.0943.4917833333333332197811879.0241
2402IpFuItRz1yXbYAtHsXJs4021843602000-11-200.5380.6655-6.63110.05360.2262.91e-050.08910.338112.77143.6406200013324.0391
2502JLJi7sryFu44hhNWOv1c3025400012010-09-240.8390.7020-7.40110.40.2220.00.06260.46490.01744.233333333333333201010611.0443
2602RqhFaAYzWScfVzKfTZ7L5320278402020-01-070.6120.711-4.68900.1860.03710.00.03410.749103.81143.3797333333333333202050057.0552
2702hewGMWcG5Vp8nwhcmTA019219267019940.5530.7954-11.38210.09990.00830.00.2080.16893.82443.65445199448108.0421
2802yg6Eksd8zZpYxsT1Lpsp2910421301999-01-010.4990.8197-7.06310.04540.00651.37e-060.05840.915184.25441.7368833333333333199993331.0471
2903UV1fQG0MIm4CPszfttTJ3518383402006-06-260.6610.6534-4.89500.05090.02920.00.08020.505125.99143.0639200655390.0441
3004GlMx0s7MvdMr2MEXX8s83425584002008-04-180.5330.8711-11.23710.03690.00020.160.06090.87155.97544.264200857656.0421
3104PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7040.771-4.98600.06660.02880.00.2990.534100.03143.82345201415874.0453
3204PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7040.771-4.98600.06660.02880.00.2990.534100.03143.82345201465568.0503
3304QM4iqGabg7QlphUoWiMB2626740801995-05-080.5920.8691-5.91600.06590.126.89e-050.3150.60791.5744.4568199552393.0471
3404izaHyfYxsE3AQ68kgB4I2414582702000-04-200.7210.7699-5.93300.1750.1780.00.08640.625131.90942.430452000810.0241
3504kSSveBKXiNeRaOXu2iog4322993301999-06-050.6720.4370-10.31700.03630.2980.00.180.22105.42143.8322166666666666199950057.0552
3604wTGEh2gJTG72yUoizwFV10205693019800.4750.85210-6.64610.05490.00010.003240.3120.503112.03743.4282166666666667198057656.0421
37053zfKbxSoqRm2RCdJdQMS2820140001997-01-010.420.9572-5.47510.04750.0010.02020.110.596166.19343.3566666666666665199713433.0311
38053zgCcuWJ9x4dJJgaF4VS10509704019820.1840.2010-16.85110.03790.7850.4010.2080.0421135.56848.49506666666666619829218.0291
39054VUAvYt59A1yGSrBAgac2220321301976-02-120.5370.4875-9.37310.05130.730.02710.1240.814147.66843.3868833333333335197638031.0421
4005IAxc7j9T9KcEu0lAgbfB4621601302016-01-030.780.92711-2.83610.09580.1250.00.20.971133.93343.600216666666667201610701.0421
4105JJTD8KdxbZsMnOn6imMv42230853020000.6640.9097-4.19600.1390.07920.0002990.1150.835168.03243.84755200019650.0421
4205RcrI18Qi65dwc3vegSNP4321600002017-10-270.6380.8810-4.94810.1230.08720.00.3060.64279.99443.6201745147.0472
4305hJSmfNiPXSzwSLeG1Rfb3923250702016-10-210.440.4266-6.50200.03390.8220.003990.1140.48799.84243.875116666666666520166179.0361
4405hk8ivnCt0HP3aI1uWJfi4329506702015-03-060.5980.7622-8.38610.03960.5370.00.3240.686106.97244.917783333333333201532447.0401
4505v1D17vjtCNRyalBVGbKU3448116002007-06-100.4320.78411-10.27300.06040.8290.04980.9630.354108.14248.019333333333334200744870.0431
4605yf5sbF36dhwGRfj18sg52722564702007-11-190.9120.697-6.20610.1960.1090.00.2560.7392.97143.7607833333333334200720468.0351
47062pHiX7lpKRDmFvmJor3o38182813020060.5180.9922-4.92610.1040.00121.17e-060.330.412146.55343.046883333333333200642871.0431
48063FiQaY2FuZe3a44rZuSX3722118702012-08-080.7720.7050-5.09310.04420.1941.66e-060.1750.72995.00843.68645201211211.0321
49063ZobFSzYA0O7fmfi4oik4221268902002-03-120.7670.95110-4.3610.04990.1520.00.3610.84499.98643.5448166666666667200251546.0491
5006AIE9kwssJ3cmPbMuHobb3032313302007-05-110.1370.9167-5.25810.05810.00237.75e-050.1260.399166.2145.38555200734390.0401
5106K6svVGbiQILn9BAzZNgo3737412012013-12-130.6140.7620-9.02100.06970.01491.42e-060.1220.352133.99246.23533333333333320133133.0432
5206LDaZRAWOc0fNqkdOQo9S1877853019980.7620.45510-11.07710.03610.30.7960.06960.482147.90141.29755199870621.0461
5306SnvuakcuOcj3odFZpIpJ1813763901969-09-060.6460.542-9.7400.03110.07230.00.1310.80194.85842.293983333333333319697150.0291
5406V1fPvTfXdIBMUItnLmQ71411949301967-10-160.4370.3335-15.82600.0360.7620.0005290.1050.34479.07831.99155196710848.0351
5506kPUkFJv1RfEVjgEF3oRz2325022701997-01-010.7510.7350-10.08710.03540.00880.02190.1020.857129.87944.17045199731913.0471
5607Iyx8PziAv997SrJanQgl11279013019810.3610.1410-20.15700.0320.3548.66e-060.2220.097297.96444.650216666666667198120934.0421
5707QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.5170.95610-5.700.2910.01397.05e-060.3670.66784.01145.3498833333333335201010011.0513
5807QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.5170.95610-5.700.2910.01397.05e-060.3670.66784.01145.349883333333333520107380.0393
5907rj0QlIw4hRcvhIJGY2Lf612317301965-10-120.5540.1890-15.12700.04850.9688.64e-060.1310.30167.15242.052883333333333419653121.0311
600804d5chEAcksTIZ7IVh6O2920000001967-10-160.330.4399-13.42100.04650.8754.92e-050.5080.356101.97543.3333333333333335196710848.0351
6108D5BLpRFI6PT2XYuq7eDg2827326701996-01-010.6960.9395-6.53800.03030.5075.37e-060.2510.974137.92144.55445199612829.0361
6208D6EqghkRLD6iFHKKWJ4Y5521312012020-01-270.7580.6870-8.81300.05690.30.02630.120.475125.01843.552202040274.0622
6308ER472xnuMlbtkKSh3PB41116686701975-09-070.3370.2910-14.60710.08870.930.00.1090.416178.15942.7811166666666667197512184.0341
6408ULDx4KeBpXe7OHEWjgVP4527729302014-04-160.6870.933-4.41810.03930.01470.001160.330.96153.06844.6215520143064.0371
6508uPk6Ge77W31WGGYNEPHA215636901978-01-010.4690.5475-8.84210.3150.7811.28e-060.1030.73692.54442.6061519785059.0422
6608uPk6Ge77W31WGGYNEPHA215636901978-01-010.4690.5475-8.84210.3150.7811.28e-060.1030.73692.54442.6061519781.012
6708xY8SSfwHc7SIXWroGW9j16167360019940.3240.2462-14.30800.03910.6570.00.1070.59124.35442.7893333333333334199436250.0541
68095PNli2gmoYwlvgotxxEs28269067019950.5760.6167-5.14400.0280.05496.02e-060.0960.146119.80444.48445199577205.0531
6909FEC2H0Ls5bXMSFj2bLzL5019846002018-10-240.7080.8070-8.00310.2190.2980.0001250.1160.917169.84843.3076666666666665201874908.0653
7009GKW7x0iMuxzzM7IeQ01g715352101964-02-210.5210.2090-12.89110.03390.7360.00.1870.54882.01842.55868333333333319643058.0311
7109PwmLzRfWj5hkmIiw1a9X2125131601969-09-070.3230.3850-9.22810.03130.7147.9e-060.3780.379119.00244.1886196912184.0341
7209d4etyrEWf5BguIIUfRW91221252301974-08-160.3080.4299-8.70300.0320.4340.00.1380.31483.69543.54205197412962.0361
730ADOK8ffxUlSDu51nIu5KU23274533019990.6620.2991-11.21400.02750.410.00.3630.553103.03744.57555199961203.0451
740AGuYcLlnbR7AOQSbTGzIz2923296002000-12-310.4470.5732-8.93700.03030.07990.0001110.2910.562201.07743.8826666666666667200015007.0401
750AHy3nZXS725AAHDC0qjPR2220233601966-09-280.4820.3639-10.88510.03770.8780.00.3080.369117.99943.3722666666666665196615347.0351
760AJJJnR5VwJfeYGvnyG74v719153001967-03-290.6320.3037-14.100.02720.7740.002170.1020.404104.85943.1921666666666666196769339.0461
770ATVUbPlX65HB5H1WbIugt5418135612020-12-040.8910.8997-9.83610.2880.1850.00.07780.888154.91243.0226202027699.0541
780AUeyoLQYwYUDsQRJZKvZd22248907019800.4240.8244-5.89510.04340.00054.31e-050.1330.423195.9944.14845198057656.0421
790AXtdoJaVk6W8pcYstecgw33193227020000.5950.8062-7.01910.04420.1998.02e-050.02040.718120.17243.22045200044870.0431
800AYmWvKRGtgePrpo3NdYXh4821853002019-02-010.6410.3824-11.05510.02990.9360.4370.1070.237112.99643.6421666666666668201990157.0611
810ApseFDjbipbKK0dBD4Viz3526701312005-11-150.8370.6979-5.89900.2940.07760.00.1530.54590.01144.450216666666667200580933.0461
820Aq0weZZ0YzbrNlsA8UV3P2329410701997-01-050.570.5967-8.59110.02710.01570.0003310.1440.211140.84344.9017833333333331997509.0121
830Au3WxSl0fm6IvSZHCShjZ2215718701998-01-010.5040.9234-6.53300.04150.01410.01750.1510.643139.0842.6197833333333334199829108.0401
840AwC0oZxr25SOcThearmN42830038302005-04-290.9270.7057-6.2710.05770.01350.00.07550.74999.02445.006383333333333200520468.0351
850B1uuoyiPSkV2f8cAw473e023077301946-01-010.2670.1936-17.21410.07810.9910.880.190.34679.24413.84621666666666719461805.0295
860B1uuoyiPSkV2f8cAw473e023077301946-01-010.2670.1936-17.21410.07810.9910.880.190.34679.24413.846216666666667194612.085
870B1uuoyiPSkV2f8cAw473e023077301946-01-010.2670.1936-17.21410.07810.9910.880.190.34679.24413.84621666666666719461.055
880B1uuoyiPSkV2f8cAw473e023077301946-01-010.2670.1936-17.21410.07810.9910.880.190.34679.24413.84621666666666719462671.0475
890B1uuoyiPSkV2f8cAw473e023077301946-01-010.2670.1936-17.21410.07810.9910.880.190.34679.24413.846216666666667194644235.0435
900BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.680.8689-2.45400.4130.3480.00.2020.819173.87444.898033333333333200952237.0503
910BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.680.8689-2.45400.4130.3480.00.2020.819173.87444.898033333333333200922905.0373
920BleuvVMt39PmQaukAdEcv37228053020100.4150.5045-8.65810.02920.6221.76e-050.120.383156.07443.800883333333333201038499.0421
930Bn854XtnJv15m9f1ZvWbe4419401302011-05-310.490.886-4.64200.04470.2990.00.0990.619160.01543.23355201136693.0491
940BsQm2iFzKTzAemliJtu8x3920123102016-11-250.560.7914-6.32310.03030.00540.00.08580.605129.99243.35385201636719.0451
950C2VNcmHdRRvQtetkaZVn532311440019900.5990.3951-14.30700.02580.6683.77e-050.1160.353102.0345.190666666666667199077205.0531
960C2XO15BhHDglDB724brOi43316627019980.840.54110-12.03500.210.09511.2e-050.1060.68180.98845.277116666666666199870621.0461
970CAinfqSGq1peA63s4gNam2118793301988-12-310.6330.29810-10.69310.03250.8890.00.08640.591116.33533.132216666666667198829541.0431
980CBj18X3J7MabwDqKeMxuo47239373019990.3840.7680-6.10110.1210.1240.00.06820.74119.97343.98955199935719.0471
990CLO6b6Ohrnmp9fDLvYj3U25162987020030.2750.348-13.43500.03390.9820.00.2210.27291.16542.7164520033095.0392
1000CMxaedU31OgxQvrA1VE1J5422586702019-10-250.8620.6153-5.01900.2280.2390.00.07970.413114.95943.76445201935006.0492
Rows: 1-100 | Columns: 25

Define a list of predictors and the response, and then save the normalized version of the final dataset to the database.

In [28]:
# define predictors and response
predictors = ['duration_minute', 
              'release_year', 
              'danceability', 
              'energy', 
              'loudness',
              'speechiness', 
              'acousticness', 
              'instrumentalness', 
              'liveness', 
              'valence', 
              'artists_followers', 
              'artist_popularity', 
              'nb_singers']
response = 'popularity'

# normalize the features
polish_tracks.normalize(method = "minmax",
                        columns = predictors)

# save the final dataset to the database
polish_tracks.to_db('"spotify"."polish_tracks_data_final"', relation_type = "table")
Out[28]:
Abc
id
Varchar(44)
Abc
Varchar(96)
123
popularity
Int
123
duration_ms
Int
123
explicit
Int
Abc
Varchar(100)
Abc
Varchar(156)
Abc
release_date
Varchar(20)
123
danceability
Float
123
energy
Float
123
key
Int
123
loudness
Float
123
mode
Int
123
speechiness
Float
123
acousticness
Float
123
instrumentalness
Float
123
liveness
Float
123
valence
Float
123
tempo
Numeric(9,4)
123
time_signature
Int
123
duration_minute
Float
123
release_year
Float
123
artists_followers
Float
123
artist_popularity
Float
123
nb_singers
Float
10004Uy71ku11n3LMpuyf593425890702012-01-010.6173245614035090.59936138807723660.79767505538545810.0024051601618020.1778894472361810.001514841351074720.0612919896640830.365659943567771140.05640.168255848386141550.8928571428571430.0542600762282040.50.0
2008UO76dicJJrUb5g6WID042216813020010.6765350877192980.88653593196172390.92579593259154200.0236142997704170.1648241206030150.00.0694573643410850.74187480405476119.98140.137093574178264730.7619047619047620.4712643678160920.6111111111111110.0
300D7fT49Q54HNiaaLY32UA4119623102017-06-070.5274122807017540.67266118424355610.90468722774002410.0980649393243690.1989949748743720.00.0997416020671830.120075242972097120.06150.121856677524429970.9523809523809520.7542190610525890.8194444444444440.222222222222222
400De7REGurYkmfrKAh6CU13640086701992-10-300.1732456140350880.102328523661777110.46028427052995800.0259101344703180.5125628140703520.7082906857727740.0550904392764860.001881074302435114.59230.2733491264435890.6547619047619050.1168632393985770.5555555555555560.0
500KCwnrvIXX8GRU3ZMOIBW5520782102019-12-060.8388157894736840.69274332017953460.87660866751300600.043074231988630.172864321608040.001019447287615150.072971576227390.596614066255617123.98540.130436778205507840.9761904761904760.2340666046437180.750.0
600g7zzRDf2DOf4fKYzFTUA4619285702018-08-240.7565789473684210.7981745338434210.86585517636223310.3374877008855360.5698492462311560.00.0784496124031010.726199184867802112.34340.119358898430559660.9642857142857140.2814441343297020.5972222222222220.0
700i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.56250.67466939783715470.84058945062604200.0780583797966550.6321608040201010.00.2082687338501290.93729752325216898.84240.0490798045602605860.4880952380952384.0014805478e-050.0555555555555560.111111111111111
800i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.56250.67466939783715470.84058945062604200.0780583797966550.6321608040201010.00.2082687338501290.93729752325216898.84240.0490798045602605860.4880952380952380.0505987215269650.5833333333333330.111111111111111
900l8ILQ8qgsew789dGn1Gh732624001963-03-250.5570175438596490.60136960167083420.83202648545042700.0066688531759050.5115577889447249.71340839303992e-060.1452196382428940.782631413940851102.42840.21810260586319220.309523809523810.0069525724518070.1250.0
1000vlVv7WLZjyRS783REKmn24243107019980.9024122807017540.50597945597493710.82550469220620810.3812178856455670.0521608040201010.00.1979328165374680.35938969589298893.0540.156559076103050030.7261904761904760.4121224853195680.5138888888888890.0
1100wJpDUdKoWQ2ibqTpJh1Y3221165301998-08-300.813596491228070.92469199024008260.83531227441316300.0106045698043070.3155778894472360.0004278403275332650.2165374677002580.989549587208695130.10340.133273615635179140.7261904761904760.0798895591368810.3750.0
12010cqDI9IKD8f3Ji8TnT9h1419121301967-04-140.4956140350877190.415609844263036100.82129788664028110.0161801683612110.7879396984924620.00.0703875968992250.52241613543735135.1540.118141841871483560.3571428571428570.137090723567720.4861111111111110.0
130123TtFcZ7pZIss52i7nAK2317416001995-02-270.6293859649122810.62747637838760500.6992009558658810.0127910790423090.0148743718592960.000658137154554760.0376227390180880.746054969171282126.6340.105517471128220310.690476190476190.0113141862489120.3888888888888890.0
14015JvDPIBquLQTX6VZPwB2114221101960-09-070.2445175438596490.29310881505356900.72740397779603210.0122444517328090.6894472361809056.86796315250768e-060.6310077519379840.405371512174731112.21430.081865561148948770.2738095238095240.12187509378470.4722222222222220.0
1501FN979nyz1gmdsfWZv5OL4518229301995-08-310.5449561403508770.68671867939874120.85694371841784310.0029517874713020.0624120603015086.01842374616172e-050.2051679586563310.36461490228864112.02940.11153834764583950.690476190476190.1670318017666540.5277777777777780.0
1601LN8YM6L9V2a135AnhZqf5716833302020-12-050.6118421052631580.45878643652538970.75436238269484500.0381545862031270.1175879396984923.57215967246674e-050.1369509043927650.247570279026022144.04640.10120373112229790.9880952380952380.0884327201064390.6388888888888890.111111111111111
1701LN8YM6L9V2a135AnhZqf5716833302020-12-050.6118421052631580.45878643652538970.75436238269484500.0381545862031270.1175879396984923.57215967246674e-050.1369509043927650.247570279026022144.04640.10120373112229790.9880952380952380.0094935125996620.8750.111111111111111
1801j7wulLc7rJVk0STUPEKu520016001967-03-060.3859649122807020.24290347521362440.73910337789062300.0090740133377060.9618090452261310.0005179119754350050.1472868217054260.287281847632982132.140.124765324252294930.3571428571428570.0164460850514690.2083333333333330.0
1901lYEV7KZT34GOAW2MIgHF3545830701969-06-110.2872807017543860.54915604823729130.82137256366216110.0381545862031270.5356783919597990.002088024564994880.2072351421188630.185912843557321140.60330.31587207580692920.3809523809523810.8718725928593580.5833333333333330.0
2001uK4bh8p175iZj7uSkp5k45204893020000.6052631578947370.53309033948850820.80491872650785400.0059035749426040.4221105527638192.55885363357216e-060.0956072351421190.34266903542689996.0740.128269173822919750.750.7723257605313970.7361111111111110.0
21025zlWO2FRZZrFhRw5kqa94019189302013-11-150.6644736842105260.97891375726722350.93537948373285500.1330490871323930.0012060301507543.87922210849539e-060.0935400516795870.186957884836451110.04340.118645247260882440.9047619047619050.2961195642387680.5833333333333330.0
22028DQ7NS44vnLRaoon0mwg18232640019870.3399122807017540.96987679609603320.94797500809001110.1494479064174050.0012060301507540.0003346980552712380.0560206718346250.435677709269516119.50440.148810334616523540.5952380952380950.2363674559587050.5277777777777780.0
2302GvdetV646M4C1Ip8CKg5920950701978-07-220.5197368421052630.23587472763603150.67774375824558800.0499617360883350.5326633165829150.00080143295803480.1968992248062020.300867384261678123.0940.131684927450399750.4880952380952380.1188239648670010.3333333333333330.0
2402IpFuItRz1yXbYAtHsXJs4021843602000-11-200.5241228070175440.66563243666596450.86299255719015310.0331256149557230.227135678391962.97850562947799e-050.0760723514211890.320723168565158112.77140.138295084394432920.750.1332793133459380.5416666666666670.0
2502JLJi7sryFu44hhNWOv1c3025400012010-09-240.8541666666666670.70278438814752300.84382545490752510.4118290149775880.2231155778894470.00.0486821705426360.45239836973560590.01740.16462318625999410.8690476190476190.1061392715304660.6111111111111110.222222222222222
2602RqhFaAYzWScfVzKfTZ7L5320278402020-01-070.6052631578947370.71081724252191510.91133348268737700.1778725265114250.0372864321608040.00.019224806201550.750235134287804103.81140.126707876813740020.9880952380952380.5007452757520280.7638888888888890.111111111111111
2702hewGMWcG5Vp8nwhcmTA019219267019940.5405701754385960.79616632024982240.74472904687227710.0837433038154590.0083417085427140.00.1989664082687340.14306615111296993.82440.13891027539236010.6785714285714290.481248061782860.5833333333333330.0
2802yg6Eksd8zZpYxsT1Lpsp2910421301999-01-010.4813596491228070.82026488337299670.8522390660393810.0241609270799170.0065326633165831.40225179119754e-060.0443410852713180.923711986623472184.25440.0537355641101569460.7380952380952380.9336454488160620.6527777777777780.0
2903UV1fQG0MIm4CPszfttTJ3518383402006-06-260.6589912280701750.65358315510437740.90620566051825900.0301738274844210.0293467336683420.00.066873385012920.495245062179956125.99140.112679153094462540.8214285714285710.5540950151556080.6111111111111110.0
3004GlMx0s7MvdMr2MEXX8s83425584002008-04-180.5186403508771930.87247843680653910.74833843626316210.014868262818410.0002010050251260.1637666325486180.0469250645994830.876685129062598155.97540.165985342019543990.8452380952380950.576763402458910.5833333333333330.0
3104PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7061403508771930.77106365032984910.90394045752122100.0473379250027330.028944723618090.00.2930232558139530.525551259274741100.03140.146416938110749180.9166666666666670.158788751838180.6250.222222222222222
3204PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7061403508771930.77106365032984910.90394045752122100.0473379250027330.028944723618090.00.2930232558139530.525551259274741100.03140.146416938110749180.9166666666666670.6559126876944470.6944444444444440.222222222222222
3304QM4iqGabg7QlphUoWiMB2626740801995-05-080.5833333333333330.87047022321294110.88079058073830700.0465726467694330.1206030150753777.05220061412487e-050.3095607235142120.6018392726512791.5740.17454915605567070.690476190476190.5241139221511960.6527777777777780.0
3404izaHyfYxsE3AQ68kgB4I2414582702000-04-200.7247807017543860.7700595435330590.88036741094765100.1658467257024160.1788944723618090.00.073281653746770.620650015675619131.90940.084542493337281620.750.0080929944079310.3333333333333330.0
3504kSSveBKXiNeRaOXu2iog4322993301999-06-050.6710526315789470.43669608699581300.77123938963980800.014212310047010.2994974874371860.00.1700258397932820.197408297627756105.42140.14680633698549010.7380952380952380.5007452757520280.7638888888888890.111111111111111
3604wTGEh2gJTG72yUoizwFV10205693019800.4550438596491230.85340040766736100.86261917208075110.0345468459604240.0001005025125630.003316274309109520.3064599483204130.493154979621695112.03740.128861415457506650.5119047619047620.576763402458910.5833333333333330.0
37053zfKbxSoqRm2RCdJdQMS2820140001997-01-010.3947368421052630.95883162133124520.89176810295472110.0264567617798190.0010050251256280.02067553735926310.0976744186046510.590343818580834166.19340.125683298785904650.7142857142857140.1343697167952140.4305555555555560.0
38053zgCcuWJ9x4dJJgaF4VS10509704019820.1359649122807020.19972688295127100.60859283598436810.0159615174374110.788944723618090.4104401228249740.1989664082687340.011495454070436135.56840.35392137992300860.5357142857142860.0922041155227430.4027777777777780.0
39054VUAvYt59A1yGSrBAgac2220321301976-02-120.5230263157894740.48690142683575850.79473775919149710.0306111293320210.7336683417085430.02773797338792220.1121447028423770.818162817431289147.66840.127025466390287220.4642857142857140.380440763082340.5833333333333330.0
4005IAxc7j9T9KcEu0lAgbfB4621601302016-01-030.7894736842105260.928708417427278110.95745898986881710.0792609598775550.1256281407035180.00.1906976744186050.982234298254781133.93340.136501332543677830.940476190476190.1070396046537220.5833333333333330.0
4105JJTD8KdxbZsMnOn6imMv42230853020000.6622807017543860.91063449508489770.92360540661638400.1264895594183890.0795979899497490.000306038894575230.1028423772609820.84010868429303168.03240.147487414865265040.750.1965627282094370.5833333333333330.0
4205RcrI18Qi65dwc3vegSNP4321600002017-10-270.6337719298245610.88251950477452800.90488636646503910.1089974855143760.0876381909547740.00.3002583979328170.63841571742083879.99440.13649170861711580.9523809523809520.451627102027750.6527777777777780.111111111111111
4305hJSmfNiPXSzwSLeG1Rfb3923250702016-10-210.4166666666666670.42565091223102560.86620366913100800.0115884989614080.8261306532663320.004083930399181170.1018087855297160.47643431915560799.84240.148711874444773460.940476190476190.0618028670608130.50.0
4405hk8ivnCt0HP3aI1uWJfi4329506702015-03-060.5899122807017540.76303079595545820.81930649939013810.0178200502897120.5396984924623120.00.3188630490956070.684397533702581106.97240.195025170269469940.9285714285714290.3245800946350150.5555555555555560.0
4505v1D17vjtCNRyalBVGbKU3448116002007-06-100.4078947368421050.785121145485034110.77233465262738700.0405597463649280.8331658291457290.05097236438075740.9793281653746770.337443829031247108.14240.33279019840094760.8333333333333330.4488560767483970.5972222222222220.0
4605yf5sbF36dhwGRfj18sg52722564702007-11-190.9342105263157890.69073510658593670.87357180195653810.1888050727014320.1095477386934670.00.2485788113695090.73037934998432492.97140.14363340242819070.8333333333333330.2047457559296940.4861111111111110.0
47062pHiX7lpKRDmFvmJor3o38182813020060.502192982456140.99397535921920720.90543399795882810.0882256477533620.0012060301507541.19754350051177e-060.3250645994832040.398056223220817146.55340.1119233047083210.8214285714285710.4288586777107530.5972222222222220.0
48063FiQaY2FuZe3a44rZuSX3722118702012-08-080.7807017543859650.7057967085379200.90127697707415410.0228490215371160.1949748743718591.69907881269191e-060.1648578811369510.72933430870519495.00840.140331655315368680.8928571428571430.112141492352170.4444444444444440.0
49063ZobFSzYA0O7fmfi4oik4221268902002-03-120.7752192982456140.952806980550451100.91952306275359110.029080572865420.1527638190954770.00.3571059431524550.84951405580520499.98640.13404056855196920.7738095238095240.5156407870912240.6805555555555560.0
5006AIE9kwssJ3cmPbMuHobb3032313302007-05-110.0844298245614040.9176632426624970.89716974087073410.0380452607412270.0023115577889457.93244626407369e-050.114211886304910.38447068659212166.2140.215802487414865270.8333333333333330.3440172863959670.5555555555555560.0
5106K6svVGbiQILn9BAzZNgo3737412012013-12-130.6074561403508770.76303079595545800.80349986309212700.0507270143216360.0149748743718591.45342886386899e-060.1100775193798450.335353746472986133.99240.25354826769321880.9047619047619050.0313315926892950.5972222222222220.111111111111111
5206LDaZRAWOc0fNqkdOQo9S1877853019980.7697368421052630.454770009338193100.75232121076344810.013993659123210.3015075376884420.8147389969293760.0559173126614990.471209112759954147.90140.034221202250518210.7261904761904760.7064613907145640.6388888888888890.0
5306SnvuakcuOcj3odFZpIpJ1813763901969-09-060.6425438596491230.540119087066120.78560227018146500.0085273860282060.0726633165829150.00.119379844961240.80457728080259294.85840.078480900207284570.3809523809523810.0715164610906040.4027777777777780.0
5406V1fPvTfXdIBMUItnLmQ71411949301967-10-160.4133771929824560.33226898012872650.63410748512682600.013884333661310.7658291457286430.0005414534288638690.092506459948320.32699341623994179.07830.065047379330766960.3571428571428570.1085101487550390.4861111111111110.0
5506kPUkFJv1RfEVjgEF3oRz2325022701997-01-010.7576754385964910.73591991244188700.77696462798396910.0132283808899090.0088442211055280.02241555783009210.0894056847545220.863099592433901129.87940.161830026650873570.7142857142857140.3192381181036980.6527777777777780.0
5607Iyx8PziAv997SrJanQgl11279013019810.3300438596491230.138476368346537100.52629875787220300.0095113151853070.3557788944723628.86386898669396e-060.213436692506460.06907722855052897.96440.18314036126739710.5238095238095240.2094074807678840.5833333333333330.0
5707QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.501096491228070.957827514534446100.88616732631369300.2926642615065050.0139698492462317.21596724667349e-060.3633074935400520.66454174939910184.01140.214218241042345270.8690476190476190.1001370507087620.7083333333333330.222222222222222
5807QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.501096491228070.957827514534446100.88616732631369300.2926642615065050.0139698492462317.21596724667349e-060.3633074935400520.66454174939910184.01140.214218241042345270.8690476190476190.073817312405590.5416666666666670.222222222222222
5907rj0QlIw4hRcvhIJGY2Lf612317301965-10-120.5416666666666670.18767760138968400.65150723122495200.0275500163988190.972864321608048.84339815762538e-060.119379844961240.28205664123732967.15240.067771690849866740.3333333333333330.0312115482728610.4305555555555560.0
600804d5chEAcksTIZ7IVh6O2920000001967-10-160.2960526315789470.43870430058941190.69397356433425400.0253635071608180.8793969849246235.03582395087001e-050.5090439276485790.339533911589508101.97540.124646875925377560.3571428571428570.1085101487550390.4861111111111110.0
6108D5BLpRFI6PT2XYuq7eDg2827326701996-01-010.6973684210526320.94075769898886450.86530754486844400.0076527823330050.5095477386934675.496417604913e-060.2434108527131780.985369422092173137.92140.17888658572697660.7023809523809520.1283274811680320.50.0
6208D6EqghkRLD6iFHKKWJ4Y5521312012020-01-270.7653508771929820.6877227861955400.80867746994249900.0367333551984260.3015075376884420.02691914022517910.1080103359173130.46389382380604125.01840.13435963873260290.9880952380952380.4028790652541440.8611111111111110.111111111111111
6308ER472xnuMlbtkKSh3PB41116686701975-09-070.3037280701754390.289092387866373100.66445124835088210.071498852082650.9346733668341710.00.0966408268733850.402236388337339178.15940.100118448326917380.4523809523809520.12187509378470.4722222222222220.0
6408ULDx4KeBpXe7OHEWjgVP4527729302014-04-160.68750.93172073781767430.91807930699723710.0174920739040120.0147738693467340.001187308085977480.3250645994832040.970738844184345153.06840.181867041753035220.9166666666666670.0306413372947990.5138888888888890.0
6508uPk6Ge77W31WGGYNEPHA215636901978-01-010.4484649122807020.54714783464369350.80795559206432210.3189023723625230.7849246231155781.31013306038895e-060.0904392764857880.73664959765910892.54440.092346757477050640.4880952380952380.0505987215269650.5833333333333330.111111111111111
6608uPk6Ge77W31WGGYNEPHA215636901978-01-010.4484649122807020.54714783464369350.80795559206432210.3189023723625230.7849246231155781.31013306038895e-060.0904392764857880.73664959765910892.54440.092346757477050640.4880952380952380.00.0138888888888890.111111111111111
6708xY8SSfwHc7SIXWroGW9j16167360019940.2894736842105260.24491168880722220.67189405819829200.0172734229802120.6603015075376880.00.0945736434108530.584073570906051124.35440.100483417234231570.6785714285714290.3626241709432490.750.0
68095PNli2gmoYwlvgotxxEs28269067019950.5657894736842110.61643120362281770.90000746770218800.0051382967093040.0551758793969856.16171954964176e-060.0832041343669250.120075242972097119.80440.175777317145395320.690476190476190.7723257605313970.7361111111111110.0
6909FEC2H0Ls5bXMSFj2bLzL5019846002018-10-240.7105263157894740.80821560181140900.82884026585019810.213949928938450.2994974874371860.0001279426816786080.1038759689922480.925802069181733169.84840.123506810778797740.9642857142857140.749347258485640.9027777777777780.222222222222222
7009GKW7x0iMuxzzM7IeQ01g715352101964-02-210.5054824561403510.20775973732566200.70716650486645310.0115884989614080.7396984924623120.00.1772609819121450.54018183718256982.01840.090238377257921230.3214285714285710.0305813150865820.4305555555555560.0
7109PwmLzRfWj5hkmIiw1a9X2125131601969-09-070.2883771929824560.3844825335622700.79834714858238110.0087460369520060.7175879396984928.08597748208802e-060.3746770025839790.36356986100951119.00240.162636215575954980.3809523809523810.12187509378470.4722222222222220.0
7209d4etyrEWf5BguIIUfRW91221252301974-08-160.2719298245614040.42866323262142290.81141562741144500.0095113151853070.4361809045226130.00.1266149870801030.29564217786602683.69540.133917678412792420.440476190476190.1296579734501770.50.0
730ADOK8ffxUlSDu51nIu5KU23274533019990.6600877192982460.29812934903756410.74891096009757800.0045916693998030.4120603015075380.00.3591731266149870.545407043578221103.03740.17982380811371040.7380952380952380.612246531216550.6250.0
740AGuYcLlnbR7AOQSbTGzIz2923296002000-12-310.4243421052631580.57325461136046420.80559081970477700.0076527823330050.0803015075376880.0001136131013306040.2847545219638240.554812415090396201.07740.14904723127035830.750.1501155427508180.5555555555555560.0
750AHy3nZXS725AAHDC0qjPR2220233601966-09-280.4627192982456140.36239218403269490.75710054016379210.0157428665136110.8824120603015080.00.3023255813953490.353119448218205117.99940.126376221498371330.3452380952380950.153516801216450.4861111111111110.0
760AJJJnR5VwJfeYGvnyG74v719153001967-03-290.627192982456140.30214577622475970.67707166504866500.0042636930141030.7778894472361810.002221084953940630.0894056847545220.389695892987773104.85940.118376517619188630.3571428571428570.6936366455588570.6388888888888890.0
770ATVUbPlX65HB5H1WbIugt5418135612020-12-040.9111842105263160.90059342711690870.78321260548129310.2893844976495030.1859296482412060.00.0643927648578810.895495872086947154.91240.110844684631329580.9880952380952380.2770825205325970.750.0
780AUeyoLQYwYUDsQRJZKvZd22248907019800.3991228070175440.8252854173569940.88131331989146910.0219744178419150.0005025125628144.4114636642784e-050.1214470284237730.409551677291253195.9940.160852827953805150.5119047619047620.576763402458910.5833333333333330.0
790AXtdoJaVk6W8pcYstecgw33193227020000.5866228070175440.8072114950146120.85333432902695810.0228490215371160.28.20880245649949e-050.0050645994832040.717838854634758120.17240.119632810186556110.750.4488560767483970.5972222222222220.0
800AYmWvKRGtgePrpo3NdYXh4821853002019-02-010.6370614035087720.38147021317187340.75286884225723710.0072154804854050.940703517587940.4472876151484140.0945736434108530.215173999372975112.99640.13836467278649690.9761904761904760.9018937006692480.8472222222222220.0
810ApseFDjbipbKK0dBD4Viz3526701312005-11-150.8519736842105260.69776385416352990.88121375052896200.2959440253635070.0779899497487440.00.1421188630490960.53704671334517790.01140.174256736748593430.809523809523810.8096195592369180.6388888888888890.0
820Aq0weZZ0YzbrNlsA8UV3P2329410701997-01-050.5592105263157890.59634906768683970.81420356956164610.0041543675522030.0157788944723620.0003387922210849540.13281653746770.188002926115582140.84340.194314480307965640.7142857142857140.0050818802957090.1666666666666670.0
830Au3WxSl0fm6IvSZHCShjZ2215718701998-01-010.4868421052631580.92469199024008240.86543200657157800.0198972340658140.0141708542713570.01791197543500510.1400516795865630.639460758699969139.0840.092952324548415750.7261904761904760.2911777357622320.5555555555555560.0
840AwC0oZxr25SOcThearmN42830038302005-04-290.9506578947368420.7057967085379270.87197869215642310.0376079588936260.013567839195980.00.0620155038759690.75023513428780499.02440.198960615931299960.809523809523810.2047457559296940.4861111111111110.0
850B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.018046677270590.4027777777777780.444444444444444
860B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.0001100407150650.1111111111111110.444444444444444
870B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.00.0694444444444440.444444444444444
880B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.0267098826565830.6527777777777780.444444444444444
890B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.442503726378760.5972222222222220.444444444444444
900BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.6798245614035090.86946611641614290.96696786398825100.4260413250245980.3497487437185930.00.1927648578811370.823388023826941173.87440.194147912348238080.8571428571428570.5225533447375530.6944444444444440.222222222222222
910BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.6798245614035090.86946611641614290.96696786398825100.4260413250245980.3497487437185930.00.1927648578811370.823388023826941173.87440.194147912348238080.8571428571428570.2291247761671820.5138888888888890.222222222222222
920BleuvVMt39PmQaukAdEcv37228053020100.3892543859649120.5039712423813450.81253578273965110.0064502022521050.6251256281407041.8014329580348e-050.1080103359173130.367750026126032156.07440.145414569144210830.8690476190476190.385122495323270.5833333333333330.0
930Bn854XtnJv15m9f1ZvWbe4419401302011-05-310.4714912280701750.88151539797772960.91250342269683600.0233956488466160.3005025125628140.00.0863049095607240.614379768000836160.01540.120214687592537760.8809523809523810.367055810649940.6805555555555560.0
940BsQm2iFzKTzAemliJtu8x3920123102016-11-250.5482456140350880.79214989306262640.87065939810320410.0076527823330050.0054271356783920.00.072661498708010.599749190093009129.99240.125558187740598150.940476190476190.3673159068855480.6250.0
950C2VNcmHdRRvQtetkaZVn532311440019900.5910087719298250.39452360153025910.67191895053891900.0027331365475020.6713567839195983.85875127942682e-050.1038759689922480.336398787752116102.0340.20714613562333430.6309523809523810.7723257605313970.7361111111111110.0
960C2XO15BhHDglDB724brOi43316627019980.8552631578947370.541123193862899100.72847434844298400.2041106373674430.0955778894472361.22824974411464e-050.0935400516795870.67917232730692980.98840.21098608232158720.7261904761904760.7064613907145640.6388888888888890.0
970CAinfqSGq1peA63s4gNam2118793301988-12-310.6282894736842110.297125242240765100.76187986956413510.0100579424948070.8934673366834170.00.073281653746770.585118612185181116.33530.115713651169677230.6071428571428570.2955093384552280.5972222222222220.0
980CBj18X3J7MabwDqKeMxuo47239373019990.3552631578947370.76905543673625100.87618549772235110.1068109762763750.1246231155778890.00.0544702842377260.74082976277563119.97340.153794788273615650.7380952380952380.3573122055160410.6527777777777780.0
990CLO6b6Ohrnmp9fDLvYj3U25162987020030.2357456140350880.33929772770631980.69362507156547900.0115884989614080.9869346733668340.00.2124031007751940.25175044414254491.16540.097246076399170860.7857142857142860.0309514520372540.5416666666666670.111111111111111
1000CMxaedU31OgxQvrA1VE1J5422586702019-10-250.8793859649122810.61542709682601830.90311901028053700.2237892205094570.2402010050251260.00.0663565891472870.399101264499948114.95940.14379626887770210.9761904761904760.3501795664395830.6805555555555560.111111111111111
Rows: 1-100 | Columns: 25

Machine Learning

We can use AutoML to easily get a well-performing model.

In [29]:
# define a random seed so models tested by AutoML produce consistent results
vp.set_option("random_state", 2)

AutoML automatically tests several machine learning models and picks the best performing one.

In [30]:
from verticapy.learn.delphi import AutoML
# define the model
auto_model = AutoML('spotify.automl_spotify_polish',
                    estimator = "native",
                    preprocess_data = False,
                    stepwise = False,
                    cv = 2)

Train the model.

In [31]:
%%time
auto_model.fit('spotify.polish_tracks_data_final', 
               predictors, 
               response) 
Starting AutoML

Testing Model - LinearRegression

Model: LinearRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'newton'}; Test_score: 8.066647640522543; Train_score: 8.118985304784605; Time: 0.20426690578460693;
Model: LinearRegression; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'bfgs'}; Test_score: 8.06657408402635; Train_score: 8.118986556025963; Time: 0.7867480516433716;

Grid Search Selected Model
LinearRegression; Parameters: {'solver': 'bfgs', 'penalty': 'none', 'max_iter': 100, 'tol': 1e-06}; Test_score: 8.06657408402635; Train_score: 8.118986556025963; Time: 0.7867480516433716;

Testing Model - ElasticNet

Model: ElasticNet; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'cgd', 'C': 1.0, 'l1_ratio': 0.5}; Test_score: 14.822426878049042; Train_score: 15.035727014744669; Time: 0.27442634105682373;

Grid Search Selected Model
ElasticNet; Parameters: {'solver': 'cgd', 'penalty': 'enet', 'max_iter': 100, 'l1_ratio': 0.5, 'C': 1.0, 'tol': 1e-06}; Test_score: 14.822426878049042; Train_score: 15.035727014744669; Time: 0.27442634105682373;

Testing Model - Ridge

Model: Ridge; Parameters: {'tol': 1e-06, 'max_iter': 100, 'C': 1.0}; Test_score: 8.064233984706952; Train_score: 8.121069572782197; Time: 0.20085954666137695;

Grid Search Selected Model
Ridge; Parameters: {'solver': 'newton', 'penalty': 'l2', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Test_score: 8.064233984706952; Train_score: 8.121069572782197; Time: 0.20085954666137695;

Testing Model - Lasso

Model: Lasso; Parameters: {'tol': 1e-06, 'max_iter': 100, 'solver': 'cgd', 'C': 1.0}; Test_score: 9.76754307664328; Train_score: 10.121086467655486; Time: 0.2544424533843994;

Grid Search Selected Model
Lasso; Parameters: {'solver': 'cgd', 'penalty': 'l1', 'max_iter': 100, 'C': 1.0, 'tol': 1e-06}; Test_score: 9.76754307664328; Train_score: 10.121086467655486; Time: 0.2544424533843994;

Testing Model - LinearSVR

Model: LinearSVR; Parameters: {'tol': 1e-06, 'fit_intercept': True, 'intercept_mode': 'regularized', 'max_iter': 100}; Test_score: 8.06441805263914; Train_score: 8.120720846232265; Time: 2.197774648666382;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'fit_intercept': True, 'intercept_mode': 'unregularized', 'max_iter': 100}; Test_score: 8.065481552212296; Train_score: 8.119616043007014; Time: 2.323397994041443;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'fit_intercept': True, 'intercept_mode': 'regularized', 'max_iter': 100}; Test_score: 8.06441805263914; Train_score: 8.120720846232265; Time: 2.182707905769348;
Model: LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'fit_intercept': True, 'intercept_mode': 'unregularized', 'max_iter': 100}; Test_score: 8.065481552212296; Train_score: 8.119616043007014; Time: 2.6499600410461426;

Grid Search Selected Model
LinearSVR; Parameters: {'tol': 1e-06, 'C': 1.0, 'max_iter': 100, 'fit_intercept': True, 'intercept_scaling': 1.0, 'intercept_mode': 'regularized', 'acceptable_error_margin': 0.1}; Test_score: 8.06441805263914; Train_score: 8.120720846232265; Time: 2.197774648666382;

Testing Model - RandomForestRegressor

Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 128, 'max_depth': 5, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.113663323363556; Train_score: 6.39265180458105; Time: 0.42195308208465576;
Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 1000, 'max_depth': 5, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.113663323363556; Train_score: 6.39265180458105; Time: 0.3885120153427124;
Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 128, 'max_depth': 5, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.181553675673483; Train_score: 6.5483753528126805; Time: 0.40996646881103516;
Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 64, 'max_depth': 4, 'min_samples_leaf': 1, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.266855044387853; Train_score: 6.809449561850816; Time: 0.3967854976654053;
Model: RandomForestRegressor; Parameters: {'max_features': 'max', 'max_leaf_nodes': 64, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.064327407880318; Train_score: 6.210967309985071; Time: 0.4239314794540405;

Grid Search Selected Model
RandomForestRegressor; Parameters: {'n_estimators': 10, 'max_features': 'max', 'max_leaf_nodes': 64, 'sample': 0.632, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Test_score: 7.064327407880318; Train_score: 6.210967309985071; Time: 0.4239314794540405;

Final Model

RandomForestRegressor; Best_Parameters: {'n_estimators': 10, 'max_features': 'max', 'max_leaf_nodes': 64, 'sample': 0.632, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}; Best_Test_score: 7.064327407880318; Train_score: 6.210967309985071; Time: 0.4239314794540405;


CPU times: user 1.51 s, sys: 113 ms, total: 1.62 s
Wall time: 33.3 s
Out[31]:
model_type
avg_score
avg_train_score
avg_time
score_std
score_train_std
1RandomForestRegressor7.0643274078803186.2109673099850710.42393147945404050.0219516103164192450.05495843581963396
2RandomForestRegressor7.1136633233635566.392651804581050.421953082084655760.044207457734251670.05405953173125159
3RandomForestRegressor7.1136633233635566.392651804581050.38851201534271240.044207457734251670.05405953173125159
4RandomForestRegressor7.1815536756734836.54837535281268050.409966468811035160.04573450961884770.05011685394254436
5RandomForestRegressor7.2668550443878536.8094495618508160.39678549766540530.059548933686773640.20818778505776214
6Ridge8.0642339847069528.1210695727821970.200859546661376950.152169055640503480.1513846089750132
7LinearSVR8.064418052639148.1207208462322652.1977746486663820.150185425016993420.1512890154897334
8LinearSVR8.064418052639148.1207208462322652.1827079057693480.150185425016993420.1512890154897334
9LinearSVR8.0654815522122968.1196160430070142.3233979940414430.148137505283755740.15105738388572493
10LinearSVR8.0654815522122968.1196160430070142.64996004104614260.148137505283755740.15105738388572493
11LinearRegression8.066574084026358.1189865560259630.78674805164337160.14514036956105970.15092076870560675
12LinearRegression8.0666476405225438.1189853047846050.204266905784606930.145235539275220230.15092178039328827
13Lasso9.7675430766432810.1210864676554860.25444245338439940.19183445513892240.22492932355164136
14ElasticNet14.82242687804904215.0357270147446690.274426341056823730.008989002316723210.08108192620896942
Rows: 1-14 | Columns: 8

The numbers are hard to grasp on their own, so let's plot the performance and efficiency of each model.

In [32]:
# visualize the performance of different models
auto_model.plot()
Out[32]:
<AxesSubplot:xlabel='time', ylabel='score'>

Extract the best model according to AutoML. From here, we can look at the model type and its hyperparameters.

In [33]:
# extract the model type and hyperparameters
best_model = auto_model.best_model_
bm_type = best_model.type
hyperparams = best_model.get_params()

print(bm_type)
print(hyperparams)
RandomForestRegressor
{'n_estimators': 10, 'max_features': 'max', 'max_leaf_nodes': 64, 'sample': 0.632, 'max_depth': 6, 'min_samples_leaf': 2, 'min_info_gain': 0.0, 'nbins': 32}

Thanks to AutoML, we know best model type and its hyperparameters. Let's create a new model with this information in mind.

In [47]:
from verticapy.learn.ensemble import RandomForestRegressor

# define the model
rf_model = RandomForestRegressor('spotify.randomforest_spotify', **hyperparams)

# train the model
rf_model.fit(polish_tracks, predictors, response) 

# use the model to predict
rf_model.predict(polish_tracks, 
                 name = 'estimated_popularity')
Out[47]:
Abc
id
Varchar(44)
Abc
Varchar(96)
123
popularity
Int
123
duration_ms
Int
123
explicit
Int
Abc
Varchar(100)
Abc
Varchar(156)
Abc
release_date
Varchar(20)
123
danceability
Float
123
energy
Float
123
key
Int
123
loudness
Float
123
mode
Int
123
speechiness
Float
123
acousticness
Float
123
instrumentalness
Float
123
liveness
Float
123
valence
Float
123
tempo
Numeric(9,4)
123
time_signature
Int
123
duration_minute
Float
123
release_year
Float
123
artists_followers
Float
123
artist_popularity
Float
123
nb_singers
Float
123
estimated_popularity
Float
10004Uy71ku11n3LMpuyf593425890702012-01-010.6173245614035090.59936138807723660.79767505538545810.0024051601618020.1778894472361810.001514841351074720.0612919896640830.365659943567771140.05640.168255848386141550.8928571428571430.0542600762282040.50.040.4301647912923
2008UO76dicJJrUb5g6WID042216813020010.6765350877192980.88653593196172390.92579593259154200.0236142997704170.1648241206030150.00.0694573643410850.74187480405476119.98140.137093574178264730.7619047619047620.4712643678160920.6111111111111110.034.869262824777
300D7fT49Q54HNiaaLY32UA4119623102017-06-070.5274122807017540.67266118424355610.90468722774002410.0980649393243690.1989949748743720.00.0997416020671830.120075242972097120.06150.121856677524429970.9523809523809520.7542190610525890.8194444444444440.22222222222222245.0804559629553
400De7REGurYkmfrKAh6CU13640086701992-10-300.1732456140350880.102328523661777110.46028427052995800.0259101344703180.5125628140703520.7082906857727740.0550904392764860.001881074302435114.59230.2733491264435890.6547619047619050.1168632393985770.5555555555555560.030.6638002980626
500KCwnrvIXX8GRU3ZMOIBW5520782102019-12-060.8388157894736840.69274332017953460.87660866751300600.043074231988630.172864321608040.001019447287615150.072971576227390.596614066255617123.98540.130436778205507840.9761904761904760.2340666046437180.750.052.0762280321545
600g7zzRDf2DOf4fKYzFTUA4619285702018-08-240.7565789473684210.7981745338434210.86585517636223310.3374877008855360.5698492462311560.00.0784496124031010.726199184867802112.34340.119358898430559660.9642857142857140.2814441343297020.5972222222222220.045.0804559629553
700i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.56250.67466939783715470.84058945062604200.0780583797966550.6321608040201010.00.2082687338501290.93729752325216898.84240.0490798045602605860.4880952380952384.0014805478e-050.0555555555555560.1111111111111112.86773809523809
800i4Q7OfN0mGnUyuyQcAdZ29792401978-01-010.56250.67466939783715470.84058945062604200.0780583797966550.6321608040201010.00.2082687338501290.93729752325216898.84240.0490798045602605860.4880952380952380.0505987215269650.5833333333333330.11111111111111110.6410158019808
900l8ILQ8qgsew789dGn1Gh732624001963-03-250.5570175438596490.60136960167083420.83202648545042700.0066688531759050.5115577889447249.71340839303992e-060.1452196382428940.782631413940851102.42840.21810260586319220.309523809523810.0069525724518070.1250.07.30127856572452
1000vlVv7WLZjyRS783REKmn24243107019980.9024122807017540.50597945597493710.82550469220620810.3812178856455670.0521608040201010.00.1979328165374680.35938969589298893.0540.156559076103050030.7261904761904760.4121224853195680.5138888888888890.024.9533724588221
1100wJpDUdKoWQ2ibqTpJh1Y3221165301998-08-300.813596491228070.92469199024008260.83531227441316300.0106045698043070.3155778894472360.0004278403275332650.2165374677002580.989549587208695130.10340.133273615635179140.7261904761904760.0798895591368810.3750.024.9533724588221
12010cqDI9IKD8f3Ji8TnT9h1419121301967-04-140.4956140350877190.415609844263036100.82129788664028110.0161801683612110.7879396984924620.00.0703875968992250.52241613543735135.1540.118141841871483560.3571428571428570.137090723567720.4861111111111110.013.4855833789985
130123TtFcZ7pZIss52i7nAK2317416001995-02-270.6293859649122810.62747637838760500.6992009558658810.0127910790423090.0148743718592960.000658137154554760.0376227390180880.746054969171282126.6340.105517471128220310.690476190476190.0113141862489120.3888888888888890.023.1748101052984
14015JvDPIBquLQTX6VZPwB2114221101960-09-070.2445175438596490.29310881505356900.72740397779603210.0122444517328090.6894472361809056.86796315250768e-060.6310077519379840.405371512174731112.21430.081865561148948770.2738095238095240.12187509378470.4722222222222220.03.61489291707544
1501FN979nyz1gmdsfWZv5OL4518229301995-08-310.5449561403508770.68671867939874120.85694371841784310.0029517874713020.0624120603015086.01842374616172e-050.2051679586563310.36461490228864112.02940.11153834764583950.690476190476190.1670318017666540.5277777777777780.024.4145764534479
1601LN8YM6L9V2a135AnhZqf5716833302020-12-050.6118421052631580.45878643652538970.75436238269484500.0381545862031270.1175879396984923.57215967246674e-050.1369509043927650.247570279026022144.04640.10120373112229790.9880952380952380.0884327201064390.6388888888888890.11111111111111152.3733282506672
1701LN8YM6L9V2a135AnhZqf5716833302020-12-050.6118421052631580.45878643652538970.75436238269484500.0381545862031270.1175879396984923.57215967246674e-050.1369509043927650.247570279026022144.04640.10120373112229790.9880952380952380.0094935125996620.8750.11111111111111155.4733496073336
1801j7wulLc7rJVk0STUPEKu520016001967-03-060.3859649122807020.24290347521362440.73910337789062300.0090740133377060.9618090452261310.0005179119754350050.1472868217054260.287281847632982132.140.124765324252294930.3571428571428570.0164460850514690.2083333333333330.011.5052223490586
1901lYEV7KZT34GOAW2MIgHF3545830701969-06-110.2872807017543860.54915604823729130.82137256366216110.0381545862031270.5356783919597990.002088024564994880.2072351421188630.185912843557321140.60330.31587207580692920.3809523809523810.8718725928593580.5833333333333330.014.9059819728965
2001uK4bh8p175iZj7uSkp5k45204893020000.6052631578947370.53309033948850820.80491872650785400.0059035749426040.4221105527638192.55885363357216e-060.0956072351421190.34266903542689996.0740.128269173822919750.750.7723257605313970.7361111111111110.034.389324210261
21025zlWO2FRZZrFhRw5kqa94019189302013-11-150.6644736842105260.97891375726722350.93537948373285500.1330490871323930.0012060301507543.87922210849539e-060.0935400516795870.186957884836451110.04340.118645247260882440.9047619047619050.2961195642387680.5833333333333330.040.4301647912923
22028DQ7NS44vnLRaoon0mwg18232640019870.3399122807017540.96987679609603320.94797500809001110.1494479064174050.0012060301507540.0003346980552712380.0560206718346250.435677709269516119.50440.148810334616523540.5952380952380950.2363674559587050.5277777777777780.018.5862492866963
2302GvdetV646M4C1Ip8CKg5920950701978-07-220.5197368421052630.23587472763603150.67774375824558800.0499617360883350.5326633165829150.00080143295803480.1968992248062020.300867384261678123.0940.131684927450399750.4880952380952380.1188239648670010.3333333333333330.011.7498709352586
2402IpFuItRz1yXbYAtHsXJs4021843602000-11-200.5241228070175440.66563243666596450.86299255719015310.0331256149557230.227135678391962.97850562947799e-050.0760723514211890.320723168565158112.77140.138295084394432920.750.1332793133459380.5416666666666670.034.09439364668
2502JLJi7sryFu44hhNWOv1c3025400012010-09-240.8541666666666670.70278438814752300.84382545490752510.4118290149775880.2231155778894470.00.0486821705426360.45239836973560590.01740.16462318625999410.8690476190476190.1061392715304660.6111111111111110.22222222222222235.0806689337132
2602RqhFaAYzWScfVzKfTZ7L5320278402020-01-070.6052631578947370.71081724252191510.91133348268737700.1778725265114250.0372864321608040.00.019224806201550.750235134287804103.81140.126707876813740020.9880952380952380.5007452757520280.7638888888888890.11111111111111152.7615971869234
2702hewGMWcG5Vp8nwhcmTA019219267019940.5405701754385960.79616632024982240.74472904687227710.0837433038154590.0083417085427140.00.1989664082687340.14306615111296993.82440.13891027539236010.6785714285714290.481248061782860.5833333333333330.021.3180307423213
2802yg6Eksd8zZpYxsT1Lpsp2910421301999-01-010.4813596491228070.82026488337299670.8522390660393810.0241609270799170.0065326633165831.40225179119754e-060.0443410852713180.923711986623472184.25440.0537355641101569460.7380952380952380.9336454488160620.6527777777777780.025.7209218589767
2903UV1fQG0MIm4CPszfttTJ3518383402006-06-260.6589912280701750.65358315510437740.90620566051825900.0301738274844210.0293467336683420.00.066873385012920.495245062179956125.99140.112679153094462540.8214285714285710.5540950151556080.6111111111111110.035.0794305188842
3004GlMx0s7MvdMr2MEXX8s83425584002008-04-180.5186403508771930.87247843680653910.74833843626316210.014868262818410.0002010050251260.1637666325486180.0469250645994830.876685129062598155.97540.165985342019543990.8452380952380950.576763402458910.5833333333333330.035.0806689337132
3104PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7061403508771930.77106365032984910.90394045752122100.0473379250027330.028944723618090.00.2930232558139530.525551259274741100.03140.146416938110749180.9166666666666670.158788751838180.6250.22222222222222240.8800274671513
3204PxH7CFGAaAvo6j0zZAOr4222940702014-11-070.7061403508771930.77106365032984910.90394045752122100.0473379250027330.028944723618090.00.2930232558139530.525551259274741100.03140.146416938110749180.9166666666666670.6559126876944470.6944444444444440.22222222222222240.8800274671513
3304QM4iqGabg7QlphUoWiMB2626740801995-05-080.5833333333333330.87047022321294110.88079058073830700.0465726467694330.1206030150753777.05220061412487e-050.3095607235142120.6018392726512791.5740.17454915605567070.690476190476190.5241139221511960.6527777777777780.025.4977094719485
3404izaHyfYxsE3AQ68kgB4I2414582702000-04-200.7247807017543860.7700595435330590.88036741094765100.1658467257024160.1788944723618090.00.073281653746770.620650015675619131.90940.084542493337281620.750.0080929944079310.3333333333333330.034.4777416980164
3504kSSveBKXiNeRaOXu2iog4322993301999-06-050.6710526315789470.43669608699581300.77123938963980800.014212310047010.2994974874371860.00.1700258397932820.197408297627756105.42140.14680633698549010.7380952380952380.5007452757520280.7638888888888890.11111111111111133.8012231517393
3604wTGEh2gJTG72yUoizwFV10205693019800.4550438596491230.85340040766736100.86261917208075110.0345468459604240.0001005025125630.003316274309109520.3064599483204130.493154979621695112.03740.128861415457506650.5119047619047620.576763402458910.5833333333333330.014.610474284499
37053zfKbxSoqRm2RCdJdQMS2820140001997-01-010.3947368421052630.95883162133124520.89176810295472110.0264567617798190.0010050251256280.02067553735926310.0976744186046510.590343818580834166.19340.125683298785904650.7142857142857140.1343697167952140.4305555555555560.024.4145764534479
38053zgCcuWJ9x4dJJgaF4VS10509704019820.1359649122807020.19972688295127100.60859283598436810.0159615174374110.788944723618090.4104401228249740.1989664082687340.011495454070436135.56840.35392137992300860.5357142857142860.0922041155227430.4027777777777780.011.773544254012
39054VUAvYt59A1yGSrBAgac2220321301976-02-120.5230263157894740.48690142683575850.79473775919149710.0306111293320210.7336683417085430.02773797338792220.1121447028423770.818162817431289147.66840.127025466390287220.4642857142857140.380440763082340.5833333333333330.021.6044790501583
4005IAxc7j9T9KcEu0lAgbfB4621601302016-01-030.7894736842105260.928708417427278110.95745898986881710.0792609598775550.1256281407035180.00.1906976744186050.982234298254781133.93340.136501332543677830.940476190476190.1070396046537220.5833333333333330.045.0804559629553
4105JJTD8KdxbZsMnOn6imMv42230853020000.6622807017543860.91063449508489770.92360540661638400.1264895594183890.0795979899497490.000306038894575230.1028423772609820.84010868429303168.03240.147487414865265040.750.1965627282094370.5833333333333330.034.869262824777
4205RcrI18Qi65dwc3vegSNP4321600002017-10-270.6337719298245610.88251950477452800.90488636646503910.1089974855143760.0876381909547740.00.3002583979328170.63841571742083879.99440.13649170861711580.9523809523809520.451627102027750.6527777777777780.11111111111111145.0804559629553
4305hJSmfNiPXSzwSLeG1Rfb3923250702016-10-210.4166666666666670.42565091223102560.86620366913100800.0115884989614080.8261306532663320.004083930399181170.1018087855297160.47643431915560799.84240.148711874444773460.940476190476190.0618028670608130.50.045.0804559629553
4405hk8ivnCt0HP3aI1uWJfi4329506702015-03-060.5899122807017540.76303079595545820.81930649939013810.0178200502897120.5396984924623120.00.3188630490956070.684397533702581106.97240.195025170269469940.9285714285714290.3245800946350150.5555555555555560.040.8800274671513
4505v1D17vjtCNRyalBVGbKU3448116002007-06-100.4078947368421050.785121145485034110.77233465262738700.0405597463649280.8331658291457290.05097236438075740.9793281653746770.337443829031247108.14240.33279019840094760.8333333333333330.4488560767483970.5972222222222220.033.6127140914514
4605yf5sbF36dhwGRfj18sg52722564702007-11-190.9342105263157890.69073510658593670.87357180195653810.1888050727014320.1095477386934670.00.2485788113695090.73037934998432492.97140.14363340242819070.8333333333333330.2047457559296940.4861111111111110.034.6960824675478
47062pHiX7lpKRDmFvmJor3o38182813020060.502192982456140.99397535921920720.90543399795882810.0882256477533620.0012060301507541.19754350051177e-060.3250645994832040.398056223220817146.55340.1119233047083210.8214285714285710.4288586777107530.5972222222222220.035.0794305188842
48063FiQaY2FuZe3a44rZuSX3722118702012-08-080.7807017543859650.7057967085379200.90127697707415410.0228490215371160.1949748743718591.69907881269191e-060.1648578811369510.72933430870519495.00840.140331655315368680.8928571428571430.112141492352170.4444444444444440.040.4301647912923
49063ZobFSzYA0O7fmfi4oik4221268902002-03-120.7752192982456140.952806980550451100.91952306275359110.029080572865420.1527638190954770.00.3571059431524550.84951405580520499.98640.13404056855196920.7738095238095240.5156407870912240.6805555555555560.034.869262824777
5006AIE9kwssJ3cmPbMuHobb3032313302007-05-110.0844298245614040.9176632426624970.89716974087073410.0380452607412270.0023115577889457.93244626407369e-050.114211886304910.38447068659212166.2140.215802487414865270.8333333333333330.3440172863959670.5555555555555560.035.0794305188842
5106K6svVGbiQILn9BAzZNgo3737412012013-12-130.6074561403508770.76303079595545800.80349986309212700.0507270143216360.0149748743718591.45342886386899e-060.1100775193798450.335353746472986133.99240.25354826769321880.9047619047619050.0313315926892950.5972222222222220.11111111111111140.4301647912923
5206LDaZRAWOc0fNqkdOQo9S1877853019980.7697368421052630.454770009338193100.75232121076344810.013993659123210.3015075376884420.8147389969293760.0559173126614990.471209112759954147.90140.034221202250518210.7261904761904760.7064613907145640.6388888888888890.022.8083311858699
5306SnvuakcuOcj3odFZpIpJ1813763901969-09-060.6425438596491230.540119087066120.78560227018146500.0085273860282060.0726633165829150.00.119379844961240.80457728080259294.85840.078480900207284570.3809523809523810.0715164610906040.4027777777777780.012.3261274758088
5406V1fPvTfXdIBMUItnLmQ71411949301967-10-160.4133771929824560.33226898012872650.63410748512682600.013884333661310.7658291457286430.0005414534288638690.092506459948320.32699341623994179.07830.065047379330766960.3571428571428570.1085101487550390.4861111111111110.013.2248451239649
5506kPUkFJv1RfEVjgEF3oRz2325022701997-01-010.7576754385964910.73591991244188700.77696462798396910.0132283808899090.0088442211055280.02241555783009210.0894056847545220.863099592433901129.87940.161830026650873570.7142857142857140.3192381181036980.6527777777777780.024.7301600717939
5607Iyx8PziAv997SrJanQgl11279013019810.3300438596491230.138476368346537100.52629875787220300.0095113151853070.3557788944723628.86386898669396e-060.213436692506460.06907722855052897.96440.18314036126739710.5238095238095240.2094074807678840.5833333333333330.012.780501473184
5707QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.501096491228070.957827514534446100.88616732631369300.2926642615065050.0139698492462317.21596724667349e-060.3633074935400520.66454174939910184.01140.214218241042345270.8690476190476190.1001370507087620.7083333333333330.22222222222222235.0794305188842
5807QfWWYlJsbXiEbbKSvqcd3432099312010-04-260.501096491228070.957827514534446100.88616732631369300.2926642615065050.0139698492462317.21596724667349e-060.3633074935400520.66454174939910184.01140.214218241042345270.8690476190476190.073817312405590.5416666666666670.22222222222222235.0794305188842
5907rj0QlIw4hRcvhIJGY2Lf612317301965-10-120.5416666666666670.18767760138968400.65150723122495200.0275500163988190.972864321608048.84339815762538e-060.119379844961240.28205664123732967.15240.067771690849866740.3333333333333330.0312115482728610.4305555555555560.08.58681052013858
600804d5chEAcksTIZ7IVh6O2920000001967-10-160.2960526315789470.43870430058941190.69397356433425400.0253635071608180.8793969849246235.03582395087001e-050.5090439276485790.339533911589508101.97540.124646875925377560.3571428571428570.1085101487550390.4861111111111110.013.190075690601
6108D5BLpRFI6PT2XYuq7eDg2827326701996-01-010.6973684210526320.94075769898886450.86530754486844400.0076527823330050.5095477386934675.496417604913e-060.2434108527131780.985369422092173137.92140.17888658572697660.7023809523809520.1283274811680320.50.024.4145764534479
6208D6EqghkRLD6iFHKKWJ4Y5521312012020-01-270.7653508771929820.6877227861955400.80867746994249900.0367333551984260.3015075376884420.02691914022517910.1080103359173130.46389382380604125.01840.13435963873260290.9880952380952380.4028790652541440.8611111111111110.11111111111111155.9854708194548
6308ER472xnuMlbtkKSh3PB41116686701975-09-070.3037280701754390.289092387866373100.66445124835088210.071498852082650.9346733668341710.00.0966408268733850.402236388337339178.15940.100118448326917380.4523809523809520.12187509378470.4722222222222220.011.6068359628288
6408ULDx4KeBpXe7OHEWjgVP4527729302014-04-160.68750.93172073781767430.91807930699723710.0174920739040120.0147738693467340.001187308085977480.3250645994832040.970738844184345153.06840.181867041753035220.9166666666666670.0306413372947990.5138888888888890.040.8800274671513
6508uPk6Ge77W31WGGYNEPHA215636901978-01-010.4484649122807020.54714783464369350.80795559206432210.3189023723625230.7849246231155781.31013306038895e-060.0904392764857880.73664959765910892.54440.092346757477050640.4880952380952380.0505987215269650.5833333333333330.11111111111111110.4222738664969
6608uPk6Ge77W31WGGYNEPHA215636901978-01-010.4484649122807020.54714783464369350.80795559206432210.3189023723625230.7849246231155781.31013306038895e-060.0904392764857880.73664959765910892.54440.092346757477050640.4880952380952380.00.0138888888888890.1111111111111112.86773809523809
6708xY8SSfwHc7SIXWroGW9j16167360019940.2894736842105260.24491168880722220.67189405819829200.0172734229802120.6603015075376880.00.0945736434108530.584073570906051124.35440.100483417234231570.6785714285714290.3626241709432490.750.022.0225455927807
68095PNli2gmoYwlvgotxxEs28269067019950.5657894736842110.61643120362281770.90000746770218800.0051382967093040.0551758793969856.16171954964176e-060.0832041343669250.120075242972097119.80440.175777317145395320.690476190476190.7723257605313970.7361111111111110.025.4977094719485
6909FEC2H0Ls5bXMSFj2bLzL5019846002018-10-240.7105263157894740.80821560181140900.82884026585019810.213949928938450.2994974874371860.0001279426816786080.1038759689922480.925802069181733169.84840.123506810778797740.9642857142857140.749347258485640.9027777777777780.22222222222222245.0804559629553
7009GKW7x0iMuxzzM7IeQ01g715352101964-02-210.5054824561403510.20775973732566200.70716650486645310.0115884989614080.7396984924623120.00.1772609819121450.54018183718256982.01840.090238377257921230.3214285714285710.0305813150865820.4305555555555560.05.49629465913261
7109PwmLzRfWj5hkmIiw1a9X2125131601969-09-070.2883771929824560.3844825335622700.79834714858238110.0087460369520060.7175879396984928.08597748208802e-060.3746770025839790.36356986100951119.00240.162636215575954980.3809523809523810.12187509378470.4722222222222220.013.190075690601
7209d4etyrEWf5BguIIUfRW91221252301974-08-160.2719298245614040.42866323262142290.81141562741144500.0095113151853070.4361809045226130.00.1266149870801030.29564217786602683.69540.133917678412792420.440476190476190.1296579734501770.50.013.7043253144824
730ADOK8ffxUlSDu51nIu5KU23274533019990.6600877192982460.29812934903756410.74891096009757800.0045916693998030.4120603015075380.00.3591731266149870.545407043578221103.03740.17982380811371040.7380952380952380.612246531216550.6250.025.7209218589767
740AGuYcLlnbR7AOQSbTGzIz2923296002000-12-310.4243421052631580.57325461136046420.80559081970477700.0076527823330050.0803015075376880.0001136131013306040.2847545219638240.554812415090396201.07740.14904723127035830.750.1501155427508180.5555555555555560.034.09439364668
750AHy3nZXS725AAHDC0qjPR2220233601966-09-280.4627192982456140.36239218403269490.75710054016379210.0157428665136110.8824120603015080.00.3023255813953490.353119448218205117.99940.126376221498371330.3452380952380950.153516801216450.4861111111111110.013.190075690601
760AJJJnR5VwJfeYGvnyG74v719153001967-03-290.627192982456140.30214577622475970.67707166504866500.0042636930141030.7778894472361810.002221084953940630.0894056847545220.389695892987773104.85940.118376517619188630.3571428571428570.6936366455588570.6388888888888890.015.3641314115338
770ATVUbPlX65HB5H1WbIugt5418135612020-12-040.9111842105263160.90059342711690870.78321260548129310.2893844976495030.1859296482412060.00.0643927648578810.895495872086947154.91240.110844684631329580.9880952380952380.2770825205325970.750.051.9031221393765
780AUeyoLQYwYUDsQRJZKvZd22248907019800.3991228070175440.8252854173569940.88131331989146910.0219744178419150.0005025125628144.4114636642784e-050.1214470284237730.409551677291253195.9940.160852827953805150.5119047619047620.576763402458910.5833333333333330.014.9059819728965
790AXtdoJaVk6W8pcYstecgw33193227020000.5866228070175440.8072114950146120.85333432902695810.0228490215371160.28.20880245649949e-050.0050645994832040.717838854634758120.17240.119632810186556110.750.4488560767483970.5972222222222220.034.09439364668
800AYmWvKRGtgePrpo3NdYXh4821853002019-02-010.6370614035087720.38147021317187340.75286884225723710.0072154804854050.940703517587940.4472876151484140.0945736434108530.215173999372975112.99640.13836467278649690.9761904761904760.9018937006692480.8472222222222220.042.9220527846565
810ApseFDjbipbKK0dBD4Viz3526701312005-11-150.8519736842105260.69776385416352990.88121375052896200.2959440253635070.0779899497487440.00.1421188630490960.53704671334517790.01140.174256736748593430.809523809523810.8096195592369180.6388888888888890.034.8690265263156
820Aq0weZZ0YzbrNlsA8UV3P2329410701997-01-050.5592105263157890.59634906768683970.81420356956164610.0041543675522030.0157788944723620.0003387922210849540.13281653746770.188002926115582140.84340.194314480307965640.7142857142857140.0050818802957090.1666666666666670.024.4145764534479
830Au3WxSl0fm6IvSZHCShjZ2215718701998-01-010.4868421052631580.92469199024008240.86543200657157800.0198972340658140.0141708542713570.01791197543500510.1400516795865630.639460758699969139.0840.092952324548415750.7261904761904760.2911777357622320.5555555555555560.024.637788840476
840AwC0oZxr25SOcThearmN42830038302005-04-290.9506578947368420.7057967085379270.87197869215642310.0376079588936260.013567839195980.00.0620155038759690.75023513428780499.02440.198960615931299960.809523809523810.2047457559296940.4861111111111110.034.4856784749792
850B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.018046677270590.4027777777777780.4444444444444441.87941937277902
860B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.0001100407150650.1111111111111110.4444444444444440.47034210575268
870B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.00.0694444444444440.4444444444444440.47034210575268
880B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.0267098826565830.6527777777777780.4444444444444441.87941937277902
890B1uuoyiPSkV2f8cAw473e023077301946-01-010.2269736842105260.19169402857687960.59955691633684310.0599103531212420.9959798994974870.90071647901740.1803617571059430.32908349879820379.24410.147428190701806320.1071428571428570.442503726378760.5972222222222220.4444444444444441.3571971505568
900BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.6798245614035090.86946611641614290.96696786398825100.4260413250245980.3497487437185930.00.1927648578811370.823388023826941173.87440.194147912348238080.8571428571428570.5225533447375530.6944444444444440.22222222222222235.4709516456447
910BHIwbxTN0iSClgDaO36Ft3429388202009-02-210.6798245614035090.86946611641614290.96696786398825100.4260413250245980.3497487437185930.00.1927648578811370.823388023826941173.87440.194147912348238080.8571428571428570.2291247761671820.5138888888888890.22222222222222235.4709516456447
920BleuvVMt39PmQaukAdEcv37228053020100.3892543859649120.5039712423813450.81253578273965110.0064502022521050.6251256281407041.8014329580348e-050.1080103359173130.367750026126032156.07440.145414569144210830.8690476190476190.385122495323270.5833333333333330.033.9973005576168
930Bn854XtnJv15m9f1ZvWbe4419401302011-05-310.4714912280701750.88151539797772960.91250342269683600.0233956488466160.3005025125628140.00.0863049095607240.614379768000836160.01540.120214687592537760.8809523809523810.367055810649940.6805555555555560.040.4301647912923
940BsQm2iFzKTzAemliJtu8x3920123102016-11-250.5482456140350880.79214989306262640.87065939810320410.0076527823330050.0054271356783920.00.072661498708010.599749190093009129.99240.125558187740598150.940476190476190.3673159068855480.6250.045.0804559629553
950C2VNcmHdRRvQtetkaZVn532311440019900.5910087719298250.39452360153025910.67191895053891900.0027331365475020.6713567839195983.85875127942682e-050.1038759689922480.336398787752116102.0340.20714613562333430.6309523809523810.7723257605313970.7361111111111110.026.8362368045648
960C2XO15BhHDglDB724brOi43316627019980.8552631578947370.541123193862899100.72847434844298400.2041106373674430.0955778894472361.22824974411464e-050.0935400516795870.67917232730692980.98840.21098608232158720.7261904761904760.7064613907145640.6388888888888890.026.0365054773228
970CAinfqSGq1peA63s4gNam2118793301988-12-310.6282894736842110.297125242240765100.76187986956413510.0100579424948070.8934673366834170.00.073281653746770.585118612185181116.33530.115713651169677230.6071428571428570.2955093384552280.5972222222222220.021.0755577778261
980CBj18X3J7MabwDqKeMxuo47239373019990.3552631578947370.76905543673625100.87618549772235110.1068109762763750.1246231155778890.00.0544702842377260.74082976277563119.97340.153794788273615650.7380952380952380.3573122055160410.6527777777777780.024.637788840476
990CLO6b6Ohrnmp9fDLvYj3U25162987020030.2357456140350880.33929772770631980.69362507156547900.0115884989614080.9869346733668340.00.2124031007751940.25175044414254491.16540.097246076399170860.7857142857142860.0309514520372540.5416666666666670.11111111111111131.9652849415299
1000CMxaedU31OgxQvrA1VE1J5422586702019-10-250.8793859649122810.61542709682601830.90311901028053700.2237892205094570.2402010050251260.00.0663565891472870.399101264499948114.95940.14379626887770210.9761904761904760.3501795664395830.6805555555555560.11111111111111152.5464341434452
Rows: 1-100 | Columns: 26

View the regression report and the importance of each feature.

In [48]:
rf_model.regression_report()
Out[48]:
value
explained_variance0.856500452337576
max_error35.5696078431373
median_absolute_error3.91965797527483
mean_absolute_error4.76133210498922
mean_squared_error39.2157935438824
root_mean_squared_error6.262251475618206
r20.856498475694773
r2_adj0.8560006054476506
aic13827.520352964559
bic13914.66239558891
Rows: 1-10 | Columns: 2
In [49]:
rf_model.features_importance()
Out[49]:
importance
sign
release_year90.241
artist_popularity5.211
instrumentalness1.981
artists_followers0.541
loudness0.541
acousticness0.311
energy0.31
speechiness0.241
duration_minute0.211
nb_singers0.131
valence0.131
danceability0.11
liveness0.091
Rows: 1-13 | Columns: 3

To see how our model performs, let's plot the popularity and estimated popularity of songs by other Polish artists like Brodka, Akcent, and Maanam.

In [51]:
# results for Brodka
polish_tracks.search("LOWER(artists) LIKE '%brodka%'",
                     usecols = ['popularity', 'name', 'estimated_popularity']).plot(
                               ts='name', columns=['popularity', 'estimated_popularity'])
Out[51]:
<AxesSubplot:xlabel='"name"'>
In [52]:
# results for Brodka
polish_tracks.search("LOWER(artists) LIKE '%akcent%'",
                     usecols = ['popularity', 'name', 'estimated_popularity']).plot(
                               ts='name', columns=['popularity', 'estimated_popularity'])
Out[52]:
<AxesSubplot:xlabel='"name"'>

Group Artists using Track Features

While our tracks don't have an explicit "genre" feature, we can approximate the effect by grouping artists based on their tracks' numerical features.

Let's start by taking the averages of these numerical features for each artist.

In [55]:
# group by artist
artists_features = polish_tracks.groupby(['id_artists',
                                          'artists'], expr=['AVG(danceability) AS danceability',
                                                            'AVG(energy) AS energy', 
                                                            'AVG(speechiness) AS speechiness', 
                                                            'AVG(acousticness) AS acousticness', 
                                                            'AVG(instrumentalness) AS instrumentalness',
                                                            'AVG(valence) AS valence', 
                                                            'AVG(liveness) AS liveness'])

# save relation to the database as "artists_features"
artists_features.to_db('"spotify"."artists_features"')
Out[55]:
Abc
Varchar(156)
Abc
Varchar(100)
123
danceability
Float
123
energy
Float
123
speechiness
Float
123
acousticness
Float
123
instrumentalness
Float
123
valence
Float
123
liveness
Float
10.6677631578947370.7790965047042410.3090630807915160.1116080402010050.0009263050153531220.6985055909708430.145271317829457
20.1151315789473680.2268377664648410.01393899639225950.9783919597989950.00.09097084334831250.157105943152454
30.7269736842105270.5848018395236520.01377500819940950.1998994974874370.02146519959058340.8322708746995510.048372093023256
40.68750.6616160094787680.0761998469443530.1085427135678391.97543500511771e-060.8348834778973770.083100775193798
50.6721491228070180.7349158056450880.0115884989614080.4623115577889450.00.7930818267321560.10077519379845
60.4195574162679430.6368784693030860.03523261476688830.4657469164001830.002573704289569180.5652628278817010.184947145877378
70.5559210526315790.9256960970368810.1133705039903790.0073366834170850.00.7241091023095410.089405684754522
80.7478070175438590.8689640630177420.09844757844101850.1643216080402010.0005731832139201640.6185599331173580.14687338501292
90.3739035087719290.3513470092679060.0312124193724720.6527638190954780.00.5595151008464830.572093023255814
100.5515350877192980.5973531744836380.28501147917350.1638190954773870.00.3363987877521160.561757105943152
110.623903508771930.8413511261057730.0037170657046030.0004020100502510.05834186284544520.3510293656599440.67235142118863
120.6809210526315790.8815153979777290.1680332349404180.345728643216080.00.4544884522938660.160723514211886
130.6392543859649120.5943408540932410.0642833715972450.7768844221105530.00.8902706656912950.1328165374677
140.6984649122807020.5461437278468940.0243795780037170.3758793969849250.00.8902706656912950.236175710594315
150.5394736842105270.4723418782821740.01355635727560950.2915577889447240.00.4409029156651690.246511627906977
160.7796052631578950.6847104658051430.2773586968404940.4613065326633170.00.642595882537360.257881136950904
170.4002192982456140.6796899318211480.5299005138296710.6733668341708540.00.6749921621904070.093540051679587
180.6776315789473680.6495667279171810.1734995080354210.056783919597991.48413510747185e-060.4304525028738640.037416020671835
190.8366228070175440.8654496892289460.04963375970263470.1114907872696820.001321183896281130.9487929773226040.144289405684755
200.7017543859649120.751985621190670.0330162894938230.0261306532663320.0002313203684749230.8662347162712930.2671834625323
210.6052631578947370.7108172425219150.1778725265114250.0372864321608040.00.7502351342878040.01922480620155
220.8234649122807020.6073942424516270.157100688750410.6361809045226131.18730808597748e-060.3604347371721180.12764857881137
230.1998050682261210.9208986978966190.09477302819382180.0005695142378561110.1307180382122140.297093624087040.290152167671548
240.9166666666666670.7730718639234470.0936919208483660.2190954773869350.001740020470829070.821297941268680.341602067183463
250.8464912280701750.6676406502595620.0392478408221270.1437185929648244.82088024564995e-050.483749608109520.262015503875969
260.8149122807017540.7357190910825280.2817317153164970.2436783919597990.00.7197199289371930.168124031007752
270.497807017543860.5431314074564970.1210232863233850.1185929648241210.0006018423746161720.4513533284564740.11421188630491
280.5559210526315790.5732546113604650.01087788345905750.7256281407035180.002055527123848520.5887762566621390.143152454780362
290.4462719298245610.6515749415107790.0360774024270250.0242211055276380.0002139201637666330.6927578639356250.297157622739018
300.4528508771929820.3995441355142530.0154148901279110.870351758793973.13203684749232e-050.1880029261155820.106976744186047
310.8706140350877190.7406057441602820.3159870267118550.2954773869346731.37154554759468e-060.8171177761521580.0642549526270457
320.7565789473684210.798174533843420.3374877008855360.5698492462311560.00.7261991848678020.078449612403101
330.5532163742690060.5242541996766780.05874421486097450.3366030150753770.0007916751961787790.6604730553523530.151138673557278
340.7927631578947370.6746693978371540.2380015305564670.0979899497487440.00.3572996133347270.209302325581395
350.5109649122807020.7831129318914360.1625669618454140.0146733668341710.00.2109938342564530.099741602067183
360.7708333333333330.7319034852546920.0585984475784410.3577889447236180.00.7857665377782420.139018087855297
370.5652412280701760.6365133395587950.07800371706570450.2175376884422113.11412487205732e-050.263245898212980.068888888888889
380.9287280701754390.6274763783876050.1636602164644150.0117587939698490.00.4868847319469120.152454780361757
390.6721491228070180.7891375726722290.0925986662293650.0293467336683420.00.5955690249764870.297157622739018
400.4522823261858350.1802620867496210.0304653620494880.8304113158384520.001414724212441720.4322561667111780.186865728777873
410.7094298245614040.6816981454147460.0355307751175250.5829145728643221.6171954964176e-060.7962169505695470.304392764857881
420.7971491228070180.6686447570563610.1483546517984040.3045226130653270.00.2015884627442780.085478036175711
430.4728618421052630.6242130312980090.0103039247840820.3345728643216080.07370059109518940.3968282997178390.243152454780362
440.3004385964912280.7670472231426530.057067891111840.1798994974874370.00.5945239836973560.198966408268734
450.7631578947368420.5782751453444590.0345468459604240.2201005025125630.4360286591606960.1576967290207960.381912144702842
460.7214912280701750.3774537859846770.014977588280310.5286432160804021.7093142272262e-060.1859128435573210.155555555555556
470.8453947368421050.7198542036931040.257133486388980.2152763819095482.11873080859775e-060.3468492005434210.0931782945736435
480.6557017543859650.9056139611009030.0724827812397510.1256281407035180.00.3468492005434210.259948320413437
490.8168859649122810.5180287375365240.328741663933530.2814070351758790.00.5328665482286550.037519379844961
500.6798245614035090.7936560532578250.06909369192084850.1603015075376895.88536335721597e-060.463893823806040.20671834625323
510.8651315789473680.8092197086082080.1800590357494260.0497487437185930.00.6676768732364930.335400516795866
520.400097465886940.8709164929004070.1189825443679170.03332216638749317.56795177982486e-050.6019553883489510.180361757105943
530.5065789473684210.3332730869255250.09582376735541750.7447236180904520.0001632497441146370.4325425854321250.101602067183462
540.6074561403508770.8855318251649250.4293210888816010.2804020100502510.00.6070644790469220.07266149870801
550.7933114035087720.4658151841029820.357166284027550.2175879396984920.00.7742710837078070.068423772609819
560.4089912280701750.4889096404293560.0524762217120370.1527638190954770.02210849539406350.265335980771240.084134366925065
570.4605263157894740.3185461872391410.023650741591050.6515912897822450.02293759808938930.2939004424008080.184840654608096
580.5058479532163740.5655564592516730.02140956962209830.3474036850921270.3275332650972360.6641934023060580.132575366063738
590.8552631578947370.8102238154050060.2598666229364820.1467336683417090.00.9132615738321660.395348837209302
600.6502192982456140.853400407667360.1494479064174050.2874371859296480.0003561924257932450.5882537360225730.336434108527132
610.5650584795321640.3678589877041550.05507573825054990.785371300949191.47617422950074e-060.5206743999721320.208693654895205
620.6732456140350880.5873121065156490.0071061550235050.2502512562814070.00.8536942209217260.185529715762274
630.5350877192982460.7168418833027080.4063627418825840.0925628140703520.00.7293343087051940.06749354005168
640.7061403508771930.7710636503298490.0473379250027330.028944723618090.00.5255512592747410.293023255813953
650.7258771929824560.9518028737536520.0359680769651250.0642211055276380.0004667349027635620.9320723168565160.034108527131783
660.6003289473684210.7745780241186460.02339564884661650.0757788944723623.03480040941658e-060.4304525028738630.117829457364341
670.8377192982456140.6355092327619970.3353011916475350.2603015075376880.00.820252899989550.287855297157623
680.5043859649122810.63249691237160.0194599322182140.4050251256281410.00.3959661406625560.10594315245478
690.4923245614035090.8393429125121750.2347217666994640.0482412060301519.58034800409417e-060.2559306092590660.706459948320413
700.5734649122807020.6997720677571270.1494479064174050.1437185929648240.00.821297941268680.171059431524548
710.8673245614035090.6776817182275510.167304398527750.1408375209380230.00.4844463022956070.291645133505599
720.5540935672514620.4781991679301680.01271819540104230.5905192629815740.2177482088024570.4788727488069110.132540913006029
730.7006578947368420.8016889076322150.3271017820050290.4110552763819090.00.7889016616156330.435658914728682
740.6726973684210530.4912525562885530.01169782442330820.7700167504187610.2384206772432620.3515518862995090.0921274763135228
750.5778508771929820.6917392133827350.0122444517328090.5236180904522616.16171954964176e-060.7700909185912840.254780361757106
760.7061403508771930.6816981454147460.3014102984585110.5356783919597990.04104401228249740.821297941268680.079896640826873
770.6176900584795320.5961259328431060.01191647534710830.2423673925181460.009126209484817470.7145876150996850.098409417169107
780.8223684210526320.5210410579269210.1909915819394340.3547738693467340.00.4618037412477790.301291989664083
790.7763157894736840.3784578927814760.0228490215371160.6824120603015080.03868986693961110.8254781063852020.091472868217054
800.6491228070175440.8282977377473870.0074341314092050.0568844221105530.00.920576862786080.061808785529716
810.5464912280701750.8789047203060520.04587296381327220.0353969849246232.38894575230297e-060.7163757968439750.179555555555556
820.6067251461988310.9507987669568530.1127874348602460.09742043551088771.79119754350051e-060.615773156373010.195176571920758
830.6337719298245610.5109999899589320.3680988302175580.2522613065326630.00.523461176716480.126614987080103
840.5333333333333330.3818718558905930.0087023067672460.7015075376884421.82599795291709e-050.606646462535270.145012919896641
850.811403508771930.5822915725316540.4227615611675960.145728643216084.28863868986694e-060.6687219145156230.236175710594315
860.4111058897243110.3003957615217670.02406982252833350.7838478104809760.00987866281620120.4628736644621270.202726713424388
870.6414473684210530.5521683686276870.2445610582704710.5527638190954770.00.2318946598390640.086821705426357
880.6140350877192980.7274854153487760.01034218869574740.3154974874371860.002433797338792220.915769672902080.214346253229974
890.2105263157894740.329256659738330.0242702525418170.7618090452261310.0005547594677584440.1273905319260110.10594315245478
900.2452485380116960.0237069614724220.02037097773404770.9976549413735340.9314227226202660.0427770230257430.0833074935400513
910.8410087719298250.9528069805504510.0486498305455340.0220100502512560.00.9122165325530360.011886304909561
920.7807017543859650.705796708537920.0228490215371160.1949748743718591.69907881269191e-060.7293343087051940.164857881136951
930.7686403508771930.8965769999297130.1363288509893950.0559798994974870.00.6843975337025810.097674418604651
940.7236842105263160.8021909610306150.1942713457964360.0768844221105530.00.6206500156756190.049819121447028
950.7949561403508770.8272936309505880.3210888816005250.0997989949748740.00.5213710941582190.104909560723514
960.8930921052631580.6018716550692330.371925221384060.2211055276381910.00.8960183927265130.145839793281654
970.8037280701754390.595344960890040.2073904012244450.5587939698492460.00.7356045563799770.096640826873385
980.7173793859649120.6974500707895290.2726440362960530.1864384422110550.009375309621289660.6576836660048070.218695090439276
990.8980263157894740.5451396210500950.5747239532087020.1055276381909552.80450358239509e-060.2966872191451560.102842377260982
1000.6151315789473680.4989507083973450.0168361211326120.7899497487437192.58955987717503e-060.8421987668512910.09250645994832
Rows: 1-100 | Columns: 9

Grouping means clustering, so we use an elbow curve to find a suitable number of clusters.

In [56]:
from verticapy.learn.model_selection import elbow

# define numerical features
preds = ["danceability",
         "energy",
         "speechiness",
         "acousticness",
         "instrumentalness",
         "liveness",
         "valence"]

# elbow curve
elbow = elbow('"spotify"."artists_features"',
              preds,
              n_cluster = (1, 30))

Let's define and use the Vertica k-means algorithm to create a model that can group artists together.

In [57]:
from verticapy.learn.cluster import KMeans

# define k-means
model = KMeans('"spotify"."KMeans_spotify"', 
               n_cluster = 7)

We can train our new model on the "artists_features" relation we saved earlier.

In [58]:
# train the model
model.fit('"spotify"."artists_features"', 
          X = preds)
Out[58]:

=======
centers
=======
danceability| energy |speechiness|acousticness|instrumentalness|liveness|valence 
------------+--------+-----------+------------+----------------+--------+--------
   0.69555  | 0.78801|  0.10285  |   0.12729  |     0.01264    | 0.16900| 0.77472
   0.76995  | 0.64462|  0.27348  |   0.20702  |     0.00206    | 0.14808| 0.51245
   0.47486  | 0.36216|  0.03955  |   0.73322  |     0.01880    | 0.15942| 0.35397
   0.54658  | 0.69307|  0.06738  |   0.13429  |     0.09020    | 0.15264| 0.35517
   0.60919  | 0.75848|  0.17245  |   0.17270  |     0.06267    | 0.71443| 0.54388
   0.64666  | 0.63465|  0.15365  |   0.51753  |     0.01172    | 0.18339| 0.61588
   0.35545  | 0.24647|  0.02482  |   0.85935  |     0.78889    | 0.15544| 0.26981


=======
metrics
=======
Evaluation metrics:
     Total Sum of Squares: 236.02931
     Within-Cluster Sum of Squares: 
         Cluster 0: 19.761467
         Cluster 1: 12.785339
         Cluster 2: 12.796322
         Cluster 3: 23.034665
         Cluster 4: 5.8827394
         Cluster 5: 19.168113
         Cluster 6: 3.8413151
     Total Within-Cluster Sum of Squares: 97.26996
     Between-Cluster Sum of Squares: 138.75935
     Between-Cluster SS / Total SS: 58.79%
 Number of iterations performed: 35
 Converged: True
 Call:
kmeans('"spotify"."KMeans_spotify"', '"spotify"."artists_features"', '"danceability", "energy", "speechiness", "acousticness", "instrumentalness", "liveness", "valence"', 7
USING PARAMETERS max_iterations=300, epsilon=0.0001, init_method='kmeanspp', distance_method='euclidean')

Plot the result of the k-means algoritm:

In [59]:
model.plot()
Out[59]:
<AxesSubplot:xlabel='Dim1 (44.2%)', ylabel='Dim2 (18.6%)'>
In [60]:
# predict the genres
pred_genres = model.predict('"spotify"."artists_features"', 
                            X = ["danceability",
                                 "energy",
                                 "speechiness",
                                 "acousticness",
                                 "instrumentalness",
                                 "liveness",
                                 "valence"], 
                            name="pred_genres")

Let's see how our model groups these artists together:

In [63]:
# observe the results
pred_genres['artists',
            'pred_genres'].sort({'pred_genres':'desc'})
Out[63]:
Abc
Varchar(100)
123
pred_genres
Integer
16
26
36
46
56
66
76
86
96
106
116
126
136
146
156
166
176
186
196
206
216
226
236
246
255
265
275
285
295
305
315
325
335
345
355
365
375
385
395
405
415
425
435
445
455
465
475
485
495
505
515
525
535
545
555
565
575
585
595
605
615
625
635
645
655
665
675
685
695
705
715
725
735
745
755
765
775
785
795
805
815
825
835
845
855
865
875
885
895
905
915
925
935
945
955
965
975
985
995
1005
Rows: 1-100 | Columns: 2

Conclusion

We were able to predict the popularity Polish songs with a RandomForestRegressor model suggested by AutoML. We then created a k-means model to group artists into "genres" (clusters) based on the feature-commonalities in their tracks.


VerticaPy

Python API for Vertica Data Science at Scale

About the Author


Jehona Kryeziu
Data Scientist


Jehona is a Junior Data Scientist working for Vertica. She works on data projects identifying use cases and implementing solutions that can deliver value and insights to businesses having in focus to provide machine learning, analytics and architecture solutions. She follows modern trends in data science and is curious in current needs that come as a result of the recent developments of technology and increase of data volume. Her other interests include mathematics and Natural Language Processing.