Predicting Popularity on Spotify

This example uses the publicly-available Spotify from Kaggle to predict the popularity of Polish songs and artists on Spotify. We'll also use a model to group artists together based on how similar their songs are.

You can download the Jupyter notebook of this study here.

The "tracks" dataset (tracks.csv) have the following features:

  • id represents the Id of the track generated by Spotify

Numerical:

  • acousticness (range: [0,1])
  • danceability (range: [0,1])
  • energy, (range: [0,1])
  • duration_ms (range: [200000,300000])
  • instrumentalness (range: [0,1])
  • valence (range: [0,1])
  • popularity (range: [0,100])
  • tempo (range: [50,150])
  • liveness (range: [0,1])
  • loudness (range: [-60,0])
  • speechiness (range: [0,1])

Dummy:

  • mode (0 = Minor, 1 = Major)
  • explicit (0 = No explicit content and 1 = Explicit content)

Categorical:

  • key - keys on an octave encoded as integers in range [0,11] (C = 0, C# = 1, etc.)
  • timesignature - predicted time signature
  • artists - list of contributing artists
  • artists - list of IDs of contributing artists
  • release_date - date of release (yyyy-mm-dd)
  • name - track name

The "artists" dataset (artists.csv) has the following features:

  • id - ID of the artist
  • name - artist name
  • followers - how many followers the artist has
  • popularity - popularity of the artists based on their tracks
  • genres - list of genres covered by the artist's tracks

Import libraries

Start by importing VerticaPy and loading the SQL extension, which allows you to query the Vertica database with SQL.

In [1]:
import verticapy as vp
%load_ext verticapy.sql

This examlpe uses the following version of VerticaPy:

In [2]:
vp.__version__
Out[2]:
'0.9.0'

Connect to Vertica. This example uses an existing connection called "VerticaDSN." For details on how to create a connection, use see the connection tutorial.

In [3]:
vp.connect("VerticaDSN")

Create a new schema, "spotify."

In [4]:
vp.drop("spotify", method = "schema")
vp.create_schema("spotify")
Out[4]:
True

Data Loading

Load the datasets into the vDataFrame with read_csv() and then view them with display().

In [5]:
# load datasets as vDataFrame objects
artists = vp.read_csv("data/artists.csv", schema = "spotify", parse_nrows = 100)
tracks  = vp.read_csv("data/tracks.csv" , schema = "spotify", parse_nrows = 100)
The table "spotify"."artists" has been successfully created.
The table "spotify"."tracks" has been successfully created.
In [6]:
display(artists)
display(tracks)
Abc
id
Varchar(44)
123
followers
Numeric(7,2)
Abc
Varchar(36)
Abc
name
Varchar(114)
123
popularity
Int
10001K3ON4ACv7hte9Jlc6o1.0Chris0
20001ZVMPt41Vwzt1zsmuzp13667.0Thyro & Yumi39
30001cekkfdEBoMlwVQvpLg80.0Jordan Colle0
40001wHqxbF2YYRQxGdbyER6329.0Motion Drive19
50002XY9y3JhjzTZqNCqEcv1828.0Katie Price & Peter Andre0
600045gNg7mLEf9UY9yhD0t829.0Kubus & BangBang13
70004m70bDmySiyX0mbJMNS0.0Carla Sanders0
80004mMWDlIBRhLMZP3p92i4.0Yung Danzy0
900052qkj8osxzuMhyZA88Y1.0Stephen Olaussen0
1000062105bTKjdbdc1K1UpI0.0懾魂紫0
110006sHOabJ20USMjG8cWtD705.0The Art Company32
12000778ICMgFxCapm7fVDAW1.0Polyphoniko Sygkrotima0
130007DSGSwy4e3PdpExQESD47.0Detect Audio0
140008EBLAehfRBSeStEUQmn0.0Hi Tide Wild0
150008dSXw2QjJXEg3jNl6hG0.0prottaborton0
1600097QzAB3ui6aBPavao2O0.0Gastraws Gang0
170009LMJQT62wdZVgnM4vBy97.0Bernd Mehler34
180009v6e5hmLdAqSFTLcKry1417.0Komissar0
19000AK4sjc0Np5MbCQ6JC1U11.0Tony Camillo's Bazuka0
20000AreMafq1WYcwszMM61y0.0Da Brxkxnhxrt Boys0
21000BQxIqXh6LNfsVTt703l0.0CecilyCecily0
22000BblCiHJeKvtiq5aiHOs1994.051 Koodia34
23000CkqJliuK4m3uNRl0jd20.0Young Ranz0
24000Cz1Pts8aGpDLBvrc4wi597.0El Negrito18
25000DEqM4VjDJIUIFJ2LNk10.0Han Young Chul0
26000DpPOWQC5JxdVapodOTM0.0Blu Bill$0
27000Dq0VqTZpxOP6jQMscVL4923.0Thug Brothers14
28000ESzt0wlQI41YbKYGIkf192.0Zach Russell19
29000EYNgg9zCfoWNtDBemdu0.0Coral Anointed vases0
30000Eamfs6aR2aDr5svq1XF26.0Scarlett Gonzalez0
31000FVR595IhUXTTTRwMgyQ0.0Mr. Todd0
32000G2jFbXDhyuRZGw3jX0l0.0Ci0
33000GXXsAUTgVJfJzOEMIQz3.0Seanine Seven0
34000Gam69yUQckclDmjLvlL1331.0Tony Victor0
35000IpCJToEX5T7oDZtXSeW0.0@Harlem Fuego0
36000JK7wzCV5X4mg3MbsuGA17.0Delmonaco0
37000JMMX1ZXl1WQIu9GGxCI0.0P.O.P0
38000JdbKW4RCTX2eV304DDR5.0Yuni Ayunda0
39000LGmf577qlcsgUZMWJLL0.0Mel Vine0
40000MS0paUmy5CYke9uzmAO0.0Zues Carter0
41000N5stG2Co67GJJeqBJpy0.0Hajoso0
42000NbZqISnuKaqmw33JxTV1418.0Carlos Nakai9
43000NkLnzi66UcbuTttHDqX1.0R.J. Paradee0
44000Nolc0A0ZuWkFzNYVYdU16.0Peter Clark1
45000OybtozoTUivAiSvHstz3.0Mohamad Sparow0
46000PIVxMsaaWl1FtigEyDo0.0Halo Kitty0
47000PWQyVzAHQICDwxkwEvy0.0TrulySincere0
48000QJOWUPdmFGneapRPzMx144.0The Red Carpet0
49000R5IwxLTiwZ1O0lsEgMZ2.0Verónica Falco0
50000SLLJ1xQBlePMwMXtkK50.0SlimStnr0
51000Saas96Bv97L2quWFvJl4.03MYR0
52000SwGJYBhOXUnpXva6H3v3.0Lucrative Burrito0
53000SzAei1SOa0HERIjNNqb0.0Man Gogh0
54000TLrYTTOschyZq06KUyV0.0Sadia Naeem0
55000TPdyW9vCzS7vAboRLPP0.0Florence Mestais0
56000TcTgrYANFi4NWE4K0fi0.0B Greazy0
57000UmiuzgqlUjKHpvIWmAo3834.0Lisa Knowles-Smith27
58000UxvYLQuybj6iVRRCAw191.0Primera Etica1
59000VNWB7V9Xt6HDlPnP1FM1.0JVC0
60000VtE5U1okZpIrAilYHpJ0.0Элиана Бублик0
61000WDFTEMnS2Ey1iyugp0x3.0Nyell Jeudy0
62000WMX8CCUlKyWxaOasSNZ346.0John Wang9
63000WyXOWRZlAYSCHOl6vbw5.0Naveen Hiremath0
64000XIsEmxd7gGlxWDFCD432.0YkUno0
65000ZSUbEUZFtMiWHIYBv3W0.0Cathy Fishburn0
66000ZhE5mMcLVkkIMYScxGn0.0Torrober0
67000ZqKpgzccO8TcuxHK6VW2.0Alibastard0
68000dicAA3QWT6m118rFXLO134.0Wings of the ISANG0
69000e7vL2PldGExOPwRFmfF0.0Tea K0
70000eJgbruxYJJGYpzP33bd0.0Alonzo Garton0
71000fiSPJm8TYO2T7UPExMS0.0Bobbie Johnson0
72000h0et7HHi3pmuWZ80LNj0.0NT Money0
73000isQj1PXwsP5kJF0F1Fi14.0YoursTruly1
74000j0w6ospeDya5uUncVNQ0.0Pradeep Dutto0
75000j8IPyHeLwvo4ScFUJMn0.0El Bkno0
76000kGSv1ge0O6gJDYkOBLZ11.0Electra Vega0
77000lFylX9iGEN12AIz7qkF0.0Mr. Goodbarz0
78000lUERE6k2fnBFdDn58yb6.0Décio Pulini0
79000lc4PE8DuDC8O1n2jsao0.0Luh H0
80000nr5NQY4yigOdegZaKoV7.0Ghostfog0
81000ohFb1Uurub7P6UxvkJ30.0Leek Sosa0
82000p4jMMhpEHq1h6PFCyO1335.0Anne Veski24
83000pQJoy8qFwpZwB5T8htu2.0RCHILL0
84000q9sWX0l76D7fijF5iOl19.0Room 4040
85000qAzhdvazIILeGHZhfHE1.0Deja Amari0
86000qxDglux9HycJpOEigNr0.0Em. Mâu Huỳnh Điệp0
87000rFz8r7IQznjS1qjIsyk0.0Soph vs. Hypa0
88000rLf9cJqQOstUbHCzl775.0Raffaele Cardone0
89000rOFFfUyGGzKmB8XYIuu132.0Milton Cross0
90000rU47x3QblbKdSE2VRFm2.0Ron One0
91000s6SnJAGVnsoT33VKcSW59.0Poker Pets8
92000s7VhvjpDLaDf3L140Zc0.0古風な 音楽を学ぶ0
93000sD0EGScyHwmiRNF0RRP9.0Doblado0
94000spuc3oKgwYmfg5IE26s301.0Parliament Syndicate8
95000tKel9gS8530Lh9kd8xb2.0Rupantar1
96000vMR3cHSTJ88w3PJWHJB114.0Jeff Kryka20
97000vYbdVeUF5Mx5RLgT2iR0.0608 A0
98000vgESUHZ8XyaYCCxapde7.0Maggie Lee0
99000vr8pSUPAUgtGPpJbGEH0.0Dro-Delecity0
100000vwlhkQBKs6RjdBkjltZ41.0Soul Creation feat. Kenny Bobien0
Rows: 1-100 | Columns: 5
Abc
id
Varchar(44)
Abc
Varchar(96)
123
popularity
Int
123
duration_ms
Int
123
explicit
Int
Abc
Varchar(100)
Abc
Varchar(156)
Abc
release_date
Varchar(20)
123
danceability
Numeric(6,4)
123
energy
Numeric(8,6)
123
key
Int
123
loudness
Numeric(8,4)
123
mode
Int
123
speechiness
Numeric(7,5)
123
acousticness
Numeric(6,4)
123
instrumentalness
Float
123
liveness
Numeric(7,5)
123
valence
Numeric(7,5)
123
tempo
Numeric(9,4)
123
time_signature
Int
10004Uy71ku11n3LMpuyf593425890702012-01-010.6230.5996-9.25510.02550.1770.001480.07480.381140.0564
2000CSYu4rvd8cQ7JilfxhZ4318910702005-05-030.6240.7662-7.8610.07310.3450.00.1130.54895.1284
3000DsoWJKHdaUmhgcnpr8j1623453301977-07-110.5720.6087-10.22410.5550.6420.000220.3440.559106.4534
4000G1xMMuwxNHmwVsBdtj13218234701978-09-230.2560.8952-4.8610.07070.01310.0001060.08210.555191.3074
5000KblXP5csWFFFsD6smOy3524001302006-07-080.6190.5184-5.39200.05340.8050.00.1020.314143.7573
6000Npgk5e2SgwGaIsN3ztv020697201953-12-310.2770.1453-19.89810.08450.980.8790.1110.49475.6444
7000P83HDtOHcNVFZy7Q2Yu18242493019800.6190.5617-8.37700.07360.4384.17e-050.1260.69688.944
8000RDCYioLteXcutOjeweY5419020302018-02-120.6790.770-3.53710.190.05830.00.08250.839161.7214
9000TXa2oEZLYfQGPCiv23U6243200019820.610.916-10.98110.0430.06060.08280.1480.775129.6394
10000TiSS4vK5su0MkoFyQbd4415925002017-12-220.720.6466-7.69100.1060.2370.00.1270.095880.0134
11000d0lQMYaRR5ZXS9nTeiN32258533119950.8260.8271-8.39310.2050.01064.62e-050.170.4992.1374
12000jBcNljWTnyjB4YO7ojf017974701954-01-010.7880.8085-6.5910.03950.6560.00.1540.969113.0464
13000mGrJNc2GAgQdMESdgEc0498560019510.07530.152-16.70500.03710.3020.8840.1210.035176.5583
14000q9YBtesW8yPwlmus12C317851501975-01-010.7150.2545-15.45910.03430.8780.00.08670.55994.2054
15000u1dTg7y1XCDXi80hbBX5830060001989-01-010.7560.477-12.61510.03940.1960.0004870.1260.43120.4844
16000x2qE0ZI3hodeVrnJK8A37200627019750.5070.3560-14.2410.03060.3390.00.180.472134.2484
17000xYdQfIZ4pDmBGzQalKU5318711902016-11-040.5090.8030-6.74310.040.6840.0005390.4630.651166.0184
18000ydDsz4ijCNUsmoIeZcj139866701978-09-220.6370.3531-13.97800.0390.2338.44e-060.1020.662139.2093
1900105Q1NbnHkf8R5eXXeXm4423406702011-08-230.7770.8326-3.98200.04030.3341.08e-050.07650.7294.9644
2000147h65HDYSncB3byziPP1224144001961-01-010.2380.25310-13.89810.03150.8770.00.07480.16490.8554
210018QzCxmMrpa0FubbNdak3114793301993-01-010.8040.7969-8.61610.0320.3190.0001270.09870.965113.9644
22001AGGmHtn4FUzXgAUxz391319150401991-01-010.5160.2957-15.55800.06620.6760.00.1950.881150.5423
23001GxQGaFwTjxM7tmKbMF330214493020040.5470.73111-6.05510.07370.2060.00.09390.965159.8014
24001I9iXPNwN0HlHfdQjDIX2126784001988-01-010.4530.6885-9.07510.05560.02675.21e-060.2930.42120.9534
25001IcYypSE1ryXKY5KNIin58270405302016-10-080.1110.37610-27.78600.08630.5660.7160.311e-0595.285
26001LvKFwYbfKYPQF2Fiv773419078701954-01-010.5550.5333-7.60910.03720.8020.2110.270.88282.684
27001UI3J6PKAEnBgqrwGGQC21163840019770.8390.4067-17.73210.10.05552.39e-050.1070.559139.3674
28001UkMQHw4zXfFNdKpwXAF6919105302019-04-050.5730.8467-4.86610.03440.00377.87e-050.290.562127.0614
29001YQlnDSduXd5LgBd66gT3617726701984-02-060.5540.9212-4.58910.07580.01940.08810.3290.7183.5711
30001ZmOPuWEW5czwun7nkha89286701952-04-110.160.125-15.24410.03670.9150.000240.280.162169.7221
31001e2JrYMwnTeRnxf3sgIz1236428001953-01-010.1920.07633-24.38810.04250.9510.2840.07020.055491.0973
32001f6XLtM53gwKSauiUcKI3137998701977-01-010.5560.3151-12.92300.2280.1510.00.1020.364129.9984
33001gx41rQo0bKh063TrC1I2238203001984-01-010.5030.23310-16.82710.06750.8130.01290.5460.153129.2544
34001hyVfKgE2R1a0TBXeLlV5221974212019-05-200.6660.55411-10.35310.03250.670.000250.06310.414130.0524
35001s02k3baVIxp6lIVRu6k020547301937-12-310.4260.08843-10.92510.06390.9880.001130.3930.39680.213
3600218i6cYENiDszmBKsYYg5513265712018-08-230.7060.4730-8.08500.09260.9840.02420.1060.5987.4124
370022sbR4gAWFhpMaYLbtnX2221656102009-01-010.7480.8658-5.66300.03270.180.00.1330.879138.0184
380024tEymsoc9FyKUauQngQ41546107019770.3050.2341-18.25510.02590.7520.9290.1320.07882.4533
390025JMWRhsWx0GXdlzhHMO119729302016-05-270.2520.8770-7.87600.04810.00010.08070.06850.332103.7844
400029TH4cSnQ12KKfHaq11C3819896002000-01-010.6160.8195-4.10910.03030.00250.0001140.2150.962126.8164
41002Ac7LJjVIcfFYMZ6Irue317761301969-08-230.40.3916-10.48710.02760.4851.18e-050.1180.40183.4594
42002CcxKpBE1tfKOy2CRaWr019415401955-12-310.5050.4193-10.51100.3160.9830.05120.1480.654134.6583
43002DkDzzQ7lrgaqWBF2o1M3720528001985-01-010.5090.7157-6.41910.1760.02510.00.4190.833178.1444
44002KIBXwb0pa66mPGuPKMr3329424901990-03-010.6410.8549-9.10900.02910.00170.3480.3250.871121.2444
45002TGKi4LBwxYodlfWoaN03213209302000-01-010.7630.8815-7.40910.03110.5963.11e-050.1480.987143.1644
46002aR3zqP6SvscCnPT44on072643601955-01-010.2030.1074-21.17500.03990.9250.8320.1120.0375107.3914
47002ak2fuoNB5KW0plsN1jp2621353301990-04-100.7310.399-12.02300.02860.4212.55e-060.07850.69697.3974
48002c2TeuD0GPfL2ahcmWEF017820401945-12-310.6760.2456-10.90600.09450.9950.5710.1340.74875.4444
49002dEfJAJnpoDU5cUhMmZJ0158145019400.3540.3539-12.50610.0680.9850.2780.09570.649152.4244
50002dh6a4LfxfGGnhPZY4fG0309347019510.3890.3127-9.83910.040.9194.08e-060.4250.289119.4794
51002jsFzKzBLqDS71IZpnd65211293019710.6910.44-8.82810.03670.7550.0001250.110.716121.3554
52002lZtb0gYsfOJKqIouc0j145360001972-09-110.6060.9543-5.93900.9340.9590.00.9150.28876.8563
53002sGwDZYna3CKXbYIilHz016694902009-12-160.5250.6599-14.27300.03140.2180.8650.1190.739128.0024
54002x5YvGv3c0OW9HOoa4lG1017303501977-01-010.7450.58610-6.64710.06270.6290.0003820.340.9590.1114
55002zOHMdBKYgNGtmmHSE2D525089301959-02-020.390.6997-9.8710.8070.7970.00.70.791164.584
5600309sp8DDeN07RtAyBKIX3425164611992-02-240.570.8817-12.80810.04830.00070.190.3340.623128.914
570033rRpGIViA0RngXc0jtL024595712012-11-010.1750.06825-16.95510.04190.9370.8880.2260.0354118.2484
580034Fs63aj5J6FTxe5U2Ru1177766701983-06-100.4660.4190-10.34210.03710.6986.09e-050.2090.277122.6374
590036VchQkckr6Xxz7WYEh03123258702010-03-080.7070.2661-12.33810.03070.7663.72e-050.1050.47392.0084
600038DNjmwf6yFivn7PLwqP21336653019830.8980.7399-7.48700.06050.2250.0001280.2850.883113.8844
61003FTlCpBTM4eSqYSWPv4H6923326702002-10-150.5530.7177-5.85710.03180.00010.1280.03850.318127.9474
62003IDC4FUadw6ocSKqWHtq2219376001988-01-010.5740.597-7.87900.02410.08440.00.06760.346102.6834
63003IdD0Ir5LSZHlrPpLZlm27120907019750.2930.1867-27.6710.04190.9890.8480.1090.796152.0454
64003IvyvD1M7gdSDK3z5Kfd36143200019960.7160.335-11.76400.04130.571.79e-050.1380.554117.8354
65003JzPprzThp8SHUctgXnn10182413019520.4610.4547-8.49910.04190.7970.006120.1070.298142.1974
66003KPastz3K8HNyHb8dixS2630312002000-04-090.650.830-5.75800.02810.1060.00.3180.632132.3534
67003PmYxl19KBg6m4uxClbR2221726701990-01-010.470.7125-5.46210.03170.1640.00.2250.28146.1814
68003ShLCyC9ACAsdUWOcnk01922857301993-03-120.7730.5877-12.19610.0390.8293.94e-050.07190.86130.1384
69003WuNd8vTwCW4JyhFQMYT042267019540.6560.5184-18.67310.1010.9870.1320.7620.932142.2163
70003YOvQRIurKekUuUMDtkZ4429361601999-03-170.8010.5556-7.87110.03210.4610.00.2360.806142.3434
71003Ymyk0DvgqFmqEmA5Vzc24249160019780.6930.570-9.94900.03090.380.00.280.855123.1934
72003d3VbyJTZiiOYT2W7fnQ4129093301973-01-010.4610.320-16.33710.03270.2663.5e-060.2110.487134.8813
73003fzMu2jBvZqXNPMYNYox3221310702009-09-290.5670.6037-8.22310.0440.3173.7e-050.130.681159.9344
74003gesN9g85fPpsH1Gx5OO47196040020010.7920.9522-6.74910.0550.1340.0002630.2540.968106.0094
75003hDp0MCmLiYKYRfzbhMo18229227019960.6210.8174-3.3410.05260.5090.00.1560.976155.1294
76003noSDLb7rvHSJ2FPYmDX3819386701995-01-010.5660.952-9.14600.03760.00350.004930.2270.759148.0294
77003oRl3zfRZvhttUxM9Gr51223870701986-11-120.760.7462-5.17910.2710.3222.04e-060.04570.892137.0784
78003vvx7Niy0yvhvHt4a68B82222973020040.3520.9111-5.2310.07470.00120.00.09950.236148.0334
790043vCcTBYYsmUPzCNd1JQ1122462701989-01-010.7490.7569-6.09510.03740.00610.00.03320.861122.4144
8000469l4g2MMTCWwPudV9Qq125554701960-01-010.6820.2625-14.6700.04950.7650.00.2280.598123.7954
810046quUYhSAFccrKIC3Iht4328497301985-06-010.480.4351-9.95910.03050.2075.83e-060.1960.328144.4354
820047CfWoMPpXkZlizPQAb22723621302004-11-230.8560.7547-2.72800.04160.02941.49e-050.2060.72298.0184
83004ADkC8JLeDkT5HGsPDBm423986702005-11-290.5590.7138-4.02900.03180.1230.00.10.734164.0484
84004Me6nm9MknKeGKLnWx79422324001979-04-300.7050.5849-9.99710.02960.1585.39e-060.2340.83198.8444
85004NJ3y5ED5zhn5adktc8g0184812019260.3780.5983-6.38110.04340.9960.6070.430.649198.4924
86004Oj2hkXAaenToKOLbYla0107760019560.7640.3022-14.69810.0370.9110.0004460.2610.829117.1913
87004TG0nRHejwSKisvwTcAB31204853019630.220.04474-25.51410.03840.9950.9410.1010.19768.3494
88004TP6xsBixiK3zqOfPBEx2121060019580.7080.17810-17.49110.9580.6080.00.2340.561100.1213
89004UHJTfLZH02Xds3ETd7e025464001953-01-010.4140.4252-9.29910.03780.8910.01640.370.53398.0134
90004VesjMxIn4AzXd8c52n1539250701990-01-010.5680.6211-11.87800.02740.580.005390.08460.418100.8973
91004XgNPueIQSJrK25rzP3D23243533020120.5360.5060-9.83610.02770.00090.00.09640.30683.4624
92004a0SY8lPBwVnLCtO9aOO367222302016-12-250.00.001251-27.59210.00.9081.00.1110.00.00
93004cCP7Csq7U0m67DDzEFs072147019460.4190.08743-27.61200.04190.9950.9220.120.50764.0085
94004dqsaJ3B8BBpqpN0F4YT31186333019890.6650.57-9.57900.03450.08622.33e-060.1040.95119.9694
95004ltC6j7SM8J06Egoa4wN19251960019990.7620.8247-7.57110.07230.1559.56e-050.0970.75698.0724
96004q4eDxR33ci4f8m4flwl4317162702016-08-260.8380.7031-6.19500.050.7681.56e-060.08610.658104.9824
97004sbHg1qMzdrDXdxJpkj1526284002021-04-020.5680.5176-7.31600.08940.1460.00.3430.578104.4414
98004tIr2Xy3FJVesPxRBLqm222685302020-09-250.6830.5727-7.08610.04470.111.54e-050.08950.583100.0354
99004tQ7CGOm6ZKbmhdTgKsM1714012001995-06-200.6830.4791-11.57410.9420.3670.00.3370.70776.7821
100004uWjBm1cOgxJAwQb6X8P3926046701993-11-010.7740.6340-11.40100.04870.20.009760.220.933119.9534
Rows: 1-100 | Columns: 20

Data Exploration

Our "artists" dataset is too broad for us to use right now; we're only concerned with Polish artists, so let's extract and save them to our Vertica database.

In [7]:
# filter polish artists out of the 'artists' dataset using information in 'genres' column
polish_artists = artists.search("genres ilike '%disco polo%' or genres ilike '%polish%'") 

# save it to the database
polish_artists.to_db('"spotify"."polish_artists"', relation_type = "table")
Out[7]:
Abc
id
Varchar(44)
123
followers
Numeric(7,2)
Abc
Varchar(36)
Abc
Varchar(114)
123
popularity
Integer
1004reCzVFOidvBuYrYia9Y12726.027
200drc18J6PkIXn24widBC53.00
300ekfPE5ZS3NwF8H8o8GBk17574.047
40138Hlfmap3fEvKTqmZhE91289.05
5018gIUaP08hROTOiVdiEQ3584.08
601BlTZ696EkKe6xr56Gu6G362.03
701TgMAgIALWvVXlKjUwpfn1063.09
80244q9rIqAIzBFXKNNRN6O27393.040
902Cq85QmaYHDi4dW7AxTRZ749.09
1002ESuuto8Jwyo4PeiJ1Xim149.01
1102JmHOSFJi2bLjGnO274di8817.031
1202LrsTMdnHVvKmXxN0epQF753.022
1302eZEXslMzAjHDkygNJHSX8010.026
1402keDoJak6YO12KBJFMFNm1873.015
1502tQ309SzZZ0bYs2yyO60G11734.030
160338weYyACbkc5ERuLnFTa270.017
17033WIygOyXwUjc1vfCGxJ2126.00
1803Dy3XKBUsC3vJLCuF0T7I152.01
1903KLzHVK6la8dVop1iVI5x63817.050
2003ZzgzybQr8UyvWCMSCvRy1363.014
2103jLJnyfZXs1ssrIALfGRm2633.022
2203ohDYwWFrXfgp0VEtSTiF706.09
2303qKjVTzyKc3SyTjHaOpFc2376.022
2403rREATXGWcD2CfG3OXDZY10155.044
2503xKZpOUZOQjf7g5WBN4ee3676.030
2603yP3BHBnpGyvddEoIGnsx2089.016
2704Lio76CKJCMPbK5hV6J4w1876.05
2804Loj16dRX1yZodeEQlCOv308.01
2904WxKoI0kS5JclvQ8rn8qp13.00
3004bDWf1u7HxKdskC3N2nIk27127.037
3105AVHcWP9DF6y6LEU845uz1545.015
3205Fgqq7GfWeNol1TR5H3og15868.035
3305UsyksBcAUVdfyREMxbDm295.011
34063D0MKbIbbBjKgtYRGBga7458.036
350690wuO0NVERuqxuoi2mTF319.04
3606O52v4thQuBoLC6jWatGW21.00
3706UcKJxYJXthEwn0c8XOCt11024.020
3806wBGqhkbyUAtVNMbbcK1x607.08
39070tdNOiP3pIsGlqNfVkG386130.051
40072HrG3T5BaaBj4YhKIkxv1166.08
4107ILo13zpakvXxTL3VtqwS540.010
42098RsUTij7grC7evZUhWwA720.031
4309MjLGtslj39ILxA1MqUny556.07
4409ScR35g0VzipHacuPtXZd440.07
4509Z3SI4GkhYjpCB6884vC810395.025
4609j4UTVH7vk7fVfVB71roU348.01
4709u3N07jllcbrnkvjxNBSl1117.015
480AYJ3eg4zKi9ilGrhVaINs2186.018
490AZgkXW6n0zfyOhVAnIopA1109.033
500At3wjxYzZL9WwqbFR0JL824.00
510BBB9DjvskQV0oReJMxTP130889.045
520BQIhJ61mCyaOrVrMJ7e8k5.00
530CEw36eWG0dYKCXOX8eUoO77804.047
540CI4rQj50Dcr30HpiD2LF6165.08
550CgCy79P84g1meaXcwwFqZ80.00
560CsrftI3Zs3nvfSW6MRglc50.00
570D5kXlS7UOApMpTyuSrFAW40370.039
580D9mwbJP5sUH7XYXg4F7u9580.04
590E6TslMisIITlZ1QjjPXeo110.00
600EDBV0NVPOftbsEM0fg7WZ2004.018
610EMDndPZcpfg9Qqgos0S7G73.00
620EPzUAW8kwuPedmmVP6n9S99964.053
630EQaqT3oKtxAGR0Y5c1Jme3572.011
640EYfWGAHPugeWUKKvoMU79336.04
650EalHNt7jDL6Tc7XSunjXu21.0