VerticaPy

Python API for Vertica Data Science at Scale

Linear Regression

Linear regression is one of the most popular regression algorithms and produces good predictions for well-prepared data. Its optimization function computes coefficients to express a response column as a linear relationship of its predictors.

You must verify the Gauss-Markov assumptions when using linear regression algorithms:

  • Linearity : the parameters we are estimating using the OLS method must be linear.
  • Non-Collinearity : the regressors being calculated aren’t perfectly correlated with each other.
  • Exogeneity : the regressors aren’t correlated with the error term.
  • Homoscedasticity : no matter what the values of our regressors might be, the error of the variance is constant.

To create a good linear regression model, it's important to:

  • Impute missing values
  • Encode categorical features (linear regression only accepts numerical variables)
  • Compute the correlation matrix to retrieve highly-correlated predictors
  • Decompose the data (optional)
  • Normalize the data (optional, but recommended)

Example without decomposition

Let's use the 'africa_education' dataset to compute a linear regression model of students' performance in school.

In [46]:
import verticapy as vp
africa = vp.read_csv("data/africa_education.csv")
africa = africa.select(["(zralocp + zmalocp) / 2 AS student_score",
                        "(zraloct + zmaloct) / 2 AS teacher_score",
                        "XNUMYRS AS teacher_year_teaching",
                        "numstu AS number_students_school",
                        "PENGLISH AS english_at_home",
                        "PTRAVEL AS travel_distance",
                        "PTRAVEL2 AS means_of_travel",
                        "PMOTHER AS m_education",
                        "PFATHER AS f_education",
                        "PLIGHT AS source_of_lighting",
                        "PABSENT AS days_absent",
                        "PREPEAT AS repeated_grades",
                        "zpsit AS sitting_place",
                        "PAGE AS age",
                        "zpses AS socio_eco_statut",
                        "country_long AS country"])
display(africa)
123
student_score
Float
123
teacher_score
Float
123
teacher_year_teaching
Numeric(7,3)
123
number_students_school
Integer
Abc
english_at_home
Varchar(32)
Abc
travel_distance
Varchar(22)
Abc
means_of_travel
Varchar(26)
Abc
Varchar(68)
Abc
Varchar(68)
Abc
source_of_lighting
Varchar(24)
123
days_absent
Integer
Abc
repeated_grades
Varchar(20)
Abc
sitting_place
Varchar(54)
123
age
Integer
123
socio_eco_statut
Numeric(7,3)
Abc
country
Varchar(24)
1681.138508424325[null]26.024ALL THE TIME>0.5-1KMCARELECTRIC0NEVERI have my own sitting place1215.0South Africa
2425.993367323877537.28957291176210.023SOMETIMES>4.5-5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Namibia
3534.329515370892537.28957291176210.023SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place137.0Namibia
4536.690743411639537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
5569.392927563969537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia
6542.037992351316537.28957291176210.023MOST OF THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1211.0Namibia
7573.771789981159537.28957291176210.023MOST OF THE TIME>1.5-2KMCARELECTRIC0NEVERI have my own sitting place1211.0Namibia
8589.279157441376537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia
9496.740841813343537.28957291176210.023SOMETIMES>2.5-3KMCARELECTRIC0NEVERI have my own sitting place1412.0Namibia
10535.805274812767549.402939414911.022ALL THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Mozambique
11420.509155247374624.1249022316911.025ALL THE TIME>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar
12504.298378016207624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar
13450.768711398358624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place144.0Zanzibar
14441.927245450751624.1249022316911.025ALL THE TIME>1-1.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar
15469.987298066451624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar
16445.259642792635624.1249022316911.025ALL THE TIME>4KM-4.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar
17551.071920864503624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar
18473.825234391116624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Zanzibar
19544.095820873929624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Zanzibar
20564.874824604798624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place148.0Zanzibar
21386.940445160128569.99368616128311.023SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place143.0Mozambique
22458.501728934408593.1227544839263.023SOMETIMES>2-2.5KMWALKCANDLE0NEVERI have my own sitting place145.0Mozambique
23460.360232184133569.99368616128311.023ALL THE TIME>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place157.0Mozambique
24457.035551231015593.1227544839263.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place117.0Mozambique
25504.282977573042593.1227544839263.023ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Mozambique
26475.872247772718569.99368616128311.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1610.0Mozambique
27503.899980617811584.8865381964643.07ALL THE TIME>0.5-1KMWALKCANDLE0NEVERI have my own sitting place152.0Mozambique
28480.484906064866584.8865381964643.07ALL THE TIME>2.5-3KMWALKCANDLE0NEVERI have my own sitting place153.0Mozambique
29516.744491629541641.87699522279820.025SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia
30394.65259093503641.87699522279820.025NEVER>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia
31506.128325450387641.87699522279820.025SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERI have my own sitting place117.0Namibia
32440.401203829463641.87699522279820.025SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place128.0Namibia
33453.010211603314641.87699522279820.025SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place118.0Namibia
34498.219068060275641.87699522279820.025SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place1111.0Namibia
35359.385615553614610.95681655660530.016ALL THE TIMEUP TO 0.5KMWALKGAS0NEVERI have my own sitting place14[null]Mozambique
36525.275821925396610.95681655660530.016ALL THE TIMEUP TO 0.5KMWALKGAS0NEVERI have my own sitting place17[null]Mozambique
37457.035551231015610.95681655660530.016ALL THE TIMEUP TO 0.5KMWALKGAS0NEVERI have my own sitting place12[null]Mozambique
38509.337990443557610.95681655660530.016ALL THE TIMEUP TO 0.5KMWALKGAS0NEVERI have my own sitting place12[null]Mozambique
39434.970625831997610.95681655660530.016SOMETIMESUP TO 0.5KMWALKGAS0NEVERI have my own sitting place17[null]Mozambique
40[null]641.4776489823612.016SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place174.0Mozambique
41428.988211440412656.2255055183548.023SOMETIMES>5KMWALKFIRE0NEVERI have my own sitting place122.0Namibia
42532.255684949772656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place142.0Namibia
43392.664714363554676.111735395768.020ALL THE TIME>3-3.5KMWALKCANDLE0NEVERNo place/share134.0Zambia
44415.95588691273656.2255055183548.023SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place134.0Namibia
45479.525184773189656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia
46410.328608434232656.2255055183548.023SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia
47425.993367323877656.2255055183548.023SOMETIMES>3-3.5KMWALKFIRE0NEVERI have my own sitting place124.0Namibia
48409.430742983857656.2255055183548.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place125.0Namibia
49460.605886281468638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
50560.188989035577638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
51378.486087509513638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place146.0Namibia
52516.831361986859638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia
53557.155342162198638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
54631.282997954796638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
55501.677328837489656.2255055183548.023SOMETIMES>2.5-3KMWALKCANDLE0NEVERI have my own sitting place186.0Namibia
56466.248533790025638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
57594.667991916549638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
58619.020904331767638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
59422.517903579997638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
60514.338009258817638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia
61512.766872215611638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia
62456.266205420931676.111735395768.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Zambia
63380.388263378723656.2255055183548.023SOMETIMES>5KMCARPARAFFIN/OIL0NEVERI have my own sitting place138.0Namibia
64421.344721387902676.111735395768.020SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERNo place/share118.0Zambia
65486.501284763764638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia
66453.357504553945638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place129.0Namibia
67498.219068060275638.0083836642316.024SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place119.0Namibia
68514.807687171462638.0083836642316.024NEVERUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia
69565.065579807459683.21736669488212.021MOST OF THE TIME>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place1213.0South Africa
70626.714865963592638.0083836642318.021ALL THE TIMEUP TO 0.5KMCARELECTRIC0NEVERI have my own sitting place1114.0South Africa
71457.134154158281683.21736669488212.021SOMETIMES>5KMBUS/TRUCK/VANELECTRIC0NEVERI have my own sitting place1314.0South Africa
72510.499756960558634.2228480160759.020MOST OF THE TIME>1-1.5KMWALKFIRE0NEVERI have my own sitting place162.0Uganda
73494.235377715545634.2228480160759.020SOMETIMES>1.5-2KMWALKPARAFFIN/OIL0NEVERI have my own sitting place172.0Uganda
74365.266644239282617.72252298591.013NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place132.0Zambia
75541.173490674172645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place123.0Namibia
76455.36049792936645.7158489763988.020SOMETIMES>5KMWALKPARAFFIN/OIL0NEVERNo place/share153.0Malawi
77422.517903579997645.7158489763988.020NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERNo place/share125.0Malawi
78398.31103066659651.6292602588881.021MOST OF THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Malawi
79395.120560563494639.8978152952391.022SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia
80503.716941803917645.7158489763986.049SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia
81427.462485792718623.1590462617388.018SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share136.0Zambia
82600.91272658553645.7158489763986.049SOMETIMES>2-2.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia
83565.065579807459645.7158489763986.049SOMETIMES>5KMCARELECTRIC0NEVERI have my own sitting place127.0Namibia
84540.204883338675645.7158489763986.049MOST OF THE TIME>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place117.0Namibia
85558.474690429794645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Namibia
86575.637662232951645.7158489763986.049NEVER>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Namibia
87439.680053114802645.7158489763988.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Malawi
88353.773643278958617.72252298591.013SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Zambia
89488.376137298698645.7158489763986.049SOMETIMES>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Namibia
90480.908033418792645.7158489763986.049MOST OF THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
91535.286927844308645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
92408.529715955022651.6292602588881.021SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place159.0Malawi
93385.53865385057623.1590462617388.018SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place129.0Zambia
94528.310827853733645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place1210.0Namibia
95548.67653911654645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia
96588.423257247566645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia
97583.034422772394645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia
98455.36049792936645.7158489763988.020NEVER>3.5-4KMWALKELECTRIC0NEVERNo place/share1410.0Malawi
99459.986311593889645.7158489763986.049SOMETIMES>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place1311.0Namibia
100594.667991916549645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia
Rows: 1-100 | Columns: 16

First, let's look for missing values.

In [2]:
africa.count_percent()
Out[2]:
count
percent
"number_students_school"60890.0100.0
"english_at_home"60890.0100.0
"travel_distance"60890.0100.0
"means_of_travel"60890.0100.0
"m_education"60890.0100.0
"source_of_lighting"60890.0100.0
"days_absent"60890.0100.0
"repeated_grades"60890.0100.0
"sitting_place"60890.0100.0
"age"60890.0100.0
"country"60890.0100.0
"socio_eco_statut"60832.099.905
"student_score"60809.099.867
"teacher_year_teaching"60708.099.701
"f_education"60599.099.522
"teacher_score"52122.085.6
Rows: 1-16 | Columns: 3

We'll simply drop the missing values to avoid adding bias to the data.

In [47]:
africa.dropna()
8988 elements were filtered.
Out[47]:
123
student_score
Float
123
teacher_score
Float
123
teacher_year_teaching
Numeric(7,3)
123
number_students_school
Integer
Abc
english_at_home
Varchar(32)
Abc
travel_distance
Varchar(22)
Abc
means_of_travel
Varchar(26)
Abc
Varchar(68)
Abc
Varchar(68)
Abc
source_of_lighting
Varchar(24)
123
days_absent
Integer
Abc
repeated_grades
Varchar(20)
Abc
sitting_place
Varchar(54)
123
age
Integer
123
socio_eco_statut
Numeric(7,3)
Abc
country
Varchar(24)
1425.993367323877537.28957291176210.023SOMETIMES>4.5-5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Namibia
2534.329515370892537.28957291176210.023SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place137.0Namibia
3536.690743411639537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
4569.392927563969537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia
5542.037992351316537.28957291176210.023MOST OF THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1211.0Namibia
6573.771789981159537.28957291176210.023MOST OF THE TIME>1.5-2KMCARELECTRIC0NEVERI have my own sitting place1211.0Namibia
7589.279157441376537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia
8496.740841813343537.28957291176210.023SOMETIMES>2.5-3KMCARELECTRIC0NEVERI have my own sitting place1412.0Namibia
9535.805274812767549.402939414911.022ALL THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Mozambique
10420.509155247374624.1249022316911.025ALL THE TIME>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar
11504.298378016207624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar
12450.768711398358624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place144.0Zanzibar
13441.927245450751624.1249022316911.025ALL THE TIME>1-1.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar
14469.987298066451624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar
15445.259642792635624.1249022316911.025ALL THE TIME>4KM-4.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar
16551.071920864503624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar
17473.825234391116624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Zanzibar
18544.095820873929624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Zanzibar
19564.874824604798624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place148.0Zanzibar
20386.940445160128569.99368616128311.023SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place143.0Mozambique
21458.501728934408593.1227544839263.023SOMETIMES>2-2.5KMWALKCANDLE0NEVERI have my own sitting place145.0Mozambique
22460.360232184133569.99368616128311.023ALL THE TIME>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place157.0Mozambique
23457.035551231015593.1227544839263.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place117.0Mozambique
24504.282977573042593.1227544839263.023ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Mozambique
25475.872247772718569.99368616128311.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1610.0Mozambique
26503.899980617811584.8865381964643.07ALL THE TIME>0.5-1KMWALKCANDLE0NEVERI have my own sitting place152.0Mozambique
27480.484906064866584.8865381964643.07ALL THE TIME>2.5-3KMWALKCANDLE0NEVERI have my own sitting place153.0Mozambique
28516.744491629541641.87699522279820.025SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia
29394.65259093503641.87699522279820.025NEVER>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia
30506.128325450387641.87699522279820.025SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERI have my own sitting place117.0Namibia
31440.401203829463641.87699522279820.025SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place128.0Namibia
32453.010211603314641.87699522279820.025SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place118.0Namibia
33498.219068060275641.87699522279820.025SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place1111.0Namibia
34428.988211440412656.2255055183548.023SOMETIMES>5KMWALKFIRE0NEVERI have my own sitting place122.0Namibia
35532.255684949772656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place142.0Namibia
36392.664714363554676.111735395768.020ALL THE TIME>3-3.5KMWALKCANDLE0NEVERNo place/share134.0Zambia
37415.95588691273656.2255055183548.023SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place134.0Namibia
38479.525184773189656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia
39410.328608434232656.2255055183548.023SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia
40425.993367323877656.2255055183548.023SOMETIMES>3-3.5KMWALKFIRE0NEVERI have my own sitting place124.0Namibia
41409.430742983857656.2255055183548.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place125.0Namibia
42460.605886281468638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
43560.188989035577638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
44378.486087509513638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place146.0Namibia
45516.831361986859638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia
46557.155342162198638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
47631.282997954796638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
48501.677328837489656.2255055183548.023SOMETIMES>2.5-3KMWALKCANDLE0NEVERI have my own sitting place186.0Namibia
49466.248533790025638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
50594.667991916549638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
51619.020904331767638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
52422.517903579997638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia
53514.338009258817638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia
54512.766872215611638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia
55456.266205420931676.111735395768.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Zambia
56380.388263378723656.2255055183548.023SOMETIMES>5KMCARPARAFFIN/OIL0NEVERI have my own sitting place138.0Namibia
57421.344721387902676.111735395768.020SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERNo place/share118.0Zambia
58486.501284763764638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia
59453.357504553945638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place129.0Namibia
60498.219068060275638.0083836642316.024SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place119.0Namibia
61514.807687171462638.0083836642316.024NEVERUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia
62565.065579807459683.21736669488212.021MOST OF THE TIME>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place1213.0South Africa
63626.714865963592638.0083836642318.021ALL THE TIMEUP TO 0.5KMCARELECTRIC0NEVERI have my own sitting place1114.0South Africa
64457.134154158281683.21736669488212.021SOMETIMES>5KMBUS/TRUCK/VANELECTRIC0NEVERI have my own sitting place1314.0South Africa
65510.499756960558634.2228480160759.020MOST OF THE TIME>1-1.5KMWALKFIRE0NEVERI have my own sitting place162.0Uganda
66494.235377715545634.2228480160759.020SOMETIMES>1.5-2KMWALKPARAFFIN/OIL0NEVERI have my own sitting place172.0Uganda
67365.266644239282617.72252298591.013NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place132.0Zambia
68541.173490674172645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place123.0Namibia
69455.36049792936645.7158489763988.020SOMETIMES>5KMWALKPARAFFIN/OIL0NEVERNo place/share153.0Malawi
70422.517903579997645.7158489763988.020NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERNo place/share125.0Malawi
71398.31103066659651.6292602588881.021MOST OF THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Malawi
72395.120560563494639.8978152952391.022SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia
73503.716941803917645.7158489763986.049SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia
74427.462485792718623.1590462617388.018SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share136.0Zambia
75600.91272658553645.7158489763986.049SOMETIMES>2-2.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia
76565.065579807459645.7158489763986.049SOMETIMES>5KMCARELECTRIC0NEVERI have my own sitting place127.0Namibia
77540.204883338675645.7158489763986.049MOST OF THE TIME>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place117.0Namibia
78558.474690429794645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Namibia
79575.637662232951645.7158489763986.049NEVER>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Namibia
80439.680053114802645.7158489763988.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Malawi
81353.773643278958617.72252298591.013SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Zambia
82488.376137298698645.7158489763986.049SOMETIMES>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Namibia
83480.908033418792645.7158489763986.049MOST OF THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
84535.286927844308645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia
85408.529715955022651.6292602588881.021SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place159.0Malawi
86385.53865385057623.1590462617388.018SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place129.0Zambia
87528.310827853733645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place1210.0Namibia
88548.67653911654645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia
89588.423257247566645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia
90583.034422772394645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia
91455.36049792936645.7158489763988.020NEVER>3.5-4KMWALKELECTRIC0NEVERNo place/share1410.0Malawi
92459.986311593889645.7158489763986.049SOMETIMES>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place1311.0Namibia
93594.667991916549645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia
94473.481862461084645.7158489763986.049NEVERUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1112.0Namibia
95584.081078168864645.7158489763986.049MOST OF THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1213.0Namibia
96504.241138890166645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1113.0Namibia
97451.694405556509627.0468536316397.015SOMETIMESUP TO 0.5KMWALKFIRE0NEVERNo place/share171.0Mozambique
98522.558825103114627.0468536316397.015ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERNo place/share121.0Mozambique
99422.842681154036627.0468536316397.015NEVER>0.5-1KMWALKNO LIGHTING0NEVERNo place/share141.0Mozambique
100515.308862586361657.5676860573955.020SOMETIMES>1-1.5KMWALKPARAFFIN/OIL0NEVERNo place/share182.0Mozambique
Rows: 1-100 of 50921 | Columns: 16

We need to encode the categorical columns to dummies to retain linearity.

In [48]:
africa.one_hot_encode(max_cardinality = 20)
Out[48]:
123
student_score
Float
123
teacher_score
Float
123
teacher_year_teaching
Numeric(7,3)
123
number_students_school
Integer
Abc
english_at_home
Varchar(32)
Abc
travel_distance
Varchar(22)
Abc
means_of_travel
Varchar(26)
Abc
Varchar(68)
Abc
Varchar(68)
Abc
source_of_lighting
Varchar(24)
123
days_absent
Integer
Abc
repeated_grades
Varchar(20)
Abc
sitting_place
Varchar(54)
123
age
Integer
123
socio_eco_statut
Numeric(7,3)
Abc
country
Varchar(24)
123
english_at_home_ALL_THE_TIME
Bool
123
english_at_home_MOST_OF_THE_TIME
Bool
123
english_at_home_NEVER
Bool
123
travel_distance_>0.5-1KM
Bool
123
travel_distance_>1-1.5KM
Bool
123
travel_distance_>1.5-2KM
Bool
123
travel_distance_>2-2.5KM
Bool
123
travel_distance_>2.5-3KM
Bool
123
travel_distance_>3-3.5KM
Bool
...
123
socio_eco_statut_3.000
Bool
123
socio_eco_statut_4.000
Bool
123
socio_eco_statut_5.000
Bool
123
socio_eco_statut_6.000
Bool
123
socio_eco_statut_7.000
Bool
123
socio_eco_statut_8.000
Bool
123
socio_eco_statut_9.000
Bool
123
socio_eco_statut_10.000
Bool
123
socio_eco_statut_11.000
Bool
123
socio_eco_statut_12.000
Bool
123
socio_eco_statut_13.000
Bool
123
socio_eco_statut_14.000
Bool
123
country_Botswana
Bool
123
country_Kenya
Bool
123
country_Lesotho
Bool
123
country_Malawi
Bool
123
country_Mozambique
Bool
123
country_Namibia
Bool
123
country_Seychelles
Bool
123
country_South_Africa
Bool
123
country_Swaziland
Bool
123
country_Tanzania
Bool
123
country_Uganda
Bool
123
country_Zambia
Bool
123
country_Zanzibar
Bool
1425.993367323877537.28957291176210.023SOMETIMES>4.5-5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Namibia000000000...0010000000000000010000000
2534.329515370892537.28957291176210.023SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place137.0Namibia000100000...0000100000000000010000000
3536.690743411639537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia000000000...0000010000000000010000000
4569.392927563969537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia000000000...0000001000000000010000000
5542.037992351316537.28957291176210.023MOST OF THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1211.0Namibia010000000...0000000010000000010000000
6573.771789981159537.28957291176210.023MOST OF THE TIME>1.5-2KMCARELECTRIC0NEVERI have my own sitting place1211.0Namibia010001000...0000000010000000010000000
7589.279157441376537.28957291176210.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia000000000...0000000001000000010000000
8496.740841813343537.28957291176210.023SOMETIMES>2.5-3KMCARELECTRIC0NEVERI have my own sitting place1412.0Namibia000000010...0000000001000000010000000
9535.805274812767549.402939414911.022ALL THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Mozambique100010000...0000000100000000100000000
10420.509155247374624.1249022316911.025ALL THE TIME>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar100000000...1000000000000000000000001
11504.298378016207624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place133.0Zanzibar100100000...1000000000000000000000001
12450.768711398358624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place144.0Zanzibar100100000...0100000000000000000000001
13441.927245450751624.1249022316911.025ALL THE TIME>1-1.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar100010000...0010000000000000000000001
14469.987298066451624.1249022316911.025ALL THE TIME>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Zanzibar100100000...0010000000000000000000001
15445.259642792635624.1249022316911.025ALL THE TIME>4KM-4.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar100000000...0001000000000000000000001
16551.071920864503624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place136.0Zanzibar100000000...0001000000000000000000001
17473.825234391116624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Zanzibar100000000...0000100000000000000000001
18544.095820873929624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Zanzibar100000000...0000010000000000000000001
19564.874824604798624.1249022316911.025ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place148.0Zanzibar100000000...0000010000000000000000001
20386.940445160128569.99368616128311.023SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place143.0Mozambique000100000...1000000000000000100000000
21458.501728934408593.1227544839263.023SOMETIMES>2-2.5KMWALKCANDLE0NEVERI have my own sitting place145.0Mozambique000000100...0010000000000000100000000
22460.360232184133569.99368616128311.023ALL THE TIME>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place157.0Mozambique100001000...0000100000000000100000000
23457.035551231015593.1227544839263.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place117.0Mozambique000001000...0000100000000000100000000
24504.282977573042593.1227544839263.023ALL THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Mozambique100000000...0000010000000000100000000
25475.872247772718569.99368616128311.023SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1610.0Mozambique000000000...0000000100000000100000000
26503.899980617811584.8865381964643.07ALL THE TIME>0.5-1KMWALKCANDLE0NEVERI have my own sitting place152.0Mozambique100100000...0000000000000000100000000
27480.484906064866584.8865381964643.07ALL THE TIME>2.5-3KMWALKCANDLE0NEVERI have my own sitting place153.0Mozambique100000010...1000000000000000100000000
28516.744491629541641.87699522279820.025SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia000100000...0001000000000000010000000
29394.65259093503641.87699522279820.025NEVER>5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia001000000...0001000000000000010000000
30506.128325450387641.87699522279820.025SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERI have my own sitting place117.0Namibia000000000...0000100000000000010000000
31440.401203829463641.87699522279820.025SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place128.0Namibia000000000...0000010000000000010000000
32453.010211603314641.87699522279820.025SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place118.0Namibia000100000...0000010000000000010000000
33498.219068060275641.87699522279820.025SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place1111.0Namibia000010000...0000000010000000010000000
34428.988211440412656.2255055183548.023SOMETIMES>5KMWALKFIRE0NEVERI have my own sitting place122.0Namibia000000000...0000000000000000010000000
35532.255684949772656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place142.0Namibia000100000...0000000000000000010000000
36392.664714363554676.111735395768.020ALL THE TIME>3-3.5KMWALKCANDLE0NEVERNo place/share134.0Zambia100000001...0100000000000000000000010
37415.95588691273656.2255055183548.023SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place134.0Namibia000000000...0100000000000000010000000
38479.525184773189656.2255055183548.023SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia000100000...0100000000000000010000000
39410.328608434232656.2255055183548.023SOMETIMES>1-1.5KMWALKCANDLE0NEVERI have my own sitting place124.0Namibia000010000...0100000000000000010000000
40425.993367323877656.2255055183548.023SOMETIMES>3-3.5KMWALKFIRE0NEVERI have my own sitting place124.0Namibia000000001...0100000000000000010000000
41409.430742983857656.2255055183548.023SOMETIMES>1.5-2KMWALKELECTRIC0NEVERI have my own sitting place125.0Namibia000001000...0010000000000000010000000
42460.605886281468638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
43560.188989035577638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
44378.486087509513638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place146.0Namibia000000000...0001000000000000010000000
45516.831361986859638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia000000000...0001000000000000010000000
46557.155342162198638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
47631.282997954796638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
48501.677328837489656.2255055183548.023SOMETIMES>2.5-3KMWALKCANDLE0NEVERI have my own sitting place186.0Namibia000000010...0001000000000000010000000
49466.248533790025638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
50594.667991916549638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
51619.020904331767638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
52422.517903579997638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
53514.338009258817638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place116.0Namibia000000000...0001000000000000010000000
54512.766872215611638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia000000000...0000100000000000010000000
55456.266205420931676.111735395768.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Zambia000000000...0000100000000000000000010
56380.388263378723656.2255055183548.023SOMETIMES>5KMCARPARAFFIN/OIL0NEVERI have my own sitting place138.0Namibia000000000...0000010000000000010000000
57421.344721387902676.111735395768.020SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERNo place/share118.0Zambia000000000...0000010000000000000000010
58486.501284763764638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place139.0Namibia000000000...0000001000000000010000000
59453.357504553945638.0083836642316.024SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place129.0Namibia000000000...0000001000000000010000000
60498.219068060275638.0083836642316.024SOMETIMES>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place119.0Namibia000100000...0000001000000000010000000
61514.807687171462638.0083836642316.024NEVERUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia001000000...0000000100000000010000000
62565.065579807459683.21736669488212.021MOST OF THE TIME>0.5-1KMWALKELECTRIC0NEVERI have my own sitting place1213.0South Africa010100000...0000000000100000000100000
63626.714865963592638.0083836642318.021ALL THE TIMEUP TO 0.5KMCARELECTRIC0NEVERI have my own sitting place1114.0South Africa100000000...0000000000010000000100000
64457.134154158281683.21736669488212.021SOMETIMES>5KMBUS/TRUCK/VANELECTRIC0NEVERI have my own sitting place1314.0South Africa000000000...0000000000010000000100000
65510.499756960558634.2228480160759.020MOST OF THE TIME>1-1.5KMWALKFIRE0NEVERI have my own sitting place162.0Uganda010010000...0000000000000000000000100
66494.235377715545634.2228480160759.020SOMETIMES>1.5-2KMWALKPARAFFIN/OIL0NEVERI have my own sitting place172.0Uganda000001000...0000000000000000000000100
67365.266644239282617.72252298591.013NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place132.0Zambia001100000...0000000000000000000000010
68541.173490674172645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place123.0Namibia000000000...1000000000000000010000000
69455.36049792936645.7158489763988.020SOMETIMES>5KMWALKPARAFFIN/OIL0NEVERNo place/share153.0Malawi000000000...1000000000000001000000000
70422.517903579997645.7158489763988.020NEVER>0.5-1KMWALKPARAFFIN/OIL0NEVERNo place/share125.0Malawi001100000...0010000000000001000000000
71398.31103066659651.6292602588881.021MOST OF THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place145.0Malawi010000000...0010000000000001000000000
72395.120560563494639.8978152952391.022SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
73503.716941803917645.7158489763986.049SOMETIMES>5KMWALKCANDLE0NEVERI have my own sitting place126.0Namibia000000000...0001000000000000010000000
74427.462485792718623.1590462617388.018SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share136.0Zambia000000000...0001000000000000000000010
75600.91272658553645.7158489763986.049SOMETIMES>2-2.5KMWALKELECTRIC0NEVERI have my own sitting place127.0Namibia000000100...0000100000000000010000000
76565.065579807459645.7158489763986.049SOMETIMES>5KMCARELECTRIC0NEVERI have my own sitting place127.0Namibia000000000...0000100000000000010000000
77540.204883338675645.7158489763986.049MOST OF THE TIME>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place117.0Namibia010000010...0000100000000000010000000
78558.474690429794645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place127.0Namibia000000000...0000100000000000010000000
79575.637662232951645.7158489763986.049NEVER>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Namibia001100000...0000100000000000010000000
80439.680053114802645.7158489763988.020SOMETIMESUP TO 0.5KMWALKCANDLE0NEVERNo place/share117.0Malawi000000000...0000100000000001000000000
81353.773643278958617.72252298591.013SOMETIMES>0.5-1KMWALKCANDLE0NEVERI have my own sitting place137.0Zambia000100000...0000100000000000000000010
82488.376137298698645.7158489763986.049SOMETIMES>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place118.0Namibia000010000...0000010000000000010000000
83480.908033418792645.7158489763986.049MOST OF THE TIME>1-1.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia010010000...0000010000000000010000000
84535.286927844308645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place128.0Namibia000000000...0000010000000000010000000
85408.529715955022651.6292602588881.021SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place159.0Malawi000100000...0000001000000001000000000
86385.53865385057623.1590462617388.018SOMETIMES>0.5-1KMWALKPARAFFIN/OIL0NEVERI have my own sitting place129.0Zambia000100000...0000001000000000000000010
87528.310827853733645.7158489763986.049SOMETIMESUP TO 0.5KMWALKPARAFFIN/OIL0NEVERI have my own sitting place1210.0Namibia000000000...0000000100000000010000000
88548.67653911654645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1210.0Namibia000000000...0000000100000000010000000
89588.423257247566645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia000000000...0000000100000000010000000
90583.034422772394645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1310.0Namibia000000000...0000000100000000010000000
91455.36049792936645.7158489763988.020NEVER>3.5-4KMWALKELECTRIC0NEVERNo place/share1410.0Malawi001000000...0000000100000001000000000
92459.986311593889645.7158489763986.049SOMETIMES>2.5-3KMWALKELECTRIC0NEVERI have my own sitting place1311.0Namibia000000010...0000000010000000010000000
93594.667991916549645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1212.0Namibia000000000...0000000001000000010000000
94473.481862461084645.7158489763986.049NEVERUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1112.0Namibia001000000...0000000001000000010000000
95584.081078168864645.7158489763986.049MOST OF THE TIMEUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1213.0Namibia010000000...0000000000100000010000000
96504.241138890166645.7158489763986.049SOMETIMESUP TO 0.5KMWALKELECTRIC0NEVERI have my own sitting place1113.0Namibia000000000...0000000000100000010000000
97451.694405556509627.0468536316397.015SOMETIMESUP TO 0.5KMWALKFIRE0NEVERNo place/share171.0Mozambique000000000...0000000000000000100000000
98522.558825103114627.0468536316397.015ALL THE TIMEUP TO 0.5KMWALKPARAFFIN/OIL0NEVERNo place/share121.0Mozambique100000000...0000000000000000100000000
99422.842681154036627.0468536316397.015NEVER>0.5-1KMWALKNO LIGHTING0NEVERNo place/share141.0Mozambique001100000...0000000000000000100000000
100515.308862586361657.5676860573955.020SOMETIMES>1-1.5KMWALKPARAFFIN/OIL0NEVERNo place/share182.0Mozambique000010000...0000000000000000100000000
Rows: 1-100 of 50921 | Columns: 108

Linear regression can only handle numerical columns, so we'll drop the categorical columns.

In [49]:
africa.drop(columns = ["english_at_home",
                       "travel_distance",
                       "means_of_travel",
                       "m_education",
                       "f_education",
                       "source_of_lighting",
                       "repeated_grades",
                       "sitting_place",
                       "country"])
Out[49]:
123
student_score
Float
123
teacher_score
Float
123
teacher_year_teaching
Numeric(7,3)
123
number_students_school
Integer
123
days_absent
Integer
123
age
Integer
123
socio_eco_statut
Numeric(7,3)
123
english_at_home_ALL_THE_TIME
Bool
123
english_at_home_MOST_OF_THE_TIME
Bool
123
english_at_home_NEVER
Bool
123
travel_distance_>0.5-1KM
Bool
123
travel_distance_>1-1.5KM
Bool
123
travel_distance_>1.5-2KM
Bool
123
travel_distance_>2-2.5KM
Bool
123
travel_distance_>2.5-3KM
Bool
123
travel_distance_>3-3.5KM
Bool
123
travel_distance_>3.5-4KM
Bool
123
travel_distance_>4.5-5KM
Bool
123
travel_distance_>4KM-4.5KM
Bool
123
travel_distance_>5KM
Bool
123
means_of_travel_BICYCLE
Bool
123
means_of_travel_BUS_TRUCK_VAN
Bool
123
means_of_travel_CAR
Bool
123
means_of_travel_OTHER
Bool
123
means_of_travel_TRAIN
Bool
...
123
socio_eco_statut_3.000
Bool
123
socio_eco_statut_4.000
Bool
123
socio_eco_statut_5.000
Bool
123
socio_eco_statut_6.000
Bool
123
socio_eco_statut_7.000
Bool
123
socio_eco_statut_8.000
Bool
123
socio_eco_statut_9.000
Bool
123
socio_eco_statut_10.000
Bool
123
socio_eco_statut_11.000
Bool
123
socio_eco_statut_12.000
Bool
123
socio_eco_statut_13.000
Bool
123
socio_eco_statut_14.000
Bool
123
country_Botswana
Bool
123
country_Kenya
Bool
123
country_Lesotho
Bool
123
country_Malawi
Bool
123
country_Mozambique
Bool
123
country_Namibia
Bool
123
country_Seychelles
Bool
123
country_South_Africa
Bool
123
country_Swaziland
Bool
123
country_Tanzania
Bool
123
country_Uganda
Bool
123
country_Zambia
Bool
123
country_Zanzibar
Bool
1425.993367323877537.28957291176210.0230145.0000000000010000000...0010000000000000010000000
2534.329515370892537.28957291176210.0230137.0000100000000000000...0000100000000000010000000
3536.690743411639537.28957291176210.0230128.0000000000000000000...0000010000000000010000000
4569.392927563969537.28957291176210.0230139.0000000000000000000...0000001000000000010000000
5542.037992351316537.28957291176210.02301211.0010000000000000000...0000000010000000010000000
6573.771789981159537.28957291176210.02301211.0010001000000000100...0000000010000000010000000
7589.279157441376537.28957291176210.02301212.0000000000000000000...0000000001000000010000000
8496.740841813343537.28957291176210.02301412.0000000010000000100...0000000001000000010000000
9535.805274812767549.402939414911.02201310.0100010000000000000...0000000100000000100000000
10420.509155247374624.1249022316911.0250133.0100000000000100000...1000000000000000000000001
11504.298378016207624.1249022316911.0250133.0100100000000000000...1000000000000000000000001
12450.768711398358624.1249022316911.0250144.0100100000000000000...0100000000000000000000001
13441.927245450751624.1249022316911.0250145.0100010000000000000...0010000000000000000000001
14469.987298066451624.1249022316911.0250145.0100100000000000000...0010000000000000000000001
15445.259642792635624.1249022316911.0250136.0100000000001000000...0001000000000000000000001
16551.071920864503624.1249022316911.0250136.0100000000000000000...0001000000000000000000001
17473.825234391116624.1249022316911.0250127.0100000000000000000...0000100000000000000000001
18544.095820873929624.1249022316911.0250128.0100000000000000000...0000010000000000000000001
19564.874824604798624.1249022316911.0250148.0100000000000000000...0000010000000000000000001
20386.940445160128569.99368616128311.0230143.0000100000000000000...1000000000000000100000000
21458.501728934408593.1227544839263.0230145.0000000100000000000...0010000000000000100000000
22460.360232184133569.99368616128311.0230157.0100001000000000000...0000100000000000100000000
23457.035551231015593.1227544839263.0230117.0000001000000000000...0000100000000000100000000
24504.282977573042593.1227544839263.0230118.0100000000000000000...0000010000000000100000000
25475.872247772718569.99368616128311.02301610.0000000000000000000...0000000100000000100000000
26503.899980617811584.8865381964643.070152.0100100000000000000...0000000000000000100000000
27480.484906064866584.8865381964643.070153.0100000010000000000...1000000000000000100000000
28516.744491629541641.87699522279820.0250126.0000100000000000000...0001000000000000010000000
29394.65259093503641.87699522279820.0250126.0001000000000100000...0001000000000000010000000
30506.128325450387641.87699522279820.0250117.0000000000000000000...0000100000000000010000000
31440.401203829463641.87699522279820.0250128.0000000000000000000...0000010000000000010000000
32453.010211603314641.87699522279820.0250118.0000100000000000000...0000010000000000010000000
33498.219068060275641.87699522279820.02501111.0000010000000000000...0000000010000000010000000
34428.988211440412656.2255055183548.0230122.0000000000000100000...0000000000000000010000000
35532.255684949772656.2255055183548.0230142.0000100000000000000...0000000000000000010000000
36392.664714363554676.111735395768.0200134.0100000001000000000...0100000000000000000000010
37415.95588691273656.2255055183548.0230134.0000000000000100000...0100000000000000010000000
38479.525184773189656.2255055183548.0230124.0000100000000000000...0100000000000000010000000
39410.328608434232656.2255055183548.0230124.0000010000000000000...0100000000000000010000000
40425.993367323877656.2255055183548.0230124.0000000001000000000...0100000000000000010000000
41409.430742983857656.2255055183548.0230125.0000001000000000000...0010000000000000010000000
42460.605886281468638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
43560.188989035577638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
44378.486087509513638.0083836642316.0240146.0000000000000000000...0001000000000000010000000
45516.831361986859638.0083836642316.0240116.0000000000000000000...0001000000000000010000000
46557.155342162198638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
47631.282997954796638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
48501.677328837489656.2255055183548.0230186.0000000010000000000...0001000000000000010000000
49466.248533790025638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
50594.667991916549638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
51619.020904331767638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
52422.517903579997638.0083836642316.0240126.0000000000000000000...0001000000000000010000000
53514.338009258817638.0083836642316.0240116.0000000000000000000...0001000000000000010000000
54512.766872215611638.0083836642316.0240127.0000000000000000000...0000100000000000010000000
55456.266205420931676.111735395768.0200117.0000000000000000000...0000100000000000000000010
56380.388263378723656.2255055183548.0230138.0000000000000100100...0000010000000000010000000
57421.344721387902676.111735395768.0200118.0000000000000000000...0000010000000000000000010
58486.501284763764638.0083836642316.0240139.0000000000000000000...0000001000000000010000000
59453.357504553945638.0083836642316.0240129.0000000000000000000...0000001000000000010000000
60498.219068060275638.0083836642316.0240119.0000100000000000000...0000001000000000010000000
61514.807687171462638.0083836642316.02401210.0001000000000000000...0000000100000000010000000
62565.065579807459683.21736669488212.02101213.0010100000000000000...0000000000100000000100000
63626.714865963592638.0083836642318.02101114.0100000000000000100...0000000000010000000100000
64457.134154158281683.21736669488212.02101314.0000000000000101000...0000000000010000000100000
65510.499756960558634.2228480160759.0200162.0010010000000000000...0000000000000000000000100
66494.235377715545634.2228480160759.0200172.0000001000000000000...0000000000000000000000100
67365.266644239282617.72252298591.0130132.0001100000000000000...0000000000000000000000010
68541.173490674172645.7158489763986.0490123.0000000000000000000...1000000000000000010000000
69455.36049792936645.7158489763988.0200153.0000000000000100000...1000000000000001000000000
70422.517903579997645.7158489763988.0200125.0001100000000000000...0010000000000001000000000
71398.31103066659651.6292602588881.0210145.0010000000000000000...0010000000000001000000000
72395.120560563494639.8978152952391.0220126.0000000000000000000...0001000000000000010000000
73503.716941803917645.7158489763986.0490126.0000000000000100000...0001000000000000010000000
74427.462485792718623.1590462617388.0180136.0000000000000000000...0001000000000000000000010
75600.91272658553645.7158489763986.0490127.0000000100000000000...0000100000000000010000000
76565.065579807459645.7158489763986.0490127.0000000000000100100...0000100000000000010000000
77540.204883338675645.7158489763986.0490117.0010000010000000000...0000100000000000010000000
78558.474690429794645.7158489763986.0490127.0000000000000000000...0000100000000000010000000
79575.637662232951645.7158489763986.0490137.0001100000000000000...0000100000000000010000000
80439.680053114802645.7158489763988.0200117.0000000000000000000...0000100000000001000000000
81353.773643278958617.72252298591.0130137.0000100000000000000...0000100000000000000000010
82488.376137298698645.7158489763986.0490118.0000010000000000000...0000010000000000010000000
83480.908033418792645.7158489763986.0490128.0010010000000000000...0000010000000000010000000
84535.286927844308645.7158489763986.0490128.0000000000000000000...0000010000000000010000000
85408.529715955022651.6292602588881.0210159.0000100000000000000...0000001000000001000000000
86385.53865385057623.1590462617388.0180129.0000100000000000000...0000001000000000000000010
87528.310827853733645.7158489763986.04901210.0000000000000000000...0000000100000000010000000
88548.67653911654645.7158489763986.04901210.0000000000000000000...0000000100000000010000000
89588.423257247566645.7158489763986.04901310.0000000000000000000...0000000100000000010000000
90583.034422772394645.7158489763986.04901310.0000000000000000000...0000000100000000010000000
91455.36049792936645.7158489763988.02001410.0001000000100000000...0000000100000001000000000
92459.986311593889645.7158489763986.04901311.0000000010000000000...0000000010000000010000000
93594.667991916549645.7158489763986.04901212.0000000000000000000...0000000001000000010000000
94473.481862461084645.7158489763986.04901112.0001000000000000000...0000000001000000010000000
95584.081078168864645.7158489763986.04901213.0010000000000000000...0000000000100000010000000
96504.241138890166645.7158489763986.04901113.0000000000000000000...0000000000100000010000000
97451.694405556509627.0468536316397.0150171.0000000000000000000...0000000000000000100000000
98522.558825103114627.0468536316397.0150121.0100000000000000000...0000000000000000100000000
99422.842681154036627.0468536316397.0150141.0001100000000000000...0000000000000000100000000
100515.308862586361657.5676860573955.0200182.0000010000000000000...0000000000000000100000000
Rows: 1-100 of 50921 | Columns: 99

Let's look at the correlation between the response column and the predictors. We'll look to keep columns with correlations coefficients greater than 20% (the top 10 features).

In [50]:
%matplotlib inline
x = africa.corr(focus = "student_score")
africa = africa.select(columns = x["index"][0:12])
display(africa)
123
student_score
Float
123
socio_eco_statut
Numeric(7,3)
123
teacher_score
Float
123
source_of_lighting_ELECTRIC
Integer
123
socio_eco_statut_14.000
Integer
123
means_of_travel_CAR
Integer
123
repeated_grades_NEVER
Integer
123
english_at_home_NEVER
Integer
123
age
Integer
123
country_Tanzania
Integer
123
source_of_lighting_CANDLE
Integer
123
Integer
1425.9933673238775.0537.289572911762000101400
2534.3295153708927.0537.289572911762100101300
3536.6907434116398.0537.289572911762100101200
4569.3929275639699.0537.289572911762100101300
5542.03799235131611.0537.289572911762100101200
6573.77178998115911.0537.289572911762101101200
7589.27915744137612.0537.289572911762100101200
8496.74084181334312.0537.289572911762101101400
9535.80527481276710.0549.40293941491100101300
10420.5091552473743.0624.124902231691000101300
11504.2983780162073.0624.124902231691000101300
12450.7687113983584.0624.124902231691000101400
13441.9272454507515.0624.124902231691000101400
14469.9872980664515.0624.124902231691000101400
15445.2596427926356.0624.124902231691000101300
16551.0719208645036.0624.124902231691000101300
17473.8252343911167.0624.124902231691000101200
18544.0958208739298.0624.124902231691100101200
19564.8748246047988.0624.124902231691000101400
20386.9404451601283.0569.993686161283000101400
21458.5017289344085.0593.122754483926000101401
22460.3602321841337.0569.993686161283100101500
23457.0355512310157.0593.122754483926100101100
24504.2829775730428.0593.122754483926100101100
25475.87224777271810.0569.993686161283100101600
26503.8999806178112.0584.886538196464000101501
27480.4849060648663.0584.886538196464000101501
28516.7444916295416.0641.876995222798000101201
29394.652590935036.0641.876995222798000111200
30506.1283254503877.0641.876995222798000101101
31440.4012038294638.0641.876995222798000101200
32453.0102116033148.0641.876995222798000101100
33498.21906806027511.0641.876995222798000101101
34428.9882114404122.0656.225505518354000101200
35532.2556849497722.0656.225505518354000101401
36392.6647143635544.0676.11173539576000101301
37415.955886912734.0656.225505518354000101301
38479.5251847731894.0656.225505518354000101201
39410.3286084342324.0656.225505518354000101201
40425.9933673238774.0656.225505518354000101200
41409.4307429838575.0656.225505518354100101200
42460.6058862814686.0638.008383664231100101200
43560.1889890355776.0638.008383664231100101200
44378.4860875095136.0638.008383664231100101400
45516.8313619868596.0638.008383664231100101100
46557.1553421621986.0638.008383664231100101200
47631.2829979547966.0638.008383664231100101200
48501.6773288374896.0656.225505518354000101801
49466.2485337900256.0638.008383664231100101200
50594.6679919165496.0638.008383664231100101200
51619.0209043317676.0638.008383664231100101200
52422.5179035799976.0638.008383664231100101200
53514.3380092588176.0638.008383664231100101100
54512.7668722156117.0638.008383664231100101200
55456.2662054209317.0676.11173539576000101101
56380.3882633787238.0656.225505518354001101300
57421.3447213879028.0676.11173539576000101100
58486.5012847637649.0638.008383664231100101300
59453.3575045539459.0638.008383664231100101200
60498.2190680602759.0638.008383664231100101100
61514.80768717146210.0638.008383664231100111200
62565.06557980745913.0683.217366694882100101200
63626.71486596359214.0638.008383664231111101100
64457.13415415828114.0683.217366694882110101300
65510.4997569605582.0634.222848016075000101600
66494.2353777155452.0634.222848016075000101700
67365.2666442392822.0617.7225229859000111300
68541.1734906741723.0645.715848976398100101200
69455.360497929363.0645.715848976398000101500
70422.5179035799975.0645.715848976398000111200
71398.311030666595.0651.629260258888000101400
72395.1205605634946.0639.897815295239000101200
73503.7169418039176.0645.715848976398000101201
74427.4624857927186.0623.159046261738000101301
75600.912726585537.0645.715848976398100101200
76565.0655798074597.0645.715848976398101101200
77540.2048833386757.0645.715848976398100101100
78558.4746904297947.0645.715848976398000101200
79575.6376622329517.0645.715848976398000111301
80439.6800531148027.0645.715848976398000101101
81353.7736432789587.0617.7225229859000101301
82488.3761372986988.0645.715848976398100101100
83480.9080334187928.0645.715848976398100101200
84535.2869278443088.0645.715848976398100101200
85408.5297159550229.0651.629260258888000101500
86385.538653850579.0623.159046261738000101200
87528.31082785373310.0645.715848976398000101200
88548.6765391165410.0645.715848976398100101200
89588.42325724756610.0645.715848976398100101300
90583.03442277239410.0645.715848976398100101300
91455.3604979293610.0645.715848976398100111400
92459.98631159388911.0645.715848976398100101300
93594.66799191654912.0645.715848976398100101200
94473.48186246108412.0645.715848976398100111100
95584.08107816886413.0645.715848976398100101200
96504.24113889016613.0645.715848976398100101100
97451.6944055565091.0627.046853631639000101700
98522.5588251031141.0627.046853631639000101200
99422.8426811540361.0627.046853631639000111400
100515.3088625863612.0657.567686057395000101800
Rows: 1-100 | Columns: 12

Let's examine the correlation matrix to see if we have any independent predictors.

In [51]:
africa.corr()
Out[51]:
"student_score"
"socio_eco_statut"
"teacher_score"
"source_of_lighting_ELECTRIC"
"socio_eco_statut_14.000"
"means_of_travel_CAR"
"repeated_grades_NEVER"
"english_at_home_NEVER"
"age"
"country_Tanzania"
"source_of_lighting_CANDLE"
1.00.3675784940771630.2771894046568650.2692760879407980.2328570736945490.232329900420850.221863062645218-0.221448830352243-0.2156942386950620.212223906375641-0.194832604630241
0.3675784940771631.00.1608755758715850.7318354464895350.3527955133536480.3229325530362590.163681533395721-0.13742263856166-0.410125906734765-0.171700553969189-0.259509579779055
0.2771894046568650.1608755758715851.00.09989070640673560.1196558646395740.1267353158753080.0166893126152209-0.0281249815391927-0.1255352367757010.0198824285793986-0.086981719718143
0.2692760879407980.7318354464895350.09989070640673561.00.2255656003255860.2394042568326310.128867996924722-0.0955727028568643-0.318779142277287-0.160647227902258-0.403666918392685
0.2328570736945490.3527955133536480.1196558646395740.2255656003255861.00.3084261785677380.0843862366971435-0.0590325114887168-0.132499799644539-0.0492194225528473-0.0905938365134978
0.232329900420850.3229325530362590.1267353158753080.2394042568326310.3084261785677381.00.0872814865390086-0.061318549517247-0.149140611604839-0.0598279063632101-0.0970330131726927
0.2218630626452180.1636815333957210.01668931261522090.1288679969247220.08438623669714350.08728148653900861.0-0.071048098664226-0.2867508171411170.109225703279925-0.102172122741238
-0.221448830352243-0.13742263856166-0.0281249815391927-0.0955727028568643-0.0590325114887168-0.061318549517247-0.0710480986642261.00.0230635800335197-0.0838979536831880.0887804299200322
-0.215694238695062-0.410125906734765-0.125535236775701-0.318779142277287-0.132499799644539-0.149140611604839-0.2867508171411170.02306358003351971.00.1392536235352630.108089618791283
0.212223906375641-0.1717005539691890.0198824285793986-0.160647227902258-0.0492194225528473-0.05982790636321010.109225703279925-0.0838979536831880.1392536235352631.0-0.128880144361435
-0.194832604630241-0.259509579779055-0.086981719718143-0.403666918392685-0.0905938365134978-0.0970330131726927-0.1021721227412380.08878042992003220.108089618791283-0.1288801443614351.0
0.1932057351395210.3282494135933040.08626009535669190.1874916054106710.2351589980862330.1646917054784870.0734769317031019-0.0630544127583362-0.122469827224598-0.0437059741299854-0.0728434882428964
Rows: 1-12 | Columns: 13

Some of these features are highly-correlated, like socioeconomic status and having an electric lighting. We'll drop the lighting column to avoid unexpected results while computing the linear regression.

In [52]:
africa["source_of_lighting_ELECTRIC"].drop()
Out[52]:
123
student_score
Float
123
socio_eco_statut
Numeric(7,3)
123
teacher_score
Float
123
socio_eco_statut_14.000
Integer
123
means_of_travel_CAR
Integer
123
repeated_grades_NEVER
Integer
123
english_at_home_NEVER
Integer
123
age
Integer
123
country_Tanzania
Integer
123
source_of_lighting_CANDLE
Integer
123
Integer
1425.9933673238775.0537.28957291176200101400
2534.3295153708927.0537.28957291176200101300
3536.6907434116398.0537.28957291176200101200
4569.3929275639699.0537.28957291176200101300
5542.03799235131611.0537.28957291176200101200
6573.77178998115911.0537.28957291176201101200
7589.27915744137612.0537.28957291176200101200
8496.74084181334312.0537.28957291176201101400
9535.80527481276710.0549.4029394149100101300
10420.5091552473743.0624.12490223169100101300
11504.2983780162073.0624.12490223169100101300
12450.7687113983584.0624.12490223169100101400
13441.9272454507515.0624.12490223169100101400
14469.9872980664515.0624.12490223169100101400
15445.2596427926356.0624.12490223169100101300
16551.0719208645036.0624.12490223169100101300
17473.8252343911167.0624.12490223169100101200
18544.0958208739298.0624.12490223169100101200
19564.8748246047988.0624.12490223169100101400
20386.9404451601283.0569.99368616128300101400
21458.5017289344085.0593.12275448392600101401
22460.3602321841337.0569.99368616128300101500
23457.0355512310157.0593.12275448392600101100
24504.2829775730428.0593.12275448392600101100
25475.87224777271810.0569.99368616128300101600
26503.8999806178112.0584.88653819646400101501
27480.4849060648663.0584.88653819646400101501
28516.7444916295416.0641.87699522279800101201
29394.652590935036.0641.87699522279800111200
30506.1283254503877.0641.87699522279800101101
31440.4012038294638.0641.87699522279800101200
32453.0102116033148.0641.87699522279800101100
33498.21906806027511.0641.87699522279800101101
34428.9882114404122.0656.22550551835400101200
35532.2556849497722.0656.22550551835400101401
36392.6647143635544.0676.1117353957600101301
37415.955886912734.0656.22550551835400101301
38479.5251847731894.0656.22550551835400101201
39410.3286084342324.0656.22550551835400101201
40425.9933673238774.0656.22550551835400101200
41409.4307429838575.0656.22550551835400101200
42460.6058862814686.0638.00838366423100101200
43560.1889890355776.0638.00838366423100101200
44378.4860875095136.0638.00838366423100101400
45516.8313619868596.0638.00838366423100101100
46557.1553421621986.0638.00838366423100101200
47631.2829979547966.0638.00838366423100101200
48501.6773288374896.0656.22550551835400101801
49466.2485337900256.0638.00838366423100101200
50594.6679919165496.0638.00838366423100101200
51619.0209043317676.0638.00838366423100101200
52422.5179035799976.0638.00838366423100101200
53514.3380092588176.0638.00838366423100101100
54512.7668722156117.0638.00838366423100101200
55456.2662054209317.0676.1117353957600101101
56380.3882633787238.0656.22550551835401101300
57421.3447213879028.0676.1117353957600101100
58486.5012847637649.0638.00838366423100101300
59453.3575045539459.0638.00838366423100101200
60498.2190680602759.0638.00838366423100101100
61514.80768717146210.0638.00838366423100111200
62565.06557980745913.0683.21736669488200101200
63626.71486596359214.0638.00838366423111101100
64457.13415415828114.0683.21736669488210101300
65510.4997569605582.0634.22284801607500101600
66494.2353777155452.0634.22284801607500101700
67365.2666442392822.0617.722522985900111300
68541.1734906741723.0645.71584897639800101200
69455.360497929363.0645.71584897639800101500
70422.5179035799975.0645.71584897639800111200
71398.311030666595.0651.62926025888800101400
72395.1205605634946.0639.89781529523900101200
73503.7169418039176.0645.71584897639800101201
74427.4624857927186.0623.15904626173800101301
75600.912726585537.0645.71584897639800101200
76565.0655798074597.0645.71584897639801101200
77540.2048833386757.0645.71584897639800101100
78558.4746904297947.0645.71584897639800101200
79575.6376622329517.0645.71584897639800111301
80439.6800531148027.0645.71584897639800101101
81353.7736432789587.0617.722522985900101301
82488.3761372986988.0645.71584897639800101100
83480.9080334187928.0645.71584897639800101200
84535.2869278443088.0645.71584897639800101200
85408.5297159550229.0651.62926025888800101500
86385.538653850579.0623.15904626173800101200
87528.31082785373310.0645.71584897639800101200
88548.6765391165410.0645.71584897639800101200
89588.42325724756610.0645.71584897639800101300
90583.03442277239410.0645.71584897639800101300
91455.3604979293610.0645.71584897639800111400
92459.98631159388911.0645.71584897639800101300
93594.66799191654912.0645.71584897639800101200
94473.48186246108412.0645.71584897639800111100
95584.08107816886413.0645.71584897639800101200
96504.24113889016613.0645.71584897639800101100
97451.6944055565091.0627.04685363163900101700
98522.5588251031141.0627.04685363163900101200
99422.8426811540361.0627.04685363163900111400
100515.3088625863612.0657.56768605739500101800
Rows: 1-100 | Columns: 11

Let's normalize the dataset to follow the Gaussian-Markov assumptions.

In [53]:
africa.normalize(columns = africa.get_columns(exclude_columns = ["student_score"]))
Out[53]:
123
student_score
Float
123
socio_eco_statut
Float
123
teacher_score
Float
123
socio_eco_statut_14.000
Float
123
means_of_travel_CAR
Float
123
repeated_grades_NEVER
Float
123
english_at_home_NEVER
Float
123
age
Float
123
country_Tanzania
Float
123
source_of_lighting_CANDLE
Float
123
Float
1425.993367323877-0.6227716354899414-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
2534.329515370892-0.036979420570280655-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
3536.6907434116390.2559166868895497-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
4569.3929275639690.5488127943493801-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
5542.0379923513161.134605009269041-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
6573.7717899811591.134605009269041-3.18132114374191-0.175229599937944464.2368056430898430.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
7589.2791574413761.4275011167288714-3.18132114374191-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
8496.7408418133431.4275011167288714-3.18132114374191-0.175229599937944464.2368056430898430.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
9535.8052748127670.8417089018092105-3.01529274646835-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
10420.509155247374-1.2085638504096021-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
11504.298378016207-1.2085638504096021-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
12450.768711398358-0.9156677429497718-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
13441.927245450751-0.6227716354899414-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
14469.987298066451-0.6227716354899414-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
15445.259642792635-0.32987552803011105-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
16551.071920864503-0.32987552803011105-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.2858684831209496-0.521812307163167
17473.825234391116-0.036979420570280655-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
18544.0958208739290.2559166868895497-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
19564.8748246047980.2559166868895497-1.99113751064845-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
20386.940445160128-1.2085638504096021-2.7330715588177-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
21458.501728934408-0.6227716354899414-2.41605959166216-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.28586848312094961.9163602467967908
22460.360232184133-0.036979420570280655-2.7330715588177-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.9915654737489674-0.2858684831209496-0.521812307163167
23457.035551231015-0.036979420570280655-2.41605959166216-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.2858684831209496-0.521812307163167
24504.2829775730420.2559166868895497-2.41605959166216-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.2858684831209496-0.521812307163167
25475.8722477727180.8417089018092105-2.7330715588177-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595441.5912626705988007-0.2858684831209496-0.521812307163167
26503.899980617811-1.5014599578694325-2.5289469371784-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.9915654737489674-0.28586848312094961.9163602467967908
27480.484906064866-1.2085638504096021-2.5289469371784-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.9915654737489674-0.28586848312094961.9163602467967908
28516.744491629541-0.32987552803011105-1.74782351971136-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.28586848312094961.9163602467967908
29394.65259093503-0.32987552803011105-1.74782351971136-0.17522959993794446-0.236022240804876550.80417062936032882.05125048598223-0.8075261168005324-0.2858684831209496-0.521812307163167
30506.128325450387-0.036979420570280655-1.74782351971136-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.28586848312094961.9163602467967908
31440.4012038294630.2559166868895497-1.74782351971136-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
32453.0102116033140.2559166868895497-1.74782351971136-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.2858684831209496-0.521812307163167
33498.2190680602751.134605009269041-1.74782351971136-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.28586848312094961.9163602467967908
34428.988211440412-1.5014599578694325-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
35532.255684949772-1.5014599578694325-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.28586848312094961.9163602467967908
36392.664714363554-0.9156677429497718-1.27859483415013-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.28586848312094961.9163602467967908
37415.95588691273-0.9156677429497718-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.20782891995069913-0.28586848312094961.9163602467967908
38479.525184773189-0.9156677429497718-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.28586848312094961.9163602467967908
39410.328608434232-0.9156677429497718-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.28586848312094961.9163602467967908
40425.993367323877-0.9156677429497718-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
41409.430742983857-0.6227716354899414-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
42460.605886281468-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
43560.188989035577-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
44378.486087509513-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595440.3918682768991341-0.2858684831209496-0.521812307163167
45516.831361986859-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.2858684831209496-0.521812307163167
46557.155342162198-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
47631.282997954796-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
48501.677328837489-0.32987552803011105-1.5511597626117-0.17522959993794446-0.236022240804876550.8041706293603288-0.487497928005595442.7906570642984674-0.28586848312094961.9163602467967908
49466.248533790025-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
50594.667991916549-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
51619.020904331767-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
52422.517903579997-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324-0.2858684831209496-0.521812307163167
53514.338009258817-0.32987552803011105-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-1.4072233136503656-0.2858684831209496-0.521812307163167
54512.766872215611-0.036979420570280655-1.80084753880431-0.17522959993794446-0.236022240804876550.8041706293603288-0.48749792800559544-0.8075261168005324