Writing classes
Overview
Teaching: 30 min
Exercises: 20 minQuestions
How are classes written in Python?
What do methods look like?
What is a constructor?
How can a class customise how its instances are constructed?
Objectives
Write classes from scratch
Write methods for classes
Write custom
__init__methods
In the previous section, we’ve seen how objects can have different behaviour, provided by methods.
In this episode we will learn how to write own classes.
Let’s assume that we want to write a linear regression class, which should behave similarly to sklearn.linear_model.LinearRegression. That is, the class should have fit and predict methods. Here is a possible implementation:
import numpy
class MyLinearRegression:
def __init__(self):
"""Constructor"""
self.coef_ = []
def fit(self, X, y):
"""Fit the linear model"""
Xa = numpy.array(X)
Xt = Xa.transpose()
self.coef_ = numpy.linalg.inv(Xt @ Xa) @ Xt @ y
def predict(self, X):
"""Predict using the linear model"""
return X @ self.coef_
First we define the class with the class keyword. The name of the class is MyLinearRegression. The class has one attribute, self.coef_, which represents the least square coefficients. Initially, we don’t know how many coefficients there will be so we initialise self.coef_ to an empty list.
Next we implement two methods: fit and predict that take the same arguments as the correspong methods in sklearn.linear_model.LinearRegression. The fit method does not return anything, it computes the linear square coefficients. The predict method takes data values and computes the “best” estimates for those values.
You may wonder what the __init__ method does? This is a special method used to construct the object (__init__ is called the constructor). In this case __init__ does not take any arguments (but it can like any other function).
All the methods are contained within the class definition and take self as first argument. self refers to the object itself. Thus, self.coef_ is the coef_ attribute attached to a particular instance.
Pronouncing
__init__The method name
__init__is most often pronounced “dunder init”, where the “dunder” is short for “double underscore”, since the name starts and ends with two underscores.We’ll encounter more methods with “dunder” in the name in a later episode.
Other names than
selfWhile it is possible to use any variable name for the first argument of a method, and Python will not complain, other programmers will. Since one aim when programming is to be as clear as possible to others who may read the program later, we strongly recommend following the convention of calling the first argument to methods
self.
Naming classes
Another convention in Python is that class names start with a capital letter, and instead of underscores, initial letters of subsequent words are also capitalised. This makes it easier to distinguish classes from objects and other variables at a glance.
This is how you could use the class:
mymodel2 = MyLinearRegression()
mymodel2.fit(X=[[1,],[2,]], y=[1, 2])
ypred = mymodel2.predict(X=[[1.2,],[1.8,], [2.2,]])
Note that our new class behaves the same way as sklearn.linear_model.LinearRegression, which we used in the previous episode. We could use MyLinearRegression in place of sklearn.linear_model.LinearRegression in our scripts. The changes would be minimal because both MyLinearRegression and sklearn.linear_model.LinearRegression mostly conform to the same application program interface.
Problem
Add a member
intercept_to classMyLinearRegressionand initialise this member to zero.Solution
import numpy class MyLinearRegression: def __init__(self): """Constructor""" self.coef_ = [] self.intercept_ = 0.0 def fit(self, X, y): """Fit the linear model""" Xa = numpy.array(X) Xt = Xa.transpose() self.coef_ = numpy.linalg.inv(Xt @ Xa) @ Xt @ y def predict(self, X): """Predict using the linear model""" return X @ self.coef_
Using inheritance to simplify MyLinearRegression
Perhaps we like class sklearn.linear_model.LinearRegression but would like to change just one method, fit, because we think ours is best. Do we really need to rewrite the entire class from scratch?
How about we derive our class from sklearn.linear_model.LinearRegression?
from sklearn.linear_model import LinearRegression
class MyLinearRegression2(LinearRegression):
def fit(self, X, y):
"""Fit the linear model"""
Xa = numpy.array(X)
Xt = Xa.transpose()
self.coef_ = numpy.linalg.inv(Xt @ Xa) @ Xt @ y
self.intercept_ = 0.
Note the class statement, followed by the class name and, in parentheses, the parent class (LinearRegression in this case). If you invoke a method of an instance the interpreter will look for the implementation of the method in the class. If the interpreter cannot find the method inside the class, it will look for the method in the parent class (or the parent of the parent if the parent is also a derived class).
That’s great news because we don’t have to implement predict as this method will be found in class LinearRegression. This can save a lot of coding. Shorter code generally means fewer bugs.
The constructor of the parent class (__init__) is always called automatically unless it is overwritten in the child class. If we overwrite it in the child class, we typically also want to call the parent’s constructor. This is achieved with super(). How to use super() is detailed in the next episode. In our example, we could have
from sklearn.linear_model import LinearRegression
class MyLinearRegression3(LinearRegression):
def __init__(self):
# call the parent's constructor
super().__init__(fit_intercept=False)
# assume no intercept
self.intercept_ = 0.
def fit(self, X, y):
"""Fit the linear model"""
Xa = numpy.array(X)
Xt = Xa.transpose()
self.coef_ = numpy.linalg.inv(Xt @ Xa) @ Xt @ y
Problem
Class
sklearn.linear_model.LinearRegressionhas additional members and methods. One such method isscore. Check that instances ofMyLinearRegression3can callscore.Solution
m = MyLinearRegression3() m.fit(X=[[1,],[2,]], y=[1, 2]) m.score(X=[[1.2,],[1.8,], [2.2,]], y=[1.2, 1.8, 2.2])
Key Points
Classes in Python are blocks started with the
classkeywordMethod definitions look like functions, but must take a
selfargumentThe
__init__method is called when instances are constructedSpend less time coding and more time at the beach with class inheritance