Write Simple Functions, Then Use Them!#
I hate it when students/colleagues copy-paste their code with small modifications instead of using functions or classes
Often in our code, it may be necessary to do the same thing multiple times. Without the use of functions, the code may become unreadable and there is also high risk of making errors. Take as an example the calculation of daily water use for a residential house as a function of time. You would also like to consider how this demand changes throughout the day depending on who is living in the house, specifically, the number and ages of the residents. based on the number and age of the residents as well as time of day. The following example illustrates how this is commonly evaluated when you don’t think in advance about your code structure, or the use of functions.
Note that in reality each line described below may be multiple lines in the ‘real’ file.
1 calculate water use for 1 adult resident
2 plot result
3 copy of line 1: modified to consider teenager
4 copy of line 1: modified to consider child
5 copy of line 2: plot redrawn with 3 lines
6 copy lines 1, 4, 5 to sum water use for several users
Here are a few examples of how this style of code-writing can go wrong (numbers correspond to the line in the example):
The code calculating water use as a function of time of day (1) may be wrong. Thus, the mistake will be repeated in multiple places (4, 5, 6). Extracting that code in a separate function and calling it multiple times is beneficial.
The code becomes too long. As the calculations of water use (1) and plotting functions (2) in reality span multiple lines, the code can grow significantly. Therefore, using a separate function will reduce code duplication (3, 4, 5, 6).
Functions (with descriptive names!) can increase code readability. Suppose you have water use calculations for different types of water users inside your code (e.g., industrial, commercial, residential). Making new functions for each will reduce the complexity of your code because the logic behind them will be placed somewhere else.
1 define function for water use with resident as input
2 define function to plot water use (uses function on line 1)
3 plot water use for one resident (uses line 2)
4 plot water use for multiple residents (uses line 2)
5 sum water use for several users (uses line 1)
It is clear that lines 1 and 2 are used repeatedly, not duplicated. If we had written out these steps entirely, rather than outlining with pseudocode, the total number of lines would be significantly less because the steps to calculate water use (1) are written once. In addition, the instructions for formatting the figure can be included in the plotting function (2).
This illustrates the concept of modularity: using functions to decompose the code into smaller pieces that minimize repetition and the chance of including errors. Modules and modular code are important general programming concepts, but also have particular significance for the Python language. This topic will be explained in a later workshop, so for now we encourage you to use functions and modularise your code. Simply taking a few moments to think about: 1) which part(s) of your analysis can be defined in a function, 2) write the functions at the top of your notebook or script (after importing your packages), then using them below, hence the name of this rule: write simple functions, then use them. It looks like this:
import Python packages
define your new functions
use your new functions
Would you like to learn more about this? In more general programming or computer science fields, modularity is a result of applying the concepts of decomposition and abstraction. See Guttag (2021).
The next version of this document will also discuss hardcoding, and why it is better to pass fixed values into your function as arguments to keep the functions simple and easy to use.