3.3. Beyond the Basics: Working with Files#
A lot of the work you’ll do in Python will have the following structure:
Read data from a file
Perform computations on the data
Visualize the results and/or save the results to a file
So far, we have only learned about computations. So let’s learn a bit about how to manage files. Actually, opening or saving files is usually done with the help of modules which you will learn in more detail in Notebook 4 and 6. What we’ll discuss here is how to manage file paths.
3.3.1. File paths#
To learn how to use files we need to learn how file paths in computers work. If you are tech-savvy and know how file paths work you can skip this part.
File paths in computers work like a tree. They start at the root directory, which is often the C:
drive (in Windows). This is the name of the hard drive that stores your Operating System. From the C:
drive you can navigate into other directories. This is done using the \
character, however in other Operating Systems often the /
delimiter is used.
If a file is in the folder Users
, which is stored in the C:
directory, the file path would be C:\Users
. These types of file paths are called absolute paths. This file path is valid for most computers that run Windows, but some other Operating Systems may have different folder setups. This is why it is useful to use relative paths. Relative paths do not start from the root directory. Instead, they start from the directory you are currently in. By default, Jupyter Notebooks are stored in C:\Users\CurrentUser
(where CurrentUser
is your Windows username). To move into a directory using a relative path, for example, to the desktop folder, you would just write .\Desktop
. To move back a directory, using a relative path, you would type ..
os.listdir()
or os.listdir('./')
list all the entries in your current directory os.listdir('../')
list all entries if we go back one level.
Note
We use the /
as delimiter, since a \
won’t work on macOS
import os
print(os.listdir())
print(os.listdir('./'))
print(os.listdir('../'))
['01.ipynb']
['01.ipynb']
['Exercises', 'In_a_Nutshell', 'Theory']
Warning
Keep in mind that, in Python, all file paths must be strings!
3.3.2. pathlib
and os
modules#
These modules are very useful in managing and navigating your file paths. The function path.expanduser(‘~’)
, from the os
module, allows you to find your root directory, independent of your Operating System. Try the below cell to see it.
from pathlib import Path
import os
root_path = os.path.expanduser('~')
print(root_path)
C:\Users\mmendozalugo
The path shown above is thus the absolute path to your current directory.
This can come in handy when you write a code that needs to create directories in the user’s computer to save data files and/or plots. As an example, the code below checks if a directory exists and, if it doesn’t, it creates one.
The os.path.join
is used to concatenate two strings to form a path string with the appropriate delimiter.
The code will check if a directory named plots
exists in your current directory if not, it will create one.
print('Contents of current directory (before):')
print(os.listdir(root_path))
imdir = os.path.join(root_path,'plots')
print(f'\nimdir = {imdir}')
Path(imdir).mkdir(parents=True, exist_ok=True)
print('\nContents of current directory (after creating the new directory):')
print(os.listdir(root_path))
Contents of current directory (before):
['Exercises', 'In_a_Nutshell', 'Theory']
imdir = C:\Users\mmendozalugo\plots
Contents of current directory (after creating the new directory):
['Exercises', 'In_a_Nutshell', 'plots', 'Theory']
To delete the folder that was just created we run the code bellow.
try:
os.rmdir(imdir)
print(f'Directory {imdir} has been deleted.')
except:
print('You already deleted the folder. :)')
Directory C:\Users\mmendozalugo\plots has been deleted.
Now you are, hopefully, a bit more used to working with file paths. For the next test, we are going to try to open a file. We can use some built-in Python functions to open a *.txt file and print its contents.
3.3.3. Additional study material#
Official Python Documentation - https://docs.python.org/3/tutorial/inputoutput.html
Official Python Documentation - https://docs.python.org/3/reference/expressions.html
Official Python Documentation - https://docs.python.org/3/library/filesys.html
After this Notebook you should be able to:
- print a variable, formatting it in an appropriate manner
- know the existence of escape characters
- know how to use lambda functions
- understand how file paths work
- create and delete new directories