2. Pandas#
With the standard Python functions and the numpy
library, you already have access to powerful tools to process data. However, you’ll find that organizing data using them might still be confusing and messy… so let us introduce you to pandas
: a Python library specialized in data organization. Its functions are simple to use, and they achieve a lot. Furthermore, pandas
was built on top of the numpy
library, using some of their functions and data structures. This makes pandas
fast. The pandas
library is often used in Data Science and Machine Learning to organize data that are used as input in other functions, of other libraries. For example, you store and organize an Excel file using pandas
data structures, apply statistical analysis using SciPy
, and then plot the result using matplotlib
.
In this section, we’ll introduce you to the basic pandas
data structures: the Series
and DataFrame
objects; and how to store data in them. In pandas
, a Series
represents a list, and DataFrame
represents a table.