Pandas

2. Pandas#

With the standard Python functions and the numpy library, you already have access to powerful tools to process data. However, you’ll find that organizing data using them might still be confusing and messy… so let us introduce you to pandas: a Python library specialized in data organization. Its functions are simple to use, and they achieve a lot. Furthermore, pandas was built on top of the numpy library, using some of their functions and data structures. This makes pandas fast. The pandas library is often used in Data Science and Machine Learning to organize data that are used as input in other functions, of other libraries. For example, you store and organize an Excel file using pandas data structures, apply statistical analysis using SciPy, and then plot the result using matplotlib.

In this section, we’ll introduce you to the basic pandas data structures: the Series and DataFrame objects; and how to store data in them. In pandas, a Series represents a list, and DataFrame represents a table.