Series

import pandas as pd 
import numpy as np

2.1. Series#

We start with pandas Series, since a DataFrame is made out of Series; retrieving a row or a column from a DataFrame results in a Series. A Series object is a numpy ndarray used to hold one-dimensional data, like a list. We create a Series object using its constructor pd.Series(). It can be called by using a list that you want to convert into a pandas series. Unlike numpy arrays, a Series may hold data of different types.

First we create a list containing elements of various types. Then we construct a Series and a numpy.array using our list. Finally we compare the type of each element in my_series and my_nparray.

my_list = ['begin', 2, 3/4, "end"]

my_series = pd.Series(data=my_list)
my_nparray = np.array(my_list)

for i in range(len(my_list)):
    print('----type of each element----')
    print(f'my_series element #{i} => {type(my_series[i])}')
    print(f'my_nparray element #{i} => {type(my_nparray[i])}\n')
----type of each element----
my_series element #0 => <class 'str'>
my_nparray element #0 => <class 'numpy.str_'>

----type of each element----
my_series element #1 => <class 'int'>
my_nparray element #1 => <class 'numpy.str_'>

----type of each element----
my_series element #2 => <class 'float'>
my_nparray element #2 => <class 'numpy.str_'>

----type of each element----
my_series element #3 => <class 'str'>
my_nparray element #3 => <class 'numpy.str_'>

As expected, the numpy array changed all elements to one type; in this case, strings. As mentioned in Section 5.1, in Notebook 5, a numpy array cannot hold data of different types. Note that a pandas series is, by default, printed more elaborately.

print(my_series)
print('-----------------')
print(my_nparray)
0    begin
1        2
2     0.75
3      end
dtype: object
-----------------
['begin' '2' '0.75' 'end']

The values of a series can be accessed and sliced using the iloc() function:

my_series.iloc[1:]
1       2
2    0.75
3     end
dtype: object
my_series.iloc[[2,len(my_series)-1]]
2    0.75
3     end
dtype: object

2.1.1. Labeling Series#

So far we have referred to values within a list or array using indexing, but that might be confusing. With pandas Series, you can refer to your values by labeling their indices. Labels allow you to access the values in a more informative way, similar to dictionaries; depicted in Section 2.3, in Notebook 2.

We create the indices of the same size as the list since we want to construct our Series object with and use the index option in the Series constructor. Note that our entries can be called both ways

my_index_labels = ["My first entry", "1","2","END"]
my_labeled_Series = pd.Series(data=my_list, index=my_index_labels)

print(my_labeled_Series[0] == my_labeled_Series["My first entry"])
True
C:\Users\gui-win10\AppData\Local\Temp\ipykernel_24848\1771852785.py:4: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  print(my_labeled_Series[0] == my_labeled_Series["My first entry"])

pandas can automatically create labels of indices if we construct a Series using a dictionary with labeled entries.

my_dictionary = {"a list": [420, 10],"a float": 380/3, 
                 "a list of strings": ["first word", "Second Word", "3rd w0rd"] }
my_Series = pd.Series(my_dictionary)
print(my_Series)
a list                                         [420, 10]
a float                                       126.666667
a list of strings    [first word, Second Word, 3rd w0rd]
dtype: object

We can access an element within the list labeled "a list of strings" by using its label followed by the desired index

my_Series["a list of strings"][1]
'Second Word'

Warning

When using pandas, it’s a good idea to try and avoid for loops or iterative solutions; pandas usually has a faster solution than iterating through its elements.