2.2. Data Structures#

In this Section you will tackle a data management problem! In the first module you have learned how to create variables, which is cool. But when you populate a lot of variables, or you want to store & access them within one entity, you need to have a data structure.

There are plenty of them, which differ their use cases and complexity. Today we will tackle some of the standard Python built-in data structures. The most popular of those are: list, dict and tuple.

2.2.1. list#

First, the easiest and the most popular data structure in Python: list (which is similar to a typical array you could have seen in a different programming language).

You can create a list in the following ways:

  1. Creating an empty list, option 1

  2. Creating an empty list, option 2 - using the class constructor

  3. Creating a list from existing data - option 1

  4. Creating a list from existing data - option 2

#1
empty_list1 = []
print('Type of my_list1 object', type(empty_list1))
print('Contents of my_list1', empty_list1)
print('--------------------')

#2
empty_list2 = list()
print('Type of my_list2 object', type(empty_list2))
print('Contents of my_list2', empty_list2)
print('--------------------')

#3
my_var1 = 5
my_var2 = "hello"
my_var3 = 37.5

my_list = [my_var1, my_var2, my_var3]
print('Type of my_list3 object', type(my_list))
print('Contents of my_list3', my_list)
print('--------------------')


#4
cool_rock = "sandstone" # remember that a string is a collection of characters

list_with_letters = list(cool_rock)

print('Type of my_list3 object', type(list_with_letters))
print('Contents of list_with_letters', list_with_letters)
print('--------------------')
Type of my_list1 object <class 'list'>
Contents of my_list1 []
--------------------
Type of my_list2 object <class 'list'>
Contents of my_list2 []
--------------------
Type of my_list3 object <class 'list'>
Contents of my_list3 [5, 'hello', 37.5]
--------------------
Type of my_list3 object <class 'list'>
Contents of list_with_letters ['s', 'a', 'n', 'd', 's', 't', 'o', 'n', 'e']
--------------------

As you can see, in all three cases we created a list, only the method how we did it was slightly different:

  • the first method uses the bracket notation.

  • the second method uses class constructor approach.

Both methods also apply to the other data structures.

Now, we have a list — what can we do with it?

Well… we can access and modify any element of an existing list. In order to access a list element, square brackets [] are used with the index of the element we want to access inside. Sounds easy, but keep in mind that Python has a zero-based indexing (as mentioned in Section 1.4 in Notebook 1).

Note

A zero-based indexing means that the first element has index 0 (not 1), the second element has index 1 (not 2) and the n-th element has index n - 1 (not n)!

The ``len()` function returns the lengths of an iterable (string, list, array, etc). Since we have 3 elements, thus we can access 0th, 1st, and 2nd elements.

After the element is accessed, it can be used as any variable, the list only provides a convenient storage. Since it is a storage - we can easily alter and swap list elements

print(len(my_list))
print('First element of my list:', my_list[0])
print('Last element of my list:', my_list[2])

summation = my_list[0] + my_list[2]
print(f'Sum of {my_list[0]} and {my_list[2]} is {summation}')


my_list[0] += 7
my_list[1] = "My new element"

print(my_list)
3
First element of my list: 5
Last element of my list: 37.5
Sum of 5 and 37.5 is 42.5
[12, 'My new element', 37.5]

we can only access data we have - Python will give us an error for the following

my_list[10] = 199
---------------------------------------------------------------------------
IndexError                                Traceback (most recent call last)
Cell In[4], line 1
----> 1 my_list[10] = 199

IndexError: list assignment index out of range

We can also add new elements to a list, or remove them! Adding is realized with the append method and removal of an element uses the del keyword. We can also store a list inside a list - list inception! Useful for matrices, images etc.

my_list.append("new addition to  my variable collection!")
print(my_list)

my_list.append(['another list', False, 1 + 2j])
print(my_list)

del my_list[2]
print(my_list)
[12, 'My new element', 37.5, 'new addition to  my variable collection!']
[12, 'My new element', 37.5, 'new addition to  my variable collection!', ['another list', False, (1+2j)]]
[12, 'My new element', 'new addition to  my variable collection!', ['another list', False, (1+2j)]]

Lists also have other useful functionalities, as you can see from the official documentation. Since lists are still objects you can try and apply some operations to them as well.

lst1 = [2, 4, False]
lst2 = ['second list', 0, 222]

lst1 = lst1 + lst2
print(lst1)

lst2 = lst2 * 4
print(lst2)

lst2[3] = 5050
print(lst2)
[2, 4, False, 'second list', 0, 222]
['second list', 0, 222, 'second list', 0, 222, 'second list', 0, 222, 'second list', 0, 222]
['second list', 0, 222, 5050, 0, 222, 'second list', 0, 222, 'second list', 0, 222]

As you can see, adding lists together concatenates them and multiplying them basically does the same thing (it performs addition several times, just like in real math…).

Additionally, you can also use the in keyword to check the presence of a value inside a list.

print(lst1)

if 222 in lst1:
    print('We found 222 inside lst1')
else:
    print('Nope, nothing there....')

2.2.2. tuple#

If you understood how list works, then you already understand 95% of tuple. Tuples are just like lists, with some small differences.

1. In order to create a tuple you need to use () brackets, comma or a tuple class constructor.
2. You can change the content of your list, however tuples are immutable (just like strings).

#1
tupl1 = tuple() 
print('Type of tupl1', type(tupl1))
print('Content of tupl1', tupl1)
#2
tupl2 = () # option 2 with ()
print(type(tupl2), tupl2)
Type of tupl1 <class 'tuple'>
Content of tupl1 ()
<class 'tuple'> ()

Creating a non-empty tuple using brackets or # Creating a non-empty tuple using comma. Can we change an element of a tuple?

my_var1 = 26.5
my_var2 = 'Oil'
my_var3 = False

my_tuple = (my_var1, my_var2, my_var3, 'some additional stuff', 777)
print('my tuple', my_tuple)


comma_tuple = 2, 'hi!', 228
print('A comma made tuple', comma_tuple)

print('4th element of my_tuple:', my_tuple[3])
my_tuple[3] = 'will I change?'
my tuple (26.5, 'Oil', False, 'some additional stuff', 777)
A comma made tuple (2, 'hi!', 228)
4th element of my_tuple: some additional stuff
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[8], line 13
     10 print('A comma made tuple', comma_tuple)
     12 print('4th element of my_tuple:', my_tuple[3])
---> 13 my_tuple[3] = 'will I change?'

TypeError: 'tuple' object does not support item assignment

Since tuples are immutable, it has no append() method nor any other methods that alter the object they target.

You might think that tuple is a useless class. However, there are some reasons for it to exist:

1.Storing constants & objects which shouldn’t be changed. 2.Saving memory (tuple uses less memory to store the same data than a list). .__sizeof__() determines the size of a variable in bytes.

my_name = 'Vasyan'
my_age = 27
is_student = True

a = (my_name, my_age, is_student)
b = [my_name, my_age, is_student]

print('size of a =', a.__sizeof__(), 'bytes') 
print('size of b =', b.__sizeof__(), 'bytes')
size of a = 48 bytes
size of b = 64 bytes

2.2.3. dict#

After seeing lists and tuples, you may think:

”Wow, storing all my variables within another variable is cool and gnarly! But… sometimes it’s boring & inconvenient to access my data by using it’s position within a tuple/list. Is there a way that I can store my object within a data structure but access it via something meaningful, like a keyword…?”

Don’t worry if you had this exact same thought.. Python had it as well!

Dictionaries are suited especially for that purpose — to each element you want to store, you give it a nickname (i.e., a key) and use that key to access the value you want.

To create an empty dictionary we used {} or class constructor dict()

empty_dict1 = {}
print('Type of empty_dict1', type(empty_dict1))
print('Content of it ->', empty_dict1)


empty_dict2 = dict()
print('Type of empty_dict2', type(empty_dict2))
print('Content of it ->', empty_dict2)
Type of empty_dict1 <class 'dict'>
Content of it -> {}
Type of empty_dict2 <class 'dict'>
Content of it -> {}

To create a non-empty dictionary we specify pairs of key:value pattern

my_dict = {
    'name': 'Jarno',
    'color': 'red',
    'year': 2007,
    'is cool': True,
    6: 'it works',
    (2, 22): 'that is a strange key'
}

print('Content of my_dict>>>', my_dict)
Content of my_dict>>> {'name': 'Jarno', 'color': 'red', 'year': 2007, 'is cool': True, 6: 'it works', (2, 22): 'that is a strange key'}

In the last example, you can see that only strings, numbers, or tuples were used as keys. Dictionaries can only use immutable data (or numbers) as keys:

mutable_key_dict = {
    5: 'lets try',
    True: 'I hope it will run perfectly',
    6.78: 'heh',
    ['No problemo', 'right?']: False  
}

print(mutable_key_dict)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[14], line 1
----> 1 mutable_key_dict = {
      2     5: 'lets try',
      3     True: 'I hope it will run perfectly',
      4     6.78: 'heh',
      5     ['No problemo', 'right?']: False  
      6 }
      8 print(mutable_key_dict)

TypeError: unhashable type: 'list'

Alright, now it is time to access the data we have managed to store inside my_dict using keys!

print('Some random content of my_dict', my_dict['name'], my_dict[(2, 22)])
Some random content of my_dict Jarno that is a strange key

Remember the mutable key dict? Let’s make it work by omitting the list item.

mutable_key_dict = {
    5: 'lets try',
    True: 'I hope it will run perfectly',
    6.78: 'heh'
}


print('Accessing weird dictionary...')
print(mutable_key_dict[True])
print(mutable_key_dict[5])
print(mutable_key_dict[6.78])
Accessing weird dictionary...
I hope it will run perfectly
lets try
heh

Trying to access something we have and something we don’t have

print('My favorite year is', my_dict['year'])
print('My favorite song is', my_dict['song'])
My favorite year is 2007
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
Cell In[17], line 2
      1 print('My favorite year is', my_dict['year'])
----> 2 print('My favorite song is', my_dict['song'])

KeyError: 'song'

Warning

It is best practice to use mainly strings as keys — the other options are weird and are almost never used.

What’s next? Dictionaries are mutable, so let’s go ahead and add some additional data and delete old ones.

print('my_dict right now', my_dict)

my_dict['new_element'] = 'magenta'
my_dict['weight'] = 27.8
del my_dict['year']

print('my_dict after some operations', my_dict)
my_dict right now {'name': 'Jarno', 'color': 'red', 'year': 2007, 'is cool': True, 6: 'it works', (2, 22): 'that is a strange key'}
my_dict after some operations {'name': 'Jarno', 'color': 'red', 'is cool': True, 6: 'it works', (2, 22): 'that is a strange key', 'new_element': 'magenta', 'weight': 27.8}

You can also print all keys present in the dictionary using the .keys() method, or check whether a certain key exists in a dictionary, as shown below. More operations can be found here.

print(my_dict.keys())
print("\nmy_dict has a ['name'] key:", 'name' in my_dict)
dict_keys(['name', 'color', 'is cool', 6, (2, 22), 'new_element', 'weight'])

my_dict has a ['name'] key: True

2.2.4. Real life example:#

Analyzing satellite metadata

Metadata is a set of data that describes and gives information about other data. For Sentinel-1, the metadata of the satellite is acquired as an .xml file. It is common for Dictionaries to play an important role in classifying this metadata. One could write a function to read and obtain important information from this metadata and store them in a Dictionary. Some examples of keys for the metadata of Sentinel-1 are:

dict_keys([‘azimuthSteeringRate’, ‘dataDcPolynomial’, ‘dcAzimuthtime’, ‘dcT0’, ‘rangePixelSpacing’, ‘azimuthPixelSpacing’, ‘azimuthFmRatePolynomial’, ‘azimuthFmRateTime’, ‘azimuthFmRateT0’, ‘radarFrequency’, ‘velocity’, ‘velocityTime’, ‘linesPerBurst’, ‘azimuthTimeInterval’, ‘rangeSamplingRate’, ‘slantRangeTime’, ‘samplesPerBurst’, ‘no_burst’])

The last important thing for this Notebook are slices. Similar to how you can slice a string (shown in Section 1.4, in Notebook 1). This technique allows you to select a subset of data from an iterable (like a list or a tuple).

x = [1, 2, 3, 4, 5, 6, 7]
n = len(x) 

print('The first three elements of x:', x[0:3])
print(x[:3])
print('The last element is', x[6], 'or', x[n - 1], 'or', x[-1])
print(x[0:-4])
print(x[0:3:1])
The first three elements of x: [1, 2, 3]
[1, 2, 3]
The last element is 7 or 7 or 7
[1, 2, 3]
[1, 2, 3]

Let’s break it down

This code demonstrates how to select specific elements from a list in Python using slicing:

  1. The list x contains numbers from 1 to 7.

  2. x[0:3] selects the first three elements of x.

  3. x[:3] achieves the same result by omitting the starting index.

  4. x[6], x[n - 1], and x[-1] all access the last element of x.

  5. x[0:-4] selects elements from the beginning to the fourth-to-last element.

  6. x[0:3:1] selects elements with a step size of 1.

Thus, the general slicing call is given by iterable[start:end:step].

Here’s another example:

You can select all even numbers using [::2] or reverse the list using [::-1] or select a middle subset for example [5:9].

numbers = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print('Selecting all even numbers', numbers[::2])
print('All odd numbers', numbers[1::2])
print('Normal order', numbers)
print('Reversed order', numbers[::-1])
print('Numbers from 5 to 8:', numbers[5:9])
Selecting all even numbers [0, 2, 4, 6, 8, 10]
All odd numbers [1, 3, 5, 7, 9]
Normal order [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Reversed order [10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
Numbers from 5 to 8: [5, 6, 7, 8]