import numpy as np

1.1. One-Dimensional arrays#

1.1.1. np.array(), np.asarray()#

So, how do you create a numpy 1-Dimensional (1D) array? There are a few ways to do it…

  • Option 1 - from scratch with np.array() similar to a list.

arr1 = np.array([1,2,3])
print('arr1 = {}, its type is {}'.format(arr1,type(arr1)))
arr1 = [1 2 3], its type is <class 'numpy.ndarray'>
  • Option 2 - from an existing list with np.array().

Create the list first and check its type. Then create the array A_1 from the list L_1 and check its type.

L_1 = [1,2,3,5,7,11,13]
print('L_1 = {} and its type is {}\n'.format(L_1,type(L_1)))

A_1 = np.array(L_1)
print('A_1 = {} and its type is {}'.format(A_1, type(A_1)))
L_1 = [1, 2, 3, 5, 7, 11, 13] and its type is <class 'list'>

A_1 = [ 1  2  3  5  7 11 13] and its type is <class 'numpy.ndarray'>
  • Option 3 - from an existing list with np.asarray()

L_1 = [1,2,3,5,7,11,13]
print('L_1 = {} and its type is {}\n'.format(L_1,type(L_1)))

A_1 = np.asarray(L_1)
print('A_1 = {} and its type is {}'.format(A_1, type(A_1)))
L_1 = [1, 2, 3, 5, 7, 11, 13] and its type is <class 'list'>

A_1 = [ 1  2  3  5  7 11 13] and its type is <class 'numpy.ndarray'>

From the above examples, you can’t really determine the difference between using np.array() or np.asarray(). Nonetheless, there is a very important one, similar to the = and copy conundrum discussed in Notebook 4. When generating an array from a list, both functions do pretty much the same. However, when generating an array from another array, their differences stand out.

First, let’s check the ID of arr1

print('arr1 ID is {}'.format(id(arr1)))
arr1 ID is 3116429230896

Now, let’s make two new arrays from arr1, using both functions

arr_array = np.array(arr1)
arr_asarray = np.asarray(arr1)

print('arr_array = {} and its ID is {}\n'.format(arr_array,id(arr_array)))
print('arr_asarray = {} and its ID is {}'.format(arr_asarray, id(arr_asarray)))
arr_array = [100   2   3] and its ID is 2342009211568

arr_asarray = [100   2   3] and its ID is 2342276386128

Hmm… it seems that the ID of arr_asarray is the same as the original arr1. Which means they are the same variable! Altering one will alter the other one as well. Let’s try it out.

arr1[0] = 'hello'
print(arr_asarray)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[8], line 1
----> 1 arr1[0] = 'hello'
      2 print(arr_asarray)

ValueError: invalid literal for int() with base 10: 'hello'

Oops… it didn’t work. Why do you think it didn’t work?

Answer: …

Change the first element of arr1. Then print arr_array and arr_asarray to see if the first element changed.

arr1[0] = 100

print('arr1 = {}\n'.format(arr1))
print('arr_array = {}\n'.format(arr_array))
print('arr_asarray = {}'.format(arr_asarray))
arr1 = [100   2   3]

arr_array = [100   2   3]

arr_asarray = [100   2   3]

Yep, our theory was right: arr1 and arr_asarray are indeed the same (but arr_array is not!). Therefore, altering arr1[0] will alter arr_asarray[0] in the same way.

Final check that they are indeed the same

print(arr1 is arr_asarray)
True

1.1.2. np.zeros()#

In case you already know the size of the array you will need, it is common to initialize it with zeros first using the np.zeros() function, as shown below.

Set a limit when printing the huge arrays we will generate.

Hide code cell source

np.set_printoptions(threshold=1) 

I know I will need an array with 100000 elements so I create it full of zeros first. Then, I assign the values I need to each element. in this example, I only wrote a for loop to assign random integer numbers between 0 and 9 to it. Note the use of range(len(my_arr)), we use this often to specify the range of the for loop to be of the same size as some array.

my_arr = np.zeros(100000)
print('my_arr with a bunch of zeros \n{}\n'.format(my_arr))
print('#######################')

import random

for i in range(len(my_arr)): 
    my_arr[i] = random.randint(0,9)
    
print('\nmy_arr with random numbers \n{}'.format(my_arr))
my_arr with a bunch of zeros 
[0. 0. 0. ... 0. 0. 0.]

#######################

my_arr with random numbers 
[1. 0. 9. ... 7. 2. 4.]

Note that these arrays still have \(100000\) elements, but due to our first line of code we truncated the print function to not print it completely — otherwise you would have to scroll a lot. :P

1.1.3. np.min(), np.max() and np.mean()#

Numpy also provides various packages to help you process your data. You can, for instance, find out what is the minimum value of an array, or its mean. Your task is to find the minimum, maximum, and mean values of an array.

Find the minimum, maximum and mean values of A_1 and print the results.

A_1_min = np.min(A_1)
A_1_max = np.max(A_1)
A_1_mean = np.mean(A_1)

print(f'The minimum value of A_1 is {A_1_min} \n')
print(f'The maximum value of A_1 is {A_1_max} \n')
print(f'The mean value of A_1 is {A_1_mean} \n')
The minimum value of A_1 is 1 

The maximum value of A_1 is 13 

The mean value of A_1 is 6.0 

1.1.4. np.arange()#

Another useful function of the numpy module is np.arange(). First, let’s see in the documentation what it does.

It reads:

arange([start,] stop[, step,], dtype=None, *, like=None)

Return evenly spaced values within a given interval.


To make a number range you need to choose:
1) the starting point,
2) the endpoint,
3) the interval between each point

The reason why it reads [start,] stop[, step,] with square brackets, is that the start and the step can be omitted. If not specified, start = 0 and step = 1, by default.

Warning

Your endpoint is not included in the array. If you want to include the endpoint in the array, you have to specify the stop to be endpoint + step. This will be clearer in the following examples.

omitted start and step

arr = np.arange(5) 
print('arr =', arr)
arr = [0 1 2 3 4]

As mentioned, the endpoint (5) is omitted. If you would like to include it:

arr = np.arange(5 + 1)
print('arr =', arr)
arr = [0 1 2 3 4 5]

Now, without omiting start nor step. Without endpoint.

arr = np.arange(1, 2, 0.01)
print('arr =', arr)
arr = [1.   1.01 1.02 ... 1.97 1.98 1.99]

Including endpoint

arr = np.arange(1, 2 + 0.01, 0.01) 
print('arr =', arr)
arr = [1.   1.01 1.02 ... 1.98 1.99 2.  ]

You can also generate a descending array, by using negative steps

arr = np.arange(10,0,-1)
print('arr =', arr)
arr = [10  9  8 ...  3  2  1]

1.1.5. np.sort()#

You can also sort an array in the crescent order using np.sort()

sorted_arr = np.sort(arr)
print('sorted_arr =', sorted_arr)
sorted_arr = [ 1  2  3 ...  8  9 10]

1.1.6. np.sum()#

As the name clearly states, np.sum() returns the sum of an array. Let’s try it out:

arr = ([1,2,3])
my_sum = np.sum(arr)
print(f'The sum of the array is {my_sum}')
The sum of the array is 6