import numpy as np
1.1. One-Dimensional arrays#
1.1.1. np.array()
, np.asarray()
#
So, how do you create a numpy
1-Dimensional (1D) array? There are a few ways to do it…
Option 1 - from scratch with
np.array()
similar to a list.
arr1 = np.array([1,2,3])
print('arr1 = {}, its type is {}'.format(arr1,type(arr1)))
arr1 = [1 2 3], its type is <class 'numpy.ndarray'>
Option 2 - from an existing list with
np.array()
.
Create the list first and check its type. Then create the array A_1
from the list L_1
and check its type.
L_1 = [1,2,3,5,7,11,13]
print('L_1 = {} and its type is {}\n'.format(L_1,type(L_1)))
A_1 = np.array(L_1)
print('A_1 = {} and its type is {}'.format(A_1, type(A_1)))
L_1 = [1, 2, 3, 5, 7, 11, 13] and its type is <class 'list'>
A_1 = [ 1 2 3 5 7 11 13] and its type is <class 'numpy.ndarray'>
Option 3 - from an existing list with
np.asarray()
L_1 = [1,2,3,5,7,11,13]
print('L_1 = {} and its type is {}\n'.format(L_1,type(L_1)))
A_1 = np.asarray(L_1)
print('A_1 = {} and its type is {}'.format(A_1, type(A_1)))
L_1 = [1, 2, 3, 5, 7, 11, 13] and its type is <class 'list'>
A_1 = [ 1 2 3 5 7 11 13] and its type is <class 'numpy.ndarray'>
From the above examples, you can’t really determine the difference between using np.array()
or np.asarray()
. Nonetheless, there is a very important one, similar to the =
and copy
conundrum discussed in Notebook 4. When generating an array from a list, both functions do pretty much the same. However, when generating an array from another array, their differences stand out.
First, let’s check the ID of arr1
print('arr1 ID is {}'.format(id(arr1)))
arr1 ID is 3116429230896
Now, let’s make two new arrays from arr1, using both functions
arr_array = np.array(arr1)
arr_asarray = np.asarray(arr1)
print('arr_array = {} and its ID is {}\n'.format(arr_array,id(arr_array)))
print('arr_asarray = {} and its ID is {}'.format(arr_asarray, id(arr_asarray)))
arr_array = [100 2 3] and its ID is 2342009211568
arr_asarray = [100 2 3] and its ID is 2342276386128
Hmm… it seems that the ID of arr_asarray
is the same as the original arr1
. Which means they are the same variable! Altering one will alter the other one as well. Let’s try it out.
arr1[0] = 'hello'
print(arr_asarray)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[8], line 1
----> 1 arr1[0] = 'hello'
2 print(arr_asarray)
ValueError: invalid literal for int() with base 10: 'hello'
Oops… it didn’t work. Why do you think it didn’t work?
Answer: …
Change the first element of arr1
. Then print arr_array
and arr_asarray
to see if the first element changed.
arr1[0] = 100
print('arr1 = {}\n'.format(arr1))
print('arr_array = {}\n'.format(arr_array))
print('arr_asarray = {}'.format(arr_asarray))
arr1 = [100 2 3]
arr_array = [100 2 3]
arr_asarray = [100 2 3]
Yep, our theory was right: arr1
and arr_asarray
are indeed the same (but arr_array
is not!). Therefore, altering arr1[0]
will alter arr_asarray[0]
in the same way.
Final check that they are indeed the same
print(arr1 is arr_asarray)
True
1.1.2. np.zeros()
#
In case you already know the size of the array you will need, it is common to initialize it with zeros first using the np.zeros()
function, as shown below.
Set a limit when printing the huge arrays we will generate.
I know I will need an array with 100000 elements so I create it full of zeros first. Then, I assign the values I need to each element.
in this example, I only wrote a for
loop to assign random integer numbers between 0 and 9 to it. Note the use of range(len(my_arr))
, we use this often to specify the range of the for
loop to be of the same size as some array.
my_arr = np.zeros(100000)
print('my_arr with a bunch of zeros \n{}\n'.format(my_arr))
print('#######################')
import random
for i in range(len(my_arr)):
my_arr[i] = random.randint(0,9)
print('\nmy_arr with random numbers \n{}'.format(my_arr))
my_arr with a bunch of zeros
[0. 0. 0. ... 0. 0. 0.]
#######################
my_arr with random numbers
[1. 0. 9. ... 7. 2. 4.]
Note that these arrays still have \(100000\) elements, but due to our first line of code we truncated the print function to not print it completely — otherwise you would have to scroll a lot. :P
1.1.3. np.min()
, np.max()
and np.mean()
#
Numpy also provides various packages to help you process your data. You can, for instance, find out what is the minimum value of an array, or its mean. Your task is to find the minimum, maximum, and mean values of an array.
Find the minimum, maximum and mean values of A_1 and print the results.
A_1_min = np.min(A_1)
A_1_max = np.max(A_1)
A_1_mean = np.mean(A_1)
print(f'The minimum value of A_1 is {A_1_min} \n')
print(f'The maximum value of A_1 is {A_1_max} \n')
print(f'The mean value of A_1 is {A_1_mean} \n')
The minimum value of A_1 is 1
The maximum value of A_1 is 13
The mean value of A_1 is 6.0
1.1.4. np.arange()
#
Another useful function of the numpy
module is np.arange()
. First, let’s see in the documentation what it does.
It reads:arange([start,] stop[, step,], dtype=None, *, like=None)
Return evenly spaced values within a given interval.
To make a number range you need to choose:
1) the starting point,
2) the endpoint,
3) the interval between each point
The reason why it reads [start,] stop[, step,]
with square brackets, is that the start
and the step
can be omitted. If not specified, start = 0
and step = 1
, by default.
Warning
Your endpoint is not included in the array. If you want to include the endpoint in the array, you have to specify the stop to be endpoint + step. This will be clearer in the following examples.
omitted start and step
arr = np.arange(5)
print('arr =', arr)
arr = [0 1 2 3 4]
As mentioned, the endpoint (5) is omitted. If you would like to include it:
arr = np.arange(5 + 1)
print('arr =', arr)
arr = [0 1 2 3 4 5]
Now, without omiting start
nor step
. Without endpoint.
arr = np.arange(1, 2, 0.01)
print('arr =', arr)
arr = [1. 1.01 1.02 ... 1.97 1.98 1.99]
Including endpoint
arr = np.arange(1, 2 + 0.01, 0.01)
print('arr =', arr)
arr = [1. 1.01 1.02 ... 1.98 1.99 2. ]
You can also generate a descending array, by using negative steps
arr = np.arange(10,0,-1)
print('arr =', arr)
arr = [10 9 8 ... 3 2 1]
1.1.5. np.sort()
#
You can also sort an array in the crescent order using np.sort()
sorted_arr = np.sort(arr)
print('sorted_arr =', sorted_arr)
sorted_arr = [ 1 2 3 ... 8 9 10]
1.1.6. np.sum()
#
As the name clearly states, np.sum()
returns the sum of an array. Let’s try it out:
arr = ([1,2,3])
my_sum = np.sum(arr)
print(f'The sum of the array is {my_sum}')
The sum of the array is 6