# these lines need to be run only once per program
import numpy as np
import matplotlib as plt
Numeric Python
Numpy/scipy/matplotlib
- Most python scientists, use the following libraries:
numpy
: performant array library (vectors, matrices, tensors)matplotlib
: plotting libraryscipy
: all kinds of mathematical routines
- In the rest of the course, we’ll make some use of
numpy
andmatplotlib
- They are included in all python distributions like Anaconda Python
- All additional libraries use
numpy
andmatplotlib
:pandas
,statsmodels
,sklearn
Importing the libraries
It is standard to import the libraries as np
, and plt
. We’ll follow this convention here.
print(f"Numpy version {np.__version__}")
print(f"Matplotlib version {plt.__version__}")
Numpy
What is Numpy
Numpy is an array type (python object) meant to store efficiently homogenous, square, arrays (like \((a_{i})_{i\in [1,N]}\) or \((b_{i,j,k})_{i\in [1,N],j\in[1,J],k \in [1,K]}\))
By default its stores data in contiguous C-order (last index varies faster), but also supports Fortran order and strided arrays (non-contiguous).
Numpy has introduced well thought conventions, that have been reused by many other libraries (tensorflow, pytorch, jax), or even programming languages (julia)
Vector Creation
- Vectors and matrices are created with the
np.array(...)
function. - Special vectors can be created with
np.zeros
,np.ones
,np.linspace
# an array can be created from a list of numbers
1.0, 2.0, 3.0] ) np.array( [
# or initialized by specifying the length of the array
5) np.zeros(
# 10 regularly spaced points between 0 and 1
0, 1, 11) np.linspace(
Matrix Creation
- A matrix is a 2-dimensional array and is created with
np.array
- Function
np.matrix()
has been deprecated: do not use it. - There are functions to create specific matrices:
np.eye
,np.diag
, …
# an array can be created from a list of (equal size) lists
np.array([1.0, 2.0, 3.0],
[4 , 5, 6]
[ ])
# initialize an empty matrix with the dimensions as a tuple
= np.zeros( (2, 3) )
A A
# matrix dimensions are contained in the shape attribute
A.shape
Tensors
The construction generalizes to higher dimension arrays (a.k.a. tensors)
# an array can be created from a list of list of lists
np.array([
[1.0, 2.0, 3.0],
[4 , 5, 6]
[
],
[7.0, 8.0, 9.0],
[10 , 11, 12]
[
] ])
# initialize an empty matrix with the dimensions as a tuple
= np.zeros( (2, 3) )
A A
# matrix dimensions are contained in the shape attribute
A.shape
Linear Algebra
Vector multiplications and Matrix multiplications can be performed using special sign @
= np.array([[1.0, 2.0], [2,4]])
A A
= np.array([1.0, 2.0])
B B
@B A
@A A
Note how multiplication reduces total number of dimensions by 2. It is a tensor reduction.
print(A.shape, A.shape, (A@A).shape)
Scalar types
Numpy arrays can contain data of several scalar types.
True, False, True] [
# vector of boolean
= np.array( [True, False, True] )
boolean_vector print(f"type of scalar '{boolean_vector.dtype}'")
boolean_vector
# vector of integers
= np.array([1, 2, 0])
int_vector print(f"type of scalar '{int_vector.dtype}'")
int_vector
By default, numerical arrays contain float64
numbers (like matlab). But GPUs typically process 16 bits or 32 bits numbers.
Can you create a 32 bits array?
# your code here
Subscripting Vectors
- Elements and subarrays, can be retrieved using the same syntax as lists and strings.
- Remember that indexing starts at 0.
= np.array([0., 1., 2., 3., 4.])
V 1]) # second element display(V[
= np.array([0., 1., 2., 3., 4.])
V 1:3]) # second, third and fourth element display(V[
Modifying Vector Content
- Elements and suvectors, can be assigned to new values, as long as they have the right dimensions.
= np.array([1., 1., 2., 4., 5., 8., 13.])
V 3] = 3.0
V[ V
= np.array([1., 1., 2., 4., 5., 8., 13.])
V # V[1:4] = [1,2,3,4] # this doesn't work
1:4] = [2,3,4] # this works V[
Subscripting Matrices
- Indexing generalizes to matrices: there are two indices istead of one:
M[i,j]
- One can extract a row, or a column (a slice) with
M[i,:]
orM[:,i]
- A submatrix is defining with two intervals:
M[i:j, k:l]
orM[i:j, :]
, …
= np.array([[1,2,3],[4,5,6],[7,8,9]])
M M
0,1] # access element (1,2) M[
2,:] # third row M[
1] # second column # M[i,1] for any i M[:,
1:3, :] # lines from 1 (included) to 3 (excluded) ; all columns M[
Modifying matrix content
= np.array([[1,2,3],[4,5,6],[7,8,9]])
M M
0,0] = 0
M[ M
1:3, 1:3] = np.array([[0,1],[1,0]]) # dimensions must match
M[ M
Element-wise algebraic operations
- The following algebraic operations are defined on arrays:
+
,-
,*
,/
,**
. - Comparisons operators (
<
,<=
,>
,>=
,==
) are defined are return boolean arrays. - They operate element by element.
= np.array([1,2,3,4])
A = np.array([4,3,2,1])
B +B A
*B # note the difference with A@B A
>B A
At first, one might be surprised that the default multiplication operator is element-wise multiplication rather than matrix multiplication.
There are at least two good reasons:
- consistency: all operators can be broadcasted with the exact same rules (like
*
,+
,>
) - for many workflows, elementwise operations are more common than matrix multiplication
Element-wise logical operations
- The following logical operations are defined element-wise on arrays:
&
(and),|
(or),~
(not)
= np.array([False, False, True, True])
A = np.array([False, True, False, True]) B
~A
| B A
& B A
Vector indexing
- Arrays can be indexed by boolean arrays instead of ranges.
- Only elements corresponding to true are retrieved
= np.linspace(0,1,6)
x x
# indexes such that (x^2) > (x/2)
**2 > (x/2) x
= x**2 > (x/2)
cond x[ cond ]
Going further: broadcasting rules
- Numpy library has defined very consistent conventions, to match inconsistent dimensions.
- Ignore them for now…
= np.eye(4)
M M
2:4, 2:4] = 0.5 # float
M[ M
2] = np.array([[0.1, 0.2]]) # 1x2 array
M[:,: M
Going Further
- Other useful functions (easy to google):
np.arange()
regularly spaced integersnp.where()
find elements in- …
Matplotlib
Matplotlib
matplotlib
is …- object oriented api optional Matlab-like syntax
- main function is
plt.plot(x,y)
wherex
andy
are vectors (or iterables like lists)- lots of optional arguments
from matplotlib import pyplot as plt
Example
= np.linspace(-1,1,6) x
= np.sin(x)/x # sinus cardinal y
'o')
plt.plot(x,y, plt.plot(x,y)
Example (2)
= np.linspace(-5,5,100)
x
= plt.figure() # keep a figure open to draw on it
fig for k in range(1,5):
= np.sin(x*k)/(x*k)
y =f"$sinc({k} x)$") # label each line
plt.plot(x, y, label*0, color='black', linestyle='--')
plt.plot(x, xTrue) # add a grid
plt.grid("Looking for the right hat.")
plt.title(="upper right") plt.legend(loc
Example (3)
= np.linspace(-5,5,100)
x
plt.figure()2,2,1) # create a 2x2 subplot and draw in first quadrant
plt.subplot(
plt.plot(x,x)2,2,2) # create a 2x2 subplot and draw in second quadrant
plt.subplot(-x)
plt.plot(x,2,2,3) # create a 2x2 subplot and draw in third quadrant
plt.subplot(-x)
plt.plot(x,2,2,4) # create a 2x2 subplot and draw in fourth quadrant
plt.subplot(
plt.plot(x,x)
# save some space plt.tight_layout()
Alternatives to matplotlib
- plotly (nice javascript graphs)
- bqplot (native integration with jupyter)
- altair
- excellent for dataviz/interactivity
- python wrapper to Vega-lite
- very efficient to visualize
pandas
data (i.e. a dataframe)