Skip to content
Snippets Groups Projects
Commit cd556732 authored by Repo Updater's avatar Repo Updater
Browse files

a12b9095 add dask and graphviz to env

parent 45a4a069
Branches
No related tags found
No related merge requests found
...@@ -18,6 +18,9 @@ dependencies: ...@@ -18,6 +18,9 @@ dependencies:
- gfortran_linux-64 - gfortran_linux-64
- numba - numba
- jax - jax
- dask
- graphviz
- python-graphviz
- mpich - mpich
- mpi4py - mpi4py
- hdf5=*=mpi* - hdf5=*=mpi*
......
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# NumPy and SciPy # NumPy and SciPy
**Python for HPC** **Python for HPC**
Max Planck Computing and Data Facility, Garching Max Planck Computing and Data Facility, Garching
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Outline ## Outline
* NumPy * NumPy
* SciPy examples * SciPy examples
* Input/output with NumPy arrays * Input/output with NumPy arrays
* Bonus: Image processing examples * Bonus: Image processing examples
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Performance Problem - Loops ## Performance Problem - Loops
* Looping in Python (and most interpreted languages) can be **very expensive** * Looping in Python (and most interpreted languages) can be **very expensive**
* Numerical codes perform mostly array computation * Numerical codes perform mostly array computation
* Example: Discrete Fourier Transform * Example: Discrete Fourier Transform
$$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{\frac{-2i\pi}{N}kn} $$ $$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{\frac{-2i\pi}{N}kn} $$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import cmath import cmath
import random import random
def DFT_loop(x): def DFT_loop(x):
N = len(x) N = len(x)
X = [0 for n in range(N)] X = [0 for n in range(N)]
for k in range (N): for k in range (N):
for n in range(N): for n in range(N):
X[k] += x[n]*cmath.exp(-2j * cmath.pi * k * n / N) X[k] += x[n]*cmath.exp(-2j * cmath.pi * k * n / N)
return X return X
X = [random.random() for i in range(512)] X = [random.random() for i in range(512)]
F = DFT_loop(X) F = DFT_loop(X)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Solution - Numpy Arrays ### Solution - Numpy Arrays
<div class="row"> <div class="row">
<div class="col-md-6" markdown="1"> <div class="col-md-6" markdown="1">
* Possible solution: turn loop coefficients into arrays and then use linear algebra operations * Possible solution: turn loop coefficients into arrays and then use linear algebra operations
* <span style="color:red">Larger memory footprint</span>, but <span style="color:green">higher performance</span> * <span style="color:red">Larger memory footprint</span>, but <span style="color:green">higher performance</span>
* much faster than loops * much faster than loops
</div> </div>
<div class="col-md-6" markdown="1"> <div class="col-md-6" markdown="1">
<center> <center>
<img src="fig/numpy_fft.svg?1", style="width: 85%"> <img src="fig/numpy_fft.svg?1", style="width: 85%">
</center> </center>
</div> </div>
</div> </div>
$$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{\frac{-2i\pi}{N}kn} $$ $$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{\frac{-2i\pi}{N}kn} $$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
def DFT_arrays(x): def DFT_arrays(x):
x = np.asarray(x, dtype=float) x = np.asarray(x, dtype=float)
N = x.shape[0] N = x.shape[0]
n = np.arange(N) n = np.arange(N)
k = n.reshape((N, 1)) k = n.reshape((N, 1))
M = np.exp(-2j * np.pi * k * n / N) M = np.exp(-2j * np.pi * k * n / N)
return np.dot(M, x) return np.dot(M, x)
X = np.random.random(512) X = np.random.random(512)
F = DFT_arrays(X) F = DFT_arrays(X)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### True Solution - External Call ### True Solution - External Call
* low-level external libraries will almost always be *much* better! * low-level external libraries will almost always be *much* better!
* calls highly optimized low level (e.g. C, Fortran) functions with OpenMP, cache-hit, vectorization, etc * calls highly optimized low level (e.g. C, Fortran) functions with OpenMP, cache-hit, vectorization, etc
* NumPy and SciPy already interface FFT (and many other) libraries * NumPy and SciPy already interface FFT (and many other) libraries
* How do their performance compare? * How do their performance compare?
$$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{kn(-2i\pi /N)} $$ $$ X_k = \sum_{n=0}^{N-1}x_n\cdot e^{kn(-2i\pi /N)} $$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
X = np.random.random(512) X = np.random.random(512)
%timeit Fl = DFT_loop(X) %timeit Fl = DFT_loop(X)
%timeit Fa = DFT_arrays(X) %timeit Fa = DFT_arrays(X)
%timeit Fe = np.fft.fft(X) %timeit Fe = np.fft.fft(X)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Bread & Butter: NumPy & SciPy ## Bread & Butter: NumPy & SciPy
* NumPy: * NumPy:
* efficient handling of arrays operations via NumPy Arrays * efficient handling of arrays operations via NumPy Arrays
* executes loops in compiled C code (*much* faster!) * executes loops in compiled C code (*much* faster!)
* SciPy: * SciPy:
* more advanced mathematical and numerical routines * more advanced mathematical and numerical routines
* provides several flavours of (sparse) matrices computation * provides several flavours of (sparse) matrices computation
* Both make calls to BLAS and LAPACK routines * Both make calls to BLAS and LAPACK routines
* `numpy.linalg` and `scipy.linalg` * `numpy.linalg` and `scipy.linalg`
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## NumPy arrays ## NumPy arrays
* Main workhorse of NumPy $\to$ stores all kinds of data * Main workhorse of NumPy $\to$ stores all kinds of data
* plain C arrays with the plain values * plain C arrays with the plain values
* preserves the intuitive syntax from Python * preserves the intuitive syntax from Python
* may use CPU performance features such as prefetching, caching, vectorization * may use CPU performance features such as prefetching, caching, vectorization
* may use high-performance libraries such as MKL (Math Kernel Library by Intel), Lapack, BLAS under the hood * may use high-performance libraries such as MKL (Math Kernel Library by Intel), Lapack, BLAS under the hood
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
my_array = np.array([[1,2,3,4], [5,6,7,8]]) my_array = np.array([[1,2,3,4], [5,6,7,8]])
print(my_array) print(my_array)
print(type(my_array)) print(type(my_array))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Constructors ### NumPy Arrays - Constructors
<br> <br>
* often require "shape" tuple (the parenthesis are mandatory) * often require "shape" tuple (the parenthesis are mandatory)
* accepts multi-dimension * accepts multi-dimension
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.zeros((3,4)) x = np.zeros((3,4))
x = np.ones((2,3,4)) x = np.ones((2,3,4))
x = np.full((2,2),7.2) x = np.full((2,2),7.2)
# random arrays # random arrays
x = np.random.random((2,2)) # 2x2 matrix with random reals x = np.random.random((2,2)) # 2x2 matrix with random reals
x = np.random.randint(10,size=5) # 5 random integers between 0 and 9 x = np.random.randint(10,size=5) # 5 random integers between 0 and 9
# use uninitialized memory # use uninitialized memory
x = np.empty((3,2)) x = np.empty((3,2))
# start, stop, stride # start, stop, stride
x = np.arange(10,55,5) x = np.arange(10,55,5)
print("Discrete Range:" + str(x)) print("Discrete Range:" + str(x))
# first element, last element, total number of elements # first element, last element, total number of elements
x = np.linspace(0,1,9) x = np.linspace(0,1,9)
print("Continuous Range:" + str(x)) print("Continuous Range:" + str(x))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Datatypes & Compatibility ### NumPy Arrays - Datatypes & Compatibility
* datatypes map to native C types * datatypes map to native C types
* datatype can be specified via the `dtype=` parameter of constructor * datatype can be specified via the `dtype=` parameter of constructor
* operations automatically <span style="color:red">cast compatible types</span> * operations automatically <span style="color:red">cast compatible types</span>
* defaults (in most Linux/x86_64): `np.float64` and `np.int64` * defaults (in most Linux/x86_64): `np.float64` and `np.int64`
* https://docs.scipy.org/doc/numpy/user/basics.types.html * https://docs.scipy.org/doc/numpy/user/basics.types.html
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
x = np.array([1,2,3]) x = np.array([1,2,3])
y = np.array([1.,2.,3.]) y = np.array([1.,2.,3.])
print("x and y type: %s and %s" % (x.dtype, y.dtype)) print("x and y type: %s and %s" % (x.dtype, y.dtype))
r = x+y r = x+y
print("x+y type: %s" % (r.dtype)) print("x+y type: %s" % (r.dtype))
z = np.array([1+2j,2+3j,3+4j]) z = np.array([1+2j,2+3j,3+4j])
sz = z.astype(np.complex64) sz = z.astype(np.complex64)
print("z and sz type: %s and %s" % (z.dtype, sz.dtype)) print("z and sz type: %s and %s" % (z.dtype, sz.dtype))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Functions ### NumPy Arrays - Functions
Arrays have a large selection of predefined operations. There are too many to list here! Arrays have a large selection of predefined operations. There are too many to list here!
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.random.randint(10,size=8) x = np.random.randint(10,size=8)
print("* Unsorted array: ", x) print("* Unsorted array: ", x)
x.sort() x.sort()
print("* Sorted array: ", x) print("* Sorted array: ", x)
# compute some properties # compute some properties
y = np.random.random((10,3)) y = np.random.random((10,3))
print("* Max, min, mean, std, sum: ", y.max(), y.min(), y.mean(), y.std(), y.sum()) print("* Max, min, mean, std, sum: ", y.max(), y.min(), y.mean(), y.std(), y.sum())
# may also specify over which axis # may also specify over which axis
print("* Max of each column: ", y.max(axis=0)) print("* Max of each column: ", y.max(axis=0))
print("* Max of each row: ", y.max(axis=1)) print("* Max of each row: ", y.max(axis=1))
# comparing two arrays # comparing two arrays
z = y + 1e-16 z = y + 1e-16
print("* Equal: ", np.array_equal(y,z)) print("* Equal: ", np.array_equal(y,z))
print("* Close: ", np.allclose(y, z, 1e-8)) print("* Close: ", np.allclose(y, z, 1e-8))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Functions Performance ### NumPy Arrays - Functions Performance
* prefer to use functions if they are there: <span style="color:green">faster</span> and <span style="color:green">cleaner</span> * prefer to use functions if they are there: <span style="color:green">faster</span> and <span style="color:green">cleaner</span>
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# use numpy functions # use numpy functions
def mean_loop(a): def mean_loop(a):
result = 0 result = 0
for i in range(len(a)): for i in range(len(a)):
result += a[i] result += a[i]
return result/len(a) return result/len(a)
x = np.arange(2**20) x = np.arange(2**20)
%timeit mean_loop(x) %timeit mean_loop(x)
%timeit np.mean(x) %timeit np.mean(x)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Attributes ### NumPy Arrays - Attributes
* rather advanced * rather advanced
* useful if you are writing interfaces between Python and C/Fortran * useful if you are writing interfaces between Python and C/Fortran
* useful for <span style="color:green">asserting contiguity</span> of data * useful for <span style="color:green">asserting contiguity</span> of data
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
print(y.ndim) # number of dimensions print(y.ndim) # number of dimensions
print(y.size) # number of elements print(y.size) # number of elements
print(y.dtype) # information about the data type print(y.dtype) # information about the data type
print(y.flags) # information about memory layout print(y.flags) # information about memory layout
print(y.itemsize) # size of one array element in bytes print(y.itemsize) # size of one array element in bytes
print(y.nbytes) # total size of the array in bytes print(y.nbytes) # total size of the array in bytes
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Basic Slicing ### NumPy Arrays - Basic Slicing
* syntax identical to lists * syntax identical to lists
* does <span style="color:red">not guarantee contiguity</span> of data * does <span style="color:red">not guarantee contiguity</span> of data
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.arange(20) x = np.arange(20)
# basic slice syntax as for python lists is start:end:step # basic slice syntax as for python lists is start:end:step
# for multidimension [start1:end1:step1, start2:end2:step2, ...] # for multidimension [start1:end1:step1, start2:end2:step2, ...]
print("From 2 to 9, in steps 2:", x[2:9:2]) print("From 2 to 9, in steps 2:", x[2:9:2])
print("All except 3 last elements:", x[:-3]) print("All except 3 last elements:", x[:-3])
# revert array # revert array
print("Backwards:", x[::-1]) print("Backwards:", x[::-1])
# Reshape into multi-D arrays # Reshape into multi-D arrays
y = x.reshape((5,4)) y = x.reshape((5,4))
print("Shape of y:", y.shape) print("Shape of y:", y.shape)
print("Slice of y:", y[1:,:2]) print("Slice of y:", y[1:,:2])
# new dimensions can be added # new dimensions can be added
z = np.arange(3) z = np.arange(3)
print("z:") print("z:")
print(z[:,np.newaxis]) print(z[:,np.newaxis])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Broadcasting ### NumPy Arrays - Broadcasting
* some dimension conversion is automatic, some are not! * some dimension conversion is automatic, some are not!
* careful with <span style="color:red">readability</span> of the code * careful with <span style="color:red">readability</span> of the code
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
<center> <center>
<img src="fig/numpy_broadcast.svg" style="width: 60%"> <img src="fig/numpy_broadcast.svg" style="width: 60%">
</center> </center>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays - Broadcasting ### NumPy Arrays - Broadcasting
* careful with <span style="color:red">readability</span> of the code * careful with <span style="color:red">readability</span> of the code
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
X = np.arange(12).reshape(4,3) X = np.arange(12).reshape(4,3)
Y = np.arange(3)/10 Y = np.arange(3)/10
Z = np.arange(4)/10 Z = np.arange(4)/10
print("X is a 4x3 matrix:\n",X) print("X is a 4x3 matrix:\n",X)
print("\nX+Y is valid:\n",X+Y) print("\nX+Y is valid:\n",X+Y)
# X+Z is invalid! We need to add a new dimension: # X+Z is invalid! We need to add a new dimension:
print("\nX+Z[:,None] is valid:\n",X+Z[:,None]) print("\nX+Z[:,None] is valid:\n",X+Z[:,None])
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Array - Indexing ### NumPy Array - Advanced Indexing
* indexes can be: * the selection object (as in `x[obj]`) can be:
* a list (or an array!) of positions * a non-tuple sequence object (i.e., list or range) or ndarray of indices
* or a tuple with at least one non-tuple squence object or ndarray
* a boolean array $\to$ must have same dimensions * a boolean array $\to$ must have same dimensions
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.arange(10)*10 x = np.arange(10)*10
print(x) print(x)
# use integer list as indices # use integer list as indices
print("X with integer indices: ", x[[2,5,8]]) print("X with integer indices: ", x[[2,5,8]])
print("X with boolean indices: ", x[[False, False, True, False, False, True, False, False, True, False]]) print("X with boolean indices: ", x[[False, False, True, False, False, True, False, False, True, False]])
# create boolean array # create boolean array
y = x**2 - 750 y = x**2 - 750
index = (x < y) index = (x < y)
print(index) print(index)
print("X with boolean indices: ", x[index]) print("X with boolean indices: ", x[index])
print("X with negated indices: ", x[~index]) print("X with negated indices: ", x[~index])
print("X with oneliner: ", x[x<y]) print("X with oneliner: ", x[x<y])
# Set x to 100 for y below 10 - equivalent to: # Set x to 100 for y below 10 - equivalent to:
# for i in range(len(x)): # for i in range(len(x)):
# if y[i]<10: # if y[i]<10:
# x[i] = 100 # x[i] = 100
# This is actually a view! we discuss it next # Indexing can be used to assign values to indexed arrays
x[x < y] = 100 x[x < y] = 100
print("Oneliner x: ", x) print("Oneliner x: ", x)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays vs Views ### NumPy Arrays vs Views
* Using slicing on arrays usually returns a view on the original array * Basic slicing on arrays usually returns a view on the original array
* Data is **not** copied $\to$ changing the original array also changes the view * Data is **not** copied $\to$ changing the original array also changes the view
* Views may be <span style="color:red">non-contiguous data</span> * Views may be <span style="color:red">non-contiguous data</span>
* use `copy()` for copying the data over * use `copy()` for copying the data over
* Advanced indexing directly returns a copy of the indexed data
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
x = np.arange(15) x = np.arange(15)
print("Array x: ", x) print("Array x: ", x)
y = x[1:8:2] y = x[1:8:2]
print("View y: ", y) print("View y: ", y)
print(y.flags) print(y.flags)
x[1] = 20 x[1] = 20
print("View y after changing x: ", y) print("View y after changing x: ", y)
y[0] *= 10 y[0] *= 10
print("Original x after changing view y: ", x) print("Original x after changing view y: ", x)
c = y.copy() c = y.copy()
print("Copy of y: ", c) print("Copy of y: ", c)
print(c.flags) print(c.flags)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### NumPy Arrays vs Views: Performance ### NumPy Arrays vs Views: Performance
* strided data access may <span style="color:red">hinder performance</span>. * strided data access may <span style="color:red">hinder performance</span>.
* temporary array **may** be preferable, but: additional cost due to copy! * temporary array **may** be preferable, but: additional cost due to copy!
* careful with passing views as argument * careful with passing views as argument
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
def sum_test(c,x,y): def sum_test(c,x,y):
# Computes c*x^2 + y # Computes c*x^2 + y
if len(c) == len(x) == len(y): if len(c) == len(x) == len(y):
return c*x*x + y return c*x*x + y
else: else:
print("Wrong shape") print("Wrong shape")
return None return None
def sum_test_copy(c,x,y): def sum_test_copy(c,x,y):
copy = x.copy() copy = x.copy()
return sum_test(c,copy,y) return sum_test(c,copy,y)
x = np.random.random(1000000) x = np.random.random(1000000)
y = np.random.random(100000) y = np.random.random(100000)
c = np.random.random(100000) c = np.random.random(100000)
view = x[::10] view = x[::10]
copy = view.copy() copy = view.copy()
%timeit res = sum_test(c,view,y) %timeit res = sum_test(c,view,y)
%timeit res = sum_test(c,copy,y) %timeit res = sum_test(c,copy,y)
%timeit res = sum_test_copy(c,view,y) %timeit res = sum_test_copy(c,view,y)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Example 1: Midpoint Computation ### Example 1: Midpoint Computation
$$y_i = \dfrac{x_{i}+x_{i+1}}{2}, \qquad \forall i<N$$ $$y_i = \dfrac{x_{i}+x_{i+1}}{2}, \qquad \forall i<N$$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def midpoints_list(x): # worst implementation def midpoints_list(x): # worst implementation
y = [] y = []
for i in range(len(x)-1): for i in range(len(x)-1):
y.append(0.5*(x[i] + x[i+1])) y.append(0.5*(x[i] + x[i+1]))
return np.array(y) return np.array(y)
def midpoints_loop(x): # loop forcing numpy array def midpoints_loop(x): # loop forcing numpy array
N = len(x) N = len(x)
y = np.zeros(N-1) y = np.zeros(N-1)
for i in range(N-1): for i in range(N-1):
y[i] = 0.5*(x[i] + x[i+1]) y[i] = 0.5*(x[i] + x[i+1])
return y return y
def midpoints_array(x): # best: no loops at all def midpoints_array(x): # best: no loops at all
return 0.5*(x[:-1] + x[1:]) return 0.5*(x[:-1] + x[1:])
x = np.linspace(0, 10, 10**6) x = np.linspace(0, 10, 10**6)
%timeit midpoints_list(x) %timeit midpoints_list(x)
%timeit midpoints_loop(x) %timeit midpoints_loop(x)
%timeit midpoints_array(x) %timeit midpoints_array(x)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Example 1: Midpoint Computation Explanation ### Example 1: Midpoint Computation Explanation
$$y_i = \dfrac{x_{i}+x_{i+1}}{2}, \qquad \forall i<N$$ $$y_i = \dfrac{x_{i}+x_{i+1}}{2}, \qquad \forall i<N$$
<br> <br>
* Loop ```for i in range(N-1):``` to left-hand side ```y[0:N-1]``` * Loop ```for i in range(N-1):``` to left-hand side ```y[0:N-1]```
* Translate indices on right-hand side to ranges for array * Translate indices on right-hand side to ranges for array
* ```x[i]``` to ```x[0:N-1]``` and ```x[i+1]``` to ```x[1:N]``` * ```x[i]``` to ```x[0:N-1]``` and ```x[i+1]``` to ```x[1:N]```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
<center> <center>
<img src="fig/numpy-indexing.svg", style="width: 50%"> <img src="fig/numpy-indexing.svg", style="width: 50%">
</center> </center>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Example 2: In-Place Manipulations ### Example 2: In-Place Manipulations
* `x[1:] + x[:-1]` will create a temporary array * `x[1:] + x[:-1]` will create a temporary array
* we can avoid it, but does that pay off? * we can avoid it, but does that pay off?
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def temp_arrays(x): def temp_arrays(x):
return 0.5*(x[1:] + x[:-1]) return 0.5*(x[1:] + x[:-1])
def in_place(x): def in_place(x):
y = x[1:] y = x[1:]
y += x[:-1] y += x[:-1]
y *= 0.5 y *= 0.5
return y return y
x = np.random.rand(2**15) x = np.random.rand(2**15)
%timeit temp_arrays(x) %timeit temp_arrays(x)
%timeit in_place(x) %timeit in_place(x)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
* performance is problem- and size-dependent * performance is problem- and size-dependent
* $\to$ may be faster, but not necessarily! * $\to$ may be faster, but not necessarily!
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Example 3: Center of Mass ### Example 3: Center of Mass
* For positions $\vec{x}_i$ ($j$-th coordinate $x_{i,j}$), masses $m_i$: * For positions $\vec{x}_i$ ($j$-th coordinate $x_{i,j}$), masses $m_i$:
$$ \vec{x}_\mathsf{com} = \frac{1}{M} \sum_{i=0}^N m_i \vec{x}_i, \qquad\text{where}\qquad M=\sum_i m_i$$ $$ \vec{x}_\mathsf{com} = \frac{1}{M} \sum_{i=0}^N m_i \vec{x}_i, \qquad\text{where}\qquad M=\sum_i m_i$$
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def center_of_mass_loop(pos, mass): def center_of_mass_loop(pos, mass):
com = np.zeros(3) com = np.zeros(3)
for j in range(3): for j in range(3):
for i in range(pos.shape[0]): for i in range(pos.shape[0]):
com[j] += pos[i, j] * mass[i] com[j] += pos[i, j] * mass[i]
total_mass = 0.0 total_mass = 0.0
for i in range(pos.shape[0]): for i in range(pos.shape[0]):
total_mass += mass[i] total_mass += mass[i]
for j in range(3): for j in range(3):
com[j] /= total_mass com[j] /= total_mass
return com return com
def center_of_mass_arrays(pos, mass): def center_of_mass_arrays(pos, mass):
com = np.zeros(3) com = np.zeros(3)
for j in range(3): for j in range(3):
com[j] = np.sum(pos[:, j]*mass[:]) com[j] = np.sum(pos[:, j]*mass[:])
total_mass = np.sum(mass) total_mass = np.sum(mass)
com /= total_mass com /= total_mass
return com return com
def center_of_mass_oneline(pos, mass): def center_of_mass_oneline(pos, mass):
return np.sum(pos * mass[:, None], axis=0)/np.sum(mass) return np.sum(pos * mass[:, None], axis=0)/np.sum(mass)
# positions and masses of particles # positions and masses of particles
N = 10**5 N = 10**5
pos = np.random.rand(N, 3) pos = np.random.rand(N, 3)
mass = np.random.rand(N) mass = np.random.rand(N)
%timeit center_of_mass_loop(pos, mass) %timeit center_of_mass_loop(pos, mass)
%timeit center_of_mass_arrays(pos, mass) %timeit center_of_mass_arrays(pos, mass)
%timeit center_of_mass_oneline(pos, mass) %timeit center_of_mass_oneline(pos, mass)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## NumPy Summary ## NumPy Summary
* Takeaway messages: * Takeaway messages:
1. High level manipulations $\to$ in Python 1. High level manipulations $\to$ in Python
2. Intensive computation $\to$ delegate to underlying libraries 2. Intensive computation $\to$ delegate to underlying libraries
3. No dedicated library $\to$ use NumPy arrays and their functions 3. No dedicated library $\to$ use NumPy arrays and their functions
<br> <br>
* Tradeoff: high development productivity versus performance * Tradeoff: high development productivity versus performance
* NumPy code *can be* as fast as C or Fortran code, but often it is a factor of 2-4 slower * NumPy code *can be* as fast as C or Fortran code, but often it is a factor of 2-4 slower
$\to$ Implement hotspots in C/Fortran + Python Interface $\to$ Implement hotspots in C/Fortran + Python Interface
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# SciPy # SciPy
* SciPy is a mathematical toolbox built on NumPy, providing functions for * SciPy is a mathematical toolbox built on NumPy, providing functions for
* FFT * FFT
* integration * integration
* interpolation * interpolation
* linear algebra * linear algebra
* sparse matrix operations * sparse matrix operations
* optimization * optimization
* special functions (e.g. Bessel functions) * special functions (e.g. Bessel functions)
* ... and many more * ... and many more
* most of them delegate <span style="color:red">heavy work</span> to C/Fortran libraries * most of them delegate <span style="color:red">heavy work</span> to C/Fortran libraries
* [Documentation](https://docs.scipy.org/doc/scipy/reference/) * [Documentation](https://docs.scipy.org/doc/scipy/reference/)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy Example 1: FFT & Power Spectrum ## SciPy Example 1: FFT & Power Spectrum
* calculate and plot the power spectrum of a signal * calculate and plot the power spectrum of a signal
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# power spectrum (code taken from the SciPy tutorial) # power spectrum (code taken from the SciPy tutorial)
import numpy as np import numpy as np
from scipy.fftpack import fft from scipy.fftpack import fft
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
N = 600 # number of sample points N = 600 # number of sample points
T = 1.0 / 800.0 # sample spacing T = 1.0 / 800.0 # sample spacing
x = np.linspace(0.0, N*T, N) x = np.linspace(0.0, N*T, N)
y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x) y = np.sin(50.0 * 2.0*np.pi*x) + 0.5*np.sin(80.0 * 2.0*np.pi*x)
yf = fft(y) yf = fft(y)
xf = np.linspace(0.0, 1.0/(2.0*T), N//2) xf = np.linspace(0.0, 1.0/(2.0*T), N//2)
plt.plot(xf, 2.0/N * np.abs(yf[0:N//2])) plt.plot(xf, 2.0/N * np.abs(yf[0:N//2]))
plt.grid() plt.grid()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy Example 2: Integration ## SciPy Example 2: Integration
* Numerical integration of * Numerical integration of
* a given function object * a given function object
* a given sample of function values * a given sample of function values
* ODEs (ordinary differential equations) * ODEs (ordinary differential equations)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import scipy.integrate as integrate import scipy.integrate as integrate
result = integrate.quad(lambda x: x**2, 0, 2.5) # uses Fortran's QUADPACK result = integrate.quad(lambda x: x**2, 0, 2.5) # uses Fortran's QUADPACK
print(result[0], 2.5**3/3.) print(result[0], 2.5**3/3.)
x = np.linspace(0, 2.0, 10) x = np.linspace(0, 2.0, 10)
y = np.sin(x) y = np.sin(x)
result = integrate.simps(y, x) # Simpson rule for given sample result = integrate.simps(y, x) # Simpson rule for given sample
print(result, 1.-np.cos(2.0)) print(result, 1.-np.cos(2.0))
# simple derivative: exponential # simple derivative: exponential
def yprime(y, t, alpha): def yprime(y, t, alpha):
return np.exp(alpha*t)*alpha return np.exp(alpha*t)*alpha
ts = np.linspace(0, 2.0, 50) ts = np.linspace(0, 2.0, 50)
y0 = 1.0 y0 = 1.0
y = integrate.odeint(yprime, y0, ts, args=(2.0,)) # uses Fortran's ODEPACK y = integrate.odeint(yprime, y0, ts, args=(2.0,)) # uses Fortran's ODEPACK
print(y[-1], np.exp(2.0 * ts[-1])) # compare to analytical result print(y[-1], np.exp(2.0 * ts[-1])) # compare to analytical result
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy Example 3: Sparse Matrices ## SciPy Example 3: Sparse Matrices
* different sparse formats for different tasks: * different sparse formats for different tasks:
* CSR: compressed sparse row, for row operations/slicing * CSR: compressed sparse row, for row operations/slicing
* CSC: compressed sparse column, for column operations/slicing * CSC: compressed sparse column, for column operations/slicing
* LIL: list-of-lists, efficient for building the matrix, slow for most other operations * LIL: list-of-lists, efficient for building the matrix, slow for most other operations
* DIA: diagonal matrix, very useful for FD * DIA: diagonal matrix, very useful for FD
* `speye()`: sparse identity * `speye()`: sparse identity
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### SciPy Example 3: Sparse Matrices ### SciPy Example 3: Sparse Matrices
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from scipy import sparse from scipy import sparse
import numpy as np import numpy as np
def fill_matrix(M): def fill_matrix(M):
for i in range(300): for i in range(300):
rows = np.random.randint(0,N,300) rows = np.random.randint(0,N,300)
cols = np.random.randint(0,N,300) cols = np.random.randint(0,N,300)
val = np.random.random() val = np.random.random()
M[rows,cols] = val M[rows,cols] = val
return M return M
N = 10000 N = 10000
Mlil = sparse.lil_matrix((N,N)) Mlil = sparse.lil_matrix((N,N))
Mcsr = sparse.csr_matrix((N,N)) Mcsr = sparse.csr_matrix((N,N))
%time Mcsr = fill_matrix(Mcsr) %time Mcsr = fill_matrix(Mcsr)
%time Mlil = fill_matrix(Mlil) %time Mlil = fill_matrix(Mlil)
%time Mlil = fill_matrix(Mlil).tocsr() %time Mlil = fill_matrix(Mlil).tocsr()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy - Linear Algebra ## SciPy - Linear Algebra
* handles dense and sparse matrices * handles dense and sparse matrices
* delegates computation to lower level libraries * delegates computation to lower level libraries
* e.g. Anaconda Python comes with Intel MKL * e.g. Anaconda Python comes with Intel MKL
* SIMD instructions ("vectorization") * SIMD instructions ("vectorization")
* parallel processing ("threading") * parallel processing ("threading")
<br> <br>
* NumPy vs SciPy `linalg` module: * NumPy vs SciPy `linalg` module:
* NumPy: uses own implementation (slower) if Fortran Compiler unavailable * NumPy: uses own implementation (slower) if Fortran Compiler unavailable
* SciPy: requires Fortran Compiler and has more functions * SciPy: requires Fortran Compiler and has more functions
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy Example 4: Sparse vs Dense Linear Systems ## SciPy Example 4: Sparse vs Dense Linear Systems
* dense: `scipy.linalg`, sparse: `scipy.sparse.linalg` * dense: `scipy.linalg`, sparse: `scipy.sparse.linalg`
* sparse matrix operations can be much faster, depending on the density of matrix * sparse matrix operations can be much faster, depending on the density of matrix
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from scipy import sparse from scipy import sparse
from scipy import linalg from scipy import linalg
import scipy.sparse.linalg as splinalg import scipy.sparse.linalg as splinalg
import numpy as np import numpy as np
def compare_density(density, n=1000): def compare_density(density, n=1000):
b = np.random.random(n) b = np.random.random(n)
A1 = sparse.random(n, n, density=density, format='csr') A1 = sparse.random(n, n, density=density, format='csr')
A1 += sparse.eye(n,n, format='csr') A1 += sparse.eye(n,n, format='csr')
print("Density: ", density) print("Density: ", density)
print("Sparse: ") print("Sparse: ")
%timeit splinalg.spsolve(A1, b) %timeit splinalg.spsolve(A1, b)
A2 = A1.todense() A2 = A1.todense()
print("Dense: ") print("Dense: ")
%timeit linalg.solve(A2, b) %timeit linalg.solve(A2, b)
# Density=0.001: sparse is faster, Density=0.01: dense is faster # Density=0.001: sparse is faster, Density=0.01: dense is faster
compare_density(0.01) compare_density(0.01)
compare_density(0.001) compare_density(0.001)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### SciPy - Further Linear Algebra Functions ### SciPy - Further Linear Algebra Functions
* Matrix-matrix and matrix-vector products * Matrix-matrix and matrix-vector products
* careful with preservation of sparsity! * careful with preservation of sparsity!
* Solve linear systems * Solve linear systems
* Compute eigenvalues and singular values * Compute eigenvalues and singular values
* Compute pseudoinverse * Compute pseudoinverse
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
N = 100 N = 100
A = np.random.rand(N, N) A = np.random.rand(N, N)
x = np.random.rand(N) x = np.random.rand(N)
u, s, vh = linalg.svd(A) # singular value decomposition u, s, vh = linalg.svd(A) # singular value decomposition
values, vectors = linalg.eig(A) # eigenvalue problem values, vectors = linalg.eig(A) # eigenvalue problem
det = linalg.det(A) # determinant det = linalg.det(A) # determinant
P = linalg.pinv(A) # pseudoinverse (accepts rectangular matrices) P = linalg.pinv(A) # pseudoinverse (accepts rectangular matrices)
# Further sparse matrices creation # Further sparse matrices creation
A = sparse.random(N, N, density=0.01, format='csr') A = sparse.random(N, N, density=0.01, format='csr')
B = sparse.random(N, N, density=0.01, format='csr') B = sparse.random(N, N, density=0.01, format='csr')
C, D = sparse.diags(np.arange(4), 1), sparse.kron(B, A) C, D = sparse.diags(np.arange(4), 1), sparse.kron(B, A)
ev, evec = splinalg.eigs(A, 2) # 2 largest eigenvalues ev, evec = splinalg.eigs(A, 2) # 2 largest eigenvalues
u, s, vh = sparse.linalg.svds(A, 2) # 2 largest singular values u, s, vh = sparse.linalg.svds(A, 2) # 2 largest singular values
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## SciPy Example 5: Interpolation ## SciPy Example 5: Interpolation
* Interpolation of 1D and multi-D data (structured grid and unstructured points) * Interpolation of 1D and multi-D data (structured grid and unstructured points)
* Splines and other polynomials * Splines and other polynomials
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import scipy.interpolate as interpolate import scipy.interpolate as interpolate
x = np.linspace(0, 10.0, 12) x = np.linspace(0, 10.0, 12)
y = np.sin(x) y = np.sin(x)
y_int1 = interpolate.interp1d(x, y) y_int1 = interpolate.interp1d(x, y)
y_int2 = interpolate.PchipInterpolator(x, y) y_int2 = interpolate.PchipInterpolator(x, y)
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
plt.figure(figsize=(20,6)) plt.figure(figsize=(20,6))
plt.subplot(1,2,1) plt.subplot(1,2,1)
plt.plot(x, y) plt.plot(x, y)
x2 = np.linspace(0, 10.0, 50) x2 = np.linspace(0, 10.0, 50)
plt.plot(x2, y_int1(x2)) plt.plot(x2, y_int1(x2))
plt.subplot(1,2,2) plt.subplot(1,2,2)
plt.plot(x, y) plt.plot(x, y)
im = plt.plot(x2, y_int2(x2)) im = plt.plot(x2, y_int2(x2))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## NumPy/SciPy Summary ## NumPy/SciPy Summary
* NumPy: efficient handling of arrays * NumPy: efficient handling of arrays
* SciPy: more advanced mathematical and numerical routines; uses NumPy under the hood *plus* other C/F libraries * SciPy: more advanced mathematical and numerical routines; uses NumPy under the hood *plus* other C/F libraries
* Performance tips: * Performance tips:
* Work on full arrays (slicing, NumPy routines...) * Work on full arrays (slicing, NumPy routines...)
* identify hotspots and optimize them * identify hotspots and optimize them
* profile * profile
* code in Cython or C/Fortran + Python interface -- or try Numba * code in Cython or C/Fortran + Python interface -- or try Numba
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Futher reading ## Futher reading
* Harris, C.R., Millman, K.J., van der Walt, S.J. et al. *Array programming with NumPy.* **Nature** 585, 357–362 (2020). (https://doi.org/10.1038/s41586-020-2649-2) * Harris, C.R., Millman, K.J., van der Walt, S.J. et al. *Array programming with NumPy.* **Nature** 585, 357–362 (2020). (https://doi.org/10.1038/s41586-020-2649-2)
* Bressert, E. (2012). SciPy and NumPy (1st edition.). O'Reilly Media, Inc. (https://ebooks.mpdl.mpg.de/ebooks/Record/EB001944176) * Bressert, E. (2012). SciPy and NumPy (1st edition.). O'Reilly Media, Inc. (https://ebooks.mpdl.mpg.de/ebooks/Record/EB001944176)
* There are numerous books on the topic available: https://ebooks.mpdl.mpg.de/ebooks/Search/Results?type=AllFields&lookfor=numpy * There are numerous books on the topic available: https://ebooks.mpdl.mpg.de/ebooks/Search/Results?type=AllFields&lookfor=numpy
(MPG.eBooks work from any Max Planck IP address.) (MPG.eBooks work from any Max Planck IP address.)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Numpy Input and Output # Numpy Input and Output
## Reading and writing NumPy arrays ## Reading and writing NumPy arrays
Possibilities: Possibilities:
* NumPy's native IO facilities * NumPy's native IO facilities
* HDF5 using H5PY * HDF5 using H5PY
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Native NumPy Input/Output ## Native NumPy Input/Output
### Saving NumPy arrays to files ### Saving NumPy arrays to files
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import os import os
import numpy as np import numpy as np
x = np.linspace(0.0,100.0,101) x = np.linspace(0.0,100.0,101)
# save to text file (for debugging and small files only) # save to text file (for debugging and small files only)
np.savetxt("x.txt", x) np.savetxt("x.txt", x)
print("savetxt: ", os.path.getsize('x.txt'), "bytes") print("savetxt: ", os.path.getsize('x.txt'), "bytes")
# save to binary file, suffix ".npy" # save to binary file, suffix ".npy"
np.save("x", x) np.save("x", x)
print("save: ", os.path.getsize('x.npy'), "bytes") print("save: ", os.path.getsize('x.npy'), "bytes")
y = np.linspace(0.0,100.0,201) y = np.linspace(0.0,100.0,201)
# save several arrays to binary file, suffix ".npz", preserve variable names via kwargs # save several arrays to binary file, suffix ".npz", preserve variable names via kwargs
np.savez("xy", x=x, y=y) np.savez("xy", x=x, y=y)
print("savez: ", os.path.getsize('xy.npz'), "bytes") print("savez: ", os.path.getsize('xy.npz'), "bytes")
np.savez_compressed("xy_compressed", x=x, y=y) np.savez_compressed("xy_compressed", x=x, y=y)
print("savez_compressed: ", os.path.getsize('xy_compressed.npz'), "bytes") print("savez_compressed: ", os.path.getsize('xy_compressed.npz'), "bytes")
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Reading NumPy arrays from files ### Reading NumPy arrays from files
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
#x = np.loadtxt("x.txt") #x = np.loadtxt("x.txt")
x = np.load("x.npy") x = np.load("x.npy")
print("type of x:", type(x)) print("type of x:", type(x))
xy = np.load("xy.npz") xy = np.load("xy.npz")
print("type of xy:", type(xy)) print("type of xy:", type(xy))
# individual NumPy arrays in xy are accessible via a dict-style interface # individual NumPy arrays in xy are accessible via a dict-style interface
print("Keys of xy: ", tuple(xy.keys())) print("Keys of xy: ", tuple(xy.keys()))
print("x and xy['x'] are the same: ", np.allclose(x, xy["x"])) print("x and xy['x'] are the same: ", np.allclose(x, xy["x"]))
print("y and xy['y'] are the same: ", np.allclose(y, xy["y"])) print("y and xy['y'] are the same: ", np.allclose(y, xy["y"]))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Native NumPy Input/Output ## Native NumPy Input/Output
* Useful for * Useful for
* Quickly saving results (e.g. from interactive sessions) * Quickly saving results (e.g. from interactive sessions)
* Caching results from expensive calculations locally * Caching results from expensive calculations locally
* Data that is read only in the same way * Data that is read only in the same way
* Note * Note
* Binary portability (endianness) across systems is supported * Binary portability (endianness) across systems is supported
* Problem * Problem
* Not easily accessible from other programs * Not easily accessible from other programs
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## HDF5 using H5PY ## HDF5 using H5PY
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### HDF5 in a nutshell ### HDF5 in a nutshell
* HDF5 is a hierarchichal data format specification and library implementation * HDF5 is a hierarchichal data format specification and library implementation
* HDF5 contains data plus metadata, i.e. it is a self-describing format * HDF5 contains data plus metadata, i.e. it is a self-describing format
* standard and custom datatypes * standard and custom datatypes
* filters (e.g. checksums, compression, scale offset) * filters (e.g. checksums, compression, scale offset)
* parallel IO supported, platform independent * parallel IO supported, platform independent
* claimed to be future proof * claimed to be future proof
* recommended and widely used in HPC by simulation codes, analysis and visualization tools (e.g. VisIt, ParaView, IDL, Matlab) * recommended and widely used in HPC by simulation codes, analysis and visualization tools (e.g. VisIt, ParaView, IDL, Matlab)
(cf. https://support.hdfgroup.org/HDF5/doc/H5.intro.html) (cf. https://support.hdfgroup.org/HDF5/doc/H5.intro.html)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### HDF5 schematic ### HDF5 schematic
* tree structure with groups, datasets, and attributes, some analogy to a file system with directories and files * tree structure with groups, datasets, and attributes, some analogy to a file system with directories and files
![HDF5 schematic](fig/hdf5_structure4.jpg) ![HDF5 schematic](fig/hdf5_structure4.jpg)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## H5PY: HDF5 bindings for Python ## H5PY: HDF5 bindings for Python
* easy to use Pythonic high-level interface * easy to use Pythonic high-level interface
* dictionary syntax (dataset access by name) * dictionary syntax (dataset access by name)
* NumPy array syntax (indexing, slicing, etc.) * NumPy array syntax (indexing, slicing, etc.)
* metadata access: use `attrs` attribute * metadata access: use `attrs` attribute
* designed from the beginning to integrate well with NumPy * designed from the beginning to integrate well with NumPy
* low-level Cython interface to the HDF5 C API is available * low-level Cython interface to the HDF5 C API is available
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Write a NumPy array to HDF5 ### Write a NumPy array to HDF5
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
import h5py import h5py
M = np.random.rand(128,128) M = np.random.rand(128,128)
with h5py.File("data.hdf5", 'w') as fp: with h5py.File("data.hdf5", 'w') as fp:
# a HDF5 file always has a root group, "/" which we use implicitly here # a HDF5 file always has a root group, "/" which we use implicitly here
dset = fp.create_dataset("random_matrix", data=M) dset = fp.create_dataset("random_matrix", data=M)
# add metadata (attributes) describing the dataset via the 'attrs' proxy object # add metadata (attributes) describing the dataset via the 'attrs' proxy object
dset.attrs["whatis"] = "This is a randomly generated 128x128 matrix." dset.attrs["whatis"] = "This is a randomly generated 128x128 matrix."
# or even easier # or even easier
fp["random_matrix2"] = M fp["random_matrix2"] = M
fp["random_matrix2"].attrs["whatis"] = "This is another randomly generated 128x128 matrix." fp["random_matrix2"].attrs["whatis"] = "This is another randomly generated 128x128 matrix."
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
$\to$ use `with` to ensure file is closed and written! $\to$ use `with` to ensure file is closed and written!
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Access a HDF5 dataset ### Access a HDF5 dataset
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
import h5py import h5py
with h5py.File("data.hdf5", 'r') as fp: with h5py.File("data.hdf5", 'r') as fp:
print("Available keys:", tuple(fp.keys())) print("Available keys:", tuple(fp.keys()))
# retrieve the dataset # retrieve the dataset
dset = fp["random_matrix"] dset = fp["random_matrix"]
print(type(dset), dset.shape, dset.dtype) print(type(dset), dset.shape, dset.dtype)
# retrieve the metadata that comes with the dataset # retrieve the metadata that comes with the dataset
for key,val in dset.attrs.items(): for key,val in dset.attrs.items():
print("{} : {}".format(key, val)) print("{} : {}".format(key, val))
# dset can be used similarly to a NumPy array # dset can be used similarly to a NumPy array
elem = dset[0,0] # indexing, slicing elem = dset[0,0] # indexing, slicing
dsum = np.sum(dset) # not dset.sum()! dsum = np.sum(dset) # not dset.sum()!
# explicit conversion to NumPy array # explicit conversion to NumPy array
M = dset[:] M = dset[:]
print(type(M), M.shape, M.dtype) print(type(M), M.shape, M.dtype)
msum = M.sum() msum = M.sum()
print("Sums are the same:", dsum == msum) print("Sums are the same:", dsum == msum)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Parallel Access (Preview) ### Parallel Access (Preview)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from mpi4py import MPI from mpi4py import MPI
import h5py import h5py
rank = MPI.COMM_WORLD.rank # The process ID (integer 0-3 for 4-process run) rank = MPI.COMM_WORLD.rank # The process ID (integer 0-3 for 4-process run)
f = h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD) f = h5py.File('parallel_test.hdf5', 'w', driver='mpio', comm=MPI.COMM_WORLD)
dset = f.create_dataset('test', (4,), dtype='i') dset = f.create_dataset('test', (4,), dtype='i')
dset[rank] = rank dset[rank] = rank
f.close() f.close()
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Advanced usage of HDF5-IO on HPC systems ## Advanced usage of HDF5-IO on HPC systems
* checkpointing/resuming of long-running jobs * checkpointing/resuming of long-running jobs
* write current state of simulation to HDF5 file * write current state of simulation to HDF5 file
* resume simulation by reading back-in that state * resume simulation by reading back-in that state
* upcoming optional exercise: diffusion example * upcoming optional exercise: diffusion example
* large-scale parallel IO from multiple MPI processes * large-scale parallel IO from multiple MPI processes
* write few, large files from each process * write few, large files from each process
limit the number of files per directory (max. ~1000) limit the number of files per directory (max. ~1000)
* write to a single file using MPI-enabled `h5py` (MPCDF environment module `h5py-mpi`) * write to a single file using MPI-enabled `h5py` (MPCDF environment module `h5py-mpi`)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Summary on HDF5 IO of NumPy data ## Summary on HDF5 IO of NumPy data
* easy to use with `h5py` (alternative module would be `pytables`) * easy to use with `h5py` (alternative module would be `pytables`)
* hierarchical structure enables you to store complex data in a single file * hierarchical structure enables you to store complex data in a single file
* HDF5 files are future-proof and portable data files * HDF5 files are future-proof and portable data files
* data can be shared & used by other programs * data can be shared & used by other programs
$\to$ our recommendation for NumPy-related I/O! $\to$ our recommendation for NumPy-related I/O!
Further reading: Further reading:
* Collette, A. (2013). Python and HDF5 (1st edition.). O'Reilly Media, Inc. * Collette, A. (2013). Python and HDF5 (1st edition.). O'Reilly Media, Inc.
(https://ebooks.mpdl.mpg.de/ebooks/Record/EB001941719) (https://ebooks.mpdl.mpg.de/ebooks/Record/EB001941719)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Application: image processing # Application: image processing
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Example 1: Smooth Images ## Example 1: Smooth Images
![image filter](fig/image-filter-example.svg) ![image filter](fig/image-filter-example.svg)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Smoothing with image filters ## Smoothing with image filters
* Goal: subtract noise * Goal: subtract noise
* Use filtering approach * Use filtering approach
* For each pixel, at $i, j$, compute: * For each pixel, at $i, j$, compute:
$$\tilde{f}_{i,j} = \alpha f_{i,j} + \frac{(1 - \alpha)}{4} (f_{i-1,j} + f_{i+1,j} + f_{i,j-1} + f_{i,j+1})$$ $$\tilde{f}_{i,j} = \alpha f_{i,j} + \frac{(1 - \alpha)}{4} (f_{i-1,j} + f_{i+1,j} + f_{i,j-1} + f_{i,j+1})$$
* Mix pixel with average of surrounding pixels $\to$ *filter out high-frequency noise* * Mix pixel with average of surrounding pixels $\to$ *filter out high-frequency noise*
* Can also be rewritten using the discrete Laplace operator: * Can also be rewritten using the discrete Laplace operator:
$$\tilde{f}_{i,j} = f_{i,j} + \frac{(1 - \alpha)}{4} \Delta f_{i,j}$$ $$\tilde{f}_{i,j} = f_{i,j} + \frac{(1 - \alpha)}{4} \Delta f_{i,j}$$
* Strongest smoothing for $\alpha = 0$, no smoothing for $\alpha = 1$ * Strongest smoothing for $\alpha = 0$, no smoothing for $\alpha = 1$
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Image: alpha Centauri ### Image: alpha Centauri
* Image from DSS (for copyright see http://archive.stsci.edu/dss/acknowledging.html) * Image from DSS (for copyright see http://archive.stsci.edu/dss/acknowledging.html)
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
# convert to float between 0 and 1 # convert to float between 0 and 1
image = plt.imread("fig/dss_alpha_centauri.gif").astype(np.float64)/255. image = plt.imread("fig/dss_alpha_centauri.gif").astype(np.float64)/255.
plt.imshow(image, cmap='gray') plt.imshow(image, cmap='gray')
print("Shape: ", image.shape) print("Shape: ", image.shape)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Loops in python vs. numpy arrays ## Loops in python vs. numpy arrays
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def get_filtered_loop(image, alpha=0.25): def get_filtered_loop(image, alpha=0.25):
filtered = np.zeros_like(image) filtered = np.zeros_like(image)
N, M = image.shape N, M = image.shape
for i in range(1, N-1): for i in range(1, N-1):
for j in range(1, M-1): for j in range(1, M-1):
filtered[i, j] = 0.25*(1.0-alpha)*(image[i-1, j] + image[i+1, j] filtered[i, j] = 0.25*(1.0-alpha)*(image[i-1, j] + image[i+1, j]
+ image[i, j-1] + image[i, j+1]) \ + image[i, j-1] + image[i, j+1]) \
+ alpha*image[i, j] + alpha*image[i, j]
return filtered return filtered
%timeit get_filtered_loop(image) %timeit get_filtered_loop(image)
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def get_filtered_array(image, alpha=0.25): def get_filtered_array(image, alpha=0.25):
filtered = np.zeros_like(image) filtered = np.zeros_like(image)
N, M = image.shape N, M = image.shape
filtered[1:N-1, 1:M-1] = 0.25*(1.0-alpha)*(image[0:N-2, 1:M-1] + image[2:N, 1:M-1] filtered[1:N-1, 1:M-1] = 0.25*(1.0-alpha)*(image[0:N-2, 1:M-1] + image[2:N, 1:M-1]
+ image[1:N-1, 0:M-2] + image[1:N-1, 2:M]) \ + image[1:N-1, 0:M-2] + image[1:N-1, 2:M]) \
+ alpha*image[1:N-1, 1:M-1] + alpha*image[1:N-1, 1:M-1]
return filtered return filtered
%timeit get_filtered_array(image) %timeit get_filtered_array(image)
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
$\to$ roughly factor 100-150 faster! $\to$ roughly factor 100-150 faster!
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Transformation step by step ## Transformation step by step
* Transform for loops to indices on left hand side (same range) * Transform for loops to indices on left hand side (same range)
* From: * From:
```python ```python
for i in range(1, N-1): for i in range(1, N-1):
for j in range(1, M-1): for j in range(1, M-1):
``` ```
To: To:
```python ```python
filtered[1:N-1, 1:M-1] filtered[1:N-1, 1:M-1]
``` ```
* Shift indices on right hand side: `i-1` $\to$ `0:N-2`, `i` $\to$ `1:N-1`, `i+1` $\to$ `2:N` * Shift indices on right hand side: `i-1` $\to$ `0:N-2`, `i` $\to$ `1:N-1`, `i+1` $\to$ `2:N`
* E.g.: `image[i-1, j]` $\to$ `image[0:N-2, 1:M-1]` * E.g.: `image[i-1, j]` $\to$ `image[0:N-2, 1:M-1]`
* Similar for `j` * Similar for `j`
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
filtered_loop = get_filtered_loop(image) filtered_loop = get_filtered_loop(image)
filtered_array = get_filtered_array(image) filtered_array = get_filtered_array(image)
f, axs = plt.subplots(2, 3, figsize=(14, 8)) f, axs = plt.subplots(2, 3, figsize=(14, 8))
axs[0, 0].imshow(image, cmap='gray') axs[0, 0].imshow(image, cmap='gray')
axs[0, 1].imshow(filtered_array, cmap='gray') axs[0, 1].imshow(filtered_array, cmap='gray')
axs[0, 2].imshow(filtered_array - filtered_loop, cmap='gray') axs[0, 2].imshow(filtered_array - filtered_loop, cmap='gray')
axs[1, 0].imshow(image[600:700, 600:700], cmap='gray') axs[1, 0].imshow(image[600:700, 600:700], cmap='gray')
axs[1, 1].imshow(filtered_array[600:700, 600:700], cmap='gray') axs[1, 1].imshow(filtered_array[600:700, 600:700], cmap='gray')
axs[1, 2].imshow((image-filtered_array)[600:700, 600:700], cmap='gray') axs[1, 2].imshow((image-filtered_array)[600:700, 600:700], cmap='gray')
print("Loop and array method yields the same: ", np.allclose(filtered_loop, filtered_array)) print("Loop and array method yields the same: ", np.allclose(filtered_loop, filtered_array))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Example 2: Get edges ## Example 2: Get edges
* Use the following formula: * Use the following formula:
$$\tilde{f}_{i,j} = 8 f_{i,j} - (f_{i-1,j} + f_{i+1,j} + f_{i,j-1} + f_{i,j+1} + f_{i-1,j-1} + f_{i-1,j+1} + f_{i+1,j+1} + f_{i+1,j-1})$$ $$\tilde{f}_{i,j} = 8 f_{i,j} - (f_{i-1,j} + f_{i+1,j} + f_{i,j-1} + f_{i,j+1} + f_{i-1,j-1} + f_{i-1,j+1} + f_{i+1,j+1} + f_{i+1,j-1})$$
* Subtract mean of surrounding cells * Subtract mean of surrounding cells
* Also use diagonals * Also use diagonals
* Truncate results to a certain percentile * Truncate results to a certain percentile
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from scipy.misc import ascent from scipy.misc import ascent
im = plt.imshow(ascent(), cmap='gray') im = plt.imshow(ascent(), cmap='gray')
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Our implementation ### Our implementation
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def get_edges(image, percentiles=(25, 99)): def get_edges(image, percentiles=(25, 99)):
result = np.zeros_like(image) result = np.zeros_like(image)
N, M = image.shape N, M = image.shape
result[1:N-1, 1:M-1] = -1.0*(image[0:N-2, 1:M-1] + image[2:N, 1:M-1] result[1:N-1, 1:M-1] = -1.0*(image[0:N-2, 1:M-1] + image[2:N, 1:M-1]
+ image[1:N-1, 0:M-2] + image[1:N-1, 2:M] + image[1:N-1, 0:M-2] + image[1:N-1, 2:M]
+ image[0:N-2, 0:M-2] + image[0:N-2, 2:M] + image[0:N-2, 0:M-2] + image[0:N-2, 2:M]
+ image[2:N, 0:M-2] + image[2:N, 2:M]) + 8.*image[1:N-1, 1:M-1] + image[2:N, 0:M-2] + image[2:N, 2:M]) + 8.*image[1:N-1, 1:M-1]
# now truncate results # now truncate results
fmin, fmax = np.percentile(result, percentiles) fmin, fmax = np.percentile(result, percentiles)
result[result < fmin] = fmin result[result < fmin] = fmin
result[result > fmax] = fmax result[result > fmax] = fmax
return result return result
f, axs = plt.subplots(1, 2, figsize=(15, 5)) f, axs = plt.subplots(1, 2, figsize=(15, 5))
axs[0].imshow(ascent(), cmap='gray') axs[0].imshow(ascent(), cmap='gray')
axs[1].imshow(get_edges(ascent()), cmap='gray');None axs[1].imshow(get_edges(ascent()), cmap='gray');None
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Predefined filter ### Predefined filter
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from PIL import Image, ImageFilter from PIL import Image, ImageFilter
img = Image.fromarray(ascent().astype("int32"), mode="I").convert("L") img = Image.fromarray(ascent().astype("int32"), mode="I").convert("L")
filtered_img = img.filter(ImageFilter.FIND_EDGES) filtered_img = img.filter(ImageFilter.FIND_EDGES)
f, axs = plt.subplots(1, 2, figsize=(15, 5)) f, axs = plt.subplots(1, 2, figsize=(15, 5))
axs[0].imshow(ascent(), cmap='gray') axs[0].imshow(ascent(), cmap='gray')
axs[1].imshow(np.asarray(filtered_img), cmap='gray');None axs[1].imshow(np.asarray(filtered_img), cmap='gray');None
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
$\to$ results very similar! $\to$ results very similar!
......
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Frameworks for parallel computing # Frameworks for parallel computing
**Python for HPC course** **Python for HPC course**
Max Planck Computing and Data Facility, Garching Max Planck Computing and Data Facility, Garching
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Outline ## Outline
* Motivation * Motivation
* Overview on Parallel Frameworks * Overview on Parallel Frameworks
* Example: Dask * Example: Dask
* Concepts * Concepts
* Dask-MPI on a Slurm cluster * Dask-MPI on a Slurm cluster
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Why Python frameworks for parallel computing? ## Why Python frameworks for parallel computing?
* Scale from a single local computer to large parallel resources, (ideally) with a minimum of code modifications * Scale from a single local computer to large parallel resources, (ideally) with a minimum of code modifications
* Avoid the complexity of handling interprocess/internode communication explicitly ($\to$ MPI), better let the framework handle this! * Avoid the complexity of handling interprocess/internode communication explicitly ($\to$ MPI), better let the framework handle this!
* Use cases: Data parallel problems that can be decomposed into tasks, e.g. processing, reduction, analysis of large amounts of data, training of certain neural networks, etc. * Use cases: Data parallel problems that can be decomposed into tasks, e.g. processing, reduction, analysis of large amounts of data, training of certain neural networks, etc.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Comparison of parallel frameworks (selection) ## Overview on parallel frameworks (selection)
* [Apache Spark](https://spark.apache.org): designed for distributed big data analytics, features include SQL, distributed caching, multi-language bindings including Python * [Apache Spark](https://spark.apache.org): designed for distributed big data analytics, features include SQL, distributed caching, multi-language bindings including Python
* [Dask](https://www.dask.org): parallel distributed computing, based on a reimplementation of the NumPy API (similarly for Pandas and scikit-learn) in combination with a powerful task scheduler * [Dask](https://www.dask.org): parallel distributed computing, provides an implementation of the NumPy API (similarly for Pandas and scikit-learn) in combination with a powerful task scheduler, feels quite *pythonic*
* [Ray](https://www.ray.io): core library for distributed computing, plus growing ecosystem with specific libraries (often from AI) * [Ray](https://www.ray.io): core library for distributed computing, plus growing ecosystem with specific libraries (very often for AI applications)
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Cloud vs HPC environments ## Background: Cloud vs HPC environments
### How to run and scale Python codes using such parallel frameworks?
* Cloud * Cloud
* (software) design often centered around web services * (software) design often centered around web services
* scaling works typically via container orchestration systems (e.g. Kubernetes) * scaling works e.g. via container orchestration systems (e.g. Kubernetes)
* Python frameworks for parallel computing are often designed with Cloud environments in mind (as these are available to a broader audience in contrast to HPC systems) * Python frameworks for parallel computing are often designed with Cloud environments in mind (as these are available to a broader audience in contrast to 'real' HPC systems)
* HPC * HPC
* workloads managed via batch jobs * not per-se designed to run web services (e.g. interactive dashboards, dedicated scheduler services)
* non-interactive use preferred * workloads are submitted via batch jobs
* Practical challenge: How to get Python parallel frameworks to operate in concert with a batch scheduler? * non-interactive use highly preferred
**Practical challenge: Need get Python parallel frameworks to operate in concert with a HPC batch scheduler and infrastructure**
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Example: Dask ## Example: Dask
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Dask array ### Dask array
%% Cell type:code id: tags:
``` python
# create a chunked Dask array from a NumPy array
import numpy as np
import dask.array as da
data = np.arange(100_000).reshape(200, 500)
a = da.from_array(data, chunks=(100, 100))
a
```
%% Output
dask.array<array, shape=(200, 500), dtype=int64, chunksize=(100, 100), chunktype=numpy.ndarray>
%% Cell type:code id: tags:
``` python
# inspect information about the chunking
a.chunksize
```
%% Output
(100, 100)
%% Cell type:code id: tags:
``` python
# inspect information about a particular chunk of data
a.blocks[1, 3]
```
%% Output
dask.array<blocks, shape=(100, 100), dtype=int64, chunksize=(100, 100), chunktype=numpy.ndarray>
%% Cell type:code id: tags:
``` python
# indexing/slicing works just as with NumPy
a[:50, 200]
```
%% Output
dask.array<getitem, shape=(50,), dtype=int64, chunksize=(50,), chunktype=numpy.ndarray>
%% Cell type:code id: tags:
``` python
# Lazy evaluation (only a task graph is created), we need to call compute()
# explicitly to get the result from any Dask object!
a[:50, 200].compute()
```
%% Output
array([ 200, 700, 1200, 1700, 2200, 2700, 3200, 3700, 4200,
4700, 5200, 5700, 6200, 6700, 7200, 7700, 8200, 8700,
9200, 9700, 10200, 10700, 11200, 11700, 12200, 12700, 13200,
13700, 14200, 14700, 15200, 15700, 16200, 16700, 17200, 17700,
18200, 18700, 19200, 19700, 20200, 20700, 21200, 21700, 22200,
22700, 23200, 23700, 24200, 24700])
%% Cell type:code id: tags:
``` python
# Create a Dask object to compute the mean value over the distributed array
a_mean = a.mean()
a_mean
```
%% Output
dask.array<mean_agg-aggregate, shape=(), dtype=float64, chunksize=(), chunktype=numpy.ndarray>
%% Cell type:code id: tags:
``` python
# visualize the task graph necessary to perform the computation
a_mean.visualize()
```
%% Output
<IPython.core.display.Image object>
%% Cell type:code id: tags:
``` python
# perform the computation on the Dask array
a_mean.compute()
```
%% Output
49999.5
%% Cell type:code id: tags:
``` python
# perform the computation on the original NumPy array
data.mean()
```
%% Output
49999.5
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Dask futures ### Dask futures
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Dask-MPI ### Dask-MPI
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
### Case Study ### Case Study
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
``` ```
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment