diff --git a/environment.yml b/environment.yml index 8b4f5ef0ded5d01ea14372201b0b6868923c4a13..39d6a2796a15836757d42bd0c75095b956beca27 100644 --- a/environment.yml +++ b/environment.yml @@ -17,6 +17,7 @@ dependencies: - gxx_linux-64 - gfortran_linux-64 - numba + - jax - mpich - mpi4py - hdf5=*=mpi* diff --git a/examples/pybind11/setup.py b/examples/pybind11/setup.py index 402e322b22a05d463ba70b576654cdb7868711e7..fea2d2252e3a911af632d63874679a37ce9fbaa4 100644 --- a/examples/pybind11/setup.py +++ b/examples/pybind11/setup.py @@ -1,5 +1,4 @@ from setuptools import setup -# run `pip install --user pybind11` if not yet installed from pybind11.setup_helpers import Pybind11Extension, build_ext # extra compile/link args may be injected, e.g. here for optimization and openmp with gcc diff --git a/notebooks/1b--Introduction.ipynb b/notebooks/1b--Introduction.ipynb index eec05cd389617bb1550e5bb11ac8f29b2eadc2d8..daa5f7b01699a027c7bcd2790e9c3d73c33c2d6d 100644 --- a/notebooks/1b--Introduction.ipynb +++ b/notebooks/1b--Introduction.ipynb @@ -29,7 +29,7 @@ "source": [ "## Why Python in HPC?\n", "* Efficiency with respect to development effort: \n", - " $\\rightarrow$ Python helps you to get \"things done\" quickly and focus on science.\n", + " $\\rightarrow$ Python helps you to get \"things done\" quickly and focus on your science.\n", "* Efficiency with respect to hardware utilization: \n", " $\\rightarrow$ Python-based codes may perform well if implemented properly.\n", "* Using Python may achieve a good overall efficiency for small- to medium-scale codes\n", @@ -69,7 +69,8 @@ }, "source": [ "## Use cases often relevant to our audience\n", - "* prototype implementations of new methods that should run (much) faster\n", + "\n", + "* prototype implementations of new methods that should run (much) faster towards production\n", "* data analysis scripts that should run faster and/or operate in parallel\n", "* IO-handling of large numerical data\n", "\n", @@ -102,9 +103,9 @@ "* Introduction\n", "* Refresher of Python and the Python Ecosystem\n", "* Scientific Computing with Python: NumPy and SciPy\n", - "* Performance: Cython, Numba, C/Fortran interfacing\n", - "* Parallelism: multithreading, multiprocessing, mpi4py\n", - "* Software Engineering" + "* Performance: Cython, JIT (Numba, JAX), C/Fortran interfacing\n", + "* Parallelism: multithreading, multiprocessing, Dask, mpi4py\n", + "* Software Engineering, Packaging" ] }, { @@ -155,8 +156,8 @@ "* matplotlib\n", "* Cython\n", "* gcc, gfortran\n", - "* Numba\n", - "* (MPI, mpi4py)" + "* Numba, JAX\n", + "* (Dask, Dask-MPI, MPI, mpi4py)" ] }, { @@ -171,7 +172,7 @@ "\n", "* Use the link communicated by email to access a Jupyter service on the MPCDF HPC cloud\n", "* The course material *and* software are provided via an interactive JupyterLab interface\n", - "* Each instance provides up to 6 virtual CPU cores and up to 12 GB RAM (less guaranteed)\n", + "* Each instance provides *up to* 4 virtual CPU cores and *up to* 6 GB RAM\n", "* Please keep the following points in mind\n", " * Use the JupyterLab menu **File $\\to$ Shut down** to free resources when finished\n", " * A session is terminated after 12h\n", @@ -187,7 +188,7 @@ }, "source": [ "### Option 2: MPCDF Python infrastructure on the HPC systems\n", - "* Python (**3.8**) is provided via the Anaconda Python Distribution\n", + "* Python is provided via the Anaconda Python Distribution\n", "* software is accessible via environment modules, e.g. \n", " `module purge` \n", " `module load gcc/10 impi/2019.9` \n", @@ -212,7 +213,7 @@ "metadata": { "celltoolbar": "Slideshow", "kernelspec": { - "display_name": "Python 3", + "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, @@ -226,7 +227,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.8.3" + "version": "3.9.7" } }, "nbformat": 4, diff --git a/notebooks/1c--Python_Refresher.ipynb b/notebooks/1c--Python_Refresher.ipynb index db7be75d8686aa03e6bd67034147eb5388f33d2f..9bc804acbc9fbd13cf4eb4963757279a8163fbb1 100644 --- a/notebooks/1c--Python_Refresher.ipynb +++ b/notebooks/1c--Python_Refresher.ipynb @@ -28,7 +28,7 @@ "### Python: History and Status\n", "* First version released in 1991 by G. van Rossum\n", "* Implementations: **cPython**, PyPy, Pyston, ...\n", - "* Language versions: 2.7 (legacy, ✞2020), now 3.6 - 3.9 \n", + "* Language versions: 2.7 (legacy, ✞2020), now 3.7 - 3.11 \n", " (to migrate legacy code, the packages `2to3`, `six`, `future` are helpful)" ] }, @@ -52,27 +52,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "### PEP8, style guide for Python code\n", - "* PEP8 is a style guide to write clean, readable, and maintainable Python code\n", - " * indentation using 4 *spaces*\n", - " * 79 characters maximum line width\n", - " * UTF8 source file encoding\n", - " * comments, docstrings\n", - " * naming conventions\n", - " * ...\n", - "* https://www.python.org/dev/peps/pep-0008/\n", - "* convert existing code into a PEP8 compliant format using the `autopep8` tool" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "subslide" + "slide_type": "skip" } }, "source": [ @@ -98,7 +78,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "subslide" + "slide_type": "skip" } }, "source": [ @@ -126,7 +106,7 @@ "* numbers: integer (*arbitrary* precision), float (double precision), complex `2 + 3j`\n", "* boolean: `True`, `False`\n", "* strings: `\"foo\"`\n", - "* collections\n", + "* collections (selection)\n", " * lists: `[1, \"bar\"]`\n", " * tuples: `(2, \"foobar\")`\n", " * sets: `{\"apple\", \"banana\"}`\n", @@ -144,7 +124,10 @@ }, "source": [ "### Variables, mutable vs. immutable objects\n", - "* a variable is a named reference to an object\n", + "* a variable is a named reference to an object, e.g.\n", + "```python\n", + "a = 5\n", + "```\n", "* immutable objects may not change once created: integer, float, boolean, string, tuple\n", "* mutable objects may change in place: list, set, dictionary, most user-defined classes\n", "* if unsure, check via the object ID (memory location)" @@ -152,7 +135,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 28, "metadata": { "slideshow": { "slide_type": "fragment" @@ -163,8 +146,8 @@ "name": "stdout", "output_type": "stream", "text": [ - "140668851030512\n", - "140668851031664\n" + "140644013146224\n", + "140644013148208\n" ] } ], @@ -193,7 +176,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 29, "metadata": { "slideshow": { "slide_type": "subslide" @@ -246,7 +229,7 @@ }, { "cell_type": "code", - "execution_count": 3, + "execution_count": 30, "metadata": { "slideshow": { "slide_type": "subslide" @@ -257,7 +240,6 @@ "name": "stdout", "output_type": "stream", "text": [ - "[1, 'apple', 3, 'bananas']\n", "False\n" ] } @@ -266,7 +248,6 @@ "my_list = [1, 'apple', 2, 'bananas']\n", "\n", "my_list[2] = 3\n", - "print(my_list)\n", "my_list.append(4) # in-place modification!\n", "my_list += [5, 'berries'] # append a list to another list\n", "my_list.reverse() # in-place modification!\n", @@ -289,7 +270,7 @@ }, { "cell_type": "code", - "execution_count": 4, + "execution_count": 31, "metadata": { "slideshow": { "slide_type": "-" @@ -334,7 +315,7 @@ }, { "cell_type": "code", - "execution_count": 5, + "execution_count": 32, "metadata": { "slideshow": { "slide_type": "-" @@ -351,7 +332,7 @@ "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", - "\u001b[0;32m<ipython-input-5-63d4842ba5d4>\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;31m# tuple object cannot be modified once instanciated\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mtup1\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", + "\u001b[0;32m/tmp/ipykernel_2358014/3076520060.py\u001b[0m in \u001b[0;36m<module>\u001b[0;34m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0;31m# tuple object cannot be modified once instanciated\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mtup1\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;36m0\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'tuple' object does not support item assignment" ] } @@ -382,7 +363,7 @@ }, { "cell_type": "code", - "execution_count": 6, + "execution_count": 33, "metadata": { "slideshow": { "slide_type": "fragment" @@ -393,7 +374,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "{'banana', 11, 'apple', 47}\n", + "4\n", "True\n", "False\n", "{'banana', 11}\n" @@ -402,10 +383,9 @@ ], "source": [ "set1 = {\"apple\", \"banana\", \"apple\", 47, 11, 47}\n", - "print(set1) # elements are unique\n", + "print(len(set1)) # 4 --> elements are unique\n", "\n", "set2 = {\"banana\", 11}\n", - "\n", "print(set2.issubset(set1))\n", "set2.add(3.1415)\n", "print(set2.issubset(set1))\n", @@ -430,7 +410,7 @@ }, { "cell_type": "code", - "execution_count": 7, + "execution_count": 34, "metadata": { "slideshow": { "slide_type": "fragment" @@ -507,7 +487,7 @@ }, { "cell_type": "code", - "execution_count": 8, + "execution_count": 35, "metadata": { "slideshow": { "slide_type": "-" @@ -542,7 +522,7 @@ }, { "cell_type": "code", - "execution_count": 1, + "execution_count": 36, "metadata": { "slideshow": { "slide_type": "-" @@ -587,10 +567,10 @@ }, { "cell_type": "code", - "execution_count": 10, + "execution_count": 37, "metadata": { "slideshow": { - "slide_type": "subslide" + "slide_type": "skip" } }, "outputs": [ @@ -641,7 +621,7 @@ }, { "cell_type": "code", - "execution_count": 2, + "execution_count": 38, "metadata": { "slideshow": { "slide_type": "subslide" @@ -652,7 +632,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "lst = ['one', 'two', 'four']\n" + "l = ['one', 'two', 'four']\n" ] } ], @@ -668,11 +648,11 @@ "def modify_list__range(a):\n", " a[:] = a + [\"four\"]\n", " \n", - "lst = [\"one\"]\n", - "modify_list__method(lst)\n", - "modify_list__localvar(lst)\n", - "modify_list__range(lst)\n", - "print(f\"lst = {lst}\")" + "l = [\"one\"]\n", + "modify_list__method(l)\n", + "modify_list__localvar(l)\n", + "modify_list__range(l)\n", + "print(f\"l = {l}\")" ] }, { @@ -689,16 +669,16 @@ " * methods\n", " * class and instance variables\n", " * inheritance\n", - "* unintended side effects may be caused by mutable class variables (see example below)\n", + "* unintended side effects may be caused by mutable class variables (see example)\n", "* https://docs.python.org/3.9/tutorial/classes.html" ] }, { "cell_type": "code", - "execution_count": 12, + "execution_count": 39, "metadata": { "slideshow": { - "slide_type": "subslide" + "slide_type": "skip" } }, "outputs": [ @@ -741,15 +721,17 @@ } }, "source": [ - "### decorators\n", + "### Decorators\n", + "\n", "* decorators wrap existing functions and modify their behaviour\n", "* decorators take a function as argument and return a wrapped function\n", - "* use cases: profilers, loggers, just-in-time compilers, parallelizers, ..." + "* use cases: profilers, loggers, just-in-time compilers, parallelizers, ...\n", + "* https://peps.python.org/pep-0318/" ] }, { "cell_type": "code", - "execution_count": 13, + "execution_count": 40, "metadata": { "scrolled": true, "slideshow": { @@ -790,7 +772,8 @@ } }, "source": [ - "### functional programming\n", + "### Functional programming\n", + "\n", "* functional approach: transform data via functions that only take input and return output\n", "* Python basic concept: iterators representing streams of elements, e.g. `range()`\n", "* Python builtins: `map()`, `filter()`, `enumerate()`, `sorted()`, `zip()`\n", @@ -799,7 +782,7 @@ }, { "cell_type": "code", - "execution_count": 14, + "execution_count": 41, "metadata": { "slideshow": { "slide_type": "subslide" @@ -840,7 +823,7 @@ }, { "cell_type": "code", - "execution_count": 15, + "execution_count": 42, "metadata": { "scrolled": false, "slideshow": { @@ -879,49 +862,40 @@ }, { "cell_type": "code", - "execution_count": 16, + "execution_count": 43, "metadata": { "slideshow": { "slide_type": "subslide" } }, - "outputs": [], + "outputs": [ + { + "data": { + "text/plain": [ + "9455" + ] + }, + "execution_count": 43, + "metadata": {}, + "output_type": "execute_result" + } + ], "source": [ - "# naive plain function approach: build a list in memory explicitly\n", - "def first_n(n):\n", - " num = 0\n", + "# naive approach: build a full list in memory\n", + "def first_n_sq(n):\n", + " num = 1\n", " nums = []\n", - " while num < n:\n", - " nums.append(num)\n", + " while num <= n:\n", + " nums.append(num * num)\n", " num += 1\n", " return nums\n", "\n", - "sum_of_first_n = sum(first_n(1000000))" + "sum(first_n_sq(30))" ] }, { "cell_type": "code", - "execution_count": 17, - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "outputs": [], - "source": [ - "# better: generator function **yield**ing individual elements\n", - "def first_n(n):\n", - " num = 0\n", - " while num < n:\n", - " yield num\n", - " num += 1\n", - "\n", - "sum_of_first_n = sum(first_n(1000000))" - ] - }, - { - "cell_type": "code", - "execution_count": 18, + "execution_count": 44, "metadata": { "slideshow": { "slide_type": "subslide" @@ -929,27 +903,30 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "333332833333500000\n" - ] + "data": { + "text/plain": [ + "9455" + ] + }, + "execution_count": 44, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "# now, let's assume we're suddenly interested in the sum of square numbers\n", - "def square(iterable):\n", - " for i in iterable:\n", - " yield i*i\n", + "# better: generator function **yield**ing individual elements\n", + "def first_n_sq_gen(n):\n", + " num = 1\n", + " while num <= n:\n", + " yield num * num\n", + " num += 1\n", "\n", - "# we can nest generators to implement pipelines\n", - "sum_of_first_n_squared = sum(square(first_n(1000000)))\n", - "print(sum_of_first_n_squared)" + "sum(first_n_sq_gen(30))" ] }, { "cell_type": "code", - "execution_count": 19, + "execution_count": 45, "metadata": { "slideshow": { "slide_type": "subslide" @@ -957,18 +934,20 @@ }, "outputs": [ { - "name": "stdout", - "output_type": "stream", - "text": [ - "333332833333500000\n" - ] + "data": { + "text/plain": [ + "9455" + ] + }, + "execution_count": 45, + "metadata": {}, + "output_type": "execute_result" } ], "source": [ - "# Pythonic solution using a generator expression\n", - "gen_expr = (i*i for i in range(1000000))\n", - "sum_of_first_n_squared = sum(gen_expr)\n", - "print(sum_of_first_n_squared)" + "# better: generator expression\n", + "first_n_sq_gen_expr = (i*i for i in range(1,31))\n", + "sum(first_n_sq_gen_expr)" ] }, { @@ -984,7 +963,7 @@ }, { "cell_type": "code", - "execution_count": 20, + "execution_count": 46, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1001,7 +980,7 @@ }, { "cell_type": "code", - "execution_count": 21, + "execution_count": 47, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1031,7 +1010,7 @@ }, { "cell_type": "code", - "execution_count": 22, + "execution_count": 48, "metadata": { "slideshow": { "slide_type": "subslide" @@ -1055,7 +1034,7 @@ }, { "cell_type": "code", - "execution_count": 23, + "execution_count": 49, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1086,7 +1065,7 @@ "cell_type": "markdown", "metadata": { "slideshow": { - "slide_type": "slide" + "slide_type": "subslide" } }, "source": [ @@ -1096,12 +1075,13 @@ "* use `import` to get access to a module from your code\n", " * don't use `from module_name import *` $\\to$ clutters namespace\n", "* use `dir(module_name)` to see what's inside of an imported module\n", - "* more on modules later when we talk about packaging ..." + "* more on modules later when we talk about packaging ...\n", + "* https://docs.python.org/3/tutorial/modules.html" ] }, { "cell_type": "code", - "execution_count": 24, + "execution_count": 50, "metadata": { "slideshow": { "slide_type": "subslide" @@ -1114,7 +1094,7 @@ "2.718281828459045" ] }, - "execution_count": 24, + "execution_count": 50, "metadata": {}, "output_type": "execute_result" } @@ -1126,7 +1106,7 @@ }, { "cell_type": "code", - "execution_count": 25, + "execution_count": 51, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1139,7 +1119,7 @@ "2.718281828459045" ] }, - "execution_count": 25, + "execution_count": 51, "metadata": {}, "output_type": "execute_result" } @@ -1151,7 +1131,7 @@ }, { "cell_type": "code", - "execution_count": 26, + "execution_count": 52, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1164,7 +1144,7 @@ "1.0" ] }, - "execution_count": 26, + "execution_count": 52, "metadata": {}, "output_type": "execute_result" } @@ -1176,7 +1156,7 @@ }, { "cell_type": "code", - "execution_count": 27, + "execution_count": 53, "metadata": { "scrolled": true, "slideshow": { @@ -1188,7 +1168,7 @@ "name": "stdout", "output_type": "stream", "text": [ - "['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc']\n" + "['__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'acos', 'acosh', 'asin', 'asinh', 'atan', 'atan2', 'atanh', 'ceil', 'comb', 'copysign', 'cos', 'cosh', 'degrees', 'dist', 'e', 'erf', 'erfc', 'exp', 'expm1', 'fabs', 'factorial', 'floor', 'fmod', 'frexp', 'fsum', 'gamma', 'gcd', 'hypot', 'inf', 'isclose', 'isfinite', 'isinf', 'isnan', 'isqrt', 'lcm', 'ldexp', 'lgamma', 'log', 'log10', 'log1p', 'log2', 'modf', 'nan', 'nextafter', 'perm', 'pi', 'pow', 'prod', 'radians', 'remainder', 'sin', 'sinh', 'sqrt', 'tan', 'tanh', 'tau', 'trunc', 'ulp']\n" ] } ], @@ -1211,7 +1191,7 @@ }, { "cell_type": "code", - "execution_count": 28, + "execution_count": 54, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1259,7 +1239,8 @@ }, "source": [ "### Parameter files\n", - "* scientific codes are often steered using parameter files\n", + "\n", + "* scientific codes are often controled using parameter files\n", " * parameter files can be edited by hand\n", " * code does not need to be changed, reads in parameter file\n", "* recommended\n", @@ -1271,7 +1252,7 @@ }, { "cell_type": "code", - "execution_count": 29, + "execution_count": 55, "metadata": { "slideshow": { "slide_type": "subslide" @@ -1298,12 +1279,14 @@ }, "source": [ "#### YAML\n", - "* available via the `pyyaml` package" + "\n", + "* available via the `pyyaml` package\n", + "* https://pyyaml.org/wiki/PyYAMLDocumentation" ] }, { "cell_type": "code", - "execution_count": 30, + "execution_count": 56, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1320,7 +1303,7 @@ }, { "cell_type": "code", - "execution_count": 31, + "execution_count": 57, "metadata": { "slideshow": { "slide_type": "fragment" @@ -1341,8 +1324,6 @@ } }, "source": [ - "#### YAML\n", - "\n", "```yaml\n", "# par.yaml\n", "general:\n", @@ -1357,6 +1338,26 @@ "```" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### PEP8, style guide for Python code\n", + "* PEP8 is a style guide to write clean, readable, and maintainable Python code\n", + " * indentation using 4 *spaces*\n", + " * 79 characters maximum line width\n", + " * UTF8 source file encoding\n", + " * comments, docstrings\n", + " * naming conventions\n", + " * ...\n", + "* https://www.python.org/dev/peps/pep-0008/\n", + "* convert existing code into a PEP8 compliant format using the `autopep8` tool" + ] + }, { "cell_type": "markdown", "metadata": { @@ -1366,12 +1367,14 @@ }, "source": [ "### References and Further Reading\n", + "\n", "* [A Whirlwind Tour of Python](https://jakevdp.github.io/WhirlwindTourOfPython/) by Jake VanderPlas, free O'Reilly report\n", "* [Dive Into Python](https://diveintopython3.problemsolving.io) by Mark Pilgrim, free online book\n", "* [Python for Beginners](https://www.python.org/about/gettingstarted/)\n", "* [The Python Tutorial](https://docs.python.org/3/tutorial/index.html)\n", "* [Python Documentation](https://docs.python.org/3/index.html)\n", - "* [What’s New in Python](https://docs.python.org/3/whatsnew/index.html)" + "* [What’s New in Python](https://docs.python.org/3/whatsnew/index.html)\n", + "* [Python Enhancement Proposals (PEP)](https://peps.python.org/pep-0000/)" ] } ], diff --git a/notebooks/2c--Numba.ipynb b/notebooks/2c--JIT.ipynb similarity index 89% rename from notebooks/2c--Numba.ipynb rename to notebooks/2c--JIT.ipynb index 2ec96a616972ba6b8901cd10c2801e207c875bee..fe291c04fc5c121ab3229a14d58021576e1a03b3 100644 --- a/notebooks/2c--Numba.ipynb +++ b/notebooks/2c--JIT.ipynb @@ -8,12 +8,25 @@ } }, "source": [ - "# Numba\n", + "# Just-in-Time (JIT) compilation\n", "**Python for HPC course**\n", "\n", "Max Planck Computing and Data Facility, Garching" ] }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "* just-in-time compilation refers to the compilation of a program or parts of it at runtime\n", + "* `java` or `julia`, e.g., are languages that natively make use of JITing\n", + "* for python JITing is available through packages like `Numba` or `jax`" + ] + }, { "cell_type": "markdown", "metadata": { @@ -415,7 +428,158 @@ } }, "source": [ - "### Numba application\n", + "## Jax\n", + "* jax is a library for numerical computing and machine learning with NumPy\n", + "* jax provides **just-in-Time (JIT) compilation** and **automatic differentiation**\n", + "* convenient usage via a functional api\n", + "* built on the [XLA](https://www.tensorflow.org/xla) compiler, support for\n", + " * CPUs\n", + " * GPUs (NVIDIA, AMD)\n", + " * TPUs\n", + "* https://jax.readthedocs.io/en/latest/" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "## Jax vs. Numba\n", + "\n", + "* for the JIT interface, both provide a simple `jit` function\n", + "* both use LLVM as the compiler backend; jax with a layer of indirection via the XLA compiler\n", + "* jax provides automatic differentiation in addition\n", + "* jax supports CPU/GPU/TPU backends with exactly the same codebase (e.g., no cuda knowledge necessary)\n", + "* jax will fail on code it doesn't know\n", + "* jax is general purpose but many concepts have a background in machine learning (e.g. batch parallelization)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### JIT compilation with jax" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "from jax import jit\n", + "import jax.numpy as np\n", + "\n", + "m = 1024\n", + "x = np.arange(m*m).reshape(m, m).astype(np.float32)\n", + "\n", + "@jit\n", + "def go_fast(a):\n", + " trace = 0.0\n", + " for i in range(a.shape[0]):\n", + " trace += np.tanh(a[i, i])\n", + " b = a + trace\n", + " return b\n", + "\n", + "go_fast(x) # compile the function first \n", + "\n", + "%timeit go_fast(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### Autodifferentiation with jax" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "from jax import grad\n", + "\n", + "def reduce_go_fast(a):\n", + " \"\"\"Reduction on go_fast to have scalar output.\"\"\"\n", + " return go_fast(a).sum()\n", + "\n", + "grad_reduce_go_fast = jit(grad(reduce_go_fast))\n", + "\n", + "jac = grad_reduce_go_fast(x)\n", + "\n", + "%timeit jac = grad_reduce_go_fast(x)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "* jax also provides multivariate differentiation via `jacfwd`, `jacrev`, `jvp`, `vjp`\n", + "* jax has support for vectorization and parallelization: `vmap`, `pmap`\n", + "* jax is still under active development but becoming increasingly popular:\n", + " * the [alphafold project](https://alphafold.com/) from DeepMind\n", + " * the [flax](https://flax.readthedocs.io/) neural net library\n", + " * ..." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "## Watch out for more (domain specific) JIT solutions!\n", + "\n", + "### NumExpr: Fast numerical expression evaluator for NumPy\n", + "\n", + "* decompose NumPy expressions into loop representation: memory saving, cache blocking, thread parallelization, SIMD\n", + "* https://github.com/pydata/numexpr\n", + "\n", + "### *pystencils*\n", + "\n", + "* sympy-based code generator for **stencil computations on NumPy arrays**\n", + "* due to knowledge about the structure of the stencil, the code can be highly optimized\n", + "* https://i10git.cs.fau.de/pycodegen/pystencils\n", + "\n", + "Optional exercise: implement the diffusion computations using these tools!" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### Numba/jax application\n", "* let's continue with the `Diffusion.ipynb` notebook" ] }, @@ -1254,19 +1418,6 @@ "# integer overflow\n", "count_numba(M)" ] - }, - { - "cell_type": "markdown", - "metadata": { - "slideshow": { - "slide_type": "subslide" - } - }, - "source": [ - "### Thank you!\n", - "\n", - "### Questions?" - ] } ], "metadata": { diff --git a/notebooks/2d--Diffusion.ipynb b/notebooks/2d--Diffusion.ipynb index 8b98c4b131820165f37a3afac8608fbc40badedd..37250966751c6622fb0de57bc89410ea340a68b5 100644 --- a/notebooks/2d--Diffusion.ipynb +++ b/notebooks/2d--Diffusion.ipynb @@ -5507,13 +5507,112 @@ } }, "source": [ - "## pystencils -- a high performance package for stencil computations\n", + "## Jax\n", + "* Jax is a library for just-in-time compiler (based on XLA) and autodifferentiation\n", + "* usage via the decorator `@jit`" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### Numpy in-place updates with jax\n", + "* jax **doesn't** support inplace updates\n", + "* **but** there is a functional replacement on arrays wit the `.at` operator: `x.at[idx].set(y)` returning a copy\n", + "* **and** if `x` is not reused (i.e. `x = x.at[idx].set(y)`) the compiled operation will happen in place" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "import jax.numpy as np\n", + "\n", + "def main_loop_jax(evolve_func, grid):\n", + " \"\"\"Main loop function, calling evolve_func on grid.\"\"\"\n", + " grid_tmp = np.empty_like(grid)\n", + " for i in range(1, n_iterations+1):\n", + " grid = evolve_func(grid, grid_tmp, n_points, dt, D)\n", + " return grid" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "outputs": [], + "source": [ + "from jax import jit\n", + "from functools import partial\n", + "\n", + "# just redefined as before but with jax.numpy\n", + "def laplacian_np_roll(grid):\n", + " \"\"\"Laplacian implementation based on the NumPy roll functionality.\"\"\"\n", + " return np.roll(grid, +1, 0) + np.roll(grid, -1, 0) \\\n", + " + np.roll(grid, +1, 1) + np.roll(grid, -1, 1) \\\n", + " - 4 * grid\n", "\n", - "* sympy-based code generator for stencil computations on NumPy arrays\n", - "* due to knowledge about the structure of the stencil, the code can be highly optimized and may outperform cython or Numba\n", - "* install via conda or pip, find more information at \n", - " https://i10git.cs.fau.de/pycodegen/pystencils\n", - "* optional exercise: implement the diffusion computation using pystencils" + "@jit\n", + "def evolve_np_roll_jax(grid, grid_tmp, n_points, dt, D):\n", + " \"\"\"Time step based on the NumPy-roll-Laplacian.\"\"\"\n", + " grid_tmp.at[:].set(grid + dt * D * laplacian_np_roll(grid))\n", + " return grid_tmp" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "grid = init()\n", + "solution_jax = main_loop_jax(evolve_np_roll_jax, grid)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "fragment" + } + }, + "outputs": [], + "source": [ + "%%timeit\n", + "grid = init()\n", + "solution_jax = main_loop_jax(evolve_np_roll_jax, grid)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "slideshow": { + "slide_type": "skip" + } + }, + "outputs": [], + "source": [ + "plot_grids(init(), solution_jax)" ] } ], diff --git a/notebooks/2e--Interfacing_with_C_and_F.ipynb b/notebooks/2e--Interfacing_with_C_and_F.ipynb index daaba1ad5109353931472b2773b2297e31e2b26a..83724b57408fe69c1164ded1b3fcde6100cd102f 100644 --- a/notebooks/2e--Interfacing_with_C_and_F.ipynb +++ b/notebooks/2e--Interfacing_with_C_and_F.ipynb @@ -434,9 +434,78 @@ "\n", "* lightweight header-only library to create interfaces to modern C++ code\n", "* highlights: STL, iterators, classes and inheritance, smart pointers, move semantics, NumPy $\\leftrightarrow$ Eigen, etc.\n", - "* cf. the simple NumPy example in `pybind11`\n", "* documentation: https://pybind11.readthedocs.io/en/latest/" ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### Easy integration with pyproject.toml (PEP518)\n", + "\n", + "```toml\n", + "[build-system]\n", + "requires = [\"setuptools\", \"pybind11[global]\"]\n", + "build-backend = \"setuptools.build_meta\"\n", + "\n", + "[project]\n", + "...\n", + "```\n", + "\n", + "* the `global` specifier ensures the availability of C++ headers at build-time\n", + " (see `Software_Engineering.ipynb` notebook for more information on the setup of python projects)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "subslide" + } + }, + "source": [ + "### Build with setuptools\n", + "\n", + "```python\n", + "from setuptools import setup\n", + "from pybind11.setup_helpers import Pybind11Extension, build_ext\n", + "\n", + "ext_modules = [\n", + " Pybind11Extension(\n", + " \"pybex\",\n", + " [\"src/pybex.cpp\",],\n", + " ),\n", + "]\n", + "\n", + "setup(\n", + " cmdclass={\"build_ext\": build_ext},\n", + " ext_modules=ext_modules\n", + ")\n", + "```\n", + "\n", + "* cf. the simple NumPy example in `pybind11` for more details\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "### Build-system support\n", + "\n", + "* direct inegration with `setuptools`\n", + "* `CMake` for large projects with dependencies on the C++ side\n", + "* and some more, see: https://pybind11.readthedocs.io/en/stable/compiling.html\n", + "\n", + "* [scikit-build](https://scikit-build.readthedocs.io/en/latest/) for glueing setuptools with CMake\n" + ] } ], "metadata": { diff --git a/notebooks/4b--Parallel_Frameworks.ipynb b/notebooks/4b--Parallel_Frameworks.ipynb new file mode 100644 index 0000000000000000000000000000000000000000..37734adfea10f8389bf07355d1397f34e21c8061 --- /dev/null +++ b/notebooks/4b--Parallel_Frameworks.ipynb @@ -0,0 +1,133 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "slideshow": { + "slide_type": "slide" + } + }, + "source": [ + "# Frameworks for parallel computing\n", + "**Python for HPC course**\n", + "\n", + "Max Planck Computing and Data Facility, Garching" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Outline\n", + "\n", + "* Motivation\n", + "* Overview on Parallel Frameworks\n", + "* Example: Dask\n", + " * Concepts\n", + " * Dask-MPI on a Slurm cluster" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Why Python frameworks for parallel computing?\n", + "\n", + "* Scale from a single local computer to large parallel resources, (ideally) with a minimum of code modifications\n", + "* Avoid the complexity of handling interprocess/internode communication explicitly ($\\to$ MPI), better let the framework handle this!\n", + "* Use cases: Data parallel problems that can be decomposed into tasks, e.g. processing, reduction, analysis of large amounts of data, training of certain neural networks, etc." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Comparison of parallel frameworks (selection)\n", + "\n", + "* [Apache Spark](https://spark.apache.org): designed for distributed big data analytics, features include SQL, distributed caching, multi-language bindings including Python\n", + "* [Dask](https://www.dask.org): parallel distributed computing, based on a reimplementation of the NumPy API (similarly for Pandas and scikit-learn) in combination with a powerful task scheduler\n", + "* [Ray](https://www.ray.io): core library for distributed computing, plus growing ecosystem with specific libraries (often from AI)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cloud vs HPC environments\n", + "\n", + "* Cloud\n", + " * (software) design often centered around web services\n", + " * scaling works typically via container orchestration systems (e.g. Kubernetes)\n", + " * Python frameworks for parallel computing are often designed with Cloud environments in mind (as these are available to a broader audience in contrast to HPC systems)\n", + "\n", + "* HPC\n", + " * workloads managed via batch jobs\n", + " * non-interactive use preferred\n", + " * Practical challenge: How to get Python parallel frameworks to operate in concert with a batch scheduler?" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Example: Dask" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Dask array" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Dask futures" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Dask-MPI" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Case Study" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.7" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}