bigmax_mnist_example_tutorial.ipynb 2.51 MB
 Angelo Ziletti committed Feb 18, 2019 1 2 3 4 5 6 7 8 9 10 11 { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# (Convolutional) Neural network tutorial - BigMax workshop - Dresden, April 2019\n", "\n", "\n", "##### Authors: Angelo Ziletti, Andreas Leitherer, and Luca M. Ghiringhelli - Fritz Haber Institute of the Max Planck Society, Berlin\n", "\n",  Angelo Ziletti committed Mar 01, 2019 12  "In this tutorial, we briefly introduce the main ideas behind convolutional neural networks, build a neural network model with Keras, and explain the classification decision process using attentive response maps."  Angelo Ziletti committed Feb 18, 2019 13 14 15 16 17 18 19 20 21 22 23 24 25  ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 0. Install packages needed" ] }, { "cell_type": "markdown", "metadata": {}, "source": [  Angelo Ziletti committed Feb 19, 2019 26  "We first install the packages that we will need to perform this tutorial, and then we load the necessary Python libraries. This tutorial has been tested on Python 3.5."  Angelo Ziletti committed Feb 18, 2019 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: tensorflow in /home/ziletti/.local/lib/python3.6/site-packages (1.12.0)\n", "Requirement already satisfied: tensorboard<1.13.0,>=1.12.0 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (1.12.2)\n", "Requirement already satisfied: keras-preprocessing>=1.0.5 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (1.0.9)\n",  Angelo Ziletti committed Feb 19, 2019 43 44 45 46  "Requirement already satisfied: wheel>=0.26 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from tensorflow) (0.31.1)\n", "Requirement already satisfied: termcolor>=1.1.0 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (1.1.0)\n", "Requirement already satisfied: astor>=0.6.0 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (0.7.1)\n", "Requirement already satisfied: numpy>=1.13.3 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from tensorflow) (1.15.1)\n",  Angelo Ziletti committed Feb 18, 2019 47  "Requirement already satisfied: keras-applications>=1.0.6 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (1.0.7)\n",  Angelo Ziletti committed Feb 19, 2019 48  "Requirement already satisfied: grpcio>=1.8.6 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (1.18.0)\n",  Angelo Ziletti committed Feb 18, 2019 49 50  "Requirement already satisfied: absl-py>=0.1.6 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (0.7.0)\n", "Requirement already satisfied: protobuf>=3.6.1 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (3.6.1)\n",  Angelo Ziletti committed Feb 19, 2019 51 52  "Requirement already satisfied: gast>=0.2.0 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorflow) (0.2.2)\n", "Requirement already satisfied: six>=1.10.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from tensorflow) (1.11.0)\n",  Angelo Ziletti committed Feb 18, 2019 53  "Requirement already satisfied: markdown>=2.6.8 in /home/ziletti/.local/lib/python3.6/site-packages (from tensorboard<1.13.0,>=1.12.0->tensorflow) (3.0.1)\n",  Angelo Ziletti committed Feb 19, 2019 54  "Requirement already satisfied: werkzeug>=0.11.10 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from tensorboard<1.13.0,>=1.12.0->tensorflow) (0.14.1)\n",  Angelo Ziletti committed Feb 18, 2019 55 56 57 58 59 60  "Requirement already satisfied: h5py in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras-applications>=1.0.6->tensorflow) (2.8.0)\n", "Requirement already satisfied: setuptools in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from protobuf>=3.6.1->tensorflow) (40.2.0)\n", "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n", "Requirement already satisfied: keras in /home/ziletti/.local/lib/python3.6/site-packages (2.2.4)\n",  Angelo Ziletti committed Feb 19, 2019 61  "Requirement already satisfied: keras-preprocessing>=1.0.5 in /home/ziletti/.local/lib/python3.6/site-packages (from keras) (1.0.9)\n",  Angelo Ziletti committed Feb 18, 2019 62  "Requirement already satisfied: keras-applications>=1.0.6 in /home/ziletti/.local/lib/python3.6/site-packages (from keras) (1.0.7)\n",  Angelo Ziletti committed Feb 19, 2019 63  "Requirement already satisfied: six>=1.9.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras) (1.11.0)\n",  Angelo Ziletti committed Feb 18, 2019 64  "Requirement already satisfied: numpy>=1.9.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras) (1.15.1)\n",  Angelo Ziletti committed Feb 19, 2019 65 66 67  "Requirement already satisfied: pyyaml in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras) (3.13)\n", "Requirement already satisfied: scipy>=0.14 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras) (1.1.0)\n", "Requirement already satisfied: h5py in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras) (2.8.0)\n",  Angelo Ziletti committed Feb 18, 2019 68 69 70  "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n",  Angelo Ziletti committed Feb 19, 2019 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87  "Requirement already satisfied: matplotlib in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (2.2.3)\n", "Requirement already satisfied: numpy>=1.7.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (1.15.1)\n", "Requirement already satisfied: cycler>=0.10 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (0.10.0)\n", "Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (2.2.0)\n", "Requirement already satisfied: python-dateutil>=2.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (2.7.3)\n", "Requirement already satisfied: pytz in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (2018.5)\n", "Requirement already satisfied: six>=1.10 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (1.11.0)\n", "Requirement already satisfied: kiwisolver>=1.0.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib) (1.0.1)\n", "Requirement already satisfied: setuptools in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib) (40.2.0)\n", "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n", "Requirement already satisfied: scipy in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (1.1.0)\n", "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n", "Requirement already satisfied: numpy in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (1.15.1)\n",  Angelo Ziletti committed Feb 18, 2019 88 89 90 91  "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n", "Collecting git+https://github.com/raghakot/keras-vis.git\n",  Angelo Ziletti committed Feb 19, 2019 92  " Cloning https://github.com/raghakot/keras-vis.git to /tmp/pip-req-build-epbr_4s0\n",  Angelo Ziletti committed Feb 18, 2019 93 94 95 96 97 98 99 100  "Requirement not upgraded as not directly required: keras in /home/ziletti/.local/lib/python3.6/site-packages (from keras-vis==0.4.1) (2.2.4)\n", "Requirement not upgraded as not directly required: six in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras-vis==0.4.1) (1.11.0)\n", "Requirement not upgraded as not directly required: scikit-image in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras-vis==0.4.1) (0.14.0)\n", "Requirement not upgraded as not directly required: matplotlib in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras-vis==0.4.1) (2.2.3)\n", "Requirement not upgraded as not directly required: h5py in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras-vis==0.4.1) (2.8.0)\n", "Requirement not upgraded as not directly required: keras-preprocessing>=1.0.5 in /home/ziletti/.local/lib/python3.6/site-packages (from keras->keras-vis==0.4.1) (1.0.9)\n", "Requirement not upgraded as not directly required: scipy>=0.14 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras->keras-vis==0.4.1) (1.1.0)\n", "Requirement not upgraded as not directly required: keras-applications>=1.0.6 in /home/ziletti/.local/lib/python3.6/site-packages (from keras->keras-vis==0.4.1) (1.0.7)\n",  Angelo Ziletti committed Feb 19, 2019 101  "Requirement not upgraded as not directly required: numpy>=1.9.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras->keras-vis==0.4.1) (1.15.1)\n",  Angelo Ziletti committed Feb 18, 2019 102 103 104 105 106 107 108  "Requirement not upgraded as not directly required: pyyaml in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from keras->keras-vis==0.4.1) (3.13)\n", "Requirement not upgraded as not directly required: networkx>=1.8 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from scikit-image->keras-vis==0.4.1) (2.1)\n", "Requirement not upgraded as not directly required: pillow>=4.3.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from scikit-image->keras-vis==0.4.1) (5.2.0)\n", "Requirement not upgraded as not directly required: PyWavelets>=0.4.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from scikit-image->keras-vis==0.4.1) (1.0.0)\n", "Requirement not upgraded as not directly required: dask[array]>=0.9.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from scikit-image->keras-vis==0.4.1) (0.19.1)\n", "Requirement not upgraded as not directly required: cloudpickle>=0.2.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from scikit-image->keras-vis==0.4.1) (0.5.5)\n", "Requirement not upgraded as not directly required: cycler>=0.10 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib->keras-vis==0.4.1) (0.10.0)\n",  Angelo Ziletti committed Feb 19, 2019 109 110 111 112 113 114 115  "Requirement not upgraded as not directly required: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib->keras-vis==0.4.1) (2.2.0)\n" ] }, { "name": "stdout", "output_type": "stream", "text": [  Angelo Ziletti committed Feb 18, 2019 116 117 118 119 120 121 122 123  "Requirement not upgraded as not directly required: python-dateutil>=2.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib->keras-vis==0.4.1) (2.7.3)\n", "Requirement not upgraded as not directly required: pytz in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib->keras-vis==0.4.1) (2018.5)\n", "Requirement not upgraded as not directly required: kiwisolver>=1.0.1 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from matplotlib->keras-vis==0.4.1) (1.0.1)\n", "Requirement not upgraded as not directly required: decorator>=4.1.0 in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from networkx>=1.8->scikit-image->keras-vis==0.4.1) (4.3.0)\n", "Requirement not upgraded as not directly required: toolz>=0.7.3; extra == \"array\" in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from dask[array]>=0.9.0->scikit-image->keras-vis==0.4.1) (0.9.0)\n", "Requirement not upgraded as not directly required: setuptools in /home/ziletti/anaconda2/envs/py36/lib/python3.6/site-packages (from kiwisolver>=1.0.1->matplotlib->keras-vis==0.4.1) (40.2.0)\n", "Building wheels for collected packages: keras-vis\n", " Running setup.py bdist_wheel for keras-vis ... \u001b[?25ldone\n",  Angelo Ziletti committed Feb 19, 2019 124  "\u001b[?25h Stored in directory: /tmp/pip-ephem-wheel-cache-3ikp3hx6/wheels/c5/ae/e7/b34d1cb48b1898f606a5cce08ebc9521fa0588f37f1e590d9f\n",  Angelo Ziletti committed Feb 18, 2019 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141  "Successfully built keras-vis\n", "\u001b[31mtwisted 18.7.0 requires PyHamcrest>=1.9.0, which is not installed.\u001b[0m\n", "Installing collected packages: keras-vis\n", " Found existing installation: keras-vis 0.4.1\n", " Uninstalling keras-vis-0.4.1:\n", " Successfully uninstalled keras-vis-0.4.1\n", "Successfully installed keras-vis-0.4.1\n", "\u001b[33mYou are using pip version 10.0.1, however version 19.0.2 is available.\n", "You should consider upgrading via the 'pip install --upgrade pip' command.\u001b[0m\n" ] } ], "source": [ "# packages to build convolutional neural networks (and not only)\n", "! pip install --user tensorflow\n", "! pip install --user keras\n", "\n",  Angelo Ziletti committed Feb 19, 2019 142 143 144 145 146 147 148  "# to visualize images\n", "! pip install matplotlib\n", "\n", "# to calculate convolution\n", "! pip install scipy\n", "! pip install numpy\n", "\n",  Angelo Ziletti committed Feb 18, 2019 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198  "# package for neural network attention map visualization\n", "! pip install git+https://github.com/raghakot/keras-vis.git -U" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Using TensorFlow backend.\n" ] } ], "source": [ "from __future__ import print_function\n", "%matplotlib inline\n", "\n", "import keras\n", "from keras.datasets import mnist\n", "from keras.models import Sequential\n", "from keras.layers import Dense, Dropout, Flatten\n", "from keras.layers import Conv2D, MaxPooling2D\n", "from keras import backend as K\n", "import matplotlib\n", "import matplotlib.pyplot as plt\n", "import numpy as np\n", "from scipy import signal\n", "import scipy.misc\n", "import urllib.request" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. Introduction to Convolutional Neural Networks \n", "\n", "This introduction is mainly taken from Ref. [1], to which we refer the interested reader for more details." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convolutional networks are a specialized kind of neural network for processing data that has a known **grid-like topology**; they are networks that use convolution in place of general matrix multiplication in at least one of their layers.\n", "\n",  Angelo Ziletti committed Feb 19, 2019 199 200 201  "Examples of such data include time-series data (1-D grid with samples at regular time intervals) and image data (2-D grid of pixels). \n", "Convolutional networks have been tremendously successful in practical applications, especially in computer vision. \n", "\n",  Angelo Ziletti committed Feb 18, 2019 202 203 204 205  "The name \"convolutional neural network\" indicates that the network employs a mathematical operation called convolution. Convolution is a specialized kind of linear operation. \n", "\n", "\n", "A typical layer of a convolutional network consists of three stages:\n",  Angelo Ziletti committed Mar 01, 2019 206  "1. **Convolution** stage: the layer performs several convolutions in parallel to produce a set of linear activations (see Sec. 3 for more details).\n",  Angelo Ziletti committed Feb 18, 2019 207 208  "\n", "2. **Detector** stage: each linear activation is run through a nonlinear activation function (e.g. rectified linear \n",  Angelo Ziletti committed Mar 01, 2019 209  "activation function, sigmoid or tanh function)\n",  Angelo Ziletti committed Feb 18, 2019 210  "\n",  Angelo Ziletti committed Feb 19, 2019 211  "3. **Pooling** stage: a pooling function is used to modify (downsample) the output of the layer. A pooling function replaces the output of the network at a certain location with a summary statistic of the nearby outputs. For example, the max pooling operation reports the maximum output within a rectangular neighborhood. Other popular pooling functions include the average of a rectangular neighborhood, the $L^2$ norm of a rectangular neighborhood, or a weighted average based on the distance from the central pixel.\n",  Angelo Ziletti committed Feb 18, 2019 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236  "\n", "#### Max pooling example\n", "\n", "![maxpool.jpg](maxpool.jpg)\n", "Figure from http://cs231n.github.io/convolutional-networks/\n", "\n", "\n", "#### Average pooling example\n", "\n", "![avg_pooling_example.png](avg_pooling_example.png)\n", "Figure from https://github.com/vdumoulin/conv_arithmetic\n", "\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 2. Motivation\n", "Why one should use convolutional neural networks instead of simple (fully connected) neural networks?\n", "\n", "Convolution leverages three important ideas that can help improve a machine learning system: \n", "- **sparse interactions**\n", "- **parameter sharing**\n", "- **equivariant representations** \n",  Angelo Ziletti committed Feb 19, 2019 237 238  "\n", "Moreover, convolution provides a means for working with inputs of variable size - while this is not possible with fully connected neural networks (also called multi-layer perceptrons).\n",  Angelo Ziletti committed Feb 18, 2019 239 240 241 242 243  "\n", "#### 2.1 Sparse interactions\n", "##### Fully connected NN\n", "It uses matrix multiplication by a matrix of parameters with a separate parameter describing the interaction between each input unit and each output unit. This means that every output unit interacts with every input unit. This do not scale well to full images. For example, an image of 200x200x3 would lead to neurons that have 200x200x3 = 120,000 weights. Moreover, we would almost certainly want to have several such neurons. Clearly, this full connectivity is wasteful and the huge number of parameters would quickly lead to overfitting.\n", "##### CNN\n",  Angelo Ziletti committed Feb 19, 2019 244 245  "It achieves sparse interactions (sparse connectivity) by making the kernel smaller than the input. When processing an image, we can detect small, meaningful features such as edges with kernels that occupy only tens or hundreds of pixels. (*see Sec. 3.3.2 for two concrete examples*). \n", "This means that we need to store fewer parameters, which both reduces the memory requirements of the model and improves its statistical efficiency. It also means that computing the output requires fewer operations. If there are $m$ inputs and $n$ outputs, then matrix multiplication requires $m \\times n$ parameters, and the algorithms used in practice have $O(m \\times n)$ runtime (per example). If we limit the number of connections each output may have to $k$, then the sparsely connected approach requires only $k \\times n$ parameters and $O(k \\times n)$ runtime. For many practical applications, $k$ is several orders of magnitude smaller than $m$.\n",  Angelo Ziletti committed Feb 18, 2019 246 247 248 249 250 251 252 253  "\n", "#### 2.2 Parameter sharing\n", "It refers to using the same parameter for more than one function in a model. \n", "\n", "##### Fully connected NN\n", "Each element of the weight matrix is used exactly once when computing the output of a layer.\n", "\n", "##### CNN\n",  Angelo Ziletti committed Feb 19, 2019 254  "Each member of the kernel is used at every position of the input. The parameter sharing used by the convolution operation means that rather than learning a separate set of parameters for every location, we learn only one set. This further reduce the storage requirements of the model to $k$ parameters. Recall that $k$ is usually several orders of magnitude smaller than $m$. Since $m$ and $n$ are usually roughly the same size, $k$ is practically insignificant compared to $m \\times n$. Convolution is thus dramatically more efficient than dense matrix multiplication in terms of the memory requirements and statistical efficiency. \n",  Angelo Ziletti committed Feb 18, 2019 255 256 257 258 259 260  "\n", "\n", "#### 2.3 Equivariant representations\n", "\n", "Parameter sharing causes the layer to have **equivariance to translation**. To say a function is equivariant means that if the input changes, the output changes in the same way.\n", "\n",  Angelo Ziletti committed Feb 19, 2019 261  "When processing time-series data, this means that convolution produces a sort of timeline that shows when different features appear in the input. If we move an event later in time in the input, the exact same representation of it will appear in the output, just later. Similarly with images, convolution creates a 2-D map of where certain features appear in the input. If we move the object in the input, its representation will move the same amount in the output. This is useful for when we know that some function of a small number of neighboring pixels is useful when applied to multiple input locations."  Angelo Ziletti committed Feb 18, 2019 262 263 264 265 266 267 268 269 270  ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. The convolution operation\n", "\n", "### 3.1 Summary and intuition\n",  Angelo Ziletti committed Feb 19, 2019 271  "The convolutional layer's parameters consist of a set of learnable filters. Every filter is small spatially (along width and height), but extends through the full depth of the input volume. For example, a typical filter on a first layer of a ConvNet might have size 5x5x3 (i.e. 5 pixels width and height, and 3 because images have depth 3, the color channels). \n",  Angelo Ziletti committed Feb 18, 2019 272 273 274 275 276  "\n", "* During the forward pass, we slide (more precisely, convolve) each filter across the width and height of the input volume and compute dot products between the entries of the filter and the input at any position. Intuitively, a convolution can be thought as a sliding (weigthed) average. \n", "\n", "* As we slide the filter over the width and height of the input volume we will produce a 2-dimensional activation map that gives the responses of that filter at every spatial position. Intuitively, the network will learn filters that activate when they see some type of visual feature such as an edge of some orientation or a blotch of some color on the first layer, or eventually entire honeycomb or wheel-like patterns on higher layers of the network. \n", "\n",  Angelo Ziletti committed Feb 19, 2019 277  "* At this stage, we have an entire set of filters in each convolutional layer (e.g. 12 filters), and each of them produce a separate 2-dimensional activation map. We stack these activation maps along the depth dimension and produce the output volume.\n",  Angelo Ziletti committed Feb 18, 2019 278  "\n",  Angelo Ziletti committed Feb 19, 2019 279  "Below, you can see a representation on how the convolution operation is performed.\n",  Angelo Ziletti committed Feb 18, 2019 280 281 282 283 284 285 286 287 288 289 290  "![AnimationConvolution](padding_strides.gif \"convolution\")\n", "\n", "Animation from: https://github.com/vdumoulin/conv_arithmetic/blob/master/gif/padding_strides.gif" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.2 Mathematical formulation - from Ref. [1]\n", "\n",  Angelo Ziletti committed Feb 19, 2019 291 292  "#### Main idea\n", "Suppose we are tracking the location of a spaceship with a laser sensor. Our laser sensor provides a single output $x(t)$, the position of the spaceship at time $t$. Now suppose that our laser sensor is somewhat noisy. To obtain a less noisy estimate of the spaceship’s position, we would like to average several measurements. Of course, more recent measurements are more relevant, so we will want this to be a weighted average that gives more weight to recent measurements. We can do this with a weighting function $w(a)$, where $a$ is the age of a measurement. \n",  Angelo Ziletti committed Feb 18, 2019 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332  "If we apply such a weighted average operation at every moment, we obtain a new function $s$ providing a smoothed estimate of the position of the spaceship: \n", "\n", "$s(t) = \\int x(a)w(t− a)da$\n", "\n", "This operation is called **convolution**. \n", "\n", "The convolution operation is typically denoted with an asterisk: \n", "\n", "$s(t) = ( x ∗ w )( t )$\n", "\n", "In convolutional network terminology, the first argument (in this example, the function $x$) to the convolution is often referred to as the **input**, and the second argument (in this example, the function $w$) as the **kernel**. The output is sometimes referred to as the **feature map**.\n", "\n", "#### Discrete version - 1D [optional]\n", "Let us assume that time index $t$ can then take on only integer values. If we now assume that $x$ and $w$ are defined only on integer $t$, we can define the discrete convolution: \n", "\n", "$s(t) = ( x ∗ w )( t ) = \\sum_{a=-\\infty}^{+\\infty} x(a)w(t− a)$\n", "\n", "#### Discrete version - 2D [optional]\n", "$S(i,j) = (I ∗ k)(i,j) = \\sum_{m}\\sum_{n} I(m,n)K(i-m,j-n)$ \n", "\n", "Convolution is commutative, so we can write: \n", "\n", "$S(i,j) = (K ∗ I)(i,j) = \\sum_{m}\\sum_{n} I(i-m,j-n)K(m,n)$ \n", "\n", "Usually the latter formula is more straightforward to implement in a machine learning library, because there is less variation in the range of valid values of $m$ and $n$. The commutative property of convolution arises because we have flipped the kernel relative to the input, in the sense that as $m$ increases, the index into the input increases, but the index into the kernel decreases. The only reason to flip the kernel is to obtain the commutative property. While the commutative property is useful for writing proofs, it is not usually an important property of a neural network implementation. \n", "\n", "Instead, many neural network libraries implement a related function called the cross-correlation, which is the same as convolution but without flipping the kernel:\n", "\n", "$S(i,j) = (I ∗ K)(i,j) = \\sum_{m}\\sum_{n} I(i+m,j+n)K(m,n)$ \n", "\n", "Many machine learning libraries implement cross-correlation but call it *convolution*. In the context of machine learning, the learning algorithm will learn the appropriate values of the kernel in the appropriate place, so an algorithm based on convolution with kernel flipping will learn a kernel that is flipped relative to the kernel learned by an algorithm without the flipping. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3.3 Examples\n", "\n", "### 3.3.1 Example: computing output value of a discrete convolution (from Ref. [3])\n",  Angelo Ziletti committed Feb 19, 2019 333  "We present below the calculation of the discrete convolution of a 3x3 kernel $K_{\\rm ex}$ (with no padding and stride 1): \n",  Angelo Ziletti committed Feb 18, 2019 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358  "$K_{\\rm ex} = \\begin{pmatrix}\n", "0 & 1 & 2 \\\\ \n", "2 & 2 & 0 \\\\ \n", "0 & 1 & 2 \n", "\\end{pmatrix}$ \n", "\n", "![SegmentLocal](output_discrete_convolution.png \"segment\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### 3.3.2 Example: convolution in practice on real images" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We now perform a convolution operation on real images. We use a photo of Max Planck, and a Berlin landscape." ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 359  "execution_count": 3,  Angelo Ziletti committed Mar 01, 2019 360 361 362  "metadata": { "collapsed": true },  Angelo Ziletti committed Feb 19, 2019 363  "outputs": [],  Angelo Ziletti committed Feb 18, 2019 364  "source": [  Angelo Ziletti committed Feb 19, 2019 365 366  "# this can be skipped because the images are already saved on the server\n", "\n",  Angelo Ziletti committed Feb 18, 2019 367  "# retrieve image of Max Planck from wikipedia\n",  Angelo Ziletti committed Feb 19, 2019 368 369  "#print(\"Retrieving picture of Max Planck. Saving image to './img_max_planck.jpg'.\")\n", "#urllib.request.urlretrieve(\"https://upload.wikimedia.org/wikipedia/commons/thumb/c/c7/Max_Planck_1933.jpg/220px-Max_Planck_1933.jpg\", \"./img_max_planck.jpg\")\n",  Angelo Ziletti committed Feb 18, 2019 370 371  "\n", "# retrive a picture of Berlin\n",  Angelo Ziletti committed Feb 19, 2019 372 373  "#print(\"Retrieving picture of Berlin landscape. Saving image to './img_berlin_landscape.jpg'.\")\n", "#urllib.request.urlretrieve(\"http://vivalifestyleandtravel.com/images/cache/c-1509326560-44562570.jpg\", \"./img_berlin_landscape.jpg\")\n",  Angelo Ziletti committed Feb 18, 2019 374  "\n",  Angelo Ziletti committed Feb 19, 2019 375 376 377 378 379 380 381 382  "#print(\"Done.\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We define a function to display images in a single figure; it is not important for the purpose of this tutorial to understand this function implementation."  Angelo Ziletti committed Feb 18, 2019 383 384 385 386  ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 387  "execution_count": 4,  Angelo Ziletti committed Feb 18, 2019 388 389 390  "metadata": { "code_folding": [ 7  Angelo Ziletti committed Mar 01, 2019 391 392  ], "collapsed": true  Angelo Ziletti committed Feb 18, 2019 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432  }, "outputs": [], "source": [ "# function to display multiple images in a single figure\n", "def show_images(images, cols=1, titles=None, cmap='viridis', filename_out=None):\n", " \"\"\"Display a list of images in a single figure with matplotlib.\n", "\n", " Taken from https://stackoverflow.com/questions/11159436/multiple-figures-in-a-single-window\n", "\n", " Parameters:\n", "\n", " images: list of np.arrays\n", " Images to be plotted. It must be compatible with plt.imshow.\n", "\n", " cols: int, optional, (default = 1)\n", " Number of columns in figure (number of rows is\n", " set to np.ceil(n_images/float(cols))).\n", "\n", " titles: list of strings\n", " List of titles corresponding to each image.\n", "\n", " \"\"\"\n", " plt.clf()\n", " assert ((titles is None) or (len(images) == len(titles)))\n", " n_images = len(images)\n", " if titles is None:\n", " titles = ['Image (%d)' % i for i in range(1, n_images + 1)]\n", " fig = plt.figure()\n", " for n, (image, title) in enumerate(zip(images, titles)):\n", " a = fig.add_subplot(cols, np.ceil(n_images / float(cols)), n + 1)\n", " plt.imshow(image, cmap=cmap)\n", " a.set_title(title, fontsize=40)\n", " a.axis('off') # clear x- and y-axes\n", " fig.set_size_inches(np.array(fig.get_size_inches()) * n_images)\n", " if filename_out is not None:\n", " plt.savefig(filename_out, dpi=100, format='png')" ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 433  "execution_count": 5,  Angelo Ziletti committed Mar 01, 2019 434 435 436  "metadata": { "collapsed": true },  Angelo Ziletti committed Feb 18, 2019 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517  "outputs": [], "source": [ "# read jpg files as numpy arrays\n", "img_max_planck = plt.imread('./img_max_planck.jpg')[:, :, 0]\n", "img_berlin_landscape = plt.imread('./img_berlin_landscape.jpg')[:, :, 0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Type of Kernels\n", "\n", "In Sec. 3.3.1, we used a randomly chosen matrix to perform our convolution; it turns out that there are some \"special\" kernel matrices that perform specific (and useful) transformation when convoluted with an image.\n", "Below, we present some example of these kernels.\n", "\n", "Please visit this page for more details: https://en.wikipedia.org/wiki/Kernel_(image_processing)\n", "\n", "$K_{\\rm identity } = \\begin{pmatrix}\n", "0 & 0 & 0 \\\\ \n", "0 & 1 & 0 \\\\ \n", "0 & 0 & 0 \n", "\\end{pmatrix}$ \n", "\n", "$K_{ \\rm boxblur} = \\dfrac{1}{9}\\begin{pmatrix}\n", "1 & 1 & 1 \\\\ \n", "1 & 1 & 1 \\\\ \n", "1 & 1 & 1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm gaussianblur3x3} = \\dfrac{1}{16}\\begin{pmatrix}\n", "1 & 2 & 1 \\\\ \n", "2 & 4 & 2 \\\\ \n", "1 & 2 & 1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm gaussianblur5x5} = \\dfrac{1}{256}\\begin{pmatrix}\n", "1 & 4 & 6 & 4 & 1 \\\\ \n", "4 & 16 & 24 & 16 & 4 \\\\ \n", "6 & 24 & 36 & 24 & 6 \\\\ \n", "4 & 16 & 24 & 16 & 4 \\\\ \n", "1 & 4 & 6 & 4 & 1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm vlines} = \\begin{pmatrix}\n", "-1 & 2 & -1 \\\\ \n", "-1 & 2 & -1 \\\\ \n", "-1 & 2 & -1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm hlines} = \\begin{pmatrix}\n", "-1 & -1 & -1 \\\\ \n", " 2 & 2 & 2 \\\\ \n", "-1 & -1 & -1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm edges} = \\begin{pmatrix}\n", "-1 & -1 & -1 \\\\ \n", "-1 & 8 & -1 \\\\ \n", "-1 & -1 & -1 \n", "\\end{pmatrix}$ \n", "\n", "$K_{\\rm emboss} = \\begin{pmatrix}\n", "-2 & -1 & 0 \\\\ \n", "-1 & 1 & 1 \\\\ \n", " 0 & 1 & 2 \n", "\\end{pmatrix}$ " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we apply the convolution operation on both images (photo of Max Planck and the Berlin landscape) using each of the kernel above.\n", "In particular, we use the Scipy function signal.convolve2d to perform the convolution. \n", "\n", "Please refer to the Scipy documentation for more details on this function: https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.convolve2d.html" ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 518  "execution_count": 6,  Angelo Ziletti committed Mar 01, 2019 519 520 521  "metadata": { "collapsed": true },  Angelo Ziletti committed Feb 18, 2019 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566  "outputs": [], "source": [ "k_identity = np.array([[0., 0., 0.], \n", " [0., 1., 0.], \n", " [0., 0., 0.]])\n", "\n", "k_box_blur = 1./9. * np.array([[1., 1., 1.], \n", " [1., 1., 1.], \n", " [1., 1., 1.]])\n", "\n", "k_gauss_blur_3x3 = 1./16.* np.array([[0., 0., 0., 5., 0., 0., 0.],\n", " [0., 0., 18., 32., 18., 5., 0.],\n", " [0., 18., 64., 100., 64., 18., 0.],\n", " [5., 32., 100., 100., 100., 32., 5.],\n", " [0., 18., 64., 100., 64., 18., 0.],\n", " [0., 5., 18., 32., 18., 5., 0.],\n", " [0., 0., 0., 5., 0., 0., 0.]])\n", "\n", "k_gauss_blur_5x5 = 1./256.* np.array([[1., 4., 6., 4., 1.],\n", " [4., 16., 24., 16., 4.],\n", " [6., 24., 36., 24., 6.],\n", " [4., 16., 24., 16., 4.],\n", " [1., 4., 6., 4., 1.]])\n", "\n", "k_vlines = np.array([[-1., 2., -1.], \n", " [-1., 2., -1.], \n", " [-1., 2., -1.]])\n", "\n", "k_hlines = np.array([[-1., -1., -1.], \n", " [ 2., 2., 2.], \n", " [-1., -1., -1.]])\n", "\n", "k_edges = np.array([[-1., -1., -1.], \n", " [-1., 8., -1.], \n", " [-1., -1., -1.]])\n", "\n", "#the emboss kernel givens the illusion of depth by emphasizing the differences of pixels in a given direction\n", "# in this case, in a direction along a line from the top left to the bottom right.\n", "k_emboss = np.array([[-2., -1., 0.], \n", " [-1., 1., 1.], \n", " [ 0., 1., 2.]])\n" ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 567  "execution_count": 7,  Angelo Ziletti committed Mar 01, 2019 568 569 570  "metadata": { "collapsed": true },  Angelo Ziletti committed Feb 18, 2019 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585  "outputs": [], "source": [ "kernels = [k_identity, k_box_blur, k_vlines, k_hlines, k_edges, k_emboss]\n", "titles = ['original', 'box blur', 'vertical lines', 'horizontal lines', 'edges', 'emboss']\n", "\n", "# now apply the convolution for each kernel above\n", "max_planck_feature_maps = []\n", "berlin_landscape_feature_maps = []\n", "for kernel in kernels:\n", " max_planck_feature_maps.append(signal.convolve2d(img_max_planck, kernel, boundary='symm', mode='same'))\n", " berlin_landscape_feature_maps.append(signal.convolve2d(img_berlin_landscape, kernel, boundary='symm', mode='same'))" ] }, { "cell_type": "code",  Angelo Ziletti committed Feb 19, 2019 586  "execution_count": 8,  Angelo Ziletti committed Feb 18, 2019 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616  "metadata": {}, "outputs": [ { "data": { "text/plain": [ "