In this tutorial, we briefly introduce the main ideas behind convolutional neural networks, build a neural network model, and finally explain the classification decision process using attentive response maps.
+In this tutorial, we briefly introduce the main ideas behind convolutional neural networks, build a neural network model with Keras, and explain the classification decision process using attentive response maps.
The name "convolutional neural network" indicates that the network employs a mathematical operation called convolution. Convolution is a specialized kind of linear operation.
A typical layer of a convolutional network consists of three stages:
Convolution stage: the layer performs several convolutions in parallel to produce a set of linear activations.
+Convolution stage: the layer performs several convolutions in parallel to produce a set of linear activations (see Sec. 3 for more details).
Detector stage: each linear activation is run through a nonlinear activation function (e.g. rectified linear -activation function)
+activation function, sigmoid or tanh function)Pooling stage: a pooling function is used to modify (downsample) the output of the layer. A pooling function replaces the output of the network at a certain location with a summary statistic of the nearby outputs. For example, the max pooling operation reports the maximum output within a rectangular neighborhood. Other popular pooling functions include the average of a rectangular neighborhood, the $L^2$ norm of a rectangular neighborhood, or a weighted average based on the distance from the central pixel.