Skip to content

Commit

Permalink
Renamed & re-organised + fixed code and added samples
Browse files Browse the repository at this point in the history
  • Loading branch information
leriomaggio committed Apr 12, 2017
1 parent ce29da9 commit 8a96b5a
Show file tree
Hide file tree
Showing 10 changed files with 2,154 additions and 3,433 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
}
},
"source": [
"# ConvNet HandsOn with Keras"
"# CNN HandsOn with Keras"
]
},
{
Expand Down
254 changes: 254 additions & 0 deletions 2.2.1 CNN HandsOn - MNIST & FC Nets.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,254 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"# Fully Connected Feed-Forward Network\n",
"\n",
"In this notebook we will play with Feed-Forward FC-NN (Fully Connected Neural Network) for a *classification task*: Image Classification on MNIST Dataset\n",
"\n",
"**RECALL**\n",
"\n",
"In the FC-NN, the output of each layer is computed using the activations from the previous one, as follows:\n",
"\n",
"$$h_{i} = \\sigma(W_i h_{i-1} + b_i)$$\n",
"\n",
"where ${h}_i$ is the activation vector from the $i$-th layer (or the input data for $i=0$), ${W}_i$ and ${b}_i$ are the weight matrix and the bias vector for the $i$-th layer, respectively. \n",
"<br><rb>\n",
"$\\sigma(\\cdot)$ is the activation function. In our example, we will use the *ReLU* activation function for the hidden layers and *softmax* for the last layer.\n",
"\n",
"To regularize the model, we will also insert a Dropout layer between consecutive hidden layers. \n",
"\n",
"Dropout works by “dropping out” some unit activations in a given layer, that is setting them to zero with a given probability.\n",
"\n",
"Our loss function will be the **categorical crossentropy**."
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"## Model definition\n",
"Keras supports two different kind of models: the [Sequential](http://keras.io/models/#sequential) model and the [Graph](http://keras.io/models/#graph) model. The former is used to build linear stacks of layer (so each layer has one input and one output), and the latter supports any kind of connection graph.\n",
"\n",
"In our case we build a Sequential model with three [Dense](http://keras.io/layers/core/#dense) (aka fully connected) layers, with some [Dropout](http://keras.io/layers/core/#dropout). Notice that the output layer has the softmax activation function. \n",
"\n",
"The resulting model is actually a `function` of its own inputs implemented using the Keras backend. \n",
"\n",
"We apply the binary crossentropy loss and choose SGD as the optimizer. \n",
"\n",
"Please remind that Keras supports a variety of different [optimizers](http://keras.io/optimizers/) and [loss functions](http://keras.io/objectives/), which you may want to check out. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
}
],
"source": [
"from keras.models import Sequential\n",
"from keras.layers.core import Dense, Dropout\n",
"from keras.optimizers import SGD\n",
"\n",
"nb_classes = 10\n",
"\n",
"# FC@512+relu -> DropOut(0.2) -> FC@512+relu -> DropOut(0.2) -> FC@nb_classes+softmax\n",
"# ... your Code Here"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": true,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"# Decomment and Execute this cell to get the solution\n",
"# %load solutions/sol_221_1.py"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"## Data preparation (`keras.dataset`)\n",
"\n",
"We will train our model on the MNIST dataset, which consists of 60,000 28x28 grayscale images of the 10 digits, along with a test set of 10,000 images. \n",
"\n",
"![](imgs/mnist.png)\n",
"\n",
"Since this dataset is **provided** with Keras, we just ask the `keras.dataset` model for training and test data.\n",
"\n",
"We will:\n",
"\n",
"* download the data\n",
"* reshape data to be in vectorial form (original data are images)\n",
"* normalize between 0 and 1.\n",
"\n",
"The `binary_crossentropy` loss expects a **one-hot-vector** as input, therefore we apply the `to_categorical` function from `keras.utilis` to convert integer labels to **one-hot-vectors**."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"from keras.datasets import mnist\n",
"from keras.utils import np_utils\n",
"\n",
"(X_train, y_train), (X_test, y_test) = mnist.load_data()\n",
"X_train = X_train.reshape(60000, 784)\n",
"X_test = X_test.reshape(10000, 784)\n",
"X_train = X_train.astype(\"float32\")\n",
"X_test = X_test.astype(\"float32\")\n",
"X_train /= 255\n",
"X_test /= 255\n",
"\n",
"# convert class vectors to binary class matrices\n",
"Y_train = np_utils.to_categorical(y_train, 10)\n",
"Y_test = np_utils.to_categorical(y_test, 10)"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"## Training\n",
"Having defined and compiled the model, it can be trained using the `fit` function. We also specify a validation dataset to monitor validation loss and accuracy."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true,
"scrolled": true
},
"outputs": [],
"source": [
"network_history = model.fit(X_train, Y_train, batch_size=128, \n",
" epochs=100, verbose=1, validation_data=(X_test, Y_test))"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"### Plotting Network Performance Trend\n",
"The return value of the `fit` function is a `keras.callbacks.History` object which contains the entire history of training/validation loss and accuracy, for each epoch. We can therefore plot the behaviour of loss and accuracy during the training phase."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false,
"deletable": true,
"editable": true
},
"outputs": [],
"source": [
"import matplotlib.pyplot as plt\n",
"%matplotlib inline\n",
"\n",
"plt.figure()\n",
"plt.xlabel('Epochs')\n",
"plt.ylabel('Loss')\n",
"plt.plot(network_history.history['loss'])\n",
"plt.plot(network_history.history['val_loss'])\n",
"plt.legend(['Training', 'Validation'])\n",
"\n",
"plt.figure()\n",
"plt.xlabel('Epochs')\n",
"plt.ylabel('Accuracy')\n",
"plt.plot(network_history.history['acc'])\n",
"plt.plot(network_history.history['val_acc'])\n",
"plt.legend(['Training', 'Validation'], loc='lower right')"
]
},
{
"cell_type": "markdown",
"metadata": {
"deletable": true,
"editable": true
},
"source": [
"After `100` epochs, we get a `98.8%` validation accuracy. \n",
"\n",
"If you continue training, at some point the validation loss will start to increase: that is when the model starts to **overfit**. \n",
"\n",
"It is always necessary to monitor training and validation loss during the training of any kind of Neural Network, either to detect overfitting or to evaluate the behaviour of the model **(any clue on how to do it??)**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"# %load solutions/sol_221_2.py"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.2"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading

0 comments on commit 8a96b5a

Please sign in to comment.