data-science-ipython-notebooks/deep-learning/keras-tutorial/2.3 Supervised Learning - Famous Models with Keras.ipynb

717 lines
23 KiB
Python
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
2017-08-26 08:47:27 -04:00
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggio"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Practical Deep Learning"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"Constructing and training your own ConvNet from scratch can be Hard and a long task.\n",
"\n",
"A common trick used in Deep Learning is to use a **pre-trained** model and finetune it to the specific data it will be used for. "
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"## Famous Models with Keras\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"This notebook contains code and reference for the following Keras models (gathered from [https://github.com/fchollet/deep-learning-models]())\n",
"\n",
"- VGG16\n",
"- VGG19\n",
"- ResNet50\n",
"- Inception v3\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "skip"
}
},
"source": [
"## References\n",
"\n",
"- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) - please cite this paper if you use the VGG models in your work.\n",
"- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - please cite this paper if you use the ResNet model in your work.\n",
"- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567) - please cite this paper if you use the Inception v3 model in your work.\n"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. \n",
"\n",
"For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, \"Width-Height-Depth\"."
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "subslide"
}
},
"source": [
"### Keras Configuration File"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "-"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"image_dim_ordering\": \"th\",\r\n",
" \"floatx\": \"float32\",\r\n",
" \"epsilon\": 1e-07,\r\n",
" \"backend\": \"theano\"\r\n",
"}"
]
}
],
"source": [
"!cat ~/.keras/keras.json"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"image_dim_ordering\": \"th\",\r\n",
" \"floatx\": \"float32\",\r\n",
" \"epsilon\": 1e-07,\r\n",
" \"backend\": \"tensorflow\"\r\n",
"}"
]
}
],
"source": [
"!sed -i 's/theano/tensorflow/g' ~/.keras/keras.json\n",
"!cat ~/.keras/keras.json"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using TensorFlow backend.\n"
]
}
],
"source": [
"import keras"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "fragment"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
]
}
],
"source": [
"import theano"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"{\r\n",
" \"image_dim_ordering\": \"th\",\r\n",
" \"backend\": \"theano\",\r\n",
" \"floatx\": \"float32\",\r\n",
" \"epsilon\": 1e-07\r\n",
"}"
]
}
],
"source": [
"!sed -i 's/tensorflow/theano/g' ~/.keras/keras.json\n",
"!cat ~/.keras/keras.json"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using Theano backend.\n",
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
]
}
],
"source": [
"import keras"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using Theano backend.\n",
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
]
}
],
"source": [
"# %load deep_learning_models/imagenet_utils.py\n",
"import numpy as np\n",
"import json\n",
"\n",
"from keras.utils.data_utils import get_file\n",
"from keras import backend as K\n",
"\n",
"CLASS_INDEX = None\n",
"CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'\n",
"\n",
"\n",
"def preprocess_input(x, dim_ordering='default'):\n",
" if dim_ordering == 'default':\n",
" dim_ordering = K.image_dim_ordering()\n",
" assert dim_ordering in {'tf', 'th'}\n",
"\n",
" if dim_ordering == 'th':\n",
" x[:, 0, :, :] -= 103.939\n",
" x[:, 1, :, :] -= 116.779\n",
" x[:, 2, :, :] -= 123.68\n",
" # 'RGB'->'BGR'\n",
" x = x[:, ::-1, :, :]\n",
" else:\n",
" x[:, :, :, 0] -= 103.939\n",
" x[:, :, :, 1] -= 116.779\n",
" x[:, :, :, 2] -= 123.68\n",
" # 'RGB'->'BGR'\n",
" x = x[:, :, :, ::-1]\n",
" return x\n",
"\n",
"\n",
"def decode_predictions(preds):\n",
" global CLASS_INDEX\n",
" assert len(preds.shape) == 2 and preds.shape[1] == 1000\n",
" if CLASS_INDEX is None:\n",
" fpath = get_file('imagenet_class_index.json',\n",
" CLASS_INDEX_PATH,\n",
" cache_subdir='models')\n",
" CLASS_INDEX = json.load(open(fpath))\n",
" indices = np.argmax(preds, axis=-1)\n",
" results = []\n",
" for i in indices:\n",
" results.append(CLASS_INDEX[str(i)])\n",
" return results\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": true,
"slideshow": {
"slide_type": "skip"
}
},
"outputs": [],
"source": [
"IMAGENET_FOLDER = 'imgs/imagenet' #in the repo"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# VGG16"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [],
"source": [
"# %load deep_learning_models/vgg16.py\n",
"'''VGG16 model for Keras.\n",
"\n",
"# Reference:\n",
"\n",
"- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n",
"\n",
"'''\n",
"from __future__ import print_function\n",
"\n",
"import numpy as np\n",
"import warnings\n",
"\n",
"from keras.models import Model\n",
"from keras.layers import Flatten, Dense, Input\n",
"from keras.layers import Convolution2D, MaxPooling2D\n",
"from keras.preprocessing import image\n",
"from keras.utils.layer_utils import convert_all_kernels_in_model\n",
"from keras.utils.data_utils import get_file\n",
"from keras import backend as K\n",
"\n",
"TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'\n",
"TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'\n",
"TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'\n",
"TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'\n",
"\n",
"\n",
"def VGG16(include_top=True, weights='imagenet',\n",
" input_tensor=None):\n",
" '''Instantiate the VGG16 architecture,\n",
" optionally loading weights pre-trained\n",
" on ImageNet. Note that when using TensorFlow,\n",
" for best performance you should set\n",
" `image_dim_ordering=\"tf\"` in your Keras config\n",
" at ~/.keras/keras.json.\n",
"\n",
" The model and the weights are compatible with both\n",
" TensorFlow and Theano. The dimension ordering\n",
" convention used by the model is the one\n",
" specified in your Keras config file.\n",
"\n",
" # Arguments\n",
" include_top: whether to include the 3 fully-connected\n",
" layers at the top of the network.\n",
" weights: one of `None` (random initialization)\n",
" or \"imagenet\" (pre-training on ImageNet).\n",
" input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n",
" to use as image input for the model.\n",
"\n",
" # Returns\n",
" A Keras model instance.\n",
" '''\n",
" if weights not in {'imagenet', None}:\n",
" raise ValueError('The `weights` argument should be either '\n",
" '`None` (random initialization) or `imagenet` '\n",
" '(pre-training on ImageNet).')\n",
" # Determine proper input shape\n",
" if K.image_dim_ordering() == 'th':\n",
" if include_top:\n",
" input_shape = (3, 224, 224)\n",
" else:\n",
" input_shape = (3, None, None)\n",
" else:\n",
" if include_top:\n",
" input_shape = (224, 224, 3)\n",
" else:\n",
" input_shape = (None, None, 3)\n",
"\n",
" if input_tensor is None:\n",
" img_input = Input(shape=input_shape)\n",
" else:\n",
" if not K.is_keras_tensor(input_tensor):\n",
" img_input = Input(tensor=input_tensor)\n",
" else:\n",
" img_input = input_tensor\n",
" # Block 1\n",
" x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)\n",
" x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)\n",
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)\n",
"\n",
" # Block 2\n",
" x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)\n",
" x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)\n",
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)\n",
"\n",
" # Block 3\n",
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)\n",
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)\n",
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)\n",
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)\n",
"\n",
" # Block 4\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)\n",
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)\n",
"\n",
" # Block 5\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)\n",
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)\n",
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)\n",
"\n",
" if include_top:\n",
" # Classification block\n",
" x = Flatten(name='flatten')(x)\n",
" x = Dense(4096, activation='relu', name='fc1')(x)\n",
" x = Dense(4096, activation='relu', name='fc2')(x)\n",
" x = Dense(1000, activation='softmax', name='predictions')(x)\n",
"\n",
" # Create model\n",
" model = Model(img_input, x)\n",
"\n",
" # load weights\n",
" if weights == 'imagenet':\n",
" print('K.image_dim_ordering:', K.image_dim_ordering())\n",
" if K.image_dim_ordering() == 'th':\n",
" if include_top:\n",
" weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',\n",
" TH_WEIGHTS_PATH,\n",
" cache_subdir='models')\n",
" else:\n",
" weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',\n",
" TH_WEIGHTS_PATH_NO_TOP,\n",
" cache_subdir='models')\n",
" model.load_weights(weights_path)\n",
" if K.backend() == 'tensorflow':\n",
" warnings.warn('You are using the TensorFlow backend, yet you '\n",
" 'are using the Theano '\n",
" 'image dimension ordering convention '\n",
" '(`image_dim_ordering=\"th\"`). '\n",
" 'For best performance, set '\n",
" '`image_dim_ordering=\"tf\"` in '\n",
" 'your Keras config '\n",
" 'at ~/.keras/keras.json.')\n",
" convert_all_kernels_in_model(model)\n",
" else:\n",
" if include_top:\n",
" weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',\n",
" TF_WEIGHTS_PATH,\n",
" cache_subdir='models')\n",
" else:\n",
" weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',\n",
" TF_WEIGHTS_PATH_NO_TOP,\n",
" cache_subdir='models')\n",
" model.load_weights(weights_path)\n",
" if K.backend() == 'theano':\n",
" convert_all_kernels_in_model(model)\n",
" return model"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false,
"slideshow": {
"slide_type": "subslide"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"K.image_dim_ordering: th\n",
"Input image shape: (1, 3, 224, 224)\n",
"Predicted: [['n07745940', 'strawberry']]\n"
]
}
],
"source": [
"import os\n",
"\n",
"model = VGG16(include_top=True, weights='imagenet')\n",
"\n",
"img_path = os.path.join(IMAGENET_FOLDER, 'strawberry_1157.jpeg')\n",
"img = image.load_img(img_path, target_size=(224, 224))\n",
"x = image.img_to_array(img)\n",
"x = np.expand_dims(x, axis=0)\n",
"x = preprocess_input(x)\n",
"print('Input image shape:', x.shape)\n",
"\n",
"preds = model.predict(x)\n",
"print('Predicted:', decode_predictions(preds))"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Fine Tuning of a Pre-Trained Model"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```python\n",
"def VGG16_FT(weights_path = None, \n",
" img_width = 224, img_height = 224, \n",
" f_type = None, n_labels = None ):\n",
" \n",
" \"\"\"Fine Tuning of a VGG16 based Net\"\"\"\n",
"\n",
" # VGG16 Up to the layer before the last!\n",
" model = Sequential()\n",
" model.add(ZeroPadding2D((1, 1), \n",
" input_shape=(3, \n",
" img_width, img_height)))\n",
"\n",
" model.add(Convolution2D(64, 3, 3, activation='relu', \n",
" name='conv1_1'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(64, 3, 3, activation='relu', \n",
" name='conv1_2'))\n",
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
"\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(128, 3, 3, activation='relu', \n",
" name='conv2_1'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(128, 3, 3, activation='relu', \n",
" name='conv2_2'))\n",
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
"\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
" name='conv3_1'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
" name='conv3_2'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
" name='conv3_3'))\n",
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
"\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv4_1'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv4_2'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv4_3'))\n",
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
"\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv5_1'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv5_2'))\n",
" model.add(ZeroPadding2D((1, 1)))\n",
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
" name='conv5_3'))\n",
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
" model.add(Flatten())\n",
"\n",
" # Plugging new Layers\n",
" model.add(Dense(768, activation='sigmoid'))\n",
" model.add(Dropout(0.0))\n",
" model.add(Dense(768, activation='sigmoid'))\n",
" model.add(Dropout(0.0))\n",
" \n",
" last_layer = Dense(n_labels, activation='sigmoid')\n",
" loss = 'categorical_crossentropy'\n",
" optimizer = optimizers.Adam(lr=1e-4, epsilon=1e-08)\n",
" batch_size = 128\n",
" \n",
" assert os.path.exists(weights_path), 'Model weights not found (see \"weights_path\" variable in script).'\n",
" #model.load_weights(weights_path)\n",
" f = h5py.File(weights_path)\n",
" for k in range(len(f.attrs['layer_names'])):\n",
" g = f[f.attrs['layer_names'][k]]\n",
" weights = [g[g.attrs['weight_names'][p]] \n",
" for p in range(len(g.attrs['weight_names']))]\n",
" if k >= len(model.layers):\n",
" break\n",
" else:\n",
" model.layers[k].set_weights(weights)\n",
" f.close()\n",
" print('Model loaded.')\n",
"\n",
" model.add(last_layer)\n",
"\n",
" # set the first 25 layers (up to the last conv block)\n",
" # to non-trainable (weights will not be updated)\n",
" for layer in model.layers[:25]:\n",
" layer.trainable = False\n",
"\n",
" # compile the model with a SGD/momentum optimizer\n",
" # and a very slow learning rate.\n",
" model.compile(loss=loss,\n",
" optimizer=optimizer,\n",
" metrics=['accuracy'])\n",
" return model, batch_size\n",
"\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "slide"
}
},
"source": [
"# Hands On:\n",
"\n",
"### Try to do the same with other models "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%load deep_learning_models/vgg19.py"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"%load deep_learning_models/resnet50.py"
]
}
],
"metadata": {
"celltoolbar": "Slideshow",
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}