759
deep-learning/keras-tutorial/0. Preamble.ipynb
Normal file
|
@ -0,0 +1,759 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggio"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "-"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<div>\n",
|
||||
" <h1 style=\"text-align: center;\">Deep Learning with Keras</h1>\n",
|
||||
" <img style=\"text-align: left\" src=\"imgs/keras-logo-small.jpg\" width=\"10%\" />\n",
|
||||
"<div>\n",
|
||||
"\n",
|
||||
"<div>\n",
|
||||
" <h2 style=\"text-align: center;\">Tutorial @ EuroScipy 2016</h2>\n",
|
||||
" <img style=\"text-align: left\" src=\"imgs/euroscipy_2016_logo.png\" width=\"40%\" />\n",
|
||||
"</div> "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### Yam Peleg, Valerio Maggio"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "-"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Goal of this Tutorial\n",
|
||||
"\n",
|
||||
"- **Introduce** main features of Keras\n",
|
||||
"- **Learn** how simple and Pythonic is doing Deep Learning with Keras\n",
|
||||
"- **Understand** how easy is to do basic and *advanced* DL models in Keras;\n",
|
||||
" - **Examples and Hand-on Excerises** along the way."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Source\n",
|
||||
"\n",
|
||||
"https://github.com/leriomaggio/deep-learning-keras-euroscipy2016/"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"source": [
|
||||
"# (Tentative) Schedule \n",
|
||||
"\n",
|
||||
"## Attention: Spoilers Warning!\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"- **Setup** (`10 mins`)\n",
|
||||
"\n",
|
||||
"- **Part I**: **Introduction** (`~65 mins`)\n",
|
||||
"\n",
|
||||
" - Intro to ANN (`~20 mins`)\n",
|
||||
" - naive pure-Python implementation\n",
|
||||
" - fast forward, sgd, backprop\n",
|
||||
" \n",
|
||||
" - Intro to Theano (`15 mins`)\n",
|
||||
" - Model + SGD with Theano\n",
|
||||
" \n",
|
||||
" - Introduction to Keras (`30 mins`)\n",
|
||||
" - Overview and main features\n",
|
||||
" - Theano backend\n",
|
||||
" - Tensorflow backend\n",
|
||||
" - Multi-Layer Perceptron and Fully Connected\n",
|
||||
" - Examples with `keras.models.Sequential` and `Dense`\n",
|
||||
" - HandsOn: MLP with keras\n",
|
||||
" \n",
|
||||
"- **Coffe Break** (`30 mins`)\n",
|
||||
"\n",
|
||||
"- **Part II**: **Supervised Learning and Convolutional Neural Nets** (`~45 mins`)\n",
|
||||
" \n",
|
||||
" - Intro: Focus on Image Classification (`5 mins`)\n",
|
||||
"\n",
|
||||
" - Intro to CNN (`25 mins`)\n",
|
||||
" - meaning of convolutional filters\n",
|
||||
" - examples from ImageNet \n",
|
||||
" - Meaning of dimensions of Conv filters (through an exmple of ConvNet) \n",
|
||||
" - Visualising ConvNets\n",
|
||||
" - HandsOn: ConvNet with keras \n",
|
||||
"\n",
|
||||
" - Advanced CNN (`10 mins`)\n",
|
||||
" - Dropout\n",
|
||||
" - MaxPooling\n",
|
||||
" - Batch Normalisation\n",
|
||||
" \n",
|
||||
" - Famous Models in Keras (likely moved somewhere else) (`10 mins`)\n",
|
||||
" (ref: https://github.com/fchollet/deep-learning-models)\n",
|
||||
" - VGG16\n",
|
||||
" - VGG19\n",
|
||||
" - ResNet50\n",
|
||||
" - Inception v3\n",
|
||||
" - HandsOn: Fine tuning a network on new dataset \n",
|
||||
" \n",
|
||||
"- **Part III**: **Unsupervised Learning** (`10 mins`)\n",
|
||||
"\n",
|
||||
" - AutoEncoders (`5 mins`)\n",
|
||||
" - word2vec & doc2vec (gensim) & `keras.datasets` (`5 mins`)\n",
|
||||
" - `Embedding`\n",
|
||||
" - word2vec and CNN\n",
|
||||
" - Exercises\n",
|
||||
"\n",
|
||||
"- **Part IV**: **Advanced Materials** (`20 mins`)\n",
|
||||
" - RNN and LSTM (`10 mins`)\n",
|
||||
" - RNN, LSTM, GRU \n",
|
||||
" - Example of RNN and LSTM with Text (`~10 mins`) -- *Tentative*\n",
|
||||
" - HandsOn: IMDB\n",
|
||||
"\n",
|
||||
"- **Wrap up and Conclusions** (`5 mins`)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Requirements"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This tutorial requires the following packages:\n",
|
||||
"\n",
|
||||
"- Python version 3.4+ \n",
|
||||
" - likely Python 2.7 would be fine, but *who knows*? :P\n",
|
||||
"- `numpy` version 1.10 or later: http://www.numpy.org/\n",
|
||||
"- `scipy` version 0.16 or later: http://www.scipy.org/\n",
|
||||
"- `matplotlib` version 1.4 or later: http://matplotlib.org/\n",
|
||||
"- `pandas` version 0.16 or later: http://pandas.pydata.org\n",
|
||||
"- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n",
|
||||
"- `keras` version 1.0 or later: http://keras.io\n",
|
||||
"- `theano` version 0.8 or later: http://deeplearning.net/software/theano/\n",
|
||||
"- `ipython`/`jupyter` version 4.0 or later, with notebook support\n",
|
||||
"\n",
|
||||
"(Optional but recommended):\n",
|
||||
"\n",
|
||||
"- `pyyaml`\n",
|
||||
"- `hdf5` and `h5py` (required if you use model saving/loading functions in keras)\n",
|
||||
"- **NVIDIA cuDNN** if you have NVIDIA GPUs on your machines.\n",
|
||||
" [https://developer.nvidia.com/rdp/cudnn-download]()\n",
|
||||
"\n",
|
||||
"The easiest way to get (most) these is to use an all-in-one installer such as [Anaconda](http://www.continuum.io/downloads) from Continuum. These are available for multiple architectures."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Python Version"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"I'm currently running this tutorial with **Python 3** on **Anaconda**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Python 3.5.2\r\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!python --version"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# How to set up your environment"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The quickest and simplest way to setup the environment is to use [conda](https://store.continuum.io) environment manager. \n",
|
||||
"\n",
|
||||
"We provide in the materials a `deep-learning.yml` that is complete and **ready to use** to set up your virtual environment with conda."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"scrolled": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"name: deep-learning\r\n",
|
||||
"channels:\r\n",
|
||||
"- conda-forge\r\n",
|
||||
"- defaults\r\n",
|
||||
"dependencies:\r\n",
|
||||
"- accelerate=2.3.0=np111py35_3\r\n",
|
||||
"- accelerate_cudalib=2.0=0\r\n",
|
||||
"- bokeh=0.12.1=py35_0\r\n",
|
||||
"- cffi=1.6.0=py35_0\r\n",
|
||||
"- backports.shutil_get_terminal_size=1.0.0=py35_0\r\n",
|
||||
"- blas=1.1=openblas\r\n",
|
||||
"- ca-certificates=2016.8.2=3\r\n",
|
||||
"- cairo=1.12.18=8\r\n",
|
||||
"- certifi=2016.8.2=py35_0\r\n",
|
||||
"- cycler=0.10.0=py35_0\r\n",
|
||||
"- cython=0.24.1=py35_0\r\n",
|
||||
"- decorator=4.0.10=py35_0\r\n",
|
||||
"- entrypoints=0.2.2=py35_0\r\n",
|
||||
"- fontconfig=2.11.1=3\r\n",
|
||||
"- freetype=2.6.3=1\r\n",
|
||||
"- gettext=0.19.7=1\r\n",
|
||||
"- glib=2.48.0=4\r\n",
|
||||
"- h5py=2.6.0=np111py35_6\r\n",
|
||||
"- harfbuzz=1.0.6=0\r\n",
|
||||
"- hdf5=1.8.17=2\r\n",
|
||||
"- icu=56.1=4\r\n",
|
||||
"- ipykernel=4.3.1=py35_1\r\n",
|
||||
"- ipython=5.1.0=py35_0\r\n",
|
||||
"- ipywidgets=5.2.2=py35_0\r\n",
|
||||
"- jinja2=2.8=py35_1\r\n",
|
||||
"- jpeg=9b=0\r\n",
|
||||
"- jsonschema=2.5.1=py35_0\r\n",
|
||||
"- jupyter_client=4.3.0=py35_0\r\n",
|
||||
"- jupyter_console=5.0.0=py35_0\r\n",
|
||||
"- jupyter_core=4.1.1=py35_1\r\n",
|
||||
"- libffi=3.2.1=2\r\n",
|
||||
"- libiconv=1.14=3\r\n",
|
||||
"- libpng=1.6.24=0\r\n",
|
||||
"- libsodium=1.0.10=0\r\n",
|
||||
"- libtiff=4.0.6=6\r\n",
|
||||
"- libxml2=2.9.4=0\r\n",
|
||||
"- markupsafe=0.23=py35_0\r\n",
|
||||
"- matplotlib=1.5.2=np111py35_6\r\n",
|
||||
"- mistune=0.7.3=py35_0\r\n",
|
||||
"- nbconvert=4.2.0=py35_0\r\n",
|
||||
"- nbformat=4.0.1=py35_0\r\n",
|
||||
"- ncurses=5.9=8\r\n",
|
||||
"- nose=1.3.7=py35_1\r\n",
|
||||
"- notebook=4.2.2=py35_0\r\n",
|
||||
"- numpy=1.11.1=py35_blas_openblas_201\r\n",
|
||||
"- openblas=0.2.18=4\r\n",
|
||||
"- openssl=1.0.2h=2\r\n",
|
||||
"- pandas=0.18.1=np111py35_1\r\n",
|
||||
"- pango=1.40.1=0\r\n",
|
||||
"- path.py=8.2.1=py35_0\r\n",
|
||||
"- pcre=8.38=1\r\n",
|
||||
"- pexpect=4.2.0=py35_1\r\n",
|
||||
"- pickleshare=0.7.3=py35_0\r\n",
|
||||
"- pip=8.1.2=py35_0\r\n",
|
||||
"- pixman=0.32.6=0\r\n",
|
||||
"- prompt_toolkit=1.0.6=py35_0\r\n",
|
||||
"- protobuf=3.0.0b3=py35_1\r\n",
|
||||
"- ptyprocess=0.5.1=py35_0\r\n",
|
||||
"- pygments=2.1.3=py35_1\r\n",
|
||||
"- pyparsing=2.1.7=py35_0\r\n",
|
||||
"- python=3.5.2=2\r\n",
|
||||
"- python-dateutil=2.5.3=py35_0\r\n",
|
||||
"- pytz=2016.6.1=py35_0\r\n",
|
||||
"- pyyaml=3.11=py35_0\r\n",
|
||||
"- pyzmq=15.4.0=py35_0\r\n",
|
||||
"- qt=4.8.7=0\r\n",
|
||||
"- qtconsole=4.2.1=py35_0\r\n",
|
||||
"- readline=6.2=0\r\n",
|
||||
"- requests=2.11.0=py35_0\r\n",
|
||||
"- scikit-learn=0.17.1=np111py35_blas_openblas_201\r\n",
|
||||
"- scipy=0.18.0=np111py35_blas_openblas_201\r\n",
|
||||
"- setuptools=25.1.6=py35_0\r\n",
|
||||
"- simplegeneric=0.8.1=py35_0\r\n",
|
||||
"- sip=4.18=py35_0\r\n",
|
||||
"- six=1.10.0=py35_0\r\n",
|
||||
"- sqlite=3.13.0=1\r\n",
|
||||
"- terminado=0.6=py35_0\r\n",
|
||||
"- tk=8.5.19=0\r\n",
|
||||
"- tornado=4.4.1=py35_1\r\n",
|
||||
"- traitlets=4.2.2=py35_0\r\n",
|
||||
"- wcwidth=0.1.7=py35_0\r\n",
|
||||
"- wheel=0.29.0=py35_0\r\n",
|
||||
"- widgetsnbextension=1.2.6=py35_3\r\n",
|
||||
"- xz=5.2.2=0\r\n",
|
||||
"- yaml=0.1.6=0\r\n",
|
||||
"- zeromq=4.1.5=0\r\n",
|
||||
"- zlib=1.2.8=3\r\n",
|
||||
"- cudatoolkit=7.5=0\r\n",
|
||||
"- ipython_genutils=0.1.0=py35_0\r\n",
|
||||
"- jupyter=1.0.0=py35_3\r\n",
|
||||
"- libgfortran=3.0.0=1\r\n",
|
||||
"- llvmlite=0.11.0=py35_0\r\n",
|
||||
"- mkl=11.3.3=0\r\n",
|
||||
"- mkl-service=1.1.2=py35_2\r\n",
|
||||
"- numba=0.26.0=np111py35_0\r\n",
|
||||
"- pycparser=2.14=py35_1\r\n",
|
||||
"- pyqt=4.11.4=py35_4\r\n",
|
||||
"- snakeviz=0.4.1=py35_0\r\n",
|
||||
"- pip:\r\n",
|
||||
" - backports.shutil-get-terminal-size==1.0.0\r\n",
|
||||
" - certifi==2016.8.2\r\n",
|
||||
" - cycler==0.10.0\r\n",
|
||||
" - cython==0.24.1\r\n",
|
||||
" - decorator==4.0.10\r\n",
|
||||
" - h5py==2.6.0\r\n",
|
||||
" - ipykernel==4.3.1\r\n",
|
||||
" - ipython==5.1.0\r\n",
|
||||
" - ipython-genutils==0.1.0\r\n",
|
||||
" - ipywidgets==5.2.2\r\n",
|
||||
" - jinja2==2.8\r\n",
|
||||
" - jsonschema==2.5.1\r\n",
|
||||
" - jupyter-client==4.3.0\r\n",
|
||||
" - jupyter-console==5.0.0\r\n",
|
||||
" - jupyter-core==4.1.1\r\n",
|
||||
" - keras==1.0.7\r\n",
|
||||
" - mako==1.0.4\r\n",
|
||||
" - markupsafe==0.23\r\n",
|
||||
" - matplotlib==1.5.2\r\n",
|
||||
" - mistune==0.7.3\r\n",
|
||||
" - nbconvert==4.2.0\r\n",
|
||||
" - nbformat==4.0.1\r\n",
|
||||
" - nose==1.3.7\r\n",
|
||||
" - notebook==4.2.2\r\n",
|
||||
" - numpy==1.11.1\r\n",
|
||||
" - pandas==0.18.1\r\n",
|
||||
" - path.py==8.2.1\r\n",
|
||||
" - pexpect==4.2.0\r\n",
|
||||
" - pickleshare==0.7.3\r\n",
|
||||
" - pip==8.1.2\r\n",
|
||||
" - prompt-toolkit==1.0.6\r\n",
|
||||
" - protobuf==3.0.0b2\r\n",
|
||||
" - ptyprocess==0.5.1\r\n",
|
||||
" - pygments==2.1.3\r\n",
|
||||
" - pygpu==0.2.1\r\n",
|
||||
" - pyparsing==2.1.7\r\n",
|
||||
" - python-dateutil==2.5.3\r\n",
|
||||
" - pytz==2016.6.1\r\n",
|
||||
" - pyyaml==3.11\r\n",
|
||||
" - pyzmq==15.4.0\r\n",
|
||||
" - qtconsole==4.2.1\r\n",
|
||||
" - requests==2.11.0\r\n",
|
||||
" - scikit-learn==0.17.1\r\n",
|
||||
" - scipy==0.18.0\r\n",
|
||||
" - setuptools==25.1.4\r\n",
|
||||
" - simplegeneric==0.8.1\r\n",
|
||||
" - six==1.10.0\r\n",
|
||||
" - tensorflow==0.10.0rc0\r\n",
|
||||
" - terminado==0.6\r\n",
|
||||
" - theano==0.8.2\r\n",
|
||||
" - tornado==4.4.1\r\n",
|
||||
" - traitlets==4.2.2\r\n",
|
||||
" - wcwidth==0.1.7\r\n",
|
||||
" - wheel==0.29.0\r\n",
|
||||
" - widgetsnbextension==1.2.6\r\n",
|
||||
"prefix: /home/valerio/anaconda3/envs/deep-learning\r\n",
|
||||
"\r\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!cat deep-learning.yml"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Recreate the Conda Environment"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### A. Create the Environment\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"conda env create -f deep-learning.yml # this file is for Linux channels.\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"If you're using a **Mac OSX**, we also provided in the repo the conda file \n",
|
||||
"that is compatible with `osx-channels`:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"conda env create -f deep-learning-osx.yml # this file is for OSX channels.\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"#### B. Activate the new `deep-learning` Environment\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"source activate deep-learning\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Optionals"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 1. Enabling Conda-Forge"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"It is strongly suggested to enable [**conda forge**](https://conda-forge.github.io/) in your Anaconda installation.\n",
|
||||
"\n",
|
||||
"**Conda-Forge** is a github organisation containing repositories of conda recipies.\n",
|
||||
"\n",
|
||||
"To add `conda-forge` as an additional anaconda channel it is just required to type:\n",
|
||||
"\n",
|
||||
"```shell\n",
|
||||
"conda config --add channels conda-forge\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 2. Configure Theano\n",
|
||||
"\n",
|
||||
"1) Create the `theanorc` file:\n",
|
||||
"\n",
|
||||
"```shell\n",
|
||||
"touch $HOME/.theanorc\n",
|
||||
"```\n",
|
||||
"\n",
|
||||
"2) Copy the following content into the file:\n",
|
||||
"\n",
|
||||
"```\n",
|
||||
"[global]\n",
|
||||
"floatX = float32\n",
|
||||
"device = gpu # switch to cpu if no GPU is available on your machine\n",
|
||||
"\n",
|
||||
"[nvcc]\n",
|
||||
"fastmath = True\n",
|
||||
"\n",
|
||||
"[lib]\n",
|
||||
"cnmem=.90\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**More on [theano documentation](http://theano.readthedocs.io/en/latest/library/config.html)**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### 3. Installing Tensorflow as backend "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"```shell\n",
|
||||
"# Ubuntu/Linux 64-bit, GPU enabled, Python 3.5\n",
|
||||
"# Requires CUDA toolkit 7.5 and CuDNN v4. For other versions, see \"Install from sources\" below.\n",
|
||||
"export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0rc0-cp35-cp35m-linux_x86_64.whl\n",
|
||||
"\n",
|
||||
"pip install --ignore-installed --upgrade $TF_BINARY_URL\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"**More on [tensorflow documentation](https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html)**"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Test if everything is up&running"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 1. Check import"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import numpy as np\n",
|
||||
"import scipy as sp\n",
|
||||
"import pandas as pd\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import sklearn"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n",
|
||||
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import keras"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## 2. Check installeded Versions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"numpy: 1.11.1\n",
|
||||
"scipy: 0.18.0\n",
|
||||
"matplotlib: 1.5.2\n",
|
||||
"iPython: 5.1.0\n",
|
||||
"scikit-learn: 0.17.1\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import numpy\n",
|
||||
"print('numpy:', numpy.__version__)\n",
|
||||
"\n",
|
||||
"import scipy\n",
|
||||
"print('scipy:', scipy.__version__)\n",
|
||||
"\n",
|
||||
"import matplotlib\n",
|
||||
"print('matplotlib:', matplotlib.__version__)\n",
|
||||
"\n",
|
||||
"import IPython\n",
|
||||
"print('iPython:', IPython.__version__)\n",
|
||||
"\n",
|
||||
"import sklearn\n",
|
||||
"print('scikit-learn:', sklearn.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"keras: 1.0.7\n",
|
||||
"Theano: 0.8.2\n",
|
||||
"Tensorflow: 0.10.0rc0\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import keras\n",
|
||||
"print('keras: ', keras.__version__)\n",
|
||||
"\n",
|
||||
"import theano\n",
|
||||
"print('Theano: ', theano.__version__)\n",
|
||||
"\n",
|
||||
"# optional\n",
|
||||
"import tensorflow as tf\n",
|
||||
"print('Tensorflow: ', tf.__version__)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<br>\n",
|
||||
"<h1 style=\"text-align: center;\">If everything worked till down here, you're ready to start!</h1>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Consulting Material"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"You have two options to go through the material presented in this tutorial:\n",
|
||||
"\n",
|
||||
"* Read (and execute) the material as **iPython/Jupyter** notebooks\n",
|
||||
"* (just) read the material as (HTML) slides"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In the first case, all you need to do is just execute `ipython notebook` (or `jupyter notebook`) depending on the version of `iPython` you have installed on your machine\n",
|
||||
"\n",
|
||||
"(`jupyter` command works in case you have `iPython 4.0.x` installed)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In the second case, you may simply convert the provided notebooks in `HTML` slides and see them into your browser\n",
|
||||
"thanks to `nbconvert`.\n",
|
||||
"\n",
|
||||
"Thus, move to the folder where notebooks are stored and execute the following command:\n",
|
||||
"\n",
|
||||
" jupyter nbconvert --to slides ./*.ipynb --post serve\n",
|
||||
" \n",
|
||||
" \n",
|
||||
"(Please substitute `jupyter` with `ipython` in the previous command if you have `iPython 3.x` installed on your machine)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## In case..."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"..you wanna do **both** (interactive and executable slides), I highly suggest to install the terrific `RISE` ipython notebook extension: [https://github.com/damianavila/RISE](https://github.com/damianavila/RISE)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
1018
deep-learning/keras-tutorial/1.2 Introduction - Theano.ipynb
Normal file
634
deep-learning/keras-tutorial/1.3 Introduction - Keras.ipynb
Normal file
|
@ -0,0 +1,634 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%matplotlib inline"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import numpy as np\n",
|
||||
"import pandas as pd\n",
|
||||
"import theano\n",
|
||||
"import theano.tensor as T\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"import keras \n",
|
||||
"from sklearn.preprocessing import StandardScaler\n",
|
||||
"from sklearn.preprocessing import LabelEncoder \n",
|
||||
"from keras.utils import np_utils\n",
|
||||
"from sklearn.cross_validation import train_test_split\n",
|
||||
"from keras.callbacks import EarlyStopping, ModelCheckpoint\n",
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers import Dense, Activation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"For this section we will use the Kaggle otto challenge.\n",
|
||||
"If you want to follow, Get the data from Kaggle: https://www.kaggle.com/c/otto-group-product-classification-challenge/data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### About the data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"The Otto Group is one of the world’s biggest e-commerce companies, A consistent analysis of the performance of products is crucial. However, due to diverse global infrastructure, many identical products get classified differently.\n",
|
||||
"For this competition, we have provided a dataset with 93 features for more than 200,000 products. The objective is to build a predictive model which is able to distinguish between our main product categories. \n",
|
||||
"Each row corresponds to a single product. There are a total of 93 numerical features, which represent counts of different events. All features have been obfuscated and will not be defined any further.\n",
|
||||
"\n",
|
||||
"https://www.kaggle.com/c/otto-group-product-classification-challenge/data"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def load_data(path, train=True):\n",
|
||||
" \"\"\"Load data from a CSV File\n",
|
||||
" \n",
|
||||
" Parameters\n",
|
||||
" ----------\n",
|
||||
" path: str\n",
|
||||
" The path to the CSV file\n",
|
||||
" \n",
|
||||
" train: bool (default True)\n",
|
||||
" Decide whether or not data are *training data*.\n",
|
||||
" If True, some random shuffling is applied.\n",
|
||||
" \n",
|
||||
" Return\n",
|
||||
" ------\n",
|
||||
" X: numpy.ndarray \n",
|
||||
" The data as a multi dimensional array of floats\n",
|
||||
" ids: numpy.ndarray\n",
|
||||
" A vector of ids for each sample\n",
|
||||
" \"\"\"\n",
|
||||
" df = pd.read_csv(path)\n",
|
||||
" X = df.values.copy()\n",
|
||||
" if train:\n",
|
||||
" np.random.shuffle(X) # https://youtu.be/uyUXoap67N8\n",
|
||||
" X, labels = X[:, 1:-1].astype(np.float32), X[:, -1]\n",
|
||||
" return X, labels\n",
|
||||
" else:\n",
|
||||
" X, ids = X[:, 1:].astype(np.float32), X[:, 0].astype(str)\n",
|
||||
" return X, ids"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def preprocess_data(X, scaler=None):\n",
|
||||
" \"\"\"Preprocess input data by standardise features \n",
|
||||
" by removing the mean and scaling to unit variance\"\"\"\n",
|
||||
" if not scaler:\n",
|
||||
" scaler = StandardScaler()\n",
|
||||
" scaler.fit(X)\n",
|
||||
" X = scaler.transform(X)\n",
|
||||
" return X, scaler\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def preprocess_labels(labels, encoder=None, categorical=True):\n",
|
||||
" \"\"\"Encode labels with values among 0 and `n-classes-1`\"\"\"\n",
|
||||
" if not encoder:\n",
|
||||
" encoder = LabelEncoder()\n",
|
||||
" encoder.fit(labels)\n",
|
||||
" y = encoder.transform(labels).astype(np.int32)\n",
|
||||
" if categorical:\n",
|
||||
" y = np_utils.to_categorical(y)\n",
|
||||
" return y, encoder"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Loading data...\n",
|
||||
"[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 3.\n",
|
||||
" 2. 1. 0. 0. 0. 0. 0. 0. 0. 5. 3. 1. 1. 0.\n",
|
||||
" 0. 0. 0. 0. 1. 0. 0. 1. 0. 1. 0. 1. 0. 0.\n",
|
||||
" 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
|
||||
" 0. 0. 0. 0. 0. 0. 0. 3. 0. 0. 0. 0. 1. 1.\n",
|
||||
" 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.\n",
|
||||
" 0. 11. 1. 20. 0. 0. 0. 0. 0.]]\n",
|
||||
"(9L, 'classes')\n",
|
||||
"(93L, 'dims')\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(\"Loading data...\")\n",
|
||||
"X, labels = load_data('train.csv', train=True)\n",
|
||||
"X, scaler = preprocess_data(X)\n",
|
||||
"Y, encoder = preprocess_labels(labels)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"X_test, ids = load_data('test.csv', train=False)\n",
|
||||
"X_test, ids = X_test[:1000], ids[:1000]\n",
|
||||
"\n",
|
||||
"#Plotting the data\n",
|
||||
"print(X_test[:1])\n",
|
||||
"\n",
|
||||
"X_test, _ = preprocess_data(X_test, scaler)\n",
|
||||
"\n",
|
||||
"nb_classes = Y.shape[1]\n",
|
||||
"print(nb_classes, 'classes')\n",
|
||||
"\n",
|
||||
"dims = X.shape[1]\n",
|
||||
"print(dims, 'dims')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now lets create and train a logistic regression model."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"---"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Keras\n",
|
||||
"\n",
|
||||
"## Deep Learning library for Theano and TensorFlow"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Keras is a minimalist, highly modular neural networks library, written in Python and capable of running on top of either TensorFlow or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research.\n",
|
||||
"ref: https://keras.io/"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Why this name, Keras?\n",
|
||||
"\n",
|
||||
"Keras (κέρας) means _horn_ in Greek. It is a reference to a literary image from ancient Greek and Latin literature, first found in the _Odyssey_, where dream spirits (_Oneiroi_, singular _Oneiros_) are divided between those who deceive men with false visions, who arrive to Earth through a gate of ivory, and those who announce a future that will come to pass, who arrive through a gate of horn. It's a play on the words κέρας (horn) / κραίνω (fulfill), and ἐλέφας (ivory) / ἐλεφαίρομαι (deceive).\n",
|
||||
"\n",
|
||||
"Keras was initially developed as part of the research effort of project ONEIROS (Open-ended Neuro-Electronic Intelligent Robot Operating System).\n",
|
||||
"\n",
|
||||
">_\"Oneiroi are beyond our unravelling --who can be sure what tale they tell? Not all that men look for comes to pass. Two gates there are that give passage to fleeting Oneiroi; one is made of horn, one of ivory. The Oneiroi that pass through sawn ivory are deceitful, bearing a message that will not be fulfilled; those that come out through polished horn have truth behind them, to be accomplished for men who see them.\"_ Homer, Odyssey 19. 562 ff (Shewring translation)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Hands On - Keras Logistic Regression\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"(93L, 'dims')\n",
|
||||
"Building model...\n",
|
||||
"(9L, 'classes')\n",
|
||||
"Epoch 1/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 1.0574 \n",
|
||||
"Epoch 2/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.7730 \n",
|
||||
"Epoch 3/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.7297 \n",
|
||||
"Epoch 4/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.7080 \n",
|
||||
"Epoch 5/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6948 \n",
|
||||
"Epoch 6/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6854 \n",
|
||||
"Epoch 7/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6787 \n",
|
||||
"Epoch 8/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6734 \n",
|
||||
"Epoch 9/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6691 \n",
|
||||
"Epoch 10/10\n",
|
||||
"61878/61878 [==============================] - 1s - loss: 0.6657 \n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<keras.callbacks.History at 0x23d330f0>"
|
||||
]
|
||||
},
|
||||
"execution_count": 6,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"dims = X.shape[1]\n",
|
||||
"print(dims, 'dims')\n",
|
||||
"print(\"Building model...\")\n",
|
||||
"\n",
|
||||
"nb_classes = Y.shape[1]\n",
|
||||
"print(nb_classes, 'classes')\n",
|
||||
"\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Dense(nb_classes, input_shape=(dims,)))\n",
|
||||
"model.add(Activation('softmax'))\n",
|
||||
"model.compile(optimizer='sgd', loss='categorical_crossentropy')\n",
|
||||
"model.fit(X, Y)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Simplicity is pretty impressive right? :)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Now lets understand:\n",
|
||||
"<pre>The core data structure of Keras is a <b>model</b>, a way to organize layers. The main type of model is the <b>Sequential</b> model, a linear stack of layers.</pre>\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"What we did here is stacking a Fully Connected (<b>Dense</b>) layer of trainable weights from the input to the output and an <b>Activation</b> layer on top of the weights layer."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### Dense"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"source": [
|
||||
"```python\n",
|
||||
"from keras.layers.core import Dense\n",
|
||||
"\n",
|
||||
"Dense(output_dim, init='glorot_uniform', activation='linear', \n",
|
||||
" weights=None, W_regularizer=None, b_regularizer=None,\n",
|
||||
" activity_regularizer=None, W_constraint=None, \n",
|
||||
" b_constraint=None, bias=True, input_dim=None)\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### Activation"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"source": [
|
||||
"```python\n",
|
||||
"from keras.layers.core import Activation\n",
|
||||
"\n",
|
||||
"Activation(activation)\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### Optimizer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"If you need to, you can further configure your optimizer. A core principle of Keras is to make things reasonably simple, while allowing the user to be fully in control when they need to (the ultimate control being the easy extensibility of the source code).\n",
|
||||
"Here we used <b>SGD</b> (stochastic gradient descent) as an optimization algorithm for our trainable weights. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\"Data Sciencing\" this example a little bit more\n",
|
||||
"====="
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"What we did here is nice, however in the real world it is not useable because of overfitting.\n",
|
||||
"Lets try and solve it with cross validation."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"##### Overfitting"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"In overfitting, a statistical model describes random error or noise instead of the underlying relationship. Overfitting occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. \n",
|
||||
"\n",
|
||||
"A model that has been overfit has poor predictive performance, as it overreacts to minor fluctuations in the training data."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": [
|
||||
"\n",
|
||||
"<img src =\"imgs/overfitting.png\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<pre>To avoid overfitting, we will first split out data to training set and test set and test out model on the test set.\n",
|
||||
"Next: we will use two of keras's callbacks <b>EarlyStopping</b> and <b>ModelCheckpoint</b></pre>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 13,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Train on 19835 samples, validate on 3501 samples\n",
|
||||
"Epoch 1/20\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6391 - val_loss: 0.6680\n",
|
||||
"Epoch 2/20\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6386 - val_loss: 0.6689\n",
|
||||
"Epoch 3/20\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6384 - val_loss: 0.6695\n",
|
||||
"Epoch 4/20\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6381 - val_loss: 0.6702\n",
|
||||
"Epoch 5/20\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6378 - val_loss: 0.6709\n",
|
||||
"Epoch 6/20\n",
|
||||
"19328/19835 [============================>.] - ETA: 0s - loss: 0.6380Epoch 00005: early stopping\n",
|
||||
"19835/19835 [==============================] - 0s - loss: 0.6375 - val_loss: 0.6716\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<keras.callbacks.History at 0x1d7245f8>"
|
||||
]
|
||||
},
|
||||
"execution_count": 13,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"X, X_test, Y, Y_test = train_test_split(X, Y, test_size=0.15, random_state=42)\n",
|
||||
"\n",
|
||||
"fBestModel = 'best_model.h5' \n",
|
||||
"early_stop = EarlyStopping(monitor='val_loss', patience=4, verbose=1) \n",
|
||||
"best_model = ModelCheckpoint(fBestModel, verbose=0, save_best_only=True)\n",
|
||||
"model.fit(X, Y, validation_data = (X_test, Y_test), nb_epoch=20, \n",
|
||||
" batch_size=128, verbose=True, validation_split=0.15, \n",
|
||||
" callbacks=[best_model, early_stop]) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Multi-Layer Perceptron and Fully Connected"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"So, how hard can it be to build a Multi-Layer percepton with keras?\n",
|
||||
"It is baiscly the same, just add more layers!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = Sequential()\n",
|
||||
"model.add(Dense(100, input_shape=(dims,)))\n",
|
||||
"model.add(Dense(nb_classes))\n",
|
||||
"model.add(Activation('softmax'))\n",
|
||||
"model.compile(optimizer='sgd', loss='categorical_crossentropy')\n",
|
||||
"model.fit(X, Y)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Your Turn!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"## Hands On - Keras Fully Connected\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Take couple of minutes and Try and optimize the number of layers and the number of parameters in the layers to get the best results. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model = Sequential()\n",
|
||||
"model.add(Dense(100, input_shape=(dims,)))\n",
|
||||
"\n",
|
||||
"# ...\n",
|
||||
"# ...\n",
|
||||
"# Play with it! add as much layers as you want! try and get better results.\n",
|
||||
"\n",
|
||||
"model.add(Dense(nb_classes))\n",
|
||||
"model.add(Activation('softmax'))\n",
|
||||
"model.compile(optimizer='sgd', loss='categorical_crossentropy')\n",
|
||||
"model.fit(X, Y)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Building a question answering system, an image classification model, a Neural Turing Machine, a word2vec embedder or any other model is just as fast. The ideas behind deep learning are simple, so why should their implementation be painful?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Theoretical Motivations for depth"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
">Much has been studied about the depth of neural nets. Is has been proven mathematically[1] and empirically that convolutional neural network benifit from depth! "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[1] - On the Expressive Power of Deep Learning: A Tensor Analysis - Cohen, et al 2015"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Theoretical Motivations for depth"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"One much quoted theorem about neural network states that:"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
">Universal approximation theorem states[1] that a feed-forward network with a single hidden layer containing a finite number of neurons (i.e., a multilayer perceptron), can approximate continuous functions on compact subsets of $\\mathbb{R}^n$, under mild assumptions on the activation function. The theorem thus states that simple neural networks can represent a wide variety of interesting functions when given appropriate parameters; however, it does not touch upon the algorithmic learnability of those parameters."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"[1] - Approximation Capabilities of Multilayer Feedforward Networks - Kurt Hornik 1991"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
|
@ -0,0 +1,406 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"source": [
|
||||
"# A simple implementation of ANN for MNIST\n",
|
||||
"\n",
|
||||
"This code was taken from: https://github.com/mnielsen/neural-networks-and-deep-learning\n",
|
||||
"\n",
|
||||
"This accompanies the online text http://neuralnetworksanddeeplearning.com/ . The book is highly recommended. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n",
|
||||
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Import libraries\n",
|
||||
"import random\n",
|
||||
"import numpy as np\n",
|
||||
"import keras\n",
|
||||
"from keras.datasets import mnist"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 2,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Set the full path to mnist.pkl.gz\n",
|
||||
"# Point this to the data folder inside the repository\n",
|
||||
"path_to_dataset = \"euroscipy2016_dl-tutorial/data/mnist.pkl.gz\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"!mkdir -p $HOME/.keras/datasets/euroscipy2016_dl-tutorial/data/"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Downloading data from https://s3.amazonaws.com/img-datasets/mnist.pkl.gz\n",
|
||||
"15286272/15296311 [============================>.] - ETA: 0s"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# Load the datasets\n",
|
||||
"(X_train, y_train), (X_test, y_test) = mnist.load_data(path_to_dataset)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"(60000, 28, 28) (60000,)\n",
|
||||
"(10000, 28, 28) (10000,)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(X_train.shape, y_train.shape)\n",
|
||||
"print(X_test.shape, y_test.shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"\"\"\"\n",
|
||||
"network.py\n",
|
||||
"~~~~~~~~~~\n",
|
||||
"A module to implement the stochastic gradient descent learning\n",
|
||||
"algorithm for a feedforward neural network. Gradients are calculated\n",
|
||||
"using backpropagation. Note that I have focused on making the code\n",
|
||||
"simple, easily readable, and easily modifiable. It is not optimized,\n",
|
||||
"and omits many desirable features.\n",
|
||||
"\"\"\"\n",
|
||||
"\n",
|
||||
"#### Libraries\n",
|
||||
"# Standard library\n",
|
||||
"import random\n",
|
||||
"\n",
|
||||
"# Third-party libraries\n",
|
||||
"import numpy as np\n",
|
||||
"\n",
|
||||
"class Network(object):\n",
|
||||
"\n",
|
||||
" def __init__(self, sizes):\n",
|
||||
" \"\"\"The list ``sizes`` contains the number of neurons in the\n",
|
||||
" respective layers of the network. For example, if the list\n",
|
||||
" was [2, 3, 1] then it would be a three-layer network, with the\n",
|
||||
" first layer containing 2 neurons, the second layer 3 neurons,\n",
|
||||
" and the third layer 1 neuron. The biases and weights for the\n",
|
||||
" network are initialized randomly, using a Gaussian\n",
|
||||
" distribution with mean 0, and variance 1. Note that the first\n",
|
||||
" layer is assumed to be an input layer, and by convention we\n",
|
||||
" won't set any biases for those neurons, since biases are only\n",
|
||||
" ever used in computing the outputs from later layers.\"\"\"\n",
|
||||
" self.num_layers = len(sizes)\n",
|
||||
" self.sizes = sizes\n",
|
||||
" self.biases = [np.random.randn(y, 1) for y in sizes[1:]]\n",
|
||||
" self.weights = [np.random.randn(y, x)\n",
|
||||
" for x, y in zip(sizes[:-1], sizes[1:])]\n",
|
||||
"\n",
|
||||
" def feedforward(self, a):\n",
|
||||
" \"\"\"Return the output of the network if ``a`` is input.\"\"\"\n",
|
||||
" for b, w in zip(self.biases, self.weights):\n",
|
||||
" a = sigmoid(np.dot(w, a)+b)\n",
|
||||
" return a\n",
|
||||
"\n",
|
||||
" def SGD(self, training_data, epochs, mini_batch_size, eta,\n",
|
||||
" test_data=None):\n",
|
||||
" \"\"\"Train the neural network using mini-batch stochastic\n",
|
||||
" gradient descent. The ``training_data`` is a list of tuples\n",
|
||||
" ``(x, y)`` representing the training inputs and the desired\n",
|
||||
" outputs. The other non-optional parameters are\n",
|
||||
" self-explanatory. If ``test_data`` is provided then the\n",
|
||||
" network will be evaluated against the test data after each\n",
|
||||
" epoch, and partial progress printed out. This is useful for\n",
|
||||
" tracking progress, but slows things down substantially.\"\"\"\n",
|
||||
" training_data = list(training_data)\n",
|
||||
" test_data = list(test_data)\n",
|
||||
" if test_data: n_test = len(test_data)\n",
|
||||
" n = len(training_data)\n",
|
||||
" for j in range(epochs):\n",
|
||||
" random.shuffle(training_data)\n",
|
||||
" mini_batches = [\n",
|
||||
" training_data[k:k+mini_batch_size]\n",
|
||||
" for k in range(0, n, mini_batch_size)]\n",
|
||||
" for mini_batch in mini_batches:\n",
|
||||
" self.update_mini_batch(mini_batch, eta)\n",
|
||||
" if test_data:\n",
|
||||
" print( \"Epoch {0}: {1} / {2}\".format(\n",
|
||||
" j, self.evaluate(test_data), n_test))\n",
|
||||
" else:\n",
|
||||
" print( \"Epoch {0} complete\".format(j))\n",
|
||||
"\n",
|
||||
" def update_mini_batch(self, mini_batch, eta):\n",
|
||||
" \"\"\"Update the network's weights and biases by applying\n",
|
||||
" gradient descent using backpropagation to a single mini batch.\n",
|
||||
" The ``mini_batch`` is a list of tuples ``(x, y)``, and ``eta``\n",
|
||||
" is the learning rate.\"\"\"\n",
|
||||
" nabla_b = [np.zeros(b.shape) for b in self.biases]\n",
|
||||
" nabla_w = [np.zeros(w.shape) for w in self.weights]\n",
|
||||
" for x, y in mini_batch:\n",
|
||||
" delta_nabla_b, delta_nabla_w = self.backprop(x, y)\n",
|
||||
" nabla_b = [nb+dnb for nb, dnb in zip(nabla_b, delta_nabla_b)]\n",
|
||||
" nabla_w = [nw+dnw for nw, dnw in zip(nabla_w, delta_nabla_w)]\n",
|
||||
" self.weights = [w-(eta/len(mini_batch))*nw\n",
|
||||
" for w, nw in zip(self.weights, nabla_w)]\n",
|
||||
" self.biases = [b-(eta/len(mini_batch))*nb\n",
|
||||
" for b, nb in zip(self.biases, nabla_b)]\n",
|
||||
"\n",
|
||||
" def backprop(self, x, y):\n",
|
||||
" \"\"\"Return a tuple ``(nabla_b, nabla_w)`` representing the\n",
|
||||
" gradient for the cost function C_x. ``nabla_b`` and\n",
|
||||
" ``nabla_w`` are layer-by-layer lists of numpy arrays, similar\n",
|
||||
" to ``self.biases`` and ``self.weights``.\"\"\"\n",
|
||||
" nabla_b = [np.zeros(b.shape) for b in self.biases]\n",
|
||||
" nabla_w = [np.zeros(w.shape) for w in self.weights]\n",
|
||||
" # feedforward\n",
|
||||
" activation = x\n",
|
||||
" activations = [x] # list to store all the activations, layer by layer\n",
|
||||
" zs = [] # list to store all the z vectors, layer by layer\n",
|
||||
" for b, w in zip(self.biases, self.weights):\n",
|
||||
" z = np.dot(w, activation)+b\n",
|
||||
" zs.append(z)\n",
|
||||
" activation = sigmoid(z)\n",
|
||||
" activations.append(activation)\n",
|
||||
" # backward pass\n",
|
||||
" delta = self.cost_derivative(activations[-1], y) * \\\n",
|
||||
" sigmoid_prime(zs[-1])\n",
|
||||
" nabla_b[-1] = delta\n",
|
||||
" nabla_w[-1] = np.dot(delta, activations[-2].transpose())\n",
|
||||
" # Note that the variable l in the loop below is used a little\n",
|
||||
" # differently to the notation in Chapter 2 of the book. Here,\n",
|
||||
" # l = 1 means the last layer of neurons, l = 2 is the\n",
|
||||
" # second-last layer, and so on. It's a renumbering of the\n",
|
||||
" # scheme in the book, used here to take advantage of the fact\n",
|
||||
" # that Python can use negative indices in lists.\n",
|
||||
" for l in range(2, self.num_layers):\n",
|
||||
" z = zs[-l]\n",
|
||||
" sp = sigmoid_prime(z)\n",
|
||||
" delta = np.dot(self.weights[-l+1].transpose(), delta) * sp\n",
|
||||
" nabla_b[-l] = delta\n",
|
||||
" nabla_w[-l] = np.dot(delta, activations[-l-1].transpose())\n",
|
||||
" return (nabla_b, nabla_w)\n",
|
||||
"\n",
|
||||
" def evaluate(self, test_data):\n",
|
||||
" \"\"\"Return the number of test inputs for which the neural\n",
|
||||
" network outputs the correct result. Note that the neural\n",
|
||||
" network's output is assumed to be the index of whichever\n",
|
||||
" neuron in the final layer has the highest activation.\"\"\"\n",
|
||||
" test_results = [(np.argmax(self.feedforward(x)), y)\n",
|
||||
" for (x, y) in test_data]\n",
|
||||
" return sum(int(x == y) for (x, y) in test_results)\n",
|
||||
"\n",
|
||||
" def cost_derivative(self, output_activations, y):\n",
|
||||
" \"\"\"Return the vector of partial derivatives \\partial C_x /\n",
|
||||
" \\partial a for the output activations.\"\"\"\n",
|
||||
" return (output_activations-y)\n",
|
||||
"\n",
|
||||
"#### Miscellaneous functions\n",
|
||||
"def sigmoid(z):\n",
|
||||
" \"\"\"The sigmoid function.\"\"\"\n",
|
||||
" return 1.0/(1.0+np.exp(-z))\n",
|
||||
"\n",
|
||||
"def sigmoid_prime(z):\n",
|
||||
" \"\"\"Derivative of the sigmoid function.\"\"\"\n",
|
||||
" return sigmoid(z)*(1-sigmoid(z))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"def vectorized_result(j):\n",
|
||||
" \"\"\"Return a 10-dimensional unit vector with a 1.0 in the jth\n",
|
||||
" position and zeroes elsewhere. This is used to convert a digit\n",
|
||||
" (0...9) into a corresponding desired output from the neural\n",
|
||||
" network.\"\"\"\n",
|
||||
" e = np.zeros((10, 1))\n",
|
||||
" e[j] = 1.0\n",
|
||||
" return e"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"net = Network([784, 30, 10])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 9,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"training_inputs = [np.reshape(x, (784, 1)) for x in X_train.copy()]\n",
|
||||
"training_results = [vectorized_result(y) for y in y_train.copy()]\n",
|
||||
"training_data = zip(training_inputs, training_results)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 10,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"test_inputs = [np.reshape(x, (784, 1)) for x in X_test.copy()]\n",
|
||||
"test_data = zip(test_inputs, y_test.copy())"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 11,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Epoch 0: 1348 / 10000\n",
|
||||
"Epoch 1: 1939 / 10000\n",
|
||||
"Epoch 2: 2046 / 10000\n",
|
||||
"Epoch 3: 1422 / 10000\n",
|
||||
"Epoch 4: 1365 / 10000\n",
|
||||
"Epoch 5: 1351 / 10000\n",
|
||||
"Epoch 6: 1879 / 10000\n",
|
||||
"Epoch 7: 1806 / 10000\n",
|
||||
"Epoch 8: 1754 / 10000\n",
|
||||
"Epoch 9: 1974 / 10000\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net.SGD(training_data, 10, 10, 3.0, test_data=test_data)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 12,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Epoch 0: 3526 / 10000\n",
|
||||
"Epoch 1: 3062 / 10000\n",
|
||||
"Epoch 2: 2946 / 10000\n",
|
||||
"Epoch 3: 2462 / 10000\n",
|
||||
"Epoch 4: 3617 / 10000\n",
|
||||
"Epoch 5: 3773 / 10000\n",
|
||||
"Epoch 6: 3568 / 10000\n",
|
||||
"Epoch 7: 4459 / 10000\n",
|
||||
"Epoch 8: 3009 / 10000\n",
|
||||
"Epoch 9: 2660 / 10000\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"net = Network([784, 10, 10])\n",
|
||||
"\n",
|
||||
"training_inputs = [np.reshape(x, (784, 1)) for x in X_train.copy()]\n",
|
||||
"training_results = [vectorized_result(y) for y in y_train.copy()]\n",
|
||||
"training_data = zip(training_inputs, training_results)\n",
|
||||
"\n",
|
||||
"test_inputs = [np.reshape(x, (784, 1)) for x in X_test.copy()]\n",
|
||||
"test_data = zip(test_inputs, y_test.copy())\n",
|
||||
"\n",
|
||||
"net.SGD(training_data, 10, 10, 1.0, test_data=test_data)"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
|
@ -0,0 +1,906 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Convolutional Neural Network"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "skip"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### References:\n",
|
||||
"\n",
|
||||
"Some of the images and the content I used came from this great couple of blog posts \\[1\\] [https://adeshpande3.github.io/adeshpande3.github.io/]() and \\[2\\] the terrific book, [\"Neural Networks and Deep Learning\"](http://neuralnetworksanddeeplearning.com/) by Michael Nielsen. (**Strongly recommend**) "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"A convolutional neural network (CNN, or ConvNet) is a type of **feed-forward** artificial neural network in which the connectivity pattern between its neurons is inspired by the organization of the animal visual cortex."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The networks consist of multiple layers of small neuron collections which process portions of the input image, called **receptive fields**. \n",
|
||||
"\n",
|
||||
"The outputs of these collections are then tiled so that their input regions overlap, to obtain a _better representation_ of the original image; this is repeated for every such layer."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## How does it look like?"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": true,
|
||||
"slideshow": {
|
||||
"slide_type": "-"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/convnets_cover.png\" width=\"70%\" />\n",
|
||||
"\n",
|
||||
"> source: https://flickrcode.files.wordpress.com/2014/10/conv-net2.png"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# The Problem Space \n",
|
||||
"\n",
|
||||
"## Image Classification"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Image classification is the task of taking an input image and outputting a class (a cat, dog, etc) or a probability of classes that best describes the image. \n",
|
||||
"\n",
|
||||
"For humans, this task of recognition is one of the first skills we learn from the moment we are born and is one that comes naturally and effortlessly as adults."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"These skills of being able to quickly recognize patterns, *generalize* from prior knowledge, and adapt to different image environments are ones that we do not share with machines."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Inputs and Outputs"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn1.png\" width=\"70%\" />\n",
|
||||
"\n",
|
||||
"source: [http://www.pawbuzz.com/wp-content/uploads/sites/551/2014/11/corgi-puppies-21.jpg]()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"When a computer sees an image (takes an image as input), it will see an array of pixel values. \n",
|
||||
"\n",
|
||||
"Depending on the resolution and size of the image, it will see a 32 x 32 x 3 array of numbers (The 3 refers to RGB values).\n",
|
||||
"\n",
|
||||
"let's say we have a color image in JPG form and its size is 480 x 480. The representative array will be 480 x 480 x 3. Each of these numbers is given a value from 0 to 255 which describes the pixel intensity at that point."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Goal"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"What we want the computer to do is to be able to differentiate between all the images it’s given and figure out the unique features that make a dog a dog or that make a cat a cat. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"When we look at a picture of a dog, we can classify it as such if the picture has identifiable features such as paws or 4 legs. \n",
|
||||
"\n",
|
||||
"In a similar way, the computer should be able to perform image classification by looking for *low level* features such as edges and curves, and then building up to more abstract concepts through a series of **convolutional layers**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Structure of a CNN"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"> A more detailed overview of what CNNs do would be that you take the image, pass it through a series of convolutional, nonlinear, pooling (downsampling), and fully connected layers, and get an output. As we said earlier, the output can be a single class or a probability of classes that best describes the image. \n",
|
||||
"\n",
|
||||
"source: [1]"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Convolutional Layer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The first layer in a CNN is always a **Convolutional Layer**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"\n",
|
||||
"<img src =\"imgs/conv.png\" width=\"50%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Convolutional filters\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A Convolutional Filter much like a **kernel** in image recognition is a small matrix useful for blurring, sharpening, embossing, edge detection, and more. \n",
|
||||
"\n",
|
||||
"This is accomplished by means of convolution between a kernel and an image.\n",
|
||||
"\n",
|
||||
"The main difference _here_ is that the conv matrices are **learned**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/keDyv.png\" width=\"90%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"As the filter is sliding, or **convolving**, around the input image, it is multiplying the values in the filter with the original pixel values of the image (aka computing **element wise multiplications**)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn2.png\" width=\"80%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"iNow, we repeat this process for every location on the input volume. (Next step would be moving the filter to the right by 1 unit, then right again by 1, and so on)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"After sliding the filter over all the locations, we are left with an array of numbers usually called an **activation map** or **feature map**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## High Level Perspective\n",
|
||||
"\n",
|
||||
"Let’s talk about briefly what this convolution is actually doing from a high level. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Each of these filters can be thought of as **feature identifiers** (e.g. *straight edges, simple colors, curves*)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn3.png\" width=\"70%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Visualisation of the Receptive Field"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn4.png\" width=\"80%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn5.png\" width=\"80%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/cnn6.png\" width=\"80%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The value is much lower! This is because there wasn’t anything in the image section that responded to the curve detector filter. Remember, the output of this conv layer is an activation map. \n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Going Deeper Through the Network"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Now in a traditional **convolutional neural network** architecture, there are other layers that are interspersed between these conv layers.\n",
|
||||
"\n",
|
||||
"<img src=\"https://adeshpande3.github.io/assets/Table.png\"/>"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## ReLU (Rectified Linear Units) Layer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
" After each conv layer, it is convention to apply a *nonlinear layer* (or **activation layer**) immediately afterward.\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"The purpose of this layer is to introduce nonlinearity to a system that basically has just been computing linear operations during the conv layers (just element wise multiplications and summations)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"In the past, nonlinear functions like tanh and sigmoid were used, but researchers found out that **ReLU layers** work far better because the network is able to train a lot faster (because of the computational efficiency) without making a significant difference to the accuracy."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"It also helps to alleviate the **vanishing gradient problem**, which is the issue where the lower layers of the network train very slowly because the gradient decreases exponentially through the layers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"(**very briefly**)\n",
|
||||
"\n",
|
||||
"Vanishing gradient problem depends on the choice of the activation function. \n",
|
||||
"\n",
|
||||
"Many common activation functions (e.g `sigmoid` or `tanh`) *squash* their input into a very small output range in a very non-linear fashion. \n",
|
||||
"\n",
|
||||
"For example, sigmoid maps the real number line onto a \"small\" range of [0, 1]."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"As a result, there are large regions of the input space which are mapped to an extremely small range. \n",
|
||||
"\n",
|
||||
"In these regions of the input space, even a large change in the input will produce a small change in the output - hence the **gradient is small**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### ReLu\n",
|
||||
"\n",
|
||||
"The **ReLu** function is defined as $f(x) = \\max(0, x),$ [2]\n",
|
||||
"\n",
|
||||
"A smooth approximation to the rectifier is the *analytic function*: $f(x) = \\ln(1 + e^x)$\n",
|
||||
"\n",
|
||||
"which is called the **softplus** function.\n",
|
||||
"\n",
|
||||
"The derivative of softplus is $f'(x) = e^x / (e^x + 1) = 1 / (1 + e^{-x})$, i.e. the **logistic function**.\n",
|
||||
"\n",
|
||||
"[2] [http://www.cs.toronto.edu/~fritz/absps/reluICML.pdf]() by G. E. Hinton "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Pooling Layers"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
" After some ReLU layers, it is customary to apply a **pooling layer** (aka *downsampling layer*)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"In this category, there are also several layer options, with **maxpooling** being the most popular. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Example of a MaxPooling filter"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/MaxPool.png\" width=\"80%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Other options for pooling layers are average pooling and L2-norm pooling. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The intuition behind this Pooling layer is that once we know that a specific feature is in the original input volume (there will be a high activation value), its exact location is not as important as its relative location to the other features. \n",
|
||||
"\n",
|
||||
"Therefore this layer drastically reduces the spatial dimension (the length and the width but not the depth) of the input volume.\n",
|
||||
"\n",
|
||||
"This serves two main purposes: reduce the amount of parameters; controlling overfitting. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"An intuitive explanation for the usefulness of pooling could be explained by an example: \n",
|
||||
"\n",
|
||||
"Lets assume that we have a filter that is used for detecting faces. The exact pixel location of the face is less relevant then the fact that there is a face \"somewhere at the top\""
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Dropout Layer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The **dropout layers** have the very specific function to *drop out* a random set of activations in that layers by setting them to zero in the forward pass. Simple as that. \n",
|
||||
"\n",
|
||||
"It allows to avoid *overfitting* but has to be used **only** at training time and **not** at test time. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Fully Connected Layer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The last layer, however, is an important one, namely the **Fully Connected Layer**."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Basically, a FC layer looks at what high level features most strongly correlate to a particular class and has particular weights so that when you compute the products between the weights and the previous layer, you get the correct probabilities for the different classes."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/ConvNet LeNet.png\" width=\"90%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# CNN in Keras"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"**Keras** supports:\n",
|
||||
"\n",
|
||||
"- 1D Convolutional Layers;\n",
|
||||
"- 2D Convolutional Layers;\n",
|
||||
"- 3D Convolutional Layers;\n",
|
||||
"\n",
|
||||
"The corresponding `keras` package is `keras.layers.convolutional`"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Convolution1D\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"from keras.layers.convolutional import Convolution1D\n",
|
||||
"Convolution1D(nb_filter, filter_length, init='uniform',\n",
|
||||
" activation='linear', weights=None,\n",
|
||||
" border_mode='valid', subsample_length=1,\n",
|
||||
" W_regularizer=None, b_regularizer=None,\n",
|
||||
" activity_regularizer=None, W_constraint=None,\n",
|
||||
" b_constraint=None, bias=True, input_dim=None,\n",
|
||||
" input_length=None)\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
">Convolution operator for filtering neighborhoods of **one-dimensional inputs**. When using this layer as the first layer in a model, either provide the keyword argument `input_dim` (int, e.g. 128 for sequences of 128-dimensional vectors), or `input_shape` (tuple of integers, e.g. (10, 128) for sequences of 10 vectors of 128-dimensional vectors)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Example\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"\n",
|
||||
"# apply a convolution 1d of length 3 to a sequence with 10 timesteps,\n",
|
||||
"# with 64 output filters\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Convolution1D(64, 3, border_mode='same', input_shape=(10, 32)))\n",
|
||||
"# now model.output_shape == (None, 10, 64)\n",
|
||||
"\n",
|
||||
"# add a new conv1d on top\n",
|
||||
"model.add(Convolution1D(32, 3, border_mode='same'))\n",
|
||||
"# now model.output_shape == (None, 10, 32)\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Convolution2D\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"from keras.layers.convolutional import Convolution2D\n",
|
||||
"Convolution2D(nb_filter, nb_row, nb_col, \n",
|
||||
" init='glorot_uniform',\n",
|
||||
" activation='linear', weights=None,\n",
|
||||
" border_mode='valid', subsample=(1, 1),\n",
|
||||
" dim_ordering='default', W_regularizer=None,\n",
|
||||
" b_regularizer=None, activity_regularizer=None,\n",
|
||||
" W_constraint=None, b_constraint=None, \n",
|
||||
" bias=True)\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Example\n",
|
||||
"\n",
|
||||
"```python\n",
|
||||
"\n",
|
||||
"# apply a 3x3 convolution with 64 output filters on a 256x256 image:\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Convolution2D(64, 3, 3, border_mode='same', \n",
|
||||
" input_shape=(3, 256, 256)))\n",
|
||||
"# now model.output_shape == (None, 64, 256, 256)\n",
|
||||
"\n",
|
||||
"# add a 3x3 convolution on top, with 32 output filters:\n",
|
||||
"model.add(Convolution2D(32, 3, 3, border_mode='same'))\n",
|
||||
"# now model.output_shape == (None, 32, 256, 256)\n",
|
||||
"\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Dimensions of Conv filters in Keras"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"The complex structure of ConvNets *may* lead to a representation that is challenging to understand."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Of course, the dimensions vary according to the dimension of the Convolutional filters (e.g. 1D, 2D)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Convolution1D\n",
|
||||
"\n",
|
||||
"**Input Shape**:\n",
|
||||
"\n",
|
||||
"**3D** tensor with shape: (`samples`, `steps`, `input_dim`).\n",
|
||||
"\n",
|
||||
"**Output Shape**:\n",
|
||||
"\n",
|
||||
"**3D** tensor with shape: (`samples`, `new_steps`, `nb_filter`)."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Convolution2D\n",
|
||||
"\n",
|
||||
"**Input Shape**:\n",
|
||||
"\n",
|
||||
"**4D** tensor with shape: \n",
|
||||
"\n",
|
||||
"- (`samples`, `channels`, `rows`, `cols`) if `dim_ordering='th'`\n",
|
||||
"- (`samples`, `rows`, `cols`, `channels`) if `dim_ordering='tf'`\n",
|
||||
"\n",
|
||||
"**Output Shape**:\n",
|
||||
"\n",
|
||||
"**4D** tensor with shape:\n",
|
||||
"\n",
|
||||
"- (`samples`, `nb_filter`, `new_rows`, `new_cols`) \n",
|
||||
"if `dim_ordering='th'`\n",
|
||||
"- (`samples`, `new_rows`, `new_cols`, `nb_filter`) if `dim_ordering='tf'`"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"celltoolbar": "Slideshow",
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
|
@ -0,0 +1,716 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Practical Deep Learning"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"Constructing and training your own ConvNet from scratch can be Hard and a long task.\n",
|
||||
"\n",
|
||||
"A common trick used in Deep Learning is to use a **pre-trained** model and finetune it to the specific data it will be used for. "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Famous Models with Keras\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"This notebook contains code and reference for the following Keras models (gathered from [https://github.com/fchollet/deep-learning-models]())\n",
|
||||
"\n",
|
||||
"- VGG16\n",
|
||||
"- VGG19\n",
|
||||
"- ResNet50\n",
|
||||
"- Inception v3\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "skip"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## References\n",
|
||||
"\n",
|
||||
"- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) - please cite this paper if you use the VGG models in your work.\n",
|
||||
"- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - please cite this paper if you use the ResNet model in your work.\n",
|
||||
"- [Rethinking the Inception Architecture for Computer Vision](http://arxiv.org/abs/1512.00567) - please cite this paper if you use the Inception v3 model in your work.\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. \n",
|
||||
"\n",
|
||||
"For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, \"Width-Height-Depth\"."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"### Keras Configuration File"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "-"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{\r\n",
|
||||
" \"image_dim_ordering\": \"th\",\r\n",
|
||||
" \"floatx\": \"float32\",\r\n",
|
||||
" \"epsilon\": 1e-07,\r\n",
|
||||
" \"backend\": \"theano\"\r\n",
|
||||
"}"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!cat ~/.keras/keras.json"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{\r\n",
|
||||
" \"image_dim_ordering\": \"th\",\r\n",
|
||||
" \"floatx\": \"float32\",\r\n",
|
||||
" \"epsilon\": 1e-07,\r\n",
|
||||
" \"backend\": \"tensorflow\"\r\n",
|
||||
"}"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!sed -i 's/theano/tensorflow/g' ~/.keras/keras.json\n",
|
||||
"!cat ~/.keras/keras.json"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using TensorFlow backend.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import keras"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import theano"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 8,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"{\r\n",
|
||||
" \"image_dim_ordering\": \"th\",\r\n",
|
||||
" \"backend\": \"theano\",\r\n",
|
||||
" \"floatx\": \"float32\",\r\n",
|
||||
" \"epsilon\": 1e-07\r\n",
|
||||
"}"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"!sed -i 's/tensorflow/theano/g' ~/.keras/keras.json\n",
|
||||
"!cat ~/.keras/keras.json"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n",
|
||||
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import keras"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "skip"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n",
|
||||
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# %load deep_learning_models/imagenet_utils.py\n",
|
||||
"import numpy as np\n",
|
||||
"import json\n",
|
||||
"\n",
|
||||
"from keras.utils.data_utils import get_file\n",
|
||||
"from keras import backend as K\n",
|
||||
"\n",
|
||||
"CLASS_INDEX = None\n",
|
||||
"CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def preprocess_input(x, dim_ordering='default'):\n",
|
||||
" if dim_ordering == 'default':\n",
|
||||
" dim_ordering = K.image_dim_ordering()\n",
|
||||
" assert dim_ordering in {'tf', 'th'}\n",
|
||||
"\n",
|
||||
" if dim_ordering == 'th':\n",
|
||||
" x[:, 0, :, :] -= 103.939\n",
|
||||
" x[:, 1, :, :] -= 116.779\n",
|
||||
" x[:, 2, :, :] -= 123.68\n",
|
||||
" # 'RGB'->'BGR'\n",
|
||||
" x = x[:, ::-1, :, :]\n",
|
||||
" else:\n",
|
||||
" x[:, :, :, 0] -= 103.939\n",
|
||||
" x[:, :, :, 1] -= 116.779\n",
|
||||
" x[:, :, :, 2] -= 123.68\n",
|
||||
" # 'RGB'->'BGR'\n",
|
||||
" x = x[:, :, :, ::-1]\n",
|
||||
" return x\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def decode_predictions(preds):\n",
|
||||
" global CLASS_INDEX\n",
|
||||
" assert len(preds.shape) == 2 and preds.shape[1] == 1000\n",
|
||||
" if CLASS_INDEX is None:\n",
|
||||
" fpath = get_file('imagenet_class_index.json',\n",
|
||||
" CLASS_INDEX_PATH,\n",
|
||||
" cache_subdir='models')\n",
|
||||
" CLASS_INDEX = json.load(open(fpath))\n",
|
||||
" indices = np.argmax(preds, axis=-1)\n",
|
||||
" results = []\n",
|
||||
" for i in indices:\n",
|
||||
" results.append(CLASS_INDEX[str(i)])\n",
|
||||
" return results\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": true,
|
||||
"slideshow": {
|
||||
"slide_type": "skip"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"IMAGENET_FOLDER = 'imgs/imagenet' #in the repo"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# VGG16"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# %load deep_learning_models/vgg16.py\n",
|
||||
"'''VGG16 model for Keras.\n",
|
||||
"\n",
|
||||
"# Reference:\n",
|
||||
"\n",
|
||||
"- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)\n",
|
||||
"\n",
|
||||
"'''\n",
|
||||
"from __future__ import print_function\n",
|
||||
"\n",
|
||||
"import numpy as np\n",
|
||||
"import warnings\n",
|
||||
"\n",
|
||||
"from keras.models import Model\n",
|
||||
"from keras.layers import Flatten, Dense, Input\n",
|
||||
"from keras.layers import Convolution2D, MaxPooling2D\n",
|
||||
"from keras.preprocessing import image\n",
|
||||
"from keras.utils.layer_utils import convert_all_kernels_in_model\n",
|
||||
"from keras.utils.data_utils import get_file\n",
|
||||
"from keras import backend as K\n",
|
||||
"\n",
|
||||
"TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'\n",
|
||||
"TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'\n",
|
||||
"TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'\n",
|
||||
"TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def VGG16(include_top=True, weights='imagenet',\n",
|
||||
" input_tensor=None):\n",
|
||||
" '''Instantiate the VGG16 architecture,\n",
|
||||
" optionally loading weights pre-trained\n",
|
||||
" on ImageNet. Note that when using TensorFlow,\n",
|
||||
" for best performance you should set\n",
|
||||
" `image_dim_ordering=\"tf\"` in your Keras config\n",
|
||||
" at ~/.keras/keras.json.\n",
|
||||
"\n",
|
||||
" The model and the weights are compatible with both\n",
|
||||
" TensorFlow and Theano. The dimension ordering\n",
|
||||
" convention used by the model is the one\n",
|
||||
" specified in your Keras config file.\n",
|
||||
"\n",
|
||||
" # Arguments\n",
|
||||
" include_top: whether to include the 3 fully-connected\n",
|
||||
" layers at the top of the network.\n",
|
||||
" weights: one of `None` (random initialization)\n",
|
||||
" or \"imagenet\" (pre-training on ImageNet).\n",
|
||||
" input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)\n",
|
||||
" to use as image input for the model.\n",
|
||||
"\n",
|
||||
" # Returns\n",
|
||||
" A Keras model instance.\n",
|
||||
" '''\n",
|
||||
" if weights not in {'imagenet', None}:\n",
|
||||
" raise ValueError('The `weights` argument should be either '\n",
|
||||
" '`None` (random initialization) or `imagenet` '\n",
|
||||
" '(pre-training on ImageNet).')\n",
|
||||
" # Determine proper input shape\n",
|
||||
" if K.image_dim_ordering() == 'th':\n",
|
||||
" if include_top:\n",
|
||||
" input_shape = (3, 224, 224)\n",
|
||||
" else:\n",
|
||||
" input_shape = (3, None, None)\n",
|
||||
" else:\n",
|
||||
" if include_top:\n",
|
||||
" input_shape = (224, 224, 3)\n",
|
||||
" else:\n",
|
||||
" input_shape = (None, None, 3)\n",
|
||||
"\n",
|
||||
" if input_tensor is None:\n",
|
||||
" img_input = Input(shape=input_shape)\n",
|
||||
" else:\n",
|
||||
" if not K.is_keras_tensor(input_tensor):\n",
|
||||
" img_input = Input(tensor=input_tensor)\n",
|
||||
" else:\n",
|
||||
" img_input = input_tensor\n",
|
||||
" # Block 1\n",
|
||||
" x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)\n",
|
||||
" x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)\n",
|
||||
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)\n",
|
||||
"\n",
|
||||
" # Block 2\n",
|
||||
" x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)\n",
|
||||
" x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)\n",
|
||||
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)\n",
|
||||
"\n",
|
||||
" # Block 3\n",
|
||||
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)\n",
|
||||
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)\n",
|
||||
" x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)\n",
|
||||
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)\n",
|
||||
"\n",
|
||||
" # Block 4\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)\n",
|
||||
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)\n",
|
||||
"\n",
|
||||
" # Block 5\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)\n",
|
||||
" x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)\n",
|
||||
" x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)\n",
|
||||
"\n",
|
||||
" if include_top:\n",
|
||||
" # Classification block\n",
|
||||
" x = Flatten(name='flatten')(x)\n",
|
||||
" x = Dense(4096, activation='relu', name='fc1')(x)\n",
|
||||
" x = Dense(4096, activation='relu', name='fc2')(x)\n",
|
||||
" x = Dense(1000, activation='softmax', name='predictions')(x)\n",
|
||||
"\n",
|
||||
" # Create model\n",
|
||||
" model = Model(img_input, x)\n",
|
||||
"\n",
|
||||
" # load weights\n",
|
||||
" if weights == 'imagenet':\n",
|
||||
" print('K.image_dim_ordering:', K.image_dim_ordering())\n",
|
||||
" if K.image_dim_ordering() == 'th':\n",
|
||||
" if include_top:\n",
|
||||
" weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',\n",
|
||||
" TH_WEIGHTS_PATH,\n",
|
||||
" cache_subdir='models')\n",
|
||||
" else:\n",
|
||||
" weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',\n",
|
||||
" TH_WEIGHTS_PATH_NO_TOP,\n",
|
||||
" cache_subdir='models')\n",
|
||||
" model.load_weights(weights_path)\n",
|
||||
" if K.backend() == 'tensorflow':\n",
|
||||
" warnings.warn('You are using the TensorFlow backend, yet you '\n",
|
||||
" 'are using the Theano '\n",
|
||||
" 'image dimension ordering convention '\n",
|
||||
" '(`image_dim_ordering=\"th\"`). '\n",
|
||||
" 'For best performance, set '\n",
|
||||
" '`image_dim_ordering=\"tf\"` in '\n",
|
||||
" 'your Keras config '\n",
|
||||
" 'at ~/.keras/keras.json.')\n",
|
||||
" convert_all_kernels_in_model(model)\n",
|
||||
" else:\n",
|
||||
" if include_top:\n",
|
||||
" weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',\n",
|
||||
" TF_WEIGHTS_PATH,\n",
|
||||
" cache_subdir='models')\n",
|
||||
" else:\n",
|
||||
" weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',\n",
|
||||
" TF_WEIGHTS_PATH_NO_TOP,\n",
|
||||
" cache_subdir='models')\n",
|
||||
" model.load_weights(weights_path)\n",
|
||||
" if K.backend() == 'theano':\n",
|
||||
" convert_all_kernels_in_model(model)\n",
|
||||
" return model"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"K.image_dim_ordering: th\n",
|
||||
"Input image shape: (1, 3, 224, 224)\n",
|
||||
"Predicted: [['n07745940', 'strawberry']]\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"\n",
|
||||
"model = VGG16(include_top=True, weights='imagenet')\n",
|
||||
"\n",
|
||||
"img_path = os.path.join(IMAGENET_FOLDER, 'strawberry_1157.jpeg')\n",
|
||||
"img = image.load_img(img_path, target_size=(224, 224))\n",
|
||||
"x = image.img_to_array(img)\n",
|
||||
"x = np.expand_dims(x, axis=0)\n",
|
||||
"x = preprocess_input(x)\n",
|
||||
"print('Input image shape:', x.shape)\n",
|
||||
"\n",
|
||||
"preds = model.predict(x)\n",
|
||||
"print('Predicted:', decode_predictions(preds))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Fine Tuning of a Pre-Trained Model"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"```python\n",
|
||||
"def VGG16_FT(weights_path = None, \n",
|
||||
" img_width = 224, img_height = 224, \n",
|
||||
" f_type = None, n_labels = None ):\n",
|
||||
" \n",
|
||||
" \"\"\"Fine Tuning of a VGG16 based Net\"\"\"\n",
|
||||
"\n",
|
||||
" # VGG16 Up to the layer before the last!\n",
|
||||
" model = Sequential()\n",
|
||||
" model.add(ZeroPadding2D((1, 1), \n",
|
||||
" input_shape=(3, \n",
|
||||
" img_width, img_height)))\n",
|
||||
"\n",
|
||||
" model.add(Convolution2D(64, 3, 3, activation='relu', \n",
|
||||
" name='conv1_1'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(64, 3, 3, activation='relu', \n",
|
||||
" name='conv1_2'))\n",
|
||||
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
|
||||
"\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(128, 3, 3, activation='relu', \n",
|
||||
" name='conv2_1'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(128, 3, 3, activation='relu', \n",
|
||||
" name='conv2_2'))\n",
|
||||
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
|
||||
"\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
|
||||
" name='conv3_1'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
|
||||
" name='conv3_2'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(256, 3, 3, activation='relu', \n",
|
||||
" name='conv3_3'))\n",
|
||||
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
|
||||
"\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv4_1'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv4_2'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv4_3'))\n",
|
||||
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
|
||||
"\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv5_1'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv5_2'))\n",
|
||||
" model.add(ZeroPadding2D((1, 1)))\n",
|
||||
" model.add(Convolution2D(512, 3, 3, activation='relu', \n",
|
||||
" name='conv5_3'))\n",
|
||||
" model.add(MaxPooling2D((2, 2), strides=(2, 2)))\n",
|
||||
" model.add(Flatten())\n",
|
||||
"\n",
|
||||
" # Plugging new Layers\n",
|
||||
" model.add(Dense(768, activation='sigmoid'))\n",
|
||||
" model.add(Dropout(0.0))\n",
|
||||
" model.add(Dense(768, activation='sigmoid'))\n",
|
||||
" model.add(Dropout(0.0))\n",
|
||||
" \n",
|
||||
" last_layer = Dense(n_labels, activation='sigmoid')\n",
|
||||
" loss = 'categorical_crossentropy'\n",
|
||||
" optimizer = optimizers.Adam(lr=1e-4, epsilon=1e-08)\n",
|
||||
" batch_size = 128\n",
|
||||
" \n",
|
||||
" assert os.path.exists(weights_path), 'Model weights not found (see \"weights_path\" variable in script).'\n",
|
||||
" #model.load_weights(weights_path)\n",
|
||||
" f = h5py.File(weights_path)\n",
|
||||
" for k in range(len(f.attrs['layer_names'])):\n",
|
||||
" g = f[f.attrs['layer_names'][k]]\n",
|
||||
" weights = [g[g.attrs['weight_names'][p]] \n",
|
||||
" for p in range(len(g.attrs['weight_names']))]\n",
|
||||
" if k >= len(model.layers):\n",
|
||||
" break\n",
|
||||
" else:\n",
|
||||
" model.layers[k].set_weights(weights)\n",
|
||||
" f.close()\n",
|
||||
" print('Model loaded.')\n",
|
||||
"\n",
|
||||
" model.add(last_layer)\n",
|
||||
"\n",
|
||||
" # set the first 25 layers (up to the last conv block)\n",
|
||||
" # to non-trainable (weights will not be updated)\n",
|
||||
" for layer in model.layers[:25]:\n",
|
||||
" layer.trainable = False\n",
|
||||
"\n",
|
||||
" # compile the model with a SGD/momentum optimizer\n",
|
||||
" # and a very slow learning rate.\n",
|
||||
" model.compile(loss=loss,\n",
|
||||
" optimizer=optimizer,\n",
|
||||
" metrics=['accuracy'])\n",
|
||||
" return model, batch_size\n",
|
||||
"\n",
|
||||
"```"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Hands On:\n",
|
||||
"\n",
|
||||
"### Try to do the same with other models "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%load deep_learning_models/vgg19.py"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%load deep_learning_models/resnet50.py"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"celltoolbar": "Slideshow",
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
557
deep-learning/keras-tutorial/3.2 RNN and LSTM.ipynb
Normal file
|
@ -0,0 +1,557 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Recurrent Neural networks\n",
|
||||
"====="
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### RNN "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": [
|
||||
"<img src =\"imgs/rnn.png\" width=\"20%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A recurrent neural network (RNN) is a class of artificial neural network where connections between units form a directed cycle. This creates an internal state of the network which allows it to exhibit dynamic temporal behavior."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"keras.layers.recurrent.SimpleRNN(output_dim, \n",
|
||||
" init='glorot_uniform', inner_init='orthogonal', activation='tanh', \n",
|
||||
" W_regularizer=None, U_regularizer=None, b_regularizer=None, \n",
|
||||
" dropout_W=0.0, dropout_U=0.0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Backprop Through time "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Contrary to feed-forward neural networks, the RNN is characterized by the ability of encoding longer past information, thus very suitable for sequential models. The BPTT extends the ordinary BP algorithm to suit the recurrent neural\n",
|
||||
"architecture."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"scrolled": true
|
||||
},
|
||||
"source": [
|
||||
"<img src =\"imgs/rnn2.png\" width=\"45%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"%matplotlib inline"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 1,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stderr",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Using Theano backend.\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"import numpy as np\n",
|
||||
"import pandas as pd\n",
|
||||
"import theano\n",
|
||||
"import theano.tensor as T\n",
|
||||
"import keras \n",
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers import Dense, Activation\n",
|
||||
"from keras.preprocessing import image\n",
|
||||
"from __future__ import print_function\n",
|
||||
"import numpy as np\n",
|
||||
"import matplotlib.pyplot as plt\n",
|
||||
"\n",
|
||||
"from keras.datasets import imdb\n",
|
||||
"from keras.datasets import mnist\n",
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers import Dense, Dropout, Activation, Flatten\n",
|
||||
"from keras.layers import Convolution2D, MaxPooling2D\n",
|
||||
"from keras.utils import np_utils\n",
|
||||
"from keras.preprocessing import sequence\n",
|
||||
"from keras.layers.embeddings import Embedding\n",
|
||||
"from keras.layers.recurrent import LSTM, GRU, SimpleRNN\n",
|
||||
"from sklearn.preprocessing import LabelEncoder\n",
|
||||
"from sklearn.preprocessing import StandardScaler\n",
|
||||
"from sklearn.cross_validation import train_test_split\n",
|
||||
"from keras.layers.core import Activation, TimeDistributedDense, RepeatVector\n",
|
||||
"from keras.callbacks import EarlyStopping, ModelCheckpoint"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### IMDB sentiment classification task"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"This is a dataset for binary sentiment classification containing substantially more data than previous benchmark datasets. \n",
|
||||
"\n",
|
||||
"IMDB provided a set of 25,000 highly polar movie reviews for training, and 25,000 for testing. \n",
|
||||
"\n",
|
||||
"There is additional unlabeled data for use as well. Raw text and already processed bag of words formats are provided. \n",
|
||||
"\n",
|
||||
"http://ai.stanford.edu/~amaas/data/sentiment/"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Data Preparation - IMDB"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Loading data...\n",
|
||||
"20000 train sequences\n",
|
||||
"5000 test sequences\n",
|
||||
"Example:\n",
|
||||
"[ [1, 20, 28, 716, 48, 495, 79, 27, 493, 8, 5067, 7, 50, 5, 4682, 13075, 10, 5, 852, 157, 11, 5, 1716, 3351, 10, 5, 500, 7308, 6, 33, 256, 41, 13610, 7, 17, 23, 48, 1537, 3504, 26, 269, 929, 18, 2, 7, 2, 4284, 8, 105, 5, 2, 182, 314, 38, 98, 103, 7, 36, 2184, 246, 360, 7, 19, 396, 17, 26, 269, 929, 18, 1769, 493, 6, 116, 7, 105, 5, 575, 182, 27, 5, 1002, 1085, 130, 62, 17, 24, 89, 17, 13, 381, 1421, 8, 5167, 7, 5, 2723, 38, 325, 7, 17, 23, 93, 9, 156, 252, 19, 235, 20, 28, 5, 104, 76, 7, 17, 169, 35, 14764, 17, 23, 1460, 7, 36, 2184, 934, 56, 2134, 6, 17, 891, 214, 11, 5, 1552, 6, 92, 6, 33, 256, 82, 7]]\n",
|
||||
"Pad sequences (samples x time)\n",
|
||||
"X_train shape: (20000L, 100L)\n",
|
||||
"X_test shape: (5000L, 100L)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"max_features = 20000\n",
|
||||
"maxlen = 100 # cut texts after this number of words (among top max_features most common words)\n",
|
||||
"batch_size = 32\n",
|
||||
"\n",
|
||||
"print(\"Loading data...\")\n",
|
||||
"(X_train, y_train), (X_test, y_test) = imdb.load_data(nb_words=max_features, test_split=0.2)\n",
|
||||
"print(len(X_train), 'train sequences')\n",
|
||||
"print(len(X_test), 'test sequences')\n",
|
||||
"\n",
|
||||
"print('Example:')\n",
|
||||
"print(X_train[:1])\n",
|
||||
"\n",
|
||||
"print(\"Pad sequences (samples x time)\")\n",
|
||||
"X_train = sequence.pad_sequences(X_train, maxlen=maxlen)\n",
|
||||
"X_test = sequence.pad_sequences(X_test, maxlen=maxlen)\n",
|
||||
"print('X_train shape:', X_train.shape)\n",
|
||||
"print('X_test shape:', X_test.shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"#### Model building "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Build model...\n",
|
||||
"Train...\n",
|
||||
"Train on 20000 samples, validate on 5000 samples\n",
|
||||
"Epoch 1/1\n",
|
||||
"20000/20000 [==============================] - 174s - loss: 0.7213 - val_loss: 0.6179\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<keras.callbacks.History at 0x20519860>"
|
||||
]
|
||||
},
|
||||
"execution_count": 7,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print('Build model...')\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Embedding(max_features, 128, input_length=maxlen))\n",
|
||||
"model.add(SimpleRNN(128)) \n",
|
||||
"model.add(Dropout(0.5))\n",
|
||||
"model.add(Dense(1))\n",
|
||||
"model.add(Activation('sigmoid'))\n",
|
||||
"\n",
|
||||
"# try using different optimizers and different optimizer configs\n",
|
||||
"model.compile(loss='binary_crossentropy', optimizer='adam', class_mode=\"binary\")\n",
|
||||
"\n",
|
||||
"print(\"Train...\")\n",
|
||||
"model.fit(X_train, y_train, batch_size=batch_size, nb_epoch=1, validation_data=(X_test, y_test), show_accuracy=True)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### LSTM "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"A LSTM network is an artificial neural network that contains LSTM blocks instead of, or in addition to, regular network units. A LSTM block may be described as a \"smart\" network unit that can remember a value for an arbitrary length of time. \n",
|
||||
"\n",
|
||||
"Unlike traditional RNNs, an Long short-term memory network is well-suited to learn from experience to classify, process and predict time series when there are very long time lags of unknown size between important events."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"scrolled": true
|
||||
},
|
||||
"source": [
|
||||
"<img src =\"imgs/gru.png\" width=\"60%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"keras.layers.recurrent.LSTM(output_dim, init='glorot_uniform', inner_init='orthogonal', \n",
|
||||
" forget_bias_init='one', activation='tanh', \n",
|
||||
" inner_activation='hard_sigmoid', \n",
|
||||
" W_regularizer=None, U_regularizer=None, b_regularizer=None, \n",
|
||||
" dropout_W=0.0, dropout_U=0.0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### GRU "
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Gated recurrent units are a gating mechanism in recurrent neural networks. \n",
|
||||
"\n",
|
||||
"Much similar to the LSTMs, they have fewer parameters than LSTM, as they lack an output gate."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"keras.layers.recurrent.GRU(output_dim, init='glorot_uniform', inner_init='orthogonal', \n",
|
||||
" activation='tanh', inner_activation='hard_sigmoid', \n",
|
||||
" W_regularizer=None, U_regularizer=None, b_regularizer=None, \n",
|
||||
" dropout_W=0.0, dropout_U=0.0)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Your Turn! - Hands on Rnn"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"print('Build model...')\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Embedding(max_features, 128, input_length=maxlen))\n",
|
||||
"\n",
|
||||
"# Play with those! try and get better results!\n",
|
||||
"#model.add(SimpleRNN(128)) \n",
|
||||
"#model.add(GRU(128)) \n",
|
||||
"#model.add(LSTM(128)) \n",
|
||||
"\n",
|
||||
"model.add(Dropout(0.5))\n",
|
||||
"model.add(Dense(1))\n",
|
||||
"model.add(Activation('sigmoid'))\n",
|
||||
"\n",
|
||||
"# try using different optimizers and different optimizer configs\n",
|
||||
"model.compile(loss='binary_crossentropy', optimizer='adam', class_mode=\"binary\")\n",
|
||||
"\n",
|
||||
"print(\"Train...\")\n",
|
||||
"model.fit(X_train, y_train, batch_size=batch_size, \n",
|
||||
" nb_epoch=4, validation_data=(X_test, y_test), show_accuracy=True)\n",
|
||||
"score, acc = model.evaluate(X_test, y_test, batch_size=batch_size, show_accuracy=True)\n",
|
||||
"print('Test score:', score)\n",
|
||||
"print('Test accuracy:', acc)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Sentence Generation using RNN(LSTM)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": null,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Downloading data from https://s3.amazonaws.com/text-datasets/nietzsche.txt\n",
|
||||
"598016/600901 [============================>.] - ETA: 0s('corpus length:', 600901)\n",
|
||||
"('total chars:', 59)\n",
|
||||
"('nb sequences:', 200287)\n",
|
||||
"Vectorization...\n",
|
||||
"Build model...\n",
|
||||
"()\n",
|
||||
"--------------------------------------------------\n",
|
||||
"('Iteration', 1)\n",
|
||||
"Epoch 1/1\n",
|
||||
"200287/200287 [==============================] - 1367s - loss: 1.9977 \n",
|
||||
"()\n",
|
||||
"('----- diversity:', 0.2)\n",
|
||||
"----- Generating with seed: \"nd the frenzied\n",
|
||||
"speeches of the prophets\"\n",
|
||||
"nd the frenzied\n",
|
||||
"speeches of the prophets and the present and and the preases and the soul to the sense of the morals and the some the consequence of the most and one only the some of the proment and interent of the some devertal to the self-consertion of the some deverent of the some distiness and the sense of the some of the morality of the most proves and the some of the some in the seem of the self-conception of the sees of the sense()\n",
|
||||
"()\n",
|
||||
"('----- diversity:', 0.5)\n",
|
||||
"----- Generating with seed: \"nd the frenzied\n",
|
||||
"speeches of the prophets\"\n",
|
||||
"nd the frenzied\n",
|
||||
"speeches of the prophets of the preat weak to the master of man who onow in interervain of even which who with it is the isitaial conception of the some live the contented the one who exilfacied in the sees to raters, and the passe expecience the inte that the persented in the pass, in the experious of the soulity of the waith the morally distanding of the some of the most interman only and as a period of the sense and o()\n",
|
||||
"()\n",
|
||||
"('----- diversity:', 1.0)\n",
|
||||
"----- Generating with seed: \"nd the frenzied\n",
|
||||
"speeches of the prophets\"\n",
|
||||
"nd the frenzied\n",
|
||||
"speeches of the prophets of\n",
|
||||
"ar self now no ecerspoped ivent so not,\n",
|
||||
"that itsed undiswerbatarlials. what it is altrenively evok\n",
|
||||
"now be scotnew\n",
|
||||
"prigardiness intagualds, and coumond-grow to\n",
|
||||
"the respence you as penires never wand be\n",
|
||||
"natuented ost ablinice to love worts an who itnopeancew be than mrank againribl\n",
|
||||
"some something lines in the estlenbtupenies of korils divenowry apmains, curte, were,\n",
|
||||
"ind \"feulness. a will, natur()\n",
|
||||
"()\n",
|
||||
"('----- diversity:', 1.2)\n",
|
||||
"----- Generating with seed: \"nd the frenzied\n",
|
||||
"speeches of the prophets\"\n",
|
||||
"nd the frenzied\n",
|
||||
"speeches of the prophets, ind someaterting will stroour hast-fards and lofe beausold, in souby in ruarest, we withquus. \"the capinistin and it a mode what it be\n",
|
||||
"my oc, to th[se condectay\n",
|
||||
"of ymo fre\n",
|
||||
"dunt and so asexthersess renieved concecunaulies tound\"), from glubiakeitiouals kenty am feelitafouer deceanw or sumpind, and by afolod peall--phasoos of sole\n",
|
||||
"iy copprajakias\n",
|
||||
"in\n",
|
||||
"in adcyont-mean to prives apf-rigionall thust wi()\n",
|
||||
"()\n",
|
||||
"--------------------------------------------------\n",
|
||||
"('Iteration', 2)\n",
|
||||
"Epoch 1/1\n",
|
||||
" 40576/200287 [=====>........................] - ETA: 1064s - loss: 1.6878"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers import Dense, Activation, Dropout\n",
|
||||
"from keras.layers import LSTM\n",
|
||||
"from keras.optimizers import RMSprop\n",
|
||||
"from keras.utils.data_utils import get_file\n",
|
||||
"import numpy as np\n",
|
||||
"import random\n",
|
||||
"import sys\n",
|
||||
"\n",
|
||||
"path = get_file('nietzsche.txt', origin=\"https://s3.amazonaws.com/text-datasets/nietzsche.txt\")\n",
|
||||
"text = open(path).read().lower()\n",
|
||||
"print('corpus length:', len(text))\n",
|
||||
"\n",
|
||||
"chars = sorted(list(set(text)))\n",
|
||||
"print('total chars:', len(chars))\n",
|
||||
"char_indices = dict((c, i) for i, c in enumerate(chars))\n",
|
||||
"indices_char = dict((i, c) for i, c in enumerate(chars))\n",
|
||||
"\n",
|
||||
"# cut the text in semi-redundant sequences of maxlen characters\n",
|
||||
"maxlen = 40\n",
|
||||
"step = 3\n",
|
||||
"sentences = []\n",
|
||||
"next_chars = []\n",
|
||||
"for i in range(0, len(text) - maxlen, step):\n",
|
||||
" sentences.append(text[i: i + maxlen])\n",
|
||||
" next_chars.append(text[i + maxlen])\n",
|
||||
"print('nb sequences:', len(sentences))\n",
|
||||
"\n",
|
||||
"print('Vectorization...')\n",
|
||||
"X = np.zeros((len(sentences), maxlen, len(chars)), dtype=np.bool)\n",
|
||||
"y = np.zeros((len(sentences), len(chars)), dtype=np.bool)\n",
|
||||
"for i, sentence in enumerate(sentences):\n",
|
||||
" for t, char in enumerate(sentence):\n",
|
||||
" X[i, t, char_indices[char]] = 1\n",
|
||||
" y[i, char_indices[next_chars[i]]] = 1\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# build the model: a single LSTM\n",
|
||||
"print('Build model...')\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(LSTM(128, input_shape=(maxlen, len(chars))))\n",
|
||||
"model.add(Dense(len(chars)))\n",
|
||||
"model.add(Activation('softmax'))\n",
|
||||
"\n",
|
||||
"optimizer = RMSprop(lr=0.01)\n",
|
||||
"model.compile(loss='categorical_crossentropy', optimizer=optimizer)\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"def sample(preds, temperature=1.0):\n",
|
||||
" # helper function to sample an index from a probability array\n",
|
||||
" preds = np.asarray(preds).astype('float64')\n",
|
||||
" preds = np.log(preds) / temperature\n",
|
||||
" exp_preds = np.exp(preds)\n",
|
||||
" preds = exp_preds / np.sum(exp_preds)\n",
|
||||
" probas = np.random.multinomial(1, preds, 1)\n",
|
||||
" return np.argmax(probas)\n",
|
||||
"\n",
|
||||
"# train the model, output generated text after each iteration\n",
|
||||
"for iteration in range(1, 60):\n",
|
||||
" print()\n",
|
||||
" print('-' * 50)\n",
|
||||
" print('Iteration', iteration)\n",
|
||||
" model.fit(X, y, batch_size=128, nb_epoch=1)\n",
|
||||
"\n",
|
||||
" start_index = random.randint(0, len(text) - maxlen - 1)\n",
|
||||
"\n",
|
||||
" for diversity in [0.2, 0.5, 1.0, 1.2]:\n",
|
||||
" print()\n",
|
||||
" print('----- diversity:', diversity)\n",
|
||||
"\n",
|
||||
" generated = ''\n",
|
||||
" sentence = text[start_index: start_index + maxlen]\n",
|
||||
" generated += sentence\n",
|
||||
" print('----- Generating with seed: \"' + sentence + '\"')\n",
|
||||
" sys.stdout.write(generated)\n",
|
||||
"\n",
|
||||
" for i in range(400):\n",
|
||||
" x = np.zeros((1, maxlen, len(chars)))\n",
|
||||
" for t, char in enumerate(sentence):\n",
|
||||
" x[0, t, char_indices[char]] = 1.\n",
|
||||
"\n",
|
||||
" preds = model.predict(x, verbose=0)[0]\n",
|
||||
" next_index = sample(preds, diversity)\n",
|
||||
" next_char = indices_char[next_index]\n",
|
||||
"\n",
|
||||
" generated += next_char\n",
|
||||
" sentence = sentence[1:] + next_char\n",
|
||||
"\n",
|
||||
" sys.stdout.write(next_char)\n",
|
||||
" sys.stdout.flush()\n",
|
||||
" print()"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
|
@ -0,0 +1,964 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"source": [
|
||||
"\n",
|
||||
"# RNN using LSTM \n",
|
||||
" \n",
|
||||
"\n",
|
||||
"\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/RNN-rolled.png\"/ width=\"80px\" height=\"80px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/RNN-unrolled.png\"/ width=\"400px\" height=\"400px\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"<img src=\"imgs/LSTM3-chain.png\"/ width=\"60%\">"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"_source: http://colah.github.io/posts/2015-08-Understanding-LSTMs_"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 3,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from keras.optimizers import SGD\n",
|
||||
"from keras.preprocessing.text import one_hot, text_to_word_sequence, base_filter\n",
|
||||
"from keras.utils import np_utils\n",
|
||||
"from keras.models import Sequential\n",
|
||||
"from keras.layers.core import Dense, Dropout, Activation\n",
|
||||
"from keras.layers.embeddings import Embedding\n",
|
||||
"from keras.layers.recurrent import LSTM, GRU\n",
|
||||
"from keras.preprocessing import sequence"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"### Reading blog post from data directory"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 4,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import os\n",
|
||||
"import pickle\n",
|
||||
"import numpy as np"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 5,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"/home/valerio/deep-learning-keras-euroscipy2016/data\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"DATA_DIRECTORY = os.path.join(os.path.abspath(os.path.curdir), 'data')\n",
|
||||
"print(DATA_DIRECTORY)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 6,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"male_posts = []\n",
|
||||
"female_post = []"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 7,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"with open(os.path.join(DATA_DIRECTORY,\"male_blog_list.txt\"),\"rb\") as male_file:\n",
|
||||
" male_posts= pickle.load(male_file)\n",
|
||||
" \n",
|
||||
"with open(os.path.join(DATA_DIRECTORY,\"female_blog_list.txt\"),\"rb\") as female_file:\n",
|
||||
" female_posts = pickle.load(female_file)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 85,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"filtered_male_posts = list(filter(lambda p: len(p) > 0, male_posts))\n",
|
||||
"filtered_female_posts = list(filter(lambda p: len(p) > 0, female_posts))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 86,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# text processing - one hot builds index of the words\n",
|
||||
"male_one_hot = []\n",
|
||||
"female_one_hot = []\n",
|
||||
"n = 30000\n",
|
||||
"for post in filtered_male_posts:\n",
|
||||
" try:\n",
|
||||
" male_one_hot.append(one_hot(post, n, split=\" \", filters=base_filter(), lower=True))\n",
|
||||
" except:\n",
|
||||
" continue\n",
|
||||
"\n",
|
||||
"for post in filtered_female_posts:\n",
|
||||
" try:\n",
|
||||
" female_one_hot.append(one_hot(post,n,split=\" \",filters=base_filter(),lower=True))\n",
|
||||
" except:\n",
|
||||
" continue"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 87,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# 0 for male, 1 for female\n",
|
||||
"concatenate_array_rnn = np.concatenate((np.zeros(len(male_one_hot)),\n",
|
||||
" np.ones(len(female_one_hot))))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 88,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from sklearn.cross_validation import train_test_split\n",
|
||||
"\n",
|
||||
"X_train_rnn, X_test_rnn, y_train_rnn, y_test_rnn = train_test_split(np.concatenate((female_one_hot,male_one_hot)),\n",
|
||||
" concatenate_array_rnn, \n",
|
||||
" test_size=0.2)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 89,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"X_train_rnn shape: (3873, 100) (3873,)\n",
|
||||
"X_test_rnn shape: (969, 100) (969,)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"maxlen = 100\n",
|
||||
"X_train_rnn = sequence.pad_sequences(X_train_rnn, maxlen=maxlen)\n",
|
||||
"X_test_rnn = sequence.pad_sequences(X_test_rnn, maxlen=maxlen)\n",
|
||||
"print('X_train_rnn shape:', X_train_rnn.shape, y_train_rnn.shape)\n",
|
||||
"print('X_test_rnn shape:', X_test_rnn.shape, y_test_rnn.shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 90,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"max_features = 30000\n",
|
||||
"dimension = 128\n",
|
||||
"output_dimension = 128\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Embedding(max_features, dimension))\n",
|
||||
"model.add(LSTM(output_dimension))\n",
|
||||
"model.add(Dropout(0.5))\n",
|
||||
"model.add(Dense(1))\n",
|
||||
"model.add(Activation('sigmoid'))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 91,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model.compile(loss='mean_squared_error', optimizer='sgd', metrics=['accuracy'])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 92,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Train on 3873 samples, validate on 969 samples\n",
|
||||
"Epoch 1/4\n",
|
||||
"3873/3873 [==============================] - 3s - loss: 0.2487 - acc: 0.5378 - val_loss: 0.2506 - val_acc: 0.5191\n",
|
||||
"Epoch 2/4\n",
|
||||
"3873/3873 [==============================] - 3s - loss: 0.2486 - acc: 0.5401 - val_loss: 0.2508 - val_acc: 0.5191\n",
|
||||
"Epoch 3/4\n",
|
||||
"3873/3873 [==============================] - 3s - loss: 0.2484 - acc: 0.5417 - val_loss: 0.2496 - val_acc: 0.5191\n",
|
||||
"Epoch 4/4\n",
|
||||
"3873/3873 [==============================] - 3s - loss: 0.2484 - acc: 0.5399 - val_loss: 0.2502 - val_acc: 0.5191\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<keras.callbacks.History at 0x7fa1e96ac4e0>"
|
||||
]
|
||||
},
|
||||
"execution_count": 92,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model.fit(X_train_rnn, y_train_rnn, batch_size=32,\n",
|
||||
" nb_epoch=4, validation_data=(X_test_rnn, y_test_rnn))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 93,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"969/969 [==============================] - 0s \n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"score, acc = model.evaluate(X_test_rnn, y_test_rnn, batch_size=32)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 94,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.250189056399 0.519091847357\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(score, acc)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Using TFIDF Vectorizer as an input instead of one hot encoder"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 95,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"from sklearn.feature_extraction.text import TfidfVectorizer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 96,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"vectorizer = TfidfVectorizer(decode_error='ignore', norm='l2', min_df=5)\n",
|
||||
"tfidf_male = vectorizer.fit_transform(filtered_male_posts)\n",
|
||||
"tfidf_female = vectorizer.fit_transform(filtered_female_posts)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 97,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"flattened_array_tfidf_male = tfidf_male.toarray()\n",
|
||||
"flattened_array_tfidf_female = tfidf_male.toarray()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 98,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"y_rnn = np.concatenate((np.zeros(len(flattened_array_tfidf_male)),\n",
|
||||
" np.ones(len(flattened_array_tfidf_female))))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 99,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"X_train_rnn, X_test_rnn, y_train_rnn, y_test_rnn = train_test_split(np.concatenate((flattened_array_tfidf_male, \n",
|
||||
" flattened_array_tfidf_female)),\n",
|
||||
" y_rnn,test_size=0.2)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 100,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"X_train_rnn shape: (4152, 100) (4152,)\n",
|
||||
"X_test_rnn shape: (1038, 100) (1038,)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"maxlen = 100\n",
|
||||
"X_train_rnn = sequence.pad_sequences(X_train_rnn, maxlen=maxlen)\n",
|
||||
"X_test_rnn = sequence.pad_sequences(X_test_rnn, maxlen=maxlen)\n",
|
||||
"print('X_train_rnn shape:', X_train_rnn.shape, y_train_rnn.shape)\n",
|
||||
"print('X_test_rnn shape:', X_test_rnn.shape, y_test_rnn.shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 101,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"max_features = 30000\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(Embedding(max_features, dimension))\n",
|
||||
"model.add(LSTM(output_dimension))\n",
|
||||
"model.add(Dropout(0.5))\n",
|
||||
"model.add(Dense(1))\n",
|
||||
"model.add(Activation('sigmoid'))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 102,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"model.compile(loss='mean_squared_error',optimizer='sgd', metrics=['accuracy'])"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 103,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Train on 4152 samples, validate on 1038 samples\n",
|
||||
"Epoch 1/4\n",
|
||||
"4152/4152 [==============================] - 3s - loss: 0.2502 - acc: 0.4988 - val_loss: 0.2503 - val_acc: 0.4865\n",
|
||||
"Epoch 2/4\n",
|
||||
"4152/4152 [==============================] - 3s - loss: 0.2507 - acc: 0.4843 - val_loss: 0.2500 - val_acc: 0.4865\n",
|
||||
"Epoch 3/4\n",
|
||||
"4152/4152 [==============================] - 3s - loss: 0.2504 - acc: 0.4952 - val_loss: 0.2501 - val_acc: 0.4865\n",
|
||||
"Epoch 4/4\n",
|
||||
"4152/4152 [==============================] - 3s - loss: 0.2506 - acc: 0.4913 - val_loss: 0.2500 - val_acc: 0.5135\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"data": {
|
||||
"text/plain": [
|
||||
"<keras.callbacks.History at 0x7fa1f466f278>"
|
||||
]
|
||||
},
|
||||
"execution_count": 103,
|
||||
"metadata": {},
|
||||
"output_type": "execute_result"
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"model.fit(X_train_rnn, y_train_rnn, \n",
|
||||
" batch_size=32, nb_epoch=4,\n",
|
||||
" validation_data=(X_test_rnn, y_test_rnn))"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 104,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"1038/1038 [==============================] - 0s \n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"score,acc = model.evaluate(X_test_rnn, y_test_rnn, \n",
|
||||
" batch_size=32)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 105,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"0.249981284572 0.513487476145\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"print(score, acc)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"# Sentence Generation using LSTM"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 106,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# reading all the male text data into one string\n",
|
||||
"male_post = ' '.join(filtered_male_posts)\n",
|
||||
"\n",
|
||||
"#building character set for the male posts\n",
|
||||
"character_set_male = set(male_post)\n",
|
||||
"#building two indices - character index and index of character\n",
|
||||
"char_indices = dict((c, i) for i, c in enumerate(character_set_male))\n",
|
||||
"indices_char = dict((i, c) for i, c in enumerate(character_set_male))\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"# cut the text in semi-redundant sequences of maxlen characters\n",
|
||||
"maxlen = 20\n",
|
||||
"step = 1\n",
|
||||
"sentences = []\n",
|
||||
"next_chars = []\n",
|
||||
"for i in range(0, len(male_post) - maxlen, step):\n",
|
||||
" sentences.append(male_post[i : i + maxlen])\n",
|
||||
" next_chars.append(male_post[i + maxlen])\n"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 107,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"(2552476, 20, 152) (2552476, 152)\n",
|
||||
"(2552476, 20, 152) (2552476, 152)\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"#Vectorisation of input\n",
|
||||
"x_male = np.zeros((len(male_post), maxlen, len(character_set_male)), dtype=np.bool)\n",
|
||||
"y_male = np.zeros((len(male_post), len(character_set_male)), dtype=np.bool)\n",
|
||||
"\n",
|
||||
"print(x_male.shape, y_male.shape)\n",
|
||||
"\n",
|
||||
"for i, sentence in enumerate(sentences):\n",
|
||||
" for t, char in enumerate(sentence):\n",
|
||||
" x_male[i, t, char_indices[char]] = 1\n",
|
||||
" y_male[i, char_indices[next_chars[i]]] = 1\n",
|
||||
"\n",
|
||||
"print(x_male.shape, y_male.shape)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 109,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"Build model...\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"\n",
|
||||
"# build the model: a single LSTM\n",
|
||||
"print('Build model...')\n",
|
||||
"model = Sequential()\n",
|
||||
"model.add(LSTM(128, input_shape=(maxlen, len(character_set_male))))\n",
|
||||
"model.add(Dense(len(character_set_male)))\n",
|
||||
"model.add(Activation('softmax'))\n",
|
||||
"\n",
|
||||
"optimizer = RMSprop(lr=0.01)\n",
|
||||
"model.compile(loss='categorical_crossentropy', optimizer=optimizer)"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 74,
|
||||
"metadata": {
|
||||
"collapsed": false
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"auto_text_generating_male_model.compile(loss='mean_squared_error',optimizer='sgd')"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 110,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"import random, sys"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 111,
|
||||
"metadata": {
|
||||
"collapsed": true
|
||||
},
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# helper function to sample an index from a probability array\n",
|
||||
"def sample(a, diversity=0.75):\n",
|
||||
" if random.random() > diversity:\n",
|
||||
" return np.argmax(a)\n",
|
||||
" while 1:\n",
|
||||
" i = random.randint(0, len(a)-1)\n",
|
||||
" if a[i] > random.random():\n",
|
||||
" return i"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "code",
|
||||
"execution_count": 113,
|
||||
"metadata": {
|
||||
"collapsed": false,
|
||||
"scrolled": false
|
||||
},
|
||||
"outputs": [
|
||||
{
|
||||
"name": "stdout",
|
||||
"output_type": "stream",
|
||||
"text": [
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 1\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 226s - loss: 1.8022 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"p from the lack of \"\n",
|
||||
"sense of the search \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"p from the lack of \"\n",
|
||||
"through that possibl\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"p from the lack of \"\n",
|
||||
". This is a \" by p\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"p from the lack of \"\n",
|
||||
"d he latermal ta we \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 2\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 228s - loss: 1.7312 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"s Last Dance\" with t\"\n",
|
||||
" screening on the st\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"s Last Dance\" with t\"\n",
|
||||
"r song think of the \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"s Last Dance\" with t\"\n",
|
||||
". I'm akin computer \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"s Last Dance\" with t\"\n",
|
||||
"played that comment \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 3\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 229s - loss: 1.8693 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \", as maybe someone w\"\n",
|
||||
"the ssone the so the\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \", as maybe someone w\"\n",
|
||||
"the sasd nouts and t\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \", as maybe someone w\"\n",
|
||||
"p hin I had at f¿ to\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \", as maybe someone w\"\n",
|
||||
"oge rely bluy leanda\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 4\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 228s - loss: 1.9135 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"o the package :(. Ah\"\n",
|
||||
" suadedbe teacher th\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"o the package :(. Ah\"\n",
|
||||
"e a searingly the id\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"o the package :(. Ah\"\n",
|
||||
"propost the bure so \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"o the package :(. Ah\"\n",
|
||||
"ing.Lever fan. By in\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 5\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 229s - loss: 4.5892 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"ot as long as my fri\"\n",
|
||||
"atde getu th> QQ.“]\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"ot as long as my fri\"\n",
|
||||
"tQ t[we QaaefYhere Q\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"ot as long as my fri\"\n",
|
||||
"ew[”*ing”e[ t[w that\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"ot as long as my fri\"\n",
|
||||
" me]sQoonQ“]e” ti nw\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 6\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 229s - loss: 6.7174 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"use I'm pretty damn \"\n",
|
||||
"me g 'o a a a a\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"use I'm pretty damn \"\n",
|
||||
" a o theT a o a \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"use I'm pretty damn \"\n",
|
||||
" n . thot auupe to \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"use I'm pretty damn \"\n",
|
||||
" tomalek ho tt Ion i\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 7\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 227s - loss: 6.9138 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"ats all got along be\"\n",
|
||||
" thrtg t ia thv i c\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"ats all got along be\"\n",
|
||||
"th wtot.. t to gt? \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"ats all got along be\"\n",
|
||||
" ed dthwnn,is a ment\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"ats all got along be\"\n",
|
||||
" t incow . wmiyit\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 8\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 228s - loss: 11.0629 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \"oot of my sleeping b\"\n",
|
||||
"m g te>t e s t anab\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \"oot of my sleeping b\"\n",
|
||||
" dttoe s s“snge es s\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \"oot of my sleeping b\"\n",
|
||||
"tut hou wen a onap\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \"oot of my sleeping b\"\n",
|
||||
"evtyr tt e io on tok\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"--------------------------------------------------\n",
|
||||
"Iteration 9\n",
|
||||
"Epoch 1/1\n",
|
||||
"2552476/2552476 [==============================] - 228s - loss: 8.7874 \n",
|
||||
"\n",
|
||||
"----- diversity: 0.2\n",
|
||||
"----- Generating with seed: \" I’ve always looked \"\n",
|
||||
"ea e ton ann n ffee\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.4\n",
|
||||
"----- Generating with seed: \" I’ve always looked \"\n",
|
||||
"o tire n a anV sia a\n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.6\n",
|
||||
"----- Generating with seed: \" I’ve always looked \"\n",
|
||||
"r i jooe Vag o en \n",
|
||||
"\n",
|
||||
"\n",
|
||||
"----- diversity: 0.8\n",
|
||||
"----- Generating with seed: \" I’ve always looked \"\n",
|
||||
" ao at ge ena oro o\n",
|
||||
"\n"
|
||||
]
|
||||
}
|
||||
],
|
||||
"source": [
|
||||
"# train the model, output generated text after each iteration\n",
|
||||
"for iteration in range(1,10):\n",
|
||||
" print()\n",
|
||||
" print('-' * 50)\n",
|
||||
" print('Iteration', iteration)\n",
|
||||
" model.fit(x_male, y_male, batch_size=128, nb_epoch=1)\n",
|
||||
"\n",
|
||||
" start_index = random.randint(0, len(male_post) - maxlen - 1)\n",
|
||||
"\n",
|
||||
" for diversity in [0.2, 0.4, 0.6, 0.8]:\n",
|
||||
" print()\n",
|
||||
" print('----- diversity:', diversity)\n",
|
||||
"\n",
|
||||
" generated = ''\n",
|
||||
" sentence = male_post[start_index : start_index + maxlen]\n",
|
||||
" generated += sentence\n",
|
||||
" print('----- Generating with seed: \"' + sentence + '\"')\n",
|
||||
"\n",
|
||||
" for iteration in range(400):\n",
|
||||
" try:\n",
|
||||
" x = np.zeros((1, maxlen, len(character_set_male)))\n",
|
||||
" for t, char in enumerate(sentence):\n",
|
||||
" x[0, t, char_indices[char]] = 1.\n",
|
||||
"\n",
|
||||
" preds = model.predict(x, verbose=0)[0]\n",
|
||||
" next_index = sample(preds, diversity)\n",
|
||||
" next_char = indices_char[next_index]\n",
|
||||
"\n",
|
||||
" generated += next_char\n",
|
||||
" sentence = sentence[1:] + next_char\n",
|
||||
" except:\n",
|
||||
" continue\n",
|
||||
" \n",
|
||||
" print(sentence)\n",
|
||||
" print()"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"celltoolbar": "Slideshow",
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
120
deep-learning/keras-tutorial/4. Conclusions.ipynb
Normal file
|
@ -0,0 +1,120 @@
|
|||
{
|
||||
"cells": [
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {},
|
||||
"source": [
|
||||
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggiohttps://github.com/donnemartin/system-design-primer"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"# Conclusions"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"* Keras is a powerful and battery-included framework for Deep Learning in Python\n",
|
||||
"\n",
|
||||
"* Keras is **simple** to use..\n",
|
||||
"\n",
|
||||
"* ...but it is **not** for simple things!"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/keras_rank_1.jpg\" width=\"65%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "subslide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"<img src=\"imgs/keras_rank_2.jpg\" width=\"65%\" />"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "slide"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"## Some References for .."
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Cutting Edge\n",
|
||||
"\n",
|
||||
"* Fractal Net Implementation with Keras: https://github.com/snf/keras-fractalnet -\n",
|
||||
"* Please check out: [https://github.com/fchollet/keras-resources]()"
|
||||
]
|
||||
},
|
||||
{
|
||||
"cell_type": "markdown",
|
||||
"metadata": {
|
||||
"slideshow": {
|
||||
"slide_type": "fragment"
|
||||
}
|
||||
},
|
||||
"source": [
|
||||
"#### Hyper-Cool\n",
|
||||
"\n",
|
||||
"* Hyperas: https://github.com/maxpumperla/hyperas\n",
|
||||
" - A web dashboard for Keras Models"
|
||||
]
|
||||
}
|
||||
],
|
||||
"metadata": {
|
||||
"celltoolbar": "Slideshow",
|
||||
"kernelspec": {
|
||||
"display_name": "Python 3",
|
||||
"language": "python",
|
||||
"name": "python3"
|
||||
},
|
||||
"language_info": {
|
||||
"codemirror_mode": {
|
||||
"name": "ipython",
|
||||
"version": 3
|
||||
},
|
||||
"file_extension": ".py",
|
||||
"mimetype": "text/x-python",
|
||||
"name": "python",
|
||||
"nbconvert_exporter": "python",
|
||||
"pygments_lexer": "ipython3",
|
||||
"version": "3.4.3"
|
||||
}
|
||||
},
|
||||
"nbformat": 4,
|
||||
"nbformat_minor": 0
|
||||
}
|
21
deep-learning/keras-tutorial/LICENSE
Normal file
|
@ -0,0 +1,21 @@
|
|||
The MIT License (MIT)
|
||||
|
||||
Copyright (c) 2017 MPBA
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
BIN
deep-learning/keras-tutorial/data/female_blog_list.txt
Normal file
1
deep-learning/keras-tutorial/data/intro_to_ann.csv
Normal file
BIN
deep-learning/keras-tutorial/data/male_blog_list.txt
Normal file
BIN
deep-learning/keras-tutorial/data/mnist.pkl.gz
Normal file
5331
deep-learning/keras-tutorial/data/rt-polarity.neg
Normal file
5331
deep-learning/keras-tutorial/data/rt-polarity.pos
Normal file
116
deep-learning/keras-tutorial/data_helpers.py
Normal file
|
@ -0,0 +1,116 @@
|
|||
import numpy as np
|
||||
import re
|
||||
import itertools
|
||||
from collections import Counter
|
||||
"""
|
||||
Original taken from https://github.com/dennybritz/cnn-text-classification-tf
|
||||
"""
|
||||
|
||||
def clean_str(string):
|
||||
"""
|
||||
Tokenization/string cleaning for all datasets except for SST.
|
||||
Original taken from https://github.com/yoonkim/CNN_sentence/blob/master/process_data.py
|
||||
"""
|
||||
string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string)
|
||||
string = re.sub(r"\'s", " \'s", string)
|
||||
string = re.sub(r"\'ve", " \'ve", string)
|
||||
string = re.sub(r"n\'t", " n\'t", string)
|
||||
string = re.sub(r"\'re", " \'re", string)
|
||||
string = re.sub(r"\'d", " \'d", string)
|
||||
string = re.sub(r"\'ll", " \'ll", string)
|
||||
string = re.sub(r",", " , ", string)
|
||||
string = re.sub(r"!", " ! ", string)
|
||||
string = re.sub(r"\(", " \( ", string)
|
||||
string = re.sub(r"\)", " \) ", string)
|
||||
string = re.sub(r"\?", " \? ", string)
|
||||
string = re.sub(r"\s{2,}", " ", string)
|
||||
return string.strip().lower()
|
||||
|
||||
|
||||
def load_data_and_labels():
|
||||
"""
|
||||
Loads MR polarity data from files, splits the data into words and generates labels.
|
||||
Returns split sentences and labels.
|
||||
"""
|
||||
# Load data from files
|
||||
positive_examples = list(open("./data/rt-polarity.pos", encoding='ISO-8859-1').readlines())
|
||||
positive_examples = [s.strip() for s in positive_examples]
|
||||
negative_examples = list(open("./data/rt-polarity.neg", encoding='ISO-8859-1').readlines())
|
||||
negative_examples = [s.strip() for s in negative_examples]
|
||||
# Split by words
|
||||
x_text = positive_examples + negative_examples
|
||||
x_text = [clean_str(sent) for sent in x_text]
|
||||
x_text = [s.split(" ") for s in x_text]
|
||||
# Generate labels
|
||||
positive_labels = [[0, 1] for _ in positive_examples]
|
||||
negative_labels = [[1, 0] for _ in negative_examples]
|
||||
y = np.concatenate([positive_labels, negative_labels], 0)
|
||||
return [x_text, y]
|
||||
|
||||
|
||||
def pad_sentences(sentences, padding_word="<PAD/>"):
|
||||
"""
|
||||
Pads all sentences to the same length. The length is defined by the longest sentence.
|
||||
Returns padded sentences.
|
||||
"""
|
||||
sequence_length = max(len(x) for x in sentences)
|
||||
padded_sentences = []
|
||||
for i in range(len(sentences)):
|
||||
sentence = sentences[i]
|
||||
num_padding = sequence_length - len(sentence)
|
||||
new_sentence = sentence + [padding_word] * num_padding
|
||||
padded_sentences.append(new_sentence)
|
||||
return padded_sentences
|
||||
|
||||
|
||||
def build_vocab(sentences):
|
||||
"""
|
||||
Builds a vocabulary mapping from word to index based on the sentences.
|
||||
Returns vocabulary mapping and inverse vocabulary mapping.
|
||||
"""
|
||||
# Build vocabulary
|
||||
word_counts = Counter(itertools.chain(*sentences))
|
||||
# Mapping from index to word
|
||||
vocabulary_inv = [x[0] for x in word_counts.most_common()]
|
||||
# Mapping from word to index
|
||||
vocabulary = {x: i for i, x in enumerate(vocabulary_inv)}
|
||||
return [vocabulary, vocabulary_inv]
|
||||
|
||||
|
||||
def build_input_data(sentences, labels, vocabulary):
|
||||
"""
|
||||
Maps sentencs and labels to vectors based on a vocabulary.
|
||||
"""
|
||||
x = np.array([[vocabulary[word] for word in sentence] for sentence in sentences])
|
||||
y = np.array(labels)
|
||||
return [x, y]
|
||||
|
||||
|
||||
def load_data():
|
||||
"""
|
||||
Loads and preprocessed data for the MR dataset.
|
||||
Returns input vectors, labels, vocabulary, and inverse vocabulary.
|
||||
"""
|
||||
# Load and preprocess data
|
||||
sentences, labels = load_data_and_labels()
|
||||
sentences_padded = pad_sentences(sentences)
|
||||
vocabulary, vocabulary_inv = build_vocab(sentences_padded)
|
||||
x, y = build_input_data(sentences_padded, labels, vocabulary)
|
||||
return [x, y, vocabulary, vocabulary_inv]
|
||||
|
||||
|
||||
def batch_iter(data, batch_size, num_epochs):
|
||||
"""
|
||||
Generates a batch iterator for a dataset.
|
||||
"""
|
||||
data = np.array(data)
|
||||
data_size = len(data)
|
||||
num_batches_per_epoch = int(len(data)/batch_size) + 1
|
||||
for epoch in range(num_epochs):
|
||||
# Shuffle the data at each epoch
|
||||
shuffle_indices = np.random.permutation(np.arange(data_size))
|
||||
shuffled_data = data[shuffle_indices]
|
||||
for batch_num in range(num_batches_per_epoch):
|
||||
start_index = batch_num * batch_size
|
||||
end_index = min((batch_num + 1) * batch_size, data_size)
|
||||
yield shuffled_data[start_index:end_index]
|
142
deep-learning/keras-tutorial/deep-learning-osx.yml
Normal file
|
@ -0,0 +1,142 @@
|
|||
name: deep-learning
|
||||
channels:
|
||||
- conda-forge
|
||||
- defaults
|
||||
dependencies:
|
||||
- accelerate=2.3.0=np111py35_3
|
||||
- accelerate_cudalib=2.0=0
|
||||
- appnope=0.1.0=py35_0
|
||||
- bokeh=0.12.1=py35_0
|
||||
- cffi=1.6.0=py35_0
|
||||
- backports.shutil_get_terminal_size=1.0.0=py35_0
|
||||
- blas=1.1=openblas
|
||||
- ca-certificates=2016.8.2=3
|
||||
- certifi=2016.8.2=py35_0
|
||||
- cycler=0.10.0=py35_0
|
||||
- cython=0.24.1=py35_0
|
||||
- decorator=4.0.10=py35_0
|
||||
- entrypoints=0.2.2=py35_0
|
||||
- freetype=2.6.3=1
|
||||
- h5py=2.6.0=np111py35_6
|
||||
- hdf5=1.8.17=2
|
||||
- ipykernel=4.3.1=py35_1
|
||||
- ipython=5.1.0=py35_0
|
||||
- ipywidgets=5.2.2=py35_0
|
||||
- jinja2=2.8=py35_1
|
||||
- jsonschema=2.5.1=py35_0
|
||||
- jupyter_client=4.3.0=py35_0
|
||||
- jupyter_console=5.0.0=py35_0
|
||||
- jupyter_core=4.1.1=py35_1
|
||||
- libgfortran=3.0.0=0
|
||||
- libpng=1.6.24=0
|
||||
- libsodium=1.0.10=0
|
||||
- markupsafe=0.23=py35_0
|
||||
- matplotlib=1.5.2=np111py35_5
|
||||
- mistune=0.7.3=py35_0
|
||||
- nbconvert=4.2.0=py35_0
|
||||
- nbformat=4.0.1=py35_0
|
||||
- ncurses=5.9=8
|
||||
- nose=1.3.7=py35_1
|
||||
- notebook=4.2.2=py35_0
|
||||
- numpy=1.11.1=py35_blas_openblas_201
|
||||
- openblas=0.2.18=4
|
||||
- openssl=1.0.2h=2
|
||||
- pandas=0.18.1=np111py35_1
|
||||
- pexpect=4.2.0=py35_1
|
||||
- pickleshare=0.7.3=py35_0
|
||||
- pip=8.1.2=py35_0
|
||||
- prompt_toolkit=1.0.6=py35_0
|
||||
- ptyprocess=0.5.1=py35_0
|
||||
- pygments=2.1.3=py35_1
|
||||
- pyparsing=2.1.7=py35_0
|
||||
- python=3.5.2=2
|
||||
- python-dateutil=2.5.3=py35_0
|
||||
- pytz=2016.6.1=py35_0
|
||||
- pyyaml=3.11=py35_0
|
||||
- pyzmq=15.4.0=py35_0
|
||||
- qtconsole=4.2.1=py35_0
|
||||
- readline=6.2=0
|
||||
- requests=2.11.0=py35_0
|
||||
- scikit-learn=0.17.1=np111py35_blas_openblas_201
|
||||
- scipy=0.18.0=np111py35_blas_openblas_201
|
||||
- setuptools=25.1.6=py35_0
|
||||
- simplegeneric=0.8.1=py35_0
|
||||
- sip=4.18=py35_0
|
||||
- six=1.10.0=py35_0
|
||||
- sqlite=3.13.0=1
|
||||
- terminado=0.6=py35_0
|
||||
- tk=8.5.19=0
|
||||
- tornado=4.4.1=py35_1
|
||||
- traitlets=4.2.2=py35_0
|
||||
- wcwidth=0.1.7=py35_0
|
||||
- wheel=0.29.0=py35_0
|
||||
- widgetsnbextension=1.2.6=py35_3
|
||||
- xz=5.2.2=0
|
||||
- yaml=0.1.6=0
|
||||
- zeromq=4.1.5=0
|
||||
- zlib=1.2.8=3
|
||||
- cudatoolkit=7.5=0
|
||||
- ipython_genutils=0.1.0=py35_0
|
||||
- jupyter=1.0.0=py35_3
|
||||
- llvmlite=0.11.0=py35_0
|
||||
- mkl=11.3.3=0
|
||||
- mkl-service=1.1.2=py35_2
|
||||
- numba=0.26.0=np111py35_0
|
||||
- pycparser=2.14=py35_1
|
||||
- pyqt=4.11.4=py35_4
|
||||
- python.app=1.2=py35_4
|
||||
- qt=4.8.7=4
|
||||
- snakeviz=0.4.1=py35_0
|
||||
- pip:
|
||||
- backports.shutil-get-terminal-size==1.0.0
|
||||
- certifi==2016.8.2
|
||||
- cycler==0.10.0
|
||||
- cython==0.24.1
|
||||
- decorator==4.0.10
|
||||
- h5py==2.6.0
|
||||
- ipykernel==4.3.1
|
||||
- ipython==5.1.0
|
||||
- ipython-genutils==0.1.0
|
||||
- ipywidgets==5.2.2
|
||||
- jinja2==2.8
|
||||
- jsonschema==2.5.1
|
||||
- jupyter-client==4.3.0
|
||||
- jupyter-console==5.0.0
|
||||
- jupyter-core==4.1.1
|
||||
- keras==1.0.7
|
||||
- markupsafe==0.23
|
||||
- matplotlib==1.5.2
|
||||
- mistune==0.7.3
|
||||
- nbconvert==4.2.0
|
||||
- nbformat==4.0.1
|
||||
- nose==1.3.7
|
||||
- notebook==4.2.2
|
||||
- numpy==1.11.1
|
||||
- pandas==0.18.1
|
||||
- pexpect==4.2.0
|
||||
- pickleshare==0.7.3
|
||||
- pip==8.1.2
|
||||
- prompt-toolkit==1.0.6
|
||||
- ptyprocess==0.5.1
|
||||
- pygments==2.1.3
|
||||
- pyparsing==2.1.7
|
||||
- python-dateutil==2.5.3
|
||||
- pytz==2016.6.1
|
||||
- pyyaml==3.11
|
||||
- pyzmq==15.4.0
|
||||
- qtconsole==4.2.1
|
||||
- requests==2.11.0
|
||||
- scikit-learn==0.17.1
|
||||
- scipy==0.18.0
|
||||
- setuptools==25.1.6
|
||||
- simplegeneric==0.8.1
|
||||
- six==1.10.0
|
||||
- terminado==0.6
|
||||
- theano==0.8.2
|
||||
- tornado==4.4.1
|
||||
- traitlets==4.2.2
|
||||
- wcwidth==0.1.7
|
||||
- wheel==0.29.0
|
||||
- widgetsnbextension==1.2.6
|
||||
prefix: /Users/valerio/anaconda/envs/deep-learning
|
||||
|
159
deep-learning/keras-tutorial/deep-learning.yml
Normal file
|
@ -0,0 +1,159 @@
|
|||
name: deep-learning
|
||||
channels:
|
||||
- conda-forge
|
||||
- defaults
|
||||
dependencies:
|
||||
- accelerate=2.3.0=np111py35_3
|
||||
- accelerate_cudalib=2.0=0
|
||||
- bokeh=0.12.1=py35_0
|
||||
- cffi=1.6.0=py35_0
|
||||
- backports.shutil_get_terminal_size=1.0.0=py35_0
|
||||
- blas=1.1=openblas
|
||||
- ca-certificates=2016.8.2=3
|
||||
- cairo=1.12.18=8
|
||||
- certifi=2016.8.2=py35_0
|
||||
- cycler=0.10.0=py35_0
|
||||
- cython=0.24.1=py35_0
|
||||
- decorator=4.0.10=py35_0
|
||||
- entrypoints=0.2.2=py35_0
|
||||
- fontconfig=2.11.1=3
|
||||
- freetype=2.6.3=1
|
||||
- gettext=0.19.7=1
|
||||
- glib=2.48.0=4
|
||||
- h5py=2.6.0=np111py35_6
|
||||
- harfbuzz=1.0.6=0
|
||||
- hdf5=1.8.17=2
|
||||
- icu=56.1=4
|
||||
- ipykernel=4.3.1=py35_1
|
||||
- ipython=5.1.0=py35_0
|
||||
- ipywidgets=5.2.2=py35_0
|
||||
- jinja2=2.8=py35_1
|
||||
- jpeg=9b=0
|
||||
- jsonschema=2.5.1=py35_0
|
||||
- jupyter_client=4.3.0=py35_0
|
||||
- jupyter_console=5.0.0=py35_0
|
||||
- jupyter_core=4.1.1=py35_1
|
||||
- libffi=3.2.1=2
|
||||
- libiconv=1.14=3
|
||||
- libpng=1.6.24=0
|
||||
- libsodium=1.0.10=0
|
||||
- libtiff=4.0.6=6
|
||||
- libxml2=2.9.4=0
|
||||
- markupsafe=0.23=py35_0
|
||||
- matplotlib=1.5.2=np111py35_6
|
||||
- mistune=0.7.3=py35_0
|
||||
- nbconvert=4.2.0=py35_0
|
||||
- nbformat=4.0.1=py35_0
|
||||
- ncurses=5.9=8
|
||||
- nose=1.3.7=py35_1
|
||||
- notebook=4.2.2=py35_0
|
||||
- numpy=1.11.1=py35_blas_openblas_201
|
||||
- openblas=0.2.18=4
|
||||
- openssl=1.0.2h=2
|
||||
- pandas=0.18.1=np111py35_1
|
||||
- pango=1.40.1=0
|
||||
- path.py=8.2.1=py35_0
|
||||
- pcre=8.38=1
|
||||
- pexpect=4.2.0=py35_1
|
||||
- pickleshare=0.7.3=py35_0
|
||||
- pip=8.1.2=py35_0
|
||||
- pixman=0.32.6=0
|
||||
- prompt_toolkit=1.0.6=py35_0
|
||||
- protobuf=3.0.0b3=py35_1
|
||||
- ptyprocess=0.5.1=py35_0
|
||||
- pygments=2.1.3=py35_1
|
||||
- pyparsing=2.1.7=py35_0
|
||||
- python=3.5.2=2
|
||||
- python-dateutil=2.5.3=py35_0
|
||||
- pytz=2016.6.1=py35_0
|
||||
- pyyaml=3.11=py35_0
|
||||
- pyzmq=15.4.0=py35_0
|
||||
- qt=4.8.7=0
|
||||
- qtconsole=4.2.1=py35_0
|
||||
- readline=6.2=0
|
||||
- requests=2.11.0=py35_0
|
||||
- scikit-learn=0.17.1=np111py35_blas_openblas_201
|
||||
- scipy=0.18.0=np111py35_blas_openblas_201
|
||||
- setuptools=25.1.6=py35_0
|
||||
- simplegeneric=0.8.1=py35_0
|
||||
- sip=4.18=py35_0
|
||||
- six=1.10.0=py35_0
|
||||
- sqlite=3.13.0=1
|
||||
- terminado=0.6=py35_0
|
||||
- tk=8.5.19=0
|
||||
- tornado=4.4.1=py35_1
|
||||
- traitlets=4.2.2=py35_0
|
||||
- wcwidth=0.1.7=py35_0
|
||||
- wheel=0.29.0=py35_0
|
||||
- widgetsnbextension=1.2.6=py35_3
|
||||
- xz=5.2.2=0
|
||||
- yaml=0.1.6=0
|
||||
- zeromq=4.1.5=0
|
||||
- zlib=1.2.8=3
|
||||
- cudatoolkit=7.5=0
|
||||
- ipython_genutils=0.1.0=py35_0
|
||||
- jupyter=1.0.0=py35_3
|
||||
- libgfortran=3.0.0=1
|
||||
- llvmlite=0.11.0=py35_0
|
||||
- mkl=11.3.3=0
|
||||
- mkl-service=1.1.2=py35_2
|
||||
- numba=0.26.0=np111py35_0
|
||||
- pycparser=2.14=py35_1
|
||||
- pyqt=4.11.4=py35_4
|
||||
- snakeviz=0.4.1=py35_0
|
||||
- pip:
|
||||
- backports.shutil-get-terminal-size==1.0.0
|
||||
- certifi==2016.8.2
|
||||
- cycler==0.10.0
|
||||
- cython==0.24.1
|
||||
- decorator==4.0.10
|
||||
- h5py==2.6.0
|
||||
- ipykernel==4.3.1
|
||||
- ipython==5.1.0
|
||||
- ipython-genutils==0.1.0
|
||||
- ipywidgets==5.2.2
|
||||
- jinja2==2.8
|
||||
- jsonschema==2.5.1
|
||||
- jupyter-client==4.3.0
|
||||
- jupyter-console==5.0.0
|
||||
- jupyter-core==4.1.1
|
||||
- keras==1.0.7
|
||||
- mako==1.0.4
|
||||
- markupsafe==0.23
|
||||
- matplotlib==1.5.2
|
||||
- mistune==0.7.3
|
||||
- nbconvert==4.2.0
|
||||
- nbformat==4.0.1
|
||||
- nose==1.3.7
|
||||
- notebook==4.2.2
|
||||
- numpy==1.11.1
|
||||
- pandas==0.18.1
|
||||
- path.py==8.2.1
|
||||
- pexpect==4.2.0
|
||||
- pickleshare==0.7.3
|
||||
- pip==8.1.2
|
||||
- prompt-toolkit==1.0.6
|
||||
- protobuf==3.0.0b2
|
||||
- ptyprocess==0.5.1
|
||||
- pygments==2.1.3
|
||||
- pyparsing==2.1.7
|
||||
- python-dateutil==2.5.3
|
||||
- pytz==2016.6.1
|
||||
- pyyaml==3.11
|
||||
- pyzmq==15.4.0
|
||||
- qtconsole==4.2.1
|
||||
- requests==2.11.0
|
||||
- scikit-learn==0.17.1
|
||||
- scipy==0.18.0
|
||||
- setuptools==25.1.4
|
||||
- simplegeneric==0.8.1
|
||||
- six==1.10.0
|
||||
- terminado==0.6
|
||||
- theano==0.8.2
|
||||
- tornado==4.4.1
|
||||
- traitlets==4.2.2
|
||||
- wcwidth==0.1.7
|
||||
- wheel==0.29.0
|
||||
- widgetsnbextension==1.2.6
|
||||
prefix: /home/valerio/anaconda3/envs/deep-learning
|
||||
|
21
deep-learning/keras-tutorial/deep_learning_models/LICENSE
Normal file
|
@ -0,0 +1,21 @@
|
|||
MIT License
|
||||
|
||||
Copyright (c) 2016 François Chollet
|
||||
|
||||
Permission is hereby granted, free of charge, to any person obtaining a copy
|
||||
of this software and associated documentation files (the "Software"), to deal
|
||||
in the Software without restriction, including without limitation the rights
|
||||
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
||||
copies of the Software, and to permit persons to whom the Software is
|
||||
furnished to do so, subject to the following conditions:
|
||||
|
||||
The above copyright notice and this permission notice shall be included in all
|
||||
copies or substantial portions of the Software.
|
||||
|
||||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
||||
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
||||
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
||||
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
||||
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
||||
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
||||
SOFTWARE.
|
89
deep-learning/keras-tutorial/deep_learning_models/README.md
Normal file
|
@ -0,0 +1,89 @@
|
|||
# Trained image classification models for Keras
|
||||
|
||||
This repository contains code for the following Keras models:
|
||||
|
||||
- VGG16
|
||||
- VGG19
|
||||
- ResNet50
|
||||
|
||||
We plan on adding Inception v3 soon.
|
||||
|
||||
All architectures are compatible with both TensorFlow and Theano, and upon instantiation the models will be built according to the image dimension ordering set in your Keras configuration file at `~/.keras/keras.json`. For instance, if you have set `image_dim_ordering=tf`, then any model loaded from this repository will get built according to the TensorFlow dimension ordering convention, "Width-Height-Depth".
|
||||
|
||||
Weights can be automatically loaded upon instantiation (`weights='imagenet'` argument in model constructor). Weights are automatically downloaded if necessary, and cached locally in `~/.keras/models/`.
|
||||
|
||||
**Note that using these models requires the latest version of Keras (from the Github repo, not PyPI).**
|
||||
|
||||
## Examples
|
||||
|
||||
### Classify images
|
||||
|
||||
```python
|
||||
from resnet50 import ResNet50
|
||||
from keras.preprocessing import image
|
||||
from imagenet_utils import preprocess_input, decode_predictions
|
||||
|
||||
model = ResNet50(weights='imagenet')
|
||||
|
||||
img_path = 'elephant.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
|
||||
preds = model.predict(x)
|
||||
print('Predicted:', decode_predictions(preds))
|
||||
# print: [[u'n02504458', u'African_elephant']]
|
||||
```
|
||||
|
||||
### Extract features from images
|
||||
|
||||
```python
|
||||
from vgg16 import VGG16
|
||||
from keras.preprocessing import image
|
||||
from imagenet_utils import preprocess_input
|
||||
|
||||
model = VGG16(weights='imagenet', include_top=False)
|
||||
|
||||
img_path = 'elephant.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
|
||||
features = model.predict(x)
|
||||
```
|
||||
|
||||
### Extract features from an arbitrary intermediate layer
|
||||
|
||||
```python
|
||||
from vgg19 import VGG19
|
||||
from keras.preprocessing import image
|
||||
from imagenet_utils import preprocess_input
|
||||
from keras.models import Model
|
||||
|
||||
base_model = VGG19(weights='imagenet')
|
||||
model = Model(input=base_model.input, output=base_model.get_layer('block4_pool').output)
|
||||
|
||||
img_path = 'elephant.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
|
||||
block4_pool_features = model.predict(x)
|
||||
```
|
||||
|
||||
## References
|
||||
|
||||
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) - please cite this paper if you use the VGG models in your work.
|
||||
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385) - please cite this paper if you use the ResNet model in your work.
|
||||
|
||||
Additionally, don't forget to [cite Keras](https://keras.io/getting-started/faq/#how-should-i-cite-keras) if you use these models.
|
||||
|
||||
|
||||
## License
|
||||
|
||||
- All code in this repository is under the MIT license as specified by the LICENSE file.
|
||||
- The ResNet50 weights are ported from the ones [released by Kaiming He](https://github.com/KaimingHe/deep-residual-networks) under the [MIT license](https://github.com/KaimingHe/deep-residual-networks/blob/master/LICENSE).
|
||||
- The VGG16 and VGG19 weights are ported from the ones [released by VGG at Oxford](http://www.robots.ox.ac.uk/~vgg/research/very_deep/) under the [Creative Commons Attribution License](https://creativecommons.org/licenses/by/4.0/).
|
|
@ -0,0 +1,43 @@
|
|||
import numpy as np
|
||||
import json
|
||||
|
||||
from keras.utils.data_utils import get_file
|
||||
from keras import backend as K
|
||||
|
||||
CLASS_INDEX = None
|
||||
CLASS_INDEX_PATH = 'https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json'
|
||||
|
||||
|
||||
def preprocess_input(x, dim_ordering='default'):
|
||||
if dim_ordering == 'default':
|
||||
dim_ordering = K.image_dim_ordering()
|
||||
assert dim_ordering in {'tf', 'th'}
|
||||
|
||||
if dim_ordering == 'th':
|
||||
x[:, 0, :, :] -= 103.939
|
||||
x[:, 1, :, :] -= 116.779
|
||||
x[:, 2, :, :] -= 123.68
|
||||
# 'RGB'->'BGR'
|
||||
x = x[:, ::-1, :, :]
|
||||
else:
|
||||
x[:, :, :, 0] -= 103.939
|
||||
x[:, :, :, 1] -= 116.779
|
||||
x[:, :, :, 2] -= 123.68
|
||||
# 'RGB'->'BGR'
|
||||
x = x[:, :, :, ::-1]
|
||||
return x
|
||||
|
||||
|
||||
def decode_predictions(preds):
|
||||
global CLASS_INDEX
|
||||
assert len(preds.shape) == 2 and preds.shape[1] == 1000
|
||||
if CLASS_INDEX is None:
|
||||
fpath = get_file('imagenet_class_index.json',
|
||||
CLASS_INDEX_PATH,
|
||||
cache_subdir='models')
|
||||
CLASS_INDEX = json.load(open(fpath))
|
||||
indices = np.argmax(preds, axis=-1)
|
||||
results = []
|
||||
for i in indices:
|
||||
results.append(CLASS_INDEX[str(i)])
|
||||
return results
|
246
deep-learning/keras-tutorial/deep_learning_models/resnet50.py
Normal file
|
@ -0,0 +1,246 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
'''ResNet50 model for Keras.
|
||||
|
||||
# Reference:
|
||||
|
||||
- [Deep Residual Learning for Image Recognition](https://arxiv.org/abs/1512.03385)
|
||||
|
||||
Adapted from code contributed by BigMoyan.
|
||||
'''
|
||||
from __future__ import print_function
|
||||
|
||||
import numpy as np
|
||||
import warnings
|
||||
|
||||
from keras.layers import merge, Input
|
||||
from keras.layers import Dense, Activation, Flatten
|
||||
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D, AveragePooling2D
|
||||
from keras.layers import BatchNormalization
|
||||
from keras.models import Model
|
||||
from keras.preprocessing import image
|
||||
import keras.backend as K
|
||||
from keras.utils.layer_utils import convert_all_kernels_in_model
|
||||
from keras.utils.data_utils import get_file
|
||||
|
||||
|
||||
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/resnet50_weights_th_dim_ordering_th_kernels.h5'
|
||||
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/resnet50_weights_tf_dim_ordering_tf_kernels.h5'
|
||||
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/resnet50_weights_th_dim_ordering_th_kernels_notop.h5'
|
||||
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5'
|
||||
|
||||
|
||||
def identity_block(input_tensor, kernel_size, filters, stage, block):
|
||||
'''The identity_block is the block that has no conv layer at shortcut
|
||||
|
||||
# Arguments
|
||||
input_tensor: input tensor
|
||||
kernel_size: defualt 3, the kernel size of middle conv layer at main path
|
||||
filters: list of integers, the nb_filters of 3 conv layer at main path
|
||||
stage: integer, current stage label, used for generating layer names
|
||||
block: 'a','b'..., current block label, used for generating layer names
|
||||
'''
|
||||
nb_filter1, nb_filter2, nb_filter3 = filters
|
||||
if K.image_dim_ordering() == 'tf':
|
||||
bn_axis = 3
|
||||
else:
|
||||
bn_axis = 1
|
||||
conv_name_base = 'res' + str(stage) + block + '_branch'
|
||||
bn_name_base = 'bn' + str(stage) + block + '_branch'
|
||||
|
||||
x = Convolution2D(nb_filter1, 1, 1, name=conv_name_base + '2a')(input_tensor)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
|
||||
x = Activation('relu')(x)
|
||||
|
||||
x = Convolution2D(nb_filter2, kernel_size, kernel_size,
|
||||
border_mode='same', name=conv_name_base + '2b')(x)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
|
||||
x = Activation('relu')(x)
|
||||
|
||||
x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
|
||||
|
||||
x = merge([x, input_tensor], mode='sum')
|
||||
x = Activation('relu')(x)
|
||||
return x
|
||||
|
||||
|
||||
def conv_block(input_tensor, kernel_size, filters, stage, block, strides=(2, 2)):
|
||||
'''conv_block is the block that has a conv layer at shortcut
|
||||
|
||||
# Arguments
|
||||
input_tensor: input tensor
|
||||
kernel_size: defualt 3, the kernel size of middle conv layer at main path
|
||||
filters: list of integers, the nb_filters of 3 conv layer at main path
|
||||
stage: integer, current stage label, used for generating layer names
|
||||
block: 'a','b'..., current block label, used for generating layer names
|
||||
|
||||
Note that from stage 3, the first conv layer at main path is with subsample=(2,2)
|
||||
And the shortcut should have subsample=(2,2) as well
|
||||
'''
|
||||
nb_filter1, nb_filter2, nb_filter3 = filters
|
||||
if K.image_dim_ordering() == 'tf':
|
||||
bn_axis = 3
|
||||
else:
|
||||
bn_axis = 1
|
||||
conv_name_base = 'res' + str(stage) + block + '_branch'
|
||||
bn_name_base = 'bn' + str(stage) + block + '_branch'
|
||||
|
||||
x = Convolution2D(nb_filter1, 1, 1, subsample=strides,
|
||||
name=conv_name_base + '2a')(input_tensor)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2a')(x)
|
||||
x = Activation('relu')(x)
|
||||
|
||||
x = Convolution2D(nb_filter2, kernel_size, kernel_size, border_mode='same',
|
||||
name=conv_name_base + '2b')(x)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2b')(x)
|
||||
x = Activation('relu')(x)
|
||||
|
||||
x = Convolution2D(nb_filter3, 1, 1, name=conv_name_base + '2c')(x)
|
||||
x = BatchNormalization(axis=bn_axis, name=bn_name_base + '2c')(x)
|
||||
|
||||
shortcut = Convolution2D(nb_filter3, 1, 1, subsample=strides,
|
||||
name=conv_name_base + '1')(input_tensor)
|
||||
shortcut = BatchNormalization(axis=bn_axis, name=bn_name_base + '1')(shortcut)
|
||||
|
||||
x = merge([x, shortcut], mode='sum')
|
||||
x = Activation('relu')(x)
|
||||
return x
|
||||
|
||||
|
||||
def ResNet50(include_top=True, weights='imagenet',
|
||||
input_tensor=None):
|
||||
'''Instantiate the ResNet50 architecture,
|
||||
optionally loading weights pre-trained
|
||||
on ImageNet. Note that when using TensorFlow,
|
||||
for best performance you should set
|
||||
`image_dim_ordering="tf"` in your Keras config
|
||||
at ~/.keras/keras.json.
|
||||
|
||||
The model and the weights are compatible with both
|
||||
TensorFlow and Theano. The dimension ordering
|
||||
convention used by the model is the one
|
||||
specified in your Keras config file.
|
||||
|
||||
# Arguments
|
||||
include_top: whether to include the 3 fully-connected
|
||||
layers at the top of the network.
|
||||
weights: one of `None` (random initialization)
|
||||
or "imagenet" (pre-training on ImageNet).
|
||||
input_tensor: optional Keras tensor (i.e. xput of `layers.Input()`)
|
||||
to use as image input for the model.
|
||||
|
||||
# Returns
|
||||
A Keras model instance.
|
||||
'''
|
||||
if weights not in {'imagenet', None}:
|
||||
raise ValueError('The `weights` argument should be either '
|
||||
'`None` (random initialization) or `imagenet` '
|
||||
'(pre-training on ImageNet).')
|
||||
# Determine proper input shape
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
input_shape = (3, 224, 224)
|
||||
else:
|
||||
input_shape = (3, None, None)
|
||||
else:
|
||||
if include_top:
|
||||
input_shape = (224, 224, 3)
|
||||
else:
|
||||
input_shape = (None, None, 3)
|
||||
|
||||
if input_tensor is None:
|
||||
img_input = Input(shape=input_shape)
|
||||
else:
|
||||
if not K.is_keras_tensor(input_tensor):
|
||||
img_input = Input(tensor=input_tensor)
|
||||
else:
|
||||
img_input = input_tensor
|
||||
if K.image_dim_ordering() == 'tf':
|
||||
bn_axis = 3
|
||||
else:
|
||||
bn_axis = 1
|
||||
|
||||
x = ZeroPadding2D((3, 3))(img_input)
|
||||
x = Convolution2D(64, 7, 7, subsample=(2, 2), name='conv1')(x)
|
||||
x = BatchNormalization(axis=bn_axis, name='bn_conv1')(x)
|
||||
x = Activation('relu')(x)
|
||||
x = MaxPooling2D((3, 3), strides=(2, 2))(x)
|
||||
|
||||
x = conv_block(x, 3, [64, 64, 256], stage=2, block='a', strides=(1, 1))
|
||||
x = identity_block(x, 3, [64, 64, 256], stage=2, block='b')
|
||||
x = identity_block(x, 3, [64, 64, 256], stage=2, block='c')
|
||||
|
||||
x = conv_block(x, 3, [128, 128, 512], stage=3, block='a')
|
||||
x = identity_block(x, 3, [128, 128, 512], stage=3, block='b')
|
||||
x = identity_block(x, 3, [128, 128, 512], stage=3, block='c')
|
||||
x = identity_block(x, 3, [128, 128, 512], stage=3, block='d')
|
||||
|
||||
x = conv_block(x, 3, [256, 256, 1024], stage=4, block='a')
|
||||
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='b')
|
||||
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='c')
|
||||
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='d')
|
||||
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='e')
|
||||
x = identity_block(x, 3, [256, 256, 1024], stage=4, block='f')
|
||||
|
||||
x = conv_block(x, 3, [512, 512, 2048], stage=5, block='a')
|
||||
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='b')
|
||||
x = identity_block(x, 3, [512, 512, 2048], stage=5, block='c')
|
||||
|
||||
x = AveragePooling2D((7, 7), name='avg_pool')(x)
|
||||
|
||||
if include_top:
|
||||
x = Flatten()(x)
|
||||
x = Dense(1000, activation='softmax', name='fc1000')(x)
|
||||
|
||||
model = Model(img_input, x)
|
||||
|
||||
# load weights
|
||||
if weights == 'imagenet':
|
||||
print('K.image_dim_ordering:', K.image_dim_ordering())
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
weights_path = get_file('resnet50_weights_th_dim_ordering_th_kernels.h5',
|
||||
TH_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('resnet50_weights_th_dim_ordering_th_kernels_notop.h5',
|
||||
TH_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'tensorflow':
|
||||
warnings.warn('You are using the TensorFlow backend, yet you '
|
||||
'are using the Theano '
|
||||
'image dimension ordering convention '
|
||||
'(`image_dim_ordering="th"`). '
|
||||
'For best performance, set '
|
||||
'`image_dim_ordering="tf"` in '
|
||||
'your Keras config '
|
||||
'at ~/.keras/keras.json.')
|
||||
convert_all_kernels_in_model(model)
|
||||
else:
|
||||
if include_top:
|
||||
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels.h5',
|
||||
TF_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5',
|
||||
TF_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'theano':
|
||||
convert_all_kernels_in_model(model)
|
||||
return model
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
model = ResNet50(include_top=True, weights='imagenet')
|
||||
|
||||
img_path = 'elephant.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
print('Input image shape:', x.shape)
|
||||
|
||||
preds = model.predict(x)
|
||||
print('Predicted:', decode_predictions(preds))
|
165
deep-learning/keras-tutorial/deep_learning_models/vgg16.py
Normal file
|
@ -0,0 +1,165 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
'''VGG16 model for Keras.
|
||||
|
||||
# Reference:
|
||||
|
||||
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
|
||||
|
||||
'''
|
||||
from __future__ import print_function
|
||||
|
||||
import numpy as np
|
||||
import warnings
|
||||
|
||||
from keras.models import Model
|
||||
from keras.layers import Flatten, Dense, Input
|
||||
from keras.layers import Convolution2D, MaxPooling2D
|
||||
from keras.preprocessing import image
|
||||
from keras.utils.layer_utils import convert_all_kernels_in_model
|
||||
from keras.utils.data_utils import get_file
|
||||
from keras import backend as K
|
||||
# from imagenet_utils import decode_predictions, preprocess_input
|
||||
|
||||
|
||||
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels.h5'
|
||||
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
|
||||
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_th_dim_ordering_th_kernels_notop.h5'
|
||||
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
|
||||
|
||||
|
||||
def VGG16(include_top=True, weights='imagenet',
|
||||
input_tensor=None):
|
||||
'''Instantiate the VGG16 architecture,
|
||||
optionally loading weights pre-trained
|
||||
on ImageNet. Note that when using TensorFlow,
|
||||
for best performance you should set
|
||||
`image_dim_ordering="tf"` in your Keras config
|
||||
at ~/.keras/keras.json.
|
||||
|
||||
The model and the weights are compatible with both
|
||||
TensorFlow and Theano. The dimension ordering
|
||||
convention used by the model is the one
|
||||
specified in your Keras config file.
|
||||
|
||||
# Arguments
|
||||
include_top: whether to include the 3 fully-connected
|
||||
layers at the top of the network.
|
||||
weights: one of `None` (random initialization)
|
||||
or "imagenet" (pre-training on ImageNet).
|
||||
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
|
||||
to use as image input for the model.
|
||||
|
||||
# Returns
|
||||
A Keras model instance.
|
||||
'''
|
||||
if weights not in {'imagenet', None}:
|
||||
raise ValueError('The `weights` argument should be either '
|
||||
'`None` (random initialization) or `imagenet` '
|
||||
'(pre-training on ImageNet).')
|
||||
# Determine proper input shape
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
input_shape = (3, 224, 224)
|
||||
else:
|
||||
input_shape = (3, None, None)
|
||||
else:
|
||||
if include_top:
|
||||
input_shape = (224, 224, 3)
|
||||
else:
|
||||
input_shape = (None, None, 3)
|
||||
|
||||
if input_tensor is None:
|
||||
img_input = Input(shape=input_shape)
|
||||
else:
|
||||
if not K.is_keras_tensor(input_tensor):
|
||||
img_input = Input(tensor=input_tensor)
|
||||
else:
|
||||
img_input = input_tensor
|
||||
# Block 1
|
||||
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)
|
||||
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
|
||||
|
||||
# Block 2
|
||||
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)
|
||||
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
|
||||
|
||||
# Block 3
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
|
||||
|
||||
# Block 4
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
|
||||
|
||||
# Block 5
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
|
||||
|
||||
if include_top:
|
||||
# Classification block
|
||||
x = Flatten(name='flatten')(x)
|
||||
x = Dense(4096, activation='relu', name='fc1')(x)
|
||||
x = Dense(4096, activation='relu', name='fc2')(x)
|
||||
x = Dense(1000, activation='softmax', name='predictions')(x)
|
||||
|
||||
# Create model
|
||||
model = Model(img_input, x)
|
||||
|
||||
# load weights
|
||||
if weights == 'imagenet':
|
||||
print('K.image_dim_ordering:', K.image_dim_ordering())
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels.h5',
|
||||
TH_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('vgg16_weights_th_dim_ordering_th_kernels_notop.h5',
|
||||
TH_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'tensorflow':
|
||||
warnings.warn('You are using the TensorFlow backend, yet you '
|
||||
'are using the Theano '
|
||||
'image dimension ordering convention '
|
||||
'(`image_dim_ordering="th"`). '
|
||||
'For best performance, set '
|
||||
'`image_dim_ordering="tf"` in '
|
||||
'your Keras config '
|
||||
'at ~/.keras/keras.json.')
|
||||
convert_all_kernels_in_model(model)
|
||||
else:
|
||||
if include_top:
|
||||
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
|
||||
TF_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
|
||||
TF_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'theano':
|
||||
convert_all_kernels_in_model(model)
|
||||
return model
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
model = VGG16(include_top=True, weights='imagenet')
|
||||
|
||||
img_path = 'elephant.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
print('Input image shape:', x.shape)
|
||||
|
||||
preds = model.predict(x)
|
||||
print('Predicted:', decode_predictions(preds))
|
167
deep-learning/keras-tutorial/deep_learning_models/vgg19.py
Normal file
|
@ -0,0 +1,167 @@
|
|||
# -*- coding: utf-8 -*-
|
||||
'''VGG19 model for Keras.
|
||||
|
||||
# Reference:
|
||||
|
||||
- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556)
|
||||
|
||||
'''
|
||||
from __future__ import print_function
|
||||
|
||||
import numpy as np
|
||||
import warnings
|
||||
|
||||
from keras.models import Model
|
||||
from keras.layers import Flatten, Dense, Input
|
||||
from keras.layers import Convolution2D, MaxPooling2D
|
||||
from keras.preprocessing import image
|
||||
from keras.utils.layer_utils import convert_all_kernels_in_model
|
||||
from keras.utils.data_utils import get_file
|
||||
from keras import backend as K
|
||||
|
||||
|
||||
TH_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_th_dim_ordering_th_kernels.h5'
|
||||
TF_WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5'
|
||||
TH_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_th_dim_ordering_th_kernels_notop.h5'
|
||||
TF_WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5'
|
||||
|
||||
|
||||
def VGG19(include_top=True, weights='imagenet',
|
||||
input_tensor=None):
|
||||
'''Instantiate the VGG19 architecture,
|
||||
optionally loading weights pre-trained
|
||||
on ImageNet. Note that when using TensorFlow,
|
||||
for best performance you should set
|
||||
`image_dim_ordering="tf"` in your Keras config
|
||||
at ~/.keras/keras.json.
|
||||
|
||||
The model and the weights are compatible with both
|
||||
TensorFlow and Theano. The dimension ordering
|
||||
convention used by the model is the one
|
||||
specified in your Keras config file.
|
||||
|
||||
# Arguments
|
||||
include_top: whether to include the 3 fully-connected
|
||||
layers at the top of the network.
|
||||
weights: one of `None` (random initialization)
|
||||
or "imagenet" (pre-training on ImageNet).
|
||||
input_tensor: optional Keras tensor (i.e. output of `layers.Input()`)
|
||||
to use as image input for the model.
|
||||
|
||||
# Returns
|
||||
A Keras model instance.
|
||||
'''
|
||||
if weights not in {'imagenet', None}:
|
||||
raise ValueError('The `weights` argument should be either '
|
||||
'`None` (random initialization) or `imagenet` '
|
||||
'(pre-training on ImageNet).')
|
||||
# Determine proper input shape
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
input_shape = (3, 224, 224)
|
||||
else:
|
||||
input_shape = (3, None, None)
|
||||
else:
|
||||
if include_top:
|
||||
input_shape = (224, 224, 3)
|
||||
else:
|
||||
input_shape = (None, None, 3)
|
||||
|
||||
if input_tensor is None:
|
||||
img_input = Input(shape=input_shape)
|
||||
else:
|
||||
if not K.is_keras_tensor(input_tensor):
|
||||
img_input = Input(tensor=input_tensor)
|
||||
else:
|
||||
img_input = input_tensor
|
||||
# Block 1
|
||||
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv1')(img_input)
|
||||
x = Convolution2D(64, 3, 3, activation='relu', border_mode='same', name='block1_conv2')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
|
||||
|
||||
# Block 2
|
||||
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv1')(x)
|
||||
x = Convolution2D(128, 3, 3, activation='relu', border_mode='same', name='block2_conv2')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
|
||||
|
||||
# Block 3
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv1')(x)
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv2')(x)
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv3')(x)
|
||||
x = Convolution2D(256, 3, 3, activation='relu', border_mode='same', name='block3_conv4')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
|
||||
|
||||
# Block 4
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv1')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv2')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv3')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block4_conv4')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
|
||||
|
||||
# Block 5
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv1')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv2')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv3')(x)
|
||||
x = Convolution2D(512, 3, 3, activation='relu', border_mode='same', name='block5_conv4')(x)
|
||||
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
|
||||
|
||||
if include_top:
|
||||
# Classification block
|
||||
x = Flatten(name='flatten')(x)
|
||||
x = Dense(4096, activation='relu', name='fc1')(x)
|
||||
x = Dense(4096, activation='relu', name='fc2')(x)
|
||||
x = Dense(1000, activation='softmax', name='predictions')(x)
|
||||
|
||||
# Create model
|
||||
model = Model(img_input, x)
|
||||
|
||||
# load weights
|
||||
if weights == 'imagenet':
|
||||
print('K.image_dim_ordering:', K.image_dim_ordering())
|
||||
if K.image_dim_ordering() == 'th':
|
||||
if include_top:
|
||||
weights_path = get_file('vgg19_weights_th_dim_ordering_th_kernels.h5',
|
||||
TH_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('vgg19_weights_th_dim_ordering_th_kernels_notop.h5',
|
||||
TH_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'tensorflow':
|
||||
warnings.warn('You are using the TensorFlow backend, yet you '
|
||||
'are using the Theano '
|
||||
'image dimension ordering convention '
|
||||
'(`image_dim_ordering="th"`). '
|
||||
'For best performance, set '
|
||||
'`image_dim_ordering="tf"` in '
|
||||
'your Keras config '
|
||||
'at ~/.keras/keras.json.')
|
||||
convert_all_kernels_in_model(model)
|
||||
else:
|
||||
if include_top:
|
||||
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels.h5',
|
||||
TF_WEIGHTS_PATH,
|
||||
cache_subdir='models')
|
||||
else:
|
||||
weights_path = get_file('vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5',
|
||||
TF_WEIGHTS_PATH_NO_TOP,
|
||||
cache_subdir='models')
|
||||
model.load_weights(weights_path)
|
||||
if K.backend() == 'theano':
|
||||
convert_all_kernels_in_model(model)
|
||||
return model
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
model = VGG19(include_top=True, weights='imagenet')
|
||||
|
||||
img_path = 'cat.jpg'
|
||||
img = image.load_img(img_path, target_size=(224, 224))
|
||||
x = image.img_to_array(img)
|
||||
x = np.expand_dims(x, axis=0)
|
||||
x = preprocess_input(x)
|
||||
print('Input image shape:', x.shape)
|
||||
|
||||
preds = model.predict(x)
|
||||
print('Predicted:', decode_predictions(preds))
|
BIN
deep-learning/keras-tutorial/imgs/ConvNet LeNet.png
Normal file
After Width: | Height: | Size: 43 KiB |
BIN
deep-learning/keras-tutorial/imgs/LSTM3-chain.png
Normal file
After Width: | Height: | Size: 224 KiB |
BIN
deep-learning/keras-tutorial/imgs/MLP.png
Normal file
After Width: | Height: | Size: 72 KiB |
BIN
deep-learning/keras-tutorial/imgs/MaxPool.png
Normal file
After Width: | Height: | Size: 16 KiB |
BIN
deep-learning/keras-tutorial/imgs/Perceptron and MLP.png
Normal file
After Width: | Height: | Size: 76 KiB |
BIN
deep-learning/keras-tutorial/imgs/Perceptron.png
Normal file
After Width: | Height: | Size: 35 KiB |
BIN
deep-learning/keras-tutorial/imgs/RNN-rolled.png
Normal file
After Width: | Height: | Size: 21 KiB |
BIN
deep-learning/keras-tutorial/imgs/RNN-unrolled.png
Normal file
After Width: | Height: | Size: 92 KiB |
BIN
deep-learning/keras-tutorial/imgs/autoencoder.png
Normal file
After Width: | Height: | Size: 21 KiB |
BIN
deep-learning/keras-tutorial/imgs/backprop.png
Normal file
After Width: | Height: | Size: 153 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn1.png
Normal file
After Width: | Height: | Size: 166 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn2.png
Normal file
After Width: | Height: | Size: 580 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn3.png
Normal file
After Width: | Height: | Size: 46 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn4.png
Normal file
After Width: | Height: | Size: 66 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn5.png
Normal file
After Width: | Height: | Size: 111 KiB |
BIN
deep-learning/keras-tutorial/imgs/cnn6.png
Normal file
After Width: | Height: | Size: 102 KiB |
BIN
deep-learning/keras-tutorial/imgs/conv.png
Normal file
After Width: | Height: | Size: 185 KiB |
BIN
deep-learning/keras-tutorial/imgs/convnets_cover.png
Normal file
After Width: | Height: | Size: 136 KiB |
BIN
deep-learning/keras-tutorial/imgs/euroscipy_2016_logo.png
Normal file
After Width: | Height: | Size: 79 KiB |
BIN
deep-learning/keras-tutorial/imgs/gru.png
Normal file
After Width: | Height: | Size: 48 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/apricot_565.jpeg
Normal file
After Width: | Height: | Size: 200 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/apricot_696.jpeg
Normal file
After Width: | Height: | Size: 57 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/apricot_787.jpeg
Normal file
After Width: | Height: | Size: 148 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/strawberry_1157.jpeg
Normal file
After Width: | Height: | Size: 90 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/strawberry_1174.jpeg
Normal file
After Width: | Height: | Size: 73 KiB |
BIN
deep-learning/keras-tutorial/imgs/imagenet/strawberry_1189.jpeg
Normal file
After Width: | Height: | Size: 100 KiB |
BIN
deep-learning/keras-tutorial/imgs/keDyv.png
Normal file
After Width: | Height: | Size: 62 KiB |
BIN
deep-learning/keras-tutorial/imgs/keras-logo-small.jpg
Normal file
After Width: | Height: | Size: 30 KiB |
BIN
deep-learning/keras-tutorial/imgs/keras_rank_1.jpg
Normal file
After Width: | Height: | Size: 584 KiB |
BIN
deep-learning/keras-tutorial/imgs/keras_rank_2.jpg
Normal file
After Width: | Height: | Size: 149 KiB |
BIN
deep-learning/keras-tutorial/imgs/mlp_details.png
Normal file
After Width: | Height: | Size: 112 KiB |
BIN
deep-learning/keras-tutorial/imgs/overfitting.png
Normal file
After Width: | Height: | Size: 5.7 KiB |
BIN
deep-learning/keras-tutorial/imgs/rnn.png
Normal file
After Width: | Height: | Size: 11 KiB |
BIN
deep-learning/keras-tutorial/imgs/rnn2.png
Normal file
After Width: | Height: | Size: 13 KiB |
BIN
deep-learning/keras-tutorial/imgs/sprint.jpg
Normal file
After Width: | Height: | Size: 76 KiB |
66
deep-learning/keras-tutorial/outline.md
Normal file
|
@ -0,0 +1,66 @@
|
|||
# Outline (Draft)
|
||||
|
||||
- Part I: Introduction
|
||||
|
||||
- Intro to ANN
|
||||
- (naive pure-Python implementation from `pybrain`)
|
||||
- fast forward
|
||||
- sgd + backprop
|
||||
- Intro to Theano
|
||||
- Model + SGD with Theano (simple logreg)
|
||||
|
||||
- Introduction to Keras
|
||||
- Overview and main features
|
||||
- Theano backend
|
||||
- Tensorflow backend
|
||||
- Same LogReg with Keras
|
||||
|
||||
- Part II: Supervised Learning + Keras Internals
|
||||
- Intro: Focus on Image Classification
|
||||
- Multi-Layer Perceptron and Fully Connected
|
||||
- Examples with `keras.models.Sequential` and `Dense`
|
||||
- HandsOn: MLP with keras
|
||||
|
||||
- Intro to CNN
|
||||
- meaning of convolutional filters
|
||||
- examples from ImageNet
|
||||
|
||||
- Meaning of dimensions of Conv filters (through an exmple of ConvNet)
|
||||
- HandsOn: ConvNet with keras
|
||||
|
||||
- Advanced CNN
|
||||
- Dropout and MaxPooling
|
||||
- Famous ANN in Keras (likely moved somewhere else)
|
||||
- ref: https://github.com/fchollet/deep-learning-models
|
||||
- VGG16
|
||||
- VGG19
|
||||
- LaNet
|
||||
- Inception/GoogleNet
|
||||
- ResNet
|
||||
*Implementation and examples
|
||||
- HandsOn: Fine tuning a network on new dataset
|
||||
|
||||
- Part III: Unsupervised Learning + Keras Internals
|
||||
- AutoEncoders
|
||||
- word2vec & doc2vec (gensim) + `keras.dataset` (i.e. `keras.dataset.imdb`)
|
||||
- HandsOn: _______
|
||||
|
||||
*should we include embedding here?
|
||||
|
||||
- Part IV: Advanced Materials
|
||||
- RNN (LSTM)
|
||||
- RNN, LSTM, GRU
|
||||
- Meaning of dimensions of rnn (backprop though time, etc)
|
||||
- HandsOn: IMDB (?)
|
||||
|
||||
- CNN-RNN
|
||||
- Time Distributed Convolution
|
||||
- Some of the recent advances in DL implemented in Keras
|
||||
- e.g. https://github.com/snf/keras-fractalnet - Fractal Net Implementation with Keras
|
||||
|
||||
|
||||
Notes:
|
||||
|
||||
1) Please, add more details in Part IV (i.e. /Advanced Materials/)
|
||||
2) As for Keras internals, I Would consider this: https://github.com/wuaalb/keras_extensions/blob/master/keras_extensions/rbm.py
|
||||
This is just to show how easy it is to extend Keras ( in this case, properly creating a new `Layer`).
|
4
deep-learning/keras-tutorial/solutions/sol_111.py
Normal file
|
@ -0,0 +1,4 @@
|
|||
ann = ANN(2, 10, 1)
|
||||
%timeit -n 1 -r 1 ann.train(zip(X,y), iterations=2)
|
||||
plot_decision_boundary(ann)
|
||||
plt.title("Our next model with 10 hidden units")
|
4
deep-learning/keras-tutorial/solutions/sol_112.py
Normal file
|
@ -0,0 +1,4 @@
|
|||
ann = ANN(2, 10, 1)
|
||||
%timeit -n 1 -r 1 ann.train(zip(X,y), iterations=100)
|
||||
plot_decision_boundary(ann)
|
||||
plt.title("Our model with 10 hidden units and 100 iterations")
|
57
deep-learning/keras-tutorial/w2v.py
Normal file
|
@ -0,0 +1,57 @@
|
|||
from gensim.models import word2vec
|
||||
from os.path import join, exists, split
|
||||
import os
|
||||
import numpy as np
|
||||
|
||||
def train_word2vec(sentence_matrix, vocabulary_inv,
|
||||
num_features=300, min_word_count=1, context=10):
|
||||
"""
|
||||
Trains, saves, loads Word2Vec model
|
||||
Returns initial weights for embedding layer.
|
||||
|
||||
inputs:
|
||||
sentence_matrix # int matrix: num_sentences x max_sentence_len
|
||||
vocabulary_inv # dict {str:int}
|
||||
num_features # Word vector dimensionality
|
||||
min_word_count # Minimum word count
|
||||
context # Context window size
|
||||
"""
|
||||
model_dir = 'word2vec_models'
|
||||
model_name = "{:d}features_{:d}minwords_{:d}context".format(num_features, min_word_count, context)
|
||||
model_name = join(model_dir, model_name)
|
||||
if exists(model_name):
|
||||
embedding_model = word2vec.Word2Vec.load(model_name)
|
||||
print('Loading existing Word2Vec model \'%s\'' % split(model_name)[-1])
|
||||
else:
|
||||
# Set values for various parameters
|
||||
num_workers = 2 # Number of threads to run in parallel
|
||||
downsampling = 1e-3 # Downsample setting for frequent words
|
||||
|
||||
# Initialize and train the model
|
||||
print("Training Word2Vec model...")
|
||||
sentences = [[vocabulary_inv[w] for w in s] for s in sentence_matrix]
|
||||
embedding_model = word2vec.Word2Vec(sentences, workers=num_workers, \
|
||||
size=num_features, min_count = min_word_count, \
|
||||
window = context, sample = downsampling)
|
||||
|
||||
# If we don't plan to train the model any further, calling
|
||||
# init_sims will make the model much more memory-efficient.
|
||||
embedding_model.init_sims(replace=True)
|
||||
|
||||
# Saving the model for later use. You can load it later using Word2Vec.load()
|
||||
if not exists(model_dir):
|
||||
os.mkdir(model_dir)
|
||||
print('Saving Word2Vec model \'%s\'' % split(model_name)[-1])
|
||||
embedding_model.save(model_name)
|
||||
|
||||
# add unknown words
|
||||
embedding_weights = [np.array([embedding_model[w] if w in embedding_model\
|
||||
else np.random.uniform(-0.25,0.25,embedding_model.vector_size)\
|
||||
for w in vocabulary_inv])]
|
||||
return embedding_weights
|
||||
|
||||
if __name__=='__main__':
|
||||
import data_helpers
|
||||
print("Loading data...")
|
||||
x, _, _, vocabulary_inv = data_helpers.load_data()
|
||||
w = train_word2vec(x, vocabulary_inv)
|