data-science-ipython-notebooks/deep-learning/keras-tutorial/0. Preamble.ipynb

760 lines
20 KiB
Python

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Credits: Forked from [deep-learning-keras-tensorflow](https://github.com/leriomaggio/deep-learning-keras-tensorflow) by Valerio Maggio"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"<div>\n",
" <h1 style=\"text-align: center;\">Deep Learning with Keras</h1>\n",
" <img style=\"text-align: left\" src=\"imgs/keras-logo-small.jpg\" width=\"10%\" />\n",
"<div>\n",
"\n",
"<div>\n",
" <h2 style=\"text-align: center;\">Tutorial @ EuroScipy 2016</h2>\n",
" <img style=\"text-align: left\" src=\"imgs/euroscipy_2016_logo.png\" width=\"40%\" />\n",
"</div> "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Yam Peleg, Valerio Maggio"
]
},
{
"cell_type": "markdown",
"metadata": {
"slideshow": {
"slide_type": "-"
}
},
"source": [
"# Goal of this Tutorial\n",
"\n",
"- **Introduce** main features of Keras\n",
"- **Learn** how simple and Pythonic is doing Deep Learning with Keras\n",
"- **Understand** how easy is to do basic and *advanced* DL models in Keras;\n",
" - **Examples and Hand-on Excerises** along the way."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Source\n",
"\n",
"https://github.com/leriomaggio/deep-learning-keras-euroscipy2016/"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# (Tentative) Schedule \n",
"\n",
"## Attention: Spoilers Warning!\n",
"\n",
"\n",
"- **Setup** (`10 mins`)\n",
"\n",
"- **Part I**: **Introduction** (`~65 mins`)\n",
"\n",
" - Intro to ANN (`~20 mins`)\n",
" - naive pure-Python implementation\n",
" - fast forward, sgd, backprop\n",
" \n",
" - Intro to Theano (`15 mins`)\n",
" - Model + SGD with Theano\n",
" \n",
" - Introduction to Keras (`30 mins`)\n",
" - Overview and main features\n",
" - Theano backend\n",
" - Tensorflow backend\n",
" - Multi-Layer Perceptron and Fully Connected\n",
" - Examples with `keras.models.Sequential` and `Dense`\n",
" - HandsOn: MLP with keras\n",
" \n",
"- **Coffe Break** (`30 mins`)\n",
"\n",
"- **Part II**: **Supervised Learning and Convolutional Neural Nets** (`~45 mins`)\n",
" \n",
" - Intro: Focus on Image Classification (`5 mins`)\n",
"\n",
" - Intro to CNN (`25 mins`)\n",
" - meaning of convolutional filters\n",
" - examples from ImageNet \n",
" - Meaning of dimensions of Conv filters (through an exmple of ConvNet) \n",
" - Visualising ConvNets\n",
" - HandsOn: ConvNet with keras \n",
"\n",
" - Advanced CNN (`10 mins`)\n",
" - Dropout\n",
" - MaxPooling\n",
" - Batch Normalisation\n",
" \n",
" - Famous Models in Keras (likely moved somewhere else) (`10 mins`)\n",
" (ref: https://github.com/fchollet/deep-learning-models)\n",
" - VGG16\n",
" - VGG19\n",
" - ResNet50\n",
" - Inception v3\n",
" - HandsOn: Fine tuning a network on new dataset \n",
" \n",
"- **Part III**: **Unsupervised Learning** (`10 mins`)\n",
"\n",
" - AutoEncoders (`5 mins`)\n",
" - word2vec & doc2vec (gensim) & `keras.datasets` (`5 mins`)\n",
" - `Embedding`\n",
" - word2vec and CNN\n",
" - Exercises\n",
"\n",
"- **Part IV**: **Advanced Materials** (`20 mins`)\n",
" - RNN and LSTM (`10 mins`)\n",
" - RNN, LSTM, GRU \n",
" - Example of RNN and LSTM with Text (`~10 mins`) -- *Tentative*\n",
" - HandsOn: IMDB\n",
"\n",
"- **Wrap up and Conclusions** (`5 mins`)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Requirements"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This tutorial requires the following packages:\n",
"\n",
"- Python version 3.4+ \n",
" - likely Python 2.7 would be fine, but *who knows*? :P\n",
"- `numpy` version 1.10 or later: http://www.numpy.org/\n",
"- `scipy` version 0.16 or later: http://www.scipy.org/\n",
"- `matplotlib` version 1.4 or later: http://matplotlib.org/\n",
"- `pandas` version 0.16 or later: http://pandas.pydata.org\n",
"- `scikit-learn` version 0.15 or later: http://scikit-learn.org\n",
"- `keras` version 1.0 or later: http://keras.io\n",
"- `theano` version 0.8 or later: http://deeplearning.net/software/theano/\n",
"- `ipython`/`jupyter` version 4.0 or later, with notebook support\n",
"\n",
"(Optional but recommended):\n",
"\n",
"- `pyyaml`\n",
"- `hdf5` and `h5py` (required if you use model saving/loading functions in keras)\n",
"- **NVIDIA cuDNN** if you have NVIDIA GPUs on your machines.\n",
" [https://developer.nvidia.com/rdp/cudnn-download]()\n",
"\n",
"The easiest way to get (most) these is to use an all-in-one installer such as [Anaconda](http://www.continuum.io/downloads) from Continuum. These are available for multiple architectures."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Python Version"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I'm currently running this tutorial with **Python 3** on **Anaconda**"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Python 3.5.2\r\n"
]
}
],
"source": [
"!python --version"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# How to set up your environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The quickest and simplest way to setup the environment is to use [conda](https://store.continuum.io) environment manager. \n",
"\n",
"We provide in the materials a `deep-learning.yml` that is complete and **ready to use** to set up your virtual environment with conda."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false,
"scrolled": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"name: deep-learning\r\n",
"channels:\r\n",
"- conda-forge\r\n",
"- defaults\r\n",
"dependencies:\r\n",
"- accelerate=2.3.0=np111py35_3\r\n",
"- accelerate_cudalib=2.0=0\r\n",
"- bokeh=0.12.1=py35_0\r\n",
"- cffi=1.6.0=py35_0\r\n",
"- backports.shutil_get_terminal_size=1.0.0=py35_0\r\n",
"- blas=1.1=openblas\r\n",
"- ca-certificates=2016.8.2=3\r\n",
"- cairo=1.12.18=8\r\n",
"- certifi=2016.8.2=py35_0\r\n",
"- cycler=0.10.0=py35_0\r\n",
"- cython=0.24.1=py35_0\r\n",
"- decorator=4.0.10=py35_0\r\n",
"- entrypoints=0.2.2=py35_0\r\n",
"- fontconfig=2.11.1=3\r\n",
"- freetype=2.6.3=1\r\n",
"- gettext=0.19.7=1\r\n",
"- glib=2.48.0=4\r\n",
"- h5py=2.6.0=np111py35_6\r\n",
"- harfbuzz=1.0.6=0\r\n",
"- hdf5=1.8.17=2\r\n",
"- icu=56.1=4\r\n",
"- ipykernel=4.3.1=py35_1\r\n",
"- ipython=5.1.0=py35_0\r\n",
"- ipywidgets=5.2.2=py35_0\r\n",
"- jinja2=2.8=py35_1\r\n",
"- jpeg=9b=0\r\n",
"- jsonschema=2.5.1=py35_0\r\n",
"- jupyter_client=4.3.0=py35_0\r\n",
"- jupyter_console=5.0.0=py35_0\r\n",
"- jupyter_core=4.1.1=py35_1\r\n",
"- libffi=3.2.1=2\r\n",
"- libiconv=1.14=3\r\n",
"- libpng=1.6.24=0\r\n",
"- libsodium=1.0.10=0\r\n",
"- libtiff=4.0.6=6\r\n",
"- libxml2=2.9.4=0\r\n",
"- markupsafe=0.23=py35_0\r\n",
"- matplotlib=1.5.2=np111py35_6\r\n",
"- mistune=0.7.3=py35_0\r\n",
"- nbconvert=4.2.0=py35_0\r\n",
"- nbformat=4.0.1=py35_0\r\n",
"- ncurses=5.9=8\r\n",
"- nose=1.3.7=py35_1\r\n",
"- notebook=4.2.2=py35_0\r\n",
"- numpy=1.11.1=py35_blas_openblas_201\r\n",
"- openblas=0.2.18=4\r\n",
"- openssl=1.0.2h=2\r\n",
"- pandas=0.18.1=np111py35_1\r\n",
"- pango=1.40.1=0\r\n",
"- path.py=8.2.1=py35_0\r\n",
"- pcre=8.38=1\r\n",
"- pexpect=4.2.0=py35_1\r\n",
"- pickleshare=0.7.3=py35_0\r\n",
"- pip=8.1.2=py35_0\r\n",
"- pixman=0.32.6=0\r\n",
"- prompt_toolkit=1.0.6=py35_0\r\n",
"- protobuf=3.0.0b3=py35_1\r\n",
"- ptyprocess=0.5.1=py35_0\r\n",
"- pygments=2.1.3=py35_1\r\n",
"- pyparsing=2.1.7=py35_0\r\n",
"- python=3.5.2=2\r\n",
"- python-dateutil=2.5.3=py35_0\r\n",
"- pytz=2016.6.1=py35_0\r\n",
"- pyyaml=3.11=py35_0\r\n",
"- pyzmq=15.4.0=py35_0\r\n",
"- qt=4.8.7=0\r\n",
"- qtconsole=4.2.1=py35_0\r\n",
"- readline=6.2=0\r\n",
"- requests=2.11.0=py35_0\r\n",
"- scikit-learn=0.17.1=np111py35_blas_openblas_201\r\n",
"- scipy=0.18.0=np111py35_blas_openblas_201\r\n",
"- setuptools=25.1.6=py35_0\r\n",
"- simplegeneric=0.8.1=py35_0\r\n",
"- sip=4.18=py35_0\r\n",
"- six=1.10.0=py35_0\r\n",
"- sqlite=3.13.0=1\r\n",
"- terminado=0.6=py35_0\r\n",
"- tk=8.5.19=0\r\n",
"- tornado=4.4.1=py35_1\r\n",
"- traitlets=4.2.2=py35_0\r\n",
"- wcwidth=0.1.7=py35_0\r\n",
"- wheel=0.29.0=py35_0\r\n",
"- widgetsnbextension=1.2.6=py35_3\r\n",
"- xz=5.2.2=0\r\n",
"- yaml=0.1.6=0\r\n",
"- zeromq=4.1.5=0\r\n",
"- zlib=1.2.8=3\r\n",
"- cudatoolkit=7.5=0\r\n",
"- ipython_genutils=0.1.0=py35_0\r\n",
"- jupyter=1.0.0=py35_3\r\n",
"- libgfortran=3.0.0=1\r\n",
"- llvmlite=0.11.0=py35_0\r\n",
"- mkl=11.3.3=0\r\n",
"- mkl-service=1.1.2=py35_2\r\n",
"- numba=0.26.0=np111py35_0\r\n",
"- pycparser=2.14=py35_1\r\n",
"- pyqt=4.11.4=py35_4\r\n",
"- snakeviz=0.4.1=py35_0\r\n",
"- pip:\r\n",
" - backports.shutil-get-terminal-size==1.0.0\r\n",
" - certifi==2016.8.2\r\n",
" - cycler==0.10.0\r\n",
" - cython==0.24.1\r\n",
" - decorator==4.0.10\r\n",
" - h5py==2.6.0\r\n",
" - ipykernel==4.3.1\r\n",
" - ipython==5.1.0\r\n",
" - ipython-genutils==0.1.0\r\n",
" - ipywidgets==5.2.2\r\n",
" - jinja2==2.8\r\n",
" - jsonschema==2.5.1\r\n",
" - jupyter-client==4.3.0\r\n",
" - jupyter-console==5.0.0\r\n",
" - jupyter-core==4.1.1\r\n",
" - keras==1.0.7\r\n",
" - mako==1.0.4\r\n",
" - markupsafe==0.23\r\n",
" - matplotlib==1.5.2\r\n",
" - mistune==0.7.3\r\n",
" - nbconvert==4.2.0\r\n",
" - nbformat==4.0.1\r\n",
" - nose==1.3.7\r\n",
" - notebook==4.2.2\r\n",
" - numpy==1.11.1\r\n",
" - pandas==0.18.1\r\n",
" - path.py==8.2.1\r\n",
" - pexpect==4.2.0\r\n",
" - pickleshare==0.7.3\r\n",
" - pip==8.1.2\r\n",
" - prompt-toolkit==1.0.6\r\n",
" - protobuf==3.0.0b2\r\n",
" - ptyprocess==0.5.1\r\n",
" - pygments==2.1.3\r\n",
" - pygpu==0.2.1\r\n",
" - pyparsing==2.1.7\r\n",
" - python-dateutil==2.5.3\r\n",
" - pytz==2016.6.1\r\n",
" - pyyaml==3.11\r\n",
" - pyzmq==15.4.0\r\n",
" - qtconsole==4.2.1\r\n",
" - requests==2.11.0\r\n",
" - scikit-learn==0.17.1\r\n",
" - scipy==0.18.0\r\n",
" - setuptools==25.1.4\r\n",
" - simplegeneric==0.8.1\r\n",
" - six==1.10.0\r\n",
" - tensorflow==0.10.0rc0\r\n",
" - terminado==0.6\r\n",
" - theano==0.8.2\r\n",
" - tornado==4.4.1\r\n",
" - traitlets==4.2.2\r\n",
" - wcwidth==0.1.7\r\n",
" - wheel==0.29.0\r\n",
" - widgetsnbextension==1.2.6\r\n",
"prefix: /home/valerio/anaconda3/envs/deep-learning\r\n",
"\r\n"
]
}
],
"source": [
"!cat deep-learning.yml"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Recreate the Conda Environment"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### A. Create the Environment\n",
"\n",
"```\n",
"conda env create -f deep-learning.yml # this file is for Linux channels.\n",
"```\n",
"\n",
"If you're using a **Mac OSX**, we also provided in the repo the conda file \n",
"that is compatible with `osx-channels`:\n",
"\n",
"```\n",
"conda env create -f deep-learning-osx.yml # this file is for OSX channels.\n",
"```\n",
"\n",
"#### B. Activate the new `deep-learning` Environment\n",
"\n",
"```\n",
"source activate deep-learning\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Optionals"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 1. Enabling Conda-Forge"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"It is strongly suggested to enable [**conda forge**](https://conda-forge.github.io/) in your Anaconda installation.\n",
"\n",
"**Conda-Forge** is a github organisation containing repositories of conda recipies.\n",
"\n",
"To add `conda-forge` as an additional anaconda channel it is just required to type:\n",
"\n",
"```shell\n",
"conda config --add channels conda-forge\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 2. Configure Theano\n",
"\n",
"1) Create the `theanorc` file:\n",
"\n",
"```shell\n",
"touch $HOME/.theanorc\n",
"```\n",
"\n",
"2) Copy the following content into the file:\n",
"\n",
"```\n",
"[global]\n",
"floatX = float32\n",
"device = gpu # switch to cpu if no GPU is available on your machine\n",
"\n",
"[nvcc]\n",
"fastmath = True\n",
"\n",
"[lib]\n",
"cnmem=.90\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**More on [theano documentation](http://theano.readthedocs.io/en/latest/library/config.html)**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 3. Installing Tensorflow as backend "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"```shell\n",
"# Ubuntu/Linux 64-bit, GPU enabled, Python 3.5\n",
"# Requires CUDA toolkit 7.5 and CuDNN v4. For other versions, see \"Install from sources\" below.\n",
"export TF_BINARY_URL=https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow-0.10.0rc0-cp35-cp35m-linux_x86_64.whl\n",
"\n",
"pip install --ignore-installed --upgrade $TF_BINARY_URL\n",
"```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**More on [tensorflow documentation](https://www.tensorflow.org/versions/r0.10/get_started/os_setup.html)**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Test if everything is up&running"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 1. Check import"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import numpy as np\n",
"import scipy as sp\n",
"import pandas as pd\n",
"import matplotlib.pyplot as plt\n",
"import sklearn"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Using Theano backend.\n",
"Using gpu device 0: GeForce GTX 760 (CNMeM is enabled with initial size: 90.0% of memory, cuDNN 4007)\n"
]
}
],
"source": [
"import keras"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 2. Check installeded Versions"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"numpy: 1.11.1\n",
"scipy: 0.18.0\n",
"matplotlib: 1.5.2\n",
"iPython: 5.1.0\n",
"scikit-learn: 0.17.1\n"
]
}
],
"source": [
"import numpy\n",
"print('numpy:', numpy.__version__)\n",
"\n",
"import scipy\n",
"print('scipy:', scipy.__version__)\n",
"\n",
"import matplotlib\n",
"print('matplotlib:', matplotlib.__version__)\n",
"\n",
"import IPython\n",
"print('iPython:', IPython.__version__)\n",
"\n",
"import sklearn\n",
"print('scikit-learn:', sklearn.__version__)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"keras: 1.0.7\n",
"Theano: 0.8.2\n",
"Tensorflow: 0.10.0rc0\n"
]
}
],
"source": [
"import keras\n",
"print('keras: ', keras.__version__)\n",
"\n",
"import theano\n",
"print('Theano: ', theano.__version__)\n",
"\n",
"# optional\n",
"import tensorflow as tf\n",
"print('Tensorflow: ', tf.__version__)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"<br>\n",
"<h1 style=\"text-align: center;\">If everything worked till down here, you're ready to start!</h1>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Consulting Material"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You have two options to go through the material presented in this tutorial:\n",
"\n",
"* Read (and execute) the material as **iPython/Jupyter** notebooks\n",
"* (just) read the material as (HTML) slides"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the first case, all you need to do is just execute `ipython notebook` (or `jupyter notebook`) depending on the version of `iPython` you have installed on your machine\n",
"\n",
"(`jupyter` command works in case you have `iPython 4.0.x` installed)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the second case, you may simply convert the provided notebooks in `HTML` slides and see them into your browser\n",
"thanks to `nbconvert`.\n",
"\n",
"Thus, move to the folder where notebooks are stored and execute the following command:\n",
"\n",
" jupyter nbconvert --to slides ./*.ipynb --post serve\n",
" \n",
" \n",
"(Please substitute `jupyter` with `ipython` in the previous command if you have `iPython 3.x` installed on your machine)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## In case..."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"..you wanna do **both** (interactive and executable slides), I highly suggest to install the terrific `RISE` ipython notebook extension: [https://github.com/damianavila/RISE](https://github.com/damianavila/RISE)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.4.3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}