mirror of
https://github.com/donnemartin/data-science-ipython-notebooks.git
synced 2024-03-22 13:30:56 +08:00
292 lines
8.4 KiB
Python
292 lines
8.4 KiB
Python
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"# Iteration"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Now that we have a few examples under our belt, let us take a look at what is happening a bit more closely.\n",
|
||
"\n",
|
||
"As we have identified earlier, TensorFlow allows us to create a graph of operations and variables. These variables are called **Tensors**, and represent data, whether that is a single number, a string, a matrix, or something else. **Tensors** are combined through operations, and this whole process is modelled in a graph.\n",
|
||
"\n",
|
||
"First, make sure you have your **tensorenv** virtual environment activated, Once it is activated type in **conda install jupyter** to install jupter books.\n",
|
||
"\n",
|
||
"Then, run **jupyter notebook** to launch a browser session of the Jupyter Notebook (previously called the IPython Notebook). (If your browser doesn’t open, open it and type **localhost:8888** into the browser’s address bar.)\n",
|
||
"\n",
|
||
"Click “New” and then “Python 3” under “Notebooks”. This will launch a new browser tab. Give the notebook a name by clicking “Untitled” at the top and give it a name (I used “Interactive TensorFlow”).\n",
|
||
"\n",
|
||
"> If you have never used a Jupyter Notebook (or IPython Notebook) before, take a look [at this site](http://opentechschool.github.io/python-data-intro/core/notebook.html) for a brief introduction.\n",
|
||
"\n",
|
||
"Next, as before, let’s create a basic TensorFlow program. One major change is the use of an **InteractiveSession**, which allows us to run variables without needing to constantly refer to the session object (less typing!). Code blocks below are broken into different cells. If you see a break in the code, you will need to run the previous cell first. Also, if you aren’t otherwise confident, ensure all of the code in a given block is type into a cell before you run it."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"metadata": {
|
||
"collapsed": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import tensorflow as tf\n",
|
||
"\n",
|
||
"session = tf.InteractiveSession()\n",
|
||
"\n",
|
||
"x = tf.constant(list(range(10)))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"In this section of code, we create an **InteractiveSession**, and then define a **constant** value, which is like a placeholder, but with a set value (that doesn’t change). In the next cell, we can evaluate this constant and print the result."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0 1 2 3 4 5 6 7 8 9]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(x.eval())"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"metadata": {
|
||
"collapsed": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"session.close()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Closing sessions is quite important, and can be easy to forget. For that reason, we were using the **with** keyword in earlier tutorials to handle this. When the **with** block is finished executing, the session will be closed (this also happens if an error happens - the session is still closed).\n",
|
||
"\n",
|
||
"Now lets take a look at a larger example. In this example, we will take a very large matrix and compute on it, keeping track of when memory is used. First, let’s find out how much memory our Python session is currently using:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"3944091648 Kb\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import resource\n",
|
||
"print(\"{} Kb\".format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"On my system, this is using 78496 kilobytes, after running the above code as well. Now, create a new session, and define two matrices:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"metadata": {
|
||
"collapsed": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"session = tf.InteractiveSession()\n",
|
||
"\n",
|
||
"X = tf.constant(np.eye(10000))\n",
|
||
"Y = tf.constant(np.random.randn(10000, 300))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Let’s take a look at our memory usage again:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"3944091648 Kb\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"{} Kb\".format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"On my system, the memory usage jumped to 885,220 Kb - those matrices are large!\n",
|
||
"\n",
|
||
"Now, let’s multiply those matrices together using matmul:"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"metadata": {
|
||
"collapsed": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"Z = tf.matmul(X, Y)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"If we check our memory usage now, we find that no more memory has been used – no actual computation of Z has taken place. It is only when we evaluate the operation do we actually computer this. For an interactive session, you can just use **Z.eval()**, rather than run **session.run(Z)**. Note that you can’t always rely on .eval(), as this is a shortcut that uses the “default” session, not necessarily the one you want to use."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[-1.34447547, 0.21090638, 0.43680784, ..., 0.73037215,\n",
|
||
" -0.23213385, -1.12871138],\n",
|
||
" [ 0.64485571, -0.54936434, 0.35649891, ..., -0.12413736,\n",
|
||
" -0.58614079, -0.32230335],\n",
|
||
" [ 0.37195075, 0.20654139, -0.15417305, ..., 0.11346775,\n",
|
||
" 0.23759908, -1.27450529],\n",
|
||
" ..., \n",
|
||
" [ 0.63094722, -2.96401008, -1.58592691, ..., -1.57240119,\n",
|
||
" -0.34051408, 0.53935437],\n",
|
||
" [-1.97724198, -0.32331035, -0.98294343, ..., -1.56477455,\n",
|
||
" -0.02398526, -0.37687652],\n",
|
||
" [-0.70687284, 2.04289277, -0.57564784, ..., -0.40614639,\n",
|
||
" -0.12381717, -2.23741137]])"
|
||
]
|
||
},
|
||
"execution_count": 36,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"Z.eval()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Your computer will think for quite a while, because only now is it actually performing the action of multiplying those matrices. Checking the memory usage afterwards reveals that this computation has happened, as it now uses nearly 3Gb!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"metadata": {
|
||
"collapsed": false
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"4016365568 Kb\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(\"{} Kb\".format(resource.getrusage(resource.RUSAGE_SELF).ru_maxrss))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {},
|
||
"source": [
|
||
"Don’t forget to close your session!"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"metadata": {
|
||
"collapsed": true
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"session.close()"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"kernelspec": {
|
||
"display_name": "Python 2",
|
||
"language": "python",
|
||
"name": "python2"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 2
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython2",
|
||
"version": "2.7.11"
|
||
},
|
||
"toc": {
|
||
"toc_cell": false,
|
||
"toc_number_sections": false,
|
||
"toc_threshold": "8",
|
||
"toc_window_display": false
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 0
|
||
}
|