{
 "metadata": {
  "name": "",
  "signature": "sha256:dcf024d8d7e8070e9c5c1ad5d846420549912ed82967bcd1d7ef1193a6eb6dd8"
 },
 "nbformat": 3,
 "nbformat_minor": 0,
 "worksheets": [
  {
   "cells": [
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "# Pandas\n",
      "\n",
      "* Series\n",
      "* DataFrame\n",
      "* Reindexing\n",
      "* Dropping Entries\n",
      "* Indexing, Selecting, Filtering\n",
      "* Arithmetic and Data Alignment\n",
      "* Function Application and Mapping\n",
      "* Sorting and Ranking\n",
      "* Axis Indices with Duplicate Values\n",
      "* Summarizing and Computing Descriptive Statistics"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "from pandas import Series, DataFrame\n",
      "import pandas as pd\n",
      "import numpy as np"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 1
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Series\n",
      "\n",
      "A Series is a one-dimensional array-like object containing an array of data and an associated array of data labels.  The data can be any NumPy data type and the labels are the Series' index."
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_1 = Series([1, 1, 2, -3, -5, 8, 13])\n",
      "ser_1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 2,
       "text": [
        "0     1\n",
        "1     1\n",
        "2     2\n",
        "3    -3\n",
        "4    -5\n",
        "5     8\n",
        "6    13\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 2
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Get the array representation of a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_1.values"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 3,
       "text": [
        "array([ 1,  1,  2, -3, -5,  8, 13])"
       ]
      }
     ],
     "prompt_number": 3
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Index objects are immutable and hold the axis labels and metadata such as names and axis names.\n",
      "\n",
      "Get the index of the Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_1.index"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 4,
       "text": [
        "Int64Index([0, 1, 2, 3, 4, 5, 6], dtype='int64')"
       ]
      }
     ],
     "prompt_number": 4
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a Series with a custom index:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2 = Series([1, 1, 2, -3, -5], index=['a', 'b', 'c', 'd', 'e'])\n",
      "ser_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 5,
       "text": [
        "a    1\n",
        "b    1\n",
        "c    2\n",
        "d   -3\n",
        "e   -5\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 5
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Get a value from a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[4] == ser_2['e']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 6,
       "text": [
        "True"
       ]
      }
     ],
     "prompt_number": 6
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Get a set of values from a Series by passing in a list:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[['c', 'a', 'b']]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 7,
       "text": [
        "c    2\n",
        "a    1\n",
        "b    1\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 7
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Get values great than 0:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[ser_2 > 0]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 8,
       "text": [
        "a    1\n",
        "b    1\n",
        "c    2\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 8
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Scalar multiply:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2 * 2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 9,
       "text": [
        "a     2\n",
        "b     2\n",
        "c     4\n",
        "d    -6\n",
        "e   -10\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 9
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply a numpy math function:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "import numpy as np\n",
      "np.exp(ser_2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 10,
       "text": [
        "a    2.718282\n",
        "b    2.718282\n",
        "c    7.389056\n",
        "d    0.049787\n",
        "e    0.006738\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 10
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "A Series is like a fixed-length, ordered dict.  \n",
      "\n",
      "Create a series by passing in a dict:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "dict_1 = {'foo' : 100, 'bar' : 200, 'baz' : 300}\n",
      "ser_3 = Series(dict_1)\n",
      "ser_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 11,
       "text": [
        "bar    200\n",
        "baz    300\n",
        "foo    100\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 11
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Re-order a Series by passing in an index (indices not found are NaN):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "index = ['foo', 'bar', 'baz', 'qux']\n",
      "ser_4 = Series(dict_1, index=index)\n",
      "ser_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 12,
       "text": [
        "foo    100\n",
        "bar    200\n",
        "baz    300\n",
        "qux    NaN\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 12
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Check for NaN with the pandas method:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pd.isnull(ser_4)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 13,
       "text": [
        "foo    False\n",
        "bar    False\n",
        "baz    False\n",
        "qux     True\n",
        "dtype: bool"
       ]
      }
     ],
     "prompt_number": 13
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Check for NaN with the Series method:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.isnull()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 14,
       "text": [
        "foo    False\n",
        "bar    False\n",
        "baz    False\n",
        "qux     True\n",
        "dtype: bool"
       ]
      }
     ],
     "prompt_number": 14
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Series automatically aligns differently indexed data in arithmetic operations:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_3 + ser_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 15,
       "text": [
        "bar    400\n",
        "baz    600\n",
        "foo    200\n",
        "qux    NaN\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 15
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Name a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.name = 'foobarbazqux'"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 16
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Name a Series index:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.index.name = 'label'"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 17
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 18,
       "text": [
        "label\n",
        "foo      100\n",
        "bar      200\n",
        "baz      300\n",
        "qux      NaN\n",
        "Name: foobarbazqux, dtype: float64"
       ]
      }
     ],
     "prompt_number": 18
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Rename a Series' index in place:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.index = ['fo', 'br', 'bz', 'qx']\n",
      "ser_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 19,
       "text": [
        "fo    100\n",
        "br    200\n",
        "bz    300\n",
        "qx    NaN\n",
        "Name: foobarbazqux, dtype: float64"
       ]
      }
     ],
     "prompt_number": 19
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## DataFrame\n",
      "\n",
      "A DataFrame is a tabular data structure containing an ordered collection of columns.  Each column can have a different type.  DataFrames have both row and column indices and is analogous to a dict of Series.  Row and column operations are treated roughly symmetrically.  Columns returned when indexing a DataFrame are views of the underlying data, not a copy.  To obtain a copy, use the Series' copy method.\n",
      "\n",
      "Create a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "data_1 = {'state' : ['VA', 'VA', 'VA', 'MD', 'MD'],\n",
      "          'year' : [2012, 2013, 2014, 2014, 2015],\n",
      "          'pop' : [5.0, 5.1, 5.2, 4.0, 4.1]}\n",
      "df_1 = DataFrame(data_1)\n",
      "df_1"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>pop</th>\n",
        "      <th>state</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 5.0</td>\n",
        "      <td> VA</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 5.1</td>\n",
        "      <td> VA</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 5.2</td>\n",
        "      <td> VA</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 4.0</td>\n",
        "      <td> MD</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> MD</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 20,
       "text": [
        "   pop state  year\n",
        "0  5.0    VA  2012\n",
        "1  5.1    VA  2013\n",
        "2  5.2    VA  2014\n",
        "3  4.0    MD  2014\n",
        "4  4.1    MD  2015"
       ]
      }
     ],
     "prompt_number": 20
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a DataFrame specifying a sequence of columns:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_2 = DataFrame(data_1, columns=['year', 'state', 'pop'])\n",
      "df_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 21,
       "text": [
        "   year state  pop\n",
        "0  2012    VA  5.0\n",
        "1  2013    VA  5.1\n",
        "2  2014    VA  5.2\n",
        "3  2014    MD  4.0\n",
        "4  2015    MD  4.1"
       ]
      }
     ],
     "prompt_number": 21
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Like Series, columns that are not present in the data are NaN:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3 = DataFrame(data_1, columns=['year', 'state', 'pop', 'unempl'])\n",
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 22,
       "text": [
        "   year state  pop unempl\n",
        "0  2012    VA  5.0    NaN\n",
        "1  2013    VA  5.1    NaN\n",
        "2  2014    VA  5.2    NaN\n",
        "3  2014    MD  4.0    NaN\n",
        "4  2015    MD  4.1    NaN"
       ]
      }
     ],
     "prompt_number": 22
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Retrieve a column by key, returning a Series:\n"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3['state']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 23,
       "text": [
        "0    VA\n",
        "1    VA\n",
        "2    VA\n",
        "3    MD\n",
        "4    MD\n",
        "Name: state, dtype: object"
       ]
      }
     ],
     "prompt_number": 23
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Retrive a column by attribute, returning a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.year"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 24,
       "text": [
        "0    2012\n",
        "1    2013\n",
        "2    2014\n",
        "3    2014\n",
        "4    2015\n",
        "Name: year, dtype: int64"
       ]
      }
     ],
     "prompt_number": 24
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Retrieve a row by position:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.ix[0]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 25,
       "text": [
        "year      2012\n",
        "state       VA\n",
        "pop          5\n",
        "unempl     NaN\n",
        "Name: 0, dtype: object"
       ]
      }
     ],
     "prompt_number": 25
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Update a column by assignment:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3['unempl'] = np.arange(5)\n",
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> 0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 4</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 26,
       "text": [
        "   year state  pop  unempl\n",
        "0  2012    VA  5.0       0\n",
        "1  2013    VA  5.1       1\n",
        "2  2014    VA  5.2       2\n",
        "3  2014    MD  4.0       3\n",
        "4  2015    MD  4.1       4"
       ]
      }
     ],
     "prompt_number": 26
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Assign a Series to a column (note if assigning a list or array, the length must match the DataFrame, unlike a Series):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "unempl = Series([6.0, 6.0, 6.1], index=[2, 3, 4])\n",
      "df_3['unempl'] = unempl\n",
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 27,
       "text": [
        "   year state  pop  unempl\n",
        "0  2012    VA  5.0     NaN\n",
        "1  2013    VA  5.1     NaN\n",
        "2  2014    VA  5.2     6.0\n",
        "3  2014    MD  4.0     6.0\n",
        "4  2015    MD  4.1     6.1"
       ]
      }
     ],
     "prompt_number": 27
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Assign a new column that doesn't exist to create a new column:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3['state_dup'] = df_3['state']\n",
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>state_dup</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> VA</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> VA</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> VA</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> MD</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> MD</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 28,
       "text": [
        "   year state  pop  unempl state_dup\n",
        "0  2012    VA  5.0     NaN        VA\n",
        "1  2013    VA  5.1     NaN        VA\n",
        "2  2014    VA  5.2     6.0        VA\n",
        "3  2014    MD  4.0     6.0        MD\n",
        "4  2015    MD  4.1     6.1        MD"
       ]
      }
     ],
     "prompt_number": 28
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Delete a column:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "del df_3['state_dup']\n",
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 29,
       "text": [
        "   year state  pop  unempl\n",
        "0  2012    VA  5.0     NaN\n",
        "1  2013    VA  5.1     NaN\n",
        "2  2014    VA  5.2     6.0\n",
        "3  2014    MD  4.0     6.0\n",
        "4  2015    MD  4.1     6.1"
       ]
      }
     ],
     "prompt_number": 29
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a DataFrame from a nested dict of dicts (the keys in the inner dicts are unioned and sorted to form the index in the result, unless an explicit index is specified):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "pop = {'VA' : {2013 : 5.1, 2014 : 5.2},\n",
      "       'MD' : {2014 : 4.0, 2015 : 4.1}}\n",
      "df_4 = DataFrame(pop)\n",
      "df_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>MD</th>\n",
        "      <th>VA</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2013</th>\n",
        "      <td> NaN</td>\n",
        "      <td> 5.1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2014</th>\n",
        "      <td> 4.0</td>\n",
        "      <td> 5.2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2015</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 30,
       "text": [
        "       MD   VA\n",
        "2013  NaN  5.1\n",
        "2014  4.0  5.2\n",
        "2015  4.1  NaN"
       ]
      }
     ],
     "prompt_number": 30
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Transpose the DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_4.T"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>2013</th>\n",
        "      <th>2014</th>\n",
        "      <th>2015</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>MD</th>\n",
        "      <td> NaN</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 4.1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>VA</th>\n",
        "      <td> 5.1</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 31,
       "text": [
        "    2013  2014  2015\n",
        "MD   NaN   4.0   4.1\n",
        "VA   5.1   5.2   NaN"
       ]
      }
     ],
     "prompt_number": 31
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a DataFrame from a dict of Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "data_2 = {'VA' : df_4['VA'][1:],\n",
      "          'MD' : df_4['MD'][2:]}\n",
      "df_5 = DataFrame(data_2)\n",
      "df_5"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>MD</th>\n",
        "      <th>VA</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2014</th>\n",
        "      <td> NaN</td>\n",
        "      <td> 5.2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2015</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 32,
       "text": [
        "       MD   VA\n",
        "2014  NaN  5.2\n",
        "2015  4.1  NaN"
       ]
      }
     ],
     "prompt_number": 32
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Set the DataFrame index name:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_5.index.name = 'year'\n",
      "df_5"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>MD</th>\n",
        "      <th>VA</th>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>year</th>\n",
        "      <th></th>\n",
        "      <th></th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2014</th>\n",
        "      <td> NaN</td>\n",
        "      <td> 5.2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2015</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 33,
       "text": [
        "       MD   VA\n",
        "year          \n",
        "2014  NaN  5.2\n",
        "2015  4.1  NaN"
       ]
      }
     ],
     "prompt_number": 33
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Set the DataFrame columns name:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_5.columns.name = 'state'\n",
      "df_5"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th>state</th>\n",
        "      <th>MD</th>\n",
        "      <th>VA</th>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>year</th>\n",
        "      <th></th>\n",
        "      <th></th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2014</th>\n",
        "      <td> NaN</td>\n",
        "      <td> 5.2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2015</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 34,
       "text": [
        "state   MD   VA\n",
        "year           \n",
        "2014   NaN  5.2\n",
        "2015   4.1  NaN"
       ]
      }
     ],
     "prompt_number": 34
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Return the data contained in a DataFrame as a 2D ndarray:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_5.values"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 35,
       "text": [
        "array([[ nan,  5.2],\n",
        "       [ 4.1,  nan]])"
       ]
      }
     ],
     "prompt_number": 35
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "If the columns are different dtypes, the 2D ndarray's dtype will accomodate all of the columns:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.values"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 36,
       "text": [
        "array([[2012, 'VA', 5.0, nan],\n",
        "       [2013, 'VA', 5.1, nan],\n",
        "       [2014, 'VA', 5.2, 6.0],\n",
        "       [2014, 'MD', 4.0, 6.0],\n",
        "       [2015, 'MD', 4.1, 6.1]], dtype=object)"
       ]
      }
     ],
     "prompt_number": 36
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Reindexing"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Create a new object with the data conformed to a new index.  Any missing values are set to NaN."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 37,
       "text": [
        "   year state  pop  unempl\n",
        "0  2012    VA  5.0     NaN\n",
        "1  2013    VA  5.1     NaN\n",
        "2  2014    VA  5.2     6.0\n",
        "3  2014    MD  4.0     6.0\n",
        "4  2015    MD  4.1     6.1"
       ]
      }
     ],
     "prompt_number": 37
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Reindexing rows returns a new frame with the specified index:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.reindex(list(reversed(range(0, 6))))"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td>  NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2015</td>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2014</td>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2014</td>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 2013</td>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 2012</td>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 38,
       "text": [
        "   year state  pop  unempl\n",
        "5   NaN   NaN  NaN     NaN\n",
        "4  2015    MD  4.1     6.1\n",
        "3  2014    MD  4.0     6.0\n",
        "2  2014    VA  5.2     6.0\n",
        "1  2013    VA  5.1     NaN\n",
        "0  2012    VA  5.0     NaN"
       ]
      }
     ],
     "prompt_number": 38
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Missing values can be set to something other than NaN:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.reindex(range(6, 0), fill_value=0)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>year</th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 39,
       "text": [
        "Empty DataFrame\n",
        "Columns: [year, state, pop, unempl]\n",
        "Index: []"
       ]
      }
     ],
     "prompt_number": 39
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Interpolate ordered data like a time series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_5 = Series(['foo', 'bar', 'baz'], index=[0, 2, 4])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [],
     "prompt_number": 40
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_5.reindex(range(5), method='ffill')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 41,
       "text": [
        "0    foo\n",
        "1    foo\n",
        "2    bar\n",
        "3    bar\n",
        "4    baz\n",
        "dtype: object"
       ]
      }
     ],
     "prompt_number": 41
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_5.reindex(range(5), method='bfill')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 42,
       "text": [
        "0    foo\n",
        "1    bar\n",
        "2    bar\n",
        "3    baz\n",
        "4    baz\n",
        "dtype: object"
       ]
      }
     ],
     "prompt_number": 42
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Reindex columns:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.reindex(columns=['state', 'pop', 'unempl', 'year'])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 43,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015"
       ]
      }
     ],
     "prompt_number": 43
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Reindex rows and columns while filling rows:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_3.reindex(index=list(reversed(range(0, 6))),\n",
      "             fill_value=0,\n",
      "             columns=['state', 'pop', 'unempl', 'year'])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td>  0</td>\n",
        "      <td> 0.0</td>\n",
        "      <td> 0.0</td>\n",
        "      <td>    0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 44,
       "text": [
        "  state  pop  unempl  year\n",
        "5     0  0.0     0.0     0\n",
        "4    MD  4.1     6.1  2015\n",
        "3    MD  4.0     6.0  2014\n",
        "2    VA  5.2     6.0  2014\n",
        "1    VA  5.1     NaN  2013\n",
        "0    VA  5.0     NaN  2012"
       ]
      }
     ],
     "prompt_number": 44
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Reindex using ix:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6 = df_3.ix[range(0, 7), ['state', 'pop', 'unempl', 'year']]\n",
      "df_6"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 45,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 45
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Dropping Entries"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Drop rows from a Series or DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_7 = df_6.drop([0, 1])\n",
      "df_7"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 46,
       "text": [
        "  state  pop  unempl  year\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 46
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Drop columns from a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_7 = df_7.drop('unempl', axis=1)\n",
      "df_7"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 47,
       "text": [
        "  state  pop  year\n",
        "2    VA  5.2  2014\n",
        "3    MD  4.0  2014\n",
        "4    MD  4.1  2015\n",
        "5   NaN  NaN   NaN\n",
        "6   NaN  NaN   NaN"
       ]
      }
     ],
     "prompt_number": 47
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Indexing, Selecting, Filtering"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Series indexing is similar to NumPy array indexing with the added bonus of being able to use the Series' index values."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 48,
       "text": [
        "a    1\n",
        "b    1\n",
        "c    2\n",
        "d   -3\n",
        "e   -5\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 48
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a value from a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[0] == ser_2['a']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 49,
       "text": [
        "True"
       ]
      }
     ],
     "prompt_number": 49
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a slice from a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[1:4]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 50,
       "text": [
        "b    1\n",
        "c    2\n",
        "d   -3\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 50
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select specific values from a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[['b', 'c', 'd']]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 51,
       "text": [
        "b    1\n",
        "c    2\n",
        "d   -3\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 51
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select from a Series based on a filter:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2[ser_2 > 0]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 52,
       "text": [
        "a    1\n",
        "b    1\n",
        "c    2\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 52
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a slice from a Series with labels (note the end point is inclusive):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2['a':'b']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 53,
       "text": [
        "a    1\n",
        "b    1\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 53
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Assign to a Series slice (note the end point is inclusive):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_2['a':'b'] = 0\n",
      "ser_2"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 54,
       "text": [
        "a    0\n",
        "b    0\n",
        "c    2\n",
        "d   -3\n",
        "e   -5\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 54
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Pandas supports indexing into a DataFrame."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 55,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 55
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select specified columns from a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6[['pop', 'unempl']]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 56,
       "text": [
        "   pop  unempl\n",
        "0  5.0     NaN\n",
        "1  5.1     NaN\n",
        "2  5.2     6.0\n",
        "3  4.0     6.0\n",
        "4  4.1     6.1\n",
        "5  NaN     NaN\n",
        "6  NaN     NaN"
       ]
      }
     ],
     "prompt_number": 56
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a slice from a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6[:2]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td>NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td>NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 57,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013"
       ]
      }
     ],
     "prompt_number": 57
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select from a DataFrame based on a filter:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6[df_6['pop'] > 5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td>NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td>  6</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 58,
       "text": [
        "  state  pop  unempl  year\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2       6  2014"
       ]
      }
     ],
     "prompt_number": 58
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Perform a scalar comparison on a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6 > 5"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  True</td>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "      <td>  True</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "      <td> False</td>\n",
        "      <td>  True</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  True</td>\n",
        "      <td> False</td>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  True</td>\n",
        "      <td> False</td>\n",
        "      <td>  True</td>\n",
        "      <td>  True</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "      <td> False</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 59,
       "text": [
        "   state    pop unempl   year\n",
        "0   True  False  False   True\n",
        "1   True   True  False   True\n",
        "2   True   True   True   True\n",
        "3   True  False   True   True\n",
        "4   True  False   True   True\n",
        "5  False  False  False  False\n",
        "6  False  False  False  False"
       ]
      }
     ],
     "prompt_number": 59
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Perform a scalar comparison on a DataFrame, retain the values that pass the filter:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6[df_6 > 5]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  VA</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 60,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  NaN     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  NaN     6.0  2014\n",
        "4    MD  NaN     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 60
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a slice of rows from a DataFrame (note the end point is inclusive):"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.ix[2:3]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 61,
       "text": [
        "  state  pop  unempl  year\n",
        "2    VA  5.2       6  2014\n",
        "3    MD  4.0       6  2014"
       ]
      }
     ],
     "prompt_number": 61
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select a slice of rows from a specific column of a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.ix[0:2, 'pop']\n",
      "df_6"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 62,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 62
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select rows based on an arithmetic operation on a specific row:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.ix[df_6.unempl > 5.0]"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 63,
       "text": [
        "  state  pop  unempl  year\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015"
       ]
      }
     ],
     "prompt_number": 63
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Arithmetic and Data Alignment"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Adding Series objects results in the union of index pairs if the pairs are not the same, resulting in NaN for indices that do not overlap:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "np.random.seed(0)\n",
      "ser_6 = Series(np.random.randn(5),\n",
      "               index=['a', 'b', 'c', 'd', 'e'])\n",
      "ser_6"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 64,
       "text": [
        "a    1.764052\n",
        "b    0.400157\n",
        "c    0.978738\n",
        "d    2.240893\n",
        "e    1.867558\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 64
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "np.random.seed(1)\n",
      "ser_7 = Series(np.random.randn(5),\n",
      "               index=['a', 'c', 'e', 'f', 'g'])\n",
      "ser_7"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 65,
       "text": [
        "a    1.624345\n",
        "c   -0.611756\n",
        "e   -0.528172\n",
        "f   -1.072969\n",
        "g    0.865408\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 65
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_6 + ser_7"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 66,
       "text": [
        "a    3.388398\n",
        "b         NaN\n",
        "c    0.366982\n",
        "d         NaN\n",
        "e    1.339386\n",
        "f         NaN\n",
        "g         NaN\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 66
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Set a fill value instead of NaN for indices that do not overlap:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_6.add(ser_7, fill_value=0)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 67,
       "text": [
        "a    3.388398\n",
        "b    0.400157\n",
        "c    0.366982\n",
        "d    2.240893\n",
        "e    1.339386\n",
        "f   -1.072969\n",
        "g    0.865408\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 67
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Adding DataFrame objects results in the union of index pairs for rows and columns if the pairs are not the same, resulting in NaN for indices that do not overlap:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "np.random.seed(0)\n",
      "df_8 = DataFrame(np.random.rand(9).reshape((3, 3)),\n",
      "                 columns=['a', 'b', 'c'])\n",
      "df_8"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.548814</td>\n",
        "      <td> 0.715189</td>\n",
        "      <td> 0.602763</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.544883</td>\n",
        "      <td> 0.423655</td>\n",
        "      <td> 0.645894</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.437587</td>\n",
        "      <td> 0.891773</td>\n",
        "      <td> 0.963663</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 68,
       "text": [
        "          a         b         c\n",
        "0  0.548814  0.715189  0.602763\n",
        "1  0.544883  0.423655  0.645894\n",
        "2  0.437587  0.891773  0.963663"
       ]
      }
     ],
     "prompt_number": 68
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "np.random.seed(1)\n",
      "df_9 = DataFrame(np.random.rand(9).reshape((3, 3)),\n",
      "                 columns=['b', 'c', 'd'])\n",
      "df_9"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.417022</td>\n",
        "      <td> 0.720324</td>\n",
        "      <td> 0.000114</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.302333</td>\n",
        "      <td> 0.146756</td>\n",
        "      <td> 0.092339</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.186260</td>\n",
        "      <td> 0.345561</td>\n",
        "      <td> 0.396767</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 69,
       "text": [
        "          b         c         d\n",
        "0  0.417022  0.720324  0.000114\n",
        "1  0.302333  0.146756  0.092339\n",
        "2  0.186260  0.345561  0.396767"
       ]
      }
     ],
     "prompt_number": 69
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_8 + df_9"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>NaN</td>\n",
        "      <td> 1.132211</td>\n",
        "      <td> 1.323088</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>NaN</td>\n",
        "      <td> 0.725987</td>\n",
        "      <td> 0.792650</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>NaN</td>\n",
        "      <td> 1.078033</td>\n",
        "      <td> 1.309223</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 70,
       "text": [
        "    a         b         c   d\n",
        "0 NaN  1.132211  1.323088 NaN\n",
        "1 NaN  0.725987  0.792650 NaN\n",
        "2 NaN  1.078033  1.309223 NaN"
       ]
      }
     ],
     "prompt_number": 70
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Set a fill value instead of NaN for indices that do not overlap:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_10 = df_8.add(df_9, fill_value=0)\n",
      "df_10"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.548814</td>\n",
        "      <td> 1.132211</td>\n",
        "      <td> 1.323088</td>\n",
        "      <td> 0.000114</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.544883</td>\n",
        "      <td> 0.725987</td>\n",
        "      <td> 0.792650</td>\n",
        "      <td> 0.092339</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.437587</td>\n",
        "      <td> 1.078033</td>\n",
        "      <td> 1.309223</td>\n",
        "      <td> 0.396767</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 71,
       "text": [
        "          a         b         c         d\n",
        "0  0.548814  1.132211  1.323088  0.000114\n",
        "1  0.544883  0.725987  0.792650  0.092339\n",
        "2  0.437587  1.078033  1.309223  0.396767"
       ]
      }
     ],
     "prompt_number": 71
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Like NumPy, pandas supports arithmetic operations between DataFrames and Series.\n",
      "\n",
      "Match the index of the Series on the DataFrame's columns, broadcasting down the rows:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_8 = df_10.ix[0]\n",
      "df_11 = df_10 - ser_8\n",
      "df_11"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>-0.003930</td>\n",
        "      <td>-0.406224</td>\n",
        "      <td>-0.530438</td>\n",
        "      <td> 0.092224</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>-0.111226</td>\n",
        "      <td>-0.054178</td>\n",
        "      <td>-0.013864</td>\n",
        "      <td> 0.396653</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 72,
       "text": [
        "          a         b         c         d\n",
        "0  0.000000  0.000000  0.000000  0.000000\n",
        "1 -0.003930 -0.406224 -0.530438  0.092224\n",
        "2 -0.111226 -0.054178 -0.013864  0.396653"
       ]
      }
     ],
     "prompt_number": 72
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Match the index of the Series on the DataFrame's columns, broadcasting down the rows and union the indices that do not match:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_9 = Series(range(3), index=['a', 'd', 'e'])\n",
      "ser_9"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 73,
       "text": [
        "a    0\n",
        "d    1\n",
        "e    2\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 73
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_11 - ser_9"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "      <th>e</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.000000</td>\n",
        "      <td>NaN</td>\n",
        "      <td>NaN</td>\n",
        "      <td>-1.000000</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>-0.003930</td>\n",
        "      <td>NaN</td>\n",
        "      <td>NaN</td>\n",
        "      <td>-0.907776</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>-0.111226</td>\n",
        "      <td>NaN</td>\n",
        "      <td>NaN</td>\n",
        "      <td>-0.603347</td>\n",
        "      <td>NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 74,
       "text": [
        "          a   b   c         d   e\n",
        "0  0.000000 NaN NaN -1.000000 NaN\n",
        "1 -0.003930 NaN NaN -0.907776 NaN\n",
        "2 -0.111226 NaN NaN -0.603347 NaN"
       ]
      }
     ],
     "prompt_number": 74
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Broadcast over the columns and match the rows (axis=0) by using an arithmetic method:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_10"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.548814</td>\n",
        "      <td> 1.132211</td>\n",
        "      <td> 1.323088</td>\n",
        "      <td> 0.000114</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.544883</td>\n",
        "      <td> 0.725987</td>\n",
        "      <td> 0.792650</td>\n",
        "      <td> 0.092339</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.437587</td>\n",
        "      <td> 1.078033</td>\n",
        "      <td> 1.309223</td>\n",
        "      <td> 0.396767</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 75,
       "text": [
        "          a         b         c         d\n",
        "0  0.548814  1.132211  1.323088  0.000114\n",
        "1  0.544883  0.725987  0.792650  0.092339\n",
        "2  0.437587  1.078033  1.309223  0.396767"
       ]
      }
     ],
     "prompt_number": 75
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_10 = Series([100, 200, 300])\n",
      "ser_10"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 76,
       "text": [
        "0    100\n",
        "1    200\n",
        "2    300\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 76
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_10.sub(ser_10, axis=0)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> -99.451186</td>\n",
        "      <td> -98.867789</td>\n",
        "      <td> -98.676912</td>\n",
        "      <td> -99.999886</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>-199.455117</td>\n",
        "      <td>-199.274013</td>\n",
        "      <td>-199.207350</td>\n",
        "      <td>-199.907661</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>-299.562413</td>\n",
        "      <td>-298.921967</td>\n",
        "      <td>-298.690777</td>\n",
        "      <td>-299.603233</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 77,
       "text": [
        "            a           b           c           d\n",
        "0  -99.451186  -98.867789  -98.676912  -99.999886\n",
        "1 -199.455117 -199.274013 -199.207350 -199.907661\n",
        "2 -299.562413 -298.921967 -298.690777 -299.603233"
       ]
      }
     ],
     "prompt_number": 77
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Function Application and Mapping"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "NumPy ufuncs (element-wise array methods) operate on pandas objects:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_11 = np.abs(df_11)\n",
      "df_11"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.003930</td>\n",
        "      <td> 0.406224</td>\n",
        "      <td> 0.530438</td>\n",
        "      <td> 0.092224</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.111226</td>\n",
        "      <td> 0.054178</td>\n",
        "      <td> 0.013864</td>\n",
        "      <td> 0.396653</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 78,
       "text": [
        "          a         b         c         d\n",
        "0  0.000000  0.000000  0.000000  0.000000\n",
        "1  0.003930  0.406224  0.530438  0.092224\n",
        "2  0.111226  0.054178  0.013864  0.396653"
       ]
      }
     ],
     "prompt_number": 78
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply a function on 1D arrays to each column:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "func_1 = lambda x: x.max() - x.min()\n",
      "df_11.apply(func_1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 79,
       "text": [
        "a    0.111226\n",
        "b    0.406224\n",
        "c    0.530438\n",
        "d    0.396653\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 79
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply a function on 1D arrays to each row:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_11.apply(func_1, axis=1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 80,
       "text": [
        "0    0.000000\n",
        "1    0.526508\n",
        "2    0.382789\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 80
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply a function and return a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "func_2 = lambda x: Series([x.min(), x.max()], index=['min', 'max'])\n",
      "df_11.apply(func_2)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>min</th>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "      <td> 0.000000</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>max</th>\n",
        "      <td> 0.111226</td>\n",
        "      <td> 0.406224</td>\n",
        "      <td> 0.530438</td>\n",
        "      <td> 0.396653</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 81,
       "text": [
        "            a         b         c         d\n",
        "min  0.000000  0.000000  0.000000  0.000000\n",
        "max  0.111226  0.406224  0.530438  0.396653"
       ]
      }
     ],
     "prompt_number": 81
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply an element-wise Python function to a DataFrame:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "func_3 = lambda x: '%.2f' %x\n",
      "df_11.applymap(func_3)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>c</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 0.00</td>\n",
        "      <td> 0.00</td>\n",
        "      <td> 0.00</td>\n",
        "      <td> 0.00</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 0.00</td>\n",
        "      <td> 0.41</td>\n",
        "      <td> 0.53</td>\n",
        "      <td> 0.09</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 0.11</td>\n",
        "      <td> 0.05</td>\n",
        "      <td> 0.01</td>\n",
        "      <td> 0.40</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 82,
       "text": [
        "      a     b     c     d\n",
        "0  0.00  0.00  0.00  0.00\n",
        "1  0.00  0.41  0.53  0.09\n",
        "2  0.11  0.05  0.01  0.40"
       ]
      }
     ],
     "prompt_number": 82
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Apply an element-wise Python function to a Series:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_11['a'].map(func_3)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 83,
       "text": [
        "0    0.00\n",
        "1    0.00\n",
        "2    0.11\n",
        "Name: a, dtype: object"
       ]
      }
     ],
     "prompt_number": 83
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Sorting and Ranking"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 84,
       "text": [
        "fo    100\n",
        "br    200\n",
        "bz    300\n",
        "qx    NaN\n",
        "Name: foobarbazqux, dtype: float64"
       ]
      }
     ],
     "prompt_number": 84
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sort a Series by its index:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.sort_index()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 85,
       "text": [
        "br    200\n",
        "bz    300\n",
        "fo    100\n",
        "qx    NaN\n",
        "Name: foobarbazqux, dtype: float64"
       ]
      }
     ],
     "prompt_number": 85
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sort a Series by its values:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_4.order()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 86,
       "text": [
        "fo    100\n",
        "br    200\n",
        "bz    300\n",
        "qx    NaN\n",
        "Name: foobarbazqux, dtype: float64"
       ]
      }
     ],
     "prompt_number": 86
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_12 = DataFrame(np.arange(12).reshape((3, 4)),\n",
      "                  index=['three', 'one', 'two'],\n",
      "                  columns=['c', 'a', 'b', 'd'])\n",
      "df_12"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>c</th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>three</th>\n",
        "      <td> 0</td>\n",
        "      <td> 1</td>\n",
        "      <td>  2</td>\n",
        "      <td>  3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>one</th>\n",
        "      <td> 4</td>\n",
        "      <td> 5</td>\n",
        "      <td>  6</td>\n",
        "      <td>  7</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>two</th>\n",
        "      <td> 8</td>\n",
        "      <td> 9</td>\n",
        "      <td> 10</td>\n",
        "      <td> 11</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 87,
       "text": [
        "       c  a   b   d\n",
        "three  0  1   2   3\n",
        "one    4  5   6   7\n",
        "two    8  9  10  11"
       ]
      }
     ],
     "prompt_number": 87
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sort a DataFrame by its index:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_12.sort_index()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>c</th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>one</th>\n",
        "      <td> 4</td>\n",
        "      <td> 5</td>\n",
        "      <td>  6</td>\n",
        "      <td>  7</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>three</th>\n",
        "      <td> 0</td>\n",
        "      <td> 1</td>\n",
        "      <td>  2</td>\n",
        "      <td>  3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>two</th>\n",
        "      <td> 8</td>\n",
        "      <td> 9</td>\n",
        "      <td> 10</td>\n",
        "      <td> 11</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 88,
       "text": [
        "       c  a   b   d\n",
        "one    4  5   6   7\n",
        "three  0  1   2   3\n",
        "two    8  9  10  11"
       ]
      }
     ],
     "prompt_number": 88
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sort a DataFrame by columns in descending order:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_12.sort_index(axis=1, ascending=False)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>d</th>\n",
        "      <th>c</th>\n",
        "      <th>b</th>\n",
        "      <th>a</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>three</th>\n",
        "      <td>  3</td>\n",
        "      <td> 0</td>\n",
        "      <td>  2</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>one</th>\n",
        "      <td>  7</td>\n",
        "      <td> 4</td>\n",
        "      <td>  6</td>\n",
        "      <td> 5</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>two</th>\n",
        "      <td> 11</td>\n",
        "      <td> 8</td>\n",
        "      <td> 10</td>\n",
        "      <td> 9</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 89,
       "text": [
        "        d  c   b  a\n",
        "three   3  0   2  1\n",
        "one     7  4   6  5\n",
        "two    11  8  10  9"
       ]
      }
     ],
     "prompt_number": 89
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sort a DataFrame's values by column:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_12.sort_index(by=['d', 'c'])"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>c</th>\n",
        "      <th>a</th>\n",
        "      <th>b</th>\n",
        "      <th>d</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>three</th>\n",
        "      <td> 0</td>\n",
        "      <td> 1</td>\n",
        "      <td>  2</td>\n",
        "      <td>  3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>one</th>\n",
        "      <td> 4</td>\n",
        "      <td> 5</td>\n",
        "      <td>  6</td>\n",
        "      <td>  7</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>two</th>\n",
        "      <td> 8</td>\n",
        "      <td> 9</td>\n",
        "      <td> 10</td>\n",
        "      <td> 11</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 90,
       "text": [
        "       c  a   b   d\n",
        "three  0  1   2   3\n",
        "one    4  5   6   7\n",
        "two    8  9  10  11"
       ]
      }
     ],
     "prompt_number": 90
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Ranking is similar to numpy.argsort except that ties are broken by assigning each group the mean rank:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_11 = Series([7, -5, 7, 4, 2, 0, 4, 7])\n",
      "ser_11 = ser_11.order()\n",
      "ser_11"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 91,
       "text": [
        "1   -5\n",
        "5    0\n",
        "4    2\n",
        "3    4\n",
        "6    4\n",
        "0    7\n",
        "2    7\n",
        "7    7\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 91
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_11.rank()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 92,
       "text": [
        "1    1.0\n",
        "5    2.0\n",
        "4    3.0\n",
        "3    4.5\n",
        "6    4.5\n",
        "0    7.0\n",
        "2    7.0\n",
        "7    7.0\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 92
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Rank a Series according to when they appear in the data:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_11.rank(method='first')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 93,
       "text": [
        "1    1\n",
        "5    2\n",
        "4    3\n",
        "3    4\n",
        "6    5\n",
        "0    6\n",
        "2    7\n",
        "7    8\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 93
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Rank a Series in descending order, using the maximum rank for the group:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_11.rank(ascending=False, method='max')"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 94,
       "text": [
        "1    8\n",
        "5    7\n",
        "4    6\n",
        "3    5\n",
        "6    5\n",
        "0    3\n",
        "2    3\n",
        "7    3\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 94
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "DataFrames can rank over rows or columns."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_13 = DataFrame({'foo' : [7, -5, 7, 4, 2, 0, 4, 7],\n",
      "                   'bar' : [-5, 4, 2, 0, 4, 7, 7, 8],\n",
      "                   'baz' : [-1, 2, 3, 0, 5, 9, 9, 5]})\n",
      "df_13"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>bar</th>\n",
        "      <th>baz</th>\n",
        "      <th>foo</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>-5</td>\n",
        "      <td>-1</td>\n",
        "      <td> 7</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 4</td>\n",
        "      <td> 2</td>\n",
        "      <td>-5</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 2</td>\n",
        "      <td> 3</td>\n",
        "      <td> 7</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 0</td>\n",
        "      <td> 0</td>\n",
        "      <td> 4</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 4</td>\n",
        "      <td> 5</td>\n",
        "      <td> 2</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> 7</td>\n",
        "      <td> 9</td>\n",
        "      <td> 0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> 7</td>\n",
        "      <td> 9</td>\n",
        "      <td> 4</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>7</th>\n",
        "      <td> 8</td>\n",
        "      <td> 5</td>\n",
        "      <td> 7</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 95,
       "text": [
        "   bar  baz  foo\n",
        "0   -5   -1    7\n",
        "1    4    2   -5\n",
        "2    2    3    7\n",
        "3    0    0    4\n",
        "4    4    5    2\n",
        "5    7    9    0\n",
        "6    7    9    4\n",
        "7    8    5    7"
       ]
      }
     ],
     "prompt_number": 95
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Rank a DataFrame over rows:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_13.rank()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>bar</th>\n",
        "      <th>baz</th>\n",
        "      <th>foo</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 1.0</td>\n",
        "      <td> 1.0</td>\n",
        "      <td> 7.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 4.5</td>\n",
        "      <td> 3.0</td>\n",
        "      <td> 1.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 3.0</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 7.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 2.0</td>\n",
        "      <td> 2.0</td>\n",
        "      <td> 4.5</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 4.5</td>\n",
        "      <td> 5.5</td>\n",
        "      <td> 3.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> 6.5</td>\n",
        "      <td> 7.5</td>\n",
        "      <td> 2.0</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> 6.5</td>\n",
        "      <td> 7.5</td>\n",
        "      <td> 4.5</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>7</th>\n",
        "      <td> 8.0</td>\n",
        "      <td> 5.5</td>\n",
        "      <td> 7.0</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 96,
       "text": [
        "   bar  baz  foo\n",
        "0  1.0  1.0  7.0\n",
        "1  4.5  3.0  1.0\n",
        "2  3.0  4.0  7.0\n",
        "3  2.0  2.0  4.5\n",
        "4  4.5  5.5  3.0\n",
        "5  6.5  7.5  2.0\n",
        "6  6.5  7.5  4.5\n",
        "7  8.0  5.5  7.0"
       ]
      }
     ],
     "prompt_number": 96
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Rank a DataFrame over columns:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_13.rank(axis=1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>bar</th>\n",
        "      <th>baz</th>\n",
        "      <th>foo</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td> 1.0</td>\n",
        "      <td> 2.0</td>\n",
        "      <td> 3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td> 3.0</td>\n",
        "      <td> 2.0</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td> 1.0</td>\n",
        "      <td> 2.0</td>\n",
        "      <td> 3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td> 1.5</td>\n",
        "      <td> 1.5</td>\n",
        "      <td> 3</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td> 2.0</td>\n",
        "      <td> 3.0</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> 2.0</td>\n",
        "      <td> 3.0</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> 2.0</td>\n",
        "      <td> 3.0</td>\n",
        "      <td> 1</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>7</th>\n",
        "      <td> 3.0</td>\n",
        "      <td> 1.0</td>\n",
        "      <td> 2</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 97,
       "text": [
        "   bar  baz  foo\n",
        "0  1.0  2.0    3\n",
        "1  3.0  2.0    1\n",
        "2  1.0  2.0    3\n",
        "3  1.5  1.5    3\n",
        "4  2.0  3.0    1\n",
        "5  2.0  3.0    1\n",
        "6  2.0  3.0    1\n",
        "7  3.0  1.0    2"
       ]
      }
     ],
     "prompt_number": 97
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Axis Indexes with Duplicate Values"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Labels do not have to be unique in Pandas:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_12 = Series(range(5), index=['foo', 'foo', 'bar', 'bar', 'baz'])\n",
      "ser_12"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 98,
       "text": [
        "foo    0\n",
        "foo    1\n",
        "bar    2\n",
        "bar    3\n",
        "baz    4\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 98
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_12.index.is_unique"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 99,
       "text": [
        "False"
       ]
      }
     ],
     "prompt_number": 99
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select Series elements:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "ser_12['foo']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 100,
       "text": [
        "foo    0\n",
        "foo    1\n",
        "dtype: int64"
       ]
      }
     ],
     "prompt_number": 100
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Select DataFrame elements:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_14 = DataFrame(np.random.randn(5, 4),\n",
      "                  index=['foo', 'foo', 'bar', 'bar', 'baz'])\n",
      "df_14"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>0</th>\n",
        "      <th>1</th>\n",
        "      <th>2</th>\n",
        "      <th>3</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>foo</th>\n",
        "      <td>-2.363469</td>\n",
        "      <td> 1.135345</td>\n",
        "      <td>-1.017014</td>\n",
        "      <td> 0.637362</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>foo</th>\n",
        "      <td>-0.859907</td>\n",
        "      <td> 1.772608</td>\n",
        "      <td>-1.110363</td>\n",
        "      <td> 0.181214</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>bar</th>\n",
        "      <td> 0.564345</td>\n",
        "      <td>-0.566510</td>\n",
        "      <td> 0.729976</td>\n",
        "      <td> 0.372994</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>bar</th>\n",
        "      <td> 0.533811</td>\n",
        "      <td>-0.091973</td>\n",
        "      <td> 1.913820</td>\n",
        "      <td> 0.330797</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>baz</th>\n",
        "      <td> 1.141943</td>\n",
        "      <td>-1.129595</td>\n",
        "      <td>-0.850052</td>\n",
        "      <td> 0.960820</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 101,
       "text": [
        "            0         1         2         3\n",
        "foo -2.363469  1.135345 -1.017014  0.637362\n",
        "foo -0.859907  1.772608 -1.110363  0.181214\n",
        "bar  0.564345 -0.566510  0.729976  0.372994\n",
        "bar  0.533811 -0.091973  1.913820  0.330797\n",
        "baz  1.141943 -1.129595 -0.850052  0.960820"
       ]
      }
     ],
     "prompt_number": 101
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_14.ix['bar']"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>0</th>\n",
        "      <th>1</th>\n",
        "      <th>2</th>\n",
        "      <th>3</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>bar</th>\n",
        "      <td> 0.564345</td>\n",
        "      <td>-0.566510</td>\n",
        "      <td> 0.729976</td>\n",
        "      <td> 0.372994</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>bar</th>\n",
        "      <td> 0.533811</td>\n",
        "      <td>-0.091973</td>\n",
        "      <td> 1.913820</td>\n",
        "      <td> 0.330797</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 102,
       "text": [
        "            0         1         2         3\n",
        "bar  0.564345 -0.566510  0.729976  0.372994\n",
        "bar  0.533811 -0.091973  1.913820  0.330797"
       ]
      }
     ],
     "prompt_number": 102
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "## Summarizing and Computing Descriptive Statistics"
     ]
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Unlike NumPy arrays, Pandas descriptive statistics automatically exclude missing data.  NaN values are excluded unless the entire row or column is NA."
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "html": [
        "<div style=\"max-height:1000px;max-width:1500px;overflow:auto;\">\n",
        "<table border=\"1\" class=\"dataframe\">\n",
        "  <thead>\n",
        "    <tr style=\"text-align: right;\">\n",
        "      <th></th>\n",
        "      <th>state</th>\n",
        "      <th>pop</th>\n",
        "      <th>unempl</th>\n",
        "      <th>year</th>\n",
        "    </tr>\n",
        "  </thead>\n",
        "  <tbody>\n",
        "    <tr>\n",
        "      <th>0</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.0</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2012</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>1</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.1</td>\n",
        "      <td> NaN</td>\n",
        "      <td> 2013</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>2</th>\n",
        "      <td>  VA</td>\n",
        "      <td> 5.2</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>3</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.0</td>\n",
        "      <td> 6.0</td>\n",
        "      <td> 2014</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>4</th>\n",
        "      <td>  MD</td>\n",
        "      <td> 4.1</td>\n",
        "      <td> 6.1</td>\n",
        "      <td> 2015</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>5</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "    <tr>\n",
        "      <th>6</th>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td> NaN</td>\n",
        "      <td>  NaN</td>\n",
        "    </tr>\n",
        "  </tbody>\n",
        "</table>\n",
        "</div>"
       ],
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 103,
       "text": [
        "  state  pop  unempl  year\n",
        "0    VA  5.0     NaN  2012\n",
        "1    VA  5.1     NaN  2013\n",
        "2    VA  5.2     6.0  2014\n",
        "3    MD  4.0     6.0  2014\n",
        "4    MD  4.1     6.1  2015\n",
        "5   NaN  NaN     NaN   NaN\n",
        "6   NaN  NaN     NaN   NaN"
       ]
      }
     ],
     "prompt_number": 103
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.sum()"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 104,
       "text": [
        "pop          23.4\n",
        "unempl       18.1\n",
        "year      10068.0\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 104
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Sum over the rows:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.sum(axis=1)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 105,
       "text": [
        "0    2017.0\n",
        "1    2018.1\n",
        "2    2025.2\n",
        "3    2024.0\n",
        "4    2025.2\n",
        "5       NaN\n",
        "6       NaN\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 105
    },
    {
     "cell_type": "markdown",
     "metadata": {},
     "source": [
      "Account for NaNs:"
     ]
    },
    {
     "cell_type": "code",
     "collapsed": false,
     "input": [
      "df_6.sum(axis=1, skipna=False)"
     ],
     "language": "python",
     "metadata": {},
     "outputs": [
      {
       "metadata": {},
       "output_type": "pyout",
       "prompt_number": 106,
       "text": [
        "0       NaN\n",
        "1       NaN\n",
        "2    2025.2\n",
        "3    2024.0\n",
        "4    2025.2\n",
        "5       NaN\n",
        "6       NaN\n",
        "dtype: float64"
       ]
      }
     ],
     "prompt_number": 106
    }
   ],
   "metadata": {}
  }
 ]
}