data-science-ipython-notebooks/scipy/sampling.ipynb

1114 lines
100 KiB
Plaintext
Raw Normal View History

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Random Sampling\n",
"==========================\n",
"\n",
"Credits: Forked from [CompStats](https://github.com/AllenDowney/CompStats) by Allen Downey. License: [Creative Commons Attribution 4.0 International](http://creativecommons.org/licenses/by/4.0/)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"from __future__ import print_function, division\n",
"\n",
"import numpy\n",
"import scipy.stats\n",
"\n",
"import matplotlib.pyplot as pyplot\n",
"\n",
"from IPython.html.widgets import interact, fixed\n",
"from IPython.html import widgets\n",
"\n",
"# seed the random number generator so we all get the same results\n",
"numpy.random.seed(18)\n",
"\n",
"# some nicer colors from http://colorbrewer2.org/\n",
"COLOR1 = '#7fc97f'\n",
"COLOR2 = '#beaed4'\n",
"COLOR3 = '#fdc086'\n",
"COLOR4 = '#ffff99'\n",
"COLOR5 = '#386cb0'\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Part One\n",
"========\n",
"\n",
"Suppose we want to estimate the average weight of men and women in the U.S.\n",
"\n",
"And we want to quantify the uncertainty of the estimate.\n",
"\n",
"One approach is to simulate many experiments and see how much the results vary from one experiment to the next.\n",
"\n",
"I'll start with the unrealistic assumption that we know the actual distribution of weights in the population. Then I'll show how to solve the problem without that assumption.\n",
"\n",
"Based on data from the [BRFSS](http://www.cdc.gov/brfss/), I found that the distribution of weight in kg for women in the U.S. is well modeled by a lognormal distribution with the following parameters:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(72.697645732966876, 16.944043048498038)"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"weight = scipy.stats.lognorm(0.23, 0, 70.8)\n",
"weight.mean(), weight.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's what that distribution looks like:"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZQAAAEPCAYAAABlZDIgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4XPV97/H3bNp3S7I2G8nCeMGAF2xMIFhsLSUNkCYN\npaFsaUOeXtLetPeWpPe5T5x773ObtE/alKYlpE0TB3ohEFJqmrAFUELA2MY2XrBlS94tWbJs7Ysl\njWbuH+doNGdGM9YyozPL55XMo/P7zTkzX2N5vvNbzu8HIiIiIiIiIiIiIiIiIiIiIiIiIiIiInNy\nB9AENAOPRzjnCfP5vcAasy4L2A58CBwE/iro/BLgDeAI8DpQFPOoRUQkobiAFqAW8GAkhxUh59wJ\n/Nw8vg54P+i5HPOn26y/wSz/NfAX5vHjwDdiGbSIiCSe64FXg8pfMR/BvgvcG1RuAhaGnJMD7ARW\nTnFOhVkWERGbOeP42tXA6aDyGbPuUufUmMcujFZNB/A2RtcXGMmkwzzuIDwBiYiIDeKZUPzTPM8R\n4bpxYDVGgrkJaIjwHtN9HxERiSN3HF+7FVgUVF6E0QKJdk6NWResF/gZsA5oxGiVVADtQCVwbqo3\nr6+v9x89enSWoYuIpKWjwOWzvTieLZQPgKUYg/IZGGMlW0PO2Qo8YB5vBHowEkYpk7O3soHbMbq/\nJq550Dx+EHhpqjc/evQofr8/KR9f+9rXbI9B8dsfh+JPzkcyxw/Uz+VDP54tFC/wGPAaxnjI94FD\nwKPm809hzPC6E2M22CDwsPlcJbAFI+E5gaeBN83nvgE8D3weOAF8No5/BhERmaZ4JhSAV8xHsKdC\nyo9Ncd1+YG2E1+wCbptjXCIiEmPx7PKSWWpoaLA7hDlR/PZS/PZK9vjnInSGVSrxm32CIiIyDQ6H\nA+aQF9RCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGR\nmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBC\nERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmIh3QrkDaAKagccj\nnPOE+fxeYI1Ztwh4G/gIOAD8SdD5m4EzwB7zcUesgxYRkZlzx/G1XcB3gNuAVmAnsBU4FHTOncDl\nwFLgOuBJYCMwBnwZ+BDIA3YBr2MkJz/wt+ZDUkzPSA+9o71kujLJdmeT487B4/TYHZaITEM8E8oG\noAU4YZafA+7GmlDuAraYx9uBImAh0G4+AAbMa6oxEgqAI15By/zz+rwc7DrIrs5dnOw/Gfb8orxF\nbKraRH1hPQ6H/upFElU8E0o1cDqofAajFXKpc2qAjqC6WoyusO1BdV8CHgA+AP4c6IlJxDLvPjj3\nAW+deYsh71DEc04PnOaZI8+wKG8Rt9TcQl1B3TxGKCLTFc8xFP80zwv9yhl8XR7wE+BPMVoqYHSL\n1QGrgbPAt+YQo9jE5/fx2qnX+M8T/xk1mQQ7PXCaLU1beOXkK/j8vjhHKCIzFc8WSivG4PqERRgt\nkGjn1Jh1AB7gReAZ4KWgc84FHf8L8HKkADZv3hw4bmhooKGhYVqBS3x5fV7+4/h/sP/C/imfL88u\nx4+fYe8wA2MDYc9v79jO4Ngg9yy5B7cznr/CIqmtsbGRxsbGmL1ePDuk3cBh4FagDdgB3Ef4oPxj\n5s+NwLfNnw6MsZULGIPzwSoxWiaYz60Hfn+K9/f7/dNtJMl8GR0f5bnm5zjWd8xS73F6WFe2jrXl\naynPLg/Utw228faZt2nubQ57rfrCeu69/F4yXBlxj1skHZhjlLPOC/Ee4fwtjCThAr4P/BXwqPnc\nU+bP72BM/R0EHgZ2AzcCvwL2MdkF9lXgVeBHGN1dfuC4+XrBYy4TlFAS0H8c/w/2dO6x1OV6cvnc\nFZ+jKrcq4nUn+0/yQssLYS2WRXmLeGD5A5oJJhIDiZ5Q7KSEkmA+6vqIF1pesNSVZJZw/7L7Kckq\nueT13SPdPN30NF0jXZb61aWrubvubs0AE5mjuSYU3Skv86J3pJeXj1uHuxZkLeDzKz8/rWQCUJxZ\nzCMrH6Eip8JS/+H5D9nVuStmsYrI7CihSNz5/D5+euynXBy/GKhzOpx8pv4z5HpyZ/RaeZ48Hlr+\nEAuyFljqXzn5CmcGQud8iMh8UkKRuHv37LthNyzeVnMblbmVs3q9LHcW9y691zJuMu4f5/mW56ec\nFSYi80MJReKqf7SfX7b90lJXX1DPxoqNc3rd8uxy7q6721LXN9rHz078bE6vKyKzp4QicfXLtl/i\n9XkD5Rx3DvcsuQenY+6/eqsWrOL6iustdYe6D3G4+/CcX1tEZk4JReKm62IXuzt3W+purr6Z/Iz8\nmL3HbYtuC5tu/LOTP2NkfCRm7yEi06OEInHzduvbliVSijOLWVO2JsoVM+dyuPhk7SdxBM107Bvt\n4+0zb8f0fUTk0pRQJC7ah9rDlla5ufrmuCyVUplbGdb1tb1jO22DbTF/LxGJTAlF4uLN029ayguz\nF7Jqwaq4vV9DdQNFGUWBsh8/Lx9/WYtIiswjJRSJuTMDZ8LW3rql5paYDMRHkuHK4BO1n7DUnR06\ny4ELB+L2niJipYQiMbe9Y7ulvChvEVcUXRH3911atJSVJSstdW+decsyy0xE4kcJRWJqYGyAj7o+\nstTdVHXTvK2zdVvNbZaWUM9oDzvP7ZyX9xZJd0ooElO7zu2yjFuUZJZQX1g/b+9fklXCteXXWup+\n1fYrLnovRrhCRGJFCUViZtw3zgfnPrDUrV+4Pq5jJ1O5qeomMpyTe6QMe4d5t/3deY1BJB0poUjM\nNPU00T/WHyh7nB5Wl66e9zjyPHl8rPJjlrpt7dvoG+2b91hE0okSisTMjo4dlvI1pdeQ7c62JZbr\nK663rGTs9Xl57+x7tsQiki6UUCQm2ofaw1YU3lC+waZoINOVyaaqTZa6Dzo/0GrEInGkhCIxETp2\nUptfS3lOeYSz58fasrXkeybXDfP6vLzf/r6NEYmkNiUUmTOvzxt2A+GGhfa1Tia4nW5uqLzBUrej\nYwdD3iGbIhJJbUooMmfNPc2W3Rhz3DksK1pmY0ST1patJdc9OZYy6htle/v2KFeIyGwpocic7buw\nz1JeVbIKl9NlUzRWGa4Mrq8MXzgyOAGKSGwoocicDHuHOdJzxFJ3VelVNkUztfXl68lyZQXKF8cv\nsrNDd8+LxJoSiszJoe5DjPvHA+XizGJqcmtsjChcpiszbMvh9zve1xpfIjGmhCJzsu+8tbvr6gVX\nz9u6XTNx3cLrLHfPD44Nhu3XIiJzo4Qis9Y72suJ/hOWuqsWJFZ314RsdzZry9Za6ra1b8Pv99sU\nkUjqUUKRWQudKlyVW0VpdqlN0VzadRXXWbYKPjd8jqO9R22MSCS1KKHIrE3V3ZXIijOLw/ZLea9d\ny7GIxIoSiszK+eHzdAx3BMoOHFxZcqWNEU1P6N7zx/qO0T7UblM0IqlFCUVm5VD3IUu5rqCO/Iz8\nCGcnjpq8GhbnLbbUbWvfZlM0Iqkl3gnlDqAJaAYej3DOE+bze4E1Zt0i4G3gI+AA8CdB55cAbwBH\ngNeBophHLZfU1N1kKS8vXm5TJDMXeqPj/gv76R/tj3C2iExXPBOKC/gORlJZCdwHrAg5507gcmAp\n8AXgSbN+DPgycCWwEfgvwMQn1lcwEsoVwJtmWeZR32gfrYOtlrpkSijLipZRklkSKPv8vrDFLUVk\n5uKZUDYALcAJjATxHHB3yDl3AVvM4+0YrY2FQDvwoVk/ABwCqqe4ZgtwT+xDl2gOdx+2lKtzqynI\nKLApmplzOpxct/A6S92uzl260VFkjuKZUKqB00HlM0wmhWjnhN5mXYvRFTaxot9CYGI0uMMsyzxK\n5u6uCdeUXWO50XFgbIC
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d9ead90>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"xs = numpy.linspace(20, 160, 100)\n",
"ys = weight.pdf(xs)\n",
"pyplot.plot(xs, ys, linewidth=4, color=COLOR1)\n",
"pyplot.xlabel('weight (kg)')\n",
"pyplot.ylabel('PDF')\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`make_sample` draws a random sample from this distribution. The result is a NumPy array."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def make_sample(n=100):\n",
" sample = weight.rvs(n)\n",
" return sample"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's an example with `n=100`. The mean and std of the sample are close to the mean and std of the population, but not exact."
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(76.308293640077437, 19.995558735561865)"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sample = make_sample(n=100)\n",
"sample.mean(), sample.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We want to estimate the average weight in the population, so the \"sample statistic\" we'll use is the mean:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def sample_stat(sample):\n",
" return sample.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"One iteration of \"the experiment\" is to collect a sample of 100 women and compute their average weight.\n",
"\n",
"We can simulate running this experiment many times, and collect a list of sample statistics. The result is a NumPy array."
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def compute_sample_statistics(n=100, iters=1000):\n",
" stats = [sample_stat(make_sample(n)) for i in range(iters)]\n",
" return numpy.array(stats)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The next line runs the simulation 1000 times and puts the results in\n",
"`sample_means`:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"sample_means = compute_sample_statistics(n=100, iters=1000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Let's look at the distribution of the sample means. This distribution shows how much the results vary from one experiment to the next.\n",
"\n",
"Remember that this distribution is not the same as the distribution of weight in the population. This is the distribution of results across repeated imaginary experiments."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEPCAYAAACp/QjLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAE+BJREFUeJzt3X2UXHV9x/H35AHLU7JJxRCSaBBJS5RKFGMqWqaUpuHY\nmtAWkBYbq8djT+Th1NpqtEc2teVgFUtthPZUwSCaNoqmwYcW5DCn2BqjyENCiCaRKBtDgprdhWpt\nCNs/fr9hb+Y3u5lN5t67M3m/zpkzd37zcL+/fbif+zS/C5IkSZIkSZIkSZIkSZIkSVLXmQPcCzwC\nbAGuju29QB/wQLxdlHnPSmA7sA1YXFShkqTinAqcE6dPAr4DnAVcC7yzyevnAw8Ck4G5wA5gQu5V\nSpISeS58nyAs7AGeBh4FZsXHlSavXwqsBQ4AuwjhsDDH+iRJIyhqzXwusADYGB9fBTwEfALoiW2n\nEXY31fUxHCaSpAIVEQ4nAZ8DriFsQdwMnE7Y5bQHuGGU9w7lXp0kKTEp58+fDNwB3A6sj237Ms9/\nHLgzTu8mHMSumx3bDnHGGWcM7dy5s/2VSlJ32wm8pNUX57nlUCHsNtoK3Jhpn5mZvhjYHKc3AG8E\njiNsWZwJbGr80J07dzI0NNS1t2uvvbb0Guyb/bN/3XcDzhjLAjzPLYfzgCuAhwmnrAK8F7icsEtp\nCHgMeHt8biuwLt4/A6zA3UqSVIo8w+FrNN8y+coo77ku3qSuNbVnGoMD/YXMa8rUHgb69xcyL3WX\nvI85aIyq1WrZJeSmm/sGrfdvcKCfRVeuP/wL22Dj6mVt+yx/f8eWZt83GO+G4v4zqSNVKpVCw8H/\nF0H4u2MMy3y/gSxJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJ\nShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgO\nkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqTEpLILkMaDqT3TGBzoL7sMadwwHCRgcKCfRVeu\nL2ReG1cvK2Q+0tFwt5IkKWE4SJISeYbDHOBe4BFgC3B1bJ8O3A18F7gL6Mm8ZyWwHdgGLM6xNknS\nKPIMhwPAnwIvBRYB7wDOAt5DCId5wD3xMcB84LJ4vwS4Kef6JEkjyHPh+wTwYJx+GngUmAW8AVgT\n29cA9aNzS4G1hFDZBewAFuZYnyRpBEWtmc8FFgDfAGYAe2P73vgY4DSgL/OePkKYSJIKVsSprCcB\ndwDXAE81PDcUbyNp+lxvb+9z09VqlWq1elQFSlK3qdVq1Gq1I35/3uEwmRAMnwLqJ5HvBU4l7Haa\nCeyL7bsJB7HrZse2RDYcJEmpxhXnVatWjen9ee5WqgCfALYCN2baNwDL4/RyhkNjA/BG4DjgdOBM\nYFOO9UmSRpDnlsN5wBXAw8ADsW0lcD2wDngr4cDzpfG5rbF9K/AMsILRdzlJknKSZzh8jZG3TC4c\nof26eJMklcjvEUiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiS\nEoaDJClhOEiSEkVcCU5SWSoTqVQqhc1uytQeBvr3FzY/5cdwkLrZ0EEWXbn+8K9rk42rlxU2L+XL\n3UqSpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6S\npIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElK5B0OtwB7gc2Ztl6gD3gg3i7KPLcS2A5s\nAxbnXJskaQR5h8OtwJKGtiHgI8CCePtKbJ8PXBbvlwA3FVCfJKmJvBe+9wH7m7RXmrQtBdYCB4Bd\nwA5gYW6VSZJGVNaa+VXAQ8AngJ7Ydhphd1NdHzCr4LokSZQTDjcDpwPnAHuAG0Z57VAhFUmSDjGp\nhHnuy0x/HLgzTu8G5mSemx3bEr29vc9NV6tVqtVqWwuUpE5Xq9Wo1WpH/P4ywmEmYYsB4GKGz2Ta\nAHyGcLB6FnAmsKnZB2TDQZKUalxxXrVq1Zjen3c4rAXOB54PPA5cC1QJu5SGgMeAt8fXbgXWxftn\ngBW4W0mSSpF3OFzepO2WUV5/XbxJkkrk9wgkSQnDQZKUMBwkSQnDQZKUMBwkSQnDQZKUMBwkSYlW\nwuGeFtskSV1itC/BHQ+cAJwCTM+0T8HRUiWpq40WDm8HriEMpX1/pv0pYHWeRUmSyjVaONwYb1cD\nHy2mHEnSeNDK2EofBV4DzG14/W15FCRJKl8r4XA78GLgQeBgpt1wkKQu1Uo4vBKYj8NnS9Ixo5VT\nWbcQLtAjSTpGtLLlcArhAjybgJ/HtiHgDXkVJUkqVyvh0Jt3EZKk8aWVcKjlXYQkaXxpJRyeZvhg\n9HHA5Ng2Ja+ipKk90xgc6C+7DOmY1Uo4nJSZnkA41rAon3KkYHCgn0VXri9sfhtXLytsXlInGOuo\nrM8C64ElOdQiSRonWtly+L3M9ATC9x5+lk85kqTxoJVw+B2Gjzk8A+wCluZVkCSpfK2Ew5vzLkKS\nNL60csxhDvAF4Ml4uwOYnWdRkqRytRIOtwIbCNd1OA24M7ZJkrpUK+FwCiEMDsTbJ4EX5FiTJKlk\nrYTDj4E3ARMJxyiuAH6UZ1GSpHK1Eg5/DFwKPAHsAS6JbZKkLtXK2Up/BfwRsD8+ng58GHhLXkVJ\nksrVypbDyxkOBoCfAK/IpxxJ0njQSjhUCFsLddMJxx8kSV2qld1KNwBfB9YRguIS4G/yLEqSVK5W\nwuE24H7gAsIwGhcTrgwnSepSrYQDwCPxJkk6Box1yG5J0jEg73C4BdgLbM60TQfuBr4L3AX0ZJ5b\nCWwHtgGLc65NkjSCvMPhVtILA72HEA7zgHviY4D5wGXxfglwUwH1SZKayHvhex+HfkcCwmVG18Tp\nNUD9+oxLgbWE8Zt2ATuAhTnXJ0lqoow18xmEXU3E+xlx+jSgL/O6PmBWgXVJkqKyd9sMMXyVuZGe\nlyQVrNVTWdtpL3AqYSC/mcC+2L6bcGGhutmxLdHb2/vcdLVapVqt5lCmJHWuWq1GrVY74veXEQ4b\ngOXAB+P9+kz7Z4CPEHYnnQlsavYB2XCQJKUaV5xXrVo1pvfnHQ5rgfOB5wOPA+8HricMxfFWwoHn\nS+Nrt8b2rcAzwArcrSRJpcg7HC4fof3CEdqvizdJnagykUqlUsispkztYaC/8WRItUsZu5Ukdauh\ngyy6cv3hX9cGG1cvO/yLdMTKPltJkjQOGQ6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpITh\nIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElK\nGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpMSksgtQ55jaM43Bgf6yy5BUAMNBLRsc6GfR\nlesLmdfG1csKmY+k5tytJElKGA6SpIThIElKGA6SpIThIElKlHm20i5gEDgIHAAWAtOBfwVeFJ+/\nFPDcSUkqWJlbDkNAFVhACAaA9wB3A/OAe+JjSVLByt6tVGl4/AZgTZxeA3iyuySVoOwth68C3wLe\nFttmAHvj9N74WJJUsDKPOZwH7AFOIexK2tbw/FC8JXp7e5+brlarVKvVXAqUpE5Vq9Wo1WpH/P4y\nw2FPvH8S+ALhuMNe4FTgCWAmsK/ZG7PhIElKNa44r1q1akzvL2u30gnAyXH6RGAxsBnYACyP7cuB\nYgbykSQdoqwthxmErYV6DZ8G7iIcf1gHvJXhU1klSQUrKxweA85p0v4T4MKCa5EkNSj7VFZJ0jhk\nOEiSEoaDJCnhleAkdabKRCqVxkEW8jFlag8D/fsLmdd4YThI6kxDB71sbY7crSRJShgOkqSE4SBJ\nShgOkqSE4SBJShgOkqS
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d915710>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"pyplot.hist(sample_means, color=COLOR5)\n",
"pyplot.xlabel('sample mean (n=100)')\n",
"pyplot.ylabel('count')\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The mean of the sample means is close to the actual population mean, which is nice, but not actually the important part."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"72.652052080657413"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sample_means.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The standard deviation of the sample means quantifies the variability from one experiment to the next, and reflects the precision of the estimate.\n",
"\n",
"This quantity is called the \"standard error\"."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"1.6355262477017491"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"std_err = sample_means.std()\n",
"std_err"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can also use the distribution of sample means to compute a \"90% confidence interval\", which contains 90% of the experimental results:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([ 69.92149384, 75.40866638])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"conf_int = numpy.percentile(sample_means, [5, 95])\n",
"conf_int"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following function takes an array of sample statistics and prints the SE and CI:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def summarize_sampling_distribution(sample_stats):\n",
" print('SE', sample_stats.std())\n",
" print('90% CI', numpy.percentile(sample_stats, [5, 95]))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And here's what that looks like:"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.6355262477\n",
"90% CI [ 69.92149384 75.40866638]\n"
]
}
],
"source": [
"summarize_sampling_distribution(sample_means)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we'd like to see what happens as we vary the sample size, `n`. The following function takes `n`, runs 1000 simulated experiments, and summarizes the results."
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def plot_sample_stats(n, xlim=None):\n",
" sample_stats = compute_sample_statistics(n, iters=1000)\n",
" summarize_sampling_distribution(sample_stats)\n",
" pyplot.hist(sample_stats, color=COLOR2)\n",
" pyplot.xlabel('sample statistic')\n",
" pyplot.xlim(xlim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's a test run with `n=100`:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.71202891175\n",
"90% CI [ 69.96057332 75.58582662]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEdZJREFUeJzt3X+Q1PV9x/HneqiNknh3MQFEIhY1IzGtWEPt6IwLQylO\nO6Jm/NWakqlNneCI0zRNpJ2Gu0nGmrYxpG10Wn+VRiXBqBTamPHndjQmYioIiCggRKEKUe5MnGkS\nxOsfn896X5e9u9293dvdzz0fM9/Z7372u9/vm2O/r/vc5/vd7xckSZIkSZIkSZIkSZIkSZLGpWnA\nY8BzwGZgSWzvAXYD6+N0XuY9S4FtwFZg/lgVKkmqzmTg9Dg/EXgBOBVYBnyuzPIzgQ3A4cB0YDtw\nWMOrlCQdYqTwfY0Q2ABvAc8DU+PzXJnlFwIrgQPALkLAzx51lZKkqlXTu54OzAJ+FJ9fAzwL3AZ0\nxrbjCEM3RbsZ/IUgSRpDlQb8ROC7wLWEnvzNwImE4ZtXga8N896B0RQoSarNhAqWORy4F7gTWB3b\n9mVevxVYG+f3EA7MFh0f295jxowZAzt27Ki6WEka53YAJ1W68Eg9+BxhCGYLsDzTPiUzfyGwKc6v\nAS4DjiD08E8G1h1S4Y4dDAwMtO20bNmyptcwHmu3/uZP1t/cCZhRabjDyD34s4ErgI2E0yEB/gq4\nnDA8MwDsBK6Kr20BVsXHt4HFOEQjSU0xUsA/Qfle/gPDvOf6OEmSmshz1GuQz+ebXULN2rl2sP5m\ns/72Uu5c9rEwEMeTJEkVyuVyUEVu24OXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgNa50dXWT\ny+UaPnV1dTf7nyp5HrzGl1wux6P3bW74duZedBp+xlVvngcvSQIMeElKlgEvSYky4CUpUQa8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJmtDsAqSurm76+/uaXYaUHANeTdff3zcmFwCDcBEw\nabxwiEaSEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9J\niTLgJSlRIwX8NOAx4DlgM7AktncDDwEvAg8CnZn3LAW2AVuB+fUsVpJUuZEC/gDw58DHgLOAq4FT\ngesIAX8K8Eh8DjATuDQ+LgBuqmAbkqQGGCl8XwM2xPm3gOeBqcD5wIrYvgK4IM4vBFYSfjHsArYD\ns+tXriSpUtX0rqcDs4CngEnA3ti+Nz4HOA7YnXnPbsIvBEnSGKv0jk4TgXuBa4Gfl7w2EKehlH2t\np6fn3fl8Pk8+n6+wFEkaHwqFAoVCoeb3VxLwhxPC/VvA6ti2F5hMGMKZAuyL7XsIB2aLjo9th8gG\nvCTpUKWd397e3qreP9IQTQ64DdgCLM+0rwEWxflFDAb/GuAy4AjgROBkYF1VFUmS6mKkHvzZwBXA\nRmB9bFsK3ACsAq4kHEy9JL62JbZvAd4GFjP88I0kqUFGCvgnGLqXP2+I9uvjJElqIs9Rl6REGfCS\nlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiZrQ7ALUurq6uunv72t2GW2po6OD\nXC7X8O10dnbR17e/4dtRezLgNaT+/j4evW9zw7cz96LTGr6NsXbw4EF/dmo6h2gkKVEGvCQlyoCX\npEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REVRLwtwN7gU2Zth5gN7A+Tudl\nXlsKbAO2AvPrUqUkqWqVBPwdwIKStgHgRmBWnB6I7TOBS+PjAuCmCrchSaqzSsL3caDcNWPLXQt1\nIbASOADsArYDs2stTpJUu9H0rq8BngVuAzpj23GEoZui3cDUUWxDklSjWgP+ZuBE4HTgVeBrwyw7\nUOM2JEmjUOsNP/Zl5m8F1sb5PcC0zGvHx7ZD9PT0vDufz+fJ5/M1liJJaSoUChQKhZrfX2vATyH0\n3AEuZPAMmzXA3YQDsFOBk4F15VaQDXhJ0qFKO7+9vb1Vvb+SgF8JnAscC7wCLAPyhOGZAWAncFVc\ndguwKj6+DSzGIRpJaopKAv7yMm23D7P89XGSJDWR56hLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnw\nkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVGVBPztwF5gU6atG3gIeBF4EOjMvLYU2AZsBebXp0xJ\nUrUqCfg7gAUlbdcRAv4U4JH4HGAmcGl8XADcVOE2JEl1Vkn4Pg70lbSdD6yI8yuAC+L8QmAlcADY\nBWwHZo+6SklS1WrtXU8iDNsQHyfF+eOA3ZnldgNTa9yGJGkU6jF8MhCn4V6XJI2xCTW+by8wGXgN\nmALsi+17gGmZ5Y6PbYfo6el5dz6fz5PP52ssRZLSVCgUKBQKNb+/1oBfAywCvhofV2fa7wZuJAzN\nnAysK7eCbMBLkg5V2vnt7e2t6v2VBPxK4FzgWOAV4EvADcAq4ErCwdRL4rJbYvsW4G1gMQ7RSFJT\nVBLwlw/RPm+I9uvjJElqIs9Rl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVG13tFJUgvo6Oggl8s1fDudnV309e1v+HZUXwa8\n1MYOHjzIo/dtbvh25l50WsO3ofpziEaSEmXAS1KiDHhJSpRj8G2oq6ub/v6+ZpchqcUZ8G2ov7/P\nA2uSRuQQjSQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElK1GgvF7wL+BlwEDgAzAa6ge8AJ8TXLwH6R7kdSVKVRtuDHwDywCxCuANcBzwE\nnAI8Ep9LksZYPYZociXPzwdWxPkVwAV12IYkqUr16ME/DPwY+ExsmwTsjfN743NJ0hgb7Rj82cCr\nwIcIwzJbS14fiNMhenp63p3P5/Pk8/lRliJJaSkUChQKhZrfP9qAfzU+/hS4nzAOvxeYDLwGTAH2\nlXtjNuAlSYcq7fz29vZW9f7RDNEcBbw/zh8NzAc2AWuARbF9EbB6FNuQJNVoND34SYRee3E9dwEP\nEsbjVwFXMniapCRpjI0m4HcCp5dp3w/MG8V6JUl14DdZJSlRBrwkJcqAl6REGfCSlCgDXpISZcBL\nUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJGu09WSWNAx0d\nHeRyuYZvp7Ozi76+/Q3fznhhwEsa0cGDB3n0vs0N387ci05r+DbGE4doJClRBrwkJcqAl6REGfCS\nlCgDXpISZcBLUqIMeElKlOfB11FXVzf9/X3NLkOSAAO+rvr7+/wyiKSW4RCNJCXKgJekRBnwkpQo\nA16SEmXAS1KiDHhJSpQBL0mJSv48+DfffJM9e/Y0uwxJGnONCvgFwHKgA7gV+GqDtjOiz352MY88\n/AgTj35/Q7fzxv7XG7p+SapWIwK+A/hnYB6wB3gaWAM834BtjeiXv/gVV33q88w557y6rXPD5nWc\nftrs97Tdetdy7r731rpto1HK1d5OrL+52r3+QqFAPp9vdhljphFj8LOB7cAu4ADwbWBhA7bTNBs2\nP93sEmrWzrWD9Tdbo+sv3ty7UdOcOXPI5XJ0dXU39N/RKhrRg58KvJJ5vhv47QZsR1JiGn1z73/7\n9jf59GVXj5vrOTUi4Ac
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d9eaf90>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_sample_stats(100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use `interact` to run `plot_sample_stats` with different values of `n`. Note: `xlim` sets the limits of the x-axis so the figure doesn't get rescaled as we vary `n`."
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.71314776815\n",
"90% CI [ 69.99274896 75.65943185]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEaNJREFUeJzt3X+QXeVdx/H3ZRMoSSjZFZqEkHYxQCVEJVVjHWC4zcRM\nUIdAmPJD0WCxg9IhTKtTiDM2uxYpdQTRKoyWH5OWEpu2IQaddgg/LqXWQisJSQiBJJIxSWGhklhw\n1IZl/eN5Lnu4ubt7f+y99+yT92vmzj3nOefc++XZ5bNPnnPuuSBJkiRJkiRJkiRJkiRJknRUmgM8\nDjwHbAdWxvY+YD+wOT4uzByzCtgF7ASWtKtQSVJ9ZgLnxOVpwAvAWcBq4FNV9p8HbAEmA73AbuCY\nllcpSTrCWOH7CiGwAd4Engdmx/VClf2XAWuBw8BeQsAvbLpKSVLd6hld9wILgO/F9euBZ4F7gOmx\n7RTC1E3Zfob/IEiS2qjWgJ8GfB24gTCSvws4jTB98zJw2yjHDjVToCSpMZNq2Gcy8A3gfmBDbHs1\ns/1u4KG4fIBwYrbs1Nj2LnPnzh3as2dP3cVK0lFuD3B6rTuPNYIvEKZgdgB3ZNpnZZYvAbbF5Y3A\nFcCxhBH+GcDTR1S4Zw9DQ0O5f6xevbrjNaRS50So0TqtM+8PYG6t4Q5jj+DPBa4CthIuhwT4Y+BK\nwvTMEPAScG3ctgNYF5/fAq7DKRpJ6oixAv47VB/lf3OUY26JD0lSB3mN+iiKxWKnS6jJRKhzItQI\n1jnerLOzql3L3g5DcT5JklSjQqEAdeS2I3hJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJl\nwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAV+51d/dQKBRGfXR393S6TCl3vJukcq9QKPDY+u2j7rNo\n+Xz8nVLqvJukJAkw4CUpWQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBryR0dXX5aVepwqRO\nFyCNh8HBwZo+7SodTRzBS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUrUWAE/B3gceA7YDqyM7T3AJuBF4GFgeuaYVcAuYCewZDyLlSTVbqyAPwx8Ejgb+DDwCeAs4CZC\nwJ8JPBrXAeYBl8fnpcCdNbyHJKkFxgrfV4AtcflN4HlgNnARsCa2rwEujsvLgLWEPwx7gd3AwvEr\nV5JUq3pG173AAuApYAYwENsH4jrAKcD+zDH7CX8QJEltVuvtgqcB3wBuAN6o2DYUHyOpuq2vr++d\n5WKxSLFYrLEUSTo6lEolSqVSw8fXEvCTCeH+ZWBDbBsAZhKmcGYBr8b2A4QTs2WnxrYjZANeknSk\nysFvf39/XcePNUVTAO4BdgB3ZNo3Aivi8gqGg38jcAVwLHAacAbwdF0VSZLGxVgj+HOBq4CtwObY\ntgq4FVgHXEM4mXpZ3LYjtu8A3gKuY/TpG0lSi4wV8N9h5FH+4hHab4kPSVIHeY26JCXKgJekRBnw\nkpQoA16SEmXAS1KiDHhJSpQBr47r7u6hUCiM+JDUmFrvRSO1zKFDB3ls/fYRty9aPr+N1UjpcAQv\nSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKU\nKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky\n4CUpUQa8JCWqloC/FxgAtmXa+oD9wOb4uDCzbRWwC9gJLBmXKiVJdasl4O8Dlla0DQG3Awvi45ux\nfR5weXxeCtxZ43tIksZZLeH7JHCwSnuhStsyYC1wGNgL7AYWNlqcJKlxzYyurweeBe4Bpse2UwhT\nN2X7gdlNvIckqUGNBvxdwGnAOcDLwG2j7DvU4HtIkpowqcHjXs0s3w08FJcPAHMy206NbUfo6+t7\nZ7lYLFIsFhssRZLSVCqVKJVKDR/faMDPIozcAS5h+AqbjcADhBOws4EzgKervUA24CVJR6oc/Pb3\n99d1fC0Bvxa4ADgJ2AesBoqE6Zkh4CXg2rjvDmBdfH4LuA6naCSpI2oJ+CurtN07yv63xIeUK11d\nXRQK1S7+GjZ9ejcHD77epoqk1mp0ikaacAYHB3ls/fZR91m0fH6bqpFazw8hSVKiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAl\nKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9Jiaol4O8FBoBtmbYeYBPwIvAwMD2zbRWw\nC9gJLBmfMiVJ9aol4O8Dlla03UQI+DOBR+M6wDzg8vi8FLizxveQJI2zWsL3SeBgRdtFwJq4vAa4\nOC4vA9YCh4G9wG5gYdNVSpLq1ujoegZh2ob4PCMunwLsz+y3H5jd4HtIkpowHtMnQ/Ex2nZJUptN\navC4AWAm8AowC3g1th8A5mT2OzW2HaGvr++d5WKxSLFYbLAUSUpTqVSiVCo1fHyjAb8RWAF8Pj5v\nyLQ/ANxOmJo5A3i62gtkA16SdKTKwW9/f39dx9cS8GuBC4CTgH3AZ4BbgXXANYSTqZfFfXfE9h3A\nW8B1OEUjSR1RS8BfOUL74hHab4kPSVIHeY26JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8Gqp7u4eCoXCqA9JrdHo3SSlmhw6dJDH1m8fdZ9Fy+e3qRrp6OIIXpIS\nZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL2V0dXWNemO0\n7u6eTpco1cybjUkZg4ODo94czRujaSJxBC9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIM\neElKlAEvSYky4CUpUQa8JCWq2XvR7AV+DAwCh4GFQA/wVeADcftlwKEm30eSVKdmR/BDQBFYQAh3\ngJuATcCZwKNxXZLUZuMxRVOoWL8IWBOX1wAXj8N7SJLqNB4j+EeAHwAfj20zgIG4PBDXJUlt1uwc\n/LnAy8DJhGmZnRXbh+LjCH19fe8sF4tFisVik6VIUlpKpRKlUqnh45sN+Jfj82vAg4R5+AFgJvAK\nMAt4tdqB2YCXJB2pcvDb399f1/HNTNFMAU6Iy1OBJcA2YCOwIravADY08R6SpAY1M4KfQRi1l1/n\nK8DDhPn4dcA1DF8mKUlqs2YC/iXgnCrtrwOLm3hdSdI48JOskpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXqpDV1cXhUJh1Ed3d0+ny5SA5r6yTzrqDA4O8tj67aPus2j5/DZVI43OEbwkJcqAV8O6u3vGnK6Q\n1DlO0ahhhw4ddLpCyjFH8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCg/\nyaqqrr76d9n70t5OlyGpCQa8qtqwYQOfvHY1U4+fVnX7vh/u5Ylvl9pblKS6GPAa0TlnL+S9J5xY\nddu0qSe0uZqJo3zP+NFMn97NwYOvt6kiHa0MeGmcec945UWrTrIuBXYCu4AbW/QekqRRtCLgu4C/\nIYT8POBK4KwWvE/LlUqlTpdQk4lQ55btT3e6hKRMhJ85WGentSLgFwK7gb3AYeAfgGUteJ+Wmyg/\n9IlQ55bt3+90CUmZCD9zsM5Oa0XAzwb2Zdb3xzZJUhu14iTrUAteU23WdcwxfO4LNzJ50uSq29/4\n7zfaXJGkerXiSzM/DPQ
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d804810>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def sample_stat(sample):\n",
" return sample.mean()\n",
"\n",
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
"interact(plot_sample_stats, n=slider, xlim=fixed([55, 95]))\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This framework works with any other quantity we want to estimate. By changing `sample_stat`, you can compute the SE and CI for any sample statistic.\n",
"\n",
"As an exercise, fill in `sample_stat` below with any of these statistics:\n",
"\n",
"* Standard deviation of the sample.\n",
"* Coefficient of variation, which is the sample standard deviation divided by the sample standard mean.\n",
"* Min or Max\n",
"* Median (which is the 50th percentile)\n",
"* 10th or 90th percentile.\n",
"* Interquartile range (IQR), which is the difference between the 75th and 25th percentiles.\n",
"\n",
"NumPy array methods you might find useful include `std`, `min`, `max`, and `percentile`.\n",
"Depending on the results, you might want to adjust `xlim`."
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.67195986148\n",
"90% CI [ 69.82954731 75.32184298]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEfxJREFUeJzt3XuMnNV9h/FnWGMSoLF3ResLdmIXTIWhDZCG0pKKwTIO\nqBEmVNxSWjelERIo0PQCGClht4kIiQJNb6CWAHIhuHGAuEZRKAYzFblwSWIuxjjYDlaxi9cE7ARK\nW/Cy/eOc2RmPd3bHM7M7y/6ej/Rq3vfM+75z9sB85/i8N5AkSZIkSZIkSZIkSZIkSVKbvAd4HHgK\n2Ah8MZf3AGuBF4AHgelV2ywHNgObgCXjVlNJUksOza9TgMeAjwBfBq7K5VcDN+T5haQfhoOBecAW\n4KDxqqgkqXWHAk8Cx5F67TNy+cy8DKlXf3XVNg8Ap4xXBSVJw2uk130QqbfeDzwCPEcK+v78fj+V\n4J8NbK/adjtwZFtqKklq2pQG1nkHOAGYBvw7cHrN+4N5qmek9yRJ46CRsC/7OfBt4EOk3vxMYCcw\nC9iV19kBzK3aZk4u28dRRx01uHXr1mbqK0mRbQWObmbD0YZxjqByps17gTOA9cAaYFkuXwaszvNr\ngAuBqcB8YAHwxH613bqVwcFBp8FBrrvuuo7XYaJMtoVtYVuMPAFHNRP0MHrPfhawgvSjcBBwJ/Bw\nDvxVwCXANuD8vP7GXL4R2AtchsM4ktRxo4X9s8BJw5S/Biyus831eZIkTRCeA99hxWKx01WYMGyL\nCtuiwrZoj0KHPncwjz9JkhpUKBSgydy2Zy9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS+pId3dPRQKhX2m7u6eTldLDfKul5IaUigUWHffhn3KFp17\nPH6Xx493vZQkjciwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCw\nl6QARgv7ucAjwHPABuCKXN4LbAfW5+msqm2WA5uBTcCSNtZVktSkKaO8/zbwGeAp4HDgR8BaYBC4\nKU/VFgIX5NcjgYeAY4B32ldlSRNFV1dX+U6MAEyf3s3u3a91sEaqZ7Sw35kngDeA50khDsPfZnMp\nsJL0I7EN2AKcDDzWakUlTTwDAwP73PZ40bnHd7A2GsmBjNnPA06kEtyfBp4GbgOm57LZpOGdsu1U\nfhwkSR3SaNgfDtwDXEnq4d8CzAdOAF4GbhxhW59sIEkdNtowDsDBwL3AXcDqXLar6v2vAffn+R2k\ng7plc3LZfnp7e4fmi8UixWKxkfpKUhilUolSqdSWfY32eKsCsAJ4lXSgtmwWqUdPLv8w8AnSgdm7\nSeP05QO0R7N/797HEkrvMvUeS1g7Zu93e+y08ljC0Xr2pwIXA8+QTrEEuBa4iDSEMwi8CFya39sI\nrMqve4HLcBhHkjputLD/LsOP639nhG2uz5MkaYLwClpJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCmC0sJ8LPAI8B2wA\nrsjlPcBa4AXgQWB61TbLgc3AJmBJOysrSWrOaGH/NvAZ4DjgFOBy4FjgGlLYHwM8nJcBFgIX5Ncz\ngZsb+AxJ0hgbLYh3Ak/l+TeA54EjgbOBFbl8BXBOnl8KrCT9SGwDtgAnt6+6kqRmHEivex5wIvA4\nMAPoz+X9eRlgNrC9apvtpB8HSVIHTWlwvcOBe4Ergddr3hvMUz3Dvtfb2zs0XywWKRaLDVZFkmIo\nlUqUSqW27KuRsD+YFPR3AqtzWT8wkzTMMwvYlct3kA7qls3JZfupDntJ0v5qO8J9fX1N72u0YZwC\ncBuwEfhqVfkaYFmeX0blR2ANcCEwFZgPLACeaLp2kqS2GK1nfypwMfAMsD6XLQduAFYBl5AOxJ6f\n39uYyzcCe4HLGHmIR5I0DkYL++9Sv/e/uE759XmSJE0QngMvSQEY9pIUgGEvSQEY9pIUgGEvSQEY\n9pIUgGEvqW26urooFApDU3d3T6erpKzRe+NI0qgGBgZYd9+GoeVF5x7fwdqomj17SQrAsJekAAx7\nSQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrA\nsJekAAx7SQrAsJekAAx7SQrAsJekABoJ+9uBfuDZqrJeYDuwPk9nVb23HNgMbAKWtKWWkqSWNBL2\ndwBn1pQNAjcBJ+bpO7l8IXBBfj0TuLnBz5AkjaFGgvhRYPcw5YVhypYCK4G3gW3AFuDkZisnSWqP\nVnrdnwaeBm4Dpuey2aThnbLtwJEtfIYkqQ2mNLndLcBf5/nPAzcCl9RZd3C4wt7e3qH5YrFIsVhs\nsiqSNDmVSiVKpVJb9tVs2O+qmv8acH+e3wHMrXpvTi7bT3XYS5L2V9sR7uvra3pfzQ7jzKqa/ziV\nM3XWABcCU4H5wALgiaZrJ0lqi0Z69iuB04AjgJeA64AicAJpiOZF4NK87kZgVX7dC1xGnWEcSdL4\naSTsLxqm7PYR1r8+T5KkCcJz4CUpAMNe0rC6u3soFApDk97dmj0bR9Ikt2fPbtbdt2FoedG5x3ew\nNmqVPXtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCqCRsL8d6AeerSrrAdYCLwAPAtOr3lsObAY2AUvaU01JUisaCfs7gDNryq4hhf0xwMN5GWAhcEF+\nPRO4ucHPkCSNoUaC+FFgd03Z2cCKPL8COCfPLwVWAm8D24AtwMkt11KS1JJme90zSEM75NcZeX42\nsL1qve3AkU1+hiSpTaa0YR+DeRrp/f309vYOzReLRYrFYhuqIkmTR6lUolQqtWVfzYZ9PzAT2AnM\nAnbl8h3A3Kr15uSy/VSHvSRpf7Ud4b6+vqb31ewwzhpgWZ5fBqyuKr8QmArMBxYATzRdO0lSWzTS\ns18JnAYcAbwEfA64AVgFXEI6EHt+XndjLt8I7AUuY+QhHknSOGgk7C+qU764Tvn1eZIkTRCeAy9J\nARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS9pzHR1dVEoFIam7u6eTlcprEaeQStJTRkYGGDdfRuGlhed\ne3wHaxObPXtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QAWr2oahvwC2AAeBs4GegB\nvgF8IL9/PrCnxc+RJLWg1Z79IFAETiQFPcA1wFrgGODhvCxJ6qB2DOMUapbPBlbk+RXAOW34DElS\nC9rRs38I+CHwqVw2A+jP8/15WZLUQa2O2Z8KvAz8MmnoZlPN+4N5kiR1UKth/3J+fQX4Fmncvh+Y\nCewEZgG7htuwt7d3aL5YLFIsFlusiqRWdHf3sGfP7k5XQ1VKpRKlUqkt+2ol7A8FuoDXgcOAJUAf\nsAZYBnwpv64ebuPqsJfUeXv27PZ2xBNMbUe4r6+v6X21EvYzSL358n6+DjxIGr9fBVxC5dRLSVIH\ntRL2LwInDFP+GrC4hf1KktrMK2glKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDD\nXpICMOwlKQDDXpICmNL
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d775990>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def sample_stat(sample):\n",
" # TODO: replace the following line with another sample statistic\n",
" return sample.mean()\n",
"\n",
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
"interact(plot_sample_stats, n=slider, xlim=fixed([0, 100]))\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Part Two\n",
"========\n",
"\n",
"So far we have shown that if we know the actual distribution of the population, we can compute the sampling distribution for any sample statistic, and from that we can compute SE and CI.\n",
"\n",
"But in real life we don't know the actual distribution of the population. If we did, we wouldn't need to estimate it!\n",
"\n",
"In real life, we use the sample to build a model of the population distribution, then use the model to generate the sampling distribution. A simple and popular way to do that is \"resampling,\" which means we use the sample itself as a model of the population distribution and draw samples from it.\n",
"\n",
"Before we go on, I want to collect some of the code from Part One and organize it as a class. This class represents a framework for computing sampling distributions."
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class Resampler(object):\n",
" \"\"\"Represents a framework for computing sampling distributions.\"\"\"\n",
" \n",
" def __init__(self, sample, xlim=None):\n",
" \"\"\"Stores the actual sample.\"\"\"\n",
" self.sample = sample\n",
" self.n = len(sample)\n",
" self.xlim = xlim\n",
" \n",
" def resample(self):\n",
" \"\"\"Generates a new sample by choosing from the original\n",
" sample with replacement.\n",
" \"\"\"\n",
" new_sample = numpy.random.choice(self.sample, self.n, replace=True)\n",
" return new_sample\n",
" \n",
" def sample_stat(self, sample):\n",
" \"\"\"Computes a sample statistic using the original sample or a\n",
" simulated sample.\n",
" \"\"\"\n",
" return sample.mean()\n",
" \n",
" def compute_sample_statistics(self, iters=1000):\n",
" \"\"\"Simulates many experiments and collects the resulting sample\n",
" statistics.\n",
" \"\"\"\n",
" stats = [self.sample_stat(self.resample()) for i in range(iters)]\n",
" return numpy.array(stats)\n",
" \n",
" def plot_sample_stats(self):\n",
" \"\"\"Runs simulated experiments and summarizes the results.\n",
" \"\"\"\n",
" sample_stats = self.compute_sample_statistics()\n",
" summarize_sampling_distribution(sample_stats)\n",
" pyplot.hist(sample_stats, color=COLOR2)\n",
" pyplot.xlabel('sample statistic')\n",
" pyplot.xlim(self.xlim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The following function instantiates a `Resampler` and runs it."
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def plot_resampled_stats(n=100):\n",
" sample = weight.rvs(n)\n",
" resampler = Resampler(sample, xlim=[55, 95])\n",
" resampler.plot_sample_stats()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's a test run with `n=100`"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.72606450921\n",
"90% CI [ 71.35648645 76.82647135]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEoZJREFUeJzt3X+QVeV9x/H3dREVbWS3Gn5JshQ1DdJWTEPtqOOVIQy2\nHVGc+CO1pRObITUjTtqOSmcad6MlJlNt+kunjT+G+IOGJEixmVjxx01NU38kgoBIBCpToLKYCIlm\n2gZw+8fzXPd49+7ee3f37j378H7NnLnnPuece78+u3724Tn3ngOSJEmSJEmSJEmSJEmSJElHpeOB\n54CNwFbgC7G9A1gPvAo8DkzMHLMc2A5sAxaMWqWSpIZNiI/jgGeB84EvATfG9puA2+P6LMIfg2OB\nTmAHcMxoFSpJGpoJwAvAWYTR+aTYPjk+hzB6vylzzGPAuaNVoCSpTz2j62MIo/Ie4GngZUK498Tt\nPfSF/VRgT+bYPcC0EalUktSQcXXs8w5wNnAy8K/ARRXbe+MykMG2SZKapJ6AL/sJ8C3gI4RR+2Rg\nHzAF2B/32QtMzxxzWmx7j5kzZ/bu3LlzKPVK0tFsJ3B6vTvXmqI5hb5PyJwAfAzYAKwDlsT2JcDa\nuL4OuAoYD8wAzgCe71fhzp309vbmfrnllltaXkMqdY6FGq3TOvO+ADPrDXeoPYKfAqwk/CE4BngA\neDKG/GrgWmAXcEXcf2ts3wocBq7DKRpJaolaAb8ZOKdK+5vA/AGOWREXSVIL+Rn1QRSLxVaXUJex\nUOdYqBGsc6RZZ2sVWvS+vXE+SZJUp0KhAA3ktiN4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA17Ja2/voFAo1Fza2ztaXao0orwe\nvJJXKBR4as2WmvvNWzwbfy+VZ14PXpIEGPCSlCwDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXK\ngJekRBnwkpSoWgE/HXgaeBnYAiyL7V3AHmBDXC7OHLMc2A5sAxaMYK2SpAaMq7H9EPBZYCNwEvAD\nYD3QC9wZl6xZwJXxcRrwBHAm8M7IlSxJqketEfw+QrgDvA28QghuqH7Bm0XAKsIfhl3ADmDusKuU\nJDWskTn4TmAO8Gx8fj3wEnAvMDG2TSVM3ZTtoe8PgiRpFNUb8CcB3wBuIIzk7wZmAGcDrwN3DHKs\n11+VpBaoNQcPcCzwTeBBYG1s25/Zfg/waFzfSzgxW3ZabOunq6vr3fVisUixWKynXkk6apRKJUql\n0pCPr3Xh+AKwEvgx4WRr2RTCyJ3Y/lHgE4STqw8T5t3LJ1lPp/8o3ht+aNR4ww+lotEbftQawZ8H\nXANsInwcEuDPgKsJ0zO9wGvA0rhtK7A6Ph4GrsMpGklqiVoB/12qz9N/e5BjVsRFktRCfpNVkhJl\nwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAGvMau9vYNCoVBz\nkY5W9VwPXsqlgwcP1H0ZYOlo5AhekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpSoWgE/HXgaeBnYAiyL7R3AeuBV4HFg\nYuaY5cB2YBuwYCSLlSTVr1bAHwI+C5wFnAt8BvgwcDMh4M8EnozPAWYBV8bHhcBddbyHJKkJaoXv\nPmBjXH8beAWYBlwCrIztK4FL4/oiYBXhD8MuYAcwd+TKlSTVq5HRdScwB3gOmAT0xPae+BxgKrAn\nc8wewh8ESdIoq/em2ycB3wRuAN6q2NYbl4FU3dbV1fXuerFYpFgs1lmK1BxtbW0UCoWa+02c2M6B\nA2+OQkU62pVKJUql0pCPryfgjyWE+wPA2tjWA0wmTOFMAfbH9r2EE7Nlp8W2frIBL+XBkSNHeGrN\nlpr7zVs8exSqkfoPfru7uxs6vtYUTQG4F9gKfDnTvg5YEteX0Bf864CrgPHADOAM4PmGKpIkjYha\nI/jzgGuATcCG2LYcuB1YDVxLOJl6Rdy2NbZvBQ4D1zH49I0kqUlqBfx3GXiUP3+A9hVxkSS1kJ9R\nl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJ\nSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6RE\nGfCSlCgDXpISZcBLUqLqCfj7gB5gc6atC9gDbIjLxZlty4HtwDZgwYhUKUlqWD0Bfz+wsKKtF7gT\nmBOXb8f2WcCV8XEhcFed7yFJGmH1hO8zwIEq7YUqbYuAVcAhYBewA5g71OIkSUM3nNH19cBLwL3A\nxNg2lTB1U7YHmDaM95AkDdG4IR53N/D5uH4rcAdw7QD79lZr7Orqene9WCxSLBaHWIokpalUKlEq\nlYZ8/FADfn9m/R7g0bi+F5ie2XZabOsnG/CSpP4qB7/d3d0NHT/UKZopmfXL6PuEzTrgKmA8MAM4\nA3h+iO8hSRqGekbwq4ALgVOA3cAtQBE4mzD98hqwNO67FVgdHw8D1zHAFI0kqbnqCfirq7TdN8j+\nK+IiSWohP6MuSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQl\nyoCXpEQZ8JKUKANeudPe3kGhUKi5SBrcUO/oJDXNwYMHeGrNlpr7zVs8exSqkcYuR/CSlCgDXpIS\nZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJqifg7wN6\ngM2Ztg5gPfAq8DgwMbNtObAd2AYsGJkyJUmNqifg7wcWVrTdTAj4M4En43OAWcCV8XEhcFed7yFJ\nGmH1hO8zwIGKtkuAlXF9JXBpXF8ErAIOAbuAHcDcYVcpSWrYUEfXkwjTNsTHSXF9KrAns98eYNoQ\n30OSNAwjcUen3rgMtr2frq6ud9eLxSLFYnEESpGkdJRKJUql0pCPH2rA9wCTgX3AFGB/bN8LTM/s\nd1ps6ycb8JKk/ioHv93d3Q0dP9QpmnXAkri+BFibab8KGA/MAM4Anh/ie0iShqGeEfwq4ELgFGA3\n8DngdmA1cC3hZOoVcd+tsX0rcBi4jsGnbyRJTVJPwF89QPv8AdpXxEWS1EJ+Rl2SEmXAS1KiDHhJ\nSpQBL0mJMuAlKVEGvNSgtrY2CoVCzaW9vaPVpeooNxKXKpCOKkeOHOGpNVtq7jdv8exRqEYamCN4\nSUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpESNG+bxu4CfAkeAQ8BcoAP4GvDB\nuP0K4OAw30eS1KDhjuB7gSIwhxDuADcD64EzgSfjc0nSKBuJKZpCxfNLgJVxfSVw6Qi8hySpQSMx\ngn8C+D7wqdg2CeiJ6z3xuSRplA13Dv484HXgVMK0zLaK7b1xkSSNsuEG/Ovx8Q3gEcI8fA8wGdgH\nTAH2Vzuwq6vr3fVisUixWBxmKZKUllKpRKlUGvLxwwn4CUAb8BZwIrAA6AbWAUuAL8bHtdUOzga8\nJKm/ysFvd3d3Q8cPJ+AnEUbt5dd5CHicMB+/GriWvo9JSpJG2XAC/jXg7CrtbwLzh/G6kqQR4DdZ\nJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAGvUdPe3kGhUKi5SBoZw70WjVS3gwcP8NSa\nLTX3m7d49ihUI6XPEbwkJcqAl6REGfCSlCgDXmqStra2uk4qt7d3tLpUJcqTrFKTHDlyxJPKailH\n8JKUKANekhJlwEtSogx
"text/plain": [
"<matplotlib.figure.Figure at 0x7f2974281c50>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plot_resampled_stats(100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can use `plot_resampled_stats` in an interaction:"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.67407589545\n",
"90% CI [ 69.60129748 75.13161693]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEqdJREFUeJzt3X2MXNV9h/FnWGPA0ODdQvyerGsgjXFbTBqXClAGy7FM\nW2EwCi8pravQiJYIo7QV4EoNu0AJiQpJ30BteJEDwY2TGNc0KsXGTEqS8pLgV4yD7WLVdmFNgp2A\n1DZgtn+cM97r8axnZndm5+7x85Gu5s659878cjz57uHMnXtBkiRJkiRJkiRJkiRJkiTpmHQi8Byw\nAdgKfD62dwFrgFeAJ4HxmWOWAtuBbcD8EatUktSwcfFxDPAscAHwReCm2H4zcFdcn0n4Y3A80A3s\nAI4bqUIlSUMzDngBOJswOp8Q2yfG5xBG7zdnjnkCOG+kCpQkDahndH0cYVTeBzwNvEQI9764vY+B\nsJ8M7MkcuweY0pRKJUkNGVPHPu8B5wCnAv8GXFSxvT8ugznaNklSi9QT8GU/Bb4NfIQwap8IvA5M\nAvbFffYC0zLHTI1th5kxY0b/zp07h1KvJB3LdgJn1LtzrSma0xg4Q+Yk4OPAemA1sDi2LwZWxfXV\nwFXAWGA6cCbw/BEV7txJf39/7pdbb7217TWkUudoqNE6rTPvCzCj3nCH2iP4ScAywh+C44CHgadi\nyK8ArgV2AVfE/bfG9q3Au8D1OEUjSW1RK+A3A+dWaX8TmDfIMXfGRZLURp6jfhTFYrHdJdRlNNQ5\nGmoE62w262yvQpvetz/OJ0mS6lQoFKCB3HYEL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwGtU\n6ezsolAo1Fw6O7vaXarUdp4Hr1GlUCiwbuWWmvvNXTQLP2NKjefBS5IAA16SkmXAS1KiDHhJSpQB\nL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCS\nlCgDXpISVSvgpwFPAy8BW4Alsb0H2AOsj8vFmWOWAtuBbcD8JtYqSWrAmBrb3wE+C2wATgF+CKwB\n+oF74pI1E7gyPk4B1gJnAe81r2RJUj1qjeBfJ4Q7wNvAy4Tghuo3fl0ILCf8YdgF7ADmDLtKSVLD\nGpmD7wZmA8/G5zcAG4EHgPGxbTJh6qZsDwN/ECRJI6jegD8F+CZwI2Ekfx8wHTgHeA24+yjH9g+n\nQEnS0NSagwc4HvgW8AiwKrbty2y/H3g8ru8lfDFbNjW2HaGnp+fQerFYpFgs1lOvJB0zSqUSpVJp\nyMdXm0ev3L4M+Anhy9aySYSRO7H9o8AnCV+uPkqYdy9/yXoGR47i+/v7HdircYVCgXUrt9Tcb+6i\nWfgZU2oKhQLUzu1Dao3gzweuATYRTocE+HPgasL0TD/wKnBd3LYVWBEf3wWuxyka1aGzs4sDB/a3\nuwwpKXX/JWgyR/A6TCMjc0fwOlY1OoL3l6ySlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYmqFfDTgKeBl4AtwJLY\n3gWsAV4BngTGZ45ZCmwHtgHzm1msJKl+tQL+HeCzwNnAecBngA8DtxAC/izgqfgcYCZwZXxcANxb\nx3tIklqgVvi+DmyI628DLwNTgEuAZbF9GXBpXF8ILCf8YdgF7ADmNK9cSVK9GhlddwOzgeeACUBf\nbO+LzwEmA3syx+wh/EGQJI2wMXXudwrwLeBG4K2Kbf1xGUzVbT09PYfWi8UixWKxzlIk6dhQKpUo\nlUpDPr6egD+eEO4PA6tiWx8wkTCFMwnYF9v3Er6YLZsa246QDXhJ0pEqB7+9vb0NHV9riqYAPABs\nBb6caV8NLI7rixkI/tXAVcBYYDpwJvB8QxVJkpqi1gj+fOAaYBOwPrYtBe4CVgDXEr5MvSJu2xrb\ntwLvAtdz9OkbSVKL1Ar47zL4KH/eIO13xkWS1Eaeoy5JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpIS\nZcBLUqIMeElKlAGvJHV0dFAoFGounZ1d7S5Vapl6ryYpjSoHDx5k3cotNfebu2jWCFQjtYcjeElK\nlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ\n8JKUKANekhJlwEtSogx4SUqUAS9Jiaon4B8E+oDNmbYeYA+wPi4XZ7YtBbYD24D5TalSktSwegL+\nIWBBRVs/cA8wOy7/GttnAlfGxwXAvXW+hySpyeoJ32eA/VXaC1XaFgLLgXeAXcAOYM5Qi5MkDd1w\nRtc3ABuBB4DxsW0yYeqmbA8wZRjvIUkaojFDPO4+4La4fjtwN3DtIPv2V2vs6ek5tF4sFikWi0Ms\nRZLSVCqVKJVKQz5+qAG/L7N+P/B4XN8LTMtsmxrbjpANeEnSkSoHv729vQ0dP9QpmkmZ9csYOMNm\nNXAVMBaYDpwJPD/E95AkDUM9I/jlwMeA04DdwK1AETiHMP3yKnBd3HcrsCI+vgtczyBTNJKk1qon\n4K+u0vbgUfa/My6SpDbyHHVJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAV0t1dnZRKBRqLpKab6g3/JDqcuDAftat3FJzv7mLZo1ANdKxxRG8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUpUPQH/INAHbM60dQFrgFeAJ4HxmW1Lge3ANmB+c8qUJDWqnoB/CFhQ0XYLIeDPAp6KzwFmAlfG\nxwXAvXW+hySpyeoJ32eA/RVtlwDL4voy4NK4vhBYDrwD7AJ2AHOGXaUkqWFDHV1PIEzbEB8nxPXJ\nwJ7MfnuAKUN8D0nSMDTjln39cTna9iP09PQcWi8WixSLxSaUIknpKJVKlEqlIR8/1IDvAyYCrwOT\ngH2xfS8wLbPf1Nh2hGzAS5KOVDn47e3tbej4oU7RrAYWx/XFwKpM+1XAWGA6cCbw/BDfQ5I0DPWM\n4JcDHwNOA3YDnwPuAlYA1xK+TL0i7rs1tm8F3gWu5+jTN5KkFqkn4K8epH3eIO13xkWS1Eaeoy5J\niTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwOqZ1\ndHRQKBTqWjo7u9pdrtSQZtzRSRq1Dh48yLqVW+rad+6iWS2uRmouR/CSlCgDXpISZcBLUqIMeElK\nlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVHDvVzwLuBnwEHg\nHWAO0AV8Hfhg3H4FcGCY7yNJatBwR/D9QBGYTQh3gFuANcBZwFPxuSRphDVjiqZQ8fwSYFlcXwZc\n2oT3kCQ1qBkj+LXAD4BPx7YJQF9c74vPlZjOzq66bnMnqX2GOwd/PvAacDphWmZbxfb+uCgxBw7s\nr+tWd97mTmqf4Qb8a/HxDeAxwjx8HzAReB2YBOyrdmBPT8+h9WKxSLFYHGYpkpSWUqlEqVQa8vHD\nCfhxQAfwFnAyMB/oBVYDi4EvxMdV1Q7OBrwk6UiVg9/e3t6Gjh9OwE8gjNrLr/M14EnCfPwK4FoG\nTpOUJI2w4QT8q8A5VdrfBOYN43UlSU3gL1klKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnw\nkpQoA16qU0dHB4VCoeb
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d796cd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
"interact(plot_resampled_stats, n=slider, xlim=fixed([1, 15]))\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Exercise: write a new class called `StdResampler` that inherits from `Resampler` and overrides `sample_stat` so it computes the standard deviation of the resampled data."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class StdResampler(Resampler): \n",
" \"\"\"Computes the sampling distribution of the standard deviation.\"\"\"\n",
" \n",
" def sample_stat(self, sample):\n",
" \"\"\"Computes a sample statistic using the original sample or a\n",
" simulated sample.\n",
" \"\"\"\n",
" return sample.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Test your code using the cell below:"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.30056137605\n",
"90% CI [ 13.70615766 18.05008376]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEMFJREFUeJzt3X+wXOVdx/H3chNQCU3uHTS/22AIDikqQY1VdNhmYgwj\nQwAtPxSNigwzdAhatRBUcu/YQdopFH/BVPkxsZTYFNKYjNMOKWEdsBasDSSXEEhSMpKY3MSSW8BB\nG8L1j+fZ7GZzf2x2994N9/t+zZzZc549e85zH9jPPnnOL5AkSZIkSZIkSZIkSZIkSVKLzAaeBl4C\neoEVubwb2AtsydOlVZ9ZCewEdgBLxqqikqTGTQMuzPOTgFeA84FVwCcGWX8+8AIwEZgD7AJOG/Va\nSpKGNVIQHyCFN8DbwMvAzLxcGGT9ZcAa4AiwhxT2C5uupSSpKSfT654DLAC+mZdvAV4EHgKm5LIZ\npOGdsr1UfhwkSW1Sb9hPAh4HbiX18B8AziEN8ewH7hnmswPNVFCS1LwJdawzEXgCeBRYn8sOVr3/\nILAxz+8jHdQtm5XLjjN37tyB3bt3n3RlJSm43cC5jXxwpJ59gTRMsx24r6p8etX8lcC2PL8BuBY4\nndTznwc8f0Jtd+9mYGDAaWCAVatWtb0Op8pkW9gWtsXwEzC3kaCHkXv2FwPXA1tJp1gC3AFcRxrC\nGQBeA27K720H1ubXd4GbcRhHktpupLB/lsF7/18d5jN35UmSdIrwHPg2KxaL7a7CKcO2qLAtKmyL\n1hjsXPmxMJDHnyRJdSoUCtBgbtuzl6QADHtJCsCwl6QADPsW6ezsolAoUCgU6Ozsand1JOk4HqBt\nkUKhwOZ1vQAsuuoCxtvfJ6n9PEArSRqWYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSA\nYd+Ezs4uCoVC+YnvknTKMuyb0N9/mM3retm8rrfdVZGkYRn2khSAYS9JARj2khSAYS9JARj2khSA\nYS9JAYwU9rOBp4GXgF5gRS7vAjYBrwJPAlOqPrMS2AnsAJa0srKSpMaMFPZHgD8APgx8BPg4cD5w\nOynszwOeyssA84Fr8utS4P469iFJGmUjBfEB4IU8/zbwMjATuBxYnctXA1fk+WXAGtKPxB5gF7Cw\nddWVJDXiZHrdc4AFwHPAVKAvl/flZYAZwN6qz+wl/ThIktpoQp3rTQKeAG4F3qp5byBPQxn0ve7u\n7mPzxWKRYrFYZ1UkKYZSqUSpVGrJtuoJ+4mkoP8CsD6X9QHTSMM804GDuXwf6aBu2axcdoLqsJck\nnai2I9zT09PwtkYaxikADwHbgfuqyjcAy/P8cio/AhuAa4HTgXOAecDzDddOktQSI/XsLwauB7YC\nW3LZSuBuYC1wA+lA7NX5ve25fDvwLnAzww/xSJLGwEhh/yxD9/4XD1F+V54kSacIz4GXpAAMe0kK\nwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLBvQGdnF4VCod3VkKS6GfYN6O8/zOZ1ve2uhiTV\nzbCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCX\npAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpADq\nCfuHgT5gW1VZN7AX2JKnS6veWwnsBHYAS1pSS0lSU+oJ+0eApTVlA8C9wII8fTWXzweuya9Lgfvr\n3IckaRTVE8TPAIcHKS8MUrYMWAMcAfYAu4CFjVZOktQazfS6bwFeBB4CpuSyGaThnbK9wMwm9iFJ\naoFGw/4B4BzgQmA/cM8w6w40uA9JUotMaPBzB6vmHwQ25vl9wOyq92blshN0d3cfmy8WixSLxQar\nIknjU6lUolQqtWRbjYb9dFKPHuBKKmfqbAAeIx28nQnMA54fbAPVYS9JOlFtR7inp6fhbdUT9muA\nS4CzgdeBVUCRNIQzALwG3JTX3Q6sza/vAjfjMI4ktV09YX/dIGUPD7P+XXkKq6Ojg0KhwJQpnRw+\n/Ea7qyNJngM/Go4ePcrmdb309w92xqokjT3DXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpIC\nMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDD\nXpICMOwlKQDDXpICqCfsHwb6gG1VZV3AJuBV4ElgStV7K4GdwA5gSWuqKUlqRj1h/wiwtKbsdlLY\nnwc8lZcB5gPX5NelwP117kOSNIrqCeJngMM1ZZcDq/P8auCKPL8MWAMcAfYAu4CFTddSktSURnvd\nU0lDO+TXqXl+BrC3ar29wMwG9yFJapFWDLEM5Gm49yVJbTShwc/1AdOAA8B04GAu3wfMrlpvVi47\nQXd397H5YrFIsVhssCqSND6VSiVKpVJLttVo2G8AlgOfzq/rq8ofA+4lDd/MA54fbAPVYS9JOlFt\nR7inp6fhbdUT9muAS4CzgdeBO4G7gbXADaQDsVfndbfn8u3Au8DNOIwjSW1XT9hfN0T54iHK78qT\nJOkU4TnwkhSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2ders7KJQ\nKFAoFNpdFUk6aYZ9nfr7D7N5XS+b1/W2uyqSdNIMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAM\ne0kKwLCXpAAMe0kKwLAfRR0dHcdusdDZ2dXu6kgKbEK7KzCeHT169NjtFRZddUGbayMpMnv2khSA\nYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khRAs3e9\n3AO8CRwFjgALgS7gS8CH8vtXA/1N7keS1IRme/YDQBFYQAp6gNuBTcB5wFN5WZLURq0YxinULF8O\nrM7zq4ErWrAPSVITWtGz/zrwLeDGXDYV6MvzfXlZktRGzY7ZXwzsB36YNHSzo+b9gTydoLu7+9h8\nsVikWCw2WRVJGl9KpRKlUqkl22o27Pfn10PAV0jj9n3ANOAAMB04ONgHq8NeknSi2o5wT09Pw9tq\nZhjnh4Cz8vyZwBJgG7ABWJ7LlwPrm9iHJKkFmunZTyX15svb+SLwJGn8fi1wA5VTLyVJbdRM2L8G\nXDhI+RvA4ia2K0lqMa+glaQADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwHyMdHR0UCgUKhQKdnV3tro6k\nYJp5Bu2419nZRX//4ZZs6+jRo2xe1wvAoqsuaMk2Jale9uyH0d9/mM3reo+FtCS9Xxn2khSAYS9J\nARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYd8G3tte\n0lgbrfvZLwXuAzqAB4FPj9J+Wu7ZZ5/lnXfeGdV9eG97SWNtNHr2HcDfkAJ/PnAdcP4o7KflDh06\nxKJFi/jTlT3cdOPN7a5OOKVSqd1VOGXYFhW2RWuMRtgvBHYBe4AjwD8Cy0ZhPy333nvv8YGzJnP3\nn3yeX7tseburE45f6grbosK2aI3RCPuZwOtVy3tzmSSpTUZjzH5gFLY5Jk477TTefOt73PmZWzhw\ncH+7qyNJLVMYhW1+BOgmjdkDrATe4/iDtLuAuaOwb0kaz3YD57a7EmUTSBWaA5wOvMD75ACtJOnk\nXAq8QurBr2xzXSRJkiS
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d737b10>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"def plot_resampled_stats(n=100):\n",
" sample = weight.rvs(n)\n",
" resampler = StdResampler(sample, xlim=[0, 100])\n",
" resampler.plot_sample_stats()\n",
" \n",
"plot_resampled_stats()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"When your `StdResampler` is working, you should be able to interact with it:"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 1.29095098626\n",
"90% CI [ 15.13442137 19.27452588]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEfVJREFUeJzt3XuMnNV9h/FnWNsk4GDvitZ3YteYCkNbII1LSyomluMa\nNcWENlxSWrelERJpoOkFMFLDboIIiQKhN1AbLnIhuHHAcY2iUAxmKtIEyMUGlsXBdrCKXbymgBOo\n2sYs2z/OGe94r7Mzszvr/T0f6dW875n3fefsgfnO8XlvIEmSJEmSJEmSJEmSJEmSpAZ5F/AUsB3o\nAj6Xy9uALcCLwCPAzIpt1gI7gR3AynGrqSSpLsfl1ynAk8AHgC8A1+Tya4Gb8/xS0g/DVGAhsAs4\nZrwqKkmq33HAd4HTSL32Wbl8dl6G1Ku/tmKbh4Gzx6uCkqTBVdPrPobUW+8GHgeeJwV9d36/m77g\nnwvsrdh2LzCvITWVJNVsShXrvAOcAcwA/hX4YL/3e/M0lOHekySNg2rCvuzHwDeA95F687OB/cAc\n4EBeZx+woGKb+bnsCIsXL+7dvXt3LfWVpMh2AyfXsuFIwzgn0nemzbuBDwHbgM3Amly+BtiU5zcD\nlwDTgEXAEuDpAbXdvZve3l6n3l5uuOGGptdhoky2hW1hWww/AYtrCXoYuWc/B1hH+lE4BrgXeCwH\n/gbgcmAPcFFevyuXdwFvA1fiMI4kNd1IYf8ccNYg5a8DK4bY5qY8SZImCM+Bb7JisdjsKkwYtkUf\n26KPbdEYhSZ9bm8ef5IkValQKECNuW3PXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDfgy0trZRKBQoFAq0trY1uzqS5L1xxkKhUGDrxk4All94OpP5b5U0frw3jiRp\nWIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9\nJAUwUtgvAB4Hngc6gatyeTuwF9iWp/MqtlkL7AR2ACsbWFdJUo2mjPD+IeBTwHZgOvB9YAvQC9ya\np0pLgYvz6zzgUeAU4J3GVVmSNFoj9ez3k4Ie4C3gBVKIw+A30F8NrCf9SOwBdgHL6q6lJKkuoxmz\nXwicCTyZlz8JPAPcBczMZXNJwztle+n7cZAkNUm1YT8deAC4mtTDvwNYBJwBvALcMsy2PpNPkpps\npDF7gKnAg8B9wKZcdqDi/TuBh/L8PtJB3bL5uWyA9vb2w/PFYpFisVhNfSUpjFKpRKlUasi+Rnpw\nbQFYB7xGOlBbNofUoyeXvx/4GOnA7P2kcfryAdqTGdi794HjkjRK9TxwfKSe/TnAZcCzpFMsAa4H\nLiUN4fQCLwFX5Pe6gA359W3gShzGkaSmGynsv8Xg4/rfHGabm/IkSZogvIJWkgIw7CUpAMNekgIw\n7CUpAMNekgIw7CUpAMNekgIw7CUpAMNekgIw7CUpAMNekgIw7BuktbWNQqFQviudJE0ohn2DHDz4\nBls3dh6+tbEkTSSGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCG\nvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgAjhf0C4HHgeaATuCqXtwFbgBeBR4CZFdusBXYCO4CV\njaysJKk2I4X9IeBTwGnA2cAngFOB60hhfwrwWF4GWApcnF9XAbdX8RmSpDE2UhDvB7bn+beAF4B5\nwPnAuly+Drggz68G1pN+JPYAu4Bljavu0aelpeXws2lbW9uaXR1JQU0ZxboLgTOBp4BZQHcu787L\nAHOBJyu22Uv6cQirp6fn8HNpl194epNrIymqasN+OvAgcDXwZr/3evM0lEHfa29vPzxfLBYpFotV\nVkWSYiiVSpRKpYbsq5qwn0oK+nuBTbmsG5hNGuaZAxzI5ftIB3XL5ueyASrDXpI0UP+OcEdHR837\nGmnMvgDcBXQBt1WUbwbW5Pk19P0IbAYuAaYBi4AlwNM1106S1BAj9ezPAS4DngW25bK1wM3ABuBy\n0oHYi/J7Xbm8C3gbuJLhh3gkSeNgpLD/FkP3/lcMUX5TniRJE4TnwEtSAIa9JAVg2EtSAIa9JAVg\n2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtS\nAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVQTdjfDXQD\nz1WUtQN7gW15Oq/ivbXATmAHsLIhtZQk1aWasL8HWNWvrBe4FTgzT9/M5UuBi/PrKuD2Kj9DkjSG\nqgniJ4A3BikvDFK2GlgPHAL2ALuAZbVWTpLUGPX0uj8JPAPcBczMZXNJwztle4F5dXyGJKkBptS4\n3R3AZ/L8Z4FbgMuHWLd3sML29vbD88VikWKxWGNVJGlyKpVKlEqlhuyr1rA/UDF/J/BQnt8HLKh4\nb34uG6Ay7CVJA/XvCHd0dNS8r1qHceZUzH+EvjN1NgOXANOARcAS4OmaaydJaohqevbrgXOBE4GX\ngRuAInAGaYjmJeCKvG4XsCG/vg1cyRDDOJKk8VNN2F86SNndw6x/U54kSROE58BLUgCGvSQFYNhL\nUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCG\nvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGfR1aW9soFAoUCoVmV0WShmXY1+Hg\nwTfYurGTrRs7m10VSRqWYS9JARj2khSAYS9JAVQT9ncD3cBzFWVtwBbgReARYGbFe2uBncAOYGVj\nqilJqkc1YX8PsKpf2XWksD8FeCwvAywFLs6vq4Dbq/wMSdIYqiaInwDe6Fd2PrAuz68DLsjzq4H1\nwCFgD7ALWFZ3LSVJdam11z2LNLRDfp2V5+cCeyvW2wvMq/EzJp2WlpbD5+W3trY1uzqSApnSgH30\n5mm49wdob28/PF8sFikWiw2oysTW09Nz+Jz85Ree3uTaSJroSqUSpVKpIfuqNey7gdnAfmAOcCCX\n7wMWVKw3P5cNUBn2kqSB+neEOzo6at5XrcM4m4E1eX4NsKmi/BJgGrAIWAI8XXPtJEkNUU3Pfj1w\nLnAi8DLwaeBmYANwOelA7EV53a5c3gW8DVzJ8EM8kqRxUE3YXzpE+Yohym/KkyRpgvAceEkKwLCX\npAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAM\ne0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kK\nwLCXpACm1Ln9HuAnQA9wCFgGtAFfBd6b378IOFjn50iS6lBvz74XKAJnkoIe4DpgC3AK8FheliQ1\nUSOGcQr9ls8H1uX5dcAFDfgMSVIdGtGzfxT4HvDxXDYL6M7z3XlZktRE9Y7ZnwO8AvwMaehmR7/3\ne/MkSWqiesP+lfz6KvB10rh9NzAb2A/MAQ4MtmF7e/vh+WKxSLFYrLMqkjS5lEolSqVSQ/ZVT9gf\nB7QAbwLHAyuBDmAzsAb4fH7dNNjGlWEvSRqof0e4o6Oj5n3VE/azSL358n6+AjxCGr/fAFxO36mX\nkqQmqifsXwLOGKT8dWBFHfuVJDWYV9BKUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQF\nYNg3SUtLC4VCgUKhQGtrW7OrI2mSq/dGaKpRT08PWzd2ArD8wtObXBtJk509e0kKwLAfpdbWtsPD\nL5J0tDDsR+ngwTfYurHz8BCMJB0NDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsC
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d78d110>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
"interact(plot_resampled_stats, n=slider)\n",
"None"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Part Three\n",
"==========\n",
"\n",
"We can extend this framework to compute SE and CI for a difference in means.\n",
"\n",
"For example, men are heavier than women on average. Here's the women's distribution again (from BRFSS data):"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(72.697645732966876, 16.944043048498038)"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"female_weight = scipy.stats.lognorm(0.23, 0, 70.8)\n",
"female_weight.mean(), female_weight.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"And here's the men's distribution:"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"(89.063576984335782, 17.992335889366288)"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"male_weight = scipy.stats.lognorm(0.20, 0, 87.3)\n",
"male_weight.mean(), male_weight.std()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"I'll simulate a sample of 100 men and 100 women:"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"female_sample = female_weight.rvs(100)\n",
"male_sample = male_weight.rvs(100)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The difference in means should be about 17 kg, but will vary from one random sample to the next:"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"15.521337537414979"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"male_sample.mean() - female_sample.mean()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here's the function that computes Cohen's $d$ again:"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def CohenEffectSize(group1, group2):\n",
" \"\"\"Compute Cohen's d.\n",
"\n",
" group1: Series or NumPy array\n",
" group2: Series or NumPy array\n",
"\n",
" returns: float\n",
" \"\"\"\n",
" diff = group1.mean() - group2.mean()\n",
"\n",
" n1, n2 = len(group1), len(group2)\n",
" var1 = group1.var()\n",
" var2 = group2.var()\n",
"\n",
" pooled_var = (n1 * var1 + n2 * var2) / (n1 + n2)\n",
" d = diff / numpy.sqrt(pooled_var)\n",
" return d"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The difference in weight between men and women is about 1 standard deviation:"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"0.9271315108152719"
]
},
"execution_count": 31,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"CohenEffectSize(male_sample, female_sample)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can write a version of the `Resampler` that computes the sampling distribution of $d$."
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"class CohenResampler(Resampler):\n",
" def __init__(self, group1, group2, xlim=None):\n",
" self.group1 = group1\n",
" self.group2 = group2\n",
" self.xlim = xlim\n",
" \n",
" def resample(self):\n",
" group1 = numpy.random.choice(self.group1, len(self.group1), replace=True)\n",
" group2 = numpy.random.choice(self.group2, len(self.group2), replace=True)\n",
" return group1, group2\n",
" \n",
" def sample_stat(self, groups):\n",
" group1, group2 = groups\n",
" return CohenEffectSize(group1, group2)\n",
" \n",
" # NOTE: The following functions are the same as the ones in Resampler,\n",
" # so I could just inherit them, but I'm including them for readability\n",
" def compute_sample_statistics(self, iters=1000):\n",
" stats = [self.sample_stat(self.resample()) for i in range(iters)]\n",
" return numpy.array(stats)\n",
" \n",
" def plot_sample_stats(self):\n",
" sample_stats = self.compute_sample_statistics()\n",
" summarize_sampling_distribution(sample_stats)\n",
" pyplot.hist(sample_stats, color=COLOR2)\n",
" pyplot.xlabel('sample statistic')\n",
" pyplot.xlim(self.xlim)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we can instantiate a `CohenResampler` and plot the sampling distribution."
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"collapsed": false
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"SE 0.160707033098\n",
"90% CI [ 0.6808076 1.1974013]\n"
]
},
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAEPCAYAAACneLThAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEkZJREFUeJzt3X2QnVVhx/HvZhNUpLCbpoa84VIIxUApwSHSYocLxUyw\nHYI48lbbtE0dZrDA2HaEOFOyW20ERzP0RRjLWyNIbBRMQ61IeLktvkBQEkIIkSQl02waNkh2FTot\nJmH7xzmXvdns7n3u+96z38/MM/e553k7R8nvnj3PvecBSZIkSZIkSZIkSZIkSZIkDfNO4GlgE7AV\n+HwsnwqsB14CHgE6io5ZBmwHtgELG1ZTSVJFjo6vk4GngA8CXwA+HctvAG6O6/MIHwhTgC5gBzCp\nURWVJFXuaOAZ4DRCL316LD8+vofQi7+h6JiHgXMaVUFJ0uGy9LInEXrnfcATwAuEgO+L2/sYCvyZ\nQG/Rsb3ArJrUVJJUtskZ9nkLOBM4DvgucP6w7YNxGc1Y2yRJdZQl5At+BnwbeD+h93488AowA9gX\n99kDzCk6ZnYsO8xJJ500uHPnzkrqK0kT2U7g5HIOKDVcM42hb868C/gQsBFYByyJ5UuAtXF9HXAF\ncBRwIjAX2HBELXfuZHBwMNll+fLlTa+DbbN9ti+9BTipnICH0j35GcAqwofBJOBe4LEY9GuApcAu\n4LK4/9ZYvhU4CFyDwzWS1DSlQv554KwRyvcDF45yzIq4SJKazO+w10Eul2t2Feom5baB7Wt1qbev\nEm1Nuu5gHF+SJGXU1tYGZea2PXlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5\nSUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeY0LnZ1TaWtra8rS2Tm12c2X6sYn\nQ2lcaGtr4/EHtzTl2hdcejr+96hW4JOhJEmHMeQlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5\nSUpYqZCfAzwBvABsAa6L5d1AL7AxLhcVHbMM2A5sAxbWsK6SpDJNLrH9APApYBNwDPBjYD0wCKyM\nS7F5wOXxdRbwKHAK8FbtqixJyqpUT/4VQsADvAG8SAhvGPmntYuB1YQPh13ADmBB1bWUJFWknDH5\nLmA+8FR8fy3wHHAX0BHLZhKGcQp6GfpQkCQ1WNaQPwb4JnA9oUd/O3AicCawF/jSGMc685MkNUmp\nMXmAKcADwH3A2li2r2j7ncBDcX0P4WZtwexYdoTu7u6313O5HLlcLkt9JWnCyOfz5PP5qs5RasrK\nNmAV8BrhBmzBDEIPnlh+NnAV4Ybr/YRx+MKN15M5sjfvVMM6jFMNS6VVMtVwqZ78ucDHgc2Er0oC\nfAa4kjBUMwi8DFwdt20F1sTXg8A1OFwjSU1TKuS/x8jj9t8Z45gVcZEkNZm/eJWkhBnykpQwQ16S\nEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5SUpYloeGSElrb28v\nzNPdUB0dnfT372/4dTWxGPKa8A4dOtSUB5ZccOnpDb+mJh6HayQpYYa8JCXMkJekhBnykpQwQ16S\nEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwkqF/BzgCeAFYAtwXSyf\nCqwHXgIeATqKjlkGbAe2AQtrWVlJUnlKhfwB4FPAacA5wCeB9wE3EkL+FOCx+B5gHnB5fF0E3Jbh\nGpKkOikVwK8Am+L6G8CLwCzgYmBVLF8FXBLXFwOrCR8Ou4AdwILaVVeSVI5yetldwHzgaWA60BfL\n++J7gJlAb9ExvYQPBUlSE2R9/N8xwAPA9cDrw7YNxmU0I27r7u5+ez2Xy5HL5TJWRfXS2TmVgYH+\nZldDUpTP58nn81WdI0vITyEE/L3A2ljWBxxPGM6ZAeyL5XsIN2sLZseyIxSHvMaHgYH+pjzrFHze\nqTSS4R3gnp6ess9RarimDbgL2ArcWlS+DlgS15cwFP7rgCuAo4ATgbnAhrJrJUmqiVI9+XOBjwOb\ngY2xbBlwM7AGWEq4wXpZ3LY1lm8FDgLXMPZQjiSpjkqF/PcYvbd/4SjlK+IiSWoyv8MuSQkz5CUp\nYYa8JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJm\nyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeUlKmCEvSQkz5CUpYYa8\nJCXMkJekhGUJ+buBPuD5orJuoBfYGJeLirYtA7YD24CFNamlJKkiWUL+HmDRsLJBYCUwPy7fieXz\ngMvj6yLgtozXkCTVQZYAfhLoH6G8bYSyxcBq4ACwC9gBLKi0cpKk6lTTy74WeA64C+iIZTMJwzgF\nvcCsKq4hSarC5AqPux3467j+WeBLwNJR9h0cqbC7u/vt9VwuRy6Xq7AqkpSmfD5PPp+v6hyVhvy+\novU7gYfi+h5gTtG22bHsCMUhL0k60vAOcE9PT9nnqHS4ZkbR+kcY+ubNOuAK4CjgRGAusKHCa0iS\nqpSlJ78aOA+YBuwGlgM54EzCUMzLwNVx363Amvh6ELiGUYZrJEn1lyXkrxyh7O4x9l8RF0lSk/kd\ndklKmCEvSQkz5CUpYYa8JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCX\npIQZ8pKUMENekhJmyEtSwip9xqvqqLNzKgMD/c2uhqQEGPLj0MBAP48/uKXh173g0tMbfk1J9eVw\njSQlzJCXpIQZ8pKUMENekhJmyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEZQn5u4E+4PmisqnA\neuAl4BGgo2jbMmA7sA1YWJtqSpIqkSXk7wEWDSu7kRDypwCPxfcA84DL4+si4LaM15Ak1UGWAH4S\nGD4l4sXAqri+Crgkri8GVgMHgF3ADmBB1bWUJFWk0l72dMIQDvF1elyfCfQW7dcLzKrwGpKkKtVi\nquHBuIy1/Qjd3d1vr+dyOXK5XA2qIknpyOfz5PP5qs5Racj3AccDrwAzgH2xfA8wp2i/2bHsCMUh\nL01E7e3ttLW1NeXaHR2d9Pfvb8q1ld3wDnBPT0/Z56g05NcBS4Bb4uvaovL7gZWEYZq5wIYKryEl\n7dChQ015OAz4gJiJJEvIrwbOA6YBu4GbgJuBNcBSwg3Wy+K+W2P5VuAgcA1jD+VIkuooS8hfOUr5\nhaOUr4iLJKnJ/A67JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ\n8pKUMENekhJmyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeUlKmCEv\nSQkz5CUpYYa8JCXMkJekhBnykpQwQ16SEja5yuN3AT8HDgEHgAXAVOCfgffG7ZcBA1VeR5JUgWp7\n8oNADphPCHiAG4H1wCnAY/G9JKkJajFc0zbs/cXAqri+CrikBteQJFWgFj35R4EfAZ+IZdOBvrje\nF99Lkpqg2jH5c4G9wK8Qhmi2Dds+GBdJUhNUG/J74+urwLcI4/J9wPHAK8AMYN9IB3Z3d7+9nsvl\nyOVyVVZFktKSz+fJ5/NVnaOakD8aaAdeB94NLAR6gHXAEuCW+Lp2pIOLQ16SdKThHeCenp6yz1FN\nyE8n9N4L5/ka8AhhfH4NsJShr1BKkpqgmpB/GThzhPL9wIVVnFeSVCP+4lWSEmbIS1LCDHlJSpgh\nL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJW7QRlyersnMrAQH+zqyFJVTHkRzEw0M/jD25pyrUv\nuPT0plxXUnoMeWkCam9vp61t+EPd6q+jo5P+/v0Nv+5EZshLE9ChQ4ea8peqf6U2njdeJSlhhrwk\nJcyQl6SEGfKSlDBDXpI
"text/plain": [
"<matplotlib.figure.Figure at 0x7f294d796fd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"resampler = CohenResampler(male_sample, female_sample)\n",
"resampler.plot_sample_stats()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This example demonstrates an advantage of the computational framework over mathematical analysis. Statistics like Cohen's $d$, which is the ratio of other statistics, are relatively difficult to analyze. But with a computational approach, all sample statistics are equally \"easy\".\n",
"\n",
"One note on vocabulary: what I am calling \"resampling\" here is a specific kind of resampling called \"bootstrapping\". Other techniques that are also considering resampling include permutation tests, which we'll see in the next section, and \"jackknife\" resampling. You can read more at <http://en.wikipedia.org/wiki/Resampling_(statistics)>."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.10"
}
},
"nbformat": 4,
"nbformat_minor": 0
}