mirror of
https://github.com/donnemartin/data-science-ipython-notebooks.git
synced 2024-03-22 13:30:56 +08:00
1114 lines
100 KiB
Plaintext
1114 lines
100 KiB
Plaintext
|
{
|
||
|
"cells": [
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Random Sampling\n",
|
||
|
"==========================\n",
|
||
|
"\n",
|
||
|
"Credits: Forked from [CompStats](https://github.com/AllenDowney/CompStats) by Allen Downey. License: [Creative Commons Attribution 4.0 International](http://creativecommons.org/licenses/by/4.0/)."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 1,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"from __future__ import print_function, division\n",
|
||
|
"\n",
|
||
|
"import numpy\n",
|
||
|
"import scipy.stats\n",
|
||
|
"\n",
|
||
|
"import matplotlib.pyplot as pyplot\n",
|
||
|
"\n",
|
||
|
"from IPython.html.widgets import interact, fixed\n",
|
||
|
"from IPython.html import widgets\n",
|
||
|
"\n",
|
||
|
"# seed the random number generator so we all get the same results\n",
|
||
|
"numpy.random.seed(18)\n",
|
||
|
"\n",
|
||
|
"# some nicer colors from http://colorbrewer2.org/\n",
|
||
|
"COLOR1 = '#7fc97f'\n",
|
||
|
"COLOR2 = '#beaed4'\n",
|
||
|
"COLOR3 = '#fdc086'\n",
|
||
|
"COLOR4 = '#ffff99'\n",
|
||
|
"COLOR5 = '#386cb0'\n",
|
||
|
"\n",
|
||
|
"%matplotlib inline"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Part One\n",
|
||
|
"========\n",
|
||
|
"\n",
|
||
|
"Suppose we want to estimate the average weight of men and women in the U.S.\n",
|
||
|
"\n",
|
||
|
"And we want to quantify the uncertainty of the estimate.\n",
|
||
|
"\n",
|
||
|
"One approach is to simulate many experiments and see how much the results vary from one experiment to the next.\n",
|
||
|
"\n",
|
||
|
"I'll start with the unrealistic assumption that we know the actual distribution of weights in the population. Then I'll show how to solve the problem without that assumption.\n",
|
||
|
"\n",
|
||
|
"Based on data from the [BRFSS](http://www.cdc.gov/brfss/), I found that the distribution of weight in kg for women in the U.S. is well modeled by a lognormal distribution with the following parameters:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 2,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"(72.697645732966876, 16.944043048498038)"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 2,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"weight = scipy.stats.lognorm(0.23, 0, 70.8)\n",
|
||
|
"weight.mean(), weight.std()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Here's what that distribution looks like:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 3,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAZQAAAEPCAYAAABlZDIgAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3Xl4XPV97/H3bNp3S7I2G8nCeMGAF2xMIFhsLSUNkCYN\npaFsaUOeXtLetPeWpPe5T5x773ObtE/alKYlpE0TB3ohEFJqmrAFUELA2MY2XrBlS94tWbJs7Ysl\njWbuH+doNGdGM9YyozPL55XMo/P7zTkzX2N5vvNbzu8HIiIiIiIiIiIiIiIiIiIiIiIiIiIiInNy\nB9AENAOPRzjnCfP5vcAasy4L2A58CBwE/iro/BLgDeAI8DpQFPOoRUQkobiAFqAW8GAkhxUh59wJ\n/Nw8vg54P+i5HPOn26y/wSz/NfAX5vHjwDdiGbSIiCSe64FXg8pfMR/BvgvcG1RuAhaGnJMD7ARW\nTnFOhVkWERGbOeP42tXA6aDyGbPuUufUmMcujFZNB/A2RtcXGMmkwzzuIDwBiYiIDeKZUPzTPM8R\n4bpxYDVGgrkJaIjwHtN9HxERiSN3HF+7FVgUVF6E0QKJdk6NWResF/gZsA5oxGiVVADtQCVwbqo3\nr6+v9x89enSWoYuIpKWjwOWzvTieLZQPgKUYg/IZGGMlW0PO2Qo8YB5vBHowEkYpk7O3soHbMbq/\nJq550Dx+EHhpqjc/evQofr8/KR9f+9rXbI9B8dsfh+JPzkcyxw/Uz+VDP54tFC/wGPAaxnjI94FD\nwKPm809hzPC6E2M22CDwsPlcJbAFI+E5gaeBN83nvgE8D3weOAF8No5/BhERmaZ4JhSAV8xHsKdC\nyo9Ncd1+YG2E1+wCbptjXCIiEmPx7PKSWWpoaLA7hDlR/PZS/PZK9vjnInSGVSrxm32CIiIyDQ6H\nA+aQF9RCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGR\nmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBC\nERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmFBCERGRmIh3QrkDaAKagccj\nnPOE+fxeYI1Ztwh4G/gIOAD8SdD5m4EzwB7zcUesgxYRkZlzx/G1XcB3gNuAVmAnsBU4FHTOncDl\nwFLgOuBJYCMwBnwZ+BDIA3YBr2MkJz/wt+ZDUkzPSA+9o71kujLJdmeT487B4/TYHZaITEM8E8oG\noAU4YZafA+7GmlDuAraYx9uBImAh0G4+AAbMa6oxEgqAI15By/zz+rwc7DrIrs5dnOw/Gfb8orxF\nbKraRH1hPQ6H/upFElU8E0o1cDqofAajFXKpc2qAjqC6WoyusO1BdV8CHgA+AP4c6IlJxDLvPjj3\nAW+deYsh71DEc04PnOaZI8+wKG8Rt9TcQl1B3TxGKCLTFc8xFP80zwv9yhl8XR7wE+BPMVoqYHSL\n1QGrgbPAt+YQo9jE5/fx2qnX+M8T/xk1mQQ7PXCaLU1beOXkK/j8vjhHKCIzFc8WSivG4PqERRgt\nkGjn1Jh1AB7gReAZ4KWgc84FHf8L8HKkADZv3hw4bmhooKGhYVqBS3x5fV7+4/h/sP/C/imfL88u\nx4+fYe8wA2MDYc9v79jO4Ngg9yy5B7cznr/CIqmtsbGRxsbGmL1ePDuk3cBh4FagDdgB3Ef4oPxj\n5s+NwLfNnw6MsZULGIPzwSoxWiaYz60Hfn+K9/f7/dNtJMl8GR0f5bnm5zjWd8xS73F6WFe2jrXl\naynPLg/Utw228faZt2nubQ57rfrCeu69/F4yXBlxj1skHZhjlLPOC/Ee4fwtjCThAr4P/BXwqPnc\nU+bP72BM/R0EHgZ2AzcCvwL2MdkF9lXgVeBHGN1dfuC4+XrBYy4TlFAS0H8c/w/2dO6x1OV6cvnc\nFZ+jKrcq4nUn+0/yQssLYS2WRXmLeGD5A5oJJhIDiZ5Q7KSEkmA+6vqIF1pesNSVZJZw/7L7Kckq\nueT13SPdPN30NF0jXZb61aWrubvubs0AE5mjuSYU3Skv86J3pJeXj1uHuxZkLeDzKz8/rWQCUJxZ\nzCMrH6Eip8JS/+H5D9nVuStmsYrI7CihSNz5/D5+euynXBy/GKhzOpx8pv4z5HpyZ/RaeZ48Hlr+\nEAuyFljqXzn5CmcGQud8iMh8UkKRuHv37LthNyzeVnMblbmVs3q9LHcW9y691zJuMu4f5/mW56ec\nFSYi80MJReKqf7SfX7b90lJXX1DPxoqNc3rd8uxy7q6721LXN9rHz078bE6vKyKzp4QicfXLtl/i\n9XkD5Rx3DvcsuQenY+6/eqsWrOL6iustdYe6D3G4+/CcX1tEZk4JReKm62IXuzt3W+purr6Z/Iz8\nmL3HbYtuC5tu/LOTP2NkfCRm7yEi06OEInHzduvbliVSijOLWVO2JsoVM+dyuPhk7SdxBM107Bvt\n4+0zb8f0fUTk0pRQJC7ah9rDlla5ufrmuCyVUplbGdb1tb1jO22DbTF/LxGJTAlF4uLN029ayguz\nF7Jqwaq4vV9DdQNFGUWBsh8/Lx9/WYtIiswjJRSJuTMDZ8LW3rql5paYDMRHkuHK4BO1n7DUnR06\ny4ELB+L2niJipYQiMbe9Y7ulvChvEVcUXRH3911atJSVJSstdW+decsyy0xE4kcJRWJqYGyAj7o+\nstTdVHXTvK2zdVvNbZaWUM9oDzvP7ZyX9xZJd0ooElO7zu2yjFuUZJZQX1g/b+9fklXCteXXWup+\n1fYrLnovRrhCRGJFCUViZtw3zgfnPrDUrV+4Pq5jJ1O5qeomMpyTe6QMe4d5t/3deY1BJB0poUjM\nNPU00T/WHyh7nB5Wl66e9zjyPHl8rPJjlrpt7dvoG+2b91hE0okSisTMjo4dlvI1pdeQ7c62JZbr\nK663rGTs9Xl57+x7tsQiki6UUCQm2ofaw1YU3lC+waZoINOVyaaqTZa6Dzo/0GrEInGkhCIxETp2\nUptfS3lOeYSz58fasrXkeybXDfP6vLzf/r6NEYmkNiUUmTOvzxt2A+GGhfa1Tia4nW5uqLzBUrej\nYwdD3iGbIhJJbUooMmfNPc2W3Rhz3DksK1pmY0ST1patJdc9OZYy6htle/v2KFeIyGwpocic7buw\nz1JeVbIKl9NlUzRWGa4Mrq8MXzgyOAGKSGwoocicDHuHOdJzxFJ3VelVNkUztfXl68lyZQXKF8cv\nsrNDd8+LxJoSiszJoe5DjPvHA+XizGJqcmtsjChcpiszbMvh9zve1xpfIjGmhCJzsu+8tbvr6gVX\nz9u6XTNx3cLrLHfPD44Nhu3XIiJzo4Qis9Y72suJ/hOWuqsWJFZ314RsdzZry9Za6ra1b8Pv99sU\nkUjqUUKRWQudKlyVW0VpdqlN0VzadRXXWbYKPjd8jqO9R22MSCS1KKHIrE3V3ZXIijOLw/ZLea9d\ny7GIxIoSiszK+eHzdAx3BMoOHFxZcqWNEU1P6N7zx/qO0T7UblM0IqlFCUVm5VD3IUu5rqCO/Iz8\nCGcnjpq8GhbnLbbUbWvfZlM0Iqkl3gnlDqAJaAYej3DOE+bze4E1Zt0i4G3gI+AA8CdB55cAbwBH\ngNeBophHLZfU1N1kKS8vXm5TJDMXeqPj/gv76R/tj3C2iExXPBOKC/gORlJZCdwHrAg5507gcmAp\n8AXgSbN+DPgycCWwEfgvwMQn1lcwEsoVwJtmWeZR32gfrYOtlrpkSijLipZRklkSKPv8vrDFLUVk\n5uKZUDYALcAJjATxHHB3yDl3AVvM4+0YrY2FQDvwoVk/ABwCqqe4ZgtwT+xDl2gOdx+2lKtzqynI\nKLApmplzOpxct/A6S92uzl260VFkjuKZUKqB00HlM0wmhWjnhN5mXYvRFTaxot9CYGI0uMMsyzxK\n5u6uCdeUXWO50XFgbIC
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d9ead90>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"xs = numpy.linspace(20, 160, 100)\n",
|
||
|
"ys = weight.pdf(xs)\n",
|
||
|
"pyplot.plot(xs, ys, linewidth=4, color=COLOR1)\n",
|
||
|
"pyplot.xlabel('weight (kg)')\n",
|
||
|
"pyplot.ylabel('PDF')\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"`make_sample` draws a random sample from this distribution. The result is a NumPy array."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 4,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def make_sample(n=100):\n",
|
||
|
" sample = weight.rvs(n)\n",
|
||
|
" return sample"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Here's an example with `n=100`. The mean and std of the sample are close to the mean and std of the population, but not exact."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 5,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"(76.308293640077437, 19.995558735561865)"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 5,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sample = make_sample(n=100)\n",
|
||
|
"sample.mean(), sample.std()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"We want to estimate the average weight in the population, so the \"sample statistic\" we'll use is the mean:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 6,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def sample_stat(sample):\n",
|
||
|
" return sample.mean()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"One iteration of \"the experiment\" is to collect a sample of 100 women and compute their average weight.\n",
|
||
|
"\n",
|
||
|
"We can simulate running this experiment many times, and collect a list of sample statistics. The result is a NumPy array."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 7,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def compute_sample_statistics(n=100, iters=1000):\n",
|
||
|
" stats = [sample_stat(make_sample(n)) for i in range(iters)]\n",
|
||
|
" return numpy.array(stats)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The next line runs the simulation 1000 times and puts the results in\n",
|
||
|
"`sample_means`:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 8,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"sample_means = compute_sample_statistics(n=100, iters=1000)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Let's look at the distribution of the sample means. This distribution shows how much the results vary from one experiment to the next.\n",
|
||
|
"\n",
|
||
|
"Remember that this distribution is not the same as the distribution of weight in the population. This is the distribution of results across repeated imaginary experiments."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 9,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAYcAAAEPCAYAAACp/QjLAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAE+BJREFUeJzt3X2UXHV9x/H35AHLU7JJxRCSaBBJS5RKFGMqWqaUpuHY\nmtAWkBYbq8djT+Th1NpqtEc2teVgFUtthPZUwSCaNoqmwYcW5DCn2BqjyENCiCaRKBtDgprdhWpt\nCNs/fr9hb+Y3u5lN5t67M3m/zpkzd37zcL+/fbif+zS/C5IkSZIkSZIkSZIkSZIkSVLXmQPcCzwC\nbAGuju29QB/wQLxdlHnPSmA7sA1YXFShkqTinAqcE6dPAr4DnAVcC7yzyevnAw8Ck4G5wA5gQu5V\nSpISeS58nyAs7AGeBh4FZsXHlSavXwqsBQ4AuwjhsDDH+iRJIyhqzXwusADYGB9fBTwEfALoiW2n\nEXY31fUxHCaSpAIVEQ4nAZ8DriFsQdwMnE7Y5bQHuGGU9w7lXp0kKTEp58+fDNwB3A6sj237Ms9/\nHLgzTu8mHMSumx3bDnHGGWcM7dy5s/2VSlJ32wm8pNUX57nlUCHsNtoK3Jhpn5mZvhjYHKc3AG8E\njiNsWZwJbGr80J07dzI0NNS1t2uvvbb0Guyb/bN/3XcDzhjLAjzPLYfzgCuAhwmnrAK8F7icsEtp\nCHgMeHt8biuwLt4/A6zA3UqSVIo8w+FrNN8y+coo77ku3qSuNbVnGoMD/YXMa8rUHgb69xcyL3WX\nvI85aIyq1WrZJeSmm/sGrfdvcKCfRVeuP/wL22Dj6mVt+yx/f8eWZt83GO+G4v4zqSNVKpVCw8H/\nF0H4u2MMy3y/gSxJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJ\nShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgO\nkqSE4SBJShgOkqSE4SBJShgOkqSE4SBJShgOkqTEpLILkMaDqT3TGBzoL7sMadwwHCRgcKCfRVeu\nL2ReG1cvK2Q+0tFwt5IkKWE4SJISeYbDHOBe4BFgC3B1bJ8O3A18F7gL6Mm8ZyWwHdgGLM6xNknS\nKPIMhwPAnwIvBRYB7wDOAt5DCId5wD3xMcB84LJ4vwS4Kef6JEkjyHPh+wTwYJx+GngUmAW8AVgT\n29cA9aNzS4G1hFDZBewAFuZYnyRpBEWtmc8FFgDfAGYAe2P73vgY4DSgL/OePkKYSJIKVsSprCcB\ndwDXAE81PDcUbyNp+lxvb+9z09VqlWq1elQFSlK3qdVq1Gq1I35/3uEwmRAMnwLqJ5HvBU4l7Haa\nCeyL7bsJB7HrZse2RDYcJEmpxhXnVatWjen9ee5WqgCfALYCN2baNwDL4/RyhkNjA/BG4DjgdOBM\nYFOO9UmSRpDnlsN5wBXAw8ADsW0lcD2wDngr4cDzpfG5rbF9K/AMsILRdzlJknKSZzh8jZG3TC4c\nof26eJMklcjvEUiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiSEoaDJClhOEiS\nEoaDJClhOEiSEkVcCU5SWSoTqVQqhc1uytQeBvr3FzY/5cdwkLrZ0EEWXbn+8K9rk42rlxU2L+XL\n3UqSpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6S\npIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElK5B0OtwB7gc2Ztl6gD3gg3i7KPLcS2A5s\nAxbnXJskaQR5h8OtwJKGtiHgI8CCePtKbJ8PXBbvlwA3FVCfJKmJvBe+9wH7m7RXmrQtBdYCB4Bd\nwA5gYW6VSZJGVNaa+VXAQ8AngJ7Ydhphd1NdHzCr4LokSZQTDjcDpwPnAHuAG0Z57VAhFUmSDjGp\nhHnuy0x/HLgzTu8G5mSemx3bEr29vc9NV6tVqtVqWwuUpE5Xq9Wo1WpH/P4ywmEmYYsB4GKGz2Ta\nAHyGcLB6FnAmsKnZB2TDQZKUalxxXrVq1Zjen3c4rAXOB54PPA5cC1QJu5SGgMeAt8fXbgXWxftn\ngBW4W0mSSpF3OFzepO2WUV5/XbxJkkrk9wgkSQnDQZKUMBwkSQnDQZKUMBwkSQnDQZKUMBwkSYlW\nwuGeFtskSV1itC/BHQ+cAJwCTM+0T8HRUiWpq40WDm8HriEMpX1/pv0pYHWeRUmSyjVaONwYb1cD\nHy2mHEnSeNDK2EofBV4DzG14/W15FCRJKl8r4XA78GLgQeBgpt1wkKQu1Uo4vBKYj8NnS9Ixo5VT\nWbcQLtAjSTpGtLLlcArhAjybgJ/HtiHgDXkVJUkqVyvh0Jt3EZKk8aWVcKjlXYQkaXxpJRyeZvhg\n9HHA5Ng2Ja+ipKk90xgc6C+7DOmY1Uo4nJSZnkA41rAon3KkYHCgn0VXri9sfhtXLytsXlInGOuo\nrM8C64ElOdQiSRonWtly+L3M9ATC9x5+lk85kqTxoJVw+B2Gjzk8A+wCluZVkCSpfK2Ew5vzLkKS\nNL60csxhDvAF4Ml4uwOYnWdRkqRytRIOtwIbCNd1OA24M7ZJkrpUK+FwCiEMDsTbJ4EX5FiTJKlk\nrYTDj4E3ARMJxyiuAH6UZ1GSpHK1Eg5/DFwKPAHsAS6JbZKkLtXK2Up/BfwRsD8+ng58GHhLXkVJ\nksrVypbDyxkOBoCfAK/IpxxJ0njQSjhUCFsLddMJxx8kSV2qld1KNwBfB9YRguIS4G/yLEqSVK5W\nwuE24H7gAsIwGhcTrgwnSepSrYQDwCPxJkk6Box1yG5J0jEg73C4BdgLbM60TQfuBr4L3AX0ZJ5b\nCWwHtgGLc65NkjSCvMPhVtILA72HEA7zgHviY4D5wGXxfglwUwH1SZKayHvhex+HfkcCwmVG18Tp\nNUD9+oxLgbWE8Zt2ATuAhTnXJ0lqoow18xmEXU3E+xlx+jSgL/O6PmBWgXVJkqKyd9sMMXyVuZGe\nlyQVrNVTWdtpL3AqYSC/mcC+2L6bcGGhutmxLdHb2/vcdLVapVqt5lCmJHWuWq1GrVY74veXEQ4b\ngOXAB+P9+kz7Z4CPEHYnnQlsavYB2XCQJKUaV5xXrVo1pvfnHQ5rgfOB5wOPA+8HricMxfFWwoHn\nS+Nrt8b2rcAzwArcrSRJpcg7HC4fof3CEdqvizdJnagykUqlUsispkztYaC/8WRItUsZu5Ukdauh\ngyy6cv3hX9cGG1cvO/yLdMTKPltJkjQOGQ6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpITh\nIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpIThIElK\nGA6SpIThIElKGA6SpIThIElKGA6SpIThIElKGA6SpMSksgtQ55jaM43Bgf6yy5BUAMNBLRsc6GfR\nlesLmdfG1csKmY+k5tytJElKGA6SpIThIElKGA6SpIThIElKlHm20i5gEDgIHAAWAtOBfwVeFJ+/\nFPDcSUkqWJlbDkNAFVhACAaA9wB3A/OAe+JjSVLByt6tVGl4/AZgTZxeA3iyuySVoOwth68C3wLe\nFttmAHvj9N74WJJUsDKPOZwH7AFOIexK2tbw/FC8JXp7e5+brlarVKvVXAqUpE5Vq9Wo1WpH/P4y\nw2FPvH8S+ALhuMNe4FTgCWAmsK/ZG7PhIElKNa44r1q1akzvL2u30gnAyXH6RGAxsBnYACyP7cuB\nYgbykSQdoqwthxmErYV6DZ8G7iIcf1gHvJXhU1klSQUrKxweA85p0v4T4MKCa5EkNSj7VFZJ0jhk\nOEiSEoaDJCnhleAkdabKRCqVxkEW8jFlag8D/fsLmdd4YThI6kxDB71sbY7crSRJShgOkqSE4SBJ\nShgOkqSE4SBJShgOkqS
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d915710>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"pyplot.hist(sample_means, color=COLOR5)\n",
|
||
|
"pyplot.xlabel('sample mean (n=100)')\n",
|
||
|
"pyplot.ylabel('count')\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The mean of the sample means is close to the actual population mean, which is nice, but not actually the important part."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 10,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"72.652052080657413"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 10,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"sample_means.mean()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The standard deviation of the sample means quantifies the variability from one experiment to the next, and reflects the precision of the estimate.\n",
|
||
|
"\n",
|
||
|
"This quantity is called the \"standard error\"."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 11,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"1.6355262477017491"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 11,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"std_err = sample_means.std()\n",
|
||
|
"std_err"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"We can also use the distribution of sample means to compute a \"90% confidence interval\", which contains 90% of the experimental results:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 12,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"array([ 69.92149384, 75.40866638])"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 12,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"conf_int = numpy.percentile(sample_means, [5, 95])\n",
|
||
|
"conf_int"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The following function takes an array of sample statistics and prints the SE and CI:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 13,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def summarize_sampling_distribution(sample_stats):\n",
|
||
|
" print('SE', sample_stats.std())\n",
|
||
|
" print('90% CI', numpy.percentile(sample_stats, [5, 95]))"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"And here's what that looks like:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 14,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.6355262477\n",
|
||
|
"90% CI [ 69.92149384 75.40866638]\n"
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"summarize_sampling_distribution(sample_means)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Now we'd like to see what happens as we vary the sample size, `n`. The following function takes `n`, runs 1000 simulated experiments, and summarizes the results."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 15,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def plot_sample_stats(n, xlim=None):\n",
|
||
|
" sample_stats = compute_sample_statistics(n, iters=1000)\n",
|
||
|
" summarize_sampling_distribution(sample_stats)\n",
|
||
|
" pyplot.hist(sample_stats, color=COLOR2)\n",
|
||
|
" pyplot.xlabel('sample statistic')\n",
|
||
|
" pyplot.xlim(xlim)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Here's a test run with `n=100`:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 16,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.71202891175\n",
|
||
|
"90% CI [ 69.96057332 75.58582662]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEdZJREFUeJzt3X+Q1PV9x/HneqiNknh3MQFEIhY1IzGtWEPt6IwLQylO\nO6Jm/NWakqlNneCI0zRNpJ2Gu0nGmrYxpG10Wn+VRiXBqBTamPHndjQmYioIiCggRKEKUe5MnGkS\nxOsfn896X5e9u9293dvdzz0fM9/Z7372u9/vm2O/r/vc5/vd7xckSZIkSZIkSZIkSZIkSZLGpWnA\nY8BzwGZgSWzvAXYD6+N0XuY9S4FtwFZg/lgVKkmqzmTg9Dg/EXgBOBVYBnyuzPIzgQ3A4cB0YDtw\nWMOrlCQdYqTwfY0Q2ABvAc8DU+PzXJnlFwIrgQPALkLAzx51lZKkqlXTu54OzAJ+FJ9fAzwL3AZ0\nxrbjCEM3RbsZ/IUgSRpDlQb8ROC7wLWEnvzNwImE4ZtXga8N896B0RQoSarNhAqWORy4F7gTWB3b\n9mVevxVYG+f3EA7MFh0f295jxowZAzt27Ki6WEka53YAJ1W68Eg9+BxhCGYLsDzTPiUzfyGwKc6v\nAS4DjiD08E8G1h1S4Y4dDAwMtO20bNmyptcwHmu3/uZP1t/cCZhRabjDyD34s4ErgI2E0yEB/gq4\nnDA8MwDsBK6Kr20BVsXHt4HFOEQjSU0xUsA/Qfle/gPDvOf6OEmSmshz1GuQz+ebXULN2rl2sP5m\ns/72Uu5c9rEwEMeTJEkVyuVyUEVu24OXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgNa50dXWT\ny+UaPnV1dTf7nyp5HrzGl1wux6P3bW74duZedBp+xlVvngcvSQIMeElKlgEvSYky4CUpUQa8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJmtDsAqSurm76+/uaXYaUHANeTdff3zcmFwCDcBEw\nabxwiEaSEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9J\niTLgJSlRIwX8NOAx4DlgM7AktncDDwEvAg8CnZn3LAW2AVuB+fUsVpJUuZEC/gDw58DHgLOAq4FT\ngesIAX8K8Eh8DjATuDQ+LgBuqmAbkqQGGCl8XwM2xPm3gOeBqcD5wIrYvgK4IM4vBFYSfjHsArYD\ns+tXriSpUtX0rqcDs4CngEnA3ti+Nz4HOA7YnXnPbsIvBEnSGKv0jk4TgXuBa4Gfl7w2EKehlH2t\np6fn3fl8Pk8+n6+wFEkaHwqFAoVCoeb3VxLwhxPC/VvA6ti2F5hMGMKZAuyL7XsIB2aLjo9th8gG\nvCTpUKWd397e3qreP9IQTQ64DdgCLM+0rwEWxflFDAb/GuAy4AjgROBkYF1VFUmS6mKkHvzZwBXA\nRmB9bFsK3ACsAq4kHEy9JL62JbZvAd4GFjP88I0kqUFGCvgnGLqXP2+I9uvjJElqIs9Rl6REGfCS\nlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiZrQ7ALUurq6uunv72t2GW2po6OD\nXC7X8O10dnbR17e/4dtRezLgNaT+/j4evW9zw7cz96LTGr6NsXbw4EF/dmo6h2gkKVEGvCQlyoCX\npEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REVRLwtwN7gU2Zth5gN7A+Tudl\nXlsKbAO2AvPrUqUkqWqVBPwdwIKStgHgRmBWnB6I7TOBS+PjAuCmCrchSaqzSsL3caDcNWPLXQt1\nIbASOADsArYDs2stTpJUu9H0rq8BngVuAzpj23GEoZui3cDUUWxDklSjWgP+ZuBE4HTgVeBrwyw7\nUOM2JEmjUOsNP/Zl5m8F1sb5PcC0zGvHx7ZD9PT0vDufz+fJ5/M1liJJaSoUChQKhZrfX2vATyH0\n3AEuZPAMmzXA3YQDsFOBk4F15VaQDXhJ0qFKO7+9vb1Vvb+SgF8JnAscC7wCLAPyhOGZAWAncFVc\ndguwKj6+DSzGIRpJaopKAv7yMm23D7P89XGSJDWR56hLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnw\nkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVGVBPztwF5gU6atG3gIeBF4EOjMvLYU2AZsBebXp0xJ\nUrUqCfg7gAUlbdcRAv4U4JH4HGAmcGl8XADcVOE2JEl1Vkn4Pg70lbSdD6yI8yuAC+L8QmAlcADY\nBWwHZo+6SklS1WrtXU8iDNsQHyfF+eOA3ZnldgNTa9yGJGkU6jF8MhCn4V6XJI2xCTW+by8wGXgN\nmALsi+17gGmZ5Y6PbYfo6el5dz6fz5PP52ssRZLSVCgUKBQKNb+/1oBfAywCvhofV2fa7wZuJAzN\nnAysK7eCbMBLkg5V2vnt7e2t6v2VBPxK4FzgWOAV4EvADcAq4ErCwdRL4rJbYvsW4G1gMQ7RSFJT\nVBLwlw/RPm+I9uvjJElqIs9Rl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVG13tFJUgvo6Oggl8s1fDudnV309e1v+HZUXwa8\n1MYOHjzIo/dtbvh25l50WsO3ofpziEaSEmXAS1KiDHhJSpRj8G2oq6ub/v6+ZpchqcUZ8G2ov7/P\nA2uSRuQQjSQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElK1GgvF7wL+BlwEDgAzAa6ge8AJ8TXLwH6R7kdSVKVRtuDHwDywCxCuANcBzwE\nnAI8Ep9LksZYPYZociXPzwdWxPkVwAV12IYkqUr16ME/DPwY+ExsmwTsjfN743NJ0hgb7Rj82cCr\nwIcIwzJbS14fiNMhenp63p3P5/Pk8/lRliJJaSkUChQKhZrfP9qAfzU+/hS4nzAOvxeYDLwGTAH2\nlXtjNuAlSYcq7fz29vZW9f7RDNEcBbw/zh8NzAc2AWuARbF9EbB6FNuQJNVoND34SYRee3E9dwEP\nEsbjVwFXMniapCRpjI0m4HcCp5dp3w/MG8V6JUl14DdZJSlRBrwkJcqAl6REGfCSlCgDXpISZcBL\nUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJGu09WSWNAx0d\nHeRyuYZvp7Ozi76+/Q3fznhhwEsa0cGDB3n0vs0N387ci05r+DbGE4doJClRBrwkJcqAl6REGfCS\nlCgDXpISZcBLUqIMeElKlOfB11FXVzf9/X3NLkOSAAO+rvr7+/wyiKSW4RCNJCXKgJekRBnwkpQo\nA16SEmXAS1KiDHhJSpQBL0mJSv48+DfffJM9e/Y0uwxJGnONCvgFwHKgA7gV+GqDtjOiz352MY88\n/AgTj35/Q7fzxv7XG7p+SapWIwK+A/hnYB6wB3gaWAM834BtjeiXv/gVV33q88w557y6rXPD5nWc\nftrs97Tdetdy7r731rpto1HK1d5OrL+52r3+QqFAPp9vdhljphFj8LOB7cAu4ADwbWBhA7bTNBs2\nP93sEmrWzrWD9Tdbo+sv3ty7UdOcOXPI5XJ0dXU39N/RKhrRg58KvJJ5vhv47QZsR1JiGn1z73/7\n9jf59GVXj5vrOTUi4Ac
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d9eaf90>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"plot_sample_stats(100)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Now we can use `interact` to run `plot_sample_stats` with different values of `n`. Note: `xlim` sets the limits of the x-axis so the figure doesn't get rescaled as we vary `n`."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 17,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.71314776815\n",
|
||
|
"90% CI [ 69.99274896 75.65943185]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEaNJREFUeJzt3X+QXeVdx/H3ZRMoSSjZFZqEkHYxQCVEJVVjHWC4zcRM\nUIdAmPJD0WCxg9IhTKtTiDM2uxYpdQTRKoyWH5OWEpu2IQaddgg/LqXWQisJSQiBJJIxSWGhklhw\n1IZl/eN5Lnu4ubt7f+y99+yT92vmzj3nOefc++XZ5bNPnnPuuSBJkiRJkiRJkiRJkiRJknRUmgM8\nDjwHbAdWxvY+YD+wOT4uzByzCtgF7ASWtKtQSVJ9ZgLnxOVpwAvAWcBq4FNV9p8HbAEmA73AbuCY\nllcpSTrCWOH7CiGwAd4Engdmx/VClf2XAWuBw8BeQsAvbLpKSVLd6hld9wILgO/F9euBZ4F7gOmx\n7RTC1E3Zfob/IEiS2qjWgJ8GfB24gTCSvws4jTB98zJw2yjHDjVToCSpMZNq2Gcy8A3gfmBDbHs1\ns/1u4KG4fIBwYrbs1Nj2LnPnzh3as2dP3cVK0lFuD3B6rTuPNYIvEKZgdgB3ZNpnZZYvAbbF5Y3A\nFcCxhBH+GcDTR1S4Zw9DQ0O5f6xevbrjNaRS50So0TqtM+8PYG6t4Q5jj+DPBa4CthIuhwT4Y+BK\nwvTMEPAScG3ctgNYF5/fAq7DKRpJ6oixAv47VB/lf3OUY26JD0lSB3mN+iiKxWKnS6jJRKhzItQI\n1jnerLOzql3L3g5DcT5JklSjQqEAdeS2I3hJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJl\nwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAV+51d/dQKBRGfXR393S6TCl3vJukcq9QKPDY+u2j7rNo\n+Xz8nVLqvJukJAkw4CUpWQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBryR0dXX5aVepwqRO\nFyCNh8HBwZo+7SodTRzBS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUrUWAE/B3gceA7YDqyM7T3AJuBF4GFgeuaYVcAuYCewZDyLlSTVbqyAPwx8Ejgb+DDwCeAs4CZC\nwJ8JPBrXAeYBl8fnpcCdNbyHJKkFxgrfV4AtcflN4HlgNnARsCa2rwEujsvLgLWEPwx7gd3AwvEr\nV5JUq3pG173AAuApYAYwENsH4jrAKcD+zDH7CX8QJEltVuvtgqcB3wBuAN6o2DYUHyOpuq2vr++d\n5WKxSLFYrLEUSTo6lEolSqVSw8fXEvCTCeH+ZWBDbBsAZhKmcGYBr8b2A4QTs2WnxrYjZANeknSk\nysFvf39/XcePNUVTAO4BdgB3ZNo3Aivi8gqGg38jcAVwLHAacAbwdF0VSZLGxVgj+HOBq4CtwObY\ntgq4FVgHXEM4mXpZ3LYjtu8A3gKuY/TpG0lSi4wV8N9h5FH+4hHab4kPSVIHeY26JCXKgJekRBnw\nkpQoA16SEmXAS1KiDHhJSpQBr47r7u6hUCiM+JDUmFrvRSO1zKFDB3ls/fYRty9aPr+N1UjpcAQv\nSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKU\nKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky\n4CUpUQa8JCWqloC/FxgAtmXa+oD9wOb4uDCzbRWwC9gJLBmXKiVJdasl4O8Dlla0DQG3Awvi45ux\nfR5weXxeCtxZ43tIksZZLeH7JHCwSnuhStsyYC1wGNgL7AYWNlqcJKlxzYyurweeBe4Bpse2UwhT\nN2X7gdlNvIckqUGNBvxdwGnAOcDLwG2j7DvU4HtIkpowqcHjXs0s3w08FJcPAHMy206NbUfo6+t7\nZ7lYLFIsFhssRZLSVCqVKJVKDR/faMDPIozcAS5h+AqbjcADhBOws4EzgKervUA24CVJR6oc/Pb3\n99d1fC0Bvxa4ADgJ2AesBoqE6Zkh4CXg2rjvDmBdfH4LuA6naCSpI2oJ+CurtN07yv63xIeUK11d\nXRQK1S7+GjZ9ejcHD77epoqk1mp0ikaacAYHB3ls/fZR91m0fH6bqpFazw8hSVKiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAl\nKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9Jiaol4O8FBoBtmbYeYBPwIvAwMD2zbRWw\nC9gJLBmfMiVJ9aol4O8Dlla03UQI+DOBR+M6wDzg8vi8FLizxveQJI2zWsL3SeBgRdtFwJq4vAa4\nOC4vA9YCh4G9wG5gYdNVSpLq1ujoegZh2ob4PCMunwLsz+y3H5jd4HtIkpowHtMnQ/Ex2nZJUptN\navC4AWAm8AowC3g1th8A5mT2OzW2HaGvr++d5WKxSLFYbLAUSUpTqVSiVCo1fHyjAb8RWAF8Pj5v\nyLQ/ANxOmJo5A3i62gtkA16SdKTKwW9/f39dx9cS8GuBC4CTgH3AZ4BbgXXANYSTqZfFfXfE9h3A\nW8B1OEUjSR1RS8BfOUL74hHab4kPSVIHeY26JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8Gqp7u4eCoXCqA9JrdHo3SSlmhw6dJDH1m8fdZ9Fy+e3qRrp6OIIXpIS\nZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL2V0dXWNemO0\n7u6eTpco1cybjUkZg4ODo94czRujaSJxBC9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIM\neElKlAEvSYky4CUpUQa8JCWq2XvR7AV+DAwCh4GFQA/wVeADcftlwKEm30eSVKdmR/BDQBFYQAh3\ngJuATcCZwKNxXZLUZuMxRVOoWL8IWBOX1wAXj8N7SJLqNB4j+EeAHwAfj20zgIG4PBDXJUlt1uwc\n/LnAy8DJhGmZnRXbh+LjCH19fe8sF4tFisVik6VIUlpKpRKlUqnh45sN+Jfj82vAg4R5+AFgJvAK\nMAt4tdqB2YCXJB2pcvDb399f1/HNTNFMAU6Iy1OBJcA2YCOwIravADY08R6SpAY1M4KfQRi1l1/n\nK8DDhPn4dcA1DF8mKUlqs2YC/iXgnCrtrwOLm3hdSdI48JOskpQoA16SEmXAS1KiDHhJSpQBL0mJ\nMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXqpDV1cXhUJh1Ed3d0+ny5SA5r6yTzrqDA4O8tj67aPus2j5/DZVI43OEbwkJcqAV8O6u3vGnK6Q\n1DlO0ahhhw4ddLpCyjFH8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCg/\nyaqqrr76d9n70t5OlyGpCQa8qtqwYQOfvHY1U4+fVnX7vh/u5Ylvl9pblKS6GPAa0TlnL+S9J5xY\nddu0qSe0uZqJo3zP+NFMn97NwYOvt6kiHa0MeGmcec945UWrTrIuBXYCu4AbW/QekqRRtCLgu4C/\nIYT8POBK4KwWvE/LlUqlTpdQk4lQ55btT3e6hKRMhJ85WGentSLgFwK7gb3AYeAfgGUteJ+Wmyg/\n9IlQ55bt3+90CUmZCD9zsM5Oa0XAzwb2Zdb3xzZJUhu14iTrUAteU23WdcwxfO4LNzJ50uSq29/4\n7zfaXJGkerXiSzM/DPQ
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d804810>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"def sample_stat(sample):\n",
|
||
|
" return sample.mean()\n",
|
||
|
"\n",
|
||
|
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
|
||
|
"interact(plot_sample_stats, n=slider, xlim=fixed([55, 95]))\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"This framework works with any other quantity we want to estimate. By changing `sample_stat`, you can compute the SE and CI for any sample statistic.\n",
|
||
|
"\n",
|
||
|
"As an exercise, fill in `sample_stat` below with any of these statistics:\n",
|
||
|
"\n",
|
||
|
"* Standard deviation of the sample.\n",
|
||
|
"* Coefficient of variation, which is the sample standard deviation divided by the sample standard mean.\n",
|
||
|
"* Min or Max\n",
|
||
|
"* Median (which is the 50th percentile)\n",
|
||
|
"* 10th or 90th percentile.\n",
|
||
|
"* Interquartile range (IQR), which is the difference between the 75th and 25th percentiles.\n",
|
||
|
"\n",
|
||
|
"NumPy array methods you might find useful include `std`, `min`, `max`, and `percentile`.\n",
|
||
|
"Depending on the results, you might want to adjust `xlim`."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 18,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.67195986148\n",
|
||
|
"90% CI [ 69.82954731 75.32184298]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEfxJREFUeJzt3XuMnNV9h/FnWGMSoLF3ResLdmIXTIWhDZCG0pKKwTIO\nqBEmVNxSWjelERIo0PQCGClht4kIiQJNb6CWAHIhuHGAuEZRKAYzFblwSWIuxjjYDlaxi9cE7ARK\nW/Cy/eOc2RmPd3bHM7M7y/6ej/Rq3vfM+75z9sB85/i8N5AkSZIkSZIkSZIkSZIkSVKbvAd4HHgK\n2Ah8MZf3AGuBF4AHgelV2ywHNgObgCXjVlNJUksOza9TgMeAjwBfBq7K5VcDN+T5haQfhoOBecAW\n4KDxqqgkqXWHAk8Cx5F67TNy+cy8DKlXf3XVNg8Ap4xXBSVJw2uk130QqbfeDzwCPEcK+v78fj+V\n4J8NbK/adjtwZFtqKklq2pQG1nkHOAGYBvw7cHrN+4N5qmek9yRJ46CRsC/7OfBt4EOk3vxMYCcw\nC9iV19kBzK3aZk4u28dRRx01uHXr1mbqK0mRbQWObmbD0YZxjqByps17gTOA9cAaYFkuXwaszvNr\ngAuBqcB8YAHwxH613bqVwcFBp8FBrrvuuo7XYaJMtoVtYVuMPAFHNRP0MHrPfhawgvSjcBBwJ/Bw\nDvxVwCXANuD8vP7GXL4R2AtchsM4ktRxo4X9s8BJw5S/Biyus831eZIkTRCeA99hxWKx01WYMGyL\nCtuiwrZoj0KHPncwjz9JkhpUKBSgydy2Zy9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS+pId3dPRQKhX2m7u6eTldLDfKul5IaUigUWHffhn3KFp17\nPH6Xx493vZQkjciwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCw\nl6QARgv7ucAjwHPABuCKXN4LbAfW5+msqm2WA5uBTcCSNtZVktSkKaO8/zbwGeAp4HDgR8BaYBC4\nKU/VFgIX5NcjgYeAY4B32ldlSRNFV1dX+U6MAEyf3s3u3a91sEaqZ7Sw35kngDeA50khDsPfZnMp\nsJL0I7EN2AKcDDzWakUlTTwDAwP73PZ40bnHd7A2GsmBjNnPA06kEtyfBp4GbgOm57LZpOGdsu1U\nfhwkSR3SaNgfDtwDXEnq4d8CzAdOAF4GbhxhW59sIEkdNtowDsDBwL3AXcDqXLar6v2vAffn+R2k\ng7plc3LZfnp7e4fmi8UixWKxkfpKUhilUolSqdSWfY32eKsCsAJ4lXSgtmwWqUdPLv8w8AnSgdm7\nSeP05QO0R7N/797HEkrvMvUeS1g7Zu93e+y08ljC0Xr2pwIXA8+QTrEEuBa4iDSEMwi8CFya39sI\nrMqve4HLcBhHkjputLD/LsOP639nhG2uz5MkaYLwClpJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCmC0sJ8LPAI8B2wA\nrsjlPcBa4AXgQWB61TbLgc3AJmBJOysrSWrOaGH/NvAZ4DjgFOBy4FjgGlLYHwM8nJcBFgIX5Ncz\ngZsb+AxJ0hgbLYh3Ak/l+TeA54EjgbOBFbl8BXBOnl8KrCT9SGwDtgAnt6+6kqRmHEivex5wIvA4\nMAPoz+X9eRlgNrC9apvtpB8HSVIHTWlwvcOBe4Ergddr3hvMUz3Dvtfb2zs0XywWKRaLDVZFkmIo\nlUqUSqW27KuRsD+YFPR3AqtzWT8wkzTMMwvYlct3kA7qls3JZfupDntJ0v5qO8J9fX1N72u0YZwC\ncBuwEfhqVfkaYFmeX0blR2ANcCEwFZgPLACeaLp2kqS2GK1nfypwMfAMsD6XLQduAFYBl5AOxJ6f\n39uYyzcCe4HLGHmIR5I0DkYL++9Sv/e/uE759XmSJE0QngMvSQEY9pIUgGEvSQEY9pIUgGEvSQEY\n9pIUgGEvqW26urooFApDU3d3T6erpKzRe+NI0qgGBgZYd9+GoeVF5x7fwdqomj17SQrAsJekAAx7\nSQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrAsJekAAx7SQrA\nsJekAAx7SQrAsJekAAx7SQrAsJekABoJ+9uBfuDZqrJeYDuwPk9nVb23HNgMbAKWtKWWkqSWNBL2\ndwBn1pQNAjcBJ+bpO7l8IXBBfj0TuLnBz5AkjaFGgvhRYPcw5YVhypYCK4G3gW3AFuDkZisnSWqP\nVnrdnwaeBm4Dpuey2aThnbLtwJEtfIYkqQ2mNLndLcBf5/nPAzcCl9RZd3C4wt7e3qH5YrFIsVhs\nsiqSNDmVSiVKpVJb9tVs2O+qmv8acH+e3wHMrXpvTi7bT3XYS5L2V9sR7uvra3pfzQ7jzKqa/ziV\nM3XWABcCU4H5wALgiaZrJ0lqi0Z69iuB04AjgJeA64AicAJpiOZF4NK87kZgVX7dC1xGnWEcSdL4\naSTsLxqm7PYR1r8+T5KkCcJz4CUpAMNe0rC6u3soFApDk97dmj0bR9Ikt2fPbtbdt2FoedG5x3ew\nNmqVPXtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCqCRsL8d6AeerSrrAdYCLwAPAtOr3lsObAY2AUvaU01JUisaCfs7gDNryq4hhf0xwMN5GWAhcEF+\nPRO4ucHPkCSNoUaC+FFgd03Z2cCKPL8COCfPLwVWAm8D24AtwMkt11KS1JJme90zSEM75NcZeX42\nsL1qve3AkU1+hiSpTaa0YR+DeRrp/f309vYOzReLRYrFYhuqIkmTR6lUolQqtWVfzYZ9PzAT2AnM\nAnbl8h3A3Kr15uSy/VSHvSRpf7Ud4b6+vqb31ewwzhpgWZ5fBqyuKr8QmArMBxYATzRdO0lSWzTS\ns18JnAYcAbwEfA64AVgFXEI6EHt+XndjLt8I7AUuY+QhHknSOGgk7C+qU764Tvn1eZIkTRCeAy9J\nARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS9pzHR1dVEoFIam7u6eTlcprEaeQStJTRkYGGDdfRuGlhed\ne3wHaxObPXtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QAWr2oahvwC2AAeBs4GegB\nvgF8IL9/PrCnxc+RJLWg1Z79IFAETiQFPcA1wFrgGODhvCxJ6qB2DOMUapbPBlbk+RXAOW34DElS\nC9rRs38I+CHwqVw2A+jP8/15WZLUQa2O2Z8KvAz8MmnoZlPN+4N5kiR1UKth/3J+fQX4Fmncvh+Y\nCewEZgG7htuwt7d3aL5YLFIsFlusiqRWdHf3sGfP7k5XQ1VKpRKlUqkt+2ol7A8FuoDXgcOAJUAf\nsAZYBnwpv64ebuPqsJfUeXv27PZ2xBNMbUe4r6+v6X21EvYzSL358n6+DjxIGr9fBVxC5dRLSVIH\ntRL2LwInDFP+GrC4hf1KktrMK2glKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDD\nXpICMOwlKQDDXpICmNL
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d775990>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"def sample_stat(sample):\n",
|
||
|
" # TODO: replace the following line with another sample statistic\n",
|
||
|
" return sample.mean()\n",
|
||
|
"\n",
|
||
|
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
|
||
|
"interact(plot_sample_stats, n=slider, xlim=fixed([0, 100]))\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Part Two\n",
|
||
|
"========\n",
|
||
|
"\n",
|
||
|
"So far we have shown that if we know the actual distribution of the population, we can compute the sampling distribution for any sample statistic, and from that we can compute SE and CI.\n",
|
||
|
"\n",
|
||
|
"But in real life we don't know the actual distribution of the population. If we did, we wouldn't need to estimate it!\n",
|
||
|
"\n",
|
||
|
"In real life, we use the sample to build a model of the population distribution, then use the model to generate the sampling distribution. A simple and popular way to do that is \"resampling,\" which means we use the sample itself as a model of the population distribution and draw samples from it.\n",
|
||
|
"\n",
|
||
|
"Before we go on, I want to collect some of the code from Part One and organize it as a class. This class represents a framework for computing sampling distributions."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 19,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"class Resampler(object):\n",
|
||
|
" \"\"\"Represents a framework for computing sampling distributions.\"\"\"\n",
|
||
|
" \n",
|
||
|
" def __init__(self, sample, xlim=None):\n",
|
||
|
" \"\"\"Stores the actual sample.\"\"\"\n",
|
||
|
" self.sample = sample\n",
|
||
|
" self.n = len(sample)\n",
|
||
|
" self.xlim = xlim\n",
|
||
|
" \n",
|
||
|
" def resample(self):\n",
|
||
|
" \"\"\"Generates a new sample by choosing from the original\n",
|
||
|
" sample with replacement.\n",
|
||
|
" \"\"\"\n",
|
||
|
" new_sample = numpy.random.choice(self.sample, self.n, replace=True)\n",
|
||
|
" return new_sample\n",
|
||
|
" \n",
|
||
|
" def sample_stat(self, sample):\n",
|
||
|
" \"\"\"Computes a sample statistic using the original sample or a\n",
|
||
|
" simulated sample.\n",
|
||
|
" \"\"\"\n",
|
||
|
" return sample.mean()\n",
|
||
|
" \n",
|
||
|
" def compute_sample_statistics(self, iters=1000):\n",
|
||
|
" \"\"\"Simulates many experiments and collects the resulting sample\n",
|
||
|
" statistics.\n",
|
||
|
" \"\"\"\n",
|
||
|
" stats = [self.sample_stat(self.resample()) for i in range(iters)]\n",
|
||
|
" return numpy.array(stats)\n",
|
||
|
" \n",
|
||
|
" def plot_sample_stats(self):\n",
|
||
|
" \"\"\"Runs simulated experiments and summarizes the results.\n",
|
||
|
" \"\"\"\n",
|
||
|
" sample_stats = self.compute_sample_statistics()\n",
|
||
|
" summarize_sampling_distribution(sample_stats)\n",
|
||
|
" pyplot.hist(sample_stats, color=COLOR2)\n",
|
||
|
" pyplot.xlabel('sample statistic')\n",
|
||
|
" pyplot.xlim(self.xlim)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The following function instantiates a `Resampler` and runs it."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 20,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def plot_resampled_stats(n=100):\n",
|
||
|
" sample = weight.rvs(n)\n",
|
||
|
" resampler = Resampler(sample, xlim=[55, 95])\n",
|
||
|
" resampler.plot_sample_stats()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Here's a test run with `n=100`"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 21,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.72606450921\n",
|
||
|
"90% CI [ 71.35648645 76.82647135]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEoZJREFUeJzt3X+QVeV9x/H3dREVbWS3Gn5JshQ1DdJWTEPtqOOVIQy2\nHVGc+CO1pRObITUjTtqOSmcad6MlJlNt+kunjT+G+IOGJEixmVjxx01NU38kgoBIBCpToLKYCIlm\n2gZw+8fzXPd49+7ee3f37j378H7NnLnnPuece78+u3724Tn3ngOSJEmSJEmSJEmSJEmSJElHpeOB\n54CNwFbgC7G9A1gPvAo8DkzMHLMc2A5sAxaMWqWSpIZNiI/jgGeB84EvATfG9puA2+P6LMIfg2OB\nTmAHcMxoFSpJGpoJwAvAWYTR+aTYPjk+hzB6vylzzGPAuaNVoCSpTz2j62MIo/Ie4GngZUK498Tt\nPfSF/VRgT+bYPcC0EalUktSQcXXs8w5wNnAy8K/ARRXbe+MykMG2SZKapJ6AL/sJ8C3gI4RR+2Rg\nHzAF2B/32QtMzxxzWmx7j5kzZ/bu3LlzKPVK0tFsJ3B6vTvXmqI5hb5PyJwAfAzYAKwDlsT2JcDa\nuL4OuAoYD8wAzgCe71fhzp309vbmfrnllltaXkMqdY6FGq3TOvO+ADPrDXeoPYKfAqwk/CE4BngA\neDKG/GrgWmAXcEXcf2ts3wocBq7DKRpJaolaAb8ZOKdK+5vA/AGOWREXSVIL+Rn1QRSLxVaXUJex\nUOdYqBGsc6RZZ2sVWvS+vXE+SZJUp0KhAA3ktiN4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA17Ja2/voFAo1Fza2ztaXao0orwe\nvJJXKBR4as2WmvvNWzwbfy+VZ14PXpIEGPCSlCwDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXK\ngJekRBnwkpSoWgE/HXgaeBnYAiyL7V3AHmBDXC7OHLMc2A5sAxaMYK2SpAaMq7H9EPBZYCNwEvAD\nYD3QC9wZl6xZwJXxcRrwBHAm8M7IlSxJqketEfw+QrgDvA28QghuqH7Bm0XAKsIfhl3ADmDusKuU\nJDWskTn4TmAO8Gx8fj3wEnAvMDG2TSVM3ZTtoe8PgiRpFNUb8CcB3wBuIIzk7wZmAGcDrwN3DHKs\n11+VpBaoNQcPcCzwTeBBYG1s25/Zfg/waFzfSzgxW3ZabOunq6vr3fVisUixWKynXkk6apRKJUql\n0pCPr3Xh+AKwEvgx4WRr2RTCyJ3Y/lHgE4STqw8T5t3LJ1lPp/8o3ht+aNR4ww+lotEbftQawZ8H\nXANsInwcEuDPgKsJ0zO9wGvA0rhtK7A6Ph4GrsMpGklqiVoB/12qz9N/e5BjVsRFktRCfpNVkhJl\nwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAGvMau9vYNCoVBz\nkY5W9VwPXsqlgwcP1H0ZYOlo5AhekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgD\nXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpSoWgE/HXgaeBnYAiyL7R3AeuBV4HFg\nYuaY5cB2YBuwYCSLlSTVr1bAHwI+C5wFnAt8BvgwcDMh4M8EnozPAWYBV8bHhcBddbyHJKkJaoXv\nPmBjXH8beAWYBlwCrIztK4FL4/oiYBXhD8MuYAcwd+TKlSTVq5HRdScwB3gOmAT0xPae+BxgKrAn\nc8wewh8ESdIoq/em2ycB3wRuAN6q2NYbl4FU3dbV1fXuerFYpFgs1lmK1BxtbW0UCoWa+02c2M6B\nA2+OQkU62pVKJUql0pCPryfgjyWE+wPA2tjWA0wmTOFMAfbH9r2EE7Nlp8W2frIBL+XBkSNHeGrN\nlpr7zVs8exSqkfoPfru7uxs6vtYUTQG4F9gKfDnTvg5YEteX0Bf864CrgPHADOAM4PmGKpIkjYha\nI/jzgGuATcCG2LYcuB1YDVxLOJl6Rdy2NbZvBQ4D1zH49I0kqUlqBfx3GXiUP3+A9hVxkSS1kJ9R\nl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJ\nSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6RE\nGfCSlCgDXpISZcBLUqLqCfj7gB5gc6atC9gDbIjLxZlty4HtwDZgwYhUKUlqWD0Bfz+wsKKtF7gT\nmBOXb8f2WcCV8XEhcFed7yFJGmH1hO8zwIEq7YUqbYuAVcAhYBewA5g71OIkSUM3nNH19cBLwL3A\nxNg2lTB1U7YHmDaM95AkDdG4IR53N/D5uH4rcAdw7QD79lZr7Orqene9WCxSLBaHWIokpalUKlEq\nlYZ8/FADfn9m/R7g0bi+F5ie2XZabOsnG/CSpP4qB7/d3d0NHT/UKZopmfXL6PuEzTrgKmA8MAM4\nA3h+iO8hSRqGekbwq4ALgVOA3cAtQBE4mzD98hqwNO67FVgdHw8D1zHAFI0kqbnqCfirq7TdN8j+\nK+IiSWohP6MuSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQl\nyoCXpEQZ8JKUKANeudPe3kGhUKi5SBrcUO/oJDXNwYMHeGrNlpr7zVs8exSqkcYuR/CSlCgDXpIS\nZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJqifg7wN6\ngM2Ztg5gPfAq8DgwMbNtObAd2AYsGJkyJUmNqifg7wcWVrTdTAj4M4En43OAWcCV8XEhcFed7yFJ\nGmH1hO8zwIGKtkuAlXF9JXBpXF8ErAIOAbuAHcDcYVcpSWrYUEfXkwjTNsTHSXF9KrAns98eYNoQ\n30OSNAwjcUen3rgMtr2frq6ud9eLxSLFYnEESpGkdJRKJUql0pCPH2rA9wCTgX3AFGB/bN8LTM/s\nd1ps6ycb8JKk/ioHv93d3Q0dP9QpmnXAkri+BFibab8KGA/MAM4Anh/ie0iShqGeEfwq4ELgFGA3\n8DngdmA1cC3hZOoVcd+tsX0rcBi4jsGnbyRJTVJPwF89QPv8AdpXxEWS1EJ+Rl2SEmXAS1KiDHhJ\nSpQBL0mJMuAlKVEGvNSgtrY2CoVCzaW9vaPVpeooNxKXKpCOKkeOHOGpNVtq7jdv8exRqEYamCN4\nSUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJek\nRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpESNG+bxu4CfAkeAQ8BcoAP4GvDB\nuP0K4OAw30eS1KDhjuB7gSIwhxDuADcD64EzgSfjc0nSKBuJKZpCxfNLgJVxfSVw6Qi8hySpQSMx\ngn8C+D7wqdg2CeiJ6z3xuSRplA13Dv484HXgVMK0zLaK7b1xkSSNsuEG/Ovx8Q3gEcI8fA8wGdgH\nTAH2Vzuwq6vr3fVisUixWBxmKZKUllKpRKlUGvLxwwn4CUAb8BZwIrAA6AbWAUuAL8bHtdUOzga8\nJKm/ysFvd3d3Q8cPJ+AnEUbt5dd5CHicMB+/GriWvo9JSpJG2XAC/jXg7CrtbwLzh/G6kqQR4DdZ\nJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAGvUdPe3kGhUKi5SBoZw70WjVS3gwcP8NSa\nLTX3m7d49ihUI6XPEbwkJcqAl6REGfCSlCgDXmqStra2uk4qt7d3tLpUJcqTrFKTHDlyxJPKailH\n8JKUKANekhJlwEtSogx
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f2974281c50>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"plot_resampled_stats(100)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Now we can use `plot_resampled_stats` in an interaction:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 22,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.67407589545\n",
|
||
|
"90% CI [ 69.60129748 75.13161693]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXgAAAEPCAYAAABIut/fAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEqdJREFUeJzt3X2MXNV9h/FnWGPA0ODdQvyerGsgjXFbTBqXClAGy7FM\nW2EwCi8pravQiJYIo7QV4EoNu0AJiQpJ30BteJEDwY2TGNc0KsXGTEqS8pLgV4yD7WLVdmFNgp2A\n1DZgtn+cM97r8axnZndm5+7x85Gu5s659878cjz57uHMnXtBkiRJkiRJkiRJkiRJkiTpmHQi8Byw\nAdgKfD62dwFrgFeAJ4HxmWOWAtuBbcD8EatUktSwcfFxDPAscAHwReCm2H4zcFdcn0n4Y3A80A3s\nAI4bqUIlSUMzDngBOJswOp8Q2yfG5xBG7zdnjnkCOG+kCpQkDahndH0cYVTeBzwNvEQI9764vY+B\nsJ8M7MkcuweY0pRKJUkNGVPHPu8B5wCnAv8GXFSxvT8ugznaNklSi9QT8GU/Bb4NfIQwap8IvA5M\nAvbFffYC0zLHTI1th5kxY0b/zp07h1KvJB3LdgJn1LtzrSma0xg4Q+Yk4OPAemA1sDi2LwZWxfXV\nwFXAWGA6cCbw/BEV7txJf39/7pdbb7217TWkUudoqNE6rTPvCzCj3nCH2iP4ScAywh+C44CHgadi\nyK8ArgV2AVfE/bfG9q3Au8D1OEUjSW1RK+A3A+dWaX8TmDfIMXfGRZLURp6jfhTFYrHdJdRlNNQ5\nGmoE62w262yvQpvetz/OJ0mS6lQoFKCB3HYEL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwGtU\n6ezsolAo1Fw6O7vaXarUdp4Hr1GlUCiwbuWWmvvNXTQLP2NKjefBS5IAA16SkmXAS1KiDHhJSpQB\nL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqUAS9JiTLgJSlRBrwkJcqAl6REGfCS\nlCgDXpISVSvgpwFPAy8BW4Alsb0H2AOsj8vFmWOWAtuBbcD8JtYqSWrAmBrb3wE+C2wATgF+CKwB\n+oF74pI1E7gyPk4B1gJnAe81r2RJUj1qjeBfJ4Q7wNvAy4Tghuo3fl0ILCf8YdgF7ADmDLtKSVLD\nGpmD7wZmA8/G5zcAG4EHgPGxbTJh6qZsDwN/ECRJI6jegD8F+CZwI2Ekfx8wHTgHeA24+yjH9g+n\nQEnS0NSagwc4HvgW8AiwKrbty2y/H3g8ru8lfDFbNjW2HaGnp+fQerFYpFgs1lOvJB0zSqUSpVJp\nyMdXm0ev3L4M+Anhy9aySYSRO7H9o8AnCV+uPkqYdy9/yXoGR47i+/v7HdircYVCgXUrt9Tcb+6i\nWfgZU2oKhQLUzu1Dao3gzweuATYRTocE+HPgasL0TD/wKnBd3LYVWBEf3wWuxyka1aGzs4sDB/a3\nuwwpKXX/JWgyR/A6TCMjc0fwOlY1OoL3l6ySlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUqUAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYmqFfDTgKeBl4AtwJLY\n3gWsAV4BngTGZ45ZCmwHtgHzm1msJKl+tQL+HeCzwNnAecBngA8DtxAC/izgqfgcYCZwZXxcANxb\nx3tIklqgVvi+DmyI628DLwNTgEuAZbF9GXBpXF8ILCf8YdgF7ADmNK9cSVK9GhlddwOzgeeACUBf\nbO+LzwEmA3syx+wh/EGQJI2wMXXudwrwLeBG4K2Kbf1xGUzVbT09PYfWi8UixWKxzlIk6dhQKpUo\nlUpDPr6egD+eEO4PA6tiWx8wkTCFMwnYF9v3Er6YLZsa246QDXhJ0pEqB7+9vb0NHV9riqYAPABs\nBb6caV8NLI7rixkI/tXAVcBYYDpwJvB8QxVJkpqi1gj+fOAaYBOwPrYtBe4CVgDXEr5MvSJu2xrb\ntwLvAtdz9OkbSVKL1Ar47zL4KH/eIO13xkWS1Eaeoy5JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpIS\nZcBLUqIMeElKlAGvJHV0dFAoFGounZ1d7S5Vapl6ryYpjSoHDx5k3cotNfebu2jWCFQjtYcjeElK\nlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ\n8JKUKANekhJlwEtSogx4SUqUAS9Jiaon4B8E+oDNmbYeYA+wPi4XZ7YtBbYD24D5TalSktSwegL+\nIWBBRVs/cA8wOy7/GttnAlfGxwXAvXW+hySpyeoJ32eA/VXaC1XaFgLLgXeAXcAOYM5Qi5MkDd1w\nRtc3ABuBB4DxsW0yYeqmbA8wZRjvIUkaojFDPO4+4La4fjtwN3DtIPv2V2vs6ek5tF4sFikWi0Ms\nRZLSVCqVKJVKQz5+qAG/L7N+P/B4XN8LTMtsmxrbjpANeEnSkSoHv729vQ0dP9QpmkmZ9csYOMNm\nNXAVMBaYDpwJPD/E95AkDUM9I/jlwMeA04DdwK1AETiHMP3yKnBd3HcrsCI+vgtczyBTNJKk1qon\n4K+u0vbgUfa/My6SpDbyHHVJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAV0t1dnZRKBRqLpKab6g3/JDqcuDAftat3FJzv7mLZo1ANdKxxRG8JCXK\ngJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4\nSUpUPQH/INAHbM60dQFrgFeAJ4HxmW1Lge3ANmB+c8qUJDWqnoB/CFhQ0XYLIeDPAp6KzwFmAlfG\nxwXAvXW+hySpyeoJ32eA/RVtlwDL4voy4NK4vhBYDrwD7AJ2AHOGXaUkqWFDHV1PIEzbEB8nxPXJ\nwJ7MfnuAKUN8D0nSMDTjln39cTna9iP09PQcWi8WixSLxSaUIknpKJVKlEqlIR8/1IDvAyYCrwOT\ngH2xfS8wLbPf1Nh2hGzAS5KOVDn47e3tbej4oU7RrAYWx/XFwKpM+1XAWGA6cCbw/BDfQ5I0DPWM\n4JcDHwNOA3YDnwPuAlYA1xK+TL0i7rs1tm8F3gWu5+jTN5KkFqkn4K8epH3eIO13xkWS1Eaeoy5J\niTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnwOqZ1\ndHRQKBTqWjo7u9pdrtSQZtzRSRq1Dh48yLqVW+rad+6iWS2uRmouR/CSlCgDXpISZcBLUqIMeElK\nlAEvSYky4CUpUQa8JCXKgJekRBnwkpQoA16SEmXAS1KiDHhJSpQBL0mJMuAlKVHDvVzwLuBnwEHg\nHWAO0AV8Hfhg3H4FcGCY7yNJatBwR/D9QBGYTQh3gFuANcBZwFPxuSRphDVjiqZQ8fwSYFlcXwZc\n2oT3kCQ1qBkj+LXAD4BPx7YJQF9c74vPlZjOzq66bnMnqX2GOwd/PvAacDphWmZbxfb+uCgxBw7s\nr+tWd97mTmqf4Qb8a/HxDeAxwjx8HzAReB2YBOyrdmBPT8+h9WKxSLFYHGYpkpSWUqlEqVQa8vHD\nCfhxQAfwFnAyMB/oBVYDi4EvxMdV1Q7OBrwk6UiVg9/e3t6Gjh9OwE8gjNrLr/M14EnCfPwK4FoG\nTpOUJI2w4QT8q8A5VdrfBOYN43UlSU3gL1klKVEGvCQlyoCXpEQZ8JKUKANekhJlwEtSogx4SUqU\nAS9JiTLgJSlRBrwkJcqAl6REGfCSlCgDXpISZcBLUqIMeElKlAEvSYky4CUpUQa8JCXKgJekRBnw\nkpQoA16qU0dHB4VCoeb
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d796cd0>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
|
||
|
"interact(plot_resampled_stats, n=slider, xlim=fixed([1, 15]))\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Exercise: write a new class called `StdResampler` that inherits from `Resampler` and overrides `sample_stat` so it computes the standard deviation of the resampled data."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 23,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"class StdResampler(Resampler): \n",
|
||
|
" \"\"\"Computes the sampling distribution of the standard deviation.\"\"\"\n",
|
||
|
" \n",
|
||
|
" def sample_stat(self, sample):\n",
|
||
|
" \"\"\"Computes a sample statistic using the original sample or a\n",
|
||
|
" simulated sample.\n",
|
||
|
" \"\"\"\n",
|
||
|
" return sample.std()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Test your code using the cell below:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 24,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.30056137605\n",
|
||
|
"90% CI [ 13.70615766 18.05008376]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEMFJREFUeJzt3X+wXOVdx/H3chNQCU3uHTS/22AIDikqQY1VdNhmYgwj\nQwAtPxSNigwzdAhatRBUcu/YQdopFH/BVPkxsZTYFNKYjNMOKWEdsBasDSSXEEhSMpKY3MSSW8BB\nG8L1j+fZ7GZzf2x2994N9/t+zZzZc549e85zH9jPPnnOL5AkSZIkSZIkSZIkSZIkSVKLzAaeBl4C\neoEVubwb2AtsydOlVZ9ZCewEdgBLxqqikqTGTQMuzPOTgFeA84FVwCcGWX8+8AIwEZgD7AJOG/Va\nSpKGNVIQHyCFN8DbwMvAzLxcGGT9ZcAa4AiwhxT2C5uupSSpKSfT654DLAC+mZdvAV4EHgKm5LIZ\npOGdsr1UfhwkSW1Sb9hPAh4HbiX18B8AziEN8ewH7hnmswPNVFCS1LwJdawzEXgCeBRYn8sOVr3/\nILAxz+8jHdQtm5XLjjN37tyB3bt3n3RlJSm43cC5jXxwpJ59gTRMsx24r6p8etX8lcC2PL8BuBY4\nndTznwc8f0Jtd+9mYGDAaWCAVatWtb0Op8pkW9gWtsXwEzC3kaCHkXv2FwPXA1tJp1gC3AFcRxrC\nGQBeA27K720H1ubXd4GbcRhHktpupLB/lsF7/18d5jN35UmSdIrwHPg2KxaL7a7CKcO2qLAtKmyL\n1hjsXPmxMJDHnyRJdSoUCtBgbtuzl6QADHtJCsCwl6QADPsW6ezsolAoUCgU6Ozsand1JOk4HqBt\nkUKhwOZ1vQAsuuoCxtvfJ6n9PEArSRqWYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2\nkhSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSA\nYd+Ezs4uCoVC+YnvknTKMuyb0N9/mM3retm8rrfdVZGkYRn2khSAYS9JARj2khSAYS9JARj2khSA\nYS9JAYwU9rOBp4GXgF5gRS7vAjYBrwJPAlOqPrMS2AnsAJa0srKSpMaMFPZHgD8APgx8BPg4cD5w\nOynszwOeyssA84Fr8utS4P469iFJGmUjBfEB4IU8/zbwMjATuBxYnctXA1fk+WXAGtKPxB5gF7Cw\nddWVJDXiZHrdc4AFwHPAVKAvl/flZYAZwN6qz+wl/ThIktpoQp3rTQKeAG4F3qp5byBPQxn0ve7u\n7mPzxWKRYrFYZ1UkKYZSqUSpVGrJtuoJ+4mkoP8CsD6X9QHTSMM804GDuXwf6aBu2axcdoLqsJck\nnai2I9zT09PwtkYaxikADwHbgfuqyjcAy/P8cio/AhuAa4HTgXOAecDzDddOktQSI/XsLwauB7YC\nW3LZSuBuYC1wA+lA7NX5ve25fDvwLnAzww/xSJLGwEhh/yxD9/4XD1F+V54kSacIz4GXpAAMe0kK\nwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLBvQGdnF4VCod3VkKS6GfYN6O8/zOZ1ve2uhiTV\nzbCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCX\npAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpADq\nCfuHgT5gW1VZN7AX2JKnS6veWwnsBHYAS1pSS0lSU+oJ+0eApTVlA8C9wII8fTWXzweuya9Lgfvr\n3IckaRTVE8TPAIcHKS8MUrYMWAMcAfYAu4CFjVZOktQazfS6bwFeBB4CpuSyGaThnbK9wMwm9iFJ\naoFGw/4B4BzgQmA/cM8w6w40uA9JUotMaPBzB6vmHwQ25vl9wOyq92blshN0d3cfmy8WixSLxQar\nIknjU6lUolQqtWRbjYb9dFKPHuBKKmfqbAAeIx28nQnMA54fbAPVYS9JOlFtR7inp6fhbdUT9muA\nS4CzgdeBVUCRNIQzALwG3JTX3Q6sza/vAjfjMI4ktV09YX/dIGUPD7P+XXkKq6Ojg0KhwJQpnRw+\n/Ea7qyNJngM/Go4ePcrmdb309w92xqokjT3DXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpIC\nMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDD\nXpICMOwlKQDDXpICqCfsHwb6gG1VZV3AJuBV4ElgStV7K4GdwA5gSWuqKUlqRj1h/wiwtKbsdlLY\nnwc8lZcB5gPX5NelwP117kOSNIrqCeJngMM1ZZcDq/P8auCKPL8MWAMcAfYAu4CFTddSktSURnvd\nU0lDO+TXqXl+BrC3ar29wMwG9yFJapFWDLEM5Gm49yVJbTShwc/1AdOAA8B04GAu3wfMrlpvVi47\nQXd397H5YrFIsVhssCqSND6VSiVKpVJLttVo2G8AlgOfzq/rq8ofA+4lDd/MA54fbAPVYS9JOlFt\nR7inp6fhbdUT9muAS4CzgdeBO4G7gbXADaQDsVfndbfn8u3Au8DNOIwjSW1XT9hfN0T54iHK78qT\nJOkU4TnwkhSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2ders7KJQ\nKFAoFNpdFUk6aYZ9nfr7D7N5XS+b1/W2uyqSdNIMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAM\ne0kKwLCXpAAMe0kKwLAfRR0dHcdusdDZ2dXu6kgKbEK7KzCeHT169NjtFRZddUGbayMpMnv2khSA\nYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khRAs3e9\n3AO8CRwFjgALgS7gS8CH8vtXA/1N7keS1IRme/YDQBFYQAp6gNuBTcB5wFN5WZLURq0YxinULF8O\nrM7zq4ErWrAPSVITWtGz/zrwLeDGXDYV6MvzfXlZktRGzY7ZXwzsB36YNHSzo+b9gTydoLu7+9h8\nsVikWCw2WRVJGl9KpRKlUqkl22o27Pfn10PAV0jj9n3ANOAAMB04ONgHq8NeknSi2o5wT09Pw9tq\nZhjnh4Cz8vyZwBJgG7ABWJ7LlwPrm9iHJKkFmunZTyX15svb+SLwJGn8fi1wA5VTLyVJbdRM2L8G\nXDhI+RvA4ia2K0lqMa+glaQADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJ\nCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwHyMdHR0UCgUKhQKdnV3tro6k\nYJp5Bu2419nZRX//4ZZs6+jRo2xe1wvAoqsuaMk2Jale9uyH0d9/mM3reo+FtCS9Xxn2khSAYS9J\nARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYS9JARj2khSAYd8G3tte\n0lgbrfvZLwXuAzqAB4FPj9J+Wu7ZZ5/lnXfeGdV9eG97SWNtNHr2HcDfkAJ/PnAdcP4o7KflDh06\nxKJFi/jTlT3cdOPN7a5OOKVSqd1VOGXYFhW2RWuMRtgvBHYBe4AjwD8Cy0ZhPy333nvv8YGzJnP3\nn3yeX7tseburE45f6grbosK2aI3RCPuZwOtVy3tzmSSpTUZjzH5gFLY5Jk477TTefOt73PmZWzhw\ncH+7qyNJLVMYhW1+BOgmjdkDrATe4/iDtLuAuaOwb0kaz3YD57a7EmUTSBWaA5wOvMD75ACtJOnk\nXAq8QurBr2xzXSRJkiS
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d737b10>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"def plot_resampled_stats(n=100):\n",
|
||
|
" sample = weight.rvs(n)\n",
|
||
|
" resampler = StdResampler(sample, xlim=[0, 100])\n",
|
||
|
" resampler.plot_sample_stats()\n",
|
||
|
" \n",
|
||
|
"plot_resampled_stats()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"When your `StdResampler` is working, you should be able to interact with it:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 25,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 1.29095098626\n",
|
||
|
"90% CI [ 15.13442137 19.27452588]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXsAAAEPCAYAAACjjWTcAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEfVJREFUeJzt3XuMnNV9h/FnWNsk4GDvitZ3YteYCkNbII1LSyomluMa\nNcWENlxSWrelERJpoOkFMFLDboIIiQKhN1AbLnIhuHHAcY2iUAxmKtIEyMUGlsXBdrCKXbymgBOo\n2sYs2z/OGe94r7Mzszvr/T0f6dW875n3fefsgfnO8XlvIEmSJEmSJEmSJEmSJEmSpAZ5F/AUsB3o\nAj6Xy9uALcCLwCPAzIpt1gI7gR3AynGrqSSpLsfl1ynAk8AHgC8A1+Tya4Gb8/xS0g/DVGAhsAs4\nZrwqKkmq33HAd4HTSL32Wbl8dl6G1Ku/tmKbh4Gzx6uCkqTBVdPrPobUW+8GHgeeJwV9d36/m77g\nnwvsrdh2LzCvITWVJNVsShXrvAOcAcwA/hX4YL/3e/M0lOHekySNg2rCvuzHwDeA95F687OB/cAc\n4EBeZx+woGKb+bnsCIsXL+7dvXt3LfWVpMh2AyfXsuFIwzgn0nemzbuBDwHbgM3Amly+BtiU5zcD\nlwDTgEXAEuDpAbXdvZve3l6n3l5uuOGGptdhoky2hW1hWww/AYtrCXoYuWc/B1hH+lE4BrgXeCwH\n/gbgcmAPcFFevyuXdwFvA1fiMI4kNd1IYf8ccNYg5a8DK4bY5qY8SZImCM+Bb7JisdjsKkwYtkUf\n26KPbdEYhSZ9bm8ef5IkValQKECNuW3PXpICMOwlKQDDXpICMOwlKQDDXpICMOwlKQDDXpICMOwl\nKQDDXpICMOwlKQDDfgy0trZRKBQoFAq0trY1uzqS5L1xxkKhUGDrxk4All94OpP5b5U0frw3jiRp\nWIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9\nJAUwUtgvAB4Hngc6gatyeTuwF9iWp/MqtlkL7AR2ACsbWFdJUo2mjPD+IeBTwHZgOvB9YAvQC9ya\np0pLgYvz6zzgUeAU4J3GVVmSNFoj9ez3k4Ie4C3gBVKIw+A30F8NrCf9SOwBdgHL6q6lJKkuoxmz\nXwicCTyZlz8JPAPcBczMZXNJwztle+n7cZAkNUm1YT8deAC4mtTDvwNYBJwBvALcMsy2PpNPkpps\npDF7gKnAg8B9wKZcdqDi/TuBh/L8PtJB3bL5uWyA9vb2w/PFYpFisVhNfSUpjFKpRKlUasi+Rnpw\nbQFYB7xGOlBbNofUoyeXvx/4GOnA7P2kcfryAdqTGdi794HjkjRK9TxwfKSe/TnAZcCzpFMsAa4H\nLiUN4fQCLwFX5Pe6gA359W3gShzGkaSmGynsv8Xg4/rfHGabm/IkSZogvIJWkgIw7CUpAMNekgIw\n7CUpAMNekgIw7CUpAMNekgIw7CUpAMNekgIw7CUpAMNekgIw7BuktbWNQqFQviudJE0ohn2DHDz4\nBls3dh6+tbEkTSSGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCG\nvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgAjhf0C4HHgeaATuCqXtwFbgBeBR4CZFdusBXYCO4CV\njaysJKk2I4X9IeBTwGnA2cAngFOB60hhfwrwWF4GWApcnF9XAbdX8RmSpDE2UhDvB7bn+beAF4B5\nwPnAuly+Drggz68G1pN+JPYAu4Bljavu0aelpeXws2lbW9uaXR1JQU0ZxboLgTOBp4BZQHcu787L\nAHOBJyu22Uv6cQirp6fn8HNpl194epNrIymqasN+OvAgcDXwZr/3evM0lEHfa29vPzxfLBYpFotV\nVkWSYiiVSpRKpYbsq5qwn0oK+nuBTbmsG5hNGuaZAxzI5ftIB3XL5ueyASrDXpI0UP+OcEdHR837\nGmnMvgDcBXQBt1WUbwbW5Pk19P0IbAYuAaYBi4AlwNM1106S1BAj9ezPAS4DngW25bK1wM3ABuBy\n0oHYi/J7Xbm8C3gbuJLhh3gkSeNgpLD/FkP3/lcMUX5TniRJE4TnwEtSAIa9JAVg2EtSAIa9JAVg\n2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtS\nAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVg2EtSAIa9JAVQTdjfDXQD\nz1WUtQN7gW15Oq/ivbXATmAHsLIhtZQk1aWasL8HWNWvrBe4FTgzT9/M5UuBi/PrKuD2Kj9DkjSG\nqgniJ4A3BikvDFK2GlgPHAL2ALuAZbVWTpLUGPX0uj8JPAPcBczMZXNJwztle4F5dXyGJKkBptS4\n3R3AZ/L8Z4FbgMuHWLd3sML29vbD88VikWKxWGNVJGlyKpVKlEqlhuyr1rA/UDF/J/BQnt8HLKh4\nb34uG6Ay7CVJA/XvCHd0dNS8r1qHceZUzH+EvjN1NgOXANOARcAS4OmaaydJaohqevbrgXOBE4GX\ngRuAInAGaYjmJeCKvG4XsCG/vg1cyRDDOJKk8VNN2F86SNndw6x/U54kSROE58BLUgCGvSQFYNhL\nUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCG\nvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGfR1aW9soFAoUCoVmV0WShmXY1+Hg\nwTfYurGTrRs7m10VSRqWYS9JARj2khSAYS9JAVQT9ncD3cBzFWVtwBbgReARYGbFe2uBncAOYGVj\nqilJqkc1YX8PsKpf2XWksD8FeCwvAywFLs6vq4Dbq/wMSdIYqiaInwDe6Fd2PrAuz68DLsjzq4H1\nwCFgD7ALWFZ3LSVJdam11z2LNLRDfp2V5+cCeyvW2wvMq/EzJp2WlpbD5+W3trY1uzqSApnSgH30\n5mm49wdob28/PF8sFikWiw2oysTW09Nz+Jz85Ree3uTaSJroSqUSpVKpIfuqNey7gdnAfmAOcCCX\n7wMWVKw3P5cNUBn2kqSB+neEOzo6at5XrcM4m4E1eX4NsKmi/BJgGrAIWAI8XXPtJEkNUU3Pfj1w\nLnAi8DLwaeBmYANwOelA7EV53a5c3gW8DVzJ8EM8kqRxUE3YXzpE+Yohym/KkyRpgvAceEkKwLCX\npAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAM\ne0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kKwLCXpAAMe0kK\nwLCXpACm1Ln9HuAnQA9wCFgGtAFfBd6b378IOFjn50iS6lBvz74XKAJnkoIe4DpgC3AK8FheliQ1\nUSOGcQr9ls8H1uX5dcAFDfgMSVIdGtGzfxT4HvDxXDYL6M7z3XlZktRE9Y7ZnwO8AvwMaehmR7/3\ne/MkSWqiesP+lfz6KvB10rh9NzAb2A/MAQ4MtmF7e/vh+WKxSLFYrLMqkjS5lEolSqVSQ/ZVT9gf\nB7QAbwLHAyuBDmAzsAb4fH7dNNjGlWEvSRqof0e4o6Oj5n3VE/azSL358n6+AjxCGr/fAFxO36mX\nkqQmqifsXwLOGKT8dWBFHfuVJDWYV9BKUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQFYNhLUgCGvSQF\nYNg3SUtLC4VCgUKhQGtrW7OrI2mSq/dGaKpRT08PWzd2ArD8wtObXBtJk509e0kKwLAfpdbWtsPD\nL5J0tDDsR+ngwTfYurHz8BCMJB0NDHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QADHtJCsCwl6QA\nDHtJCsCwl6QADHtJCsC
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d78d110>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"slider = widgets.IntSliderWidget(min=10, max=1000, value=100)\n",
|
||
|
"interact(plot_resampled_stats, n=slider)\n",
|
||
|
"None"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Part Three\n",
|
||
|
"==========\n",
|
||
|
"\n",
|
||
|
"We can extend this framework to compute SE and CI for a difference in means.\n",
|
||
|
"\n",
|
||
|
"For example, men are heavier than women on average. Here's the women's distribution again (from BRFSS data):"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 26,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"(72.697645732966876, 16.944043048498038)"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 26,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"female_weight = scipy.stats.lognorm(0.23, 0, 70.8)\n",
|
||
|
"female_weight.mean(), female_weight.std()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"And here's the men's distribution:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 27,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"(89.063576984335782, 17.992335889366288)"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 27,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"male_weight = scipy.stats.lognorm(0.20, 0, 87.3)\n",
|
||
|
"male_weight.mean(), male_weight.std()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"I'll simulate a sample of 100 men and 100 women:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 28,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"female_sample = female_weight.rvs(100)\n",
|
||
|
"male_sample = male_weight.rvs(100)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The difference in means should be about 17 kg, but will vary from one random sample to the next:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 29,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"15.521337537414979"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 29,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"male_sample.mean() - female_sample.mean()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Here's the function that computes Cohen's $d$ again:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 30,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"def CohenEffectSize(group1, group2):\n",
|
||
|
" \"\"\"Compute Cohen's d.\n",
|
||
|
"\n",
|
||
|
" group1: Series or NumPy array\n",
|
||
|
" group2: Series or NumPy array\n",
|
||
|
"\n",
|
||
|
" returns: float\n",
|
||
|
" \"\"\"\n",
|
||
|
" diff = group1.mean() - group2.mean()\n",
|
||
|
"\n",
|
||
|
" n1, n2 = len(group1), len(group2)\n",
|
||
|
" var1 = group1.var()\n",
|
||
|
" var2 = group2.var()\n",
|
||
|
"\n",
|
||
|
" pooled_var = (n1 * var1 + n2 * var2) / (n1 + n2)\n",
|
||
|
" d = diff / numpy.sqrt(pooled_var)\n",
|
||
|
" return d"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"The difference in weight between men and women is about 1 standard deviation:"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 31,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"data": {
|
||
|
"text/plain": [
|
||
|
"0.9271315108152719"
|
||
|
]
|
||
|
},
|
||
|
"execution_count": 31,
|
||
|
"metadata": {},
|
||
|
"output_type": "execute_result"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"CohenEffectSize(male_sample, female_sample)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Now we can write a version of the `Resampler` that computes the sampling distribution of $d$."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 32,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [],
|
||
|
"source": [
|
||
|
"class CohenResampler(Resampler):\n",
|
||
|
" def __init__(self, group1, group2, xlim=None):\n",
|
||
|
" self.group1 = group1\n",
|
||
|
" self.group2 = group2\n",
|
||
|
" self.xlim = xlim\n",
|
||
|
" \n",
|
||
|
" def resample(self):\n",
|
||
|
" group1 = numpy.random.choice(self.group1, len(self.group1), replace=True)\n",
|
||
|
" group2 = numpy.random.choice(self.group2, len(self.group2), replace=True)\n",
|
||
|
" return group1, group2\n",
|
||
|
" \n",
|
||
|
" def sample_stat(self, groups):\n",
|
||
|
" group1, group2 = groups\n",
|
||
|
" return CohenEffectSize(group1, group2)\n",
|
||
|
" \n",
|
||
|
" # NOTE: The following functions are the same as the ones in Resampler,\n",
|
||
|
" # so I could just inherit them, but I'm including them for readability\n",
|
||
|
" def compute_sample_statistics(self, iters=1000):\n",
|
||
|
" stats = [self.sample_stat(self.resample()) for i in range(iters)]\n",
|
||
|
" return numpy.array(stats)\n",
|
||
|
" \n",
|
||
|
" def plot_sample_stats(self):\n",
|
||
|
" sample_stats = self.compute_sample_statistics()\n",
|
||
|
" summarize_sampling_distribution(sample_stats)\n",
|
||
|
" pyplot.hist(sample_stats, color=COLOR2)\n",
|
||
|
" pyplot.xlabel('sample statistic')\n",
|
||
|
" pyplot.xlim(self.xlim)"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"Now we can instantiate a `CohenResampler` and plot the sampling distribution."
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "code",
|
||
|
"execution_count": 33,
|
||
|
"metadata": {
|
||
|
"collapsed": false
|
||
|
},
|
||
|
"outputs": [
|
||
|
{
|
||
|
"name": "stdout",
|
||
|
"output_type": "stream",
|
||
|
"text": [
|
||
|
"SE 0.160707033098\n",
|
||
|
"90% CI [ 0.6808076 1.1974013]\n"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"data": {
|
||
|
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXkAAAEPCAYAAACneLThAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAEkZJREFUeJzt3X2QnVVhx/HvZhNUpLCbpoa84VIIxUApwSHSYocLxUyw\nHYI48lbbtE0dZrDA2HaEOFOyW20ERzP0RRjLWyNIbBRMQ61IeLktvkBQEkIIkSQl02waNkh2FTot\nJmH7xzmXvdns7n3u+96z38/MM/e553k7R8nvnj3PvecBSZIkSZIkSZIkSZIkSZIkDfNO4GlgE7AV\n+HwsnwqsB14CHgE6io5ZBmwHtgELG1ZTSVJFjo6vk4GngA8CXwA+HctvAG6O6/MIHwhTgC5gBzCp\nURWVJFXuaOAZ4DRCL316LD8+vofQi7+h6JiHgXMaVUFJ0uGy9LInEXrnfcATwAuEgO+L2/sYCvyZ\nQG/Rsb3ArJrUVJJUtskZ9nkLOBM4DvgucP6w7YNxGc1Y2yRJdZQl5At+BnwbeD+h93488AowA9gX\n99kDzCk6ZnYsO8xJJ500uHPnzkrqK0kT2U7g5HIOKDVcM42hb868C/gQsBFYByyJ5UuAtXF9HXAF\ncBRwIjAX2HBELXfuZHBwMNll+fLlTa+DbbN9ti+9BTipnICH0j35GcAqwofBJOBe4LEY9GuApcAu\n4LK4/9ZYvhU4CFyDwzWS1DSlQv554KwRyvcDF45yzIq4SJKazO+w10Eul2t2Feom5baB7Wt1qbev\nEm1Nuu5gHF+SJGXU1tYGZea2PXlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5\nSUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeY0LnZ1TaWtra8rS2Tm12c2X6sYn\nQ2lcaGtr4/EHtzTl2hdcejr+96hW4JOhJEmHMeQlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5\nSUpYqZCfAzwBvABsAa6L5d1AL7AxLhcVHbMM2A5sAxbWsK6SpDJNLrH9APApYBNwDPBjYD0wCKyM\nS7F5wOXxdRbwKHAK8FbtqixJyqpUT/4VQsADvAG8SAhvGPmntYuB1YQPh13ADmBB1bWUJFWknDH5\nLmA+8FR8fy3wHHAX0BHLZhKGcQp6GfpQkCQ1WNaQPwb4JnA9oUd/O3AicCawF/jSGMc685MkNUmp\nMXmAKcADwH3A2li2r2j7ncBDcX0P4WZtwexYdoTu7u6313O5HLlcLkt9JWnCyOfz5PP5qs5RasrK\nNmAV8BrhBmzBDEIPnlh+NnAV4Ybr/YRx+MKN15M5sjfvVMM6jFMNS6VVMtVwqZ78ucDHgc2Er0oC\nfAa4kjBUMwi8DFwdt20F1sTXg8A1OFwjSU1TKuS/x8jj9t8Z45gVcZEkNZm/eJWkhBnykpQwQ16S\nEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwgx5SUpYloeGSElrb28v\nzNPdUB0dnfT372/4dTWxGPKa8A4dOtSUB5ZccOnpDb+mJh6HayQpYYa8JCXMkJekhBnykpQwQ16S\nEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJmyEtSwkqF/BzgCeAFYAtwXSyf\nCqwHXgIeATqKjlkGbAe2AQtrWVlJUnlKhfwB4FPAacA5wCeB9wE3EkL+FOCx+B5gHnB5fF0E3Jbh\nGpKkOikVwK8Am+L6G8CLwCzgYmBVLF8FXBLXFwOrCR8Ou4AdwILaVVeSVI5yetldwHzgaWA60BfL\n++J7gJlAb9ExvYQPBUlSE2R9/N8xwAPA9cDrw7YNxmU0I27r7u5+ez2Xy5HL5TJWRfXS2TmVgYH+\nZldDUpTP58nn81WdI0vITyEE/L3A2ljWBxxPGM6ZAeyL5XsIN2sLZseyIxSHvMaHgYH+pjzrFHze\nqTSS4R3gnp6ess9RarimDbgL2ArcWlS+DlgS15cwFP7rgCuAo4ATgbnAhrJrJUmqiVI9+XOBjwOb\ngY2xbBlwM7AGWEq4wXpZ3LY1lm8FDgLXMPZQjiSpjkqF/PcYvbd/4SjlK+IiSWoyv8MuSQkz5CUp\nYYa8JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJm\nyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeUlKmCEvSQkz5CUpYYa8\nJCXMkJekhGUJ+buBPuD5orJuoBfYGJeLirYtA7YD24CFNamlJKkiWUL+HmDRsLJBYCUwPy7fieXz\ngMvj6yLgtozXkCTVQZYAfhLoH6G8bYSyxcBq4ACwC9gBLKi0cpKk6lTTy74WeA64C+iIZTMJwzgF\nvcCsKq4hSarC5AqPux3467j+WeBLwNJR9h0cqbC7u/vt9VwuRy6Xq7AqkpSmfD5PPp+v6hyVhvy+\novU7gYfi+h5gTtG22bHsCMUhL0k60vAOcE9PT9nnqHS4ZkbR+kcY+ubNOuAK4CjgRGAusKHCa0iS\nqpSlJ78aOA+YBuwGlgM54EzCUMzLwNVx363Amvh6ELiGUYZrJEn1lyXkrxyh7O4x9l8RF0lSk/kd\ndklKmCEvSQkz5CUpYYa8JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCX\npIQZ8pKUMENekhJmyEtSwip9xqvqqLNzKgMD/c2uhqQEGPLj0MBAP48/uKXh173g0tMbfk1J9eVw\njSQlzJCXpIQZ8pKUMENekhJmyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEZQn5u4E+4PmisqnA\neuAl4BGgo2jbMmA7sA1YWJtqSpIqkSXk7wEWDSu7kRDypwCPxfcA84DL4+si4LaM15Ak1UGWAH4S\nGD4l4sXAqri+Crgkri8GVgMHgF3ADmBB1bWUJFWk0l72dMIQDvF1elyfCfQW7dcLzKrwGpKkKtVi\nquHBuIy1/Qjd3d1vr+dyOXK5XA2qIknpyOfz5PP5qs5Racj3AccDrwAzgH2xfA8wp2i/2bHsCMUh\nL01E7e3ttLW1NeXaHR2d9Pfvb8q1ld3wDnBPT0/Z56g05NcBS4Bb4uvaovL7gZWEYZq5wIYKryEl\n7dChQ015OAz4gJiJJEvIrwbOA6YBu4GbgJuBNcBSwg3Wy+K+W2P5VuAgcA1jD+VIkuooS8hfOUr5\nhaOUr4iLJKnJ/A67JCXMkJekhBnykpQwQ16SEmbIS1LCDHlJSpghL0kJM+QlKWGGvCQlzJCXpIQZ\n8pKUMENekhJmyEtSwgx5SUqYIS9JCTPkJSlhhrwkJcyQl6SEGfKSlDBDXpISZshLUsIMeUlKmCEv\nSQkz5CUpYYa8JCXMkJekhBnykpQwQ16SEja5yuN3AT8HDgEHgAXAVOCfgffG7ZcBA1VeR5JUgWp7\n8oNADphPCHiAG4H1wCnAY/G9JKkJajFc0zbs/cXAqri+CrikBteQJFWgFj35R4EfAZ+IZdOBvrje\nF99Lkpqg2jH5c4G9wK8Qhmi2Dds+GBdJUhNUG/J74+urwLcI4/J9wPHAK8AMYN9IB3Z3d7+9nsvl\nyOVyVVZFktKSz+fJ5/NVnaOakD8aaAdeB94NLAR6gHXAEuCW+Lp2pIOLQ16SdKThHeCenp6yz1FN\nyE8n9N4L5/ka8AhhfH4NsJShr1BKkpqgmpB/GThzhPL9wIVVnFeSVCP+4lWSEmbIS1LCDHlJSpgh\nL0kJM+QlKWGGvCQlzJCXpIQZ8pKUMENekhJW7QRlyersnMrAQH+zqyFJVTHkRzEw0M/jD25pyrUv\nuPT0plxXUnoMeWkCam9vp61t+EPd6q+jo5P+/v0Nv+5EZshLE9ChQ4ea8peqf6U2njdeJSlhhrwk\nJcyQl6SEGfKSlDBDXpI
|
||
|
"text/plain": [
|
||
|
"<matplotlib.figure.Figure at 0x7f294d796fd0>"
|
||
|
]
|
||
|
},
|
||
|
"metadata": {},
|
||
|
"output_type": "display_data"
|
||
|
}
|
||
|
],
|
||
|
"source": [
|
||
|
"resampler = CohenResampler(male_sample, female_sample)\n",
|
||
|
"resampler.plot_sample_stats()"
|
||
|
]
|
||
|
},
|
||
|
{
|
||
|
"cell_type": "markdown",
|
||
|
"metadata": {},
|
||
|
"source": [
|
||
|
"This example demonstrates an advantage of the computational framework over mathematical analysis. Statistics like Cohen's $d$, which is the ratio of other statistics, are relatively difficult to analyze. But with a computational approach, all sample statistics are equally \"easy\".\n",
|
||
|
"\n",
|
||
|
"One note on vocabulary: what I am calling \"resampling\" here is a specific kind of resampling called \"bootstrapping\". Other techniques that are also considering resampling include permutation tests, which we'll see in the next section, and \"jackknife\" resampling. You can read more at <http://en.wikipedia.org/wiki/Resampling_(statistics)>."
|
||
|
]
|
||
|
}
|
||
|
],
|
||
|
"metadata": {
|
||
|
"kernelspec": {
|
||
|
"display_name": "Python 2",
|
||
|
"language": "python",
|
||
|
"name": "python2"
|
||
|
},
|
||
|
"language_info": {
|
||
|
"codemirror_mode": {
|
||
|
"name": "ipython",
|
||
|
"version": 2
|
||
|
},
|
||
|
"file_extension": ".py",
|
||
|
"mimetype": "text/x-python",
|
||
|
"name": "python",
|
||
|
"nbconvert_exporter": "python",
|
||
|
"pygments_lexer": "ipython2",
|
||
|
"version": "2.7.10"
|
||
|
}
|
||
|
},
|
||
|
"nbformat": 4,
|
||
|
"nbformat_minor": 0
|
||
|
}
|