data-science-ipython-notebooks/scikit-learn/scikit-learn-svm.ipynb

373 lines
260 KiB
Plaintext
Raw Normal View History

2015-04-19 08:26:01 +08:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# scikit-learn-svm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"* Support Vector Machine Classifier\n",
"* Support Vector Machine with Kernels Classifier"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"%matplotlib inline\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"import seaborn; \n",
"from sklearn.linear_model import LinearRegression\n",
"from scipy import stats\n",
"import pylab as pl\n",
"\n",
"seaborn.set()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Support Vector Machine Classifier"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Support Vector Machines (SVMs) are a powerful supervised learning algorithm used for **classification** or for **regression**. SVMs draw a boundary between clusters of data. SVMs attempt to maximize the margin between sets of points. Many lines can be drawn to separate the points above:"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeMAAAFVCAYAAADc5IdQAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsvXd8G3d65/9BB9g7RVGFokSBRSySrGarN1uy1UnZluS6\n2b1k95JNsr/NZu+VzS+X7OVyv73b3OWSbdmsd23JtkiqWpKtZku2ZEsukkiKBaQoUZ29omMw8/tj\ngAFAgkQRMADJ5/168UXgO8DMdwhwPvOU7/NIOI4DQRAEQRCRQxrpCRAEQRDEZIfEmCAIgiAiDIkx\nQRAEQUQYEmOCIAiCiDAkxgRBEAQRYUiMCYIgCCLCyIN5k1ar/TGALQAUAP5Vp9P9IaSzIgiCIIhJ\nRMCWsVarXQ1gmU6nexrAagC5IZ4TQRAEQUwqgrGMNwKo02q1RwEkAPhhaKdEEARBEJOLYMQ4HcB0\nAC+At4qPA8gP5aQIgiAIYjIRjBh3A2jU6XQMgGatVmvWarVpOp2ue/gLOY7jJBLJE0+SIAiCIMYR\nAQtfMGJ8CcD3Afxcq9VOBRALoMfrbCQSdHUNBXGI6CI9PZ7OI0qYCOcATIzzmAjnANB5RBMT4RwA\n/jwCJeAELp1OdxLAda1W+yV4F/V3dToddZsgCIIgiCAJammTTqf7UagnQhAEQRCTFSr6QRAEQRAR\nhsSYIAiCICIMiTFBEARBRBgSY4IgCIKIMCTGBEEQBBFhSIwJgiAIIsKQGBMEQRBEhCExJgiCIIgI\nQ2JMEARBEBGGxJggCIIgIgyJMUEQBEFEGBJjgiAIgogwJMYEQRAEEWFIjAmCIAgiwpAYEwRBEESE\nCaqfMRGdDA4M4ItfXoKqXgG7hoV6QyyeLl8OiUQS6akRBEEQY0BiPEHo6ejB1b2fYl/tXsggAwC0\nf9COE9ePYss/7ojw7AiCIIixIDf1BOHLf/4cr9a+IggxAEyxT8FT785HS01zBGdGEARB+ILEeIIQ\nU6OCBCPd0SXGYrSdaI3AjAiCIAh/ITGeIHCjfJIcOHBSTtzJEARBEAFBYjxBMC+2wg77iPGriVeR\nX1EUgRkRBEEQ/kJiPEFY/oPV+O3Tv4MBBmGsQdOI29+5hxlzZkZwZgRBEIQvKJt6ghAXF4cXKnfi\n5P4z4K7bwWpYTNmRjQ3LNkV6agRBEIQPSIwnEEqlEmveXB/paRAEQRABQm5qgiAIgogwJMYEQRAE\nEWFIjAmCIAgiwpAYEwQxoeA4DkNDg2AYJtJTIQi/oQQugogwDMPg0oGLsF+xgZMC6jUxeGbXCmrw\nEQQXfnke7CEb0u+loT91APoNRqz/yXNQKBSRnhpBjAmJMUFEEJvNhmNvVGHfmT2IRzwAoLO6E0cu\nVmHHv1SQIAfAhV+cxzP/bTGm2qbyA/2AudWMAwPvY+v/2RXZyRGED8hNTRAR5OJvz+P1M68KQgwA\nGVwGtlZtxlenrkRwZuMLlmXBVdtcQuxADTXyPpqN9gePIzQzgvAPEmOCiCDSq4AGmhHjWWwWBs/3\nR2BG45PBwQFk3s/wum1R31PQfdEo8owIIjBIjAkikrCjb5Jw5KL2l9jYOPSm9Hnddlt9G1O100Se\nEUEEBokxQUQQdglghnnEeLu0HbGr4728g/CGQqHAwFo9rLCO2HZl2VfIK5kbgVkRhP+QGBNEBFn5\n7bV4a90fYIRRGOtBD47uPIElW5+O4MzEh+M4mM1mcFxwLT83/NdNeHv3AXyR+AVssKFR3Yj/WPN7\nLPvfK0M8U4IIPZRNTRARRKlUYtvbFTjxh9PgrtoBGaBYpcKOFydXJvVnv70AW5UJifcTMZQ6BPNG\nG9b/+DnI5f5fopRKJbb9azketj3A4c8/wNSCadg6n7KoifFBUGKs1WqvARhwPL2t0+m+FbopEcTk\nQqFQYM0frQf+KNIziQwXf/0xlv7DQkyzOuK63YBRZ8T7/VXY8r92Bry/7JxpyM6hGDExvghYjLVa\nrRoAdDrdmtBPhyCIyQTLsrBXWV1C7CAGMcg9lYPOH3YiY4r3LGmCmEgEEzMuBRCj1WpPa7Xa81qt\ndkmoJ0UQxORgYKAfU9umeN22uGcRdJcbRJ4RQUSGYMTYAOBnOp3uWQB/DOCAVqulRDCCIAImNjYO\nPSm9Xre1qduQmZcl8owIIjIEEzNuBnALAHQ6XYtWq+0BkAXgobcXp6dPjOUZdB7Rw0Q4B2BinEco\nzuHyZgtsv7BBAc/60ddXXsfedXufeP/+MBE+C2BinMdEOIdgCEaM3wBQAuB7Wq12KoAEAKPWmuvq\nGgpyatFDeno8nUeUMBHOAZgY5xGqc1j212vwh8f7UXauBAuHFqBV2YrPll7Gon96RpS/0UT4LICJ\ncR4T4RyA4G4oghHj/wDwllar/dTx/A2dTjdGHSGCIIjRUavV2PbrctxtbsN7n1chKz8bLywNPIua\nIMYzAYuxTqdjALwShrkQBDGJmTk3BzPn5kR6GgQRESjxiiAIgiAiDIkxQRAEQUQYKodJBMU93V00\n/KIWMU0a2DV2MCtZrPmzDQGVLyQIgiB46MpJBMw93V08eq0Nr9zeI4yZPjfh7aYD2PmbFyM4M4Ig\niPEJuamJgGn6RR223H7BY0wDDdZ9uBp1l2siNCuCIIjxC1nGADoetuP6f3wNZY8C9pkslvzR00hI\nSIz0tKIWdYPa6/gcyxx8deE6ip8pFXlGBEEQ45tJL8Y3Tl8D+1cW7Hv8EiSQgAGDQ0cOY9avtJhV\nlBvp6UUljIbxOs6CBacJrhctQRDEZGZSu6ntdju6/792PPt4IyTge8fKIceLut1o/icqUD8azAoW\nZphHjJ/NOIcFryyOwIwIgiDGN5NajGsuXcOqmyu8bsv8Oh1DQ4Miz2h8sObPN+D3W97BHeUdALxF\nfCbjLKz/hUVqemqEZ0cQBDH+mNRuaqvZBhWn8rpNySjAMN7dsZMdhUKBnb99ETUXr+PKZ1+Di+Ew\n/5VFSMtIi/TUCIIgxiWTWoznr16IT/M+w46W7SO2PSrrQGHyggjManwgkUhQtnoBsDrSMyEIghj/\nTGo3tUqlguQ7ctQl3PQYv5B1EVn/eVqEZkWMZ9pu3UHNF9dhNo+MqRMEQYzGpLaMAeCZ11ahblYt\nDlS+D2WXApbpVsx9sxCzCimTOhLUXrqBzhOPILFLEbcqAYufXwaJRBLpafnknu4u6v/mBkquzkOB\neTa+mv05rC/bsebPNvi9D921Rtz/8i6SZqdg4fpF4+K8CYIIDZNejAGgeGUJileWRHoak56Tf3sM\nz7y1BOssfFLd43ce49CW97HjV7shk8kiPLvRYRgG9X96Ha/ecDUzy27Nxr2f3cPn6Z/h6Ze9Jwk6\nMRgMOPvdk3jmwtNYbqrAQ9lDnHzqCBb8yxJMnZUd0FxsNhtaGpqRmJKI7Onk3SGI8cKkdlMT0UPN\nxetY/rulyLPkCWNZbBZeObYHn771SQRn5psrhy5j242tI8ZnWGbAeFjv8/2f/M0ZvPnh6yg0FQAA\nsu3ZeOPqa7j+wy8Dmsenv/kEX6y5iOkbMmBbbsCHLx3H/eZ7Ae2DIIjIQGJMRAVdJ9sxxzpnxHgc\n4sB9Ft1Z7eY2ExLhvWKbslMx5nstFgtSLyZBhpGW/9Kri6G70eTXHL489Dnm/3QedjXvQC5yscj0\nFF79eC/qvvcNbDabX/sgCCJykBgTUYGEGT0+Ota2aCAmLw7dkm6v2yzZ1jHfq9frkTKQ4nXbdMt0\ndLV1+DWHweoB5JpH5jlsr9mGL6ou+bUPgiAiB4kxERWon4lFl6RrxDgDBrYyewRm5D9Ltz+N44s+\nGDHeHNOCpArvQuskOTkZD2c/8rrtSsZVFDwzz685KNu9W+CJSIS1zeLXPgiCiBwkxkRUsGzHM6je\ndBhGGIUxO+z4/ZI/4JnvrhJtHgzD4LP3L+D833yEsz//EP19fT7fI5VKsfhXK/DW5rfxSdIF1Mnr\nUDmvCg1/p8OiHUt9vlf
"text/plain": [
"<matplotlib.figure.Figure at 0x105f65490>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.datasets.samples_generator import make_blobs\n",
"X, y = make_blobs(n_samples=50, centers=2,\n",
" random_state=0, cluster_std=0.60)\n",
"\n",
"xfit = np.linspace(-1, 3.5)\n",
"plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='spring')\n",
"\n",
"# Draw three lines that couple separate the data\n",
"for m, b, d in [(1, 0.65, 0.33), (0.5, 1.6, 0.55), (-0.2, 2.9, 0.2)]:\n",
" yfit = m * xfit + b\n",
" plt.plot(xfit, yfit, '-k')\n",
" plt.fill_between(xfit, yfit - d, yfit + d, edgecolor='none', color='#AAAAAA', alpha=0.4)\n",
"\n",
"plt.xlim(-1, 3.5);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Fit the model:"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"text/plain": [
"SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, degree=3, gamma=0.0,\n",
" kernel='linear', max_iter=-1, probability=False, random_state=None,\n",
" shrinking=True, tol=0.001, verbose=False)"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"from sklearn.svm import SVC\n",
"clf = SVC(kernel='linear')\n",
"clf.fit(X, y)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Plot the boundary:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def plot_svc_decision_function(clf, ax=None):\n",
" \"\"\"Plot the decision function for a 2D SVC\"\"\"\n",
" if ax is None:\n",
" ax = plt.gca()\n",
" x = np.linspace(plt.xlim()[0], plt.xlim()[1], 30)\n",
" y = np.linspace(plt.ylim()[0], plt.ylim()[1], 30)\n",
" Y, X = np.meshgrid(y, x)\n",
" P = np.zeros_like(X)\n",
" for i, xi in enumerate(x):\n",
" for j, yj in enumerate(y):\n",
" P[i, j] = clf.decision_function([xi, yj])\n",
" # plot the margins\n",
" ax.contour(X, Y, P, colors='k',\n",
" levels=[-1, 0, 1], alpha=0.5,\n",
" linestyles=['--', '-', '--'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the following plot the dashed lines touch a couple of the points known as *support vectors*, which are stored in the ``support_vectors_`` attribute of the classifier:"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAdwAAAFRCAYAAADejRzzAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd8VNeZ+P/PjDSjmVFDnQ4SghG9I0TvvRhsbAO2Yzt2\n4jhtk/1usrspu9lNNvltstndZNNjxxUMLmADNsVgTO+IImAAiV6FEEJlRtPu74+xBg33CtSmSHre\nr5dfWOfemfvMRejROfc85+gURUEIIYQQwaUPdwBCCCFEWyAJVwghhAgBSbhCCCFECEjCFUIIIUJA\nEq4QQggRApJwhRBCiBCIDuabFxeXN3vNUVKShdLSquZ+W3Efuc+hIfc5NOQ+h4bcZ5+0tHidVnuL\n6+FGR0eFO4Q2Qe5zaMh9Dg25z6Eh9/nBWlzCFUIIIVoiSbhCCCFECEjCFUIIIUJAEq4QQggRApJw\nhRBCiBCQhCuEEEKEgCRcIYQQIgQk4QohhBAh0KiVpqxW6z8BcwED8H82m+31Zo1KCCGEaGUa3MO1\nWq0TgDybzTYKmABkNXNMQgghRKvTmB7uNOCY1WpdDSQA/9C8IQkhhBCtT2MSbhrQBZiDr3f7EZDT\nnEEJIYQQrY1OURq2oY/Vav05UGyz2X79xdf5wBSbzXbr/nPdbo8ii1kLIYRoYzR3C2pMD3cH8G3g\n11artSMQC5RonRiMbZrS0uIpLi5v9vcVgeQ+h4bc59CQ+xwacp990tLiNdsbPGnKZrOtAw5brdZ9\n+IaTX7bZbM2+760QQgjRmjSqLMhms32/uQMRQgghWjNZ+EIIIYQIAUm4QgghRAhIwhVCCCFCQBKu\nEEIIEQKScIUQQogQkIQrhBBChIAkXCGEECIEJOEKIYQQISAJVwghhAgBSbhCCCFECEjCFUIIIUJA\nEq4QQggRApJwhRBCiBBo1G5BonU4uGYft1fcwnQ9huoMJ4mLkhj+yMhwhyWEEK2SJNw2atcb27D+\nSzYzKif7287sOMv24q2MfXFC+AITQohWSoaU2yCPx4Pzb3ZyKq0B7T3t2ShvunG5XGGKTAghWi9J\nuG3QuaIi+p3oq3lsyKlBnCk4HeKIhBCi9ZOE2wbFJ8RzO7ZU81iJ6TYJyQkhjkgIIVo/SbhtUEZG\ne86OLNI8dnLkKTp37RLiiIQQovWThNtGDfi3Ibw+6A3ucheAcsp5o/9b9P63AWGOTAghWieZpdxG\ndc7uQsa69mxasRXXeSfRXQ1MfnImRqMx3KEJIUSrJAm3DTMYDIx7amK4wxBCiDZBhpSFEEKIEJCE\nK4QQQoSADCkLEWEURUGn04U7jIh39fwVCtYfxZxqIXf+KAwGQ7hDEuKBJOEKESH2rthF+fI7mM/F\n4Ep2Uz3NzZTvzSAqKircoUUURVFY+/3V9F7dk8V3FlFOOR//5hM6/LQrfcb1C3d4QtRJEq4QEWDP\nsp30+ucsrFW9fA3XoKqgiuU3VzLvvx8Nb3ARZsvvNrLwtXm0ox0ACSTw5KknePf779N9cxYWiyXM\nEQqhTZ7hChFmiqJQ8XbZvWT7BQsWen2czbWLV8MUWYTa5PUn29rmFs5mz7KdYQhIiPqRhCtEmDkc\nDpKK1AkEYHTpKAq2Hg1xRJHNUKb9rNaECaXEG+JohKg/SbhChFlMTAyV7ao0j102XCa1e1qII4ps\n9iyHZvsFwwWShqWEOBoh6k8SrhBhptfrqRhfhQv1tohbhm5l4NghYYgqcnV5PpPdaXsC2ty4+WT8\nBgZPGhqmqIR4OJk0JUQEmPSv0/jbrTcY+ekIBlT157r+OuuHbqT/L4e0uhIhRVHY/L8b0X/ixXjL\niKNrNfFPtiP3iVH1en2fMX05/tujLPvrCswnY3DHeagcbWfmj+e3unslWhedoigNfpHVaj0ElH3x\nZZHNZvuy1nnFxeUNf/OHSEuLp7i4vLnfVtxH7nNo3H+fzx49w7ldZ2mXmcywaSNaZQL5+EcfMe/P\ns0hSkvxtZ01nKfipjVHPjGvQe9W3Zlm+n0ND7rNPWlq85jdlg3u4VqvVBGCz2WQRXiGaWfaAnmQP\n6BnuMILmdkkJHVdlBCRbgGxHNgffOoTydMMW/WiNv5CI1qsxz3AHAhar1brBarVutlqtuc0dlBCi\ndTqx/Th5N0dqHutU2JHbt2+HOCIhQqcxCbcS+KXNZpsOvAS8bbVaZfKVEOKhUrqmccV4RfNYaWIp\nsbGxIY5IiNBpzKSp08BZAJvNdsZqtZYAHQDVv6KkJAvR0c2/LF1aWnyzv6dQk/scGm3pPqdNz2XZ\nmGVYt1gD2r14qZxSSZcuwSuBakv3OZzkPtetMQn3OWAA8HWr1doRSACuaZ1YWqpdW9gU8lA+NOQ+\nh0ZbvM+9/30Qr5W/wcyD08nwZnA65jSfj9/OlB/PCtq9aIv3ORzkPvvU9UtHYxLuK8DfrFbrti++\nfs5ms8nyLkKIeuncswud1nZm/yd7KCu6Q8dhnXlk5OPhDkuIoGtwwrXZbG7g6SDEIoRoI3Q6HSNm\n5YU7DCFCSiY7CSGEECEgK02JoHE4HHz+m80Y9kah9+qxD6xm9HfGkZCYGO7QhBAi5CThiqBwu92s\n+9JqvvzZsxjw7e7i3enltX2vM3HlDOLi4sIcoRBChJYMKYug2LV8O0s/e9KfbAH06HnmwNPs+uO2\nB7xSCCFaJ+nhNoPCw2co/KsNU2EMrgQPUdMMjP/ypDa97Jx7v5N41FPjo4nGeES+7YQQbY/85Gsi\n2+5TuF6qZOm1xf62km0lfFT0IXP+45EwRhZeXmPdlWKeGKkiE0K0PTKk3EQX/1jIxGsTAtpSvCn0\ner8HVy9oL2HXFqTP68DZmLOq9lu6WxgmxYQhIiGECC9JuE1kKjBpto8qzaNg3dEQRxM5BowbxK4X\n93HUfMzfVmgs5IPFHzF28YTwBSaEEGEiQ8pN5DG7Ndvt2DEkGjSPtRUzfzyX03NPsfyjleCG9Okd\nWDBmUbjDEkKIsJCE20SVeQ68Ni/6+wYL1mV/wshHG7aZdmvUa3AOvQbnhDuMsPB6vbhcLmJiZAhd\nCCEJt8nG/2gSfz73Cgu2zyPDm4EHDx93+Zh2P0zBZNIebhaNY7fb2fP2Djw3PMT2jSd3Xh56feQ9\nFamoqODzH39K/M44LBVmSrPvkPx8GkPnj3joaxVFoWD/McpulDFgwkDi4xNCELEQIhQk4TZRXHw8\nC1Y8zr41u6k6UoEnUWHEl0aS2K5duENrVU7vtXH574uYd3ouZszc1N3kw9feY/xfp5CUmhzu8PwU\nRWHji2t5YfPzRPHF1pTFcOTEEfKNBxk0c2idry06ehbbDwoYc3AUae4B7Oi8g9tP3GXa92fV+/pu\nt5tTR09iibeQ1bNHUz+OEKIZ6RRFCdqbFxeXN/uby/ZPoRFJ99nr9bJp1lqeOrQ0oF1B4fXH3mT2\n7xeEKTK1/K2H6PlUN7o7u6uOLZu6gqlvBybPmvvsdDr5fPpGlhQsDjh+Lfoau/7jAOOenfDQa+96\nYxvOVxwMPzmUuzF3OTLsGD1/3Jseg3s25SO1CpH0/dyayX32SUuL11yEIfLG44S4z+HPDzAxf4Kq\nXYeO5D3tcDqdoQ+qDsUHb2omWwDTubqf5e5ZsZN5BXNV7R3cHXCutT/0uvkbDtLzX7NYdPJRutOd\nAdUDeHrnUi588yyVlZX1jl8IETyScEXEK79ZTqo3VfOYpTIWp7M6xBHVLaZjDOVo/4bvTK77FwPn\nlWri0F5f2njr4bPdb628SZ+K3qr2R07PZ89rOx76eiFE8EnCFRFv8PShbO3wueaxm72LiYtTLyEZ\nLnmPjWFN37Wq9lJdKUyPqvN18X0TuKG/oXmsusvDe/DGG9pJOYYYdFcf+nIhRAhIwhURL7FdO249\nWcr16OsB7YeSDpHy5fSgXvvGpeus/9ZHbBu9iW2jPmX9Nz7i2vm6M5jBYCDrv6y8OeQtrumv4cXL\nZ6lb+fCFdUz6xtQ6Xzd
"text/plain": [
"<matplotlib.figure.Figure at 0x10efed990>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='spring')\n",
"plot_svc_decision_function(clf)\n",
"plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],\n",
" s=200, facecolors='none');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use IPython's ``interact`` functionality to explore how the distribution of points affects the support vectors and the discriminative fit:"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAd8AAAFVCAYAAACuK+XmAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd8W+d1+P8PuDcJAlziFDWgRWpL1N57WcNaHvFIUsdJ\nHDdJv2nSNm2SNkl/SRNnNnFi19sa1rRlbcnae09oU5siAU5wgADu7w+KEKELWiIFgCB53n7pZeHi\nAvchBN5zx3nO0SiKghBCCCF8J6ClByCEEEK0NxJ8hRBCCB+T4CuEEEL4mARfIYQQwsck+AohhBA+\nJsFXCCGE8LGg5rzIYDD8EJgBBAN/NBqN73p0VEIIIUQb1uQzX4PBMBoYYjQahwKjgWwPj0kIIYRo\n05pz5jsROGUwGFYDMcA/eXZIQgghRNvWnOCbAKQD06k7610LdPPkoIQQQoi2rDnBtwg4ZzQabcAF\ng8FQbTAY9EajsejhFRVFUTQazRMPUgghhGhFHhn4mhN8dwPfAX5jMBg6AJGAye3WNRoKC8ubsQnR\nFAkJ0fI5e5l8xt4nn7H3yWfsGwkJ0Y9cp8kJV0ajcR1wzGAwHKTukvOrRqNRujMIIYQQj6lZU42M\nRuMPPD0QIYQQor2QIhtCCCGEj0nwFUIIIXxMgq8QQgjhYxJ8hRBCCB+T4CuEEEL4mARfIYQQwsck\n+AohhBA+JsFXCCGE8DEJvkIIIYSPSfAVQgghfEyCrxBCCOFjEnyFEEIIH5PgK4QQQviYBF8hhBDC\nxyT4CiGEED7WrH6+wvdO7jxBwfJbBBcHUZ1pJfdrfemQleq17SmKwokdxyjaUoAjWKHj7M50ye3q\nte0JIUR7IsG3Fdj55na6/aIz4yzDAVBQWL9pPZV/sdC5v+cDosPhYPVrnzBx9VgmWEcBcOyd42z8\n+jom/XDaY79PscnMwTf3EXItmFptLZ2e6UqnnC4eH68QQrQ2ctnZz1ksFkLfDKCnpYdzmQYNU/On\ncuWNi17Z5s73tjN/2Rw6Wjs6l/W19KH///bm7IHTj/Ue185e5eTMwyz+7Xzmr5rLM28vhHm17P94\nj1fGLIQQrYkEXz93cO0+xl8f7/a52GPRWK1Wj2/T8UUtWrSq5d2ru3N71fXHeo/zvz7NvItzCWjw\nFcsrHozt99XU1NR4bKxCCNEaSfD1c0EhQdiwuX3OEWRHo9F4fJsB1sa/FpqaR39lrFYrcYdj3D43\n8fIEDny2t9lj8xWbzcbuFTvZ/If1GI+ea+nhCCHaGAm+fm7Q9CFs6rzF7XOlAyoIDg72+Dare1lx\n4FAtN2MmdEjEI1+vKAoo7p/ToEGxN/Kkn7h05AJbJ61n9DeGsvhn89HOjuLdp9+VM3YhhMdI8PVz\noaGhhH03kp26Xc5lNmws6baUnj/o7ZVtDv3mCN7t9z5KgwhqxcrS8csZMnfYY425tG+Z2+c2d9zC\noBlDPDZWTym4dZfNv1rPxv9cx5nXjvPsqcXOS+89q3qw6JNFbP3ZxhYepRCirZBs51Zg4Lw88nOv\n8eH7SwgqDsaebSfva8OJjnZ/afdJxcTGMvyjMbz/uyWEnwjBEeSgNs/OjG/PJTAw8LHeo8v3urHG\nuJaZV2egoe7S+LGYYyivBhAeHu6VcTfXF/+7lYTfa1lkeppd7KIHPVTrhBBC+M6QFhidEKItkuDb\nSmR2zSLzZ1k+215cvJYpP5ne7Ndn9+5C5KoY3v/rEsLzQ7Bqa0lflMWwQaM8OMond/X8FTJ+k8Lg\n0sEAlFCCDp3bdUNKg1EUxSv32YUQ7YsEX+E1SR2SniiA+8KlJedZXLrA+bg3vTnCEQYwQLVuZZdq\nCbxCCI+Qe76iXQuscr2MnkkmF7hAJZUuy0/FnUL3fKIvhyaEaMPkzFe0ayEDwjH9n8nlUvMCFrCW\ntRTpTeij9FRn15D9rSz6DR/YgiMVQrQlEnxFuzZkzjCWfbKMr21/iaD7vw6BBFLb3cawJWNJSEkA\nICEhmsLC8pYcaqtlsVgICgoiNDTUI+93O/8WZzacJFwfQd6sYQQFyW5MtD7yrRXtWmBgINPeeYqP\n/mcZofuDCawNpDq3ht7f6u8MvA+7eeUGp986Qei9EGqSreR+tS8dMr3X5KK1Or39JHf/dJP4M1pq\ng2sxDy6h778NJDkjpVnv53A4WPfPa+i+uguLSp6mnHI+/8N6kn+aQc+RvTw8eiG8S6MoXi14oMjZ\ngvfJWZn31X/GJzYfQ/M9G2PvjqkrGILCptRNhLwRTa9ROS09TL9x6egFbF+pYkTBcJfl7/R+l/Gf\nTXN7Fvyo7/HWP2xi8s/GEUecy/JlnT4hb9tIv5vC5o9kX+EbCQnRj8zMlIQrIR6Toijc+80txt0d\n65y7rEHDpFuTuP0/+Xj5QLZVufx/RlXgBZh/4mn2vr/LzSseTbPZoQq8ADMvT2ffh7ub9Z5CtBS5\n7CzEY7p47gL9jvd1+1z3YwZu3rxBenqGj0fln8Kuh7ldHkEEyiV7s94zqNT97iqMMBSzuhxqverq\navZ9sAvF6KBWa6PvSwNJTJbMddGyJPgK0QQaxf3VpABHAHY583Wy6mrdLrdjx65vPFB+mepONeCm\nx0V+cD7a/u4LoxTcKuDQi7tZcHx+XeBHYdOSzdz8r3z6zZDsddFy5LKzEI+pS/euHO19zO1zZ/ud\nl7PeBmKf0nIl7Ipq+efpnzPgpcHNes/UFzLZrz/gssyOnfWjNtJ3bH+3rzn6swO8cPwrRFDXEESD\nhkl3J1L63yZqa90fIAjhCxJ8hXhMGo0G/evJ7Ejc6VymoLA1ZRvJ/5gm1a8aGDRzCIf/6TjrMtZR\nRRX3uMdHvZYQ+Sst2vj4Zr1nz5G9qPxDLR+OW8Kq1NUs67qc91/8mCl/n+X2s7fb7UQfjHTen29o\nyoXJ7F+7p1njEMIT5LKzEE3QZ0p/rmVf5YN3Pia0IISaFCs9XswlvbOc9T5szLcnUPFiBes2bCIs\nJpyx4yY/dmOOxuSM603OuN6PVWPb4XAQVOt+FxdOOFaLtIgULadZwddgMBwFSu8/vGI0Gl/23JCE\n8G9Zho5k/aJjSw+jVYiKimLUvLEef9/HucoQHBxMWW4FbFY/t6XDFgY+lefxcQnxuJocfA0GQxiA\n0Wgc4/nhCCGE52R8K5utZ7cy7tY457KrYVcp+YqFmJjYFhyZaO+ac+bbG4gwGAwb77/+R0aj8cAj\nXiPakPLyMgoL79GhQxphYe6nlAjhD7oN6c7VD67wwVsfE5YfSm2cjdhZWsbNnNjSQxPtXHOCrwX4\nldFofMtgMHQB1hsMhq5Go7F58wdEq1FZWcm2H26kw7Zk0u+lcTRrH2UzLEz6l2mSbCT8Vsee2XT8\nTXZLD0MIF00uL2kwGEKAAKPRWH3/8QFgjtFovOVmdZn42IZ8sOgDFi5Z6GxAAHXN53f8eAezfjKr\nBUfW8oqKiggKCiIuTl2BSQjR7jzybKQ5Z74vArnANw0GQwcgBrjT2MpSR9T7nrRe67k9Z7nx1lXC\nLoVii7FhHwfjvjORgIAHM9FuXbtJ5/WdXQIvQBxx1C61c+eV4jbdXaaxz/jUthPc/cNN0k+mYg2q\n5c7AArr+sAcde8qZVlP5U93h/EvXuHX+Jp36dyEpJamlh+Mx/vQZt2UJCdGPXKc5e8u3gP8zGAz1\nkx1flEvOrdeZnacJflXhmXsLncsqDlawJH85s96Y51x25eglJpeOc/cWJN9OorS0FJ3OfZWhturK\nqUsEfMfO4oIHnx2bYOm1ZejX64mOjmm5wbUxlZWVnNpzghh9LN36dPfabY4SczE7X99K7925jKkY\nxrH44+yfuIspv55JSEiIV7Yp2qcmF9kwGo02o9H4nNFoHHn/z35vDEz4xu03rzPs3lCXZVFE0efT\nXK4ZrzqXZeRkcTbKTW0/4F5SITEx7S/QXHrHyOiCUarlcy7MZv/fpICDp2z73WZOjD7IkGf6kzI9\nng0z13L52EWvbGvXP27
"text/plain": [
"<matplotlib.figure.Figure at 0x1102e8390>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from IPython.html.widgets import interact\n",
"\n",
"def plot_svm(N=100):\n",
" X, y = make_blobs(n_samples=200, centers=2,\n",
" random_state=0, cluster_std=0.60)\n",
" X = X[:N]\n",
" y = y[:N]\n",
" clf = SVC(kernel='linear')\n",
" clf.fit(X, y)\n",
" plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='spring')\n",
" plt.xlim(-1, 4)\n",
" plt.ylim(-1, 6)\n",
" plot_svc_decision_function(clf, plt.gca())\n",
" plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],\n",
" s=200, facecolors='none')\n",
" \n",
"interact(plot_svm, N=[10, 200], kernel='linear');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Support Vector Machine with Kernels Classifier\n",
"\n",
"Kernels are useful when the decision boundary is not linear. A Kernel is some functional transformation of the input data. SVMs have clever tricks to ensure kernel calculations are efficient. In the example below, a linear boundary is not useful in separating the groups of points:"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeYAAAFRCAYAAAChXA4CAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd8FOe18PHfSlr1LtSFUAEt6m1XEr0bsCm2sQ3ucUuc\nxDdxypvk5tY3t6Xf+CZ5k+sWJzZgGxtsisFgG4Np0qpLSFohkCgChHrfot15/1gQErs0aaVdSc83\nn3w+1szOzNGw2rMz85znyCRJQhAEQRAEx+Bk7wAEQRAEQbhOJGZBEARBcCAiMQuCIAiCAxGJWRAE\nQRAciEjMgiAIguBARGIWBEEQBAfiYu8AAJqbu21asxUQ4El7e58tdzmhifMxnDgf14lzMZw4H8OJ\n83Gdrc9FcLCP7GbrJuUVs4uLs71DcCjifAwnzsd14lwMJ87HcOJ8XDee52JSJmZBEARBmKhEYhYE\nQRAEByISsyAIgiA4EJGYBUEQBMGBiMQsCIIgCA5EJGZBEARBcCAiMQuCIAiCAxGJWRAEQRAcyKhm\n/lIoFLnAzzUazZIbln8PeA5ovrroGxqNpnY0xxIEQRCEqWDEiVmhUPwIeALosbI6C3hSo9GUjHT/\ngiAIgjAVjeZWdh3wIGBtvs9s4KcKheIrhULxk1EcQxAEQRCmlBEnZo1Gsx0YuMnqrcA3gKXAfIVC\ncd9IjyMIgiAIU8lYdZd6RaPRdAEoFIo9QCaw52YvDgjwtPkE4cHBPjbd30Qnzsdw4nxcd7tz0dXV\nhclkwt/ff5wisi/x3hhOnI/rxutc2DwxKxQKP6BcoVAkAX2Yr5rfuNU2tm4rFhzsQ3Nzt033OZGJ\n8zGcOB/X3epcXLlyBbU6n5qaKhISFKxde/84Rzf+xHtjOHE+rrP1ubhVkrdFYpYAFArFo4C3RqN5\n7epz5YOADvhMo9Hss8FxBEEYY5Ik0dBQj1qdT0NDPQBBQUHExMTaOTJBsI+2tlZKS0vYsGHtuB1z\nVIlZo9E0AHOv/vfWIcu3Yn7OLAjCBKLVavnoow8xGAxER89ApcohLm4mMtlNe7oLwqQjSRKNjRdQ\nq/OpqzuFJEkkJsYTHj4+X1DH6hmzIAgTkIeHB8uXryQ4OJiwsHB7hyMI48pkMlFbq0GtzufSpYsA\nhIdHoFLlkpKSQmtr77jEIRKzIExBnZ0d6HR6q8+5UlPT7BCRINiPXq+noqKMoiI1HR0dyGQyZs1K\nQKnMISpqOjKZDCen8ZsoUyRmQZhCLl++hFqdj0ZTQ1TUdJKT4+0dkiDYTU9PN8XFRZSWlqDV9uPi\n4kJGRibZ2TkEBQXZLS6RmAVhkpMkiTNn6lCrCzh37iwAISGhpKSkIUmSnaMThPHX3NxMYWEBVVWV\nGI1GPDw8mTdvARkZWXh5edk7PJGYBWGyMxqN7N37CX19vcTGxqFS5TJjRgwymUwM6hKmDEmSOHu2\nAbU6n/r6MwAEBgaiUuWSlJSCXC63c4TXicQsCJOci4sLq1bdi6+vHyEhIfYORxDGldFopKamGrU6\nnytXmgCYPj0alSqX+HjHrDgQiVkQJom2tlb6+vqIippusW7mzFl2iEgQ7Eer1VJWVkpxcSHd3V3I\nZDJmz05EpcolPDzC3uHdkkjMgjCB3VhvGRgYxLPPvuCQVwGCMB66ujopLFRTUVGGTqfD1dUVpVJF\nVpYSf/8Ae4d3R0RiFoQJSJKkwXrLixcbgev1loIwFZkrDgrQaKoxmUx4e/uQmzuX9PQMPDw87B3e\nXRGJWRAmqCNHDtHW1mZRbykIU4W1ioPg4BCUyhySkpJxdrZtc6TxIhKzIExAMpmMe+5Zjaenl13r\nLQXBHgYGBqiqqkStLqC1tQWAmJhYVKpcYmJiJ/wXVJGYBcGBNTc309XVQXy85eCt6dOj7RCRINhP\nX18fZWUlFBcX0dvbg7OzM8nJqSiVOYSGhto7PJsRiVkQHMy1esvCwgLOnDmNl5c33/hGLC4u4s9V\nmJra29soLCygsrICg8GAu7s7ublzyMrKxsfH197h2Zz4SxcEByFJElVVJ63WW07UZ2WCMBrXKg5O\nnapFkiR8fX1ZsGARqanpuLm52Tu8MSMSsyA4CJlMRnl5Kc3NVyZMvaUg2JrJZKKu7hRqdT6NjRcA\nCAsLR6XKRaGYPa7NJOxFJGZBcCDLlt2Dq6t8wtRbCoKt6PV6KivLKSpS097eDkB8/ExUqlymT4+e\n8AO67oZIzIIwzpqaLtPc3ExKSqrFOjFlpjDV9PT0UFJSRElJ8WCHp/T0TJRK+3Z4sieRmAVhHEiS\nRH39aQoK8jl37iyurq7Ex8+ccBMfCIKttLS0UFhYwMmTFYMdnubOnU9mZrZDdHiyJ5GYBWGMVVSU\nU1BwYrDecsaMGFSqXNzd3e0cmSCML0mSOHfuLGp1PmfOnAbMHZ6ys1WkpKQ5VIcnexKJWRDGWH39\nadrb2yZlvaUg3Amj0YhGU4NanU9T02UAoqKmD3Z4mgoDuu6GSMyCMMYWLlzMkiXLJmW9pSDcik6n\no7y8lKIiNV1d5g5PCsVsVKpcIiIi7R2ewxKJWRBsoLHxApcuXUSpzLFYJ0ZYC1NNV1cnRUWFlJeX\notPpkMvlZGcryc5Wib+HOyASsyCM0I31lk5OTigUs8WVsRVN5y9T/Ac1nifdMbqZMM43seTvVojZ\nzCaZpqbLqNUF1NRUYTKZ8PLyJjd3DunpmWKg410QfxWCMAIVFWWcOHHMot7S29tn2OtaWlowGPSE\nhYVPqTrMoZouNFH1RClPVT82uKz/q37eqnibDW9smrLnZbIwVxycQa3O5+zZBgCCgqaRk5NLYmKy\n+PI1AuKMCcIINDVdpru7m/T0TLKzVUybNm3Y+vrK02j+o4oY9XTcde58nl5K0DdDyLxPaaeI7afk\nD2qerH502DIPPFizbzUlnxWRtWLqnZPJYGBggOrqk6jVBbS0NAPXKw5iY+PEF65REIlZEEYgL28e\neXnz8Pb2tljX09NN/TdreVJz/QpRVaDixJl8NKHVKJSJ4xmq3XlUWZ/TePrAdI5+dQJWjHNAwqj0\n9/dTVlZCUVEhvb09ODk5kZSUgkqVQ2homL3DmxREYhYEK67VW54928DChYst1ltLyNccf+MoD2se\nsFie15LL5r9unXKJ2ehmtLpcQsLoZhrnaISR6uhop7CwgIqKcgwGA25ubqhUuWRnK/H19bN3eJOK\nSMyCMIS1esvExGSCg4PveB8u552QY32iBLdLU29SEeN86DvUhyeew5Yf8z9G0mOW05IKjuXixUbU\n6nxqazWDHZ7mz19IWlrGpO7wZE8iMQvCVRUVZRw9+pVFveXdJGUAQ8gARow4Y9mqURess1W4E8aS\nby/nb5WbWf3JPcwwzEBC4kjAUZq+18qi2DR7hydYYTKZOH26jl27yqiuPgVAaGjYYIcn0YZ0bInE\nLAhX9fT00N/fP+p6S9Xzeezetof1Z9cNW17hW0nYI1NvUgUXFxcefHUjpV8Uc/xwAUY3E8mPpTE7\nJt3eoQk3MBgMgx2e2tra8PJyIy4uHpUql+joGWJA1zgRiVkQrsrKUpKRkTXqesuAwED8/zuEd/5r\nCzklSrwGvDiWeALX5z2Yu2SBjaKdWGQyGZnLsmGZvSMRrOnt7R3s8NTf34ezszNpaRmsWrUUmHqP\nX+xNJGZhyrhWb3n8+Dny8hZbfPu35fOypPnJJO5Ooqa8Gl3vJeaplogJ+gWH09raOtjhaWBgAHd3\nD+bMmUdmZjbe3t4EB/vQ3Nxt7zCnHJGYhUnPXG9ZhVqdT0tLM15ebkRHzyIyMmpMjyuTyUhMTxrT\nYwjC3ZIkifPnz1FYWEBdnfn5sb+/P0plDikpabi6uto5QkEkZmFSq6go4/DhQ8PqLVetWoqLy83L\nnQRhMjKZTIMVB5cvXwIgMjIKlSqXmTNniQ5PDkQkZmFSkySJgQHDsHpLcXtOmEp0Oh0VFWUUFanp\n7OxEJpORkKBApcod87t
"text/plain": [
"<matplotlib.figure.Figure at 0x10fbeddd0>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"from sklearn.datasets.samples_generator import make_circles\n",
"X, y = make_circles(100, factor=.1, noise=.1)\n",
"\n",
"clf = SVC(kernel='linear').fit(X, y)\n",
"\n",
"plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='spring')\n",
"plot_svc_decision_function(clf);"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A simple model that could be useful is a **radial basis function**:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAcwAAAFBCAYAAAD69Z+AAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzsvXmcW2d97/85i6SZ0Uia1TPjmfF43+04sZN4yZ5AQpwm\nARKS/IB7C71QaAvcttw2oaUb8IPb3hRK0zY/CqG9EBpCSEKcQDayx4TES7yvY489qz3j8azSSGf7\n/fHo6BxJR/s50iPpeb9e4IyWc56z6Pmc7/f5LpymaWAwGAwGg5EevtQDYDAYDAajHGCCyWAwGAxG\nFjDBZDAYDAYjC5hgMhgMBoORBUwwGQwGg8HIAiaYDAaDwWBkgZjuzdHRaZZzwmAwGIyqorXVx1m9\nzixMBoPBYDCygAkmg8FgMBhZwASTwWAwGIwsYILJYDAYDEYWMMFkMBgMBiMLmGAyGAwGg5EFTDAZ\nDAaDwcgCJpgMBoPBYGQBE0wGg8FgMLKACSaDwWAwGFnABJPBYDAYjCxggslgMBgMRhYwwWQwGAwG\nIwuYYDIYDAaDkQVMMBkMBoPByAImmAwGg8FgZAETTAaDwWAwsoAJJoPBYDAYWcAEk1G1qKoKVVVL\nPQwGg1EmMMFkVB1EKGVomgRBAAQB4HkAUADI0DStxCNkMBg0IpZ6AAxGsVBVFZomQdNkABw4TgDH\nceA4gOMAWY5AVRV4PF4AgFk3NS3+bwaDUX0wwWRUPMTtqkLTFGiaAkADwKX9ji6iZnTBZELKYFQn\nTDAZFUu8UGrgOA7phVJ/z1pQdQFlQspgVCdMMBkVh5VQcokqZyNMSBmM6oAJJqNiKLZQZiJbIWUi\nymCUB0wwGWWPXUKpf0XTkkXOTtIJqapqUFUJAAeedzEhZTAoggkmo2yhzaIsFI4j6S2SFAHPCxAE\nF3PrMhgUwQSTUXZUmlCmg62PMhj0wASTUTY4L5TmKFm6yUVIWTEjBsMemGAyqMdZocyck1lOWAkp\nzydbn8waZTByhwkmg1pIZZ5ItNgAD47jbRDKyhHHXGCFGBiMwmGCyaCOfCrzMHInk1tXkubAcTwE\nwc2ElMEAE0wGRWiaFhPJ+Mo8bKYuJhxHroWqyuA4Hi6XGwCzSBkMJpiMkmMllJUa9VpumK8DK8TA\nqHaYYDJKRm5CabdbNvWMztp75UY6IWXWKKOSYILJKDq0WJSapkJRJGiaAJ5nrWHthgUaMSoNJpiM\nokGLUEZHA0CBoihQFCnuHSKiGnjershchg4rxMAoZ5hgMhyHFqEkrlbF9AoHQXCB40hkrqrKAABV\nVaCq8Z/TxdMQUSakdpK+vq4MRZHB8y5wnMBElFEymGAyHIM+oUycaQXwvAhBIO5YWY5AliMQRTc4\njovmgapRMVVALFLj27pwmsUUYAFLdkIidsnDjCCIEATyOlsfZZQCJpgM26FXKPU0FRXphI3kHhqT\ns74tVVWgaYaIGv+d/H2eFxIsUiaidsLWRxmlgAkmwzaIiMjQNKIg9ghl7t+3tigF07ZyL67KcRwE\nIf7nQh4MNJNwGoKqKIn7YG5dp2HrowynYYLJKBjdoiR9HBUAxLoqxTiIGJrFigilLkyJKSPxf+dW\nfN14IIg/Vl1E463Q7Ny6TECTKTTNhxWqZ9gFE0xG3iS6Xk3vOLG3tOPIJJTFhAipAJ4X4l7Pxa2r\nqgokKVxCMTW7senA7uNnheoZucIEk5EzqdYoi53w75RQOnUY2bp19QjdxHQXcp7NFqnAqiI5QLr1\nUUVRoKoyeF5kEbtVCBNMRtbQFcyTKJQ8SEeT9OMhwp7qPZsGmAOJbl1N0xAOz0ZruHqS3LqkGD2L\n1i02xqlUoaoSeJ6HIBiCySzS6oAJJiMjlSCU5QbHceB5K7du8tpoKreuHqlb/muk9KlPojuXBRpV\nB0wwGSmhRSijowEgm/4uVCjLc+YiKS9WQUZ6oJF5jZRYo/HfN7t1hZiIJp5HfWKnS19pGEx2a7us\nUH1lwgSTkUT+QmnvhGZYlIAxUTlpUeYWJUsLZreueY3UOlpX7zEaXxIwMd2FDnEiVJKgsEL15Q0T\nTEYMWixKa9crAIhl6lIsDVbRuuTcanFVjFLnjpIgF6CU0boGNFx6PbDNiYhd5talHyaYDAeEMr9f\ntD6ZJ9Z7Ja+V6/obXehNuVO7dZVozqgeYKRliNZN7da1j+pUiGzcunr9Y44TmZAWASaYVQxdFmWi\nUPIwCgLISd9h2Eu8W5fkgUYiIfC8AFF0myxSfY00s1uXiKmd9xN7YALihVSSItA0FTU19awQQxFg\nglmFkAT6SHTS40rmZksnlMmVedjjcynQhS9/ty6XIKLlHK0L0FfQwWiszgoxOA8TzCoi3qKUQX5s\nxb8FshHK4o6FkQvZunWzidY1p76UQxEG2qKHNS3zWNj6qH0wwawCrFyvxtpgccdhvUZpt+suP1LV\nm2VkR6JbVycxWtfs1o3vO2rt1qXretBoYeZet5nlj+YHE8wKpthrlKmq6BhCqSJ+wqFDKBnOYke0\nLkD6lbJOLwZOPEiwQvXpYYJZgeQmlMYaiI0jMI1FTw9hQskwSO/WVZMsUoDU1o0vCagLcbyIVtu9\nVYzjzXZ9lFwzoFJLMzLBrCBoiXolY6FZKIvvjmZkhy6CgBBz64bDQWiaCre7NpbuYlikcooG3slp\nL3bgVB5muZK4PirLYaiqArfbG/e5SrFGmWBWADQJpYFuCpA1rVL0x2RUCkSkjGhdF3k1z2jdynHr\n0raean6gMIS0UsQSYIJZ1tAklOSHkhjMw4SSYRfJ93Vubt3sGniXU6cX2iJ2zZTD+csHJphliD1C\naaeLilSEids6Z8+tla4dV6EkB02UZy1ZJ6ApMjXXoVi5dcl24gvUp+v0Yu45Gp876sSaf77QZ2HS\ndX7shwlmGeGMRVlIGTuzUOo9HZWU3ykMeibwYkCLYNFiKdgxjFTRulYWqeHWNVeZMta+FUUquVuX\nlnvEDJmXKterxASzDKDT9Wr+sQrQ3Vgk2Ie+HzKjnHHOakls4B3bYwq3ro4khU3bKK1bl5aHGhoF\n3G6YYFKMUcJOhr4eSKtQWn2elh9yrpTyd1+u58x5inteUrl15+ZmAHAQRVcBbl27oNElW9n3MBNM\nCom3KJPXB4s9luRWW6mFstwgk51utZd6NAyaMaeUiKI77vVki1RJ6da1K1qXvqAfOgXcTphgUkTq\nEnZOkH67lS6UBAWqKiMcjsRNWmYRZTASSbwvcnXr2hetS5dA0Sfg9sMEkwKyW6N0ysqM326lC6VR\npg/Q18b0tVej/JeKcHgWydaAQEmOa/VQzuti6aJ1jXq62bh1hQSLlNb7z1rAy/gSJsEEs4TQF8yT\nKJSl6SDiBKmOj+dFuN3u6CSmQpJC0IUynTXgVCUZhjV0nF97LDqO4yAI8VNvslvXENTEIgx6pxdz\n6zsaPCLVUAWJCWYJoEkoyY9NgX1C6UTZufy3aS2UOsYxchwXdYMBPM/D7a6Nfd8qby99JRlDRCt5\n8qg2nLSUMrl1461QvYE3gUTshkserVsNMMEsIrQJpYE++VeaRWndc9Oq0EIqUlsD6damJNP3nY6U\nZBSbYl4+q9xRgCwbSNIcVFWNtUBL5dY19xx18h5MZWEylywjJwyhJDd0bkJp741dPNer/blz2bid\nsmlOrWmGxZrPjzn12lS8gKZKgC+P5sm0BJTQNNvSck4QLQ7AA1DhcnliLtrEakaZGnjb6xGh5/w4\nBRNMBylMKJO2VvBYUrkmyeRf/pS6QwrH8Vm3q0rdPFkwfa/061L0wM5DMvECFd/A25jard26espa\nskfETrduJVmXABNMRzALpX5TlzaYJ5VrUrb8TrlhXaYvf6G0MzLTyhq17rJhztsDAA3h8GySNVpt\n61I0TbjlGtSSawPvfN265Xp+coEJpo1YCWUpx5LZNVmCgdlIqnq22deyTCwYX8xSZqm7bEQioejn\n+OgEltoaLY90g8Kp4EP
"text/plain": [
"<matplotlib.figure.Figure at 0x1104c6490>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"r = np.exp(-(X[:, 0] ** 2 + X[:, 1] ** 2))\n",
"\n",
"from mpl_toolkits import mplot3d\n",
"\n",
"def plot_3D(elev=30, azim=30):\n",
" ax = plt.subplot(projection='3d')\n",
" ax.scatter3D(X[:, 0], X[:, 1], r, c=y, s=50, cmap='spring')\n",
" ax.view_init(elev=elev, azim=azim)\n",
" ax.set_xlabel('x')\n",
" ax.set_ylabel('y')\n",
" ax.set_zlabel('r')\n",
"\n",
"interact(plot_3D, elev=[-90, 90], azip=(-180, 180));"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In three dimensions, there is a clear separation between the data. Run the SVM with the rbf kernel:"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {
"collapsed": false
},
"outputs": [
{
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAeYAAAFRCAYAAAChXA4CAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzs3Xd4VOed6PHvjGbUu1DvEtKAhBBCdESRqAIEGDDNduzE\nTrJObnpu2vZkd+8mu9lNNsmNb2LHiYONMZgmRBdVCEQH0UYSQh313qae+8fAiLEGTJE0I+n98Pgx\nc8qcd17OzO+8XSZJEoIgCIIg2Ae5rRMgCIIgCEIfEZgFQRAEwY6IwCwIgiAIdkQEZkEQBEGwIyIw\nC4IgCIIdEYFZEARBEOyIwtYJAGho6BjQMVs+Pq60tHQP5FsOayI/LIn86CPywpLID0siP/oMdF74\n+3vIHrdvRJaYFQoHWyfBroj8sCTyo4/IC0siPyyJ/OgzlHkxIgOzIAiCIAxXIjALgiAIgh0RgVkQ\nBEEQ7IgIzIIgCIJgR0RgFgRBEAQ7IgKzIAiCINgREZgFQRAEwY6IwCwIgiAIduSFZv5SqVTTgX9X\nq9Xpn9n+HeBNoOHBpq+q1eqiF7mWIAiCIIwGzx2YVSrVD4BXgU4ruycDr6nV6ivP+/6CIAiCMBq9\nSFV2CbAGsDbfZyrwE5VKdVqlUv3oBa4hCIIgCKPKcwdmtVq9E9A/ZvdW4KtABpCmUqmWP+91BEEQ\nBGE0GazVpX6tVqvbAVQqVQ6QAuQ87mAfH9cBnyDc399jQN9vuBP5YUnkRx+RF5ZEflgS+dFnqPJi\nwAOzSqXyAq6rVKoEoBtTqfm9J50z0MuK+ft70NDQMaDvOZyJ/LAk8qOPyAtLIj8sifzoM9B58aQg\nPxCBWQJQqVSbAHe1Wv3HB+3KxwENcFStVh8cgOsIgiAIwoj3QoFZrVaXAbMe/H3rI9u3YmpnFgRB\nEAThGYgJRgRBEATBjojALAiCIAh2RARmQRAEQbAjIjALgiAIgh0RgVkQBEEQ7IgIzIIgCIJgR0Rg\nFgRBEAQ7IgKzIAiCINgREZgFQRAEwY6IwCwIgiAIdkQEZkEQBEGwIyIwC4IgCIIdEYFZEARBEOyI\nCMyCIAiCYEdEYBYEQRAEOyICsyAIgiDYERGYBUEQBMGOiMAsCIIgCHZEBGZBEARBsCMiMAuCIAiC\nHRGBWRAEQRDsiAjMgiAIgmBHFLZOgCAII19dZS2Xf3sB15vOGJyMGNKMpH9jEQqF+AkShM8S3wpB\nGESNjY3odFqCgoKRyWS2To5N1FXVcevVq3zh9mbztp7TPfy58K+sfW/jqM0XQXgcUZUtCIPg3o27\nHNyUTeOMKjTT2snNOsCVnIu2TpZNXPntBdbfftlimwsurDiYyZWjl2yUKkGwX6LELAgDrLOzg3tv\nF/Gauq+EOPX8VM6VFqAOvI1qyngbpm7oudxysro9XB/OmdPnYNEQJ0gQ7JwoMQvCADv73hnWqF/q\nt31G43TK/3LXBimyLYOTwep2CQmDk3GIUyMI9k8EZkEYYIpKOUqUVvc53Xce4tTYniENuunutz3f\nO5+EzUk2SJEg2DcRmAVhgOkC9BiwXkrU+GuGODW2l/71hXyw6kPKleWAqaR82ieP+99tJCw63Map\nEwT7I9qYBWGATX1rBvu257CqfKXF9kLPGwStD7VRqmxHoVCw5g8buHrsMmdPncfgZCRx80TGRSXb\nOmmCYJdEYBaEAebj64v3fwew5f98xLQrU3DTu5E//hyOb7kwK32OrZNnEzKZjJQFqbDA1ikRBPsn\nArMgDIKEtETG70vgzvXbaLruM3tqOkql9XZnQRCER4nALAiDRCaTMT45wdbJEARhmBGdvwRBEATB\njojALAiCIAh2RARmQRAEQbAjoo1ZEKy4cbqQ+x9W4lijRBukI2BjMMkZKbZOliAIo4AIzILwGRd3\nniPoR2NIb91g3nbj2E3y//E0s14bncOdBEEYOi9Ula1SqaarVKrjVrZnqVSq8yqVKl+lUr31ItcQ\nhKEkSRLt77SQ0mpZOp7Qnoj2vW70er2NUiYIwmjx3IFZpVL9APgj4PSZ7UrgvzCtGTMP+IpKpQp4\nkUQKwlApLy9j/I1xVvel3prMneu3hzhFgiCMNi9SYi4B1gCfXeV8PFCiVqvb1Gq1DsgD5r7AdQRh\nyLi4uNDp2GV1X6djJ64eroNyXUmSMBisz68tCMLo8txtzGq1eqdKpYqysssTaHvkdQfg9bzXEYSB\nptVqOfKrAziclSEzyOidpCXtm/Pw8PQkMDCIq9MvMOv4zH7nXZtSyNK4lVbe8fl0dnZy+vRJ9Ho9\njo6OyOUydDo9BoOesWPjGT9++E5O0tHeTt6/nMDlrDMOvQ5oJmiIeDuG+GnWayMEQegzGJ2/2gCP\nR157AC1POsHHxxWFwmFAE+Hv7/H5B40iIj9MDAYDf1nzFzbv3YwzpiUYjaeMfHjhQ1YfXI2Hpwcz\n/msaO17dQdbNLJxwQoeOfap9TPnl5AHLx/z8fJqamti4cQ2Ojo799t++fZsDB3axceNGq/sH0kDf\nGwaDgX0bdvD68deRP6yUK4cT10/QuqeWuJS4Ab3eQBPfFUsiP/oMVV7IJEl67pMflJi3qtXqmY9s\nUwI3gelAF5APZKnV6vuPe5+Gho7nT4QV/v4eNDR0DORbDmsiP/rkbT/FvK/PxBdfi+0GDHz03U9Y\n+qMVAHR1dXHuz3nIKsEYAjO+NAt394H5Ul64UICzswtJSROfeJxWq2Xnzu28/PJGHBwG9sH1ocG4\nN87sOE3a16cxRhrTb9+WjR+x5H+yBvR6A2kwvyutrS3UVNQQERMxYPfSYBO/HX0GOi/8/T0+2wxs\nNhAlZglApVJtAtzVavUfVSrVd4FDmNqw33tSUBaEoaQ919MvKAM44IDT9b5FJtzc3Fjw9SUDfv3u\n7m4aGxvJzFxu3tbc2MSlLReQdcCYtACS56cgk8lwdHQkK2s1R48eZsmSzAFPy2DpvdZlNSgDuN51\nGeLU2F5PTw9Hf3CAiGNhxDREcyfkGnWZTSz52XIUCjFiVejvhe4KtVpdBsx68Petj2zfB+x7oZQJ\nwiAwKoyP36d8/L6Bcvr0STIyFppfX9h+DtnPDGysXYscOWXvlLFr8TZW/D9TFbebmxtarRZJkpDJ\nHvuA/UQGgwG5XP7c5z8ro5eEEWNfNfYjtF6jb7jZke/l8PqO11A8+LmNqYmh+71udih3k/lT+609\nEGxHPK4Jo0rgyhCKthYR3xtvsb2NNmRzBv/roNPpcHIyjTBsa21F+lcty2r7Ss9RuijeyPkC2375\nKUt+uBy5XM6UKVO5fPkiqalTATh3Lp+Ghnr0ej06nQ6DwYBOp2PJkkwCA4P6XXPbto+orq5CqVSi\nUChRKhUoFEoyM5fj79+/M9aVK5dob2/HyckJd3cPPD098fDwwNPT66mq1FPfmMbBLQdZVrPMYnuD\nQwPKxU6POWtkqqutY+yxWHNQfsgVV7wOuNP7k16cnZ1tlDrBXonALIwqSbOTOfXNw+h+oyexx9Tr\nuUpRRc6qg6z+0suDfv1H12S+8NdzrK9Zw21uU001bbTRTjtttJG/7SyJr0wkIiKS4OAQrl+/Zj6v\nvLyM8vKyfu/b29tr9ZqhoWGA6aFAr9eh1+vp6el5bBpv375FVVVlv+0bNmwmMjKq3/aKinLkcjke\nHh64u3vgO8aPsp86s+P/7GT53UyccSbP5wylG8rJfH10lRArbpYxtTnZ6r6I++E0NjYQFhY+xKkS\n7J0IzMKos/bnazk9v4CPdm9DbpDhnTGGlxauH5SqXqPRSGNjI3V1tYSFhWHR2bLb1LZdSCG3uAWA\nDBkeeOCHn8X7PHre8uVZSJL0oPSrxMHB4Ylpnzcv/ZnSvGTJMnp7e9Boeuno6KC9vZ2Ojg58fHys\nHn/06GEaGxtM6ZfJ8PDwICAgkJk7Z7Mv9yC6Dh0TViSzLNx6gBrJIidEc8v3NvOa/fvtqwipZPKY\n/sPyBEEEZmFUGjclgXF
"text/plain": [
"<matplotlib.figure.Figure at 0x110532b10>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"clf = SVC(kernel='rbf')\n",
"clf.fit(X, y)\n",
"\n",
"plt.scatter(X[:, 0], X[:, 1], c=y, s=50, cmap='spring')\n",
"plot_svc_decision_function(clf)\n",
"plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1],\n",
" s=200, facecolors='none');"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"SVM additional notes:\n",
"* When using an SVM you need to choose the right values for parameters such as c and gamma. Model validation can help to determine these optimal values by trial and error.\n",
"* SVMs run in O(n^3) performance. LinearSVC is scalable, SVC does not seem to be scalable. For large data sets try transforming the data to a smaller space and use LinearSVC with rbf."
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 2",
"language": "python",
"name": "python2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 0
}