Added feature engineering description, a description on the family size histogram, and a brief discussion on a potential feature related to the passenger's name.

pull/57/head
Donne Martin 2015-03-18 14:21:25 -04:00
parent 81660c59d1
commit 2662e2bb03
1 changed files with 13 additions and 3 deletions

View File

@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:c536f631f40b2ee6ad2ff384cb9076172d442d8b19019094b1af1a8657120e10"
"signature": "sha256:8faa925c9373212bcde3580896d60777b13a18934069bb0bec50503c01d983b0"
},
"nbformat": 3,
"nbformat_minor": 0,
@ -2333,6 +2333,8 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Feature enginering involves creating new features or modifying existing features which might be advantageous to a machine learning algorithm.\n",
"\n",
"Define a new feature FamilySize that is the sum of Parch (number of parents or children on board) and SibSp (number of siblings or spouses):"
]
},
@ -2547,14 +2549,22 @@
],
"prompt_number": 35
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Based on the histograms, it is not immediately obvious what impact FamilySize has on survival. The machine learning algorithms might benefit from this feature.\n",
"\n",
"Additional features we might want to engineer might be related to the Name column, for example honorrary or pedestrian titles might give clues and better predictive power for a male's survival."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 35
"outputs": []
}
],
"metadata": {}