Donne Martin
|
ad54e0ae70
|
Added Data Munging Summary section which contains all the data cleaning and transformation steps described in the notebook.
|
2015-03-20 11:27:06 -04:00 |
|
Donne Martin
|
387922662f
|
Replaced nested for loop that calculated the median age based on sex and passenger class with groupby + apply instead.
|
2015-03-20 11:21:27 -04:00 |
|
Donne Martin
|
c9ca38d211
|
Only attempt to fill missing ports of embarkation if there are missing values. Reworked the AgeFill process. Dropped SibSp and Parch columns as there are part of FamilySize.
|
2015-03-18 19:25:58 -04:00 |
|
Donne Martin
|
d58e6423b3
|
Updated Notebook TOC, dropped PassengerId as it won't be using in the machine learning algorithms.
|
2015-03-18 14:45:18 -04:00 |
|
Donne Martin
|
000fea0862
|
Added section Final Data Preparation for Machine Learning, which drops unused columns and converts the DataFrame to a numpy array.
|
2015-03-18 14:32:28 -04:00 |
|
Donne Martin
|
2662e2bb03
|
Added feature engineering description, a description on the family size histogram, and a brief discussion on a potential feature related to the passenger's name.
|
2015-03-18 14:21:25 -04:00 |
|
Donne Martin
|
81660c59d1
|
Reordered README sections.
|
2015-03-17 16:21:33 -04:00 |
|
Donne Martin
|
a93f599a9b
|
Cleaned up code, charts, and descriptions in various sections.
|
2015-03-17 16:16:42 -04:00 |
|
Donne Martin
|
7d4c5532a8
|
Rework the age analysis, adding more details and graphs.
|
2015-03-17 15:44:50 -04:00 |
|
Donne Martin
|
ce3ef575bd
|
Added additional plots to further explore the port of embarkation feature.
|
2015-03-17 14:53:22 -04:00 |
|
Donne Martin
|
011313d2e1
|
Added snippets of feature engineering: creating a new feature family size by combining number of parents and siblings.
|
2015-03-17 14:05:07 -04:00 |
|
Donne Martin
|
44eeaf447d
|
Cleaned up some sections, added plots of survival rate by Sex and Pclass.
|
2015-03-17 14:03:58 -04:00 |
|
Donne Martin
|
0c56902027
|
Add plots for features we will analyze in the exploratory data analysis section.
|
2015-03-17 08:51:30 -04:00 |
|
Donne Martin
|
10d63efb4a
|
Added Spark streaming snippets.
|
2015-03-16 16:01:51 -04:00 |
|
Donne Martin
|
8364d476b3
|
Updated variable descriptions section to be a markdown cell with a pre tag.
|
2015-03-15 08:36:23 -04:00 |
|
Donne Martin
|
8d72ba4cd4
|
Cleaned up various portions of the notebook.
|
2015-03-15 08:33:36 -04:00 |
|
Donne Martin
|
66a98b61d2
|
Added title and axes labels for Age charts.
|
2015-03-15 08:09:56 -04:00 |
|
Donne Martin
|
02b8c05fe9
|
Fixed range of embarked histogram, as it was not showing the NaN value. Added title and axes labels for passenger gender charts.
|
2015-03-15 06:32:55 -04:00 |
|
Donne Martin
|
2b8bf79cfa
|
Added title and axes labels for passenger gender charts.
|
2015-03-15 06:27:48 -04:00 |
|
Donne Martin
|
1b887f75ca
|
Added title and axes labels for passenger classes charts.
|
2015-03-15 06:24:07 -04:00 |
|
Donne Martin
|
e9d533232b
|
Added competition site URL. Fixed Description header.
|
2015-03-15 06:14:27 -04:00 |
|
Donne Martin
|
3b8eb8f823
|
Added snippets to analyze the Titanic passenger Age feature.
|
2015-03-15 06:11:01 -04:00 |
|
Donne Martin
|
b0f14105ae
|
Added snippets to analyze the Titanic Embarked feature.
|
2015-03-15 04:07:05 -04:00 |
|
Donne Martin
|
0394852ba5
|
Added snippets to analyze the Titanic Sex (Gender) feature.
|
2015-03-14 20:03:48 -04:00 |
|
Donne Martin
|
b2c7f4f850
|
Added snippets to analyze the Titanic Passenger Class feature.
|
2015-03-14 20:01:55 -04:00 |
|
Donne Martin
|
8babd3a1cf
|
Added Kaggle section to README.
|
2015-03-14 19:57:55 -04:00 |
|
Donne Martin
|
4ad409aa63
|
Added snippets to start exploring the Titanic data.
|
2015-03-14 19:56:28 -04:00 |
|
Donne Martin
|
bcfae90101
|
Added preliminary Kaggle Titanic survivor analysis containing the competition description, evaluation, data set, and snippet to read in the data to pandas.
|
2015-03-14 19:53:56 -04:00 |
|
Donne Martin
|
1fbbd20c68
|
Added Kaggle Titanic data files.
|
2015-03-14 19:49:07 -04:00 |
|
Donne Martin
|
8196b4bdfb
|
Updated repo description.
|
2015-03-14 09:22:01 -04:00 |
|
Donne Martin
|
ce605a6fdf
|
Added snippets for configuring Spark applications.
|
2015-03-13 08:25:50 -04:00 |
|
Donne Martin
|
53789e0e3e
|
Prefixed Spark commands with ! so they can be executed within IPython Notebook.
|
2015-03-13 08:09:01 -04:00 |
|
Donne Martin
|
87b017fd37
|
Prefixed HDFS commands with ! so they can be executed within IPython Notebook.
|
2015-03-13 08:07:17 -04:00 |
|
Donne Martin
|
8c251e43cd
|
Prefixed various misc commands with ! so they can be executed within IPython Notebook.
|
2015-03-13 08:05:56 -04:00 |
|
Donne Martin
|
8c4541ae33
|
Added git reset and pull commands.
|
2015-03-13 08:03:01 -04:00 |
|
Donne Martin
|
a9ea93b872
|
Prefixed Linux commands with ! so they can be executed within IPython Notebook.
|
2015-03-13 07:59:28 -04:00 |
|
Donne Martin
|
23d3866b8e
|
Prefixed AWS commands with ! so they can be executed within IPython Notebook.
|
2015-03-13 07:57:12 -04:00 |
|
Donne Martin
|
1c4e2157a6
|
Added snippets to demonstrate writing and running a Spark app.
|
2015-03-12 06:25:40 -04:00 |
|
Donne Martin
|
9fd62a73ae
|
Added sed command to delete matching lines in place. Added command to display all matching running processes with full formatting. Tweaked formatting of vim section regarding vimtutor and vim syntax coloring.
|
2015-03-11 20:30:23 -04:00 |
|
Donne Martin
|
31c4f3299a
|
Updated AWS index.
|
2015-03-10 17:00:44 -04:00 |
|
Donne Martin
|
5600ab0377
|
Added Lambda commands.
|
2015-03-10 17:00:08 -04:00 |
|
Donne Martin
|
cd84ffb2f0
|
Added Kinesis commands.
|
2015-03-09 16:10:54 -04:00 |
|
Donne Martin
|
1815c9a122
|
Added snippets to checkpoint RDDs in Spark.
|
2015-03-08 05:55:45 -04:00 |
|
Donne Martin
|
0481497848
|
Added snippets to cache RDDs in Spark.
|
2015-03-08 05:55:05 -04:00 |
|
Donne Martin
|
404676a1f7
|
Added discussion and snippet for working with partitions in Spark.
|
2015-03-07 09:07:18 -05:00 |
|
Donne Martin
|
bef3dfc9fc
|
Added discussion on viewing the Spark application UI.
|
2015-03-06 07:53:17 -05:00 |
|
Donne Martin
|
72cf3af7f1
|
Added snippets to run Spark on a cluster.
|
2015-03-05 07:26:54 -05:00 |
|
Donne Martin
|
a5a3da5b28
|
Added Spark pair RDDs snippets.
|
2015-03-04 08:28:07 -05:00 |
|
Donne Martin
|
e8b481f480
|
Added snippets for basic RDD operations.
|
2015-03-03 10:36:30 -05:00 |
|
Donne Martin
|
6c7e7b5239
|
Added Spark IPython Notebook, currently contains snippets for starting the pyspark shell and viewing the spark context.
|
2015-03-03 10:32:59 -05:00 |
|