Donne Martin
|
1d9797b5c5
|
Added references section.
|
2015-04-09 11:54:01 -04:00 |
|
Donne Martin
|
c4f4a8aae3
|
Added commands to configure a remote for a fork and to sync a fork. Deleted duplicate git pull origin master call
|
2015-04-08 08:04:09 -04:00 |
|
Donne Martin
|
360379c72e
|
Added matplotlib kernel density estimation plots.
|
2015-04-07 15:23:53 -04:00 |
|
Donne Martin
|
2a672bf6b6
|
Added matplotlib IPython Notebook to README. Tweaked section ordering. Changed Notebook to Notebook(s).
|
2015-04-06 09:31:03 -04:00 |
|
Donne Martin
|
0069e10997
|
Added snippets for scatter plots, subplots.
|
2015-04-06 08:55:32 -04:00 |
|
Donne Martin
|
bf27e997e4
|
Added snippets for normalized plots.
|
2015-04-06 08:54:12 -04:00 |
|
Donne Martin
|
9496652892
|
Added snippets for bar plots, histograms, and using subplot2grid.
|
2015-04-06 08:52:43 -04:00 |
|
Donne Martin
|
21b19dd12f
|
Added matplotlib IPython Notebook. Contains code to clean data, data will be plotted in the notebook and setting of global params.
|
2015-04-06 08:51:28 -04:00 |
|
Donne Martin
|
557b76f267
|
Updated linux section with list of commands.
|
2015-04-05 08:24:40 -04:00 |
|
Donne Martin
|
818cf705c4
|
Added unit test for sample mrjob mapper and reducer to parse logs on s3.
|
2015-04-05 08:14:53 -04:00 |
|
Donne Martin
|
8d1d56fc22
|
Revert README changes b41ba644be and 3d2d550852 regarding whitespace tweaks and moving the section images below the text headers. Images now appear before section text headers.
|
2015-04-04 09:51:50 -04:00 |
|
Donne Martin
|
b41ba644be
|
Tweaked whitespace.
|
2015-04-04 09:31:24 -04:00 |
|
Donne Martin
|
3d2d550852
|
Moved section images to below section text headers.
|
2015-04-04 09:27:38 -04:00 |
|
Donne Martin
|
6e2c1fd5d2
|
Added images for each section. Removed outdated References section--will update in the future.
|
2015-04-04 08:53:04 -04:00 |
|
Donne Martin
|
bac10f9f61
|
Added all images shown in README.
|
2015-04-04 08:50:37 -04:00 |
|
Donne Martin
|
21facbb91f
|
Add new repo cover image.
|
2015-04-04 08:36:56 -04:00 |
|
Donne Martin
|
a12fc148ad
|
Converted pandas and commands sections to use tables for legibility. Fixed a typo in datetime description.
|
2015-04-04 08:17:13 -04:00 |
|
Donne Martin
|
b23feee87d
|
Tweaked repo description, reordered spark and aws sections, added tables to python-core section.
|
2015-04-04 07:37:20 -04:00 |
|
Donne Martin
|
8063018571
|
Converted notebook links and descriptions to tables for readability.
|
2015-04-04 07:23:31 -04:00 |
|
Donne Martin
|
2cbff15b57
|
Added more whitespace to try to improve legibility
|
2015-04-03 06:38:26 -04:00 |
|
Donne Martin
|
eb7bba9377
|
Added more detailed descriptions to each notebook in the categories kaggle, aws, and spark.
|
2015-04-03 06:34:32 -04:00 |
|
Donne Martin
|
1403cf4134
|
Added sample mrjob mapper and reducer to parse logs on s3 following the standard bucket logging format.
|
2015-04-03 06:06:46 -04:00 |
|
Donne Martin
|
d4ab154643
|
Transformed Embarked to dummy variables instead of integer representations. The latter implies ordering, which isn't the case with Embarked.
|
2015-04-02 23:29:33 -04:00 |
|
Donne Martin
|
011747c17a
|
Added Spark accumulators snippets.
|
2015-03-31 21:41:21 -04:00 |
|
Donne Martin
|
7195c5bc82
|
Added Spark broadcast variables snippets.
|
2015-03-30 19:01:07 -04:00 |
|
Donne Martin
|
b3fb4ae219
|
Added Spark streaming with states snippets.
|
2015-03-29 17:53:52 -04:00 |
|
Donne Martin
|
f316416d88
|
Fix genders_mapping being recalculated.
|
2015-03-28 13:47:21 -04:00 |
|
Donne Martin
|
89e04b3a89
|
Renamed data munging to data wrnagling, fixed spacing between variables passed to confusion_matrix.
|
2015-03-27 06:36:23 -04:00 |
|
Donne Martin
|
53e0ae41c5
|
Reduced confusion matrix image, it was too wide and forced a horizontal scroll bar on nbviewer.
|
2015-03-25 07:56:39 -04:00 |
|
Donne Martin
|
5e38505cd7
|
Added random forest classification report.
|
2015-03-23 07:24:12 -04:00 |
|
Donne Martin
|
4c7a3a52a1
|
Added confusion matrix and accuracy metrics to evaluate the model's performance.
|
2015-03-22 12:18:51 -04:00 |
|
Donne Martin
|
20ddcd2a01
|
Added random forest score on training data. Code cleanup.
|
2015-03-21 10:46:01 -04:00 |
|
Donne Martin
|
bffbb61bc3
|
Renamed df to df_train to be more explicit of the DataFrame's purpose.
|
2015-03-20 14:42:05 -04:00 |
|
Donne Martin
|
a6153e5020
|
Tweaked slicing indices to use single : instead of ::, which I find more readable. Tweaked Feature: Sex headers.
|
2015-03-20 12:54:20 -04:00 |
|
Donne Martin
|
5bc4d9bef4
|
Formatted intro section, added Titanic image.
|
2015-03-20 11:42:43 -04:00 |
|
Donne Martin
|
01d65fd232
|
Added Random Forest: Prepare for Kaggle Submission section.
|
2015-03-20 11:36:41 -04:00 |
|
Donne Martin
|
055cd52cd3
|
Added Random Forest Predicting section.
|
2015-03-20 11:35:25 -04:00 |
|
Donne Martin
|
3fcbc8364f
|
Added Random Forest training section.
|
2015-03-20 11:33:41 -04:00 |
|
Donne Martin
|
ad54e0ae70
|
Added Data Munging Summary section which contains all the data cleaning and transformation steps described in the notebook.
|
2015-03-20 11:27:06 -04:00 |
|
Donne Martin
|
387922662f
|
Replaced nested for loop that calculated the median age based on sex and passenger class with groupby + apply instead.
|
2015-03-20 11:21:27 -04:00 |
|
Donne Martin
|
c9ca38d211
|
Only attempt to fill missing ports of embarkation if there are missing values. Reworked the AgeFill process. Dropped SibSp and Parch columns as there are part of FamilySize.
|
2015-03-18 19:25:58 -04:00 |
|
Donne Martin
|
d58e6423b3
|
Updated Notebook TOC, dropped PassengerId as it won't be using in the machine learning algorithms.
|
2015-03-18 14:45:18 -04:00 |
|
Donne Martin
|
000fea0862
|
Added section Final Data Preparation for Machine Learning, which drops unused columns and converts the DataFrame to a numpy array.
|
2015-03-18 14:32:28 -04:00 |
|
Donne Martin
|
2662e2bb03
|
Added feature engineering description, a description on the family size histogram, and a brief discussion on a potential feature related to the passenger's name.
|
2015-03-18 14:21:25 -04:00 |
|
Donne Martin
|
81660c59d1
|
Reordered README sections.
|
2015-03-17 16:21:33 -04:00 |
|
Donne Martin
|
a93f599a9b
|
Cleaned up code, charts, and descriptions in various sections.
|
2015-03-17 16:16:42 -04:00 |
|
Donne Martin
|
7d4c5532a8
|
Rework the age analysis, adding more details and graphs.
|
2015-03-17 15:44:50 -04:00 |
|
Donne Martin
|
ce3ef575bd
|
Added additional plots to further explore the port of embarkation feature.
|
2015-03-17 14:53:22 -04:00 |
|
Donne Martin
|
011313d2e1
|
Added snippets of feature engineering: creating a new feature family size by combining number of parents and siblings.
|
2015-03-17 14:05:07 -04:00 |
|
Donne Martin
|
44eeaf447d
|
Cleaned up some sections, added plots of survival rate by Sex and Pclass.
|
2015-03-17 14:03:58 -04:00 |
|