Added more detailed descriptions to each notebook in the categories kaggle, aws, and spark.

2024-03-22 13:30:56 +08:00 · 2015-04-03 06:34:32 -04:00 · 2015-04-03 06:34:32 -04:00 · eb7bba9377
commit eb7bba9377
parent 1403cf4134
1 changed files with 10 additions and 10 deletions
--- a/README.md
+++ b/README.md
@ -7,27 +7,27 @@ Continually updated IPython Data Science Notebooks geared towards processing big

 IPython Notebooks used in [kaggle](https://www.kaggle.com/) competitions.

-* [titanic](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/kaggle/titanic.ipynb)
+* [titanic](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/kaggle/titanic.ipynb): Predicts survival on the Titanic.  Demonstrates data cleaning, exploratory data analysis, and machine learning.

 ## aws

 IPython Notebooks demonstrating Amazon Web Services functionality.

 * [aws commands index](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb)
-* [s3cmd](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3cmd)
-* [s3-parallel-put](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3-parallel-put)
-* [s3distcp](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3distcp)
-* [mrjob](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#mrjob)
-* [redshift](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#redshift)
-* [kinesis](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#kinesis)
-* [lambda](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#lambda)
+* [s3cmd](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3cmd): Interacts with S3 through the command line.
+* [s3-parallel-put](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3-parallel-put): Uploads multiple files to S3 in parallel.
+* [s3distcp](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3distcp): Combines smaller files and aggregates them together by taking in a pattern and target file.  S3DistCp can also be used to transfer large volumes of data from S3 to your Hadoop cluster.
+* [mrjob](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#mrjob): Supports MapReduce jobs in Python 2.5+ and runs them locally or on Hadoop clusters.
+* [redshift](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#redshift): Acts as a fast data warehouse built on top of technology from massive parallel processing (MPP).
+* [kinesis](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#kinesis): Streams data in real time with the ability to process thousands of data streams per second.
+* [lambda](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#lambda): Runs code in response to events, automatically managing compute resources.

 ## spark

 IPython Notebooks demonstrating spark and HDFS functionality.

-* [spark](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/spark/spark.ipynb)
-* [hdfs](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/spark/hdfs.ipynb)
+* [spark](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/spark/spark.ipynb): Open-source in-memory cluster computing framework, up to 100 times faster for certain applications and is well suited for machine learning algorithms.
+* [hdfs](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/spark/hdfs.ipynb): Reliably stores very large files across machines in a large cluster.

 ## python-core