Added anchors for each AWS command line topic. Updated README with AWS topics.

This commit is contained in:
Donne Martin 2015-03-02 10:32:03 -05:00
parent 39db0b5057
commit 47211cb729
2 changed files with 21 additions and 9 deletions

View File

@ -1,7 +1,7 @@
![alt text](http://i2.wp.com/donnemartin.com/wp-content/uploads/2015/02/ipython_notebook_cover2-e1425213196820.png)
# ipython-data-notebooks
IPython Notebooks geared towards Python data analysis (core Python, NumPy, pandas, matplotlib, SciPy, scikit-learn, aws, spark, command line).
IPython Notebooks geared towards Python data analysis (core Python, NumPy, pandas, matplotlib, SciPy, scikit-learn, AWS, Spark, command line).
## python-core
@ -26,6 +26,11 @@ IPython Notebooks demonstrating pandas functionality.
IPython Notebooks demonstrating Amazon Web Services functionality.
* [aws](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb)
* [s3cmd](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3cmd)
* [s3-parallel-put](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3-parallel-put)
* [S3DistCp](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#s3distcp)
* [mrjob](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#mrjob)
* [Redshift](http://nbviewer.ipython.org/github/donnemartin/ipython-data-notebooks/blob/master/aws/aws.ipynb#redshift)
## spark

View File

@ -1,7 +1,7 @@
{
"metadata": {
"name": "",
"signature": "sha256:bcaf53e50215c57cb4a91ea44895e0e87bc885288dd093cddf3777133df410f1"
"signature": "sha256:22b98dd51ff479ba7e10ae0a2f0c4e85b11642fe49dfacfa45c7c4137881b3b7"
},
"nbformat": 3,
"nbformat_minor": 0,
@ -12,14 +12,21 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# AWS Command Lines"
"# AWS Command Lines\n",
"\n",
"* SSH to EC2\n",
"* S3cmd\n",
"* s3-parallel-put\n",
"* S3DistCp\n",
"* mrjob\n",
"* Redshift"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Connect to EC2"
"<h2 id=\"ssh-to-ec2\">SSH to EC2</h2>"
]
},
{
@ -60,7 +67,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## S3cmd\n",
"<h2 id=\"s3cmd\">S3cmd</h2>\n",
"\n",
"Before I discovered [S3cmd](http://s3tools.org/s3cmd), I had been using the [S3 console](http://aws.amazon.com/console/) to do basic operations and [boto](https://boto.readthedocs.org/en/latest/) to do more of the heavy lifting. However, sometimes I just want to hack away at a command line to do my work.\n",
"\n",
@ -167,7 +174,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## s3-parallel-put\n",
"<h2 id=\"s3-parallel-put\">s3-parallel-put</h2>\n",
"\n",
"[s3-parallel-put](https://github.com/twpayne/s3-parallel-put.git) is a great tool for uploading multiple files to S3 in parallel."
]
@ -263,7 +270,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## S3DistCp\n",
"<h2 id=\"s3distcp\">S3DistCp</h2>\n",
"\n",
"[S3DistCp](http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/UsingEMR_s3distcp.html) is an extension of DistCp that is optimized to work with Amazon S3. S3DistCp is useful for combining smaller files and aggregate them together, taking in a pattern and target file to combine smaller input files to larger ones. S3DistCp can also be used to transfer large volumes of data from S3 to your Hadoop cluster."
]
@ -338,7 +345,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## mrjob"
"<h2 id=\"mrjob\">mrjob</h2>"
]
},
{
@ -379,7 +386,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## Redshift"
"<h2 id=\"redshift\">Redshift</h2>"
]
},
{