data-science-ipython-notebooks/core/files.ipynb

145 lines
3.6 KiB
Plaintext
Raw Normal View History

{
"metadata": {
"name": "",
2015-02-17 06:01:10 +08:00
"signature": "sha256:c535c4eb558966d14a178dfbd21918ecadc105f12b3c5f939ebf807d5afd1e6f"
},
"nbformat": 3,
"nbformat_minor": 0,
"worksheets": [
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Files\n",
"\n",
"* Open a File for Reading\n",
"* Read a File\n",
2015-02-17 06:01:10 +08:00
"* Write to a New File\n",
"* Read and write UTF-8"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Open a File for Reading\n",
"\n",
"Open a file in read-only mode:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"path = 'type_util.py'\n",
"f = open(path)"
],
"language": "python",
"metadata": {},
"outputs": [],
2015-02-17 06:01:10 +08:00
"prompt_number": 2
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read a File\n",
"\n",
"Read the entire file with readlines. Iterate over the file lines as you would with a list. rstrip removes the EOL markers."
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"for line in f:\n",
" print(line.rstrip())"
],
"language": "python",
"metadata": {},
"outputs": [
{
"output_type": "stream",
"stream": "stdout",
"text": [
"class TypeUtil:\n",
"\n",
" @classmethod\n",
" def is_iterable(cls, obj):\n",
" \"\"\"Determines if obj is iterable.\n",
"\n",
" Useful when writing functions that can accept multiple types of\n",
" input (list, tuple, ndarray, iterator). Pairs well with\n",
" convert_to_list.\n",
" \"\"\"\n",
" try:\n",
" iter(obj)\n",
" return True\n",
" except TypeError:\n",
" return False\n",
"\n",
" @classmethod\n",
" def convert_to_list(cls, obj):\n",
" \"\"\"Converts obj to a list if it is not a list and it is iterable,\n",
" else returns the original obj.\n",
" \"\"\"\n",
" if not isinstance(obj, list) and cls.is_iterable(obj):\n",
" obj = list(obj)\n",
" return obj\n"
]
}
],
2015-02-17 06:01:10 +08:00
"prompt_number": 3
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Write to a New File\n",
"\n",
"Create a new file (overwriting any previous file with the same name), write text then close the file:"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"path = 'hello_world.txt'\n",
"f = open(path, 'w')\n",
"f.write('hello world!')\n",
"f.close()"
],
"language": "python",
"metadata": {},
"outputs": [],
2015-02-17 06:01:10 +08:00
"prompt_number": 4
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Read and Write UTF-8"
]
},
{
"cell_type": "code",
"collapsed": false,
"input": [
"import codecs\n",
"with codecs.open(\"hello_world_new.txt\", \"a\", \"utf-8\") as new_file:\n",
" with codecs.open(\"hello_world.txt\", \"r\", \"utf-8\") as old_file: \n",
" for line in old_file:\n",
" new_file.write(line + '\\n')"
],
"language": "python",
"metadata": {},
"outputs": [],
"prompt_number": 5
}
],
"metadata": {}
}
]
}