interactive-coding-challenges/arrays-strings/unique_chars.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<small><i>This notebook was prepared by [Donne Martin](http://donnemartin.com). Source and license info is on [GitHub](https://bit.ly/code-notes).</i></small>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Problem: Implement an algorithm to determine if a string has all unique characters\n",
    "\n",
    "* [Clarifying Questions](#Clarifying-Questions)\n",
    "* [Test Cases](#Test-Cases)\n",
    "* [Algorithm 1: Sets and Length Comparison](#Algorithm-1:-Sets-and-Length-Comparison)\n",
    "* [Code: Sets and Length Comparison](#Code:-Sets-and-Length-Comparison)\n",
    "* [Algorithm 2: Hash Map Lookup](#Algorithm-2:-Hash-Map-Lookup)\n",
    "* [Code: Hash Map Lookup](#Code:-Hash-Map-Lookup)\n",
    "* [Algorithm 3: In-Place](#Algorithm-3:-In-Place)\n",
    "* [Code: In-Place](#Code:-In-Place)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Clarifying Questions\n",
    "* Is the string in ASCII (extended?) or Unicode?  \n",
    "    * ASCII extended, which is 256 characters\n",
    "* Can you use additional data structures?  \n",
    "    * Yes"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Test Cases\n",
    "\n",
    "* '' -> True\n",
    "* 'foo' -> False\n",
    "* 'bar' -> True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Algorithm 1: Sets and Length Comparison\n",
    "\n",
    "A set is an unordered collection of unique elements.  \n",
    "\n",
    "* If the length of the set(string) equals the length of the string\n",
    "    * Return True\n",
    "* Else\n",
    "    * Return False\n",
    "    \n",
    "Complexity:\n",
    "* Time: O(n)\n",
    "* Space: Additional O(m), where m is the number of unique characters in the set"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Code: Sets and Length Comparison"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def unique_chars(string):\n",
    "    return len(set(string)) == len(string)\n",
    "\n",
    "print(unique_chars(''))\n",
    "print(unique_chars('foo'))\n",
    "print(unique_chars('bar'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Algorithm 2: Hash Map Lookup\n",
    "\n",
    "We'll keep a hash map (set) to keep track of unique characters we encounter.  \n",
    "\n",
    "Steps:\n",
    "* Scan each character\n",
    "* For each character:\n",
    "    * If the character does not exist in a hash map, add the character to a hash map\n",
    "    * Else, return False\n",
    "* Return True\n",
    "\n",
    "Notes:\n",
    "* We could also use a dictionary, but it seems more logical to use a set as it does not contain duplicate elements\n",
    "* Since the characters are in ASCII, we could potentially use an array of size 128 (or 256 for extended ASCII)\n",
    "\n",
    "Complexity:\n",
    "* Time: O(n)\n",
    "* Space: Additional O(m), where m is the number of unique characters in the hash map"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Code: Hash Map Lookup"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def unique_chars_alt(string):\n",
    "    chars_set = set()\n",
    "    for char in string:\n",
    "        if char in chars_set:\n",
    "            return False\n",
    "        else:\n",
    "            chars_set.add(char)\n",
    "    return True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Algorithm 3: In-Place\n",
    "\n",
    "Since we cannot use additional data structures, this will eliminate the fast lookup O(1) time provided by our hash map.\n",
    "* Scan each character\n",
    "* For each character:\n",
    "    * Scan all [other] characters in the array\n",
    "        * Exluding the current character from the scan is rather tricky in Python and results in a non-Pythonic solution\n",
    "        * If there is a match, return False\n",
    "* Return True\n",
    "\n",
    "Algorithm Complexity:\n",
    "* Time: O(n^2)\n",
    "* Space: In-place"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Code: In-Place"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "def unique_chars_alt(string):\n",
    "    for char in string:\n",
    "        if string.count(char) > 1:\n",
    "            return False\n",
    "    return True"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}