2017-03-28 17:00:23 +08:00
{
" cells " : [
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" This notebook was prepared by [Donne Martin](https://github.com/donnemartin). Source and license info is on [GitHub](https://github.com/donnemartin/interactive-coding-challenges). "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" # Solution Notebook "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Problem: Given two strings, find the longest common substring. \n " ,
" \n " ,
" * [Constraints](#Constraints) \n " ,
" * [Test Cases](#Test-Cases) \n " ,
" * [Algorithm](#Algorithm) \n " ,
" * [Code](#Code) \n " ,
" * [Unit Test](#Unit-Test) "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Constraints \n " ,
" \n " ,
" * Can we assume the inputs are valid? \n " ,
" * No \n " ,
" * Can we assume the strings are ASCII? \n " ,
" * Yes \n " ,
" * Is this case sensitive? \n " ,
" * Yes \n " ,
" * Is a substring a contiguous block of chars? \n " ,
" * Yes \n " ,
" * Do we expect a string as a result? \n " ,
" * Yes \n " ,
" * Can we assume this fits memory? \n " ,
" * Yes "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Test Cases \n " ,
" \n " ,
" * str0 or str1 is None -> Exception \n " ,
" * str0 or str1 equals 0 -> ' ' \n " ,
" * General case \n " ,
" \n " ,
" str0 = ' ABCDEFGHIJ ' \n " ,
" str1 = ' FOOBCDBCDE ' \n " ,
" \n " ,
" result: ' BCDE ' "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Algorithm \n " ,
" \n " ,
" We ' ll use bottom up dynamic programming to build a table. \n " ,
" \n " ,
" <pre> \n " ,
" \n " ,
" The rows (i) represent str0. \n " ,
" The columns (j) represent str1. \n " ,
" \n " ,
" str1 \n " ,
" ------------------------------------------------- \n " ,
" | | | A | B | C | D | E | F | G | H | I | J | \n " ,
" ------------------------------------------------- \n " ,
" | | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | \n " ,
" | F | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | \n " ,
" | O | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | \n " ,
" s | O | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 | 1 | 1 | 1 | \n " ,
" t | B | 0 | 0 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | \n " ,
" r | C | 0 | 0 | 1 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 2 | \n " ,
" 0 | D | 0 | 0 | 1 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | \n " ,
" | B | 0 | 0 | 1 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | \n " ,
" | C | 0 | 0 | 1 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | \n " ,
" | D | 0 | 0 | 1 | 2 | 3 | 3 | 3 | 3 | 3 | 3 | 3 | \n " ,
" | E | 0 | 0 | 1 | 2 | 3 | 4 | 4 | 4 | 4 | 4 | 4 | \n " ,
" ------------------------------------------------- \n " ,
" \n " ,
" if str1[j] != str0[i]: \n " ,
" T[i][j] = max( \n " ,
" T[i][j-1], \n " ,
" T[i-1][j]) \n " ,
" else: \n " ,
" T[i][j] = T[i-1][j-1] + 1 \n " ,
" </pre> \n " ,
" \n " ,
" Complexity: \n " ,
" * Time: O(m * n), where m is the length of str0 and n is the length of str1 \n " ,
" * Space: O(m * n), where m is the length of str0 and n is the length of str1 "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Code "
]
} ,
{
" cell_type " : " code " ,
" execution_count " : 1 ,
2020-07-14 09:26:50 +08:00
" metadata " : { } ,
2017-03-28 17:00:23 +08:00
" outputs " : [ ] ,
" source " : [
" class StringCompare(object): \n " ,
" \n " ,
" def longest_common_substr(self, str0, str1): \n " ,
" if str0 is None or str1 is None: \n " ,
" raise TypeError( ' str input cannot be None ' ) \n " ,
" # Add one to number of rows and cols for the dp table ' s \n " ,
" # first row of 0 ' s and first col of 0 ' s \n " ,
" num_rows = len(str0) + 1 \n " ,
" num_cols = len(str1) + 1 \n " ,
" T = [[None] * num_cols for _ in range(num_rows)] \n " ,
" for i in range(num_rows): \n " ,
" for j in range(num_cols): \n " ,
" if i == 0 or j == 0: \n " ,
" T[i][j] = 0 \n " ,
" elif str0[j-1] != str1[i-1]: \n " ,
" T[i][j] = max(T[i][j-1], \n " ,
" T[i-1][j]) \n " ,
" else: \n " ,
" T[i][j] = T[i-1][j-1] + 1 \n " ,
" results = ' ' \n " ,
" i = num_rows - 1 \n " ,
" j = num_cols - 1 \n " ,
" # Walk backwards to determine the substring \n " ,
" while T[i][j]: \n " ,
" if T[i][j] == T[i][j-1]: \n " ,
" j -= 1 \n " ,
" elif T[i][j] == T[i-1][j]: \n " ,
" i -= 1 \n " ,
" elif T[i][j] == T[i-1][j-1] + 1: \n " ,
" results += str1[i-1] \n " ,
" i -= 1 \n " ,
" j -= 1 \n " ,
" else: \n " ,
" raise Exception( ' Error constructing table ' ) \n " ,
" # Walking backwards results in a string in reverse order \n " ,
" return results[::-1] "
]
} ,
{
" cell_type " : " markdown " ,
" metadata " : { } ,
" source " : [
" ## Unit Test "
]
} ,
{
" cell_type " : " code " ,
" execution_count " : 2 ,
2020-07-14 09:26:50 +08:00
" metadata " : { } ,
2017-03-28 17:00:23 +08:00
" outputs " : [
{
" name " : " stdout " ,
" output_type " : " stream " ,
" text " : [
" Overwriting test_longest_common_substr.py \n "
]
}
] ,
" source " : [
" %% writefile test_longest_common_substr.py \n " ,
2020-07-14 09:26:50 +08:00
" import unittest \n " ,
2017-03-28 17:00:23 +08:00
" \n " ,
" \n " ,
2020-07-14 09:26:50 +08:00
" class TestLongestCommonSubstr(unittest.TestCase): \n " ,
2017-03-28 17:00:23 +08:00
" \n " ,
" def test_longest_common_substr(self): \n " ,
" str_comp = StringCompare() \n " ,
2020-07-14 09:26:50 +08:00
" self.assertRaises(TypeError, str_comp.longest_common_substr, None, None) \n " ,
" self.assertEqual(str_comp.longest_common_substr( ' ' , ' ' ), ' ' ) \n " ,
2017-03-28 17:00:23 +08:00
" str0 = ' ABCDEFGHIJ ' \n " ,
" str1 = ' FOOBCDBCDE ' \n " ,
" expected = ' BCDE ' \n " ,
2020-07-14 09:26:50 +08:00
" self.assertEqual(str_comp.longest_common_substr(str0, str1), expected) \n " ,
2017-03-28 17:00:23 +08:00
" print( ' Success: test_longest_common_substr ' ) \n " ,
" \n " ,
" \n " ,
" def main(): \n " ,
" test = TestLongestCommonSubstr() \n " ,
" test.test_longest_common_substr() \n " ,
" \n " ,
" \n " ,
" if __name__ == ' __main__ ' : \n " ,
" main() "
]
} ,
{
" cell_type " : " code " ,
" execution_count " : 3 ,
2020-07-14 09:26:50 +08:00
" metadata " : { } ,
2017-03-28 17:00:23 +08:00
" outputs " : [
{
" name " : " stdout " ,
" output_type " : " stream " ,
" text " : [
" Success: test_longest_common_substr \n "
]
}
] ,
" source " : [
" %r un -i test_longest_common_substr.py "
]
}
] ,
" metadata " : {
" kernelspec " : {
" display_name " : " Python 3 " ,
" language " : " python " ,
" name " : " python3 "
} ,
" language_info " : {
" codemirror_mode " : {
" name " : " ipython " ,
" version " : 3
} ,
" file_extension " : " .py " ,
" mimetype " : " text/x-python " ,
" name " : " python " ,
" nbconvert_exporter " : " python " ,
" pygments_lexer " : " ipython3 " ,
2020-07-14 09:26:50 +08:00
" version " : " 3.7.2 "
2017-03-28 17:00:23 +08:00
}
} ,
" nbformat " : 4 ,
2020-07-14 09:26:50 +08:00
" nbformat_minor " : 1
2017-03-28 17:00:23 +08:00
}