Finished red black tree, but the script has some bugs and is to be polished

master
mbinary 2018-07-13 23:42:46 +08:00
parent 3e84fa4a0b
commit 64c62a0ff8
6 changed files with 386 additions and 24 deletions

View File

@ -1,13 +1,19 @@
# 算法笔记
记录学习算法的一些笔记, 想法, 以及代码实现 :smiley:
# Algorithm and data structures
>Notes and codes for learning algorithm and data structures :smiley:
目前正在看`<<算法导论>>`
Some pictures and idead are from `<<Introduction to Algotithm>>
I use python 3.6+ and c++ to implements them.
Since I used f-Strings in python, you may use python 3.6+ to run the following python scripts.
> 目前 github 上的文档不支持 latex 数学公式渲染
所以如果要读下面的一些笔记(md文件), 可以移步到[我的博客](https://mbinary.coding.me)
>>I am still learning new things and this repo is always updating.
Some scripts may have bugs or not be finished yet.
# 索引
# Notice
Currently, Github can't render latex math formulas.
So,if you wannt to view the notes which contains latex math formulas and are in markdown format, you can visit [my blog](https://mbinary.coding.me)
# index
* [.](.)
* [notes](./notes)
* [alg-general.md](./notes/alg-general.md)

View File

@ -0,0 +1,313 @@
'''
#########################################################################
# File : redBlackTree.py
# Author: mbinary
# Mail: zhuheqin1@gmail.com
# Blog: https://mbinary.coding.me
# Github: https://github.com/mbinary
# Created Time: 2018-07-12 20:34
# Description:
#########################################################################
'''
from functools import total_ordering
from random import randint, shuffle
@total_ordering
class node:
def __init__(self,val,left=None,right=None,isBlack=False):
self.val =val
self.left = left
self.right = right
self.isBlack = isBlack
def __lt__(self,nd):
return self.val < nd.val
def __eq__(self,nd):
return nd is not None and self.val==nd.val
def setChild(self,nd,isLeft = True):
if isLeft: self.left = nd
else: self.right = nd
def getChild(self,isLeft):
if isLeft: return self.left
else: return self.right
def __bool__(self):
return self.val is not None
def __str__(self):
color = 'B' if self.isBlack else 'R'
return f'{color}-{self.val:}'
def __repr__(self):
return f'node({self.val},isBlack={self.isBlack})'
class redBlackTree:
def __init__(self):
self.root = None
def getParent(self,val):
if isinstance(val,node):val = val.val
if self.root.val == val:return None
nd = self.root
while nd:
if nd.val>val and nd.left is not None:
if nd.left.val == val: return nd
else: nd = nd.left
elif nd.val<val and nd.right is not None:
if nd.right.val == val: return nd
else: nd = nd.right
def find(self,val):
if isinstance(val,node):val = val.val
nd = self.root
while nd:
if nd.val ==val:
return nd
elif nd.val>val:
nd = nd.left
else:
nd = nd.right
@staticmethod
def checkBlack(nd):
return nd is None or nd.isBlack
@staticmethod
def setBlack(nd,isBlack):
if nd is not None:
if isBlack is None or isBlack:
nd.isBlack = True
else:nd.isBlack = False
def insert(self,val):
if isinstance(val,node):val = val.val
def _insert(root,nd):
'''return parent'''
while root:
if root == nd:return None
elif root>nd:
if root.left :
root=root.left
else:
root.left = nd
return root
else:
if root.right:
root = root.right
else:
root.right = nd
return root
# insert part
nd = node(val)
if self.root is None:
self.root = nd
self.setBlack(self.root,True)
return
parent = _insert(self.root,nd)
if parent is None: return
if not parent.isBlack: self.fixUpInsert(parent,nd)
def fixUpInsert(self,parent,nd):
''' adjust color and level, there are two red nodes: the new one and its parent'''
while not self.checkBlack(parent):
grand = self.getParent(parent)
isLeftPrt = grand.left == parent
uncle = grand.getChild(not isLeftPrt)
if not self.checkBlack(uncle):
# case 1: new node's uncle is red
self.setBlack(grand, False)
self.setBlack(grand.left, True)
self.setBlack(grand.right, True)
nd = grand
parent = self.getParent(nd)
else:
# case 2: new node's uncle is black(including nil leaf)
isLeftNode = parent.left==nd
if isLeftNode ^ isLeftPrt:
# case 2.1 the new node is inserted in left-right or right-left form
# grand grand
# parent or parent
# nd nd
parent.setChild(nd.getChild(isLeftPrt),not isLeftPrt)
nd.setChild(parent,isLeftPrt)
grand.setChild(nd,isLeftPrt)
nd,parent = parent,nd
# case 2.2 the new node is inserted in left-left or right-right form
# grand grand
# parent or parent
# nd nd
grand.setChild(parent.getChild(not isLeftPrt),isLeftPrt)
parent.setChild(grand,not isLeftPrt)
self.setBlack(grand, False)
self.setBlack(parent, True)
self.setBlack(self.root,True)
def sort(self,reverse = False):
''' return a generator of sorted data'''
def inOrder(root):
if root is None:return
if reverse:
yield from inOrder(root.right)
else:
yield from inOrder(root.left)
yield root
if reverse:
yield from inOrder(root.left)
else:
yield from inOrder(root.right)
yield from inOrder(self.root)
def getSuccessor(self,val):
if isinstance(val,node):val = val.val
def _inOrder(root):
if root is None:return
if root.val>= val:yield from _inOrder(root.left)
yield root
yield from _inOrder(root.right)
gen = _inOrder(self.root)
for i in gen:
if i.val==val:
try: return gen.__next__()
except:return None
def delete(self,val):
# delete node in a binary search tree
if isinstance(val,node):val = val.val
nd = self.find(val)
if nd is None: return
y = None
if nd.left and nd.right:
y= self.getSuccessor(val)
else:
y = nd
py = self.getParent(y.val)
x = y.left if y.left else y.right
if py is None:
self.root = x
elif y==py.left:
py.left = x
else:
py.right = x
if y != nd:
nd.val = y.val
# adjust colors and rotate
if self.checkBlack(y): self.fixUpDel(py,x)
def fixUpDel(self,prt,chd):
if self.root == chd or not self.checkBlack(chd):
self.setBlack(chd, True)
return
isLeft = prt.left == chd
brother = prt.getChild(not isLeft)
if self.checkBlack(brother):
# case 1: brother is black
lb = self.checkBlack(brother.left)
rb = self.checkBlack(brother.right)
if lb and rb:
# case 1.1: brother is black and two kids are black
self.setBlack(brother,False)
chd = prt
elif lb or rb:
# case 1.2: brother is black and two kids's colors differ
if self.checkBlack(brother.getChild(not isLeft)):
# uncle's son is nephew, and niece for uncle's daughter
nephew = brother.getChild(isLeft),
print(nephew)
self.setBlack(nephew,True)
self.setBlack(brother,False)
# brother right rotate
prt.setChild(nephew,not isLeft)
nephew.setChild(brother, not isLeft)
brother = prt.right
# case 1.3: brother is black and two kids are red
brother.isBlack = prt.isBlack
self.setBlack(prt,True)
self.setBlack(brother.right,True)
# prt left rotate
prt.setChild(brother.getChild(isLeft),not isLeft)
brother.setChild(prt,isLeft)
chd = self.root
else:
# case 2: brother is red
prt.setChild(brother.getChild(isLeft), not isLeft)
brother.setChild(prt, isLeft)
self.setBlack(prt,False)
self.setBlack(brother,True)
self.setBlack(chd,True)
def display(self):
def getHeight(nd):
if nd is None:return 0
return max(getHeight(nd.left),getHeight(nd.right)) +1
def levelVisit(root):
from collections import deque
lst = deque([root])
level = []
h = getHeight(root)
lv = 0
ct = 0
while lv<=h:
ct+=1
nd = lst.popleft()
if ct >= 2**lv:
lv+=1
level.append([])
level[-1].append(str(nd))
if nd is not None:
lst.append(nd.left)
lst.append(nd.right)
else:
lst.append(None)
lst.append(None)
return level
lines = levelVisit(self.root)
print('-'*5+ 'level visit' + '-'*5)
return '\n'.join([' '.join(line) for line in lines])
def __str__(self):
return self.display()
def genNum(n =10):
nums =[]
for i in range(n):
while 1:
d = randint(0,100)
if d not in nums:
nums.append(d)
break
#nums = [3,4,2,0,1,6]
return nums
def buildTree(n=10,nums=None,visitor=None):
if nums is None: nums = genNum(n)
rbtree = redBlackTree()
print(f'build a red-black tree using {nums}')
for i in nums:
if visitor:
visitor(rbtree)
rbtree.insert(i)
return rbtree
def testInsert():
def visitor(t):
print(t)
nums = [66, 14, 7, 2, 52, 96, 63, 51, 16, 53]
rbtree = buildTree(nums = nums,visitor = visitor)
print('-'*5+ 'in-order visit' + '-'*5)
for i,j in enumerate(rbtree.sort()):
print(f'{i+1}: {j}')
def testSuc():
rbtree = buildTree()
for i in rbtree.sort():
print(f'{i}\'s suc is {rbtree.getSuccessor(i)}')
def testDelete():
#nums = [56, 89, 31, 29, 24, 8, 62, 96, 20, 75] #tuple
nums = [66, 14, 7, 2, 52, 96, 63, 51, 16, 53]
rbtree = buildTree(nums = nums)
print(rbtree)
shuffle(nums)
for i in nums:
print(f'deleting {i}')
rbtree.delete(i)
print(rbtree)
if __name__=='__main__':
testInsert()
#testDelete()
#testSuc()

View File

@ -1,7 +1,7 @@
---
title: 『算法』general
date: 2018-07-04
categories: 算法与数据结构
categories: 数据结构与算法
tags: [算法]
keywords:
mathjax: true
@ -61,13 +61,19 @@ top:
<a id="markdown-4-算法设计" name="4-算法设计"></a>
# 4. 算法设计
# 4. 算法设计与分析
<a id="markdown-41-分治divide-and-conquer" name="41-分治divide-and-conquer"></a>
## 4.1. 分治(divide and conquer)
结构上是递归的,
步骤: 分解,解决, 合并
eg 快排,归并排序
## 随机化
## 递归
## 动态规划
## 贪心算法
## 平摊分析
<a id="markdown-5-递归式" name="5-递归式"></a>
# 5. 递归式
$T(n) = aT(\frac{n} {b})+f(n)$
@ -78,6 +84,7 @@ eg 快排,归并排序
### 5.1.1. 步骤
* 猜测解的形式
* 用数学归纳法找出常数
<a id="markdown-512-例子" name="512-例子"></a>
### 5.1.2. 例子
$T(n) = 2T(\frac{n} {2})+n$
@ -91,12 +98,14 @@ $
T(n) &\leqslant 2c\frac{n}{2}log(\frac{n}{2}) + n \leqslant cnlog(\frac{n}{2}) \\
\end{aligned}
$
<a id="markdown-513-放缩" name="513-放缩"></a>
### 5.1.3. 放缩
对于 $T(n) = 2T(\frac{cn}{2}) + 1$
如果 直接猜测 $T(n) = O (n)$ 不能证明,
而且不要猜测更高的界 $O (n^2)$
可以放缩为 n-b
<a id="markdown-514-改变变量" name="514-改变变量"></a>
### 5.1.4. 改变变量
对于 $ T(n) = 2T(\sqrt{n})+logn $
@ -176,9 +185,10 @@ for i in range(n):
初始化: i=1 成立
保持 : 假设 在第 i-1 次迭代之前,成立, 证明在第 i 次迭代之后, 仍然成立,
终止: 在 结束后, i=n+1, 得到 概率为 $\frac{1}{n!}$
<a id="markdown-7-组合方程的近似算法" name="7-组合方程的近似算法"></a>
# 7. 组合方程的近似算法
* Stiring 's approximation: $ n! \approx \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$
* Stiring's approximation: $ n! \approx \sqrt{2\pi n}\left(\frac{n}{e}\right)^n$
* 对于 $C_n^x=a$, 有 $x=\frac{ln^2 a}{n}$
* 对于 $C_x^n=a$, 有 $x=(a*n!)^{\frac{1}{n}}+\frac{n}{2}$
@ -188,7 +198,7 @@ for i in range(n):
## 8.1. 球与盒子
把相同的秋随机投到 b 个盒子里,问在每个盒子里至少有一个球之前,平均至少要投多少个球?
称投入一个空盒为击中, 即求取得 b 次击中的概率
设投 n 次, 称第 i 个阶段包括第 i-1 次击中到 第 i 次击中的球, 则 $p_i=\frac{b-i+1}{b}$
设投 n 次, 称第 i 个阶段包括第 i-1 次击中到 第 i 次击中的球, 则第 i 次击中的概率为 $p_i=\frac{b-i+1}{b}$
用 $n_i$表示第 i 阶段的投球数,则 $n=\sum_{i=1}^b n_i$
且 $n_i$服从几何分布, $E(n_i)=\frac{b}{b-i+1}$,
则由期望的线性性,

View File

@ -1,7 +1,7 @@
---
title: 『数据结构』散列表
date: 2018-07-08 23:25
categories: 算法与数据结构
categories: 数据结构与算法
tags: [数据结构,散列表]
keywords:
mathjax: true
@ -28,7 +28,6 @@ description:
<!-- /TOC -->
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**
哈希表 (hash table) , 可以实现 $O(1)$ 的 read, write, update
相对应 python 中的 dict, c语言中的 map
@ -115,9 +114,9 @@ $$
对于 m 个槽位的表, 只需 $\Theta(n)$的期望时间来处理 n 个元素的 insert, search, delete,其中 有$O(m)$个insert 操作
<a id="markdown-2313-实现" name="2313-实现"></a>
#### 2.3.1.3. 实现
选择足够大的 prime p, 记$Z_p=\{0,1,\ldots,p-1\}, Z_p^*=\{1,\ldots,p-1\},$
选择足够大的 prime p, 记$Z_p=\{0,1,\ldots,p-1\}, Z_p^{*}=\{1,\ldots,p-1\},$
令$h_{a,b}(k) = ((ak+b)mod\ p) mod\ m$
则 $H_{p,m}=\{h_{a,b}|a\in Z_p^*,b\in Z_p\}$
则 $H_{p,m}=\{h_{a,b}|a\in Z_p^{*},b\in Z_p\}$
<a id="markdown-24-开放寻址法" name="24-开放寻址法"></a>
## 2.4. 开放寻址法
所有表项都在散列表中, 没有链表.
@ -141,9 +140,16 @@ $h(k,i) = (h_1(k)+i*h_2(k))mod\ m$
<a id="markdown-241-不成功查找的探查数的期望" name="241-不成功查找的探查数的期望"></a>
### 2.4.1. 不成功查找的探查数的期望
对于开放寻址散列表,且 $\alpha<1$,在一次不成功的查找中,有 $E(\text{探查数})\leqslant \frac{1}{1-\alpha}$
不成功探查是这样的: 前面都是检查到槽被占用但是关键字不同, 最后一次应该是空槽.
对于开放寻址散列表,且 $\alpha<1$,一次不成功的查找,是这样的: 已经装填了 n 个, 总共有m 个,则空槽有 m-n 个.
不成功的探查是这样的: 一直探查到已经装填的元素(但是不是要找的元素), 直到遇到没有装填的空槽. 所以这服从几何分布, 即
$$
p(\text{不成功探查})=p(\text{第一次找到空槽})=\frac{m-n}{m}
$$
$$ E(\text{探查数})=\frac{1}{p}\leqslant \frac{1}{1-\alpha}$$
![](https://upload-images.jianshu.io/upload_images/7130568-8d659aa8fe7de1a9.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
<a id="markdown-2411-插入探查数的期望" name="2411-插入探查数的期望"></a>
#### 2.4.1.1. 插入探查数的期望
所以, 插入一个关键字, 也最多需要 $\frac{1}{1-\alpha}$次, 因为插入过程就是前面都是被占用了的槽, 最后遇到一个空槽.与探查不成功是一样的过程
@ -162,6 +168,8 @@ $$
$$
代码
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**
```python
class item:
def __init__(self,key,val,nextItem=None):

View File

@ -1,7 +1,7 @@
---
title: 『算法』排序
date: 2018-7-6
categories: 算法与数据结构
categories: 数据结构与算法
tags: [算法,排序]
keywords:
mathjax: true
@ -36,7 +36,6 @@ description:
<!-- /TOC -->
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**
排序的本质就是减少逆序数, 根据是否进行比较,可以分为如下两类.
* 比较排序
@ -411,7 +410,7 @@ if __name__ == '__main__':
设有 n 个元素, 则设立 n 个桶
将各元素通过数值线性映射到桶地址,
类似 hash 链表.
然后在每个桶内, 进行插入排序,
然后在每个桶内, 进行插入排序($O(n_i^2)$)
最后合并所有桶.
这里的特点是 n 个桶实现了 $\Theta(n)$的时间复杂度, 但是耗费的空间 为 $\Theta(n)$
@ -474,3 +473,4 @@ def select(lst,i):
return _select(0,len(lst)-1)
```
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**

View File

@ -1,7 +1,7 @@
---
title: 『数据结构』树
date: 2018-7-11 18:56
categories: 算法与数据结构
categories: 数据结构与算法
tags: [数据结构,树]
keywords:
mathjax: true
@ -26,6 +26,7 @@ description:
- [5.2.2. Zig-zig step](#522-zig-zig-step)
- [5.2.3. Zig-zag step](#523-zig-zag-step)
- [5.3. read-black Tree](#53-read-black-tree)
- [5.4. treap](#54-treap)
- [6. 总结](#6-总结)
- [7. 附代码](#7-附代码)
- [7.1. 二叉树(binaryTree)](#71-二叉树binarytree)
@ -35,7 +36,6 @@ description:
<!-- /TOC -->
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**
<a id="markdown-1-概念" name="1-概念"></a>
# 1. 概念
* 双亲
@ -87,13 +87,13 @@ $$
然后我们来看`<<算法导论>>`(p162,思考题12-4)上怎么求的吧( •̀ ω •́ )y
设生成函数
$$B(x)=\sum_{n=0}^{\infty}b_n x^n$$
下面证明$B(x)=xB(x)^2+1\quad(\#)$
下面证明$B(x)=xB(x)^2+1$
易得$$xB(x)^2=\sum_{i=1}^{\infty}\sum_{n=i}^{\infty}b_{i-1}b_{n-i}x^n$$
对比$B(x), xB(x)^2+1$的 x 的各次系数,分别是 $b_k,a_{k}$
当 k=0, $a_k=1=b_k$
当 k>0
$$a_{k} = \sum_{i=1}^{k}b_{i-1}b_{k-i} = b_k$$
所以$B(x)=xB(x)^2+1\quad(\#)$
所以$B(x)=xB(x)^2+1$
由此解得
$$B(x)=\frac{1-\sqrt{1-4x} }{2x}$$
在点 x=0 处,
@ -121,7 +121,9 @@ $$
* 若A,B都是好括号列, 则串联后 AB是好括号列
* 若A是好括号列, 则 (A)是好括号列
>充要条件: 好括号列 <=> 左右括号数相等, 且从左向右看, 看到的右括号数不超过左括号数
>充要条件: 好括号列 $\Longleftrightarrow$ 左右括号数相等, 且从左向右看, 看到的右括号数不超过左括号数
>定理: 由 n个左括号,n个右括号组成的好括号列个数为$c(n)=\frac{C_{2n}^{n}}{n+1}$
@ -209,6 +211,27 @@ Aho-Corasick automation,是在字典树上添加匹配失败边(失配指针),
## 5.3. read-black Tree
同样是平衡的二叉树, 以后单独写一篇关于红黑树的.
<a id="markdown-54-treap" name="54-treap"></a>
## 5.4. treap
[前面提到](#21-随机构造的二叉查找树), 随机构造的二叉查找树高度为 $h=O(logn)$,以及在[算法 general](https://mbinary.coding.me/alg-genral.html) 中说明了怎样 随机化(shuffle)一个给定的序列.
所以,为了得到一个平衡的二叉排序树,我们可以将给定的序列随机化, 然后再进行构造二叉排序树.
但是如果不能一次得到全部的数据,也就是可能插入新的数据的时候,该怎么办呢? 可以证明,满足下面的条件构造的结构相当于同时得到全部数据, 也就是随机化的二叉查找树.
![treap](https://upload-images.jianshu.io/upload_images/7130568-f8fd5006a58ce451.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
这种结构叫 `treap`, 不仅有要排序的关键字 key, 还有随机生成的,各不相等的关键字`priority`,代表插入的顺序.
* 二叉查找树的排序性质: 双亲结点的 key 大于左孩子,小于右孩子
* 最小(大)堆的堆序性质: 双亲的 prority小于(大于) 孩子的 prority
插入的实现: 先进行二叉查找树的插入,成为叶子结点, 再通过旋转 实现 `上浮`(堆中术语).
将先排序 key, 再排序 prority(排序prority 时通过旋转保持 key 的排序)
<a id="markdown-6-总结" name="6-总结"></a>
# 6. 总结
还有很多有趣的树结构,
@ -218,6 +241,8 @@ Aho-Corasick automation,是在字典树上添加匹配失败边(失配指针),
<a id="markdown-7-附代码" name="7-附代码"></a>
# 7. 附代码
**[github地址](https://github.com/mbinary/algorithm-and-data-structure.git)**
<a id="markdown-71-二叉树binarytree" name="71-二叉树binarytree"></a>
## 7.1. 二叉树(binaryTree)
```python