## Introduction

Incremental technique, Recursive technique

Divide-and-conquer Strategy

Iteration, Recursion

## Algorithm Analysis

CPU Basic (atomic) operations

- Initialization
- Arithmetic
- Comparison / Branching
- Memory Access

Cost analysis, Correctness analysis

Pseudocode

Binary Search Algorithm

1 | //end state: l>r, answer registered as ans |

1 | //end state: l==r, answer=l=r |

1 | //The smallest element no less than key |

1 | //The greatest element no greater than key |

Big-O Notation

Big-Ω Notation

Big-Θ Notation

## Sorting

Selection sort

Insertion sort

Bubble sort

Merge sort

Matser Theorem

Quick sort

Worst $O(n^2)$, Expected $O(n\log n)$

1 | function quickSort(arr, left, right) { |

$e_i$ and $e_j$ are compared IFF either one is the first among $e_i, e_{i+1},\cdots, e_j$ picked as a pivot.

Complexity Summary

Sort | Average Time Complexity | Space Complexity |
---|---|---|

Selection | $O(n^2)$ | $O(1)$ |

Insertion | $O(n^2)$ | $O(1)$ |

Bubble | $O(n^2)$ | $O(1)$ |

Heap | $O(n\log n)$ | $O(1)$ |

Merge | $O(n\log n)$ | Depends |

Quick | $O(n\log n)$ | $O(1)$ |

1 | void shell_sort(int arr[], int len) { |

Counting sort

Radix sort

Bucket sort

## Linked List

Operation | Time Complexity | Space Complexity |
---|---|---|

Insert | $O(n)$ | $O(1)$ |

Delete | $O(n)$ | $O(1)$ |

Find | $O(n)$ | $O(1)$ |

Update | $O(n)$ | $O(1)$ |

Double linked list

Circular linked list

## Stack & Queue

### Stack

First In Last Out

Applications: Delimiters balance problem, Evaluating arithmetic expressions, The runtime stack in memory

Prefix/Infix/Postfix expression

### Queue

First In First Out

Ring queue

All operations above takes $O(1)$ time complexity.

## String

### String matching

Brute Force $O(mn)$

Rabin-Karp Algorithm $O(mn)$

KMP $O(m+n)$

### FSA

- $Q$, a set of states
- $q_0\in Q$, the start state
- $A\subseteq Q$, the accepting states
- $\Sigma$, the input alphabet
- $\delta$, the transition function $Q\times \Sigma→Q$

## Tree

Internal nodes, leaf nodes, root

(proper) ancestor, descendant

Path, Depth, Level, Height

(full, complete) k-ary tree

preorder, inorder, postorder traversal

Character Encoding & Huffman Tree

Given a Huffman tree, it includes at least 2 nodes, assume node $u$ and node $v$ have the top-2 lowest frequencies, then

- node $u$ and $v$ have the same parent node
- $depth(u)$ and $depth(v) \ge depth(x)$, where node $x$ is any leaf node in the Huffman tree.

Huffman encoding is the optimum prefix code, i.e., the space cost is minimized.Proof

- A tree with $n$ nodes with $n-1$ edges

- For each non-root node v, it has one and only one edge point to itself.
- A tree with n nodes, thus the number of non-root nodes is n-1.
- Thus, this tree has n-1 edges.

- Let $T$ be a tree where every internal node has at least 2 child nodes. If $m$ is the number of leaf nodes, then the number of internal nodes is at most $m-1$.

- Suppose internal node $v$ has $x_v$ child nodes
- The average child nodes of each internal node is $x$
- It has $m$ leaf nodes, thus it has $m/x$ parent nodes at most, i.e., they are parent of leaf nodes.
- For $m/x$ internal nodes, it has at most $m/x^2$ parents.
- For $m/x^2$ internal nodes, it has at most $m/x^3$ parents.
- …
- The total number of internal nodes is $m/x + m/x^2 + … + 1$
- It is at most m-1.

- A complete binary tree with $n ≥ 2$ nodes has height $O(\log n)$

- Suppose the height is $h$.
- The number of nodes at each level:
- Level 0: $2^0 = 1$, Level 1: $2^1 = 2 $
- Level 2: $2^2 = 4$, Level 3: $2^3 = 8$
- …
- Level $h-1$: $2^(h-1)$, Level $h$: $x (x \ge 1) $
- Thus, $2^0 + 2^1 + … 2^{h-1} + x = n$
- $(1-2^{h-1})/(1-2) = n-x → 2^{h-1} < n $
- Thus, $h = O(\log n)$

## Advanced Binary Trees

### Heap

- Complete binary tree
- $O(n)$ space consumption
- $O(\log n)$ insertion time
- $O(\log n)$ delete-min time

Suppose that node $u$ of $T$ is stored at $A[i]$. Then, the left child of $u$ is stored at $A[2i]$, and the right child at $A[2i+1]$.

Suppose that node $u$ of $T$ is stored at $A[i]$. Then, the parent of $u$ is stored at $A[ \left \lfloor i/2 \right \rfloor ]$.

Root-fix operation: Build a heap with $O(n)$ time.

$O(\sum\limits_{i=0}^{h-1}(h-i)\times2^i)=O(2^h)=O(n)$

### Binary Search Tree

predecessor($\max\{a[i]\le q\}$), successor($\min\{a[i]\ge q\}$): $O(h)$

↙↘↘↘↘↘，↘↙↙↙↙↙

Insert: $O(h)$

Delete: $O(h)$

### BBST

For every internal node $u$ of $T$, the height of the left subtree of $u$ differs from that the right subtree of $u$ by at most 1.

Proof: A balanced binary tree with $n$ nodes has height $O(\log n)$.

To construct a BBST with greatest height using $n$ nodes, each internal node should has subtrees that differ 1 in height. Denote $f(h)$ as the least number of nodes of a BBST with height $h$, then

$f(1)=1, f(2)=2$

$f(h)=f(h-1)+f(h-2)+1$

$f(h)+1=f(h-1)+1+f(h-2)+1$

$f(h)=\frac{1}{\sqrt 5}[(\frac{1+\sqrt 5}{2})^{h+1}-(\frac{1-\sqrt 5}{2})^{h+1}]-1$

$n\ge \frac{1}{\sqrt 5}[(\frac{1+\sqrt 5}{2})^{h+1}-(\frac{1-\sqrt 5}{2})^{h+1}]-1$

$h\le c\log n, h=O(\log n)$

Insertion time analysis?

Deletion time analysis?

## Graph

BFS

BFS tree, SSSP

DFS

DFS tree, Cycle detection, topological sort

**Parenthesis Theorem**: all the following are true:

- If $u$ is a proper ancestor of $v$ in DFS-tree of $T$, then $I(u)$ contains $I(v)$.
- If $u$ is a proper descendant of $v$ in DFS-tree of T, then $I(u)$ is contained in $I(v)$ .
- Otherwise, $I(u)$ and $I(v)$ are disjoint.

**White Path Theorem**: let $u$ be a vertex in $G$. Consider the moment when $u$ is pushed into the stack in the DFS algorithm. Then a vertex $v$ becomes a proper descendant of $u$ in the DFS-forest IFF we can go from $u$ to $v$ by travelling only on white vertices.

**Cycle Theorem**: let $T$ be an arbitrary DFS-forest. $G$ contains a cycle if and only if there is a backward edge with respect to $T$.

- If: trivial
- only if: Suppose the cycle is $v_1 → 𝑣_2 → ⋯ → 𝑣_𝑘 → 𝑣_1$, let $𝑣_𝑖$ be the first to enter the stack. Then by white path theorem, all the other vertices in the cycle must be proper descendants of $𝑣_𝑖$ in the DFS-forest. This means the edge pointing to $𝑣_𝑖$ in the cycle is a backward edge.

Dijkstra: $O((|V|+|E|)\log |V|)$

Correctness proof:

Minimum Spanning Tree (Prim): $O(|V|\log |V|+|E|)$

Strongly Connected Components $O(|V|+|E|)$

obtain $G^R$

DFS on $G^R$, obtain $L^R$

- obtain $L$ by reversing $L^R$
- DFS on $G$ according to $L$
- obtain SCC from DFS-forest

Correctness proof:

Obtain $G^{SCC}$ by shrinking nodes

sink SCC: SCC without in-degree

- DFS on $G$ starting from vertices of sink SCC.

Let $S_1$ , $S_2$ be SCCs such that there is a path from $S_1$ to $S_2$ in $G^SCC$. In the ordering of $L$, the earliest vertex in $S_2$ must come before the earliest vertex in $S_1$