Graph & Tree

¡Supera tus tareas y exámenes ahora con Quizwiz!

Lowest Common Ancestor of a Binary Search Tree (LC 235)

similar to BST find or binary search base case not root or not p or not q: backtrack

binary search tree

- tree structure of binary search, every comparison has 2 possibilities - tree is easier to insert into than array

tree conditions:

1 root no cycle each node has 1 parent

graph back/forward/cross edges

Forward edge: (u, v), where v is a descendant of u, but not a tree edge.It is a non-tree edge that connects a vertex to a descendent in a DFS-tree. => 1 child multiple parents Cross edge: any other edge. Can go between vertices in same depth-first tree or in different depth-first trees. (layman) => 1 child multiple parents Back edge: children point to ancestor => cycle Since graph can have these, when traverse a graph (dfs/bfs), we need to keep track if a node is visited or not

Prim's: min spanning tree

KeyNote: find min edge from the current group (Almost like union find) keep track of nodes and edges Start at a node, find the min edge, then add the new node and its edges. Choose a min edge from the expanding set of neighbors, and repeat. O(VlogV + ElogV) = O(ElogV) Map <parent, weight> + Binary heap <weight, neighbor nodes>

interior node

non-leaf node

Validate Binary Search Tree (LC 98)

- narrowing window (recursion) - inorder traversal (recursion/stack)

insert in BST

very similar to find traverse the tree & compare until: - the end is reached => insert the node there - element is found => don't do anything

Populating Next Right Pointers in Each Node

way 1: BFS (queue) way 2: 3 pointers: parent, cur, and head Way 1: pseudo code if not root: return Q = [root] while Q: tempQ = [] for i in range(len(Q)): if i < len(Q)-1: Q[i].next = Q[i+1] if Q[i].left: tempQ.append(Q[i].left) if Q[i].right: tempQ.append(Q[i].right) Q = tempQ

Insertion in the Binary Search Tree

-do the regular binary search - if node exist, skip - if node doesn't exist, add node return root

Union-find

DSU: __init__: self.root = range(#) , self.rank = [1] * len find: if self.root[x] != x (the root would only have itself as root) union: find xroot, yroot, chose 1 to assign to both node, sum the rank code: class DSU(object): def __init__(self): self.root = range(1001) self.rank = [1] * 1001 def find(self, x): if self.root[x] != x: self.root[x] = self.find(self.root[x]) return self.root[x] def union(self, x, y): xroot, yroot = self.find(x), self.find(y) if xroot == yroot: return False root = xroot if self.rank[xroot] >= self.rank[yroot] else yroot self.root[xroot] = root self.root[yroot] = root self.rank[root] = self.rank[xroot] + self.rank[yroot] return True

A* Algorithms

Dijkstra's with potential Dijkstra's grows circularly → not efficient sometime. → add a heuristic to show how far from the next location to the target (potential) → ExtractMin with (edge weight + potential). Ex: with mapping, we can use the direct distance from target to next point as potential. https://www.youtube.com/watch?v=ySN5Wnu88nE potential: ℓ𝜋(u, v) = ℓ(u, v) − 𝜋(u) + 𝜋(v) mapping vertices to real numbers. - On each step, pick the vertex v minimizing dist[v] − 𝜋(s) + 𝜋(v) - 𝜋(s) is the same for all v, so v minimizes dist[v] + 𝜋(v) — the most promising vertex - 𝜋(v) is an estimate of d(v,t) - Pick the vertex v with the minimum current estimate of d(s, v) + d(v,t) Thus the search is directed Worst case: 𝜋(v) = 0 for all v — the same as Dijkstra

Find distance between two nodes of a Binary Tree

Dist(n1, n2) = Dist(root, n1) + Dist(root, n2) - 2*Dist(root, lca) 'n1' and 'n2' are the two given keys 'root' is root of given Binary Tree. 'lca' is lowest common ancestor of n1 and n2 Dist(n1, n2) is the distance between n1 and n2.

path compression: intuition

During find operation, not only it finds the root for a node, it does so for all the nodes on this path → why waste this information → attach all nodes on the path directly to root Find(i): {if i ≠ parent[i]: parent[i] ← Find(parent[i]) return parent[i]} This will reduce the time complexity to O(nlog*n)

AVL tree

Insert(k, R): {N ← Find(k, R) Rebalance(N)} Rebalance(N): {P ← N.Parent if N.Left.Height > N.Right.Height+1: RebalanceRight(N) if N.Right.Height > N.Left.Height+1: RebalanceLeft(N) AdjustHeight(N) if P ̸= null: Rebalance(P)} AdjustHeight(N): {N.Height ← 1+ max(N.Left.Height, N.Right.Height)} RebalanceRight(N): {M ← N.Left if M.Right.Height > M.Left.Height: RotateLeft(M) RotateRight(N) AdjustHeight on affected nodes} Delete(N): {M ← Parent of node replacing N Rebalance(M)}

Dijkstra's: single source, shortest path, no neg weight

KeyNote: Similar to Prim's: heap + map. O(V^2) wo heap, O(E log V) w heap -The heap keep track of all unvisited nodes and its distance to source (starting with inf, except source) -Pop the heap to get the closest unvisited vertex to source (starting with the source itself) -Explore its neighbors and update the distance source-neighbor (basically source-current + current-neighbor) We are done when the heap is empty

Clone an undirected graph. Each node in the graph contains a label and a list of its neighbors. (LC 133)

map + dfs dfs from a node to its neighbors, - if neighbors is not in map: create copy, add to map - append copyNeighbor to copyNode's neighbor list

graph concepts: adjacency list adjacency matrix visited != explored BFS vs DFS

matrix: if M[i][j] == 1: there is edge i->j adj list: Linkedlist dense: E = O(v²) sparse E = O(V+2E) = O(E+V) (since a->b and b->a in undir graph) 2 way to keep track of unexplored nodes: BFS uses a Q and DFS uses a stack

union-find tree pseudo-code

MakeSet(i): {parent[i] ← i rank[i] ← 0} # Running time: O(1) Find(i): {while i ≠ parent[i]: {i ← parent[i] return i}} # Running time: O(tree height) Union(i, j): {i_id ← Find(i) j_id ← Find(j) if i_id = j_id: return if rank[i_id] > rank[j_id]: parent[j_id] ← i_id else: {parent[i_id] ← j_id if rank[i_id] = rank[j_id]:rank[j_id] ← rank[j_id] + 1} # the height of the tree only increase when the 2 rank are equal

splay tree pseudo code

Splay(N): Determine proper case Apply Zig-Zig, Zig-Zag, or Zig as appropriate if N.Parent ≠ null: Splay(N) # recursively bring the node to root STFind(k, R): {N ← Find(k, R) Splay(N) return N} STInsert(k, R): {#Insert, then splay Insert(k, R) STFind(k, R)} STSplit(R, x): {N ← Find(x, R) Splay(N) split off appropriate subtree of N} STMerge(R1, R2): {N ← Find(∞, R1) Splay(N) N.Right ← R2} # O(logn) amortized time per operation

Course Scheduling II (LC 210) There are a total of n courses you have to take, labeled from 0 to n - 1. Some courses may have prerequisites, for example to take course 0 you have to first take course 1, which is expressed as a pair: [0,1] Given the total number of courses and a list of prerequisite pairs, return the ordering of courses you should take to finish all courses. There may be multiple correct orders, you just need to return one of them. If it is impossible to finish all courses, return an empty array.

The input prerequisites is a graph represented by a list of edges, not adjacency matrices. This problem is equivalent to finding the topological order in a directed graph. If a cycle exists, no topological ordering exists and therefore it will be impossible to take all courses. Topological sort could also be done via BFS. *visited array *dfs return false for cycle detection *if not all(dfs(child) for child in children[node]): return False

Lowest Common Ancestor of a Binary Tree

base case: p, q, None recursion left and right def lowestCommonAncestor(self, root, p, q): if root in (None, p, q): return root left, right = (self.lowestCommonAncestor(kid, p, q) for kid in (root.left, root.right)) return root if left and right else left or right

a tree is

either: - empty - node with a key, a list of child trees (left and right for binary tree), and (optional) parent leaf vs interior (non-leaf) node

kirchoff theorem: number of possible spanning trees in a undir graph

construct adj matrix replace: diag → degree of each node other '1' → '-1' find any cofactor (remove any 1 col and 1 row, then find det) the cofactor = # of spanning tree

serialize tree to string

def convert(p): return "^" + str(p.val) + "#" + convert(p.left) + convert(p.right) if p else "$"

BFS graph BFS(v)

particularly useful for finding the shortest path on unweighted graphs # Global/class scope variables n = number of nodes in the graph g = adjacency list representing unweighted graph # s = start node, e = end node, and 0 ≤ e,s < n function bfs(s, e): # Do a BFS starting at node s prev = solve(s) # Return reconstructed path from s -> e return reconstructPath(s, e, prev) function solve(s): q = queue data structure with enqueue and dequeue q.enqueue(s) visited = [false, ..., false] # size n visited[s] = true prev = [null, ..., null] # size n, parent array while !q.isEmpty(): node = q.dequeue() neighbours = g.get(node) for(next : neighbours): if !visited[next]: q.enqueue(next) visited[next] = true prev[next] = node return prev function reconstructPath(s, e): # Reconstruct path going backwards from e path = [] for(at = e; at != null; at = prev[at]): path.add(at) path.reverse() # If s and e are connected return the path if path[0] == s: return path return [] visited[] is global = zeros (not visited) Q = [v], visited[v] = 1 while Q: - cur = pop Q - for all cur.neighbors: if not visited, add to Q, set visited of neighbor to 1 space complx: O(n), since we only add node that wasn't visited to the queue time complx: O(E+V) for adj list

Flip Game II (LC 294): You are playing the following Flip Game with your friend: Given a string that contains only these two characters: + and -, you and your friend take turns to flip two consecutive "++" into "--". The game ends when a person can no longer make a move and therefore the other person will be the winner. Write a function to determine if the starting player can guarantee a win. For example, given s = "++++", return true. The starting player can guarantee a win by flipping the middle "++" to become "+--+".

recursion with memoization: - iterate through the array: - can win when (flip [i] and [i+1]) and cannot win in the after state (leave the opponent with a state they cannot win, use recursion here) def canWin(self, s): """ :type s: str :rtype: bool """ memo = {} def can(s): if s not in memo: memo[s] = any(s[i:i+2] == '++' and not can(s[:i] + '--' + s[i+2:]) for i in range(len(s))) return memo[s] return can(s)

Kill Process: return a list of PIDs of processes that will be killed. When a process is killed, all its children processes will be killed (LC 582)

use hashmap to store the children/relation of each node then bfs or dfs (stack/queue) to fill the res array

BFT(G,n)

use inDeg outDeg min(out-in)

"Bottom-up" Recursion Solution

"Bottom-up" is another recursion solution. In each recursion level, we will firstly call the functions recursively for all the children nodes and then come up with the answer according to the return values and the value of the root node itself. This process can be regarded as kind of postorder traversal. Typically, a "bottom-up" recursion function bottom_up(root) will be like this: 1. return specific value for null node 2. left_ans = bottom_up(root.left) // call function recursively for left child 3. right_ans = bottom_up(root.right) // call function recursively for right child 4. return answers // answer <-- left_ans, right_ans, root.val Ex: depth of binary tree: If we know the maximum depth l of the subtree rooted at its left child and the maximum depth r of the subtree rooted at its right child, can we answer the previous question? Of course yes, we can choose the maximum between them and plus 1 to get the maximum depth of the subtree rooted at the selected node. That is x = max(l, r) + 1. It means that for each node, we can get the answer after solving the problem of its children. Therefore, we can solve this problem using a "bottom-up" solution. Here is the pseudocode for the recursion function maximum_depth(root): 1. return 0 if root is null // return 0 for null node 2. left_depth = maximum_depth(root.left) 3. right_depth = maximum_depth(root.right) 4. return max(left_depth, right_depth) + 1 // return depth of the subtree rooted at root

"Top-down" Recursion Solution

"Top-down" means that in each recursion level, we will visit the node first to come up with some values, and pass these values to its children when calling the function recursively. So the "top-down" solution can be considered as kind of preorder traversal. To be specific, the recursion function top_down(root, params) works like this: 1. return specific value for null node 2. update the answer if needed // anwer <-- params 3. left_ans = top_down(root.left, left_params) // left_params <-- root.val, params 4. right_ans = top_down(root.right, right_params) // right_params <-- root.val, params 5. return the answer if needed // answer <-- left_ans, right_ans Ex: depth of binary tree: We know that the depth of the root node is 1. For each node, if we know the depth of the node, we will know the depth of its children. Therefore, if we pass the depth of the node as a parameter when calling the function recursively, all the nodes know the depth of themselves. And for leaf nodes, we can use the depth to update the final answer. Here is the pseudocode for the recursion function maximum_depth(root, depth): 1. return if root is null 2. if root is a leaf node: 3. answer = max(answer, depth) // update the answer if needed 4. maximum_depth(root.left, depth + 1) // call the function recursively for left child 5. maximum_depth(root.right, depth + 1) // call the function recursively for right child

shortest path tree

# find all shortest distance from S BFS(G, S): {for all u ∈ V: dist[u] ← ∞, prev[u] ← nil dist[S] ← 0 Q ← {S} {queue containing just S} while Q is not empty: {u ← Dequeue(Q) for all (u, v) ∈ E: {if dist[v] = ∞: {Enqueue(Q, v) dist[v] ← dist[u] + 1, prev[v] ← u}}}} ReconstructPath(S, u, prev) {result ← empty while u ≠ S: {result.append(u) u ← prev[u]} return Reverse(result)} # O(|E| + |V |)

DFS graph

# find vertices that are reachable from v Explore(v): {visited(v) ← true for (v, w) ∈ E: {if not visited(w):Explore(w)}} # to reach all vertices: → DFS DFS(G): {# optional, mark everything as unvisited for all v ∈ V : mark v unvisited for v ∈ V : {if not visited(v): Explore(v)}}

DFS previsit/postvisit

# modify explore to record extra information Explore(v): {visited(v) ← true previsit(v) for (v, w) ∈ E: {if not visited(w):Explore(w)} postvisit(v)} # useful application for pre/post: - clock (or global counter) Initialize clock to 1. previsit(v): {pre(v) ← clock clock ← clock + 1} postvisit(v): {post(v) ← clock clock ← clock + 1}

BFS grid

(- convert grid to adj list/matrix.) - This can be avoided by using directional vectors: [-1,0], [1,0], [0,-1], [0,1] (also [-1,-1], [-1,1], [1,1], [1,-1] if diagonal movement is allowed) dr = [-1, +1, 0, 0] dc = [0, 0, -1, +1] for i in range(4): rr = r+ dr[i] cc = c + dc[i] if rr<0 or cc<0 or rr>=R or cc>=C: continue if need path length: - use temp queue and adding step count - use 2 variables to store the number of node left in current layer and nodes in the next layer. - use parent[] and traverse back and find length storing x,y pair require either an array or an object wrapper → a lot of packing and unpacking → use 1 queue for each dimension, and enqueue and dequeue them all at the same time to reconstruct the coordinates

Given a binary tree, find its minimum depth. The minimum depth is the number of nodes along the shortest path from the root node down to the nearest leaf node.

* leaf node: not node.left and not node.right * either take care of the null as a base case or check before going left or going right - dfs and update the result whenever reach leaves - bfs: using queue or stack (most straight forward) - recursion calling the same function

3 ways to represent a graph

- Adj Matrix - Edge list - Adj list ← most popular, since most graphs are sparse

birectional Dijkstra's algorithms

- Build GR (reverse graph) - Start Dijkstra from s in G and from t in GR - Alternate between Dijkstra steps in G and in GR - Stop when some vertex v is processed both in G and in GR - Compute the shortest path between s and t notes: - Speedup in practice depends on the graph - Memory consumption is 2x to store G and GR

union-find merge concept

- Hang one of the trees under the root of the other one - To quickly find a height of a tree, we will keep the height of each subtree in an array rank[1 . . . n]: rank[i] is the height of the subtree whose root is i. - Hang the shorter one, since we would like to keep the trees shallow. Hanging a shorter tree under a taller one is called a union by rank heuristic.

distance between two nodes in BST

- If both keys are greater than current node, we move to right child of current node. - If both keys are smaller than current node, we move to left child of current node. - If one keys is smaller and other key is greater, current node is Lowest Common Ancestor (LCA) of two nodes. We find distances of current node from two keys and return sum of the distances.

merge AVL tree:

- Merge Combines two binary search trees into a single one. MergeWithRoot(R1, R2, T): {T.Left ← R1 T.Right ← R2 R1.Parent ← T R2.Parent ← T return T} # Time O(1) Merge(R1, R2): {T ← Find(∞, R1) Delete(T) MergeWithRoot(R1, R2, T) return T} # Time O(h) AVLTreeMergeWithRoot(R1, R2, T): {if |R1.Height − R2.Height| ≤ 1: {MergeWithRoot(R1, R2, T) T.Ht ← max(R1.Height, R2.Height) + 1 return T} else if R1.Height > R2.Height: {R′ ← AVLTreeMWR(R1.Right, R2, T) R1.Right ← R′ R′.Parent ← R1 Rebalance(R1) return root} else if R1.Height < R2.Height: <similar operations, but on the opposite side> # Time O(|R1.Height − R2.Height| + 1) = O(logn)

splay tree

- O(logn) search for random elements - If some items more frequently than others, can do better putting frequent queries near root → Bring query node to the root - application: +Lexicographic Search Tree +Data Compression +Encryption +Cache Implementation: ex LRU cache +Implementing link/cut trees and Euler trees +Computing range aggregates in O(log(N))* +Removing ranges of a tree in O(log(N)) (Logically) Reversing a range of a tree in O(log(N))* +Merging (inserting a tree in-between two existing elements) two trees together in O(log(N+M)) +Querying faster than O(log(N)) for highly biased query distributions * Common to most BSTs

union-find tree concept

- Represent each set as a rooted tree - ID of a set is the root of the tree - Use array parent[1 . . . n]: parent[i] is the parent of i, or i if it is the root - the root has itself as parent (fyi, self-loops are considered to be back edges)

Single Source Shortest Path (SSSP) on a DAG Longest path on DAG

- SSSP: can be solved in O(V+E) time, since the nodes can be ordered in topological ordering → process sequentially: relaxing each out-coming edge from the current node - The longest path can be solved by multiplying all weight to -1, process, then multiply by -1 again

split AVL tree

- Split Breaks one binary search tree into two. Split(R, x): if R = null: return (null, null) if x ≤ R.Key: {(R1, R2) ← Split(R.Left, x) R3 ← MergeWithRoot(R2, R.Right, R) return (R1, R3)} if x > R.Key: <similar operations, but on the opposite side> # Time O(logn)

topological sort

- any DAG (no cycle) can be linearly ordered. TopologicalSort(G): {DFS(G) sort vertices by reverse post-order, aka finishing time}

DFS → components

- basically graph coloring - dfs and assign a number to nodes (backtrack if visited) n = number of nodes in the graph g = adjacency list representing graph count = 0 components = empty integer array # size n visited = [false, ..., false] # size n function findComponents(): for (i = 0; i < n; i++): if !visited[i]: count++ dfs(i) return (count, components) function dfs(at): visited[at] = true components[at] = count for (next : g[at]): if !visited[next]: dfs(next)

advantages of tree

- dynamic data structure => easy to add/remove - structure conveys info

Travelling Salesman problem (TSP)

- find Hamiltonian cycle (path that visits every node once) of minimum cost? - compute the optimal solution for subpaths of length N while using information of N-1 brute force O(n!) DP: n²(2^n). ex: To compute the optimal solution for paths of length 3, we need to remember (store) two things from each of the n = 2 cases: 1) The set of visited nodes in the subpath 2) The index of the last visited node in the path # Finds the minimum TSP tour cost. # m - 2D adjacency matrix representing graph # S - The start node (0 ≤ S < N) function tsp(m, S): N = matrix.size # Initialize memo table. # Fill table with null values or +∞ memo = 2D table of size N by 2N setup(m, memo, S, N) solve(m, memo, S, N) minCost = findMinCost(m, memo, S, N) tour = findOptimalTour(m, memo, S, N) return (minCost, tour)

Deletion in a BST

- leaf: remove - 1 child: replace the node with its child - 2 child: replace the node with the leftmost of the right subtree (or the rightmost of the left subtree) it's easier if the node we try to delete is the always the root → recursion, delete appropriate root, and return the root Code: if not root: return root if root.val < key: root.right = self.deleteNode(root.right, key) elif root.val > key: root.left = self.deleteNode(root.left, key) else: if not root.left: return root.right if not root.right: return root.left # both left and right # now have to go to the leftmost right: LMR, cur = None, root.right while cur: LMR, cur = cur.val, cur.left root.val = LMR root.right = self.deleteNode(root.right, LMR) return root

Populating Next Right Pointers (LC 116, 117)

- level order traversal, then set next pointer for nodes in tempQueue - 3 variables: cur, nextP, and dummy. + assign cur.left to nextP.next, step nextP + assign cur.right to nextP.next, step nextP + step cur + if not cur, go to the next level (dummy.next) class Solution: # @param root, a tree link node # @return nothing def connect(self, root): cur = root dummy = nextP = TreeLinkNode(0) while cur: nextP.next = cur.left nextP = nextP.next or nextP nextP.next = cur.right nextP = nextP.next or nextP cur = cur.next if not cur: cur, nextP = dummy.next, dummy

graph connectivity

- modify DFS to also lable the component DFS(G): {for all v ∈ V : mark v unvisited cc ← 1 # counter for v ∈ V : {if not visited(v): Explore(v) cc ← cc+1}}

topological sort DFS

- multiple solution (DFS or BFS) - choose any not visited node, dfs. dfs doing 2 things: check for cycle (using -1,0,1 markings) and adding to result - add to result in reversed finishing order

top-sort

- ordering of nodes in a directed graph where for each directed edge A→B, A appears before B. - The ordering is not unique - only for DAG. How to detect cycle: Tarjan's SCC algorithms. - Every tree has a topological ordering (cherry picking the leaves until no node left) - O(V+E) # Assumption: graph is stored as adjacency list function topsort(graph): N = graph.numberOfNodes() V = [false,...,false] # Length N ordering = [0,...,0] # Length N i = N - 1 # Index for ordering array for(at = 0; at < N; at++): if V[at] == false: i = dfs(i, at, V, ordering, graph) return ordering # Execute Depth First Search (DFS) function dfs(i, at, V, ordering, graph): V[at] = true edges = graph.getEdgesOutFromNode(at) for edge in edges: if V[edge.to] == false: i = dfs(i, edge.to, V, ordering, graph) ordering[i] = at return i - 1

Articulation points

- removing this point will break the graph into 2 - The only time id(e.from) == lowlink(e.to) fails is when the starting node has 0 or 1 outgoing directed edges. This is because either the node is a singleton or the node in trapped in a cycle. (**) id = 0 g = adjacency list with undirected edges n = size of the graph outEdgeCount = 0 # In these arrays index i represents node i low = [0, 0, ... 0, 0] # Length n ids = [0, 0, ... 0, 0] # Length n visited = [false, ..., false] # Length n isArt = [false, ..., false] # Length n function findArtPoints(): for (i = 0; i < n; i = i + 1): if (!visited[i]): outEdgeCount = 0 # Reset edge count dfs(i, i, -1) isArt[i] = (outEdgeCount > 1) return isArt # out edge count is to eliminate (**) condition

Re-balancing BST

- rotation: RotateRight(X): {P ← X.Parent Y ← X.Left B ← Y .Right Y .Parent ← P P.AppropriateChild ← Y X.Parent ← Y , Y .Right ← X B.Parent ← X, X.Left ← B} when insert into an AVL tree, we need to re-balance along the insertion path, which takes at most O(logn)

BSTIterator (LC173)

- smallest number -> in-order traversal - traversal -> stack init: iteratively append left next: pop stack, move right, then iteratively append left

topological sort DFS

- step 1: . build graph (map of adj list: set()) .identify min (inDeg-outDeg): need a separate data structure for this - step 2: .delete node has previously found min difDeg and its edges. (this can be done by decrement the difDeg array) .place it in the output Loop step 1 & 2 until the graph is empty.

BST successor (LC285)

- the successor is in the right -> write to res whenever stepping right - do usual divide and conquer (bin search) res = None while root: if root.val > p.val: res = root.val root = root.left else: root = root.right return res

DFS graph DFS(v)

- use in count SCCs, connectivity, or find bridges/articulation points n = num of nodes g = adj list graph visited = [false]*n function dfs(v): if visited[v]: return # back track visited[v] = true neighbors = graph[v] for next in neightbors: dfs(next) stack: - loop: visit a node, explore 1 of its neighbor, backtrack (return) if all visited space complx: worst case: all nodes on the stack: O(v) time cplx: O(V+E)

Balanced Binary Tree (LC110)

-at every level heightL and heightR diff at most 1 → need a helper that return(isBalance and height). Since height >= 0, we can use neg number to mark unbalance Code: class Solution(object): def isBalanced(self, root): def check(root): if root is None: return 0 left = check(root.left) right = check(root.right) if left == -1 or right == -1 or abs(left - right) > 1: return -1 return 1 + max(left, right) return check(root) != -1

representing graph: 1. adj matrix 2. adj list 3. edge list

1. adj matrix: matrix storing weights pros: space eff for dense graphs, O(1) look-up, simplese representation cons: O(v²) space, O(v²) time iteration 2. adj list: store destination and weight pros: space eff for sparse graphs, time eff interating cons: less space efficient for denser graphs, weight lookup is O(E) 3. edge list: unordered list of edges (start,end,weight)

Invert Binary Tree (LC 226)

1. recursion 2. iterative (queue)

graph theory problems: 1. shortest path 2. connectivity 3. Neg cycle 4. strongly connected components 5. traveling salesman 6. bridge 7. articulation points 8. minimum spanning tree 9. network flow

1. shortest path: BFS (unweighted graph), Dijkstra's, Bellman-Ford, Floyd-Warshall, A+, and many more. 2. connectivity: is there a path A->B: union-find or any search algorithm (DFS/BFS) 3. Neg cycle: use Bellman-Ford and Floyd-Warshall (can be used in arbitrage) 4. Strongly connected components: SCCs, self-contained cycles within digraph. Use Tarjan's or Kosaraju's 5.TSP: Held-karp, branch and bound or approximation algorithms (ex: ant colony optimization) 6. bridge/ cut edge: any edge whose removal increases the number of SCCs (used to find weak points, bottlenecks, or vulnerabilities in a graph) 7. articulation points / cut vertex: any node in a graph whose removal increase the number of SCCs 8. MST: subset of edges that connects all vertices without any cycles and with the min possible total edge weight. Use Kruskal's, Prim's, or Boruvka's algorithms 9. max flow: source-sink flow. Use Ford-Fulkerson, Edmonds-Karp, and Dinic's

Subtree of Another Tree (LC572)

2 methods: -serialize to string, and check substring in the string -dfs and check same -Merkle hashing (https://leetcode.com/problems/subtree-of-another-tree/discuss/102741/Python-Straightforward-with-Explanation-(O(ST)-and-O(S+T)-approaches) serialize def convert(p): return "^" + str(p.val) + "#" + convert(p.left) + convert(p.right) if p else "$" return convert(t) in convert(s) dfs class Solution(object): def isSubtree(self, s, t): if self.isSame(s,t): return True if not s: return False return self.isSubtree(s.left,t) or self.isSubtree(s.right,t) def isSame(self,s,t): if not (s and t): return s is t return s.val == t.val and self.isSame(s.left, t.left) and self.isSame(s.right, t.right)

type of graph: 1. undirected/directed graph 2. weight/unweighted graph 3. trees 4. rooted tree 5. DAG 6. bipartite 7. complete graph

3. trees: an undirected graph with no cycle. It's a connected graph with N nodes and N-1 edges 4. rooted tree (directed): a tree with a designated root. Every node point away (arborescence or out-tree) or otherwise (anti-arborescent or in-tree) 5. DAG: a directed graph with no cycle. All out-trees are DAGs, but not reverse (used for topo) 6. bipartite: can be splited into 2 groups (two-colorable, no odd-length cycle), use in network flow 7. complete graph: has a unique edge between every pair of nodes. Usually worst case possible graph.

strongly connected components

A directed graph can be partitioned into strongly connected components where two vertices are connected if and only if they are in the same component. SCCs(G): {for v ∈ V in reverse postorder: if not visited(v): {Explore(v) mark visited vertices as new SCC}}

source and sink

A source is a vertex with no incoming edges. A sink is a vertex with no outgoing edges. Follow path as far as possible v1 → v2 → . . . → vn. Eventually either: -Cannot extend (found sink). -Repeat a vertex (have a cycle).

minimum spanning tree: Kruskal's

Add with Union-find repeatedly add the next lightest edge if this doesn't produce a cycle; use disjoint sets to check whether the current edge joins two vertices from different components - Algorithm: repeatedly add to X the next lightest edge e that doesn't produce a cycle - At any point of time, the set X is a forest, that is, a collection of trees - The next edge e connects two different trees—say, T1 and T2 - The edge e is the lightest between T1 and V − T1, hence adding e is safe Kruskal(G): {for all u ∈ V : {MakeSet(v)} X ← empty set sort the edges E by weight for all {u, v} ∈ E in non-decreasing {weight order: if Find(u) ̸= Find(v): {add {u, v} to X Union(u, v)} return X}}} O(|E| log |V |)

minimum spanning tree: Prim's

Add with heap repeatedly attach a new vertex to the current tree by a lightest edge; use priority queue to quickly find the next lightest edge - X is always a subtree, grows by one edge at each iteration - we add a lightest edge between a vertex of the tree and a vertex not in the tree very similar to Dijkstra's algorithm Prim(G): {for all u ∈ V: {cost[u] ← ∞, parent[u] ← nil} pick any initial vertex u0 cost[u0] ← 0 PrioQ ← MakeQueue(V) {priority is cost} while PrioQ is not empty: {v ← ExtractMin(PrioQ) for all {v, z} ∈ E: {if z ∈ PrioQ and cost[z] > w(v, z): cost[z] ← w(v, z), parent[z] ← v ChangePriority(PrioQ, z, cost[z])}}}} O(|E| log |V|)

birectional Dijkstra's idea

Alternate Dijkstra's from s and t let's say distance is 2r, → traditional Dijkstra's = 4πr² → birectional Dijkstra's = 2*πr² → 2x speed for road, but 1000x for social network, since it reduces the problem size to sqrt(n) due to six handshakes theorem

BFS pseudo code

BFS(G, S): {for all u ∈ V: dist[u] ← ∞ dist[S] ← 0 Q ← {S} {queue containing just S} while Q is not empty: {u ← Dequeue(Q) for all (u, v) ∈ E: {if dist[v] = ∞: {Enqueue(Q, v) dist[v] ← dist[u] + 1}}}} # since there is no ∞ in a language → use number of node as the upper bound # since there is no ∞ in a language → use number of node as the upper bound

shortest path algorithms summary

BFS: O(V+E), large graph, unweighted graphs (even APSP), cannot detect neg cycle, best SP on unweighted graph Dijkstra's: O((V+E)logV), Large/Medium graph, ok for APSP, cannot detect neg cycle, best SP on weighted graph Bellman Ford: O(VE), Medium/Small graph, bad for APSP, can detect neg cycle, ok for SP on weighted graph, and bad for unweighted graph Floyd Warshall: O(V³), small graph, good for APSP, good for detecting neg cycle, bad for SP.

Given a Binary Search Tree (BST) with the root node root, return the minimum difference between the values of any two different nodes (not 2 given nodes and doesn't need to be directly connected) in the tree (LC 783)

BST order => inorder traversal with memorization of the prev node in order self.prev = float('-inf') self.res = float('inf')

Floyd-Warshall: shortest path between all pairs, neg edges allowed

KeyNote: distance matrix, then relax edges. Check if exist intermediate path shorter than current path. dp[k][i][j] = m[i][j] if k = 0 dp[i][j] = min(dp[i][j], dp[i][k]+dp[k][j]) (compute the solution for k in-place instead of: dp[k][i][j] = min(dp[k-1][i][j], dp[k-1][i][k]+dp[k-1][k][j]) (o/w)) - Reuse the best distance from i to j with values routing through nodes {0,1,...,k-1} - Find the best distance from i to j through node k reusing best solutions from {0,1,...,k-1} - "go from i to k" and then "go from k to j" O(V^3) ideal for < 100s nodes https://www.youtube.com/watch?v=4OQeCuLYj-4 - distance matrix: distance vertex to itself = 0, fill in given edges - for i, j, j if dis[i][j] > dis[i][k] + dis[k][j] => update and include k # Global/class scope variables n = size of the adjacency matrix dp = the memo table that will contain APSP soln next = matrix used to reconstruct shortest paths function floydWarshall(m): setup(m) # Execute FW all pairs shortest path algorithm. for(k := 0; k < n; k++): for(i := 0; i < n; i++): for(j := 0; j < n; j++): if(dp[i][k] + dp[k][j] < dp[i][j]: dp[i][j] = dp[i][k] + dp[k][j] next[i][j] = next[i][k] # Detect and propagate negative cycles. propagateNegativeCycles(dp, n) # Return APSP matrix return dp function propagateNegativeCycles(dp, n): # Execute FW APSP algorithm a second time but # this time if the distance can be improved # set the optimal distance to be -∞. # Every edge (i, j) marked with -∞ is either # part of or reaches into a negative cycle. for(k := 0; k < n; k++): for(i := 0; i < n; i++): for(j := 0; j < n; j++): if(dp[i][k] + dp[k][j] < dp[i][j]: dp[i][j] = -∞ next[i][j] = -1 # Reconstructs the shortest path between nodes # 'start' and 'end'. You must run the # floydWarshall solver before calling this method. # Returns null if path if affected by negative cycle. function reconstructPath(start, end): path = [] # Check if there exists a path between # the start and the end node. if dp[start][end] == +∞: return path at := start # Reconstruct path from next matrix for(;at != end; at = next[at][end]): if at == -1: return null path.add(at) if next[at][end] == -1: return null path.add(end) return path

Kruskal: min spanning tree

KeyNote: process all edge sorted by weight use union find: https://www.youtube.com/watch?v=JZBQLXgSGfs Sort all edges ascending by weight Group them using union in order. This can simplify by assigning separate group for each node in the beginning If both nodes are in the same group (using find): skip the edge If each is in different groups: merge groups

Bellman-Ford: single source shortest path

KeyNote: relaxing edges O((E+V)logV) with binary heap Use for stock arbitrage, since Dijkstra's cannot deal with neg cycle - set every entry in D to inf - set D[S] = 0 - pass 1: relax each edge V-1 times (if including this path make the current distance smaller, include it) - pass 2: cycle detection (if including this path make the current distance smaller => neg edge => set it to -inf) 1. Set every entry in D to +∞ 2. Set D[S] = 0 3. Relax each edge V-1 times: for (i = 0; i < V-1; i = i + 1): for edge in graph.edges: // Relax edge (update D with shorter path) if (D[edge.from] + edge.cost < D[edge.to]) D[edge.to] = D[edge.from] + edge.cost // Repeat to find nodes caught in a negative cycle for (i = 0; i < V-1; i = i + 1): for edge in graph.edges: if (D[edge.from] + edge.cost < D[edge.to]) D[edge.to] = -∞

post-order traversal iteratively (DFS)

KeyNote: root last, reverse result (visit a node's left sub, then right sub, then itself) stack, r = [], [] - while stack or root: - if root: keep append to r and stack then move right - else: pop and go to node.left - reverse r # postorder class Solution(object): def postorderTraversal(self, root): stack, r = [], [] while stack or root: if root: r.append(root.val) stack.append(root) root = root.right else: root = stack.pop().left return r[::-1]

inorder traversal iteratively (DFS)

KeyNote: root middle (visit the node's left sub, then itself, then right sub) stack, r = [], [] - while stack or root: - if root: keep append to stack then move left - else: pop, append to r, then go right Code: # inorder class Solution(object): def inorderTraversal(self, root): stack, r = [], [] while stack or root: if root: stack.append(root) root = root.left else: root = stack.pop() r.append(root.val) root = root.right return r

level-order traversal (BFS)

KeyNote: use nextLevel array Code: class Solution(object): def levelOrder(self, root): if not root: return [] Q = [root] if root else [] res = [] while Q: res.append([node.val for node in Q]) tempQ = [] for node in Q: if node.left: tempQ.append(node.left) if node.right: tempQ.append(node.right) Q = tempQ return res return res *double BFS is slower when, for example, there is only 1 path A -> B, A has very few neighbors, but B has a lot.

bridges algorithm

Keynote: find low-link value. when reaching a visited node, get its id and backtrack to write this value to all nodes on the path. The low-link value would be propagated throughout the cycle. Bridges and articulation points are important in graph theory because they often hint at weak points, bottlenecks or vulnerabilities in a graph. id = 0 g = adjacency list with undirected edges n = size of the graph # In these arrays index i represents node i ids = [0, 0, ... 0, 0] # Length n low = [0, 0, ... 0, 0] # Length n visited = [false, ..., false] # Length n function findBridges(): bridges = [] # Finds all bridges in the graph across various connected components. for (i = 0; i < n; i = i + 1): if (!visited[i]): dfs(i, -1, bridges) return bridges # Perform Depth First Search (DFS) to find bridges. # at = current node, parent = previous node. The # bridges list is always of even length and indexes # (2*i, 2*i+1) form a bridge. For example, nodes at # indexes (0, 1) are a bridge, (2, 3) is another etc... function dfs(at, parent, bridges): visited[at] = true id = id + 1 low[at] = ids[at] = id # For each edge from node 'at' to node 'to' for (to : g[at]): if to == parent: continue if (!visited[to]): dfs(to, at, bridges) low[at] = min(low[at], low[to]) if (ids[at] < low[to]): bridges.add(at) bridges.add(to) else: low[at] = min(low[at], ids[to])

preorder traversal iteratively (DFS)

Keynote: root first (visit a node, then its left sub-tree, then its right sub-tree) stack, r = [], [] - while stack or root: - if root: keep append to r and stack then move left - else: pop and go to node.right code: # preorder class Solution(object): def preorderTraversal(self, root): stack, r = [], [] while stack or root: if root: stack.append(root) r.append(root.val) root = root.left else: root = stack.pop().right return r

minimum spanning tree algorithms

Kruskal's algorithm: repeatedly add the next lightest edge if this doesn't produce a cycle. (union-find) Prim's algorithm: repeatedly attach a new vertex to the current tree by a lightest edge.

Contraction Hierarchies Algorithms

Node ordering: - Nodes can be ordered by some "importance". Ex: Long-distance trips go through highways. Less important roads merge into more important roads → Hierarchy of roads. - Importance first increases then decreases along any shortest path. - Minimize the number of added shortcuts - Criteria: Edge difference, Number of contracted neighbors, Shortcut cover, Node level Shortest Paths with Preprocessing - Preprocess the graph - Find distance and shortest path in the preprocessed graph - Reconstruct the shortest path in the initial graph Witness Search: - we want to check whether there is a witness path from u to w bypassing v with length at most ℓ(u, v) + ℓ(v, w) — then there is no need to add a shortcut from u to w - For each predecessor ui of v, run Dijkstra from ui ignoring v - Stop Dijkstra when distance from the source becomes too big & limit the number of hops - for bidirectional, don't stop when find the middle point, stop when the distracted node is further than the target. ComputeDistance(s,t, . . .): {estimate ← +∞ Fill dist, distR with +∞ for each node dist[s] ← 0, distR[t] ← 0 proc ← empty, procR ← empty {while there are nodes to process: {v ← ExtractMin(dist) if dist[v] ≤ estimate: Process(v, . . .) if v in procR and dist[v] + distR[v] < estimate: estimate ← dist[v] + distR[v] vR ← ExtractMin(distR) Repeat symmetrically for vR} return estimate}} Algorithms: - Keep all nodes in a priority queue by decreasing the importance - On each iteration, extract the least important node - Recompute its importance If it's still minimal (compare with the top of the priority queue), contract the node - If it's still minimal (compare with the top of the priority queue), contract the node - Otherwise, put it back into priority queue with new priority

A* Path-finding algorithm

Node: - C: movement cost (from source to current) - H: heuristic (from current to goal, can use Manhattan method: vertical + horizontal distances) - F: G+H - parent List: - open: nodes that need to be checked (the next ring, like tempQueue in bfs graph) - closed: nodes that have been checked Algorithms: - calculate the heuristic cost for all nodes on the board ( can do here or in the loop) - ending condition: get to the end node - every move: + update the path cost (similar to dijstra's: cost = min(current cost, and the cost via its neighbor) + calculate F = G+H => update the priority queue/binary heap + pop the queue and add to the close List + if end node is a neighbor of current, stop then trace back the path using node.parent

reconstruct a BST from inorder and pre/post (LC 449) Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment. Design an algorithm to serialize and deserialize a binary search tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary search tree can be serialized to a string and this string can be deserialized to the original tree structure.

Note 1: - we can easily reconstruct a binary tree from post+in or pre+in - BST in order is a ascending sorted array, so we can get this from sorting post or pre. However, since we already traverse the tree anyway, why not put both in an array. This method works both for BST and non-repeated binary tree (since it uses ino.index, element cannot be repeated) class Codec: def serialize(self, root): """Encodes a tree to a single string. :type root: TreeNode :rtype: str """ def traverse(node): if node: traverse(node.left) ino.append(str(node.val)) traverse(node.right) posto.append(str(node.val)) posto = [] ino = [] traverse(root) return(posto+ino) def deserialize(self, data): """Decodes your encoded data to tree. :type data: str :rtype: TreeNode """ def buildTree(posto, ino): if not posto: return rootVal = posto.pop() index = ino.index(rootVal) lposto = posto[:index] rposto = posto[index:] lino = ino[:index] rino = ino[index+1:] root = TreeNode(rootVal) root.left = buildTree(lposto, lino) root.right = buildTree(rposto, rino) return root

time complexity DFS, BFS

O(V+E) *double BFS is slower when, for example, there is only 1 path A -> B, A has very few neighbor, but B has a lot. DFS(analysis): Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice once as UNEXPLORED once as VISITED Each edge is labeled twice once as UNEXPLORED once as DISCOVERY or BACK Method incidentEdges is called once for each vertex DFS runs in O(n + m) time provided the graph is represented by the adjacency list structure Recall that Σv deg(v) = 2m BFS(analysis): Setting/getting a vertex/edge label takes O(1) time Each vertex is labeled twice once as UNEXPLORED once as VISITED Each edge is labeled twice once as UNEXPLORED once as DISCOVERY or CROSS Each vertex is inserted once into a sequence Li Method incidentEdges is called once for each vertex BFS runs in O(n + m) time provided the graph is represented by the adjacency list structure Recall that Σv deg(v) = 2m

reconstruct a binary tree from in-order alone (LC 297) Serialization is the process of converting a data structure or object into a sequence of bits so that it can be stored in a file or memory buffer, or transmitted across a network connection link to be reconstructed later in the same or another computer environment. Design an algorithm to serialize and deserialize a binary tree. There is no restriction on how your serialization/deserialization algorithm should work. You just need to ensure that a binary tree can be serialized to a string and this string can be deserialized to the original tree structure.

a tree can be reconstructed from pre (preferred) or post alone if the null nodes are marked with special characters - User iter() and next() class Codec: def serialize(self, root): def doit(node): if node: vals.append(str(node.val)) doit(node.left) doit(node.right) else: vals.append('#') vals = [] doit(root) return ' '.join(vals) def deserialize(self, data): def doit(): val = next(vals) if val == '#': return None node = TreeNode(int(val)) node.left = doit() node.right = doit() return node vals = iter(data.split()) return doit()

tree real life example

family tree, decision tree, expression tree, file system, hierarchy, abstract syntax tree, binary search tree,

tree vs graph

graphs can have - many root - loop - either di- or undirected graph Tree - undirected graph that is connected and acyclic. - n vertices → (n − 1) edges. - Any connected undirected graph G(V , E) with |E| = |V | − 1 is a tree. - An undirected graph is a tree iff there is a unique path between any pair of its vertices

Lowest Common Ancestor of a Binary Tree (LC 236)

if root in [None, p, q]: return root L, R = (self.lowestCommonAncestor(kid, p, q) for kid in (root.left, root.right)) return root if L and R else L or R

col order traversal

index -1 if go left index +1 if go right cols = collections.defaultdict(list) Q = [(root, 0)] for node, i in Q: if node: cols[i].append(node.val) Q += (node.left, i-1), (node.right, i+1) return [cols[i] for i in sorted(cols)]

pre/in/post order traversal recursive (DFS)

inorder: dfs(node.left) add node dfs(node.right) # can either check null before go left or right or check that as a base case ex: in-order: return self.inorderTraversal(root.left) + [root.val] + self.inorderTraversal(root.right) if root else []

Trie

insert and search cost O(k), with k is the length of key space requirement: O(alphabet_size*k*n) where N is number of keys in trie good for prefix search TrieNode - map {children char: children nodes} - bool endOfWord insert: - cur = root - if char not in cur.children, create new node - traverse down - when we reach the last node, change the last char.endOfWord to True class TrieNode: def __init__(self): self.children = collections.defaultdict(TrieNode) self.is_word = False class Trie(object): def __init__(self): """ Initialize your data structure here. """ self.root = TrieNode() def insert(self, word): """ Inserts a word into the trie. :type word: str :rtype: void """ cur = self.root for ch in word: cur = cur.children[ch] cur.is_word = True def search(self, word): """ Returns if the word is in the trie. :type word: str :rtype: bool """ cur = self.root for ch in word: if ch not in cur.children: return False cur = cur.children[ch] return cur.is_word def startsWith(self, prefix): """ Returns if there is any word in the trie that starts with the given prefix. :type prefix: str :rtype: bool """ cur = self.root for ch in prefix: if ch not in cur.children: return False cur = cur.children[ch] return True

Tarjan's algorithms

keynote: same low-link value → same component function dfs(at): stack.push(at) onStack[at] = true ids[at] = low[at] = id++ # Visit all neighbours & min low-link on callback for(to : g[at]): if(ids[to] == UNVISITED): dfs(to) if(onStack[to]): low[at] = min(low[at],low[to]) # After having visited all the neighbours of 'at' # if we're at the start of a SCC empty the seen # stack until we're back to the start of the SCC. if(ids[at] == low[at]): for(node = stack.pop();;node = stack.pop()): onStack[node] = false low[node] = ids[at] if(node == at): break sccCount++

fastest route algorithms: Bellman-Ford Arbitrage

relax with DP ex: Arbitrage → maximize the product of all conversion rate along the path →π(ri) → ∑(logri) → minimize ∑(-logri) → shortest path algorithm. However, this case, the weight can be negative → bellman-ford BellmanFord(G, S): {no negative weight cycles in G} {for all u ∈ V : {dist[u] ← ∞ prev[u] ← nil} dist[S] ← 0 repeat |V | − 1 times: {for all (u, v) ∈ E: {Relax(u, v)}}}} O(|V ||E|) Detect Infinite Arbitrage: - Do |V | iterations of Bellman-Ford, save all nodes relaxed on V -th iteration — set A - Put all nodes from A in queue Q - Do breadth-first search with queue Q and find all nodes reachable from A →All those nodes and only those can have infinite arbitrage Reconstruct Infinite Arbitrage - During Breadth-First Search, remember the parent of each visited node - Reconstruct the path to u from some node w relaxed on iteration V - Go back from w to find negative cycle from which w is reachable Use this negative cycle to achieve infinite arbitrage from S to u

fastest route algorithms: Dijkstra's

relax with heap Dijkstra's single source, shortest path: - set R of vertices for which dist is already set correctly (known region). - Added S to R. - each iteration we take a vertex outside of R with the minimal dist-value, add it to R, and relax all its outgoing edges. Dijkstra(G, S) {for all u ∈ V: {dist[u] ← ∞, prev[u] ← nil} dist[S] ← 0 H ← MakeQueue(V) {dist-values as keys} while H is not empty: {u ← ExtractMin(H) for all (u, v) ∈ E: {if dist[v] > dist[u] + w(u, v): {dist[v] ← dist[u] + w(u, v) prev[v] ← u ChangePriority(H, v, dist[v])}}}}} O((|V | + |E|)log(|V |))

remove a node in a BST

replace it with the smallest of the right sub or the biggest of the left sub

why union?

to merge 2 set and have a unique id for the set, →if we use linkedlist, we can use the last node as ID, since the last node is reachable from all node, but the process would take O(n) →the answer is to use a tree with reversed pointers to root. This can reduce the path length.

algorithm to find element in BST

traverse the tree & compare until: - the end is reached - element is found Don't change the root pointer if you don't want to loose the tree


Conjuntos de estudio relacionados

Nutritional Biochemistry Study Guide

View Set

Language Arts - Correct Language Usage Exam Review

View Set

Cultural Anthropology Chapter 8: Gender

View Set

Foundations in Personal Finance Ch. 10 - T/F; Fill in blank

View Set

Organizational behavior and magnet hospitals

View Set