Data Structures: Graph Representations, Breadth-First Seach, Depth-First Search, Dijkstra's Algorithm, Heaps, Upheap, and Downheap
What are the components of a graph?
A graph consists of a set of vertices (V) and a set of edges (E) such that each edge in E is a connection between a pair of vertices in V.
Complete graph
A graph containing all possible edges
Weighted graph
A graph in which each edge has an associated weight.
Dense graph
A graph with many edges
Simple path
A path in which all vertices along the path are distinct
Path
A sequence of vertices v1, v2, ..., vn with length n - 1.
Connected graph
An undirected graph is connected if there is at least one path from any vertex to another
What search method is utilized to find unweighted shortest paths?
BFS Note that this method gives the shortest path from a given start node to ALL other vertices
Limitations of Dijkstra's Algorithm
Doesn't work with negative edges Assumes that when marked "known", no better route will be found
Range of number of edges in a graph
0 ≤ |E| ≤ |V|² - |V| The maximum comes from the fact that graphs (typically) cannot have edges between themselves
Dijkstra's Algorithm pseudocode (with min heap)
1) Initialize a min heap and an array of distances from the source vertex 2) Set each dist in the distances array to be infinity, and make the previous undefined. Also, add all vertices to the min heap 3) While the min heap is not empty, get the minimum element from the min heap (call it u) 4) For every neighbor v of the current element u, store the distance from the start vertex to v 5) If that distance is less than the current distance reported for v in the distances array, then reset the distances array to be that new value. Also, set the previous of v to be u. Finally, remove v from the min heap
Dijkstra's Algorithm pseudocode (without min heap, using regular queue instead)
1) Initialize all vertices, set their weights to INF, set their known value to F, and set their prev to NONE (nullptr) 2) Set the weight of the starting node to 0 3) While there is an unknown vertex in the graph: 4) Find the vertex with the smallest distance to the current vertex v 5) Set that vertex to be known 6) Loop through all vertices adjacent to v 7) If a neighbor is not known, adjust the weight value of the neighbor by adding the weight of the edge from v to that neighbor (only if this provides a smaller weight than the previous value that was stored) 8) Set that neighbor's prev to v
Pseudocode for unweighted shortest path algorithm (BFS)
1) Instantiate a queue and initialize all vertices in the graph 2) Enqueue the starting vertex 3) While loop until the queue is empty 4) Dequeue once from the queue, set that vertex to be known 5) Loop through each of the adjacent vertices to the vertex that was just dequeued (requires a secondary queue to store neighbors) 6) If a neighbor has not been visited (is not known), then set its previous to be the just dequeued vertex and enqueue the current neighbor
How are elements inserted into a heap?
1) Item is placed into size + 1 slot 2) Upheap is called on that element
DFS pseudocode algorithm
1) Using a wrapper function, mark all vertices as not visited (initialize the graph), then call the private helper recursive function with the starting vertex 2) Set the current node to visited (in the helper function) 3) Loop through each neighbor to the current node 4) If a neighbor has not been visited, call the DFS helper function on that node
Two common methods of representing a graph
1) With an adjacency matrix 2) With an adjacency list
Heap
A binary tree with 2 invariants 1) Shape property - Tree must be complete (all levels are full except possibly last, and if the last is not full, then all elements are to the left) 2) Heap property - parents are more important than their children
Free tree
A connected, undirected graph with no simple cycles
Directed acyclic graph
A directed graph with no cycles
Undirected graph
A graph whose edges are not directed. These edges are often thought to be "bidirectional" in that a traversal between nodes can be made in either direction
Directed graph
A graph with edges directed from one vertex to another. These edges are "one way" in the sense that you need two directed edges between nodes to travel in both directions
Sparse graph
A graph with relatively few edges
Breadth First Search (BFS)
A vertex V is first visited, then all of its children are visited. Note that any node will be visited before its children (barring cycles). This can be implemented with a queue. Useful for finding the shortest path from a starting node to all other nodes (in an unweighted graph)
Adjacency matrix
A |V| × |V| array where each row and column represents a vertex in V (in the same order that they appear in V). If any two vertices are connected by an edge, that corresponding position in the adjacency matrix is a 1 (or T). Otherwise, the value is a 0 (or F).
Adjacency list
An array of size |V| of linked lists. Each linked list contains all vertices that are adjacent to the vertex at the given index.
Greedy algorithm
An algorithm that always chooses the choice that looks best at the current moment (doesn't think ahead)
Upheap
Function that can be called on an element in a heap. Typically called on last element directly after insertion. That element is swapped with its parent until it is in the correct spot based on the heap invariant
Downheap
Function that can be called on an element of a heap. Typically called on new root directly after removal of previous root (after removing from the heap) The element is swapped with the smaller of its two children (in a min heap) This is repeated recursively until the heap invariant is restored.
Max heap
Heap where parents have a higher value than their children
Min heap
Heap where parents have a lower value than their children
Time complexity of Dijkstra's Algorithm with typical implementation (without min heap)
If using a typical implementation (adjacency matrix/list), then this algorithm is O(|V|²) since we must process each vertex (O(|V|)), find the unknown vertex with smallest distance (O(|V|)), then update the distance (O(|E|)). This becomes O(|V|² + |E|), but since the maximum value of |E| is less than |V|², it becomes O(|V|²)
Time complexity of Dijkstra's Algorithm with min heap implementation
If we put unprocessed vertices into a min heap based on current best distance, we can reduce time complexity of Dijkstra's Algorithm to O((|V| + |E|)log|V|). This comes from the fact that extracting a vertex from a heap and inserting into a heap is an O(log|V|) operation.
Dijkstra's Shortest Path Algorithm
In a weighted graph: Use the unweighted shortest path algorithm, but at each stage, when you expand the neighbors of a node, always pick the unknown vertex with the smallest known distance from the original starting vertex. Then, declare that shortest distance vertex as known.
Time complexity of deletion from a heap
O(logn) since this is a complete binary tree
Time complexity of insertion into a heap
O(logn), since this is always a complete binary tree
How are elements removed from a heap?
Only the element at the root of the heap is removed, which will either be the min or the max depending on what type of heap is being analyzed 1) Save min value (if needed) 2) Move element at size position to root 3) Downheap on root
Space/time complexities for adjacency list graph implementation
Space: O(|V| + |E|) Time to look up whether two nodes are connected: O(|V|)
Space/time complexities for adjacency matrix graph implementation
Space: O(|V|²) Time to find edge from v1 to v2: O(1) Time to list all vertices adjacent to given vertex: O(|V|)
Length of a path
The number of edges the path contains
What is true about all heaps of size n?
They all have the same shape (due to the shape invariant)
Adjacent vertices (neighbors)
Two vertices are considered to be adjacent if they are joined by an edge. These vertices are also called neighbors
Depth First Search (DFS)
Whenever a vertex V is visited during this search, DFS will recursively visit all of V's unvisited neighbors. This will go all the way down a path until the path cannot go any further. Useful for finding any path (not shortest) between two vertices or if you want all the paths.
How are heaps typically implemented?
With an array: Left child of slot s is in slot 2s Right child of slot s is in slot 2s + 1 Parent of element in slot s is in slot floor(s/2)