CS Data Structures
directed graph
Finite set of vertices together with a finite set of edges, each edge has an arrow. Each edge is associated with two vertices, called its source and target vertices. The order of the two connected vertices is important.
undirected graph
Finite set of vertices together with a finite set of edges, no arrows involved. The order of the two connected vertices is unimportant.
empty graph
Graph with empty sets
What is the time complexity of a tree structure?
O (log n)
Time complexity of Dijkstra's Algorithm
O(E log V), where V is number of vertices and E is the number of edges
Time complexity of Floyd Warshall Algorithm
O(N^3)
Time complexity of Bellman Ford Algorithm
O(VE) , where V is number of vertices and E is the number of edges
Single-source shortest path problem
The problem of finding a path between two vertices in a graph such that the sum of the weights of its constituent edges is minimized.
binary-tree
a tree in which each node has at most two children (parent, left, and right)
heap (rules)
a tree where parent node has bigger (smaller) value than children The element contained by each node is greater than or equal to the elements of that node's children. The tree is a complete binary tree.
empty tree
a tree without any node
subtree
any node in a tree and its descendants
What data structure is used to implement a heap?
array (queue)
Types of Linear Data Structures
arrays, lists, queues, stacks, linked list
full binary tree
every leaf has the same depth and every nonleaf have two children
complete binary tree
every level has two children nodes except the last level, where nodes must be to the left.
simple graph
A graph that have no loops and no multiple edges.
Edge List Representation
1. A directed graph with n vertices can be represented by n different linked lists. 2. List number i provides the connections for vertex i. 3. For each entry j in list number i, there is an edge from i to j. PROS: Linear complexity worst case. Better at iterating loops. CONS: not as fast, takes up more space
Adjacency Matrix Representation
1. An adjacency matrix is a square grid of true/false values that represent the edges of a graph. 2. If the graph contains n vertices, then the grid contains n rows and n columns. 3. For two vertex numbers i and j, the component at row i and column j is true if there is an edge from vertex i to vertex j; otherwise, the component is false. PROS: usually constant time, sometimes linear CONS: might have many wasted false spaces
Node Representation of a Tree
1. Each node of a binary tree can be stored as an object of a binary tree node class. The class contains private instance variables that are references to other nodes in the tree. 2. An entire tree is represented as a reference to the root node.
Edge Set Representation
1. To represent a graph with n vertices, we can declare an array of n sets of integers. 2. A set such as connections[i] contains the vertex numbers of all the vertices to which vertex i is connected. PROS: Linear complexity, sometimes log n. Better at iterating loops. CONS: not as fast, takes up more space
Depth first search
1. Uses a stack to keep track of vertices that still need to be visited. (Push, Visit, Mark) (work forward and then backtrack until open vertices) 2. Recursively visit an unvisited neighbor until dead end, then backtrack.
Two Tree
A binary tree that either is empty or each non-leaf has two children
Floyd Warshall Algorithm
A graph analysis algorithm for finding shortest paths in a weighted, directed graph. A single execution of the algorithm will find the shortest paths between all pairs of vertices.
weighted graph
A graph in which edges or vertices have a numerical weight assigned.
Prim's Algorithm
A minimum-spanning-tree algorithm that continuously increases the size of a tree starting with a single vertex until it spans all the vertices.
Kruskal's Algorithm
A minimum-spanning-tree algorithm where the algorithm finds an edge of the least possible weight that connects any two trees in the forest.
graph
A non-linear data structure consisting of nodes and links between nodes.
path
A sequence of vertices, (p0, p1, ..., pm), such that each adjacent pair of vertices (pi) and (pi+1) are connected by an edge.
cycle
A simple path with no repeated vertices or edges other than the starting and ending vertices.
Minimum-cost spanning tree
A spanning tree with weight less than or equal to the weight of every other spanning tree that may not be unique.
Spanning tree
A subgraph which is a tree and connects all the vertices together.
B tree
A tree data structure that keeps data sorted and allows searches, insertions, and deletions in logarithmic amortized time. Unlike self-balancing binary search trees, it is optimized for systems that read and write large blocks of data.
Root Tree
A tree with only one node
Bellman Ford Algorithm
Algorithm that computes single-source shortest paths in a weighted digraph. It is used primarily for graphs with negative weights. The algorithm can detect negative cycles and report their existence, but it cannot produce a correct "shortest path" if a negative cycle is reachable from the source.
loop
An edge that connects a vertex to itself.
B Tree rules
B-tree nodes have many more than two children. A B-tree node may contain more than just a single element. The set formulation of the B-tree rules: Every B-tree depends on a positive constant integer called MINIMUM, which is used to determine how many elements are held in a single node. Rule 1: The root can have as few as one element (or even no elements if it also has no children); every other node has at least MINIMUM elements. Rule 2: The maximum number of elements in a node is twice the value of MINIMUM. Rule 3: The elements of each B-tree node are stored in a partially filled array, sorted from the smallest element (at index 0) to the largest element (at the final used position of the array). Rule 4: The number of subtrees below a nonleaf node is always one more than the number of elements in the node. Subtree 0, subtree 1, ... Rule 5: For any nonleaf node: An element at index i is greater than all the elements in subtree number i of the node, and An element at index i is less than all the elements in subtree number i + 1 of the node. Rule 6: Every leaf in a B-tree has the same depth. Thus it ensures that a B-tree avoids the problem of a unbalanced tree.
Heap Complexities (Insertion, Deletion, Search)
Best: O (n log(n)) for n elements, O(1) optimal. O(n log(n)) for n elements for both. O(log(n)) for both. Worst: O(log n), O(log n).
2-3 Tree Complexity (Insertion, Deletion, Search)
Best: O(log n), O(log n), O (log n). Worst: O(log n), O(log n), O(log n)
AVL Tree Complexities (Insertion, Deletion, Search)
Best: O(log n), O(log n), O(log n). Worst: O(log n), O(log n), O(log n).
BST Complexities (Insertion, Deletion, Search)
Best: O(log(n)), O(log(n)), O(log(n)). Worst: O(n), O(n), O(n).
self-balancing BST
Binary Tree that attempts to keep its height, or the number of levels of nodes beneath the root, as small as possible at all times, automatically.
binary search tree BST (rules)
Binary tree in which... Every element in n's left subtree is less than or equal to the element in node n. Every element in n's right subtree is greater than the element in node n.
How to select right data structure
Consider the structure's interface of operations and its memory and run-time efficiency
edge
Each link of a graph that connects two vertices
vertex
Each node of a graph
Priority Queue (rules)
Elements are placed in the queue and later taken out. But each element in a priority queue has an associated number called its priority. When elements leave a priority queue, the highest priority element always leaves first.
Potential Problem of a BST
List of ascending/descending numbers in a BST would cause a line of height = n nodes
In-Order Traversal
Root, Left, Node, Right
Post-Order Traversal
Root, Left, Right, Node
Pre-Order Traversal
Root, Node, Left, Right
Dijkstra's Algorithm
Single-source shortest path algorithm that solves the single-source shortest path problem for a graph with non-negative edge path costs, producing a shortest path tree.
All-pairs shortest path problem
The all-pairs shortest path problem finds the shortest paths between every pair of vertices in the graph.
Time complexity of Prim's Algorithm
Using Adjacency matrix, O(V^2) Using Binary Heap, O(V log n + E log V), where V is the number of vertices
Time complexity of Kruskal's Algorithm
Using merge sort, O(E log E), where E is the number of edges
Array representation of tree
The data from the root always appears in the [0] component of the array. Suppose the data for a non-root node appears in component [i] of the array. Then the data for its parent is always at location [(i - 1)/2]. Suppose the data for a node appears in component [i] of the array. Then its children (if they exist) always have their data at these locations: Left child at component [2i + 1] Right child at component [2i + 2]
AVL tree (rules)
The heights of the two child subtrees of any node differ by at most one; therefore, it is also said to be height-balanced. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations. The balance factor of a node is the height of its right subtree minus the height of its left subtree and a node with a balance factor 1, 0, or -1 is considered balanced.
Red-Black tree (rules)
The leaf nodes are not relevant and do not contain data. A null child pointer can encode the fact that this child is a leaf. Like BSTs, RB-trees allow efficient in-order traversals of elements.
Bubble Sort
Time Complexity: worst O(n2) avg O(n2) Best O(n) Space Complexity: O(1) auxiliary
Insertion Sort
Time Complexity: worst O(n2) avg O(n2) Best O(n) Space Complexity: O(1) auxiliary
Selection Sort
Time Complexity: worst O(n2) avg O(n2) Best O(n2) Space Complexity: O(1) auxiliary
Quicksort
Time Complexity: worst O(n2) avg O(nlogn) Best O(nlogn) Space Complexity: O(logn) auxiliary
Heapsort
Time Complexity: worst O(nlogn) avg O(nlogn) Best O(nlogn) Space Complexity: O(1) auxiliary
Merge Sort
Time Complexity: worst O(nlogn) avg O(nlogn) Best O(nlogn) Space Complexity: O(n) auxiliary
2-3 Tree
Type of B-tree where every node with children (internal node) has either two children and one data element (2-nodes) or three children and two data elements (3-node). Leaf nodes have no children and one or two data elements.
Breadth first search
Uses a queue to keep track of vertices that still need to be visited. (Visit, Mark, Enqueue) (work on one until all connected vertices are used up, then look at head of queue for next)
height
length of longest downward path to a node
child
node on next level that branches from a parent node
parent
node on the previous level that is above a child node in a subtree
leaf
nodes and the very bottom level of a tree's subtrees
depth
number of steps to hop from the current node to the root node
root
top node of a tree
Types of Non-Linear Data Structures
trees, graphs, hash tables
traversal
way to visit every member in the structure