Algorithms & Data Structures,ata Structures and Algorithms Midterm, Data Structures and Algorithms FINAL, Data Structure And Algorithms, Data Structures and Algorithms, Algorithms & Data Structures, Dictionary of Algori
Complete Graph
A simple undirected graph in which every pair of distinct vertices is connected by a unique edge, in other words, every vertex is directly connected to every other vertex in the graph
in-place sort
A sort algorithm in which the sorted items occupy the same storage as the original ones. These algorithms may use o(n) additional memory for bookkeeping, but at most a constant number of items are kept in auxiliary memory at any time.
weak-heap sort
A sort algorithm that builds a weak-heap, then repeatedly extracts the maximum item. In the worst case, the run time is O(n log n + 0.1 n).
treesort (1)
A sort algorithm that first builds a binary search tree of the keys, then accesses the keys with an in-order traversal.
restricted universe sort
A sort algorithm that operates on the basis that the keys are members of a restricted set of values. They may not require comparisons of keys to perform the sorting.
Maximum Heap Tree
A tree in which every parent is greater in value than both its children, which means that the root of the tree is greatest value in the tree.
Minimum Heap Tree
A tree in which every parent is lesser in value than both its children, which means that the root of the tree is least value in the tree.
Binary Search Tree
A tree in which nodes are inserted systematically in natural order, with the final property of each left child being less than or equal to its parent, and each right child being greater than its parent. (Does not preserve the order in which nodes were added.
Direct-access table: advantage, disadvantage
Advantage: quick search, quick insert and delete Disadvantage: lots of wasted memory, keys must be unique, keys should be dense
Binary Search Tree: advantage, disadvantage
Advantage: quick search, quick insert and delete Disadvantage: slower than hash table
hash table delete
If the hash table uses chaining, use the chaining data structure delete. If it uses open addressing, mark the item "deleted" for future reuse.
We recursively do an inorder traversal on the ____ , vist the root node, and finally do a recursive _____ of the right subtree
Inorder Traversal
Linked-List: Sorted Linked List
Insertion/removal likely to be faster than using an array, but still O(N) - much more difficult to implement than a sorted array - Efficient sorting mechanism - Is still O(N^2), but only 2N copies vs O(N^2) --> however it requires twice as much memory (array+linked-list)
Kruskal's algorithm
Keep adding smallest edges
cycle
A path that starts and ends at the same vertex and includes at least one edge.
bintree
A regular decomposition k-d tree for region data.
Hamiltonian path
A simple path through a graph that includes every vertex in the graph.
optimal solution
A solution to an optimization problem which minimizes (or maximizes) the objective function.
quick search
A string matching algorithm that compares characters from the end of the search string to its beginning. When a character doesn't match, the next character in the text beyond the search string determines where the next possible match begins.
bozo sort
A terribly inefficient sort algorithm that randomly swaps items until they are in order.
graph concentration
Contracting a graph by removing a subset of the vertices.
principle of optimality
In some optimization problems, components of a globally optimal solution are themselves globally optimal.
Depth
In tree data structure, expressed as the number of steps from the root of the tree to the farthest outlying node in the tree. Height is also used to mean the same thing.
Internal Path Length
In tree processing, this is the sum of all the lengths from the root to the external nodes in a tree.
0-based indexing
Indexing (an array) beginning with 0.
array merging
Joining two arrays into one.
Child
Node that is directly beneath another node in the tree.
Internal node
Node that isn't root and has at least one child.
Leaf
Node with no children
Tree
Non-linear hierarchical structure ADT. Other than root node, each node has exactly one parent.
When working with a single linked list which nodes should we know?
Only The Header
Prime number Tables
Reduce the chance of collision.
UB-tree
Refers to either universal B-tree or unlimited branching tree.
What Is A Binary Search Tree?
Same as binary tree, but data is sorted. Right Is Higher, and Left Is Less
What Is A Double Linked List?
Same as single linked list, but has a previous pointer together with a next pointer.
Multi set
Same but element can appear more than once
xor
"Exclusive OR" or "not equal to" function: 0 XOR 0 = 0, 0 XOR 1 = 1, 1 XOR 0 = 1, 1 XOR 1 = 0.
Graphs
"Uber" data structure. Shows connections between objects. Can be displayed as either a matrix or linked list representation.
Applications of set
- Lottery - Raffle - Lucky draw - Spot-check
reflexive
A binary relation R for which a R a for all a.
free tree
A connected, acyclic, undirected graph.
hyperedge
A connection between any number of vertices of a hypergraph.
What Is A Tree?
A data structure composed of nodes.
Zipfian distribution
A distribution of probabilities of occurrence that follows Zipf's law.
deterministic finite state machine
A finite state machine with at most one transition for each symbol and state.
binary function
A function with two arguments.
dynamic hashing
A hash table that grows to handle more items. The associated hash function must change as the table grows. Some schemes may shrink the table to save space when items are deleted.
P-complete
A language L is P-hard under NC many-one reducibility if L' ≤mNC for every L' ∈ P. A language L is P-complete under NC reducibility if L ∈ P and L is P-hard.
Aho-Corasick
A multiple string matching algorithm that constructs a finite state machine from a pattern (list of keywords), then uses the machine to locate all occurrences of the keywords in a body of text.
square matrix
A n × n matrix, i.e., one whose size is the same in both dimensions.
Leaf
A node with no children.
Bucket
A second resolution to collisions, the bucket alters the structure of the hashtable so that each location can represent more than one value. Resolving collisions by using buckets that are linked chains is called separate chaining.
Quick Select
A selection algorithm to find the kth smallest element in an unordered list. Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side - the side with the element it is searching for. This reduces the average complexity from O(n log n) to O(n). Partition algorithm:
chain
A set with a total order.
don't care
A special symbol which matches any other symbol of a given alphabet.
blind sort
A specialized sort algorithm that first builds a blind trie, then traverse the tree left to right.
full array
A term from combinatorial chemistry referring to trying all possible combinations of blocks. This is not a data structure, and is here to reduce confusion.
balanced tree
A tree where no leaf is much farther away from the root than any other leaf. Different balancing schemes allow different definitions of "much farther" and different amounts of work to keep them balanced.
ordered tree
A tree where the children of every node are ordered, that is, there is a first child, second child, third child, etc.
cutting plane
A valid inequality for an integer polyhedron that separates the polyhedron from a given point outside it.
right-threaded tree
A variant of a threaded tree in which only the right thread, i.e. link to the successor, of each node is maintained.
What is the complexity of a doubly linked list?
Access: O(n) Search: O(n) Insertion: O(1) Deletion: O(1)
Temporal locality
Accessing something over a short period of time
shared memory
All processors have the same global image of (and access to) all of the memory.
acyclic
An acyclic graph is a graph with no cycles.
array
An assemblage of items that are randomly accessible by integers, the index.
external index
An auxiliary data structure added to a main data structure to improve operations, such as a search on a secondary key.
bridge
An edge of a connected graph whose removal would make the graph unconnected.
matched edge
An edge which is in a matching.
Internal Node
An existing node in a tree, either the root or any one of the children in the tree.
start state
An initial state or condition of a finite state machine or Turing machine. Informally, how the memory is initially set.
rectangular matrix
An n × m matrix, or, one whose size may not be the same in both dimensions.
Vertex
An object in a graph.
Node
An object linked to other objects, representing some entity in that data structure.
Iterators
An object that knows how to "walk" over a collection of things. Encapsulates everything it needs to know about what it's iterating over. Should all have similar interfaces. Can read data, move, know when to stop.
complete graph
An undirected graph with an edge between every pair of vertices.
Iterative Refinement
Analysis pattern. Most problems can be solved using s brute-force approach. Find such a solution and improve upon it.
Extraction:
Breaking keys into parts and using the parts that uniquely identify with the item. 379452 = 394 121267 = 112
CSP
C. A. R. Hoare's algebraic theory to formalize the notion of concurrent computation. An acronym for Communicating Sequential Processes.
Solving Divide-and-Conquer Recurrances
Case 1: Too many leaves. Case 2: Equal work per level. Case 3: Too expensive a root
Circular queue - bounded
Circular queue Dequeue o(1) Enqueue o(1)
Spatial locality
Close by in memory
phonetic coding
Code a string based on how it is pronounced.
and
Conjunction: 0 AND 0 = 0, 0 AND 1 = 0, 1 AND 0 = 0, 1 AND 1 = 1.
Collatz problem
Consider the function: if N is odd, 3 × N + 1; else N/2. Does beginning with any positive integer and repeatedly applying the function always yields 1?
or
Disjunction: 0 OR 0 = 0, 0 OR 1 = 1, 1 OR 0 = 1, 1 OR 1 = 1.
Analysis of indexed list
Doubly Linked : Array-based get: O(n) O(1) set: O(n) O(1) removeFirst: O(1) O(n) removeLast: O(1) O(1) addFirst: O(1) O(n) addLast: O(1) O(1) / O(n) remove: O(n) O(n) add: O(n) O(n)
heap property
Each node in a tree has a key which is more extreme (greater or less) than or equal to the key of its parent.
union of automata
Find an automaton that accepts everything which the automata accept individually.
Selection sort
Find smallest, put at beginning Best: O(n^2) Avg: O(n^2) Worst: O(n^2)
Queues
First in, first out. O(1)
Steiner ratio
For a given variant of the Steiner tree problem, the maximum possible ratio of the length of a minimum spanning tree of a set of terminals to the length of an optimal Steiner tree of the same set of terminals. Usually written ρ (rho).
Red Black Tree: height
Height: O(lg n)
Maxheap
Here, the object in a node is grater than or equal to its descendant objects.
Minheap
Here, the relation is less than or equal to.
binary insertion sort
Insertion sort in which the proper location for the next item is found with a binary search.
Indirect Sorting
Involves the use of smart pointers; objects which contains pointers.
Perfect hash function
Maps each element to a unique location.
What is the worst case time complexity for: Insert, lookup, and delete, for hash functions?
O(1)
What Is Best, Average, Worst, and Space Complexity Of Heap Sort?
O(nlogn), O(nlogn), O(nlogn), O(1)
What Is The Best, Average, Worst, and Space Complexity Of Merge Sort?
O(nlogn), O(nlogn), O(nlogn), O(1)
Balanced n-ary tree
Perfect tree or tree that would be perfect if deepest level is removed.
n queens
Place n chess queens on an n × n board such that no queen can attack another. Efficiently find all possible placements.
L R N
Postorder traversal (Reverse Polish)
DoubleIterator
Provide an iterator for a doubly linked list which supports remove option. It is an inner class because it needs access to all data fields.
What Are The 4 Methods Of A Stack And What Do They Do?
Push, Pop, Peek, isEmpty
Dutch national flag
Rearrange elements in an array into three groups: bottom, middle, and top.
Topological Sorting
Receives a DAG as input, outputs the ordering of vertices. Selects a node with no incoming edges, reads it's outgoing edges.
sift up
Restoring the heap property by swapping a node with its parent, and repeating the process on the parent until the root is reached or the heap property is satisfied.
What is the minimum element in a binary heap?
Root node
Single Rotation
Rotation preserves order. Inner children become the child of the node which was replaced.
binary search
Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the interval to the lower half. Otherwise narrow it to the upper half. Repeatedly check until the value is found or the interval is empty.
linear search
Search an array or list by checking items one at a time.
dichotomic search
Search by selecting between two distinct alternatives (dichotomies) at each step.
threaded binary tree
See threaded tree.
Vertex-coloring
Seeks to assign a label (or color) to each vertex of a graph such that no edge links any two vertices of the same color.
Insertion sort
Side-by-side comparison Best: O(n) Avg: O(n^2) Worst: O(n^2)
greedy heuristic
Solve an optimization problem by finding locally optimal solutions.
bubble sort
Sort by comparing each adjacent pair of items in a list in turn, swapping the items if necessary, and repeating the pass through the list until no swaps are done.
insertion sort
Sort by repeatedly taking the next item and inserting it into the final data structure in its proper order with respect to items already inserted. Run time is O(n2) because of moves.
What Is Big Omega Notation?
The Best Case Scenario Of The Code Written
omicron
The Greek letter written as "o" (see little-o notation) or "O" (see big-O notation).
Root
The base level node in a tree; the node that has no parent.
prefix
The beginning characters of a string. More formally a string v∈Σ* is a prefix of a string u∈Σ* if u=vu' for some string u'∈Σ*.
disjunction
The boolean or function.
centroid
The center of gravity or center of mass of an object.
state
The condition of a finite state machine or Turing machine at a certain time. Informally, the content of memory.
Edge
The connection in a graph between two vertices.
single program multiple data
The dominant style of parallel programming, where all processors use the same program, though each has its own data.
dual
The dual of a planar graph, G, is a graph with a vertex for each region in G and an edge between vertices for each pair of adjacent regions. The new edge crosses the edge in G which is the boundary between the adjacent regions.
best-case cost
The minimum cost to process an input sequence.
outdegree and indegree
The outdegree of a vertex is the number of edges pointing from it. The indegree of a vertex is the number of edges pointing to it.
tree editing problem
The problem of finding an edit script of minimum cost which transforms a given tree into another given tree.
planarization
The process of transforming a graph into a planar graph. More formally, the transformation involves either removing edges (planarization by edge removal), or replacing pairs of nonincident edges by 4-starts (planarization by adding crossing vertices). In both cases, the aim of planarization is minimize the number of edge removals or replacements.
Double Hashing
The process of using two hash functions to determine where to store the data.
q sort
The quicksort implementation in many libraries. It is generally a combination of quicksort, for large partitions, and insertion sort, for small partitions.
skd-tree
The spatial k-d tree is a spatial access method where successive levels are split along different dimensions. Objects are indexed by their centroid, and the minimum bounding box of objects in a node are stored in the node.
string matching with mismatches
The special case of string matching with errors where mismatches are the only type of error allowed.
candidate consistency testing
The stage of two-dimensional matching where a candidate occurrence of the pattern is checked against the "witness" table.
Sub-tree rooted at a node
The tree consisting of the node itself plus all its descendants.
depoissonization
To interpret a solution to a poisson model in the original model after poissonization.
poissonization
To replace a deterministic input by poisson process, changing the model into a poisson model. A solution to the poisson model must be interpreted in the original model.
siblings
Vertices of a tree that have the same parent.
Depth-first search
Visits the child vertices before visiting the sibling vertices A stack is usually implemented
connected vertices
We say that one vertex is connected to another if there exists a path that contains both of them.
strongly connected vertices
We say that two vertices v and w are strongly connected if they are mutually reachable: there is a directed path from v to w and a directed path from w to v.
preorder traversal
When you visit the root before we visit the roots subtrees. We then visit all the nodes in the roosts left subtree before we visit the nodes in the right subtree. Example of depth-first traversal.
Red-black Tree
Worst height: 2 log n
How Does Search Work In A Linked List?
You access the first Node Then Go Through Each Node.
What is a complete binary tree?
a binary tree in which every level of the tree is fully filled, except for perhaps the last level.
What is a full binary tree?
a binary tree in which every node has either zero or two children. No nodes have one child
list
a collection of data items arranged in a certain linear order
Java stream: definition
a sequence of data
Priority queue
allow high priority data to jump the queue and join a queue with a higher priority. Items of the same priority are still FIFO. For array adding an element, you need to shift For linked structure you need to reassign. both are o(n)
Greedy Algorithm
an algorithm that follows problem solving heuristic of making optimal choices at each stage. Hopefully finds the global optimum. An example would be Kruskal's algorithm.
If a graph's edges are unordered [ (u,v) == (v,u)], then the vertices u and v are connected by
an undirected edge (u,v).
Θ-notation
asymptotically tight bound
priority queue
collection of data items from a totally ordered universe
Implement A Tress As A List Of Lists. What Are The Functions We Need To Create This?
def BinaryTree(r): def insertLeft(root, newBranch) def insertRight(root, newBranch) def getRootVal(root) def setRootBal(root, newVal) def getLeftChild(root) def getRightChild(root)
Priority Queue
dequeue is O(1) but enqueue is O(N) --> slow insertion, so priority queue is often implemented using a heap to improve insertion time
Generalization Relation
'is-a' relation i.e. when a class extends another class
relaxed balance
When rebalancing a search tree is independent of updating the tree.
Postfix Expression
When the operator is after the expression
Prefix Expression
When the operator is before the expression.
Cycles
When there are at least two unique paths which connect vertices A and B, forming a loop or loops
k-d-B-tree
A data structure which splits multidimensional spaces like an adaptive k-d tree, but balances the resulting tree like a B-tree.
simple path
A directed path with no repeated vertices
Markov chain
A finite state machine with probabilities for each transition, that is, a probability that the next state is sj given that the current state is si.
forest
A forest is a disjoint set of trees.
multiprefix
A generalization of scan in which the partial sums are grouped by keys.
Eulerian graph
A graph that has an Euler cycle.
k-ary Huffman coding
A minimal variable-length coding based on the frequency of each character. Similar to a Huffman coding, but joins k trees into a k-ary tree at each step, and uses k symbols for each level.
subset
A set S1 is a subset of another set S2 if every element in S1 is in S2. S2 may have exactly the same elements as S1.
occurrence
A string v occurs in a string u if v is a substring of u.
rooted tree
A tree in which one node is designated as the root.
Marlena
A wonderful wife. Every man should have such an incredible wife. We were married in 1976, too, and life's only gotten much better.
Greedy Algorithms
Algorithm design patterns. Compute a solution in stages, making choices that are local optimum at step; these choices are never undone.
Sorting
Algorithm design patterns. Uncover some structure by sorting the input.
ANSI
American National Standards Institute.
bit vector
An array of bits.
What connects two nodes to show that there is a relationship between them?
An edge
What is interpolation search?
An improved binary search for numbers that are longer where binary search would be inefficient for example phone numbers
geometric optimization problem
An optimization problem induced by a collection of geometric objects.
What are the ways you can implement a queue?
Array and Linked List
UML
Describe classes, attributes, fields, static relationships, operations and constraints
graph isomorphism
Determine if two graphs are isomorphic.
Complete Graph
Has an edge between every pair of distinct vertices.
Doubly Linked List: memory
Memory: O(3n) (LL: O 2n)
Bucket Sort
O(n+m) where m is the # of buckets.
N L R
Preorder traversal (Polish)
fathoming
Pruning a search tree.
mean
The (arithmetic) mean of some values is the sum of all values divided by the number of values.
worst-case cost
The highest possible use of resources of an algorithm, which occurs for the most pessimistic, or worst possible, input.
suffix automaton
The smallest automaton accepting all suffixes of a string. The states form a directed acyclic word graph or DAWG.
Breadth-first search
Use a queue to search tree
ArraySet - add
Uses a contains check to see if element is in set, if false, adds it to he set.
Level-order traversal (breadth-first traversal)
We visit every node on a level (from left to right) before going to the next level down.
Unary Operator
When an operator has one operand ie: -5
complete graph
a graph with every pair of its vertices connected by an edge
data structure
a particular scheme organizing related data items.
What is Post-Order traversal and how do you implement it?
Post-Order traversal visits the root node after its child nodes
What is Pre-Order traversal and how do you implement it?
Pre-order traversal visits the root node then the children.
Bloom Filters
Probabilistic hash table. No means no. Yes means maybe. Multiple (different) hash functions. Can't resize table. Also can't remove elements.
postorder traversal
Process all nodes of a tree by recursively processing all subtrees, then finally processing the root.
O(n)
happens for each element
O(lg n)
happens for up to the height of a balanced tree
L N R
in-order traversal
The more items a table can hold, the () likely a collision will happen.
less
What does the tail node of a singly linked list reference?
null
if (u,v) is the last edge of the simple path from the root to vertex v, u is the _ of v
parent
Quicksort
partitioning Best: O(n log n) (or O(n) three-way) Avg: O(n log n) Worst: O(n^2)
What does the node constructor look like in a double linked list?
struct Node { int data; struct Node *next; struct Node *prev; }
Hash collision
two (or more) keys hash to same slot
Replace the lowest bit that is 1 with 0
x & (x - 1)
Compute x modulo a power of 2 (y)
x & (y - 1)
Depth First Search
Runs in time equal to the size of the graph, can determine if a graph has a cycle.
When To Use Interpolation Search?
Say we have longer numbers in are array like 1166. Binary search would be inefficient. And we would use interpolation
Geometric series
.
In-Order Traversal
1. Process left child. 2. Process self. 3. Process right child.
Pre-Order Traversal
1. Process self. 2. Process left child. 3. Process right child.
Bk tree
A binomial tree of order (height) k.
blocking flow
A flow function in which any directed path from the source to the sink contains a saturated edge.
move-to-front heuristic
A heuristic that moves the target of a search to the head of a list so it is found faster next time.
buddy system
A memory allocation strategy which recursively divides allocatable blocks of memory into pairs of adjacent equal-sized blocks called "buddies."
simple path
A path in which all vertices are distinct
Priority Queue
A queue that organizes objects according to their priorities.
Rooted Binary Tree
A rooted binary tree has a root node and every node has at most two children.
inversion list
A set of non-overlapping numeric ranges, stored in an array in increasing order. Items in odd indexes begin ranges, and items in even indexes are the first number after the ends.
SBB tree
A symmetric binary B-tree. Now known as a red-black tree.
proper coloring
A vertex coloring or edge coloring of a graph in which no two adjacent vertices or edges have the same color.
Arrays
Access - o(1) Processing - o(n)
Perfect n-ary tree
All leaf nodes have the same depth and all other nodes have exactly n children. The height is 2^h-1
exhaustive search
An algorithm that finds a solution by trying every possibility.
Direct-access table: definition
An element key k is stored in slot k.
vertex
An item in a graph. Sometimes referred to as a node.
set packing
An optimization problem to find the largest number of mutually disjoint subsets that cover a given set of sets.
set
An unordered collection (possibly empty) of distinct items called elements of the set.
bag
An unordered collection of values that may have duplicates.
Graph Modeling
Analysis pattern. Describe the problem using a graph and solve it using an existing algorithm.
Heuristics
Any approach to problem solving, learning, or discovery that employs a practical method not guaranteed to be optimal or perfect, but sufficient for the immediate goals. Where finding an optimal solution is impossible or impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution. Heuristics can be mental shortcuts that ease the cognitive load of making a decision. Examples of this method include using a rule of thumb, an educated guess, an intuitive judgment, stereotyping, profiling, or common sense
Load Factor
Approximately how it's full... 0.7-0.8.
Binary Search Tree
Avg height: O(log n) Worst height: O(n)
If each node in the tree has a maximum of two children we say that the tree is a?
Binary tree
ArrayStack (bounded)
Bottom of stack is index 0. int variable needed to keep track of top of stack. All methods are o(1) push,pop,peek,size and clear
Communicating Sequential Processes
C. A. R. Hoare's algebraic theory to formalize the notion of concurrent computation. Commonly known as CSP.
parallel prefix computation
Calculate an associative function, f, on all prefixes of an n-element array, that is, s[0], f(s[0], s[1]), f(s[0], f(s[1], s[2])), ..., f(s[0], f(s[1], ... f(s[n-2], s[n-1])...)), using Θ(n) processors in Θ(log n) time. The algorithm is for j := 0 to lg(n)-1 do for i := 2j to n-1 parallel-do s[i] := f(s[i-2j], s[i]) where lg is the logarithm base 2, and parallel-do does the innermost computations in parallel.
bitonic sort
Compare, and swap if necessary, pairs of elements in parallel. Subsets are sorted then merged.
Merge sort
Comparisons: O(N) Copies: O(N) Remarks: additional memory needed, but very efficient, can be done recursively O(NlogN)
Hash Table
Constant access time (on average).
NYSIIS
Convert a name to a phonetic coding of up to six characters.
Dynamic resizing
Creating a new larger hash table. Copying all the old buckets into the new one rehash the entries
bucketing method
Data organization methods that decompose the space from which spatial data is drawn into regions called buckets. Some conditions for the choice of region boundaries include the number of objects that they contain or on their spatial layout (e.g. minimizing overlap or coverage).
Union Find
Data structure used to make sure a cycle is not created in a MST.
What would the Perfect Hash Function be?
Each Key maps to an unique Hash Index.
min-heap property
Each node in a tree has a key which is greater than or equal to the key of its parent.
What are the three characteristics of a tree?
Each tree has a root node, the root node has zero of more child nodes, and each child node has zero or more children
Weighting:
Emphasizing some parts of the key over another.
Binary Search
Fast, but the array must be sorted log2(N) Drawbacks --> insertion takes longer O(logN)
optimal triangulation problem
Find the triangulation with the greatest overall minimum angle. There is an incremental algorithm that takes O(n log n) time.
Breadth-First Traversal
Follows a path that explores an entire level before moving to the next level.
Undirected Graph
Given any path connecting vertices A and B, you can travel from A to B or B to A
completely connected graph
See either connected graph or complete graph.
codeword
Sequence of bits of a code corresponding to a symbol.
parallel computation thesis
Sequential space is a polynomial of parallel time.
accepting state
If a finite state machine finishes an input string and is in an accepting state, the string is accepted or considered to be valid.
Connected Component
In an undirected graph, a connected component is a maximal set of vertices such that there is a path between every pair of vertices (the example shows 3 connected components).
meld
Joining several data structures with particular properties into one large data structure having those properties. For instance, some priority queue implementations support the operation of joining two priority queues into a larger one.
List
Linear collection where elements can be added/removed anywhere in the list
Probing
Locating an open location in the hash table.
Lazy Deletion
Marking a spot as deleted in a hash table rather than actually deleting it.
Inversions
Min: 0 Max: n(n-1)/2 Swapping removes 1 inversion
HashMap complexity of basic operations:
O(1)
Treemap complexity of basic operations:
O(logN)
depth
Of a node, the distance from the node to the root of the tree.
Growth Function
Shows relationship between size of problem and time it takes to solve the problem.
fixed-grid method
Space decomposition into rectangular cells by overlaying a grid on it.If the cells are congruent (i.e.,of the same width, height, etc.), then the grid is said to be uniform.
Spares/Dense
Sparse if it has few edges. Dense if it has many edges.
perfect shuffle
Split a list of elements (or deck of cards) exactly in half then precisely interleave the two halves.
Prim's Algorithm
Start with one vertex, grow tree on min weight edge from all vertices So out of all reachable edges that don't cause cycles, take the smallest
Little-Oh
T(n) = 0(f(n)) if T(n) = O(f(n)) and T(n) != Ω(f(n))
Big-Oh
T(n) = O(f(n)) if there are positive constants c & n° such that T(n) <= c * f(n) for all n >= n°
Divide-and-Conquer Recurrances
T(n) = aT(n/b) + f(n)
simple uniform hashing
The assumption or goal that items are equally likely to hash to any value.
What are the two parts to recursion?
The base case, and recursive method.
conjunction
The boolean and function.
Bounded
The collection has maximum size or capacity e.g. array Should have isEmpty() Should have size()
strongly NP-hard
The complexity class of decision problems which are still NP-hard even when all numbers in the input are bounded by some polynomial in the length of the input.
lowest common ancestor
The deepest node in a tree that is an ancestor of two given leaves.
head
The first item of a list.
gamma function
The gamma function of n, written Γ(n), is ∫ 0∞ e-xxn-1dx. Recursively Γ(n+1) = nΓ(n). For non-negative integers Γ(n+1) = n!.
asymptotic time complexity
The limiting behavior of the execution time of an algorithm when the size of the problem goes to infinity. This is usually denoted in big-O notation.
array index
The location of an item in an array.
optimal value
The minimum (or maximum) value of the objective function over the feasible region of an optimization problem.
negation
The negation of 0 is 1; the negation of 1 is 0.
Chromatic Number
The smallest number of colors needed for an edge coloring of a graph
Children
The term used in trees to indicate a node that extends from another node, such as left child and right child in a binary tree.
competitive ratio
The worst case of the ratio between the cost incurred by an on-line algorithm and the best-case cost.
topological sort
To arrange items when some pairs of items have no comparison, that is, according to a partial order.
symmetry breaking
To differentiate parts of a structure, such as a graph, which locally look the same to all vertices. Usually implemented with randomization.
tripartition
To partition array elements into three groups.
rotation
To switch children and parents among two or three adjacent nodes to restore balance to a tree.
Parallel
Two edges are parallel if they connect the same pair of vertices.
2 Ways to Improve Disjoint Sets
Union By Rank - make smaller tree point to larger tree. Path Compression - Updating parent pointer directly to root.
Set
Unordered collection of distinct elements of a particular type in which an element can appear at most once
When should you use linked list over array?
You want to insert into a middle of a list, or you need constant-time insertions and deletions
Create The Node Class Of A Linked List.
class Node { int value; Node next; }
The vertices u and v of the undirected edge(u,v) are the _ of the edge
endpoints
A graph is undirected if
every edge in it is undirected.
Ordered arrays
fast binary search O(logN) slow insertion and deletion O(N)
Unordered Arrays
fast insertion O(1) slow linear search and deletion O(N)
Algorithm
has input, produces output, definite, finite, operates on the data it is given
What are the names of the first and last node in linked list?
head and tail
We add special nodes in a doubly linked list what are they , and where do they go?
header and the beginning, and a trailer at the end
What are the attributes for a binary heap?
heapList and currentSize
What Is Heap Sort?
heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like that algorithm, it divides its input into a sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element and moving that to the sorted region. The improvement consists of the use of a heap data structure rather than a linear-time search to find the maximum.[2]
The vertices u and v of the undirected edge(u,v) are _ to the edge
incident
Chaining
make each slot is the head of a linked list
Linear Search
slow --> N/2 comparisons O(N)
Base case
the basic fundamental component of the list. Also terminating condition.
tree root
top level vertex
for the directed edge (u,v), u is the _ and v is the _
u is the tail, v is the head.
Divide and Conquer
works by recursively breaking down a problem into two or more sub problems until the problems become simple enough to be solved directly. An example would be mergesort.
Is each leaf in .a tree unique?
yes
block
(1) A number of items which are handled together for efficiency. (2) A sequence of don't care symbols.
Combinations of binary tree traversal sequences that uniquely identify a tree
1. Inorder and preorder. 2. Inorder and postorder. 3. inorder and level-order.
What do you call a tree that has nodes with up to three children?
10-ary tree
rapid sort
A 2-pass sort algorithm that is efficient when the range of keys is approximately equal to the number of items and only keys are sorted. The first pass counts the occurrences of each key in an auxiliary array. The second pass goes over the auxiliary array writing the counted number of keys to the destination.
probabilistic Turing machine
A Turing machine in which some transitions are random choices among finitely many alternatives.
antisymmetric
A binary relation R for which a R b and b R a implies a = b.
complete binary tree
A binary tree in which every level (depth), except possibly the deepest, is completely filled. At depth n, the height of the tree, all nodes must be as far left as possible.
associative array
A collection of items that are randomly accessible by a key, often a string.
Heap
A complete binary tree whose nodes contain comparable objects and are organized as follows. Each node contains an object that is no smaller (or larger) than the objects in its descendants.
spanning tree
A connected, acyclic subgraph containing all the vertices of a graph.
Graph
A data structure in programming which consists of a set of vertices (nodes) and edges (connections).
dynamization transformation
A data structuring technique that can make a static data structure dynamic. In so doing, the performance of the dynamic structure will exhibit certain space-time tradeoffs.
calendar queue
A fast priority queue implementation having N buckets each with width w, or covering w time. An item with priority p more than current goes in bucket (p/w)%N. Choose N and w to have few items in each bucket. Keep items sorted within buckets. Double or halve N and change w if the number of items grows or shrinks a lot.
worst-case minimum access
A figure of merit for a family of searches, the "best" is the search that takes the minimum accesses in the worst case.
index file
A file which stores keys and an index into another file. The index file may have additional structure, e.g., be a B-tree.
finite state transducer
A finite state machine specifically with a read-only input and a write-only output. The input and output cannot be reread or changed.
Moore machine
A finite state machine that produces an output for each state.
Mealy machine
A finite state machine which produces an output for each transition.
Weighted Graph
A graph which places "costs" on the edges for traveling their path
scan
A parallel operation in which each element in an array or linked list receives the sum of all previous elements.
ancestor
A parent of a node in a tree, the parent of the parent, etc.
first-in, first-out
A policy that items are processed in order of arrival. A queue implements this.
DFS forest
A rooted forest formed by depth-first search.
knight's tour
A series of moves of a chess knight that visits all squares on the board exactly once.
heapsort
A sort algorithm that builds a heap, then repeatedly extracts the maximum item. Run time is O(n log n).
bounded stack
A stack limited to a fixed number of items.
Zhu-Takaoka
A string matching algorithm that is a variant of the Boyer-Moore algorithm. It uses two consecutive text characters to compute the bad character shift. It is faster when the alphabet or pattern is small, but the skip table grows quickly, slowing the pre-processing phase.
subgraph
A subgraph is a subset of a graph's edges (and associated vertices) that constitutes a graph.
octree
A tree to index three dimensions. Each node has either eight children or no children.
Bellman-Ford Algorithm
Algorithm which computes shortest paths from a single vertex to all other vertices in a weighted digraph. Is slower than its counterpart, but is able to handle edge weights with negative values. Works by initially setting the distance to all nodes to infinity, and then iteratively relaxing the edges in an order which would maintain a shortest path from the starting edge to any other edge. Has a time complexity of O(EV)
descendants of v
All vertices for which v is an ancestor in a tree
Caverphone
An algorithm to code English names phonetically.
Find
An algorithm to select the kth smallest element of an array and partition the array around it. First, partition around the value of the kth element. If the split is not at element k, move the upper or lower boundary and partition again.
vertex coloring
An assignment of colors (or any distinct marks) to the vertices of a graph. Strictly speaking, a coloring is proper if no two adjacent vertices have the same color.
free edge
An edge which is not in a matching.
pile
An ordered deque, that is, items may only be added to or removed from the head or the tail. An item is added to the head if it is smaller than the current head. An item is added to the tail if it is greater than the current tail. Items are never inserted into the middle, rather, an additional pile may be created.
binary tree
An ordered tree in which every vertex has no more than two children, with each child designated as a left or right child. Potentially empty.
Clustering
Collisions that are resolved with linear probing cause groups of consecutive locations in the hash table to be occupied. Each group is called a cluster and the phenomenon is known as primary clustering.
k-way merge
Combine k sorted data streams into a single sorted stream.
Insertion Sort
Comparisons: O(N^2) --> max is N*(N-1)/2, average is N*(N-1)/4 Shifts (copies): O(N^2) --> shift is not as time consuming as a swap --> Average: N*(N-1)/4 Big O: O(N^2) --> 1/2 the time than bubble sort Data that is already sorted it runs in O(N) --> efficient way for arrays that are only slightly oout of order
List Insertion Sort
Comparisons: O(N^2) Copies: O(N) O(N^2)
BBP algorithm
Compute the nth hexadecimal digit of π efficiently, without having to compute preceding digits.
tree contraction
Contracting a tree by removing some of the nodes.
Balanced Binary Tree
For each node in the tree, the difference in the height of its left and right subtrees is at most one.
arborescence
Informally, a directed tree.
Single Source Shortest Path
Input: Graph and starting vertex. Output: shortest path to all points. Unweighted: BFS Weighted: Dijkstra's Method
What are the two key operations of a min_hea{?
Insert and Extract_min
sparsity
Instances of the longest common subsequence problem in which the number of matches is small compared to the product of the lengths of the input strings.
Order of tree
Max number of children per node
Disjoint Sets
Never allowed to break apart sets. Also known as Union/Find Algorithm. Each node has a parent pointer which points to a representative for each set
Size
Number of nodes within tree
Order of functions
O(1), O(n), O(n log n), O(n^2), O(2^n)
TreeMap complexity for iterating over associated values:
O(N)
What Is The Space Complexity Of A Singly Linked List?
O(n)
PLOP-hashing
Piecewise linear order-preserving (PLOP) hashing is a spatial access method which splits space into a nonperiodic grid. Each spatial dimension is divided by nodes of a binary tree. Object are stored in the grid cell of their centroid.
Adjacency matrix: query for adjacency
Query for adjacency: O(1)
Adjacency list: query for adjacency
Query for adjacency: O(|V|)
Replica
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
first child-next sibling representation
Representation used for ordered trees with a potentially varying amount of children per parent node.
rebalance
Restore balance to a tree.
Stacks - Simple Applications
Reversing a word Delimiter matching
jump search
Search a sorted array by checking every jth item until the right area is found, then doing a linear search. The optimum for n items is when j=√ n.
Bond Sequential Search
Search an array or list for two keys at once by using the bitwise or of the keys as the search key. When a possible match is found, compare against each key against the item.
Level of tree
Set of all the nodes at a given depth
prefix code
Set of words such that no word of the set is a prefix of another word in the set. A prefix code may be represented by a coding tree.
iteration
Solve a problem by repeatedly working on successive parts of the problem.
Heapify (bubble down)
Swap a node with one of its children, calling bubble_down on the node again until it dominates its children. Each time, place a node that dominates the others as the parent node.
chromatic index
The minimum number of colors needed to color the edges of a graph.
minimum bounding box
The smallest rectangle completely enclosing a set of points.
transitive reduction
The transitive reduction of a directed graph G is the directed graph G' with the smallest number of edges such that for every path between vertices in G, G' has a path between those vertices.
General Tree
When each node can have an arbitrary number of children
What are the methods in a binary search tree class?
constructor() root, and size attributes length(), __len__, put, _put, get(self, key), _get(key, currentNode), getItem(key), contains(key), delete(key), spliceOut(), findSuccessor(), findMin(), remove(currentNode)
Mergesort
split into sub-arrays Best: O(n log n) Avg: O(n log n) Worst: O(n log n)
greatest common divisor
(1) The greatest integer which is a divisor of given positive integers. For instance, GCD(30, 42) = 6. (2) An algorithm to find the same.
tail
(1) The last item of a list. (2) All but the first item of a list; the list following the head.
supersource
A vertex of a directed graph from which all other vertices are reachable.
universe
All potential elements of a set.
primitive algorithm
An algorithm in which all the computable steps are basic operations.
An ordered list of node that are connected by edges.
Path
What Is Merge Sort?
Uses divide and dequeue.If we have an array of 8 we divide it to 2 arrays of 4, then 4 of 2, then 8 of 1, and solve each individually and merge back together to 4 arrays of 2, then 2 arrays of 4, then 1 array of 8. The time complexity is O(nlogn), the space complexity is O(n)
How Do you create a two-dimensional array with javascript?
var items = [ [1, 2], [3, 4], [5, 6] ];
D-tree
(1) A BB(α) tree where the weights are the number of searches down that path. (2) A height-balanced binary tree that divides regions along their boundaries to serve as a spatial access method.
bipartite matching
(1) A perfect matching between vertices of a bipartite graph, that is, a subgraph that pairs every vertex with exactly one other vertex. (2) The problem of finding such a matching.
null tree
(1) A tree which is empty. (2) A tree whose leaf nodes all have a null value.
order
(1) The height of a tree. (2) The number of children of the root of a binomial tree. (3) The maximum number of children of nodes in a B-tree. (4) The number of data streams, usually denoted ω, in a multiway merge.
domain
(1) The inputs for which a function or relation is defined. For instance, 0 is not in the domain of reciprocal (1/x). (2) The possible values of a variable.
stupid sort
(1) The original name of gnome sort. (2) An alternate name for bogosort.
best case
(1) The situation or input for which an algorithm or data structure takes the least time or resources. (2) Having to do with this situation or input.
worst case
(1) The situation or input that forces an algorithm or data structure to take the most time or resources. (2) Having to do with this situation or input.
What are the 4 steps to inserting a node into the tail of a singly linked list?
1. Create A New Node. 2. Assign Its Next Reference to None. 3. Set the next reference of the tail to point to this new node. 4. Then update the tail reference itself to this new node.
Red Black Trees
1. Every node is Red or Black 2. The root is Black 3. If a node is red, it's children must be black 4. Every path from a node to a NULL pointer must contain the same number of black nodes
Red Black Tree: properties
1. Every node is either red or black 2. The root is black 3. Every leaf (NIL) is black 4. If a node is red, then both its children are black 5. All simple paths from node to child leaves contain the same # of black nodes
Post-Order Traversal
1. Process left child. 2. Process right child. 3. Process self.
In-order traversal
1. Traverse the left subtree. 2. Visit the root. 3. Traverse the right subtree
Pre-order traversal (depth first traversal)
1. Visit the root. 2. Traverse the left subtree. 3. Traverse the right subtree.
Important Sorting Assumptions
1.Sorting array of integers 2. Length of array is n 3.Sorting least to greatest 4.Can access array element in constant time 5.Compare ints in array only with '<' 6.Focus on # of comparisons
Multithread
2 or more tasks executing concurrently within 1 program
# of elements in a binary tree
2^(# of rows)
ternary search tree
A 3-way tree where every node's left subtree has keys less than the node's key, every middle subtree has keys equal to the node's key, and every right subtree has keys greater than the node's key. If the key is a multikey (string, array, list, etc.), the middle subtree organizes by the next subkey (character, array or list item, etc.)
B*-tree
A B-tree in which nodes are kept 2/3 full by redistributing keys to fill two child nodes, then splitting them into three nodes.
2-3-4 tree
A B-tree of order 4, that is, internal nodes have two, three, or four children.
What Is A Hash Table?
A Data Structure With A Key Value Pair.
universal Turing machine
A Turing machine that is capable of simulating any other Turing machine by encoding the latter.
nondeterministic Turing machine
A Turing machine which has more than one next state for some combinations of contents of the current cell and current state. An input is accepted if any move sequence leads to acceptance.
Bit Array
A bit array is a mapping from some domain (almost always a range of integers) to values in the set {0, 1}. The values can be interpreted as dark/light, absent/present, locked/unlocked, valid/invalid, et cetera. The point is that there are only two possible values, so they can be stored in one bit. As with other arrays, the access to a single bit can be managed by applying an index to the array. Assuming its size (or length) to be n bits, the array can be used to specify a subset of the domain (e.g. {0, 1, 2, ..., n−1}), where a 1-bit indicates the presence and a 0-bit the absence of a number in the set. This set data structure uses about n/w words of space, where w is the number of bits in each machine word. Whether the least significant bit (of the word) or the most significant bit indicates the smallest-index number is largely irrelevant, but the former tends to be preferred (on little-endian machines).
range sort
A bucket sort where the function to determine the bucket is based on the range of possible keys.
descendant
A child of a node in a tree, any of the children of the children, etc.
What Is The Advantage Of A Hash Table?
A collection of items which are stored in such a way as to make it easy to find them later.
What Is A Singly Linked List?
A collection of nodes that collectively form a linear sequence/
See we have a modulo hash function with 44%11 and 77%11 what would happen
A collision
edge
A connection between two vertices of a graph. In a weighted graph, each edge has an number, called a "weight." In a directed graph, an edge goes from one vertex, the source, to another, the target, and hence makes connection in only one direction.
asymptotic bound
A curve representing the limit of a function. That is, the distance between a function and the curve tends to zero. The function may or may not intersect the bounding curve.
external memory data structure
A data structure that is efficient even when accessing most of the data is very slow, such as, on a disk.
decidable problem
A decision problem that can be solved by an algorithm that halts on all inputs in a finite number of steps. The associated language is called a decidable language.
multiway decision
A decision which has more than two results. For instance, testing if a < b yields two results, but some languages allow a test to return a < b, a=b, or a > b in one operation.
Degenerate Binary Tree
A degenerate (or pathological) tree is where each parent node has only one associated child node.This means that performance-wise, the tree will behave like a linked list data structure.
pseudo-random number generator
A deterministic algorithm to generate a sequence of numbers with little or no discernible pattern in the numbers, except for broad statistical properties.
directed acyclic graph
A directed acyclic graph (or DAG) is a digraph with no directed cycles.
st-digraph
A directed acyclic graph with two specially marked nodes, the source s and the sink t.
pipelined divide and conquer
A divide and conquer paradigm in which partial results from recursive calls can be used before the calls complete. The technique is often useful for reducing the depth of an algorithm.
linear hashing
A dynamic hashing table that grows one slot at a time. It has a family of hash functions, hi, where the range of hi+1 is twice the range of hi. Slots below a pointer, p, have been split. That is, key, k, is in slot hi(k) if hi(k) > p. Otherwise it is in hi+1(k). To maintain the load factor, slot p can be split (rehashed with hi+1) and p incremented. When p reaches the end, the ranges are doubled (i is incremented), and p starts over.
pattern
A finite number of strings that are searched for in texts.
Select
A four-part algorithm to select the kth smallest element of an array. Part 1) Consider the array as groups of 5 elements; sort and find the median of each group. 2) Use Select recursively to find x, the median of the medians. 3) Next partition the array around x. 4) Let i be the number of elements in the low side of the partition. If k ≤ i, use Select recursively to find the kth element of the low side. Otherwise Select the k-ith element of the high side.
lower bound
A function or growth rate below which solving a problem is impossible.
0-ary function
A function that takes no arguments.
Acyclic
A graph that has no cycles.
connected components
A graph that is not connected consists of a set of connected components, which are maximal connected subgraphs.
Directed Graph
A graph where an edge has a direction associated with it, for example, a plane flight that takes off in one location and arrives in another. The return flight would be considered a separate edge.
Connected Graph
A graph where there exists a simple path from any vertex in the graph to any other vertex in the graph, even if it takes several "hops" to get there.
height-balanced binary search tree
A height-balanced tree which is also a binary search tree. It supports membership, insert, and delete operations in time logarithmic in the number of nodes in the tree.
frequency count heuristic
A heuristic that keeps the elements of a list ordered by number of times each element is the target of a search.
ordered linked list
A linked list whose items are kept in some order.
linked list
A list implemented by each item having a link to the next item.
K-dominant match
A match [i,j] having rank k and such that for any other pair [i',j'] of rank k either i'>i and j'≤ j or i' ≤ i and j'>j.
perfect matching
A matching, or subset of edges without common vertices, of a connected graph that touches all vertices exactly once. A graph with an odd number of vertices is allowed one unmatched vertex.
uniform matrix
A matrix having the same number of items in each row.
sparse matrix
A matrix that has relatively few non-zero (or "interesting") entries. It may be represented in much less than n × m space.
lower triangular matrix
A matrix that is only defined at (i,j) when i ≥ j.
strictly lower triangular matrix
A matrix that is only defined at (i,j) when i > j.
upper triangular matrix
A matrix that is only defined at (i,j) when i ≤ j.
strictly upper triangular matrix
A matrix that is only defined at (i,j) when i < j.
quadratic probing
A method of open addressing for a hash table in which a collision is resolved by putting the item in the next empty place given by a probe sequence. The space between places in the sequence increases quadratically.
double hashing
A method of open addressing for a hash table in which a collision is resolved by searching the table for an empty place at intervals given by a different hash function, thus minimizing clustering.
recursion tree
A method to analyze the complexity of an algorithm by diagramming the recursive function calls.
minimum spanning tree
A minimum-weight tree in a weighted graph which contains all of the graph's vertices.
random access machine
A model of computation whose memory consists of an unbounded sequence of registers, each of which may hold an integer. In this model, arithmetic operations are allowed to compute the address of a memory register.
concurrent flow
A multi-commodity flow in which the same fraction of the demand of each commodity is satisfied.
topological order
A numbering of the vertices of a directed acyclic graph such that every edge from a vertex numbered i to a vertex numbered j satisfies i<j.
linear hash
A numeric function that maintains the order of input keys while changing their spacing.
Weakly connected
A path between every pair of vertices which are undirected.
Peek
A process used in stack and queue processing where a copy of the top or front value is acquired, without removing that item.
Push
A process used in stack and queue processing where a new value is inserted onto the top of the stack OR into the back of the queue (Enqueue).
Linear Data Structure
A programming data structure that occupies contiguous memory, such as an array of values.
Recursion
A programming technique in which a method breaks down a problem into one or more simpler problems of the same nature and then calls itself to solve the simpler problem(s). All recursive definitions must have a non-recursive part to terminate the recursion. They are o(n) complexity
diagonalization
A proof technique for showing that a given language does not belong to a given complexity class, used in many separation theorems.
CTL
A propositional, branching-time temporal logic for which formulas can be checked in linear time. An acronym for Computation Tree Logic.
bounded queue
A queue limited to a fixed number of items.
Las Vegas algorithm
A randomized algorithm that always produces correct results, with the only variation from one run to another being its running time.
easy split, hard merge
A recursive algorithm, especially a sort algorithm, where dividing (splitting) into smaller problems is quick or simple and combining (merging) the solutions is time consuming or complex.
top-down radix sort
A recursive bucket sort where elements are distributed based on succeeding pieces (characters) of the key.
uniform circuit family
A sequence of circuits, one for each input length n, that can be efficiently generated by a Turing machine.
kth order Fibonacci numbers
A sequence of numbers where each number is the sum of the k preceding numbers. The usual Fibonacci numbers occur when k=2.
directed path
A sequence of vertices (v1,v2,v3,...) such that v1->v2, v2->v3,v3->... for a directed graph.
parallel random-access machine
A shared memory model of computation, where typically the processors all execute the same instruction synchronously, and access to any memory location occurs in unit time.
simple cycle
A simple cycle is a cycle with no repeated edges or vertices (except the requisite repetition of the first and last vertices).
local optimum
A solution to a problem that is better than all other solutions that are slightly different, but worse than the global optimum.
regular decomposition
A space decomposition method that partitions the underlying space by recursively halving it across the various dimensions instead of permitting the partitioning lines to vary.
Maximum Spanning Tree
A spanning tree of a weighted graph having maximum weight. It can be computed by negating the edges and running either Prim's or Kruskal's algorithms.
end-of-string
A special character indicating the end of a string.
Boyer-Moore-Horspool
A string matching algorithm that compares characters from the end of the pattern to its beginning. When characters don't match, searching jumps to the next matching position in the pattern.
Boyer-Moore
A string matching algorithm that compares characters from the end of the pattern to its beginning. When characters don't match, searching jumps to the next possible match: the farthest of a table like that used in the Knuth-Morris-Pratt algorithm and the next matching position in the pattern.
Karp-Rabin
A string matching algorithm that compares string's hash values, rather than the strings themselves. For efficiency, the hash value of the next position in the text is easily computed from the hash value of the current position.
optimal mismatch
A string matching algorithm that compares the rarest character first. When a character doesn't match, the next character in the text beyond the search string determines where the next possible match begins.
Knuth-Morris-Pratt algorithm
A string matching algorithm that turns the search string into a finite state machine, then runs the machine with the string to be searched as the input string. Execution time is O(m+n), where m is the length of the search string, and n is the length of the string to be searched.
Smith algorithm
A string matching algorithm which computes the shift value for both the rightmost character of the window and the character preceding it, then uses the maximum of the two values.
jelly-fish
A theoretical data structure for n items. It starts with a balanced binary search tree of about √(n) nodes. The leaf nodes lead to "tentacles" or linked lists, each of about √(n) nodes.
Theta
A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = Θ (g(n)) means it is within a constant multiple of g(n). The equation is read, "f of n is theta g of n".
omega
A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = ω (g(n)) means g(n) becomes insignificant relative to f(n) as n goes to infinity.
primitive recursive
A total function that can be written using only nested conditional (if-then-else) statements and fixed iteration (for) loops.
work-preserving
A translation of an algorithm from one model of computation to another in which the work is the same in both models, to within a constant factor.
Head
A typical object variable identifier name used to reference, or point to, the first object in a linked list. The number one rule for processing linked lists is, 'Never let go of the head of the list!", otherwise all of the list is lost in memory. The number two rule when managing linked lists is, 'Always connect before you disconnect!'.
Array length
A value that represents the number of elements contained in an array. Often there is a process associated with an array that provides this value, such as list.length, or len(list).
jump list
A variant of doubly linked list with items in sorted order and having two levels of additional links that span geometrically increasing distances. For a list with n items, the next level is a link from item i, 1 ≤ i ≤ n - n 1/3, to item i + i 1/3. At the top level, items 1³, 2³, 3³, ..., n 1/3 ³ have backward links, that is, there is a link from item i³, 1 < i ≤ n 1/3 ³, to item (i-1)³. Search, insert, and delete are O(n1/3) worst case.
Map
ADT that maps keys to values. Should have add(), reassign(), remove() and lookup().
Hash table
ADT with each location called a cell or bucket
Adjacency list: add vertex/edge, delete vertex/edge
Add vertex: O(1) Add edge: O(1) Delete vertex: O(|E|) Delete edge: O(|E|)
Adjacency matrix: add vertex/edge, delete vertex/edge
Add vertex: O(|V|^2) Add edge: O(1) Delete vertex: O(|V|^2) Delete edge: O(1)
AVL Trees
Adelson-Velskii & Landis: Any pair of sibling nodes have a height difference of at most 1. On insertion, at most one rotation (single or double) is needed to restore balance. On removal, multiple rotations may be necessary.
weight/cost matrix
Adjacency matrix of a weighted graph.
ArrayLists: advantages, disadvantages
Advantage: advantages of an array, plus does not run out of space Disadvantage: inserting can be slower than an array
Graph: advantage, disadvantage
Advantage: best models real-world situations Disadvantage: can be slow and complex
Divide-and-conquer
Algorithm design patterns. Divide the problem into two or more smaller independent subproblems and solve the original problem using solutions to the subproblems.
Invariants
Algorithm design patterns. Identify an invariant and use it to rule out potential solutions that are suboptimal/dominated by other solutions.
Recursion
Algorithm design patterns. If the structure of the input is defined in a recursive manner, design a recursive algorithm that follows the input definition.
proper descendants
All vertices for which v is an ancestor in a tree, excluding v itself.
Ancestors of a vertex v in a tree
All vertices on the simple path from the root to v
Stack
An abstract data type that serves as a collection of elements, with two principal operations: push, which adds an element to the collection, and pop, which removes the last element that was added. LIFO - Last In First Out
priority queue
An abstract data type to efficiently support finding the item with the highest priority across a series of operations. The basic operations are: insert, find-minimum (or maximum), and delete-minimum (or maximum). Some implementations also efficiently support join two priority queues (meld), delete an arbitrary item, and increase the priority of a item (decrease-key).
algorithm FGK
An adaptive Huffman coding scheme. Coding is never much worse than twice optimal.
branch and bound
An algorithmic technique to find the optimal solution by keeping the best solution found so far. If a partial solution cannot improve on the best, it is abandoned.
recursion
An algorithmic technique where a function, in order to accomplish a task, calls itself with some part of the task.
competitive analysis
An analysis in which the performance of an on-line algorithm is compared to the best that could have been achieved if all the inputs had been known in advance.
2D Array
An array of an arrays, characterized by rows and columns, arranged in a grid format, but still stored in contiguous, or side-by-side memory, accessed using two index values.
Mulitqueue
An array of queues with different sizes. All operations are o(1) but larger queues are o(m)(depending on number of priorities.)
Row Major
An array where the two index values for any element are the row first, then the column.
dynamic array
An array whose size may change over time. Items are not only added or removed, but memory used changes, too.
histogram sort
An efficient 3-pass refinement of a bucket sort algorithm. The first pass counts the number of items for each bucket in an auxiliary array, and then makes a running total so each auxiliary entry is the number of preceding items. The second pass puts each item in its proper bucket according to the auxiliary entry for the key of that item. The last pass sorts each bucket.
Bresenham's algorithm
An efficient algorithm to render a line with pixels. The long dimension is incremented for each pixel, and the fractional slope is accumulated.
hash heap
An efficient implementation of a priority queue. The linear hash function monotonically maps keys to buckets, and each bucket is a heap.
transitive closure
An extension or superset of a binary relation such that whenever (a,b) and (b,c) are in the extension, (a,c) is also in the extension.
diminishing increment sort
An in-place sort algorithm that repeatedly reorders different, small subsets of the input until the entire array is ordered. On each pass it handles i sets of n/i items, where n is the total number of items. Each set is every ith item, e.g. set 1 is item 1, 1+i, 1+2i, etc., set 2 is item 2, 2+i, etc. On each succeeding pass, the increment or gap, i, is reduced until it is 1 for the last pass.
relaxation
An optimization problem with an enlarged feasible region (and extended objective function) compared with an original optimization problem. Typically, the relaxation is considerably easier to solve than the original.
hybrid algorithm
Any algorithm composed of simpler algorithms.
comparison sort
Any sort algorithm using comparisons between keys, and nothing else about the keys, to arrange items in a predetermined order. An alternative is a restricted universe sort such as counting sort or bucket sort.
Quick Sort
Average is O(NlogN) - Worst case: O(N^2) when an array is inversely sorted Improvements - Median-three-partitioning - finding the median of the first, last and middle elements (or directly sorting the elements) - Using an insertion sort procedure for small partitions - Hybrid scheme - do quicksort )(NlogN) until the array is almost sorted and then use InsertionSort which is then O(N)
BST Priority Queue: insert, max, extract max, increase valye
BST-insert: O(h) BST-maximum: O(h) BST-extract-max: O(h) BST-increase-value: O(h)
2-3 Tree
Balanced tree data structure with logN complexities on searching, inserting, and deleting in both the worst and average cases. In this data structure, every node with children has either two children and one data element, or three children and two data elements. Leaf nodes will contain only one or two data elements.
Retrieval
Based on size of bucket, not size of hash table. Can be reduced by increasing constraints to increase no. of buckets.
brute force string search with mismatches
Beginning with the (leftmost) position in a string and trying each position in turn, find the number of characters for which the pattern and the substring beginning at that position don't match (the Hamming distance). Return the first position with k or fewer mismatches.
Level-Order Traversal
Begins at the root and visits nodes one level at a time. Within a level, it visits nodes from left to right. An example of breadth-first traversal.
Quadratic Probing
Checks the square of the nth time it has to check, causes secondary clustering. Not guaranteed to find an open table spot unless table is 1/2 empty.
Circular queue - unbounded
Circular queue Dequeue o(1) Enqueue o(n)
multiway merge
Combine more than two sorted data streams into a single sorted stream.
Folding:
Combining parts of the key using operations like + and bitwise operations such as exclusive or. Key: 123456789 123 456 789 --- 1368 ( 1 is discarded)
Rabin-Karp
Compute hash codes of each substring whose length is the length of s, such as a function with the property that the hash code of a string is an additive function of each individual character. Get the hash code of a sliding window of characters and compare if the hash matches.
Double-ended linked-list
Contains an additional reference to the last link - Makes it possible to insert a new link directly at the end of the list without the need to iterate along the entire list O(1) - Suitable for situations like for implementing a queue
Unordered Linked List
Data structure with non-efficently supported operations. Is unordered. Has a worst case cost of search and insertion at N, an average case cost of insertion at N, and an average case cost of searching at N/2.
one-dimensional
Dealing with or restricted to a line. An organization where location can be completely described with exactly one axis.
two-dimensional
Dealing with or restricted to a plane. An organization where location can be completely described with exactly two orthogonal axes.
three-dimensional
Dealing with or restricted to a space in which location can be completely described with exactly three orthogonal axes.
subgraph isomorphism
Decide if there is a subgraph of one graph which is isomorphic to another graph.
What Are The Methods We Need To Create A Deque And What Do They Do?
Deque() creates a new deque that is empty. addFront(item) adds item to front and returns nothing. addRear(item) adds a new item to the rear of the deque and returns nothing. removeFront() removes the front items and returns the item. removeRear() removes and returns the item.isEmpty() returns boolean. size() returns the number of items.
bin packing problem
Determine how to put the most objects in the least number of fixed space bins. More formally, find a partition and assignment of a set of objects such that a constraint is satisfied or an objective function is minimized (or maximized). There are many variants, such as, 3D, 2D, linear, pack by volume, pack by weight, minimize volume, maximize value, fixed shape objects, etc.
What is the goal of Hashing?
Do faster than O(LogN) time complexity for: lookup, insert, and remove operations. To achieve O(1)
Indexed List
Elements are referenced by numerical position or index in list. Often no relation among elements.
Ordered List
Elements are sorted by some inherent characteristic of the elements. e.g. names in alphabetical order, scores in ascending order. Have an add(T element) method. It uses a comparable interface. Specifically a compareTo method. For elements that dont have an inherent order, we can introduce an order e.g. Employee, Car
Collision
Entering into a space already in use.
dual linear program
Every linear program has a corresponding linear program called the dual. It is maxy{b· y | ATy ≤ c and y ≥ 0}. For any solution x to the original linear program and any solution y to the dual we have c · x ≥ (AT y)T x = yT(Ax) ≥ y · b. For optimal x and y, equality holds. For a problem formulated as an integer linear program, a solution to the dual of a relaxation of the program can serve as witness.
Breadth-First Search
Explores the oldest unexplored vertices first. Places discovered vertices in a queue. In an undirected graph: Assigns a direction to each edge, from the discoverer to the discovered, and the discoverer is denoted to be the parent.
queue
FIFO list in which elements are added from one end of the structure and deleted from the other end.
traveling salesman
Find a path through a weighted graph that starts and ends at the same vertex, includes every other vertex exactly once, and minimizes the total cost of edges.
British Museum technique
Find a solution by checking all possibilities one by one, beginning with the smallest. This is a conceptual, not a practical, technique where the number of possibilities are enormous.
backtracking
Find a solution by trying one of several choices. If the choice proves incorrect, computation backtracks or restarts at the point of choice and tries another choice. It is often convenient to maintain choice points and alternate choices using recursion.
vehicle routing problem
Find an optimal route of one or more vehicles through a graph.
prune and search
Find an optimal value by eliminating a constant fraction of remaining objects at each step. Eliminated objects are guaranteed not to affect the optimal value. A logarithmic number of steps reduces the number of objects to a constant, and a brute force approach can then solve it.
select kth element
Find the kth smallest element of a set. Two approaches are a modified distribution sort or select and partition.
cutting theorem
For any set H of n hyperplanes in Rk, and any parameter r, 1 ≤ r≤ n, there always exists a (1/r)-cutting of size O(rk). In two dimensions, a (1/r)-cutting of size s is a partition of the plane into s disjoint triangles, some of which are unbounded, such that no triangle in the partition intersects more than n/r lines in H. In Rk, triangles are replaced by simplices. Such a cutting can be computed in O(nrk-1) time.
inverse suffix array
For each position in a string, the inverse suffix array has its index in the string's suffix array.
Given a binary tree of integers, print it in level order. The output will contain space between the numbers in the same level, and new line between different levels.
For example 1 23 456
graph (formal)
G = (V,E). V: finite, nonempty set of vertices. E: set of pairs of V, called edges.
Relaxation
Getting from A->C more cheaply by using B as an intermediary.
Directed Graph
Given a path connecting vertices A and B you can only travel in 1 direction
Extraction
Hash function that uses part of the data given to it to locate the correct bucket
When would you want to use hash table over binary search tree?
Hash table is faster with O(1) to binary O(logn). But When You Want To Sort Binary Search Tree Is Better.
Idea of probing:
If you have a collision, search somewhere else on the table.
bucket array
Implementation of a dictionary by an array indexed by the keys of the items in the dictionary.
implies
Implication: 0 → 0 = 1, 0 → 1 = 1, 1 → 0 = 0, 1 → 1 = 1.
Floyd-Warshall
Implicitly determines shortest paths taking into account all vertexes.
Key
Information in items that is used to determine where the item goes into the table.
ArrayLists: insert
Insert: often O(1), sometimes more
What is a perfect binary tree?
Is both full and complete
halting problem
Is there an algorithm to determine whether any arbitrary program halts? Turing proved the answer is, no. Since many questions can be recast to this problem, we can conclude that some programs are absolutely impossible, although heuristic or partial solutions are possible.
How Do We Get The Root Node Of The Binary Search Tree?
It is the first key inserted
LinkedStack
Keep reference of top of stack and integer count of number of nodes in stack. Pushing element into stack means adding to front of list. Same with pop.
Tries: definition
Key-value storage; a kind of tree Key -not- stored in node, value stored in node Node variables : Boolean isNode, String value, array Edges
Map iteration
Keyset - returns keys in map entrySet - returns keys to values mapping values - returns values in map.
stack
LIFO list in which insertions/deletions are only done at one end.
Dynamic Memory
Memory that is allocated as needed, and NOT contiguous (side-by-side), specifically during the implementation of a linked list style data structure, which also includes binary trees and graphs.
Red Black Tree: memory
Memory: O(n)
Adjacency list: memory
Memory: O(|V|+|E|)
Adjacency matrix: memory
Memory: O(|V|^2)
Lower Bound on the complexity of pairwise comparisons
No compare based sorting algorithm can have fewer than ~NlogN compares
Unbounded
No limit on size of collection e.g. arraylist Should have isEmpty() Should have size() should Have isFull() Should have capacity()
Heap Sort
Non-stable, in place sort which has an order of growth of NlogN. Requires only one spot of extra space. Works like an improved version of selection sort. It divides its input into a sorted and unsorted region, and iteratively shrinks the unsorted region by extracting the smallest element and moving it into the sorted region. It will make use of a heap structure instead of a linear time search to find the minimum.
ShellSort
Non-stable, in place sort with an order of growth which is undetermined, though usually given at being N-to-the 6/5. Needs only one spot of extra space. Works as an extension of insertion sort. It gains speed by allowing exchanges of entries which are far apart, producing partially sorted arrays which are eventually sorted quickly at the end with an insertion sort. The idea is to rearrange the array so that every h-th entry yields a sorted sequence. The array is h-sorted.
What Is The Best, Average, Worst, and Space Complexity Of Selection Sort?
O(n2),O(n2),O(n2),0(1)
What Is The Best, Average, Worst, and Space Complexity Of Quick Sort?
O(nlogn), O(nlogn), O(n2), O(logn)
Collection
Object that represents a group of some fixed type. Each object within the collection is known as an element. Examples include: - Stack - Queue - List - Tree - Graph Can be unordered, linear or non-linear
parent
Of a node: the tree node conceptually above or closer to the root than the node and which has a link to the node. See the figure at tree.
Asymptotic Complexity
Order of the algorithm. Different implementations of the same algorithms may differ in efficiency. Determined by dominant term.
Stack
Organizes its entries according to the order in which they were added. Last In - First Out
Two Way algorithm
Partition ("factor") the pattern, x, into left, xl, and right, xr, parts in such a way to optimize searching. Compare xr left to right then, if it matches, compare xl right to left.
graph partition
Partition the vertices while keeping the cost of spanning edges low.
What Is The Information Stored In The Node?
Payload
Load factor
Percentage occupancy of the table at which the table will be resized. e.g. when the table is 75% full, resize.
deterministic
Permitting at most one next move at any step in a computation.
Burrows-Wheeler transform
Permute a string. Repeated substrings lead to repeated characters in the permuted string, which is easier to compress. Knowing which character was last in the original string, the original can be reconstructed from the rearranged string.
quicksort
Pick an element from the array (the pivot), partition the remaining elements into those greater than and less than this pivot, and recursively sort the partitions. There are many variants of the basic scheme above: to select the pivot, to partition the array, to stop the recursion or switch to another algorithm for small partitions, etc.
multikey Quicksort
Pick an element from the array (the pivot). Consider the first character (key) of the string (multikey). Partition the remaining elements into three sets: those whose corresponding character is less than, equal to, and greater than the pivot's character. Recursively sort the "less than" and "greater than" partitions on the same character. Recursively sort the "equal to" partition by the next character (key).
8 queens
Place eight chess queens on an 8 × 8 board such that no queen can attack another. Efficiently find all possible placements.
fully dynamic graph problem
Problem where the update operations include unrestricted insertions and deletions of edges.
Recursion
Programming technique in which a method calls itself Efficiency - calling a method involves some overhead - working wit the local memory stack could improve efficiency - danger of memory leak (stack overflow) is an issue - most recursive problems can be solved using iterations (for, while, etc)
TreeMap underlying Structure:
RBT
Fisher-Yates shuffle
Randomly permute N elements by exchanging each element ei with a random element from i to N. It consumes Θ(N log N) bits and runs in linear time.
heapify
Rearrange a heap to maintain the heap property, that is, the key of the root node is more extreme (greater or less) than or equal to the keys of its children. If the root node's key is not more extreme, swap it with the most extreme child key, then recursively heapify that child's subtree. The child subtrees must be heaps to start.
random search
Repeatedly check items in an array at random until successful.
Combinations
Repetition is Allowed: such as coins in your pocket (5,5,5,10,10) No Repetition: such as lottery numbers (2,14,15,27,30,33) https://www.mathsisfun.com/combinatorics/combinations-permutations.html
Fibonaccian search
Search a sorted array by narrowing possible locations to progressively smaller intervals. Begin with two Fibonacci numbers, p (F(n)) and q (F(n+1)), such that p < n ≤ q, where n is the size of the array. The first step checks location p. The size of the next interval is p, if the key is less than the item at that location, or q-p (F(n-1)) if it is greater.
Direct-access table: search
Search: O(1)
Hash tables: search, insert, delete
Search: O(1-n) Insert: O(1-n) Delete: O(1-n)
Sharding
Sharding is a type of database partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.
Parameter Passing
Small, no modification - value Large, no modification - CONST reference modified - pointer
divide and conquer
Solve a problem, either directly because solving that instance is easy (typically, because the instance is small) or by dividing it into two or more smaller instances. Each of these smaller instances is recursively solved, and the solutions are combined to produce a solution for the original instance.
Munkres' assignment algorithm
Solve the assignment problem in polynomial time by marking and unmarking entries and covering and uncovering rows and columns.
Insertion Sort
Stable, in place sort with an order of growth which is between N and N-squared, needs only one spot of extra space and is dependent on the order of the items. Works by scanning over the list, then inserting the current item to the front of the list where it would fit sequentially. All the items to the left of the list will be sorted, but may not be in their final place as the larger items are continuously pushed back to make room for smaller items if necessary.
What Are The Three Linear Structures That Are Similar To Arrays, But Are Different By How It Adds And Removes Items.
Stack, Queues and Deques
LSD
String/character sort from right to left
dining philosophers
Suppose a number of philosophers surround a dining table. Adjacent philosophers share one fork. They spend time thinking or trying to eat. A philosopher must have both the fork on the left and the fork on the right to eat. Clearly adjacent philosophers cannot eat at the same time. The problem is to find an algorithm for taking forks that prevents deadlock, starvation, etc.
Big Omega
T(n) = Ω(f(n)) if ∃ positive constants c & n° such that T(n) >= c * f(n) for all n >= n°
Hash Function
Takes in data, does some calculations and locates where in it goes in. It is an o(1) operation to store and search. Determined by hashCode()
balance
The (weight) balance of a tree is the number of leaves of the left subtree of a tree, denoted |Tl|, divided by the total number of leaves of the tree. Formally, ρ(T) = |Tl|/|T|.
Table Size(TS)
The Array's Length
What Is Big Theta Notations?
The Average Case Scenario Of The Code Written.
NP
The complexity class of decision problems for which answers can be checked by an algorithm whose run time is polynomial in the size of the input. Note that this doesn't require or imply that an answer can be found quickly, only that any claimed solution can be verified quickly. "NP" is the class that a Nondeterministic Turing machine accepts in Polynomial time.
root
The distinguished initial or fundamental item of a tree. The only item which has no parent. See the figure at tree.
worst-case execution time
The execution time of an algorithm in the worst case.
randomized complexity
The expected running time of the best possible randomized algorithm over the worst input.
least common multiple
The least integer which is a multiple of given integers. For instance, LCM(6, 10) = 30.
Length
The length of a path or a cycle is its number of edges.
depth of a vertex v
The length of the simple path from the root to v
probe sequence
The list of locations which a method for open addressing produces as alternatives in case of a collision.
height
The maximum distance of any node from the root. If a tree has only one node (the root), the height is zero. The height of an empty tree is not defined.
diameter
The maximum of the distances between all possible pairs of vertices of a graph.
chromatic number
The minimum number of colors needed to color the vertices of a graph such that no two adjacent vertices have the same color.
kth shortest path
The problem of finding the kth shortest path from one vertex in a graph to another vertex. Variants may require that paths are edge- or vertex-disjoint, that is sharing no edges or vertices. "Shortest" may be least number of edges, least total weight, etc.
maximum-flow problem
The problem of finding the maximum flow between any two vertices of a directed graph.
alphabet
The set of all possible symbols in an application. For instance, input characters used by a finite state machine, letters making up strings in a language, or symbols in a pattern element. In some cases, an alphabet may be infinite.
simplex
The simplest N-dimensional polytope. The generalization of a triangle (2D) or tetrahedron (3D).
Euclidean distance
The straight line distance between two points. In a plane with p1 at (x1, y1) and p2 at (x2, y2), it is √((x1 - x2)² + (y1 - y2)²).
uniform circuit complexity
The study of complexity classes defined by uniform circuit families.
midrange
The sum of the minimum and maximum values, divided by two.
clustering
The tendency for entries in a hash table using open addressing to be stored together, even when the table has ample empty space to spread them out.
secondary clustering
The tendency for some collision resolution schemes to create long run of filled slots away from a key hash position, e.g., along the probe sequence.
median
The value which has an equal number of values greater and less than it. For an even number of values, it is the mean of the two middle values.
mode
The value which occurs most often. If no value is repeated, there is no mode. If more than one value occurs with the same greatest frequency, each value is a mode.
What is Bubble Sort?
To do the bubble sort, take the first element, and compare to the seconds element and swap if not sorted. Then move to compare 2nd and 3rd. Then 3rd and 4th. Once finished we would then only compare up to 3.
What is the process of going through all the nodes in a linked list?
Traversing
edge crossing
Two different edges cross in a graph drawing if their geometric representations intersect. The number of crossings in a graph drawing is the number of pairs of edges which cross.
fractional solution
Typically, a solution to a relaxation of an optimization problem.
Heap Sort
Unstable, O(n log n), Ω(n log n): Make a heap, take everything out.
MSD Radix Sort
Used to sort an array of strings based on their first character. Is done recursively and can sort strings which are of different lengths. This algorithm will be slower than its counterpart if used for sets of strings which all have the same length. Has a time complexity of 2W(N+R).
Separate Chaining
Uses a linked list to handle collisions at a specific point.
Insertion & Quick Sort
Using both algorithms together is more efficient since O(n log n) is only for large arrays.
cellular automaton
Usually a two-dimensional organization of simple finite state machines whose next state depends on their own state and the states of their eight closest neighbors. In general the machines may be arranged in meshes of higher or lower dimension, have larger neighborhoods, or be arbitrarily complex processors.
What Are The Two Parts Of The Node Class In A Linked List?
Value and Pointer
treesort (2)
Variants of heapsort.
What is the difference of using memoization in making a recursive calls using fibonacci?
We compute the values once and store the value, and return the stored value. Rather then computing from scratch.
What Is Selection Sort?
We first find the minimum value in the array.The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list.
reachable
We say that a vertex w is reachable from a vertex v if there exists a directed path from v to w.
Memoization
What happens when a sub problem's solution is found during the process of Dynamic Programming. The solution is stored for future use, so that it may be reused for larger problems which contain this same subproblem. This helps to decrease run time.
Infix Expression
When a binary operator is between expressions
Binary Search
When you choose one half of a sorted array that can not contain the value you are search for in order to rid of a large portion of the array and increase search efficiency. Look at middle value every time and decided on which side of new array value is to be found.
O-notation
asymptotic upper bound
Algorithm Analysis
how long it takes a computer to do something
window
substring of the text that is aligned with the pattern.
Hash Function
takes a search key and produces the integer index of an element in the hash table. This array element is where you would either store or look for the value associated with a search key.
Hash function
takes an object and tells you where to put it.
purely functional language
A language that does not allow any destructive operation---one which overwrites data---such as the assignment operation. Purely functional languages are free of side effects, i.e., invoking a function has no effect other than computing the value returned by the function.
1D Array
A linear collection of data items in a program, all of the same type, such as an array of integers or an array of strings, stored in contiguous memory, and easily accessed using a process called indexing.
Heap Binary Tree: definition
A binary tree with two additional constraints: Shape - complete tree Heap property - max/min heap
Non-Linear Data Structure
A data structure that does not occupy contiguous memory, such as a linked list, graph, or tree.
Fibonacci heap
A heap made of a forest of trees. The amortized cost of the operations create, insert a value, decrease a value, find minimum, and merge or join (meld) two heaps, is a constant Θ(1). The delete operation takes O(log n).
Sparse Graph
A graph in which the number of edges is close to the minimal number of edges. Sparse graphs can be a disconnected
dense graph
A graph in which the number of edges is close to the possible number of edges.
sparse graph
A graph in which the number of edges is much less than the possible number of edges.
connected graph
A graph is connected if there is a path from every vertex to every other vertex.
connected graph
A graph such that for all vertices u and v, there exists a path from u to v.
planar graph
A graph that can be drawn in the plane with no crossing edges.
directed graph
A graph whose edges are ordered pairs of vertices. That is, each edge can be followed from one vertex to another vertex.
undirected graph
A graph whose edges are unordered pairs of vertices. That is, each edge connects two vertices.
digraph
A graph whose every edge is directed
hypergraph
A graph whose hyperedges connect two or more vertices.
subgraph
A graph whose vertices and edges are subsets of another graph.
sparse graph
A graph with few edges relative to the number of vertices
recursively enumerable language
A language accepted by a Turing machine.
Topological Sort
A linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering.
quad trie
A tree in which each node is split according to some subset of the key, typically a character.
order-preserving Huffman coding
A variable-length character coding based on the frequency of each character. The algorithm is similar to Huffman coding, but the trees are kept in the same order as the characters. Two adjacent trees with the least combined frequency are joined as subtrees of a new root. As with Huffman coding, that new tree is assigned the sum of the subtrees' frequencies. Repeat until all characters are in one tree.
Fibonacci tree
A variant of a binary tree where a tree of order n (n1) has a left subtree of order n-1 and a right subtree of order n-2. An order 0 Fibonacci tree has no nodes, and an order 1 tree has 1 node.
parental
A vertex with at least one child
leaf
A vertex with no children
Venn diagram
A visual depiction of membership in sets according to binary properties, using overlapping ovals to divide the plane into regions. Regions inside an oval have the property the oval represents, while regions outside it do not have the property. Regions are shaded to show combinations of properties (or sets) of interest, or elements are placed in regions corresponding to their properties (or membership).
Data Structure
A way of organizing data in a computer so that it can be used efficiently, such as an array, linked list, stack, queue, or binary tree.
binary tree representation of trees
A way to represent a multiway tree as a binary tree. The leftmost child, c, of a node, n, in the multiway tree is the left child, c', of the corresponding node, n', in the binary tree. The immediately right sibling of c is the right child of c'.
flow network
A weighted, directed graph with two specially marked nodes, the source s and the sink t, and a capacity function that maps edges to positive real numbers, u: E |→ R+.
Array: access, search, insert, delete
Access: O(1) Search: O(n) Insert: O(n) Delete: O(n)
Doubly Linked List: access, search, insert, delete
Access: O(n) Search: O(n) Insert: O(1) Delete: O(1)
What Is The Complexity Of A Queue?
Access: O(n) Search: O(n) Insert: O(1) Delete: O(1)
What Is Big O Complexity Of Stack
Access: O(n) Search: O(n) Insertion: O(1) Deletion: O(1)
What Is The Complexity Of A Single Linked List?
Access: O(n) Search: O(n) Insertion: O(1) Deletion: O(1)
What Is The Worst Code Complexity Of A Singly Linked List?
Access: O(n) Search: O(n) Insertion: O(1) Deletion: O(1)
What is The Average Code Complexity Of A Singly Linked List?
Access: O(n) Search: O(n) Insertion: O(1) Deletion: O(1)
Notion of Partitioning
Action of dividing into groups depending on key value - The underlying mechanism of quicksort O(N) but fewer swaps than comparisons
Red Black Tree: advantage, disadvantage
Advantage: quick insert, delete, and search Disadvantage: complex implementation
Doubly Linked List: advantage, disadvantage
Advantage: quick insert, quick delete Disadvantage: slow search
greedy algorithm
An algorithm that always takes the best immediate, or local, solution while finding an answer. Greedy algorithms find the overall, or globally, optimal solution for some optimization problems, but may find less-than-optimal solutions for some instances of other problems.
brute force
An algorithm that inefficiently solves a problem, often by trying every one of a wide range of possible solutions.
external memory algorithm
An algorithm that is efficient when accessing most of the data is very slow, such as, on disk.
on-line algorithm
An algorithm that must process each input in turn, without detailed knowledge of future inputs.
heuristic
An algorithm that usually, but not always, works or that gives nearly the right answer.
metaphone
An algorithm to code English words phonetically by reducing them to 16 consonant sounds. A better variant is double metaphone.
soundex
An algorithm to code surnames phonetically by reducing them to the first letter and up to three digits, where each digit is one of six consonant sounds. This reduces matching problems from different spellings.
Euclid's algorithm
An algorithm to compute the greatest common divisor of two positive integers. It is Euclid(a,b){if (b=0) then return a; else return Euclid(b, a mod b);}. The run time complexity is O((log a)(log b)) bit operations.
Viterbi algorithm
An algorithm to compute the optimal (most likely) state sequence in a hidden Markov model given a sequence of observed outputs.
fast fourier transform
An algorithm to convert a set of uniformly spaced points from the time domain to the frequency domain.
brute force string search
An algorithm to find a string within another string or body of text by trying each position one at a time. There are many far faster string matching algorithms.
sieve of Eratosthenes
An algorithm to find all prime numbers up to a certain N. Begin with an (unmarked) array of integers from 2 to N. The first unmarked integer, 2, is the first prime. Mark every multiple of this prime. Repeatedly take the next unmarked integer as the next prime and mark every multiple of the prime.
Baum Welch algorithm
An algorithm to find hidden Markov model parameters A, B, and Π with the maximum likelihood of generating the given symbol sequence in the observation vector.
Zeller's congruence
An algorithm to find the day of the week for any date.
Doomsday rule
An algorithm to find the day of the week for any date. It is simple enough to memorize and do mentally.
extended Euclid's algorithm
An algorithm to find the greatest common divisor, g, of two positive integers, a and b, and coefficients, h and j, such that g = ha + jb.
shadow merge
An algorithm to merge two heaps by concatenating the smaller heap to the larger, then reordering just the concatenated nodes and their parents to restore the heap property.
approximation algorithm
An algorithm to solve an optimization problem that runs in polynomial time in the length of the input and outputs a solution that is guaranteed to be close to the optimal solution. "Close" has some well-defined sense called the performance guarantee.
Ragged Array
An array where the number of columns in each row may be different.
sorted array
An array whose items are kept sorted, often so searching is faster.
What Are The Methods Used For A Queue?
Constructor create a new queue that is empty. And Returns The Empty Queue. Enqueue(item) adds new item to the rear of the queue and returns nothing. Dequeue() removes the front item from the queue and return the item. isEmpty() returns a boolean. size() returns the number of items in the queue.
list contraction
Contracting a list by removing some of the items.
select and partition
Given an array A of n elements and a positive integer k ≤ n, find the kth smallest element of A and partition the array such that A[1], ..., A[k-1] ≤ A[k] ≤ A[k+1], ..., A[n].
towers of Hanoi
Given three posts (towers) and n disks of decreasing sizes, move the disks from one post to another one at a time without putting a larger disk on a smaller one. The minimum is 2n-1 moves. The "ancient legend" was invented by De Parville in 1884.
Topological Sort
Linear ordering of the vertices of a directed graph such that for every directed edge "uv" which connects "u" to "v" (u points to v), u comes before v. This ordering is only possible if and only if there are no directed cycles in the graph, therefore, it must be a DAG.
secant search
Search a sorted array by estimating the next position to check based on the values at the two previous positions checked.
transpose sequential search
Search an array or list by checking items one at a time. If the value is found, swap it with its predecessor so it is found faster next time.
Trie
Search tree but with a child position for each character in the library Think spelling
Binary Search Tree: search, insert, delete
Search: O(h) / balanced, O(lg n) Insert: O(h) / balanced, O(lg n) Delete: O(h) / balanced, O(lg n)
What Is The Worst Case Time Complexity Of A Hash Table?
Search: O(n) Insert: O(n) Delete: O(n) Access: O(n)
first come, first served
See first-in, first-out.
MODIFIND
Select the kth smallest element of an array and partition the array around it. Partition around the value of the kth element. If the partition boundary is not at k, repeat in the partition that includes k.
triangle inequality
The property that a complete weighted graph satisfies weight(u,v) ≤ weight(u,w) + weight(w,v) for all vertices u, v, w. Informally, the graph has no short cuts.
union
The union of two sets is a set having all members in either set.
ArraySet
public ArraySet(int maxSize) {contents = (T[]) (new Object[maxSize]); currentSize = 0; this.maxSize = maxSize; }
Zipf's law
The probability of occurrence of words or other items starts high and tapers off. Thus, a few occur very often while many others occur rarely.
graph drawing
The problem of representing a graph in a plane "neatly," for instance with a minimum number of edge crossings.
Pre-Order Traversal
The process of systematically visiting every node in a tree once, starting with the root node, proceeding to the left along the tree and accessing the node when the "left" side of the node is encountered.
Binary Tree Traversal
The process of systematically visiting every node in a tree once. The three most common traversals are: pre-order, in-order, and post-order.
Doubly linkedlist
Traverse in reverse order. Has 2 references in each node. Increased storage overheads. Insertion of nodes in middle of list require modifications to both previous node and next node.
What methods for implementing a double linked list?
Traverse, Insert, Remove, and isEmpty
homeomorphic
Two graphs are homeomorphic if they can be made isomorphic by inserting new vertices of degree 2 into edges.
isomorphic
Two graphs are isomorphic if there is a one-to-one correspondence between their vertices and there is an edge between two vertices of one graph if and only if there is an edge between the two corresponding vertices in the other graph.
Sequential Search
Use to search a chain of linked nodes. When used with array, objects must have a defined .equals method.
Chaining
Using a linear linked structure or an array to model buckets/cells. The bucket would contain an element and a pointer to the next element in the same bucket. If its an array, use an overflow area to store the data. Treats the table as a table of collections rather than table of elements.
random sampling
Using a randomly selected sample of the data to help solve a problem on the whole data.
Full and Complete Binary Trees
When a binary tree of height h has all of its leaves at level h and every nonleaf parent has exactly two children, the tree is said to be full.
ArrayStack (unbounded)
When array reaches capacity, create a new larger array and copy over and update. When pushing and reached capacity, extend capacity(). push is o(n) because of extending capacity.
Binary Tree
When each node has at most two children.
Collisions:
When the Hash Function returns the same index for different keys.
Directed Graph
When the edges have direction the graph is directed. Also called a diagraph. Graphs without directed edges are undirected.
polynomial time
When the execution time of a computation, m(n), is no more than a polynomial function of the problem size, n. More formally m(n) = O(nk) where k is a constant.
Doubly Linked Chain
When the nodes can reference the previous node as well as the next node in a chain.
static
When the problem domain does not change.
Circular Array
When the queue reaches the end of the array, we can added entries to the queue at the beginning of the array where there are empty slots.
Why should you create a preorder function outside the binary tree class?
You very rarely just want to traverse a tree.
Subgraph
a portion of a graph that is itself a graph.
A path from vertex u to vertex v
a sequence of adjacent vertices that starts with u and ends with v
If we created a binary tree class what would the methods and attributes be?
class BinaryTree(object): constructor(rootObj): insertLeft(newNode): isertRight(newNode): getRightChild getLeftChild getRootVal Attributes are key, leftChild, and rightChild
What Are The Methods We Should Use In Creating A Doubly Linked List
class DoublyLinkedList(object): def __init__(self, value) self.value = value; self.next_node = None self.prev_node = None
What does the node class in a binary tree have?
class Node{ int value; Node left; Node right; }
What should be in your vertex class for implementing a graph as a adjacency list.
class Vertex: constructor(key) id = key connectedTo = {} addNeighbor(nbr, weight=0) __str__ getConnections getId
Generic Class
class which stores, operates on and manages objects whose type is not specified until the class is instantiated
What should be in your graph class?
constructor() creates a new empty graph. addVertex(vert) adds an instance of Vertex to the graph. addEdge(fromVert, toVert) addEdge(fromVert, toVert, weight) getVertex(vertKey), getVertices(), in
What are the operations we need to create a binary heap?
constructor() creates new empty binary heap, insert(k) adds a new item to top of heap, findMin() returns the item with the minimum key value. delMin() returns the min key value, removing it. isEmpty(), size(), buildHeap(lisT) build a new heap from a list of keys.
What are the methods in a tree node for a binary search tree?
constructor(key, val, left, right, parent) sets all the values hasRightChild() hasLeftChild() isLeftChild() isRightChild() isRoot() isLeaf() hasAnyChildern() hasBothChildern() replaceNodeData(key,value,lc,rc)
What are the methods and parameters for the node class of a singly linked list?
constructor(value) self.value = value self.nextnode = None
Interface
contains a skeleton for public operations - No real code for each method - List of all public methods that specifies the ADT
Load Factor
#items(n) / table size
Post-order traversal
1. Traverse the left subtree. 2. Traverse the right subtree. 3. Visit the root
transitive
A binary relation R for which a R b and b R c implies a R c.
balanced binary search tree
A binary search tree that is balanced.
Graph
A collection of distinct vertices and distinct edges
forest
A collection of one or more trees.
tree
A connected, acyclic graph
Dictionary: definition
A data structure that maps keys to values.
spatial access method
A data structure to search for lines, polygons, etc.
Connected Graph
A graph that has a path between between every pair of distinct values.
three-way merge sort
A k-way merge sort which uses three input and three output streams.
linear quadtree
A quadtree implemented as a single array of nodes.
superset
A set S1 is a superset of another set S2 if every element in S2 is in S1. S1 may have elements which are not in S2.
procedure
A subroutine which does not return a value.
tree
A tree is an acyclic connected graph.
supersink
A vertex of a directed graph which is reachable from all other vertices.
subtree of T rooted at v
All descendants of a vertex v
dictionary
An abstract data type storing items, or values. A value is accessed by an associated key. Basic operations are new, insert, find and delete.
saturated edge
An edge in a flow network which has the maximum possible flow.
breadth-first search
Any search algorithm that considers neighbors of a vertex, that is, outgoing edges of the vertex's predecessor in the search, before any outgoing edges of the vertex. Extremes are searched last. This is typically implemented with a queue.
external sort
Any sort algorithm that uses external memory, such as tape or disk, during the sort. Since most common sort algorithms assume high-speed random access to all intermediate memory, they are unsuitable if the values to be sorted don't fit in main memory.
distribution sort
Any sort algorithm where items are distributed from the input to multiple intermediate structures, which are then gathered and placed on the output.
internal sort
Any sort algorithm which uses exclusively main memory during the sort. This assumes high-speed random access to all memory.
subsequence
Any string that can be obtained by deleting zero or more symbols from a given string.
Aggregate Data Types
Any type of data that can be referenced as a single entity, and yet consists of more than one piece of data, like strings, arrays, classes, and other complex structures.
Splay Tree
Any valid BST. Amortized O(log n) access. M operations take O(m log n) for m being large #s. Any node getting inserted, removed, or accessed, get's splayed to the root.
sort
Arrange items in a predetermined order. There are dozens of algorithms, the choice of which depends on factors such as the number of items relative to working memory, knowledge of the orderliness of the items or the range of the keys, the cost of comparing keys vs. the cost of moving items, etc. Most algorithms can be implemented as an in-place sort, and many can be implemented so they are stable, too.
Heap-sort: definition
Array size doesn't change, but heap size does Take off bottom, reshuffle, repeat Less efficient than max-heapify because it sorts from the top instead of the bottom
Circular queue
Array that conceptually wraps around itself. No real front or end. Keep track of front and rear
What Are The Different Implementations Of A Stack
Array, and LinkedList
Dijkstra's Method
Calculates the shortest path to all vertices in a single source shortest path using a priority queue, or a heap. Check's "frontier" based on cost. The distance to any node is known once it has been "visited".
axiomatic semantics
Defining the behavior of an abstract data type with axioms.
model checking
Efficiently deciding whether a temporal logic formula is satisfied in a finite state machine model.
Unordered List
Elements are added in the position relative to other elements on the list. e.g. add element after y, before y, front or rear.
Depth-First Search
Explore newest unexplored vertices first. Placed discovered vertices in a stack (or used recursion). Partitions edges into two classes: tree edges and back edges. Tree edges discover new vertices; back edges are ancestors.
External Sorting
External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are combined into a single larger file. Mergesort is typically preferred.
Trie
Has only part of a key for comparison at each node.
Weighted graph
Has values on its edges.
1-based indexing
Indexing (an array) beginning with 1.
orthogonal lists
Lists that share items, but are structurally independent.
Node
Location of element (in a tree)
search
Look for a value or item in a data structure. There are dozens of algorithms, data structures, and approaches.
What Makes A Hash Table Powerful?
Look up of associated values in constant time.
Standard data structure for solving complex bit manipulation
Lookup table
What is linear search?
Looping through each item in a array one by one.
Height of tree
Number of levels in a tree
nondeterministic
Permitting more than one choice of next move at some step in a computation.
Preemption
Preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such a change is known as a context switch.
We visit the root node first, then recursively do a ____ of the left subtree, followed by a recursive ____ of the right subtree
Preorder traversal
Prim's
Same overall algorithm as Dijkstra's except that it only considers lowest cost of single edge. Continually builds onto a tree with the cheapest cost edges.
automaton
See finite state machine or cellular automaton.
What Is a Hashset?
Similar to linked list, but elements must be unique
implication
The boolean implies function.
transition
The change from one state to another in a finite state machine. Analogously, an edge in a directed graph.
ZPP
The class of languages for which a membership computation by a probabilistic Turing machine halts in polynomial time with no false acceptances or rejections, but randomly some "I don't know" answers. "ZPP" means "Zero error Probability in Polynomial" time.
BPP
The class of languages for which a membership computation by a probabilistic Turing machine halts in polynomial time with the right answer (accept or reject) at least 2/3 of the time. "BPP" means "Bounded error Probability in Polynomial" time.
polynomial hierarchy
The classes of languages accepted by k-alternating Turing machines, over all k≥ 0 and with initial state existential or universal. The bottom level (k=0) is the class P. The next level (k=1) comprises NP and co- NP.
local alignment
The detection of local similarities among two or more strings.
difference
The difference of set A minus set B is a set having all the members which are in A, but not in B.
Manhattan distance
The distance between two points measured along axes at right angles. In a plane with p1 at (x1, y1) and p2 at (x2, y2), it is |x1 - x2| + |y1 - y2|.
offset
The distance from the beginning of a string to the end of a segment in that string.
capacity
The maximum flow that may be sent through an edge or a vertex.
relative performance guarantee
The maximum ratio by which the result of a ρ-approximation algorithm may depart from the optimal solution.
time/space complexity
The maximum time or space required by a Turing machine on any input of length n.
What are methods to implement a hash table?
get(K key), getSize(), add(), remove(), isEmpty()
What do you call a node with no children?
leaf node
What attributes should we have?
n the size, capacity, and A.
How do you look up a value within the hash table?
return Table[Hash(key)];
What are the three parts of a tree data structure?
root, branches, and leaves
Each node is what compared to its children in a min_heap
smaller
height of a tree
the length of the longest simple path from the root to a leaf
Collision
when a 2 elements map to the same cell/bucket
bipartite graph
A bipartite graph is a graph whose vertices we can divide into two sets such that all edges connect a vertex in one set with a vertex in the other set.
Tries
A collection of nodes, each of which can hold a key and a value- often the values will be null. The nodes will have a value attached to the last character of the string upon insertion, which apparently makes searching very easy. Very useful for searching keys.
Cycle
A cycle is a path (with at least one edge) whose first and last vertices are the same. A simple cycle is a cycle with no repeated edges or vertices (except the requisite repetition of the first and last vertices).
strongly connected digraph
A digraph is strongly connected if there is a directed path from every vertex to every other vertex.
directed cycle
A directed cycle is a directed path (with at least one edge) whose first and last vertices are the same.
directed path
A directed path in a digraph is a sequence of vertices in which there is a (directed) edge pointing from each vertex in the sequence to its successor in the sequence.
Depth First Search
A method which is used to traverse through a graph. Works by creating a stack of nodes to visit, which consist of all the nodes around your current position. You move to the next location and add the nodes surrounding it to the stack, making sure not to add any nodes you may have already visited. You repeat this pattern until you either reach the destination, or a dead end. At a dead end, you would backtrack to the last node which still has unvisited neighbors. Time complexity of |V|+|E|
spanning forest
A spanning forest of a graph is the union of the spanning trees of its connected components.
spanning tree
A spanning tree of a connected graph is a subgraph that contains all of that graph's vertices and is a single tree.
Counting Sort (Key Indexed sort)
An integer sorting algorithm which counts the number of objects that have a distinct key value, and then used arithmetic on those countes to determine the positions of each key value in the output array. It cannot handle large keys efficiently, and is often used as a subroutine for other sorting algorithms such as radix sort. Has a time complexity of N.
Concrete Examples
Analysis pattern. Manually solve concrete instances of the problem and then build a general solution
Case Analysis
Analysis pattern. Split the input/execution into a number of cases and solve each case in isolation
Dynamic Programming
Break down a problem into smaller and smaller subproblems. At their lowest levels, the subproblems are solved and their answers stored in memory. These saved answers are used again with other larger (sub)problems which may call for a recomputation of the same information for their own answer. Reusing the stored answers allows for optimization by combining the answers of previously solved subproblems.
Dijksta's Algorithm
Finds the shortest path with no negative weights given a source vertex. Tries to find the distance to all other vertices in the graph. It produces a shortest paths tree by initializing the distance of all other nodes to infinity, and then relaxes these distances step by step by iteratively adding vertexes that dont already exist in the tree and have the lowest cost of distance to travel to them. Time complexity O(|E|+|V|log|V|)
Shellsort
Insertion sort over a gap Best: O(n log n) Avg: depends on gap sequence Worst: O(n^2)
LSD Radix Sort
Stable sort which sorts fixed length strings. Uses an axillary array, and therefore is not in place. Goes through to the last character of a string (its least significant digit), and takes its value. All strings given are then organized based on the value of their least significant digit. Following this, the algorithm proceeds to the next least significant digit, repeating the process until it has gone through the length of the strings. Best used for sorting things with fixed string lengths, like Social Security numbers or License Plates. Has a time complexity of O(n*k) where n is the number of keys and k is the average length of those keys.
MSD
String/character sort from left to right, must keep appending from previous letters to keep order
Binary Search Tree
Will have a best case high of lgN. This is also its expected height. In the worst case, it will have a height of N, and thus become similar to a linked list. Works by inserting nodes of lesser values to the left of a node, and inserting greater values to the right of the node, traversing down the tree until we reach a blank spot to insert. Has a worst case cost of N to search and insert node. The average case of searching will be 1.39lgN compares
Red Black Tree
Worst case height of 2log(n+1). The nodes are either red or black. The root is black. If a node is red, its children MUST BE BLACK. Every path from a node to a leaf must contain the same number of black nodes. New insertions will always be red and always left leaning. Insertions must satisfy the conditions that red nodes have black children and that they have the same number of black nodes in all paths. Time complexity on its operations are O(logN).
Black height
# of black nodes, including nil, on the path from given node to a leaf, not inclusive; any node with height h has black-height >= h/2
Kraft's inequality
∑i=1N 2-c(i) ≤ 1, where N is the number of leaves in a binary tree and c(i) is the depth of leaf i.
Aggregation Relation
'has-a' relation, weaker, doesn't require all parts
Composition Relation
'owns-a'n relation, stronger, requires all parts
Dependency Relation
'uses' relation
busy beaver
(1) A Turing machine with a small number of states that halts when started with a blank tape, but writes a huge number of non-blanks or takes a huge number of steps. (2) The problem of finding the maximum number of non-blanks written or steps taken for any Turing machines with a given number of states and symbols.
n-ary function
(1) A function with exactly n arguments. (2) A function which takes any number of arguments, or a variable number of arguments.
certificate
(1) Extra information so the correctness of an answer to a decision problem can be quickly checked. (2) For any graph property P and graph G, a certificate for G is a graph G' such that G has property P if and only if G' has the property.
left rotation
(1) In a binary search tree, pushing a node N down and to the left to balance the tree. N's right child replaces N, and the right child's left child becomes N's right child. (2) In an array, moving all items to the next lower location. The first item is moved to the last location, which is now vacant. (3) In a list, removing the head and inserting it at the tail.
Tree Buckets
+--WC = O(logN) +--no wasting space +--dynamically sized -- more complicated than what's needed. --> insert with dups= O(1) --> W/o dups = O(N)
Chained bucket:
+--easy to implement +-- buckets can't overfill +-- buckets won't waste time. +-- buckets are dynamically sized.
Linked-List: Implementation of stacks/queues
- Does not require the knowledge of the maxSize - Only allocate the right amount of memory for stack items - Linked-list provides more flexibility than arrays if one cannot predict maxSize (speed/complexity is similar) Queues: Easy to manipulate using double-ended feature
Array Bucket
-- a bucket of arrays. -Fixed in size. -size of about 3 work usually well.
Probe Hashing:
-> Hash it, and if it leads to a collision, use a separate equation to determine the step size and use that step size to find a new site.
If we have UW-Madison student ID's, and we wanted the ideal hash functions, how would we do it, and why would there be a problem
-> We'd simply count each one as an index -> Hash table would be huge.
Collision Hashing using Buckets
-Each element can solre than one item. -throw collisions into a bucket. -buckets aren't sorted.
HashCode Method:
-method of OBJECT class -Returns an int -default hash code is BAD-- computed from Object's memory address. --> must override
Combinations of binary tree traversal sequences that do not uniquely identify a tree
1. Postorder and preorder. 2. Preorder and level-order. 3. Postorder and level-order.
pigeonhole sort
A 2-pass sort algorithm that is efficient when the range of keys is approximately equal to the number of items. The first pass allocates an array of buckets, one bucket for each possible key value, then moves each item to its key's bucket. The second pass goes over the bucket array moving each item to the next place in the destination.
B+-tree
A B-tree in which keys are stored in the leaves.
2-3 tree
A B-tree of order 3, that is, internal nodes have two or three children.
Queue
A FIFO (First In First Out) data structure, where the first element added will be the first to be removed, and where a new element is added to the back, much like a waiting line.
What Is A Queue?
A First In First Out Data Structure.
What is a queue?
A First In First Out Data Structure.
The Mapping Between An Item And The Slot Where That Item Belongs In The Hash Table Is Called What?
A Hash Function
l-reduction
A Karp reduction that preserves approximation properties of optimization problems.
What Is A Stack
A LIFO Or FILO Data Structure.
oracle Turing machine
A Turing machine with an extra oracle tape and three extra states q?, qy, qn. When the machine enters q?, control goes to state qy if the oracle tape content is in the oracle set; otherwise control goes to state qn.
symmetric
A binary relation R for which a R b implies b R a.
threaded tree
A binary search tree in which each node uses an otherwise-empty left child link to refer to the node's in-order predecessor and an empty right child link to refer to its in-order successor.
randomized binary search tree
A binary search tree in which nodes have a randomly assigned priority. Updates keep priorities in heap order instead of keeping balance information and doing rebalance operations.
treap
A binary search tree in which nodes have another key, called the priority. Operations also keep the nodes heap ordered with regard to the priority.
splay tree
A binary search tree in which operations that access nodes restructure the tree.
full binary tree
A binary tree in which each node has exactly zero or two children.
chaining
A class of collision resolution schemes in which linked lists handle collisions in a hash table. The two main subclasses are separate chaining, where lists are outside the table, and coalesced chaining, where the lists are within the table.
polytope
A closed, bounded N-dimensional figure whose faces are hyperplanes. Informally, a multidimensional solid with flat sides. A generalization of polyhedron.
list
A collection of items accessible one after another beginning at the head and ending at the tail.
direct chaining
A collision resolution scheme in which the hash table is an array of links to lists. Each list holds all the items with the same hash value.
triconnected graph
A connected graph such that deleting any two vertices (and incident edges) results in a graph that is still connected.
biconnected graph
A connected graph that is not broken into disconnected pieces by deleting any single vertex (and incident edges).
maximally connected component
A connected subgraph of a graph to which no vertex can be added and it still be connected.
zipper
A data structure equivalent to a binary tree that is "opened" so that some node is accessible. It consists of a pair: the current node, along with information to reconstruct the tree. Reconstruction information is called the path or context. A move-to-left-child operation returns the left subtree, along with a new path, which has (i) a Left value, (ii) the current node, (iii) the right subtree, and (iv) any previous path. A similar operation moves to the right child. A move-up operation returns a tree rebuilt from the path information and the current node, along with the previous path.
passive data structure
A data structure that is only changed by external threads or processes, in contrast to an active data structure.
recursive data structure
A data structure that is partially composed of smaller or simpler instances of the same data structure. For instance, a tree is composed of smaller trees (subtrees) and leaf nodes, and a list may have other lists as elements.
persistent data structure
A data structure that preserves its old versions, that is, previous versions may be queried in addition to the latest version.
active data structure
A data structure with an associated thread or process that performs internal operations to give the external behavior of another, usually more general, data structure.
Recursive step
A definition which uses the concept that is being defined
deterministic finite tree automaton
A deterministic finite state machine that accepts finitary trees rather than just strings. The tree nodes are marked with the letters of the alphabet of the automaton, and the transition function encodes the next states for each branch of the tree. The acceptance condition is modified accordingly.
deterministic tree automaton
A deterministic finite state machine that accepts infinite trees rather than just strings. The tree nodes are marked with the letters of the alphabet of the automaton, and the transition function encodes the next states for each branch of the tree. The expressive power of such automata varies depending on the acceptance conditions of the trees.
Mnemonic
A device such as a pattern of letters, ideas, or associations that assists in remembering something
cuckoo hashing
A dictionary implemented with two hash tables, T1 and T2, and two different hash functions, h1 and h2. Each key, k, is either in T1[h1(k)] or T2[h2(k)]. A new key, k, is stored in T1[h1(k)]. If that location is already occupied by another key, l, the other key is moved to T2[h2(l)]. Keys are moved back and forth until a key moves to an empty location or a limit is reached. If the limit is reached, new hash functions are chosen, and the tables are rehashed. For tables that are a bit less than half full and with carefully chosen universal hashing functions, performance is good. A key is deleted by removing it from a table.
compact DAWG
A directed acyclic word graph (DAWG) representing the suffixes of a given string in which each edge is labeled with the longest possible string. The strings along a path from the root to a node are the substring which the node represents.
Full Binary Tree
A full binary tree (sometimes referred to as a proper or plane binary tree) is a tree in which every node in the tree has either 0 or 2 children.
confluently persistent data structure
A fully persistent data structure that allows meld or merge operations to combine two different versions.
objective function
A function associated with an optimization problem which determines how good a solution is, for instance, the total cost of edges in a solution to a traveling salesman problem.
planar straight-line graph
A graph that can be embedded in the plane without crossings in which every edge in the graph is a straight line segment. It is sometimes referred to as planar subdivision or map.
Undirected Graph
A graph that contains edges between vertices with no specific direction associated with any edge.
forest
A graph that has no cycles, but not necessarily connected
labeled graph
A graph which has labels associated with each edge or each vertex.
DAG (Directed Acyclic Graph)
A graph which is directed and contains no cycles
perfect k-ary tree
A k-ary tree with all leaf nodes at same depth. All internal nodes have degree k.
nonbalanced merge sort
A k-way merge sort in which the number of input and output data streams is different for any particular pass. Typically P input streams are merged and distributed to T output streams on one pass followed by a merge of the T inputs and distribution to P outputs.
Fibonacci number
A member of the sequence of numbers such that each number is the sum of the preceding two. The first seven numbers are 1, 1, 2, 3, 5, 8, and 13. F(n) ≈ round(Φn/√ 5), where Φ=(1+√ 5)/2.
polyphase merge sort
A merge sort algorithm that reduces the number of intermediate files needed by reusing emptied files.
padding argument
A method for transferring results about one complexity bound to another complexity bound, by padding extra dummy characters onto the inputs of the machines involved.
finite state machine
A model of computation consisting of a set of states, a start state, an input alphabet, and a transition function that maps input symbols and current states to a next state. Computation begins in the start state with an input string. It changes to new states depending on the transition function. There are many variants, for instance, machines having actions (outputs) associated with transitions (Mealy machine) or states (Moore machine), multiple start states, transitions conditioned on no input symbol (a null) or more than one transition for a given symbol and state (nondeterministic finite state machine), one or more states designated as accepting states (recognizer), etc.
multiprocessor model
A model of parallel computation based on a set of communicating sequential processors.
child
A node of a tree referred to by a parent node. See the figure at tree. Every node, except the root, is the child of some parent.
internal node
A node of a tree that has one or more child nodes, equivalently, one that is not a leaf.
interior node / nonleaf
A node with children. Also a parent.
Parent Node
A node, including the root, which has one or more child nodes connected to it.
Full Tree Traversal
A non-executable, visual approach to help determine the pre-order, in-order, or post-order traversal of a tree.
polyphase merge
A nonbalanced k-way merge which reduces the number of output files needed by reusing the emptied input file or device as one of the output devices. This is most efficient if the number of output runs in each output file is different.
alternating Turing machine
A nondeterministic Turing machine having universal states, from which the machine accepts only if all possible moves out of that state lead to acceptance.
nondeterministic finite tree automaton
A nondeterministic finite state machine that accepts finitary trees rather than just strings. The tree nodes are marked with the letters of the alphabet of the automaton, and the transition function encodes the next states for each branch of the tree. The acceptance condition is modified accordingly.
Kripke structure
A nondeterministic finite state machine whose states are labeled with boolean variables, which are the evaluations of expressions in that state. It may be extended with fairness constraints.
horizontal visibility map
A partition of the plane into regions by drawing a horizontal straight line through each vertex p of a planar straight-line graph until it intersects an edge e of the graph or extends to infinity. The edge e is said to be horizontally visible from p.
vertical visibility map
A partition of the plane into regions by drawing a vertical straight line through each vertex p of a planar straight-line graph until it intersects an edge e of the graph or extends to infinity. The edge e is said to be vertically visible from p.
Set Partition
A partitioning of elements of some universal set into a collection of disjointed subsets. Thus, each element must be in exactly one subset.
Path
A path in a graph is a sequence of vertices connected by edges. A simple path is one with no repeated vertices.
walk
A path in which edges may be repeated.
partially persistent data structure
A persistent data structure that allows updates to the latest version only.
visibility map
A planar subdivision that encodes the visibility information, that is, which points are mutually visible.
EXCELL
A point access method using a dynamic multidimensional array.
two-level grid file
A point access method which is two levels of grid files. The first level addresses second level grid files.
twin grid file
A point access method which is two simultaneous grid files. Points are shuffled between the primary and secondary file to minimize the total size.
star-shaped polygon
A polygon P in which there exists an interior point p such that all the boundary points of P are visible from p.
Horner's rule
A polynomial A(x) = a0 + a1x + a2x² + a3x³ + ... may be written as A(x) = a0 + x(a1 + x(a2 + x(a3 + ...))).
locality-sensitive hashing
A probabilistic algorithm to quickly find points in a high dimensional space near a query point. Preprocessing: put every point in multiple hash tables. Each table has its own locality-sensitive hash function and uses buckets (or chaining) since many collisions are expected. The hash functions come from a family of functions. Finding: look up the query point in each hash table, and compute the distance from the query point of every point in the bucket.
intractable
A problem for which no algorithm can exist which computes all instances of it in polynomial time.
skip list
A randomized variant of an ordered linked list with additional, parallel lists. Parallel lists at higher levels skip geometrically more items. Searching begins at the highest level, to quickly get to the right part of the list, then uses progressively lower level lists. A new item is added by randomly selecting a level, then inserting it in order in the lists for that and all lower levels. With enough levels, searching is O(log n).
permutation
A rearrangement of elements, where none are lost, added, or changed. The Fisher-Yates shuffle randomly permutes elements.
orthogonally convex rectilinear polygon
A rectilinear polygon P in which every horizontal or vertical segment connecting two points in P lies totally within P.
adjacency-list representation
A representation of a directed graph with n vertices using an array of n lists of vertices. List i contains vertex j if there is an edge from vertex i to vertex j. A weighted graph may be represented with a list of vertex/weight pairs. An undirected graph may be represented by having vertex j in the list for vertex i and vertex i in the list for vertex j.
boundary-based representation
A representation of a region that is based on its boundary.
interior-based representation
A representation of a region that is based on its interior (i.e., the cells that compose it).
pushdown automaton
A restricted Turing machine where the tape acts as a pushdown store (or stack, where only the latest element can be read), with an extra one-way read-only input tape.
Rolling hash function
A rolling hash (also known as a rolling checksum) is a hash function where the input is hashed in a window that moves through the input. A few hash functions allow a rolling hash to be computed very quickly—the new hash value is rapidly calculated given only the old hash value, the old value removed from the window, and the new value added to the window—similar to the way a moving average function can be computed much more quickly than other low-pass filters. One of the main applications is the Rabin-Karp string search algorithm, which uses the rolling hash described below.
ordered tree
A rooted tree in which all children of each vertex are ordered. (Usually left to right)
inclusion-exclusion principle
A rule that allows to compute the probability of exactly r occurrences of events A1, A2, ... , An.
separate chaining
A scheme in which each position in the hash table has a list to handle collisions. Each position may be just a link to the list (direct chaining) or may be an item and a link, essentially, the head of a list. In the latter, one item is in the table, and other colliding items are in the list.
edit script
A sequence of viable edit operations on a string or tree.
linked list
A sequence of zero or more nodes containing some data and pointers to other nodes of the list.
proper subset
A set S2 is a proper subset of another set S1 if every element in S2 is in S1 and S1 has some elements which are not in S2.
fully polynomial approximation scheme
A set of algorithms {Aε | ε > 0}, where each Aε is a 1+ε-approximation algorithm bounded by a polynomial in the length of the input and 1/ε.
superimposed code
A set of bit vectors such that no vector is a subset of a bitwise or of a small number of others.
Abstract Data Type (ADT)
A set of data and set of operations that can be performed on the data.
address-calculation sort
A sort algorithm which uses knowledge of the domain of the items to calculate the position of each item in the sorted array.
adaptive sort
A sorting algorithm that can take advantage of existing order in the input, reducing its requirements for computational resources as a function of the disorder in the input.
Minimum Weight Spanning Trees (MST)
A spanning tree whose weight is no larger than the weight of any other spanning tree which could be made with the graph. The properties of this thing include that the graph is connected, the edge weights may not necessarily be distances, the edge weights may be zero or negative, and the edge weights are all different. Can be constructed using a greedy algorithm such as Prim's or Kruskal's. Generally used in network design.
blind trie
A specialized Patricia tree whose internal nodes store only an integer, k, which is the length of the common prefix of the strings in the children. Equivalently, the strings first differ in the (k+1)st character.
existential state
A state in a nondeterministic Turing machine from which the machine accepts if any move leads to acceptance.
universal state
A state in an alternating Turing machine from which the machine accepts only if all possible moves lead to acceptance.
antichain
A subset of mutually incomparable elements in a poset.
multi suffix tree
A suffix tree extended to multiple strings by concatenating the strings.
tree traversal
A technique for processing the nodes of a tree in some order.
Hashing
A technique that determines an arrays index using only an entry's search key. The array itself is called a hash table.
stooge sort
A terribly inefficient sort algorithm that swaps the top and bottom items if needed, then (recursively) sorts the bottom two-thirds, then the top two-thirds, then the bottom two-thirds again.
Master theorem
A theorem giving a solution in asymptotic terms for recurrence relations of the form T(n) = aT(n/b) + f(n) where a ≥ 1 and b > 1 are constants and n/b means either n/b or n/b.
little-o notation
A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = o(g(n)) means f(n) becomes insignificant relative to g(n) as n approaches infinity. The notation is read, "f of n is little oh of g of n".
polynomial-time reduction
A transformation of one problem into another which is computable in polynomial time.
Complete Tree
A tree in which there are no missing nodes when looking at each level of the tree. The lowest level of tree may not be completely full, but may not have any missing nodes. All other levels are full.
Euclidean Steiner tree
A tree of minimum Euclidean distance connecting a set of points, called terminals, in the plane. This tree may include points other than the terminals, which are called Steiner points.
introspective sort
A variant of quicksort which switches to heapsort for pathological inputs, that is, when execution time is becoming quadratic.
cactus stack
A variant of stack in which one other cactus stack may be attached to the top. An attached stack is called a "branch." When a branch becomes empty, it is removed. Pop is not allowed if there is a branch. A branch is only accessible through the original reference; it is not accessible through the stack.
free vertex
A vertex not on a matched edge in a matching, or, one which has not been matched.
collision resolution scheme
A way of handling collisions, that is, when two or more items should be kept in the same location, especially in a hash table. The general ways are keeping subsequent items within the table and computing possible locations (open addressing), keeping lists for items that collide (chaining), or keeping one special overflow area.
Heap Binary Tree: access, search, insert, delete,
Access: O(1) Search: O(n) Insert: O (lg n) Best case: sorted array Delete: O (lg n)
Minimum Spanning Tree
Acyclic, contain all vertexes. Can be approached with either Prim's or Kruskal's method.
Priority Queue: advantage, disadvantage
Advantage: cheap way to sort priorities, sometimes you want to do things first Disadvantage: worse at inserting and searching than BST
Heap Binary Tree: advantage, disadvantage
Advantage: fast access, quick insert and delete Disadvantage: slow search, efficient memory if full
Tries: advantage, disadvantage, memory
Advantage: faster search than a hash table, no collisions, no hash function needed, quick insert and delete Disadvantage: can take up more space than a hash table Memory: A LOT - need empty memory for every possibility
Stack: advantage, disadvantage
Advantage: quick access Disadvantage: inefficient with an array
Heap: advantage, disadvantage
Advantage: quick insert, quick delete, access to largest item Disadvantage: slow access to all other items
Binary Tree: advantage, disadvantage
Advantage: quick search, delete, insert Disadvantage: complex deletion
Proper ancestors of a vertex v in a tree
All vertices on the simple path from the root to v, but excluding v itself.
Bellman-Ford
Allowed to reconsider costs of reaching vertexes. Can detect negative cost cycles. Able to handle negative graphs by performing relaxation on all edges V-1 times where V is the number of vertices.
Doubly Linked List
Allows to traverse the list forward and backward There are now two references within each link instead of one Benefits: - Traverse backward - More flexibility in removing, inserting objects - Deletion from the end of the list is O(1) Drawback: - Need to manage multiple pointers
lexicographical order
Alphabetical or "dictionary" order.
process algebra
An algebraic theory to formalize the notion of concurrent computation, best exemplified in CSP and CCS.
Kruskal's algorithm
An algorithm for computing a minimum spanning tree. It maintains a set of partial minimum spanning trees, and repeatedly adds the shortest edge in the graph whose vertices are in different partial minimum spanning trees.
Floyd-Warshall Algorithm
An algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles). A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices, though it does not return details of the paths themselves.
compound algorithm
An algorithm in which one or more of the computable steps is a call to execute another algorithm.
hash Table:
An array that stores a collection of items.
boolean expression
An expression consisting solely of boolean variables and values and boolean operations, such as and, or, not, implies, etc.
performance guarantee
An expression of the most the result of an approximation algorithm may depart from the optimal solution.
Tail Reference
An external reference to the last node in the chain. Makes it so you don't have to traverse chain to find last node if you only have reference to head of chain.
rescalable
An optimization for which given any instance of the problem and integer λ >0, there is an easily computed second instance that is the same except that the objective function for the second instance is (element-wise) λ times the objective function of the first instance. For such problems, the best one can hope for is a relative performance guarantee, not an absolute performance guarantee.
memoization
An optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
total order
An order defined for all pairs of items of a set. For instance, ≤ (less than or equal to) is a total order on integers, that is, for any two integers, one of them is less than or equal to the other.
partial order
An order defined for some, but not necessarily all, pairs of items. For instance, the sets {a, b} and {a, c, d} are subsets of {a, b, c, d}, but neither is a subset of the other. So "subset of" is a partial order on sets.
Binary Search
An ordered array of data which has efficiently supported operations. The worst and average case of a search using this structure is lgN. The Worst case of an insertion is N, and the average case of an insertion is N/2.
connected graph
An undirected graph that has a path between every pair of vertices.
4 Rules of Recursion
Base Cases: You must always have some base cases, which can be solved without recursion. Making Progress: For the cases that are to be solved recursively, the recursive call must make progress to a base case. Design rule: Assume that all recursive calls work Compound Interest Rule: Never duplicate word by solving the same instance of a problem in separate recursive calls.
All Pairs Shortest Path
Can be solved using Floyd-Warshall.
Transitive Closure
Can one get from node a to node d in one or more hops? A binary relation tells you only that node a is connected to node b, and that node b is connected to node c, etc. After the transitive closure is constructed one may determine that node d is reachable from node a. (use Floyd-Warshall Algorithm)
Linear Probing
Checks each spot in order to find available location, causes primary clustering.
What does binary search do?
Checks the middle item, and splitting the array in half if not found. Data must be sorted.
combination
Choose m of n elements, where m ≤ n.
Bubble Sort
Comparisons: O(N^2) Swaps: O(N^2) (less than comparisons) Big O: O(N^2)
Enhanced Insertion sort
Comparisons: O(NlogN) Shift: O(N^2) Big O: O(N^2) --> faster than bubble sort
quantum computation
Computation based on quantum mechanical effects, such as superposition and entanglement, in addition to classical digital manipulations.
Prim-Jarnik algorithm
Compute a minimum spanning tree by beginning with any vertex as the current tree. At each step add a least edge between any vertex not in the tree and any vertex in the tree. Continue until all vertices have been added.
Boruvka's algorithm
Compute a minimum spanning tree.
binary GCD
Compute the greatest common divisor of two integers, u and v, expressed in binary. The run time complexity is O((log2 u v)²) bit operations.
repeated squaring
Compute the nth power of an expression in Θ(log n) steps by repeatedly squaring an intermediate result and multiplying an accumulating value by the intermediate result when appropriate.
Ratcliff/Obershelp pattern recognition
Compute the similarity of two strings as the number of matching characters divided by the total number of characters in the two strings. Matching characters are those in the longest common subsequence plus, recursively, matching characters in the unmatched region on either side of the longest common subsequence.
CORDIC
Compute trigonometric functions by iterative complex rotations. The advantage is that all computations can be done with addition, subtraction, and binary shifts.
Abstract Data Types
Consists of 2 parts: 1. Data it contains 2. Operations that can be performed on it
What are the methods we need to create a hash table? And what are the attributes
Constructor(), put(key,val), get(key), del,len, in. self.size self.slots self.data
build-heap
Convert an array into a heap by executing heapify progressively closer to the root. For an array of n nodes, this takes O(n) time under the comparison model.
Negative Edge Costs
Dijkstra's cannot solve. Requires Bellman-Ford.
linear probing sort
Distribute each of n elements to one of m locations in an array (m>n) based on an interpolation of the element's key. In case of collisions, put the element in the next empty location. The array has extra space at the end for overflow. The second pass packs the elements back into an array of size n.
Double Checked Locking
Double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by first testing the locking criterion (the "lock hint") without actually acquiring the lock. Only if the locking criterion check indicates that locking is required does the actual locking logic proceed. (Often used in Singletons, and has issues in C++).
What Is The Difference Between A Singly Linked List, and A Doubly Linked List?
Doubly Linked List Node Keeps A Reference To The Node Before, and After.
max-heap property
Each node in a tree has a key which is less than or equal to the key of its parent.
Compressing:
Ensuring the hash code is a valid index for the table size.
formal verification
Establishing properties of hardware or software designs using logic, rather than (just) testing or informal arguments. This involves formal specification of the requirement, formal modeling of the implementation, and precise rules of inference to prove, say, that the implementation satisfies the specification.
Perfect Hash Function
Example: Search Keys range from 555-0000 to 555-9999, so the hash function will produce indices from 0-9999. If the array has 10,000 elements, each telephone number will correspond to one unique element of the hashTable.
Rehashing
Expanding the table: double table size, find closest prime number. Rehash each element for the new table size.
Chinese postman problem
Find a minimum length closed walk that traverses each edge at least once. Finding an optimal solution in a graph with both directed and undirected edges is NP-complete.
Euclidean traveling salesman problem
Find a path of minimum Euclidean distance between points in a plane which includes each point exactly once and returns to its starting point.
bottleneck traveling salesman
Find a tour where no edge is more costly than some (bottleneck) amount.
all simple paths
Find all simple paths from a starting vertex (source) to a destination vertex (sink) in a directed graph. In an undirected graph, find all simple paths between two vertices.
array search
Find an element in an array. Various algorithms exist which require more or less structure in the array elements or implementation.
cutting stock problem
Find the best arrangement of shapes on rectangles to minimize waste or the number of rectangles. This is a two-dimensional variant of the bin packing problem. It is NP-complete.
clique problem
Find the largest clique in an undirected graph.
critical path problem
Find the longest path from any source to any sinks in a directed acyclic graph which has weights, or numeric values, on vertices.
path system problem
For a path system P=(x,R,S,T), where S⊆ X, T ⊆ X, and R⊆ X × X × X, the problem of whether there is an admissible vertex in S. A vertex is admissible if and only if x∈ T, or there exists admissible y, z ∈ X such that (x,y,z) ∈ R.
Stirling's approximation
For large values of n, n! ≈ (n/e)n √(2nπ).
Arithmetic progressions
For p < -1, this sum always converges to a constant.
What is binary search?
For this to work your data has to be sorted we then search the middle element and compare if value is greater or less then. Once this is answered we can then cut the elements in half. So after every loop the elements are cut in half. We go from 8->4->2->1 so the complexity of this is O(logn)
bisector
For two elements ei and ej, the locus of points equidistant from ei and ej. That is {p|d(p,ei)=d(p(ej)}, where d is some distance metric.
minimax
Generate all nodes in a game tree. Score each leaf node with its utility value. Score each minimizing node with the smallest of its children's scores, and maximizing node with the largest of its children's scores.
Johnson-Trotter
Generate permutations by transposing one pair of elements at a time.
packing
Given a finite collection of subsets of a finite ground set, to find an optimal subcollection that are pairwise disjoint.
covering
Given a finite collection of subsets of a finite ground set, to find an optimal subcollection whose union covers the ground set.
slope selection
Given a set of points in a plane and an integer k ≤ ( n OVER 2 ), find the line between pairs of points which has the kth smallest slope.
circuit value problem
Given an encoding α of a boolean circuit α, inputs x1, ... , xn and a designated output y, the problem of deciding if output y of α is true on input x1, ... , xn.
knapsack problem
Given items of different values and volumes, find the most valuable set of items that fit in a knapsack of fixed volume.
fractional knapsack problem
Given materials of different values per unit volume and maximum amounts, find the most valuable mix of materials which fit in a knapsack of fixed volume. Since we may take pieces (fractions) of materials, a greedy algorithm finds the optimum. Take as much as possible of the material that is most valuable per unit volume. If there is still room, take as much as possible of the next most valuable material. Continue until the knapsack is full.
oscillating merge sort
Given n tape drives, one input and n-1 work drives, distribute a portion of the input to n-2 tapes, then merge them onto the final tape reading the n-2 backward. Repeat until n-2 (backward) merged runs have been created, at which time they are merged. Continue building up powers of n-2 batches until done.
unbounded knapsack problem
Given types of items of different values and volumes, find the most valuable set of items that fit in a knapsack of fixed volume. The number of items of each type is unbounded. This is an NP-hard combinatorial optimization problem.
Bipartite Graph
Graph can be colored without conflicts while using only two colors.
To Solve Any Problem What Should You Have At The Top Of Your Mind?
Hash Tables
Collesion handeling:
How you handle the collisions so each element in the hittable stores only one item.
Complete Binary Tree
In a complete binary tree every level, except possibly the last, is completely filled, and all nodes in the last level are as far left as possible.
pointer jumping
In a linked structure, replacing a pointer with the pointer it points to. Used for various algorithms on lists and trees.
LCFS hashing
In case of collision, move the existing item to another position, dictated by the open addressing scheme used.
Robin Hood hashing
In case of collision, the item with the longer probe sequence stays in the position. The other item is moved. This tends to equalize the length of probe sequences.
Trie
In computer science, a trie, also called digital tree and sometimes radix tree or prefix tree (as they can be searched by prefixes), is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are not necessarily associated with every node. Rather, values tend only to be associated with leaves, and with some inner nodes that correspond to keys of interest. For the space-optimized presentation of prefix tree, see compact prefix tree.
One-Sided Binary Search
In the absence of an upper bound, we can repeatedly test larger intervals (A[1], A[2], A[4], A[8], A[16], etc) until we find an upper bound, the transition point, p, in at most 2[log p] comparisons. One sided binary search is most useful whenever we are looking for a key that lies close to our current position.
What is In-Order traversal and how do you implement it?
In-order traversal means to "visit" the left branch, then the current node, and then the right branch
ArrayQueue
Index 0 is front. Elements are enqueued from rear and rear is incremented. o(1).
B-Trees
Items are stored in leaves. The root is either a leaf, or it will have between two and M children. All non-leaf nodes will have between M/2 and M children. All leaves will be at the same depth and store between L/2 and L data values where we are free to choose L. Useful for data storage, searching a database/sorted files. Time complexity of logN.
Stable Sorting Algorithm
Items with the same key are sorted based on their relative position in the original permutation
Bradford's law
Journals in a field can be divided into three parts, each with about one-third of all articles: 1) a core of a few journals, 2) a second zone, with more journals, and 3) a third zone, with the bulk of journals. The number of journals is 1:n:n².
Stack: definition
Last in, first out.
In A Binary Search That Keys That Are Less Than Are What Then The Parent, and The Keys On The Right Are What Then The PArent?
Left Is Less, Right Is More
The number of edges on the path fromthe root node to n.
Level
Shell sort
Like insertion but uses bigger increments Challenge: picking the right increment position --> most popular: h=3*h+1 Average is O(N(logN)^2) --> however the worst case performance is not significantly worse than the average performance
Queue
Linear collection that has openings on both ends. FIFO. Elements removed the same order they were added in. Elements added at the rear and removed from front. Should have enqueue(), dequeue()
Stack
Linear collection whose elements are added and removed from one end. LIFO. Should have push(), pop() and peek()
Kruskal's Algorithm
MST Builder/Greedy Algorithm which works by taking edges in order of their weight values, continuously adding edges to the tree if their addition doesn't create a cycle. Is generally slower than the other prominent Greedy Algorithm due to its need to check whether or not an edge is part of a cycle at each phase. Time complexity ElogE
Prim's Algorithm
MST builder/Greedy Algorithm which works by taking a starting vertex and then successively adding the neighbor vertices which have the lowest cost of addition and don't create cycles upon their addition. Time complexity ElogE
Dijkstra's algorithm
Marking nodes as they are added to tree, update reachable unmarked nodes with weight from beginning. Take smallest total weight. Repeat.
Heap Binary Tree: max-heapify, build-max-heap, heap-sort
Max-heapify: O(n) Build-max-heap: O(n) Heap-sort: O(n lgn)
Binary Search Tree: max, min, successor, predecessor
Max: O(h) Min: O(h) Successor: O(h) Predecessor: O(h)
Red Black Tree: max, min, successor, predecessory
Max: O(lg n) Min: O(lg n) Successor: O(lg n) Predecessor: O(lg n)
symmetric set difference
Members which are in either set, but not in both. That is, for sets A and B, it is (A - B) ∪ (B - A).
Static Memory
Memory allocated to an array, which cannot grow or shrink once declared.
Contiguous Memory
Memory that is "side-by-side" in a computer, typical of an array structure.
Array: memory
Memory: O(n)
Binary Search Tree: memory
Memory: O(n)
Direct-access table: memory
Memory: O(n)
Heap Binary Tree: memory
Memory: O(n)
optimal merge
Merge n sorted sequences of different lengths into one output while minimizing reads. Only two sequences can be merged at once. At each step, the two shortest sequences are merged.
simple merge
Merge n sorted streams into one output stream. All the stream heads are compared, and the head with the least key is removed and written to the output. This is repeated until all streams are empty.
MST
Minimum spanning tree Least weight that connects all nodes No cycles
nand
Negated conjunction: 0 NAND 0 = 1, 0 NAND 1 = 1, 1 NAND 0 = 1, 1 NAND 1 = 0.
nor
Negated disjunction: 0 NOR 0 = 1, 0 NOR 1 = 0, 1 NOR 0 = 0, 1 NOR 1 = 0.
not
Negation: NOT 0 = 1, NOT 1 = 0. Also known as complement.
Root
Node at base of tree so has no parent.
Radix Sort
Non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. Two classifications of radix sorts are least significant digit (LSD) radix sorts and most significant digit (MSD) radix sorts.
3-Way Quick Sort
Non-stable, in place sort with an order of growth between N and NlogN. Needs lgN of extra space. Is probabilistic and dependent on the distribution of input keys.
Quick Sort
Non-stable, in place sort with an order of growth of NlogN. Needs lgN of extra space. It has a probabilistic guarantee. Works by making use of a divide and conquer method. The array is divided into two parts, and then the parts are sorted independently. An arbitrary value is chosen as the partition. Afterwards, all items which are larger than this value go to the right of it, and all items which are less than this value go to the left of it. We arbitrarily choose a[lo] as a partitioning item. Then we scan from the left end of the array one by one until we find an entry that is greater than a[lo]. At the same time, we are scanning from a[lo] to the right to find an entry that is less than or equal to a[lo]. Once we find these two values, we swap them.
Selection Sort
Non-stable, in place sort. Has an N-squared order of growth, needs only one spot of extra space. Works by searching the entire array for the smallest item, then exchanging it with the first item in the array. Repeats this process down the entire array until it is sorted.
What Does The Nodes Last Pointer Refer To In A Single Linked List?
Null
Load Factor (LF)
Number of items/Table size. For instance, a load factor of 1 = 100% of the items are used.
Height of tree
Number of nodes from longest path from root to leaf
Depth of a node
Number of nodes on the path from root to the node
What is the big O of interpolation search?
O(n)
Linked structure
Object reference variables to link one object to another. An object reference can be seen as a pointer. Each object is known as a node.
edit operation
On a string, the operation of deletion, insertion, or substitution performed on a single symbol. On a tree, the deletion of a node v followed by the reassignment of all children of v to the node of which v was formerly a child, the insertion of a new node followed by the reassignment of some arcs departing from the parent of the new node, or the substitution of the label of one of the nodes with another label. Each edit operation may have an associated nonnegative real number representing its cost.
The number of edges in a tree
One less than the number of vertices
canonical complexity class
One of the classes defined by logarithmic, polynomial, and exponential bounds on time and space, for deterministic and nondeterministic machines. These classify most of the important computational problems.
LinkedNode
Operations of a linkedNode: T getElement() setElement(T element) LinearNode<T> getNext() setNext(LinearNode<T> node)
Queue
Organizes entries according to the order in which they were added. First in - First out.
strip packing
Pack a set of rectangles into a strip of width 1 to minimize the height used. Rectangles may not overlap or be rotated. Without loss of generality, the height of rectangles is at most 1. This is NP-hard.
B-Trees
Popular in disk storage. Keys are in nodes. Data is in the leaves.
We recursively do a _____ of the left subtree and the right subtree followed by a visit to the root node.
Postorder Traversal
What are the three ways of tree traversal?
Preorder, Inorder, and PostOrder
Quadratic Probing:
Probe Sequence is (Hk+1)^2. Minimizes clustering better at distinguishing items across table.
partially dynamic graph problem
Problem where the update operations include either edge insertions (incremental) or deletions (decremental).
level-order traversal
Process all nodes of a tree by depth: first the root, then the children of the root, etc. Equivalent to a breadth-first search from the root.
preorder traversal
Process all nodes of a tree by processing the root, then recursively processing all subtrees.
in-order traversal
Process all nodes of a tree by recursively processing the left subtree, then processing the root, and finally the right subtree.
gnome sort
Put items in order by comparing the current item with the previous item. If they are in order, move to the next item (or stop if the end is reached). If they are out of order, swap them and move to the previous item. If there is no previous item, move to the next item.
reservoir sampling
Randomly select k items from a stream of items of unknown length. Save the first k items in an array of size k. For each item j, j > k, choose a random integer M from 1 to j (inclusive). If M ≤ k, replace item M of the array with item j.
What Is Recursion?
Recursion in computer science is a method where the solution to a problem depends on solutions to smaller instances of the same problem (as opposed to iteration).
Association Relation
Relation between 2 separate classes.
Realization Relation
Relationship between a class that implements an interface
CCS
Robin Milner's algebraic theory to formalize the notion of concurrent computation. An acronym for Calculus of Communicating Systems.
Calculus of Communicating Systems
Robin Milner's algebraic theory to formalize the notion of concurrent computation. Commonly known as CCS.
memoization
Save (memoize) a computed answer for possible later reuse, rather than recomputing the answer.
What Is Quick Sort?
Say We Have An Array 6-8-2-4-3-8-1 First thing we have to do is find the pivot we choose 4, so now we have a left array, and right array. Then we check first element on left which is 6 if 6 > 4 then we stop and then we check if 1 < 4 and stop, then we switch then no we have. Once finished we repeatedly make an array half smaller again and do again.
Compute XOR of every bit in an integer
Similar to addition, XOR is associative and communicative, so, we need to XOR every bit together. First, XOR the top half with the bottom half. Then, XOR the top quarter of the bottom half with the bottom quarter of the bottom half... x ^= x >> 32 x ^= x >> 16 x ^= x >> 8 x ^= x >> 4 x ^= x >> 2 x ^= x >> 1 x = x & 1
dynamic programming
Solve an optimization problem by caching subproblem solutions (memoization) rather than recomputing them.
DAG shortest paths
Solve the single-source shortest-path problem in a weighted directed acyclic graph by 1) doing a topological sort on the vertices by edge so vertices with no incoming edges are first and vertices with only incoming edges are last, 2) assign an infinite distance to every vertex (dist(v)=∞) and a zero distance to the source, and 3) for each vertex v in sorted order, for each outgoing edge e(v,u), if dist(v) + weight(e) < dist(u), set dist(u)=dist(v) + weight(e) and the predecessor of u to v.
extremal
Some of the entries of the auxiliary array used in a string matching algorithm. An entry is d-extremal if it is the deepest entry on its diagonal to be given value d.
Topographical sort
Some things have to come before others Ex: getting dressed, course prereqs Not necessarily unique
Mergesort
Stable sort which is not in place. It has an order of growth of NlogN and requires N amount of extra space. Works by dividing an array in half continuously into smaller and smaller arrays. At the lowest level, these arrays are sorted and then merged together after sorting in the reverse order they were divided apart in.
Tree Sort
Stable, O(n log n), Ω(n log n) : Put everything in the tree, traverse in-order.
Merge Sort
Stable, O(n log n), Ω(n log n): Use recursion to split arrays in half repeatedly. An array with size 1 is already sorted.
Bubble Sort
Stable, O(n^2), Ω(n) : Compares neighboring elements to see if sorted. Stops when there's nothing left to sort.
Insertion Sort
Stable, O(n^2), Ω(n) : Swapping elements one at a time starting at the beginning.
Linear Probing:
Step size is 1. Find the index, and keep incrementing by one until you find a free space.
How do you insert a value within the hash table?
Table[Hash(key)]=data;
Kruskal's
Takes edges in sorted order by cost, creates many trees which join into one large tree.
sparsification
Technique for designing dynamic graph algorithms, which when applicable transform a time bound of T(n,m) onto O(T(n,n)), where m is the number of edges and n is the number of vertices of the given graph.
B-tree
That tree where you have like 5 keys in a node and 6 offshoots
Hamming Weight
The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used (also called the population count, popcount or sideways sum). Algorithm: - Count the number of pairs, then quads, then octs, etc, adding and shifting. v = v - ((v>>1) & 0x55555555); v = (v & 0x33333333) + ((v>>2) & 0x33333333); int count = ((v + (v>>4) & 0xF0F0F0F) * 0x1010101) >> 24;
What Is Big O Notations?
The Studying Of The Worst Case Scenario Of The Code We Wrote
NIST
The United States National Institute of Standards and Technology.
global optimum
The best possible solution to a problem.
lucky sort
The best possible sort algorithm: it is so lucky that the input is already sorted, and it need do nothing!
Minimum Product Spanning Tree
The cost of a tree is the product of all the edge weights in the tree, instead of the sum of the weights. Since log(a*b) = log(a) + log(b), the minimum spanning tree on a graph whose edge weights are replaced with their logarithms gives the minimum product spanning tree on the original graph.
relational structure
The counterpart in formal logic of a data structure or class instance in the object-oriented sense. Examples are strings, directed graphs, and undirected graphs. Sets of relational structures generalize the notion of languages as sets of strings.
degree
The degree of a vertex is the number of edges incident on it.
Path
The edge, or link between two vertices
suffix
The end characters of a string. More formally a string v is a suffix of a string u if u=u'v for some string u'.
distributional complexity
The expected running time of the best possible deterministic algorithm over the worst possible probability distribution on the inputs.
factorial
The factorial of an integer n ≥ 0, written n!, is n × n-1 × ... × 2 × 1. In particular, 0! = 1.
Shell sort
The first diminishing increment sort. On each pass i sets of n/i items are sorted, typically with insertion sort. On each succeeding pass, i is reduced until it is 1 for the last pass. A good series of i values is important to efficiency.
Lm distance
The generalized distance between two points. In a plane with point p1 at (x1, y1) and p2 at (x2, y2), it is (|x1 - x2|m + |y1 - y2|m)1/m.
intersection
The intersection of two sets is a set having those members which are in both sets.
complexity
The intrinsic minimum amount of resources, for instance, memory, time, messages, etc., needed to solve a problem or execute an algorithm.
asymptotic space complexity
The limiting behavior of the use of memory space of an algorithm when the size of the problem goes to infinity. This is usually denoted in big-O notation.
moderately exponential
The measure of computation, m(n) (usually execution time or memory space), is more than any polynomial nk, but less than any exponential cn where c > 1. Formally, m(n) is of moderately exponential growth if ∀ k > 0 m(n)=Ω(nk) and ∀ ε > 0 m(n)=o((1+ε)n).
Lotka's law
The number of authors making n contributions is about 1/na of those making one contribution, where a is often nearly 2.
in-degree
The number of edges coming into a vertex in a directed graph.
Degree of a Vertex
The number of edges incident of the vertex, with loops counted twice
Length of a path
The number of edges that composes it.
Post-Order Traversal
The process of systematically visiting every node in a tree once, starting at the root and proceeding left down the tree, accessing the first node encountered at its "right" side, proceeding likewise along the tree, accessing each node as encountered at its "right" side.
What is a balanced binary tree?
The same number of nodes in the left and right subtrees of the root.
candidate verification
The stage of two-dimensional matching where candidate occurrences of the pattern are actually tested.
next state
The state immediately following the current state, defined by the transition function of a finite state machine and the input.
circuit complexity
The study of the size, depth, and other attributes of circuits that decide specified languages or compute specified functions.
average-case cost
The sum of costs of an algorithm over all possible inputs divided by the number of possible inputs.
primary clustering
The tendency for some collision resolution schemes to create long runs of filled slots near the hash function position of keys.
amortized cost
The theoretical speed of a given set of operations. It is O(f(n)) when the execution time of the worst case of all sequences of n operations never exceeds O(n*f(n)).
work
The total number of operations taken by a computation.
length of a path
The total number of vertices in the vertex sequence defining the path - 1.
signature
The types or domains, and order, in some representations, of inputs to and outputs from a function.
target
The vertex which an edge of a directed graph enters.
Articulation Vertex
The weakest point in a graph
Strongly Connected Graphs
There lies a path between any two vertices on a directed graph.
square root
This describes a "long hand" or manual method of calculating or extracting square roots. Calculation of a square root by hand is a little like long-hand division.
Depth-First Traversal
This kind of traversal fully explores one subtree before exploring another. Traversal follows a path that descends the levels of a tree as deeply as possible until it reaches a leaf.
coarsening
To alter a problem, typically restricting it to a less complex feasible region or objective function, so that the resulting problem can be efficiently solved, usually by dynamic programming.
external merge
To combine multiple sorted data streams into a single sorted stream using external storage.
visible
Two points p and q are visible if the straight line segment between them does not intersect any other object, edge, etc.
prisoner's dilemma
Two prisoners are questioned separately about a crime they committed. Each may give evidence against the other or may say nothing. If both say nothing, they get a minor reprimand and go free because of lack of evidence. If one gives evidence and the other says nothing, the first goes free and the second is severely punished. If both give evidence, both are severely punished. The overall (globally) best strategy is for both to say nothing. However not knowing (or trusting) what the other will do, each prisoner's (locally) best strategy is to give evidence, which is the worst possible outcome. In general, a situation where local optimization leads to the worst possible outcome globally.
Double Rotation
Two single rotation at different locations, either right-left or left-right. First rotation is deeper than the second.
adjacent
Two vertices of a graph are adjacent if there is an edge between them. Two edges of a graph are adjacent if they connect the same vertex.
Quick Sort
Unstable, O(n log n) for a good pivot,O(n^2) for a bad pivot Ω(n log n) : Uses partitioning O(n), Pick a median of 1st, middle, and last element for pivot. Random selection is also good, but expensive. Algorithm can be slow because of many function calls.
Selection Sort
Unstable, O(n^2), Ω(n^2) : Iterates through every elements to ensure the list is sorted.
hashbelt
Use a short list or array of hash tables to implement a hash table with aging to expire items. To expire items, add a new table at the head of the list and drop the oldest table, along with its contents. To find an item, search all the tables.
Depth-first search
Use a stack or recursion to search tree
Linear Probing
Used to solve collisions.
Open addressing
Uses probes to find an open location to store data.
star encoding
Using a fixed dictionary, encode words in text with strings having many repeated characters, typically an asterisk or "star" (*).
Cupif-Giannini tree traversal
Visit every leaf of a perfect binary tree with maximum dispersion (see note). For a tree of height n, use an n-bit "count" integer. The least significant bit of count indicates whether to go to the left or right child from the root. Each more significant bit indicates whether to go left or right. Regular binary counting generates the list of 2n paths to the leaves.
Inorder Traversal
Visits the root of a binary tree between visiting the nodes in the roots subtrees. Order: Visit all the nodes in the roots left subtree. Visit the root. Visit all the nodes in the right subtree.
clustering free
When a collision resolution scheme spreads out entries in a hash table.
Activation Record/Frame
When a method is called, the programs run-time environment creates an objected called the activation record/frame
Program Counter
When a program executes, a special location called the program counter references the current instruction.
Hash Index
When a search key maps or hashes to the index I. I is the hash index.
Balanced binary trees.
When each node in a binary tree has two subtrees whose heights are exactly the same.
dynamic
When the problem domain may change, e.g., there may be insertions and deletions.
Exception handling
When there's an error, the program makes an error object and passes it off to the runtime system, which looks for a method in the call stack to handle it.
collision
When two or more items should be kept in the same location, especially in hash tables, that is, when two or more different keys hash to the same value.
How do you delete a value within the hash table?
You just set Table[hash(Key)] = null
Stacks and Queues
abstract data structures Complexity O(1)
Set methods
add() remove() removeAll() contains() equals() isSubset() pick() removeRandom() iterator()
If a graph's edges are unordered [ (u,v) == (v,u)], then the vertices u and v are
adjacent
Ω-notation
asymptotic lower bound
DAG
directed acyclic graph
If a graph's edges are ordered [ (u,v) != (v,u)], then the edge (u,v) is _ from _ to _
directed from u to v
Circular queue - dequeue
front= (front+1)%queue.length
What does a balanced tree mean?
not terribly imbalanced it's balanced enough to ensure O(logn)
Node height
number of edges on the longest downward path between node and a leaf
Node depth
number of edges on the longest downward path between node and the root
Height of binary tree
number of edges on the longest downward path between the root and a leaf log(n) - complete binary tree
Binary Search tree complexity
o(log^2n) time complexity for add find and remove.
Recursive Algorithms
solve a problem by solving smaller internal instances of a problem -- work towards a base case.
What does the node class store in a singly linked list?
value, and reference
What kind of Collection is Hashing?
value-orientated.
Binary Search Tree: property
value[left[x]] <= value[x] value[right[x] >= value[x]
Isolate the lowest bit that is 1 in x
x & ~(x - 1)
Right propagate the rightmost set bit in x
x | (x & ~(x - 1) - 1)
Typical runtime of a recursive function with multiple branches
O( branches^depth )
Binary Operator
When an operator has two operands ie: a + b
minimal perfect hashing
A perfect hashing function that maps each different key to a distinct integer and has the same number of possible integers as keys.
edge connectivity
(1) The smallest number of edges whose deletion will cause a connected graph to not be connected. (2) For a pair of vertices s and t in a graph, the smallest number of edges whose deletion will separate s from t.
ideal random shuffle
A permutation algorithm, or shuffle, that has exactly the same chance of producing any permutation.
subadditive ergodic theorem
If a stationary and ergodic process satisfies the subadditive inequality, it grows almost surely linearly in time.
vertex cover
A set of vertices in an undirected graph where every edge connects at least one vertex. The vertex cover problem is to find a minimum size set and is NP-complete.
poset
A set the elements of which are subject to a partial order.
disjoint set
A set whose members do not overlap, are not duplicated, etc.
digital tree
A tree for storing strings in which nodes are organized by substrings common to two or more strings.
trie
A tree for storing strings in which there is one node for every common prefix. The strings are stored in extra leaf nodes.
What Is A Binary Tree?
A tree has a node and no more two childs can have 0,1, or 2 children. In a tree we only have a root node.
Levenshtein distance
(1) The smallest number of insertions, deletions, and substitutions required to change one string or tree into another. (2) A Θ(m × n) algorithm to compute the distance between strings, where m and n are the lengths of the strings.
Full Tree
A tree in which every level of the tree is completely full, with no missing nodes.
vertex connectivity
(1) The smallest number of vertices whose deletion causes a connected graph to not be connected. (2) For a pair of vertices s and t in a graph, the smallest number of vertices whose deletion will separate s from t.
segment
(1) The substring of a pattern delimited by two don't cares or one don't care and beginning or end of the pattern. (2) A substring.
layered graph
A connected graph where "layers" L0 ... Lk partition the vertices. Each edge, which has a nonnegative integral weight, connects only vertices in successive layers. The width is the greatest number of vertices in any layer, i.e., MAXi=0k |Li|.
extreme point
A corner point of a polyhedron. More formally, a point which cannot be expressed as a convex combination of other points in the polyhedron.
search tree property
When the key of every node of a binary tree is larger than the key of its left child and smaller than its right child.
witness
(1) a structure providing an easily verified bound on the optimal value of an optimization problem. Typically used in the analysis of an approximation algorithm to prove the performance guarantee. (2) a mismatch of two symbols of string y at a distance of d is a "witness" to the fact that in no subject y could occur twice at a distance of exactly d positions (equivalently, that d cannot be a period of y).
What Is A Stack?
A LIFO, last-in first-out data structure.
What Is A Linked List?
A Linear Data Structure With A Node Class That Had Nodes That Are Linked With Pointers.
sublinear time algorithm
A algorithm whose execution time, f(n), grows slower than the size of the problem, n, but only gives an approximate or probably correct answer.
BANG file
A balanced and nested grid (BANG) file is a point access method which divides space into a nonperiodic grid. Each spatial dimension is divided by a linear hash. Cells may intersect, and points may be distributed between them.
Circular Linked Chain
When the last node references the first node so no node contains null in its next field.
AVL tree
A balanced binary search tree where the height of the two subtrees (children) of a node differs by at most one. Look-up, insertion, and deletion are O(log n), where n is the number of nodes in the tree.
balanced two-way merge sort
A balanced k-way merge sort that sorts a data stream using repeated merges. It distributes the input into two streams by repeatedly reading a block of input that fits in memory, a run, sorting it, then writing it to the next stream. It then repeatedly merges the two streams and puts each merged run into one of two output streams until there is a single sorted output.
point access method
A data structure and associated algorithms primarily to search for points defined in multidimensional space.
deque
A data structure in which items may be added to or deleted from the head or the tail.
Binary Tree
A data structure that consists of nodes, with one root node at the base of the tree, and two nodes (left child and right child) extending from the root, and from each child node.
coding tree
A full binary tree that represents a coding, such as produced by Huffman coding. Each leaf is an encoded symbol. The path from the root to a leaf is its codeword.
partial recursive function
A function computed by a Turing machine that need not halt for all inputs.
monotonically decreasing
A function from a partially ordered domain to a partially ordered range such that x ≤ y implies f(x) ≥ f(y).
monotonically increasing
A function from a partially ordered domain to a partially ordered range such that x ≤ y implies f(x) ≤ f(y).
strictly decreasing
A function from a partially ordered domain to a partially ordered range such that x < y implies f(x) > f(y).
B-tree
A balanced search tree in which every node has between m/2 and m children, where m>1 is a fixed integer. m is the order. The root may have as few as 2 children. This is a good structure if much of the tree is in slow memory (disk), since the height, and hence the number of accesses, can be kept small, say one or two, by picking a large m.
reduced basis
A basis for a lattice that is nearly orthogonal.
BDD
A binary lattice data structure that succinctly represents a truth table by collapsing redundant nodes and eliminating unnecessary nodes.
matrix multiplication
A binary operation that takes a pair of matrices. The number of columns in the first matrix must be equal to the number of rows in the second matrix.
irreflexive
A binary relation R for which there is no element a such that a R a.
scapegoat tree
A binary search tree that needs no balance information. Search time is logarithmic, and the amortized cost of update is logarithmic.
discrete interval encoding tree
A binary search tree that stores consecutive values as intervals.
BSP-tree
A binary space partitioning (BSP) tree is a binary tree for multidimensional points where successive levels are split by arbitrary hyperplanes.
What is the difference between a tree, and a binary tree?
A binary tree has up to two children.
BD-tree
A binary tree that organizes multidimensional points by splitting off regular subintervals.
binary search tree
A binary tree where every node's left subtree has keys less than the node's key, and every right subtree has keys greater than the node's key.
balanced binary tree
A binary tree where no leaf is more than a certain amount farther from the root than any other. After inserting or deleting a node, the tree may rebalanced with "rotations."
stack tree
A binary tree where no node, except the root, has more than one non-leaf child.
BB(alpha) tree
A binary tree where the balance of every subtree, ρ(T'), is bounded by α ≤ ρ(T') ≤ 1-α.
perfect binary tree
A binary tree with all leaf nodes at the same depth. All internal nodes have degree 2.
extended binary tree
A binary tree with special nodes replacing every null subtree. Every regular node has two children, and every special node has no children.
strictly increasing
A function from a partially ordered domain to a partially ordered range such that x < y implies f(x) < f(y).
binary search tree
A binary tree with the property that for all parent nodes, the left subtree contains only values less than the parent, and the right subtree contains only values greater than the parent.
Abstract Data Types
A class considered without regard to its implementation E.g. Stacks, queues, or lists
transition function
A function of the current state and input giving the next state of a finite state machine or Turing machine.
linear congruential generator
A class of algorithms that are pseudo-random number generators. The next number is generated from the current one by rn+1 = (A × rn + B) mod M, where A and M are relatively prime numbers.
Schorr-Waite graph marking algorithm
A class of algorithms to mark all reachable nodes in a directed graph by reversing pointers on the way down, then restoring them upon leaving. It uses only a few bits of extra space per node and a few work pointers.
Ackermann's function
A function of two parameters whose value grows very fast.
inverse Ackermann function
A function of two parameters whose value grows very, very slowly.
space-constructible function
A function s(n) that gives the actual space used by some Turing machine on all inputs of length n.
time-constructible function
A function t(n) that gives the actual running time of some Turing machine on all inputs of length n.
constant function
A function that always gives the same value.
computable
A function that can be computed by an algorithm --- equivalently, by a Turing machine.
uncomputable function
A function that cannot be computed by any algorithm --- equivalently, not by any Turing machine.
hash function
A function that maps keys to integers, usually to get an even distribution on a smaller set of values.
predicate
A function that returns true or false. Conceptually it tests for a condition.
Hash Function
A function that takes in the key to compute a specific Hash Index.
commutative
A function where f(A, B) = f(B, A).
associative
A function where f(A, f(B, C)) = f(f(A, B), C).
total function
A function which is defined for all inputs of the right type, that is, for all of a domain.
partial function
A function which is not defined for some inputs of the right type, that is, for some of a domain. For instance, division is a partial function since division by 0 is undefined (on the Reals).
unary function
A function which takes one argument.
boolean function
A function whose range is {0, 1}. It can be understood to evaluate the truth or falsity of each element of its domain.
trinary function
A function with three arguments.
semidefinite programming
A generalization of a linear program in which any subset of the variables may be constrained to form a semidefinite matrix.
GBD-tree
A generalized BD-tree, hence the name, which stores spatially extended objects as a hierarchy of minimum bounding boxes. It is a balanced multiway tree which serves as a spatial access method.
orthogonal drawing
A graph drawing in which each edge is represented by a polyline, each segment of which is parallel to a coordinate axis.
straight-line drawing
A graph drawing in which each edge is represented by a straight line segment.
grid drawing
A graph drawing in which each vertex is represented by a point with integer coordinates.
weighted graph
A graph having a weight, or number, associated with each edge. Some algorithms require all weights to be nonnegative, integral, positive, etc.
multigraph
A graph whose edges are unordered pairs of vertices, and the same pair of vertices can be connected by multiple edges.
acyclic graph
A graph with no cycles.
acyclic graph
A graph with no path that starts and ends at the same vertex.
weighted graph
A graph with numbers assigned to its edges.
dense graph
A graph with very few possible edges missing
formal methods
A group of analytical approaches having mathematically precise foundation which can serve as a framework or adjunct for human engineering and design skills and experience.
complete tree
A tree in which every level, except possibly the deepest, is entirely filled. At depth n, the height of the tree, all nodes are as far left as possible.
perfect hashing
A hash function that maps each different key to a distinct integer. Usually all possible keys must be known beforehand. A hash table that uses a perfect hash has no collisions.
Pearson's hash
A hash function that uses an auxiliary array, but no shift or exclusive-or (xor) operations.
multiplication method
A hash function that uses the first p bits of the key times an irrational number.
linear probing
A hash table in which a collision is resolved by putting the item in the next empty place in the array following the occupied place. Even with a moderate load factor, primary clustering tends to slow retrieval.
extendible hashing
A hash table in which the hash function is the last few bits of the key and the table refers to buckets. Table entries with the same final bits may use the same bucket. If a bucket overflows, it splits, and if only one entry referred to it, the table doubles in size. If a bucket is emptied by deletion, entries using it are changed to refer to an adjoining bucket, and the table may be halved.
Miller-Rabin
A heuristic test for prime numbers. It repeatedly checks if the number being tested, n, is pseudoprime to a randomly chosen base, a, and there are only trivial square roots of 1, modulo n. In other words, n is surely composite if an-1 ≠ 1 (mod n), where 0 < a < n. Some composites may be incorrectly judged to be prime.
move-to-root heuristic
A heuristic that moves the target of a search to the root of the search tree so it is found faster next time.
postman's sort
A highly engineered variant of top-down radix sort where attributes of the key are described so the algorithm can allocate buckets and distribute efficiently.
balanced merge sort
A k-way merge sort in which the number of input and output data streams is the same. See balanced k-way merge sort.
Array: advantage, disadvantage
Advantage: quick insert, quick access if index is known Disadvantage: slow search, slow delete, fixed size
binomial heap
A heap made of a forest of binomial trees with the heap property numbered k=0, 1, 2, ..., n, each containing either 0 or 2k nodes. Each tree is formed by linking two of its predecessors, by joining one at the root of the other. The operations of insert a value, decrease a value, delete a value, and merge or join (meld) two queues take O(log n) time. The find minimum operation is a constant Θ(1).
shadow heap
A heap, implemented in an array, adjacent to an unordered table. The shadow is the table nodes and all their (recursive) parents, by array index, in the heap.
order-preserving minimal perfect hashing
A minimal perfect hashing function for keys in S such that if k1, k2 ∈ S and k1 > k2, then f(k1) > f(k2).
s-t cut
A partitioning of the vertices of a flow network into S and T such that the source is in S and the sink is in T.
open addressing
A class of collision resolution schemes in which all items are stored within the hash table. In case of collision, other positions are computed, giving a probe sequence, and checked until an empty position is found. Some ways of computing possible new positions are less efficient because of clustering. Typically items never move once put in place, but in Robin Hood hashing, LCFS hashing, and other techniques, previously placed items may move.
queue
A collection of items in which only the earliest added item may be accessed. Basic operations are add (to the tail) or enqueue and delete (from the head) or dequeue. Delete returns the item removed. Also known as "first-in, first-out" or FIFO.
stack
A collection of items in which only the most recently added item may be removed. The latest added item is at the top. Basic operations are push and pop. Often top and isEmpty are available, too. Also known as "last-in, first-out" or LIFO.
suffix tree
A compact representation of a trie corresponding to the suffixes of a given string where all nodes with one child are merged with their parents.
Patricia tree
A compact representation of a trie in which any node that is an only child is merged with its parent.
binary heap
A complete binary tree where every node has a key more extreme (greater or less) than or equal to the key of its parent.
heap
A complete tree where every node has a key more extreme (greater or less) than or equal to the key of its parent. Usually understood to be a binary heap.
k-ary heap
A complete tree where every node has a key more extreme (greater or less) than the key of its parent. Each node has k or fewer children.
linear
(1) Any function which is a constant times the argument plus a constant: f(x)=c1x + c0. (2) In complexity theory, the measure of computation, m(n) (usually execution time or memory space), is bounded by a linear function of the problem size, n. More formally m(n) = O(n).
singularity analysis
A complex asymptotic technique for determining the asymptotics of certain algebraic functions.
Lempel-Ziv-Welch
A compression algorithm that codes strings of characters with codes of a fixed number of bits. Every new string in the input is added to a table until it is full. The codes of existing strings are output instead of the strings.
algorithm
A computable set of steps to achieve a desired result.
reduction
A computable transformation of one problem into another.
cycle
A path of positive length that starts and ends at the same vertex and does not traverse the same edge more than once.
relation
A computation which takes some inputs and yields an output. Any particular input may yield different outputs at different times. Formally, a mapping from each element in the domain to one or more elements in the range.
Cycle (graph)
A path that begins and ends at the same vertex
optimization problem
A computational problem in which the object is to find the best of all possible solutions. More formally, find a solution in the feasible region which has the minimum (or maximum) value of the objective function.
simple path
A path that repeats no vertex.
exponential
(1) Any function which is the sum of constants times other constants to the power of the argument: f(x)=Σi=0k cibixpi. (2) In complexity theory, the measure of computation, m(n) (usually execution time or memory space), is bounded by an exponential function of the problem size, n. More formally if there exists k > 1 such that m(n) = Θ(kn) and there exists c such that m(n) = O(cn).
polylogarithmic
(1) Any function which is the sum of constants times powers of a logarithm of the argument: f(x)=Σi=0kcilogpi x. (2) In complexity theory, the measure of computation, m(n) (usually execution time or memory space), is bounded by a polylogarithmic function of the problem size, n. More formally m(n) = O(logk n).
polynomial
(1) Any function which is the sum of constants times powers of the argument: f(x)=Σi=0k cixpi. (2) In complexity theory, the measure of computation, m(n) (usually execution time or memory space), is bounded by a polynomial function of the problem size, n. More formally m(n) = O(nk).
solvable
A computational problem that can be solved by a Turing machine. The problem may have a nonbinary output.
unsolvable problem
A computational problem that cannot be solved by a Turing machine. The associated function is called an uncomputable function.
Hamiltonian cycle
A path through a graph that starts and ends at the same vertex and includes every other vertex exactly once.
Euler cycle
A path through a graph which starts and ends at the same vertex and includes every edge exactly once.
augmenting path
A path with alternating free and matched edges that begins and ends with free vertices. Used to augment (improve or increase) a matching or flow.
alternating path
A path with alternating free and matched edges.
nondeterministic algorithm
A conceptual algorithm with more than one allowed step at certain times and which always takes the right or best step. It is not random, as in randomized algorithm, or indeterminate. Rather it has the supercomputational characteristic of choosing the optimal behavior.
BV-tree
A conceptual idea which generalizes B-trees to multiple dimensions. BV-trees are not balanced, and searching may require backtracking.
depth-first search
(1) Any search algorithm that considers outgoing edges (children) of a vertex before any of the vertex's siblings, that is, outgoing edges of the vertex's predecessor in the search. Extremes are searched first. This is easily implemented with recursion. (2) An algorithm that marks all vertices in a directed graph in the order they are discovered and finished, partitioning the graph into a forest.
k-dimensional
(1) Dealing with or restricted to a space where location can be completely described with exactly k orthogonal axes. (2) Dealing with a space of any number of dimensions.
Perfect Binary Tree
A perfect binary tree is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level.
right rotation
(1) In a binary search tree, pushing a node N down and to the right to balance the tree. N's left child replaces N, and the left child's right child becomes N's left child. (2) In an array, moving all items to the next higher location. The last item is moved to the first location, which is now vacant. (3) In a list, removing the tail and inserting it at the head.
boolean
(1) In computer science, entities having just two values: 1 or 0, true or false, on or off, etc. along with the operations and, or, and not. (2) In mathematics, entities from an algebra equivalent to intersection, union, and complement over subsets of a given set.
complement
(1) Of a boolean, 0 if 1, or 1 if 0. See not. (2) Of a set A, a set having all the members which are in the universe, but not in A.
degree
(1) Of a vertex, the number of edges connected to it. (2) Of a graph, the maximum degree of any vertex. (3) Of a tree node, the number of child nodes it has.
uniform hashing
A conceptual method of open addressing for a hash table. A collision is resolved by putting the item in the next empty place given by a probe sequence which is independent of sequences for all other key.
k-connected graph
A connected graph such that deleting any k-1 vertices (and incident edges) results in a graph that is still connected.
A hash function that maps each item into a unique slot is referred to as what?
A perfect hash function.
sim
(1) Proportional to. (2) Asymptotically equal to. A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) ~ g(n) means it grows at the same rate as g(n). More formally, it means limx → ∞f(x)/g(x) = 1.
function
(1) A computation which takes some arguments or inputs and yields an output. Any particular input yields the same output every time. More formally, a mapping from each element in the domain to an element in the range. (2) A subroutine which returns a value.
tree
(1) A data structure accessed beginning at the root node. Each node is either a leaf or an internal node. An internal node has one or more child nodes and is called the parent of its child nodes. All children of the same node are siblings. Contrary to a physical tree, the root is usually depicted at the top of the structure, and the leaves are depicted at the bottom. (2) A connected, undirected, acyclic graph. It is rooted and ordered unless otherwise specified. Thanks to Joshua O'Madadhain ([email protected]) for the figure, 6 October 2005.
recursive
(1) A data structure that is partially composed of other instances of the data structure. For instance, a tree is composed of smaller trees (subtrees) and leaf nodes, and a list may have other lists as elements. (2) An algorithm in which functions might call themselves. For instance, quicksort or heapify.
directed acyclic word graph
(1) A directed acyclic graph representing the suffixes of a given string in which each edge is labeled with a character. The characters along a path from the root to a node are the substring which the node represents. (2) A finite state machine that recognizes a set of words.
partition
(1) A division of a set into nonempty disjoint sets that completely cover the set. (2) To rearrange the elements of an array into two (or more) groups, typically, such that elements in the first group are less than a value and elements in the second group are greater.
Christofides algorithm
(1) A heuristic algorithm to find a near-optimal solution to the traveling salesman problem. Step 1: find a minimum spanning tree T. Step 2: find a perfect matching M among vertices with odd degree. Step 3: combine the edges of M and T to make a multigraph G. Step 4: find an Euler cycle in G by skipping vertices already seen. (2) An algorithm to find the chromatic number of a graph.
metaheuristic
(1) A high-level algorithmic framework or approach that can be specialized to solve optimization problems. (2) A high-level strategy that guides other heuristics in a search for feasible solutions.
cyclic redundancy check
(1) A method to detect and correct errors by adding bits derived from a block or string of bits to the block. (2) An algorithm to compute bits characteristic of a block based on the algebra of polynomials over the integers, modulo 2. (3) The characteristic bits of a block.
Steiner point
(1) A point that is not part of the input set of points, for instance, a point computed to construct a Steiner tree. (2) A point with a particular geometric relation to a triangle.
optimal
(1) A solution to an optimization problem which has the minimum (or maximum) value of the objective function. (2) The time, space, resource, etc. complexity of an algorithm which matches the best known lower bound of a problem.
P-tree
(1) A spatial access method that defines hyperplanes, in addition to the orthogonal dimensions, which node boundaries may parallel. Space is split by hierarchically nested polytopes (multidimensional boxes with nonrectangular sides). The R-tree is a special case that has no additional hyperplanes. (2) A spatial access method that splits space by hierarchically nested polytopes. The R-tree is a special case in which all polytopes are boxes.
R-tree
(1) A spatial access method that splits space with hierarchically nested, and possibly overlapping, boxes. The tree is height-balanced. (2) A recursion tree.
matching
(1) A subgraph in which every vertex has a degree at most one. In other words, no two edges share a common vertex. (2) The problem of finding such a subgraph.
proper
(1) A subunit of a unit is not equal to the unit itself. For instance, a proper substring is not the whole string, a proper subset is not the whole set, a proper subgraph is not the whole graph, etc. (2) According to a rule, as in proper coloring.
node
(1) A unit of reference in a data structure. Also called a vertex in graphs and trees. (2) A collection of information which must be kept at a single memory location.
source
(1) A vertex of a directed graph with no incoming edges. More formally, a vertex with in-degree 0. (2) The vertex from which an edge of a directed graph leaves.
level
(1) All the nodes of a tree with the same depth. (2) Of a node, the depth.
circuit
(1) An acyclic network of inputs, logic gates, and outputs. Contrasted with a Turing machine, it has no memory. (2) A cycle in a graph.
logarithmic
(1) Any function that is a constant times the logarithm of the argument: f(x)=c log x. (2) In complexity theory, when the measure of computation, m(n) (usually execution time or memory space), is bounded by a logarithmic function of the problem size, n. More formally m(n) = O(log n). (3) Sometimes imprecisely used to mean polylogarithmic.
run time
(1) The amount of time needed to execute an algorithm. (2) The time when a compiled program is executing, versus compile time.
Union-Find
(Disjoint-set data structure) keeps track of a set of elements partitioned into a number of disjoint (nonoverlapping) subsets. It supports two useful operations: find and union. Find: Determine which subset a particular element is in. Find typically returns an item from this set that serves as its "representative"; by comparing the result of two Find operations, one can determine whether two elements are in the same subset. Union: Join two subsets into a single subset. In order to define these operations more precisely, some way of representing the sets is needed. One common approach is to select a fixed element of each set, called its representative, to represent the set as a whole. Then, Find(x) returns the representative of the set that x belongs to, and Union takes two set representatives as its arguments.
Prim's Algorithm
(Minimum Spanning Trees, O(m + nlogn), where m is number of edges and n is the number of vertices) Starting from a vertex, grow the rest of the tree one edge at a time until all vertices are included. Greedily select the best local option from all available choices without regard to the global structure.
Kruskal's Algorithm
(Minimum Spanning Trees, O(mlogm) with a union find, which is fast for sparse graphs) Builds up connected components of vertices, repeatedly considering the lightest remaining edge and tests whether its two endpoints lie within the same connected component. If not, insert the edge and merge the two components into one.
k-coloring
1) The assignment of k colors (or any distinct marks) to the vertices of a graph. 2) The assignment of k colors to the edges of a graph. A coloring is a proper coloring if no two adjacent vertices or edges have the same color.
What are the 4 steps to inserting a node into a linked list?
1. Create A New Code. 2. Set Its Element To The New Element. 3. Set Its Next Link To Refer To The Current Head 4. Set The List's Head To Point To The New Node
Steps to resizing:
1. Double table size to nearest prime number 2. Re-hash items from old table into the new table.
Good Hash Function qualities:
1. Must be deterministic: -> Key must ALWAYS generate the same Hash Index (excluding rehashing). 2. Must achieve uniformity -> Keys should be distributed evenly across hash table. 3. FAST/EASY to compute -> only use parts of the key that DISTINGUISH THE ITEMS FROM EACH OTHER 4. Minimize collisions:
Constructing a heap in linear time
1. Place the data into the heap's data set blindly. It will have the correct shape, but the dominance order will be incorrect. 2. Starting from the last (nth) position, walk backwards through the array until we encounter an internal node with children. 3. Perform bubble down n times. Explanation: heapify() takes time proportional to the height of the heaps it is merging. Most of these heaps are extremely small. In a full binary tree on n nodes, there are n/2 nodes that are leaves, n/4 nodes that are height 1, n/8 nodes that are height 2, and so on. In general, there are at most n/(2^(h+1) nodes of hieght h, so the cost of building the heap is <= 2n (see picture).
How does Hashing work?
1. you have a key for the item. 2. the item's key gets churned within the hash function to form the Hash index. 3. The hash index can be applied to the data array, and so, the specific data is found.
counting sort
A 2-pass sort algorithm that is efficient when the number of distinct keys is small compared to the number of items. The first pass counts the occurrences of each key in an auxiliary array, and then makes a running total so each auxiliary entry is the number of preceding keys. The second pass puts each item in its final place according to the auxiliary entry for that key.
Fibonacci Heap
A data structure that is a collection of trees satisfying the minimum-heap property, that is, the key of a child is always greater than or equal to the key of the parent. This implies that the minimum key is always at the root of one of the trees. The trees do not have a prescribed shape and in the extreme case the heap can have every element in a separate tree. This flexibility allows some operations to be executed in a "lazy" manner, postponing the work for later operations. For example, merging heaps is done simply by concatenating the two lists of trees, and operation decrease key sometimes cuts a node from its parent and forms a new tree. For the Fibonacci heap, the find-minimum operation takes constant (O(1)) amortized time. The insert and decrease key operations also work in constant amortized time. Deleting an element (most often used in the special case of deleting the minimum element) works in O(log n) amortized time, where n is the size of the heap. This means that starting from an empty data structure, any sequence of a insert and decrease key operations and b delete operations would take O(a + b log n) worst case time, where n is the maximum heap size. In a binary or binomial heap such a sequence of operations would take O((a + b) log n) time. A Fibonacci heap is thus better than a binary or binomial heap when b is smaller than a by a non-constant factor. It is also possible to merge two Fibonacci heaps in constant amortized time, improving on the logarithmic merge time of a binomial heap, and improving on binary heaps which cannot handle merges efficiently. Using Fibonacci heaps for priority queues improves the asymptotic running time of important algorithms, such as Dijkstra's algorithm for computing the shortest path between two nodes in a graph, compared to the same algorithm using other slower priority queue data structures.
2-left hashing
A dictionary implemented with two hash tables of equal size, T1 and T2, and two different hash functions, h1 and h2. A new key is put in table 2 only if there are fewer (colliding) keys at T2[h2(key)] than at T1[h1(key)], otherwise it is put in table 1. With n keys and two tables of size n/2, the most collisions is 0.69... log2 ln n + O(1) with high probability.
hash table
A dictionary in which keys are mapped to array positions by hash functions. Having the keys of more than one item map to the same position is called a collision. There are many collision resolution schemes, but they may be divided into open addressing, chaining, and keeping one special overflow area. Perfect hashing avoids collisions, but may be time-consuming to create.
If all the edges in a graph are all one way what is that graph?
A directed graph
Strongly connected
A directed graph in which there exists a path between every pair of vertices.
strongly connected graph
A directed graph that has a path from each vertex to every other vertex.
weighted, directed graph
A directed graph that has a weight, or numeric value, associated with each edge.
directed acyclic graph
A directed graph with no path that starts and ends at the same vertex.
shuffle sort
A distribution sort algorithm that begins by removing the first 1/8 of the n items, sorting them (recursively), and putting them in an array. This creates n/8 buckets to which the remaining 7/8 of the items are distributed. Each bucket is then sorted, and the buckets are concatenated.
bucket sort
A distribution sort where input elements are initially distributed to several buckets based on an interpolation of the element's key. Each bucket is sorted if necessary, and the buckets' contents are concatenated.
UnShuffle sort
A distribution sort with two phases. In the first phase, the inputs are distributed among doubly-ended queues keeping the items in each queue ordered and creating a new queue when there is no place on an existing queue. The second phase is an ideal merge in which the item to be removed is determined by keeping the queues in a priority queue.
What Is Deque?
A double-ended queue. Where new items can be added at either the front or the rear. Also Existing Items Can Be Removed From Either End.
spiral storage
A dynamic hashing table that grows a few slots at a time. It uses a hash function, h, with a range of [0,1). For a key, k, an intermediate value, x= S-h(k) +h(k), is computed to find the final slot, dx, where d>1 is called the growth factor. To increase the number of slots, increase S to S' and rehash any keys from dS to dS'-1.
discrete p-center
A facility location problem in which the supply points must be a subset of demand points.
nondeterministic finite state machine
A finite state machine whose transition function maps inputs symbols and states to a (possibly empty) set of next states. The transition function also may map the null symbol (no input symbol needed) and states to next states.
recognizer
A finite state machine with one or more states designated as accepting states. An input string is accepted if there is a path from the start state to an accepting state.
unlimited branching tree
A forest of ordered trees used to contain ordered lists. The root of each tree is unique. An ordered list is represented by a traversal from the root (first element of the list) to a leaf (last list element). Lists with common prefixes share nodes. The last node in common has one child for each list. This allows rapid searches for subset inclusion of sequences.
model of computation
A formal, abstract definition of a computer. Using a model one can more easily analyze the intrinsic execution time or memory space of an algorithm while ignoring many implementation issues. There are many models of computation which differ in computing power (that is, some models can perform computations impossible for other models) and the cost of various operations.
two-way merge sort
A k-way merge sort that sorts a data stream using repeated merges. It distributes the input into two streams by repeatedly reading a block of input that fits in memory, a run, sorting it, then writing it to the next stream. It merges runs from the two streams into an output stream. It then repeatedly distributes the runs in the output stream to the two streams and merges them until there is a single sorted output.
NC many-one reducibility
A language L is NC many-one reducible or NC reducible to L', written L ≤mNC L' if there is a function f in FNC such that x ∈ L if and only if f(x) ∈ L'.
decidable language
A language for which membership can be decided by an algorithm that halts on all inputs in a finite number of steps --- equivalently, can be recognized by a Turing machine that halts for all inputs.
undecidable language
A language for which the membership cannot be decided by an algorithm --- equivalently, cannot be recognized by a Turing machine that halts for all inputs.
NP-complete language
A language in NP such that every language in NP can be reduced to it in polynomial time.
Linked List
A linear data structure, much like an array, that consists of nodes, where each node contains data as well as a link to the next node, but that does not use contiguous memory.
integer linear program
A linear program with additional constraints that all of the variables must take on integer values. Solving such problems is NP-hard.
mixed integer linear program
A linear program with additional constraints that some of the variables must take on integer values. Solving such problems is NP-hard.
string
A list of characters, usually implemented as an array. Informally a word, phrase, sentence, etc. Since text processing is so common, a special type with substring operations is often available.
text
A list of characters, usually thought of as a list of words separated by spaces.
path
A list of vertices of a graph where each vertex has an edge from it to the next vertex.
self-organizing list
A list that reorders the elements based on some self-organizing heuristic [Wikipedia] to improve average access time.
sorted list
A list whose items are kept sorted.
unsorted list
A list whose order of items, if any, is not known.
temporal logic
A logic with a notion of time included. The formulas can express facts about past, present, and future states. The formulas are interpreted over Kripke structures, which can model computation; hence temporal logic is very useful in formal verification.
space ordering method
A mapping from a discrete k-dimensional space into a linear order.
ragged matrix
A matrix having irregular numbers of items in each row.
biconnected component
A maximal subset of edges of a connected graph such that the corresponding induced subgraph cannot be disconnected by deleting any vertex.
multi-commodity flow
A maximum-flow problem involving multiple commodities, in which each commodity has an associated demand and source-sink pairs.
Smith-Waterman algorithm
A means of searching protein databases to find those with the best alignment.
potential function
A measure of a data structure whose change after an operation corresponds to the time cost of the operation.
Jaro-Winkler
A measure of similarity between two strings. The Jaro measure is the weighted sum of percentage of matched characters from each file and transposed characters. Winkler increased this measure for matching initial characters, then rescaled it by a piecewise function, whose intervals and weights depend on the type of string (first name, last name, street, etc.).
flow
A measure of the maximum weight along paths in a weighted, directed graph.
Hausdorff distance
A measure of the resemblance of two (fixed) sets of geometric points P and Q, defined as H(P,Q)=max{maxa∈ P minb∈ Q d(a,b), maxa∈ Q minb∈ P d(a,b)} where d(·,·) is the distance metric, usually the Euclidean distance.
k-way merge sort
A merge sort that sorts a data stream using repeated merges. It distributes the input into k streams by repeatedly reading a block of input that fits in memory, called a run, sorting it, then writing it to the next stream. It merges runs from the k streams into an output stream. It then repeatedly distributes the runs in the output stream to the k streams and merges them until there is a single sorted output.
balanced k-way merge sort
A merge sort that sorts a data stream using repeated merges. It distributes the input into k streams by repeatedly reading a block of input that fits in memory, called a run, sorting it, then writing it to the next stream. It then repeatedly merges the k streams and puts each merged run into one of j output streams until there is a single sorted output.
Karnaugh map
A method for minimizing a boolean expression, usually aided by a rectangular map of the value of the expression for all possible input values. Input values are arranged in a Gray code. Maximal rectangular groups that cover the inputs where the expression is true give a minimum implementation.
Rice's method
A method of complex asymptotics that can handle certain alternating sums arising in the analysis of algorithms.
Huffman coding
A minimal variable-length character coding based on the frequency of each character. First, each character becomes a one-node binary tree, with the character as the only node. The character's frequency is the tree's frequency. Two trees with the least frequencies are joined as the subtrees of a new root that is assigned the sum of their frequencies. Repeat until all characters are in one tree. One code bit represents each level. Thus more frequent characters are near the root and are coded with few bits, and rare characters are far from the root and are coded with many bits.
arithmetic coding
A minimal variable-length message coding based on the frequency of each character. The message is represented by a fraction which is the repeated offset-plus-product reduction of the range (offset) and probability (product) of each character.
rectilinear Steiner tree
A minimum-length rectilinear tree connecting a set of points, called terminals, in the plane. This tree may include points other than the terminals, which are called Steiner points.
Steiner tree
A minimum-weight tree connecting a designated set of vertices, called terminals, in an undirected, weighted graph or points in a space. The tree may include non-terminals, which are called Steiner vertices or Steiner points.
state machine
A model of computation consisting of a (possibly infinite) set of states, a set of start states, an input alphabet, and a transition function which maps input symbols and current states to a next state. Usually understood to be a finite state machine.
Turing machine
A model of computation consisting of a finite state machine controller, a read-write head, and an unbounded sequential tape. Depending on the current state and symbol read on the tape, the machine can change its state and move the head to the left or right. Unless otherwise specified, a Turing machine is deterministic.
alternation
A model of computation proposed by A. K. Chandra, L. Stockmeyere, and D. Kozen, which has two kinds of states, AND and OR. The definition of accepting computation is adjusted accordingly.
cell probe model
A model of computation where the cost of a computation is measured by the total number of memory accesses to a random access memory with log n bits cell size. All other computations are not counted and are considered to be free.
pointer machine
A model of computation whose memory consists of an unbounded collection of registers, or records, connected by pointers. Each register may contain an arbitrary amount of additional information. No arithmetic is allowed to compute the address of a register. The only way to access a register is by following pointers.
work-depth model
A model of parallel computation in which one keeps track of the total work and depth of a computation without worrying about how it maps onto a machine.
integer multi-commodity flow
A multi-commodity flow in which the flow of each commodity through each edge must take an integer value. The term is also used to capture the multi-commodity flow problem in which each demand is routed along a single path.
k-d tree
A multidimensional search tree for points in k dimensional space. Levels of the tree are split along successive dimensions at the points.
radix sort
A multiple pass distribution sort algorithm that distributes each item to a bucket according to part of the item's key beginning with the least significant part of the key. After each pass, items are collected from the buckets, keeping the items in order, then redistributed according to the next most significant part of the key.
Commentz-Walter
A multiple string matching algorithm that compares from the end of the pattern, like Boyer-Moore, using a finite state machine, like Aho-Corasick.
adaptive Huffman coding
A near-minimal variable-length character coding that changes based on the frequency of characters processed. As characters are processed, frequencies are updated and codes are changed (or, the coding tree is modified).
red-black tree
A nearly-balanced tree that uses an extra bit per node to maintain balance. No leaf is more than twice as far from the root as any other.
Leaf
A node in a tree data structure that has no children, and is at the end of a branch in a tree.
sibling
A node in a tree that has the same parent as another node is its sibling.
leaf
A node in a tree without any children. See the figure at tree.
nondeterministic tree automaton
A nondeterministic finite state machine that accepts infinite trees rather than just strings. The tree nodes are marked with the letters of the alphabet of the automaton, and the transition function encodes the next states for each branch of the tree. The expressive power of such automata varies depending on the acceptance conditions of the trees.
cut
A nonempty, proper subset of vertices of a graph.
work-efficient
A parallel algorithm which asymptotically requires at most a constant factor more work than the best known sequential algorithm or the optimal work.
concurrent read, concurrent write
A parallel memory model in which multiple processors can read simultaneously from a single memory location, and multiple processors can write simultaneously to a single memory location.
concurrent read, exclusive write
A parallel memory model in which multiple processors can read simultaneously from a single memory location, but only one processor can write to any one memory location at one time.
exclusive read, exclusive write
A parallel memory model in which only one processor can read from any one memory location at one time, and only one processor can write to any one memory location at one time.
exclusive read, concurrent write
A parallel memory model in which only one processor can read from any one memory location at one time, but multiple processors can write simultaneously to a single memory location.
Permutations
A permutation is an ordered combination. Repetition is Allowed: such as the lock above. It could be "333". No Repetition: for example the first three people in a running race. You can't be first and second. https://www.mathsisfun.com/combinatorics/combinations-permutations.html
derangement
A permutation of elements, where no element is in its original position.
fully persistent data structure
A persistent data structure that allows updates to all versions, previous and latest.
buddy tree
A point access method which splits multidimensional space in different dimensions at each level. It also keeps a minimum bounding box of points accessible by each node.
grid file
A point access method which splits space into a nonperiodic grid. Each spatial dimension is divided by a linear hash. Small sets of points are referred to by one or more cells of the grid.
hB-tree
A point access method which uses k-d trees to organize the space, but partitions as excluded intervals, like the BANG file. Searching is like in a k-d-B-tree.
lattice
A point lattice generated by taking integer linear combinations of a set of basis vectors.
Steiner vertex
A point that is not part of the input set of points, for instance, a point computed to construct a Steiner tree.
last-in, first-out
A policy that the most recently arrived item is processed first. A stack implements this.
integer polyhedron
A polyhedron, all of whose extreme points are integer valued.
optimal polyphase merge
A polyphase merge which seeks to minimize the number of merge passes by allocating output runs of each pass to the various output files. Since polyphase merging must have a different number of runs in each file to be efficient, one seeks the optimal way of selecting how many runs go into each output file. A series of kth order Fibonacci numbers is one way to select the number of runs.
pattern element
A positive (negative) pattern element is a "partial wild card" presented as a subset of the alphabet Σ, with the symbols in the subset specifying which symbols of Σ are matched (mismatched) by the pattern element.
External Node
A potential node in a tree, where currently either the left or right child pointer of a node is pointing to null, but potentially could reference another node.
oracle set
A predetermined set of tapes used by an oracle Turing machine to make decisions otherwise not feasibled.
binary priority queue
A priority queue implemented with a binary tree having the following restrictions: The key of a node is greater than keys of its children, i.e., it has the heap property. If the right subtree is not empty, the left subtree is not empty. If there are both left and right children, the left child's key is greater than the right child's key.
leftist tree
A priority queue implemented with a variant of a binary tree. Every node has a count which is the distance to the nearest leaf. In addition to the heap property, leftist trees are kept so the right descendant of each node has the shorter distance to a leaf.
pagoda
A priority queue implemented with a variant of a binary tree. The root points to its children, as in a binary tree. Every other node points back to its parent and down to its leftmost (if it is a right child) or rightmost (if it is a left child) descendant leaf. The basic operation is merge or meld, which maintains the heap property. An element is inserted by merging it as a singleton. The root is removed by merging its right and left children. Merging is bottom-up, merging the leftmost edge of one with the rightmost edge of the other.
monotone priority queue
A priority queue in which a key being inserted is never higher in priority than a previously deleted node.
What Is A Priority Queue?
A priority queue is different from a "normal" queue, because instead of being a "first-in-first-out" data structure, values come out in order by priority. A priority queue might be used, for example, to handle the jobs sent to the Computer Science Department's printer: Jobs sent by the department chair should be printed first, then jobs sent by professors, then those sent by graduate students, and finally those sent by undergraduates.
double-ended priority queue
A priority queue which simultaneously keeps track of the maximum and minimum keys, and supports operations efficiently on either extreme (minimum or maximum).
Bloom filter
A probabilistic algorithm to quickly test membership in a large set using multiple hash functions into a single array of bits.
randomized rounding
A probabilistic method to convert a solution of a relaxed problem into an approximate solution to the original problem.
linear program
A problem expressible in the following form. Given an n × m real matrix A, m-vector b and n-vector c, determine minx{c· x | Ax ≥ b and x ≥ 0} where x ranges over all n-vectors and the inequalities are interpreted component-wise, i.e., x ≥ 0 means that the entries of x are nonnegative.
totally undecidable problem
A problem that cannot be solved by a Turing machine.
undecidable problem
A problem that cannot be solved for all cases by any algorithm whatsoever---equivalently, whose associated language cannot be recognized by a Turing machine that halts for all inputs.
tractable
A problem which has an algorithm which computes all instances of it in polynomial time.
decision problem
A problem with a "yes" or "no" answer. Equivalently, a function whose range is two values, such as {0,1}.
Heapify
A process in Minimum Heap Trees where the new node is switched up until min heap state is achieved.
Pop
A process used in stack and queue processing where a copy of the top or front value is acquired, and then removed from the stack or queue (Dequeue).
interactive proof system
A protocol in which one or more provers try to convince another party, called the verifier, that the prover(s) possess certain true knowledge, such as the membership of a string x in a given language, often with the goal of revealing no further details about this knowledge. The prover(s) and verifier are formally defined as probabilistic Turing machines with special "interaction tapes" for exchanging messages.
Deutsch-Jozsa algorithm
A quantum algorithm to determine whether a function is constant or balanced, that is, returns 1 for half the domain and 0 for the other half. For a function taking n input qubits, first, do Hadamards on n 0's, forming all possible inputs, and a single 1, which will be the answer qubit. Next, run the function once; this exclusive or's the result with the answer qubit. Finally, do Hadamards on the n inputs again, and measure the answer qubit. If it is 0, the function is constant, otherwise the function is balanced.
Double ended queue
A queue that allows you to add, remove, or retrieve entires at both sides (front and back) of the queue.
Monte Carlo algorithm
A randomized algorithm that may produce incorrect results, but with bounded error probability.
hard split, easy merge
A recursive algorithm, especially a sort algorithm, where dividing (splitting) into smaller problems is time consuming or complex and combining (merging) the solutions is quick or trivial.
Turing reduction
A reduction computable by an oracle Turing machine that halts for all inputs.
Cook reduction
A reduction computed by a deterministic polynomial time oracle Turing machine.
Karp reduction
A reduction given by a polynomial time computable transformation function.
many-one reduction
A reduction that maps an instance of one problem into an equivalent instance of another problem.
link
A reference, pointer, or access handle to another part of the data structure. Often, a memory address.
binary relation
A relation between exactly two items at a time, such as "greater than" (>), "not equal to" (≠), "proper subset of" (⊂), or "is connected to" (has an edge to) for vertices of a graph.
weak-heap
A relaxed heap satisfying the following three conditions: (1) every key in the right subtree of a node is greater than the key stored in the node itself, (2) the root has no left child, and (3) leaves are only found on the last two levels of the tree.
adjacency-matrix representation
A representation of a directed graph with n vertices using an n × n matrix, where the entry at (i,j) is 1 if there is an edge from vertex i to vertex j; otherwise the entry is 0. A weighted graph may be represented using the weight as the entry. An undirected graph may be represented using the same entry in both (i,j) and (j,i) or using an upper triangular matrix.
coalesced chaining
A scheme in which linked lists within the hash table handle collisions. An item that collides is put in the next empty place in the array and added to the end of a list embedded in the array items. Any open addressing method to compute possible new positions may be used to find the "next" empty place.
universal hashing
A scheme that chooses randomly from a set of hash functions.
(a,b)-tree
A search tree with the restrictions that all leaves are at the same depth and all internal nodes have between a and b children, where a and b are integers such that 2 ≤ a ≤ (b+1)/2. The root may have as few as 2 children.
decomposable searching problem
A searching problem for which one can decompose the searched domain into subsets, find answers from these subsets, and combine these answers to form the solution to the original problem. More formally a problem with a query Q is decomposable given if there exists an efficiently computable associative and commutative binary operator @ satisfying the condition: Q(x, A ∪ B) = @(Q(x,A),Q(x,B)).
Self-loop
A self-loop is an edge that connects a vertex to itself.
polynomial approximation scheme
A set of algorithms {Aε| ε > 0}, where each Aε is a (1+ε)-approximation algorithm and the execution time is bounded by a polynomial in the length of the input. The execution time may depend on the choice of ε. Sometimes referred to more precisely as polynomial-time approximation scheme.
abstract data type
A set of data values and associated operations that are precisely specified independent of any particular implementation.
graph
A set of items connected by edges. Each item is called a vertex or node. Formally, a graph is a set of vertices and a binary relation between vertices, adjacency.
Tree
A set of nodes connected by edges that indicate the relationship among the nodes. The nodes are arranged in levels that indicate the nodes hierarchy. At the top level is a single node called the root. The nodes at each successive level of a tree are the children of the nodes. A node that has children is the parent. Nodes that share the same parent are siblings. They are also descendants of previous nodes.
set cover
A set of sets whose union has all members of the union of all sets. The set cover problem is to find a minimum size set.
language
A set of strings over some fixed alphabet. A characterization of inputs which may or may not be solved by algorithms.
maximal independent set
A set of vertices in a graph such that for any pair of vertices, there is no edge between them and such that no more vertices can be added and it still be an independent set.
independent set
A set of vertices in a graph such that for any pair of vertices, there is no edge between them.
clique
A set of vertices in an undirected graph in which there is an edge between every pair of vertices. In other words, a subgraph that is complete.
selection sort
A sort algorithm that repeatedly searches remaining items to find the least one and moves it to its final location. The run time is Θ(n²), where n is the number of elements. The number of swaps is O(n).
merge sort
A sort algorithm that splits the items to be sorted into two groups, recursively sorts each group, and merges them into a final, sorted sequence. Run time is Θ(n log n).
strand sort
A sort algorithm that works well if many items are in order. First, begin a sublist by moving the first item from the original list to the sublist. For each subsequent item in the original list, if it is greater than the last item of the sublist, remove it from the original list and append it to the sublist. Merge the sublist into a final, sorted list. Repeatedly extract and merge sublists until all items are sorted. Handle two or fewer items as special cases.
extended k-d tree
A spatial access method where successive levels are split along different dimensions into nonoverlapping cells. Objects are indexed in all cells they intersect.
cell tree
A spatial access method where successive levels are split by arbitrary hyperplanes. Concave objects are decomposed into convex pieces. Each convex piece is indexed in every cell which it overlaps.
R-file
A spatial access method which divides space into a hierarchically of nested boxes. Objects are indexed in the lowest cell which completely contains them.
multilayer grid file
A spatial access method which is two or more simultaneous grid files. Objects reside in the first grid file where it doesn't have to be split across hyperplanes.
R*-tree
A spatial access method which splits space in hierarchically nested, possibly overlapping, boxes. The tree is height-balanced. It is similar to the R-tree, but reinserts entries upon overflow, rather than splitting.
R+-tree
A spatial access method which splits space with hierarchically nested boxes. Objects are indexed in each box which intersects them. The tree is height-balanced.
tail recursion
A special form of recursion where the last operation of a function is a recursive call. The recursion may be optimized away by executing the call in the current stack frame and returning its result rather than creating a new stack frame.
collective recursion
A special form of tail recursion, where the results are produced during the recursive calls and nothing is returned. The recursion may be optimized away by executing the call in the current stack frame, rather than creating a new stack frame, or by deallocating the entire recursion stack at once rather than a little at each return.
What is a priority queue?
A special type of queue where the highest priority items are at the front of the queue and the lowest priority items at the back.
best-first search
A state-space search algorithm that considers the estimated best partial solution next. This is typically implemented with a priority queue.
deterministic finite automata string search
A string matching algorithm which builds a deterministic finite state machine to recognize the search string. The machine is then run at each location in turn. If the machine accepts, that is a match.
Shift-Or
A string matching algorithm which keeps an array of bits, R, showing if prefixes of the pattern don't match at the current place. Before searching, mismatch arrays are computed for each character in the alphabet and saved in an array, S. For the next position, with the character c, R = shift(R) or S[c]. If the last bit of R is 0, the pattern matches.
substring
A string v is a substring of a string u if u=u′ vu″ for some prefix u′ and suffix u″.
border
A string v which is both a prefix and a suffix of another string u. String v is the border of u if it is the longest proper border of u.
incompressible string
A string whose Kolmogorov complexity equals its length, so that it has no shorter encodings.
strongly connected component
A strongly connected subgraph, S, of a directed graph, D, such that no vertex of D can be added to S and it still be strongly connected. Informally, a maximal subgraph in which every vertex is reachable from every other vertex.
simulated annealing
A technique to find a good solution to an optimization problem by trying random variations of the current solution. A worse variation is accepted as the new solution with a probability that decreases as the computation proceeds. The slower the cooling schedule, or rate of decrease, the more likely the algorithm is to find an optimal or near-optimal solution.
taco sort
A terribly inefficient sort algorithm that repeatedly changes a random item by a random amount until a sorted permutation occurs. For an array of n elements of k bits each, the expected run time is n × 2nk.
bogosort
A terribly inefficient sort algorithm that repeatedly generates a random permutation of the items until the items are in order.
separation theorem
A theorem showing that two complexity classes are distinct. Most separation theorems have been proved by diagonalization.
adversary
A theoretical agent that uses information about the past moves of an on-line algorithm to choose inputs that force the worst-case cost of the algorithm.
big-O notation
A theoretical measure of the execution of an algorithm, usually the time or memory needed, given the problem size n, which is usually the number of items. Informally, saying some equation f(n) = O(g(n)) means it is less than some constant multiple of g(n). The notation is read, "f of n is big oh of g of n".
adaptive k-d tree
A tree for multidimensional points where successive levels may be split along different dimensions.
Tree Topology
A tree is widely used abstract data type (ADT) or data structure implementing this ADT that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes.
quadtree
A tree where each node is split along all d dimensions, leading to 2d children.
search tree
A tree where every subtree of a node has keys less than any other subtree of the node to its right. The keys in a node are conceptually between subtrees and are greater than any keys in subtrees to its left and less than any keys in subtrees to its right.
height-balanced tree
A tree whose subtrees differ in height by no more than one and the subtrees are height-balanced, too. An empty tree is height-balanced.
finitary tree
A tree with a finite number of children at every node.
multiway tree
A tree with any number of children for each node.
binary tree
A tree with at most two children for each node.
k-ary tree
A tree with no more than k children for each node.
compact trie
A trie in which nonbranching subtrees leading to leaf nodes are cut off.
digital search tree
A trie which stores the strings in internal nodes, so there is no need for extra leaf nodes to store the strings.
matrix
A two-dimensional array. By convention, the first index is the row, and the second index is the column.
Heap
A type of priority queue. Stores data which is order-able. O(1) access to highest priority item.
Array Index
A value that indicates the position in the array of a particular value. The last element in a zero-indexed array would be the length of the array, minus 1.
Shannon-Fano coding
A variable-length coding based on the frequency of occurrence of each character. Divide the characters into two sets with the frequency of each set as close to half as possible, and assign the sets either 0 or 1 coding. Repeatedly divide the sets until each character has a unique coding.
elastic-bucket trie
A variant of a bucket trie in which each leaf node for n strings is a bucket allocated to hold exactly n strings.
hidden Markov model
A variant of a finite state machine having a set of states, Q, an output alphabet, O, transition probabilities, A, output probabilities, B, and initial state probabilities, Π. The current state is not observable. Instead, each state produces an output with a certain probability (B). Usually the states, Q, and outputs, O, are understood, so an HMM is said to be a triple, (A, B, Π).
2-choice hashing
A variant of a hash table in which keys are added by hashing with two hash functions. The key is put in the array position with the fewer (colliding) keys. Some collision resolution scheme is needed, unless keys are kept in buckets. The average-case cost of a successful search is O(2 + (m-1)/n), where m is the number of keys and n is the size of the array. The most collisions is log2 ln n + Θ(m/n) with high probability.
doubly linked list
A variant of a linked list in which each item has a link to the previous item as well as the next. This allows easily accessing list items backward as well as forward and deleting any item in constant time.
circular list
A variant of a linked list in which the nominal tail is linked to the head. The entire list may be accessed starting at any item and following links until one comes to the starting item again.
bucket trie
A variant of a trie in which leaf nodes are buckets which hold up to k strings. Usually implies fixed sized buckets.
bidirectional bubble sort
A variant of bubble sort that compares each adjacent pair of items in a list in turn, swapping them if necessary, and alternately passes through the list from the beginning to the end then from the end to the beginning. It stops when a pass does no swaps.
divide and marriage before conquest
A variant of divide and conquer in which subproblems created in the "divide" step are merged before the "conquer" step.
smoothsort
A variant of heapsort that takes advantage of a partially ordered table. Performance is O(n) when input is sorted and O(n log n) performance for worst case.
adaptive heap sort
A variant of heapsort that uses a randomized binary search tree (RBST) to structure the input according to any preexisting order. The RBST is used to select candidates that are put into the heap so the heap doesn't need to keep track of all elements.
balanced quicksort
A variant of quicksort which attempts to choose a pivot likely to represent the middle of the values to be sorted.
bingo sort
A variant of selection sort that orders items by first finding the least value, then repeatedly moving all items with that value to their final location and find the least value for the next pass. This is more efficient than selection sort if there are many duplicate values.
sink
A vertex of a directed graph with no outgoing edges. More formally, a vertex with with out-degree 0.
matched vertex
A vertex on an matched edge in a matching, or, one which has been matched.
reachable
A vertex v is reachable from another vertex u if there is a path of any length from u to v.
cut vertex
A vertex whose deletion along with incident edges results in a graph with more components than the original graph.
Q: For every vertices u, v in a tree, there exists:
A: Exactly one simple path from u to v.
What Is Big O Complexity Of Arrays
Access: O(1) Search: O(n) Insertion: O(n) Deletion: O(n)
Arrays
Advantages - Easy to access a particular item - Insertion is fast in an unordered array - Searching is fast in ordered array - Data locality in memory Disadvantages: - Some algorithms require large number of shifting/moves - The size of an array cannot be changed once created - If the size is too large... waste of memory - If the size is too small, it cannot be expanded dynamically
Linked List
Advantages: - Take as much memory as needed --> Number of links can be expanded dynamically - No need to move items around Disadvantages: - Conceptually less intuitive than arrays - The objects/items of the link can be located anywhere in memory (loss of data locality) Big O - Insertion /Deletion at beginning of list is O(1) - Insertion at end of double-ended linked list is O(1) Finding, deleting or inserting next to a specific link is O(N) - Number of comparisons is O(N) like for arrays but items do not need to be shifted/movies --> expected to be faster than using arrays
Difference between a binary tree, and a binary search tree?
All left descendents <= n < all right descendents
Binary Search Tree
All members to the left of subtree of any node are less than parent node All members to the right of subtree of any node are greater than parent node
Vitter's algorithm
An adaptive Huffman coding scheme. Typically this produces codings the same length as or shorter than static Huffman coding. In the worst case, this uses one more bit per codeword.
Dijkstra's Algorithm
An algorithm for finding the shortest paths between nodes in a weighted graph. For a given source node in the graph, the algorithm finds the shortest path between that node and every other. It can also be used for finding the shortest paths from a single node to a single destination node by stopping the algorithm once the shortest path to the destination node has been determined. Its time complexity is O(E + VlogV), where E is the number of edges and V is the number of vertices.
Counting Sort
An algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that have each distinct key value, and using arithmetic on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum and minimum key values, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. However, it is often used as a subroutine in another sorting algorithm, radix sort, that can handle larger keys more efficiently.[1][2][3] Because counting sort uses key values as indexes into an array, it is not a comparison sort, and the Ω(n log n) lower bound for comparison sorting does not apply to it.
double metaphone
An algorithm to code English words (and foreign words often heard in the United States) phonetically by reducing them to a combination of 12 consonant sounds. It returns two codes if a word has two plausible pronunciations, such as a foreign word. This reduces matching problems from wrong spelling.
select mode
An algorithm to find the mode, or most frequently occurring value, in a group of elements.
Dijkstra's algorithm
An algorithm to find the shortest paths from a single source vertex to all other vertices in a weighted, directed graph. All weights must be nonnegative.
shadow merge insert
An algorithm to insert a new node into a shadow heap. The new node is placed in the unordered table. When the table grows beyond some threshold size, table nodes and all their parents are reordered so everything is a heap.
hsadelta
An algorithm to produce a sequence of insert and copy commands (an edit script) which creates a new version file from a reference file. To begin, hash every block of the reference file and store every hash in a hash value array. Build a suffix array and three other data structures for quick access. Beginning at the first location in the version file, hash a block and look for the longest match in the reference file. Upon a match, encode an insert back to the previous match and a copy of the match. If no match, look at the next location. At the end, encode an insert for remaining unmatched characters.
Johnson's algorithm
An algorithm to solve the all pairs shortest path problem in a sparse weighted, directed graph. First, it adds a new node with zero weight edges from it to all other nodes, and runs the Bellman-Ford algorithm to check for negative weight cycles and find h(v), the least weight of a path from the new node to node v. Next it reweights the edges using the nodes' h(v) values. Finally for each node, it runs Dijkstra's algorithm and stores the computed least weight to other nodes, reweighted using the nodes' h(v) values, as the final weight. The time complexity is O(V²log V + VE).
Floyd-Warshall algorithm
An algorithm to solve the all pairs shortest path problem in a weighted, directed graph by multiplying an adjacency-matrix representation of the graph multiple times. The edges may have negative weights, but no negative weight cycles. The time complexity is Θ (V³).
stable
An algorithm where the relative order upon input of items with equal keys is always preserved in the output. Usually a sort algorithm.
off-line algorithm
An algorithm which is given the entire sequence of inputs in advance.
deterministic algorithm
An algorithm whose behavior can be completely predicted from the input.
oblivious algorithm
An algorithm whose behavior, by design, is independent of some property that influences a typical algorithm for the same problem.
interpolation-sequential search
An approximate location is interpolated from the first and last items of a sorted array, then a linear search finds the actual location.
rho-approximation algorithm
An approximation algorithm guaranteed to find a solution at most (or at least, as appropriate) ρ times the optimum. The ratio ρ is the performance ratio or relative performance guarantee of the algorithm.
absolute performance guarantee
An approximation algorithm will return a solution at most a bounded amount more (or less, as appropriate) than the optimum.
bucket
An area of storage where items with a common property are stored. Typically tree data structures and sort algorithms use many buckets, one for each group of items. Usually buckets are kept on disk.
suffix array
An array of all starting positions of suffixes of a string arranged in lexicographical order. This allows a binary search or fast substring search.
ordered array
An array whose items have some order. Usually, it means a sorted array, but may mean not fully ordered, for example, all values less than the median are in the first half.
edge coloring
An assignment of colors (or any distinct marks) to the edges of a graph. A coloring is a proper coloring if no two adjacent edges have the same color.
flow function
An assignment of flow values to the edges of a flow network that satisfies flow conservation, skew symmetry, and capacity constraints.
self-loop
An edge of a graph which starts and ends at the same vertex.
Bellman-Ford algorithm
An efficient algorithm to solve the single-source shortest-path problem. Weights may be negative. The algorithm initializes the distance to the source vertex to 0 and all other vertices to ∞. It then does V-1 passes (V is the number of vertices) over all edges relaxing, or updating, the distance to the destination of each edge. Finally it checks each edge again to detect negative weight cycles, in which case it returns false. The time complexity is O(VE), where E is the number of edges.
van Emde-Boas priority queue
An efficient implementation of priority queues where insert, delete, get minimum, get maximum, etc. take O(log log N) time, where N is the total possible number of keys. Depending on the circumstance, the implementation is null (if the queue is empty), an integer (if the queue has one integer), a bit vector of size N (if N is small), or a special data structure: an array of priority queues, called the bottom queues, and one more priority queue of array indexes of the bottom queues.
American flag sort
An efficient, in-place variant of radix sort that distributes items into hundreds of buckets. The first step counts the number of items in each bucket, and the second step computes where each bucket will start in the array. The last step cyclically permutes items to their proper bucket. Since the buckets are in order in the array, there is no collection step. The name comes by analogy with the Dutch national flag problem in the last step: efficiently partition the array into many "stripes". Using some efficiency techniques, it is twice as fast as quicksort for large sets of strings.
D-adjacent
An entry reachable for a (d-1)-extremal entry through a unit vertical, horizontal, or diagonal-mismatch step.
Recurrence Relation
An equation that is defined in terms of itself. Any polynomial or exponential can be represented by a recurrence.
tree automaton
An extension of a finite state machine that operates on n-ary constructors. Where a finite state automaton reaches a new state with a single state and character, a tree automaton takes n states and constructors. Tree automata may be top-down (starting from the root) or bottom-up (starting from the leaves), and deterministic or nondeterministic.
optimal polyphase merge sort
An external sort algorithm that uses optimal polyphase merges.
oracle tape
An extra tape used by an oracle Turing machine to make decisions otherwise not feasible.
circular queue
An implementation of a bounded queue using an array.
Hash tables: memory
An implementation of a dictionary. Memory: O(n)
Selection Sort
An in-place comparison sort algorithm, O(n^2). The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. Initially, the sorted sublist is empty and the unsorted sublist is the entire input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging (swapping) it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right
comb sort
An in-place sort algorithm that repeatedly reorders different pairs of items. On each pass swap pairs of items separated by the increment or gap, if needed, and reduce the gap (divide it by about 1.3). The gap starts at about 3/4 of the number of items. Continue until the gap is 1 and a pass had no swaps.
J sort
An in-place sort algorithm that uses strand sort to sort fewer than about 40 items and shuffle sort to sort more.
JSort
An in-place sort that partially orders the array twice with build-heap, once moving lesser items earlier and once in reverse moving greater items later, then uses insertion sort on the nearly-ordered array.
Inverted Index
An index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents (named in contrast to a Forward Index, which maps from documents to content). The purpose of an inverted index is to allow fast full text searches, at a cost of increased processing when a document is added to the database.
inverted index
An index into a set of texts of the words in the texts. The index is accessed by some search method. Each index entry gives the word and a list of texts, possibly with locations within the text, where the word occurs.
forward index
An index into a set of texts. This is usually created as the first step to making an inverted index.
Chinese remainder theorem
An integer n can be solved uniquely mod LCM(A(i)), given modulii (n mod A(i)), A(i) > 0 for i=1..k, k > 0. In other words, given the remainders an integer gets when it's divided by an arbitrary set of divisors, you can uniquely determine the integer's remainder when it is divided by the least common multiple of those divisors.
probabilistically checkable proof
An interactive proof system in which provers follow a fixed strategy, that is, one not affected by any messages from the verifier. The prover's strategy for a given instance x of a decision problem can be represented by a finite oracle language Bx, which constitutes a proof of the correct answer for x.
Internal Sorting
An internal sort is any data sorting process that takes place entirely within the main memory of a computer. This is possible whenever the data to be sorted is small enough to all be held in the main memory. For sorting larger datasets, it may be necessary to hold only a chunk of data in memory at a time, since it won't all fit. The rest of the data is normally held on some larger, but slower medium, like a hard-disk. Any reading or writing of data to and from this slower media can slow the sortation process considerably.
block addressing index
An inverted index that includes the block, or general location, within texts, in addition to the text in which the word appears.
full inverted index
An inverted index that includes the exact location within texts, in addition to the text in which the word appears.
inverted file index
An inverted index that only indicates the text in which a word appears, not where the word appears within the text.
blossom
An odd length cycle which appears during a matching algorithm on general graphs.
binomial tree
An ordered tree of order k ≥ 0, that is Bk, whose root has k children where the ith child is binomial tree of order k-i.
Gray code
An ordering of 2n binary numbers such that only one bit changes from one entry to the next. Gray codes for 4 or more bits are not unique, even allowing for permutation or inversion of bits.
data structure
An organization of information, usually in memory, for better algorithm efficiency, such as queue, stack, linked list, heap, dictionary, and tree, or conceptual unity, such as the name and address of a person. It may include redundant information, such as length of the list or number of nodes in a subtree.
bipartite graph
An undirected graph where vertices can be partitioned into two sets such that no edge connects vertices in the same set.
set
An unordered collection of values where each value occurs at most once. A group of elements with three properties: (1) all elements belong to a universe, (2) either each element is a member of the set or it is not, and (3) the elements are unordered.
Reduction
Analysis pattern. Use a well-known solution to some other problem as a subroutine.
randomized algorithm
Any algorithm that makes random (or pseudorandom) choices.
probabilistic algorithm
Any algorithm that works for all practical purposes but has a theoretical chance of being wrong.
feasible solution
Any element of the feasible region of an optimization problem.
Subtree
Any node and its decedents form a subtree of the original tree. A subtree of a node is a tree rooted at a child of that node.
complexity class
Any of a set of computational problems with the same bounds (Θ(n)) on time and space, for deterministic and nondeterministic machines.
Java Stack/Program Stack
At the time the program is called, the activation record is pushed onto a stack called the Java Stack or the Program Stack
merge
Combine two or more sorted sequences of data into a single sorted sequence.
Selection Sort
Comparisons: O(N^2) Swaps: O(N) Big O: O(N^2) --> faster than bubble sort
A min heap is a _____ binary tree?
Complete
Heap Priority Queue: insert, max, extract max, increase value
Heap-insert: O(lg n) Heap-maximum: O(1) Heap-extract-max: O(lg n) Heap-increase-value: O(lg n)
What methods should we have in a dynamic array, and what should they do?
Constructor which makes attributes n = 0, capacity equal 1, and A call self.make_array. len return length; getItem takes k parameter and returns at index k. Append with ele paramater. A Resize method with new_cap as parameter that makes array larger. new_array that takes new_cap parameter.
rectilinear
Distance, paths, lines, etc. which are always parallel to axes at right angles. For example, a path along the streets of Salt Lake City or the moves of a rook in chess.
polychotomy
Division into many distinct classifications.
LCS
Either longest common subsequence or longest common substring.
longest common substring
Find the longest substring of two or more strings.
tournament
Find the maximum of a set of n elements in log n "rounds" (passes) by "playing" (comparing) pairs of elements and advancing the winner (greater) of each pair to the next round. It takes n-1 comparisons, like linear search, but may be parallelized, extended to also find the second greatest element, etc.
nearest neighbor
Find the point (rectangle, line, etc.) that is closest to another point.
single-destination shortest-path problem
Find the shortest path from each vertex in a weighted, directed graph to a specific destination vertex.
single-source shortest-path problem
Find the shortest paths from a specific source vertex to every other vertex in a weighted, directed graph. Dijkstra's algorithm solves this if all weights are nonnegative. The Bellman-Ford algorithm handles any weights.
shortest common supersequence
Find the shortest string that contains two or more strings as subsequences.
shortest common superstring
Find the shortest string that contains two or more strings as substrings.
all pairs shortest path
Find the weight (or length) of the shortest paths between all pairs of vertices in a weighted, directed graph.
Graph: definition
Finite set of vertices connected by edges, directed or not.
What are the two reasons to use recursion?
First is when Recursion is used as a technique in which a function makes one or more calls to itself.Second is when a data structure uses smaller instances of the exact same type of data structures when it represents itself.
rank
For a given match, this is the number of matches in a longest chain terminating with that match, inclusive.
Stirling's formula
For large values of n, (n/e)n √(2nπ) < n! < (n/e)n(1 + 1/(12n-1)) √(2nπ).
linear product
For two vectors X and Y, and with respect to two suitable operations and is a vector Z=Z0 Z1 ... Zm+n where Zk=i+j=kXi Yj (k=0, ... , m+n).
Ford-Fulkerson method
Given a flow function and its corresponding residual graph (a maximum-flow problem), select a path from the source to the sink along which the flow can be increased and increase the flow. Repeat until there are no such paths.
Malhotra-Kumar-Maheshwari blocking flow
Given a flow function and its corresponding residual graph (a maximum-flow problem), select a vertex with the least throughput and greedily push the maximum flow from it to the sink. This is repeated until all vertices are deleted.
matrix-chain multiplication problem
Given a sequence of matrices such that any matrix may be multiplied by the previous matrix, find the best association such that the result is obtained with the minimum number of arithmetic operations. One may use dynamic programming to find the best association.
capacitated facility location
Given a set of demand points, a distance function, and a parameter p, find a set of p supply object (points, lines, segments, etc.) which minimizes some distance objective function and no supply supports too many demand points.
facility location
Given a set of demand points, a distance function, and a parameter p, find a set of p supply objects (points, lines, segments, etc.) which minimizes some distance objective function. The function may be the maximum distance between any demand point and the nearest supply, so no demand point is too far from a supply, or the sum of distances to the nearest supply.
Post's correspondence problem
Given a set of pairs of strings, find a sequence of pairs such that the concatenation of all first members of the pairs is the same string as the concatenation of all second members. This is an undecidable problem.
HashMap underlying structure:
HashTable with chained buckets
Hashing
Hashing is the process of computing the location within the hash table where an element will be (or is) kept.
average case
Having to do with the mathematical average of all cases.
huge sparse array
Let N be the number of items to store and R be the size of the range of key values; R >> N. Allocate, but don't initialize, two arrays: an item array I, where |I|≥N, and a location array L, where |L|=R. Initialize a variable, next, the number of items, to zero (with 0-based indexing). To insert an item, put it in the next place in the item array and save where to find it in the location array. I[next] = item; L[item.key] = next; next++; To look up an item by key, get the index from the location array. If the index is invalid or refers to the wrong item, the item is not found. index = L[key]; if (index < 0 OR index >= next) return NOTFOUND; if (I[index].key != key) return NOTFOUND; return I[index]; Inserting N items takes Θ(N) total time, assuming allocation takes constant time. Retrieving an item by key (or responding "not found") takes constant time. Listing all items takes Θ(N) time using I.
ideal merge
Merge n sorted streams into one output stream. To begin the streams are sorted by the value of the head element of each. Then the head of the first stream, which is the least since the streams were sorted, is removed and written to the output. That stream is inserted back into the list of streams according to its new head. Taking the head of the first stream and reinserting that stream is repeated until all elements have been processed. Using linear search to insert a stream into the list, the execution time is Θ(M N) where M is the total number of elements. Keeping the streams in a heap, the execution time is Θ(M log N).
What are does the variation min heap for a binary heap do, and what does the variations max heap for a binary heap do?
Min heap in which the smallest key is at the front. Max heap is the largest key value is at the front.
LinkedQueue
More flexible than array as size can grow and shrink at runtime. No need for check if queue is full. Extra storage overheads for a queue. All operations are o(1)
O(1)
O(1) = happens once
Rehashing Complexity:
O(N)-- costly. Carefully select initial TS to avoid re-hashing.
Complexity for iterating over associated values:
O(T.S + N) --> worst case.
What Is The Best, Average, Worst, and Space Complexity Of Bubble Sort?
O(n), O(n2), )(n2), and O(1)
Benford's law
On a wide variety of statistical data, the first digit is d with the probability log10 ( 1 + 1/d ).
simulation theorem
One kind of computation can be simulated by another kind within stated complexity bounds. Most known containment or equality relationships between complexity classes were proved this way.
partially decidable problem
One whose associated language is a recursively enumerable language. Equivalently, there exists an algorithm that halts and outputs 1 for every instance having a "yes" answer, but for instances having a "no" answer is allowed either not to halt or to halt and output 0.
What Are The Three Methods Of A Queue?
Queue Has Enque Which Adds, And Deque Which Deletes, also isEmpty
What Is A Queue?
Queue is a FIFO data structure
external quicksort
Read the M/2 first and last elements into a buffer (the buffer acts like the pivot in quicksort), and sort them. Read the next element from the beginning or end to balance writing. If the next element is less than the least of the buffer, write it to available space at the beginning. If greater than the greatest, write it to the end. Otherwise write the greatest or least of the buffer, and put the next element in the buffer. Keep the maximum lower and minimum upper keys written to avoid resorting middle elements that are in order. When done, write the buffer. Recursively sort the smaller partition, and loop to sort the remaining partition.
interpolation search
Search a sorted array by estimating the next position to check based on a linear interpolation of the search key and the values at the ends of the search interval.
Red Black Tree: search, insert, delete
Search: O(lg n) Insert: O(lg n) Delete: O(lg n)
What Is The Average Cast Time Complexity Of A Hash Table?
Search:O(1) Insertion:O(1) Deletion:O(1) Access:O(n)
string matching with errors
Searching for approximate (e.g., up to a predefined number of symbol mismatches, insertions, and deletions) occurrences of a pattern string in a text string. Preprocessing, e.g., building an index, may or may not be allowed.
random number generator
See pseudo-random number generator.
randomized search tree
See randomized binary search tree.
admissible vertex
See the explanation at path system problem.
graph coloring
See vertex coloring, edge coloring, or k-coloring.
shortest path
The problem of finding the shortest path in a graph from one vertex to another. "Shortest" may be least number of edges, least total weight, etc.
What Are The Methods We Should Have If We Created A Stack Class?
Stack() creates a new stack that is empty. It needs no parameters and returns an empty stack. push(item) adds item to top of stack and returns nothing. pop() removes item and returns it. peek() returns the top item no params. isEmpty() returns a boolean. size() returns number of items on the stack and it needs no parameters
distributive partitioning sort
Step 1: find the median key. Step 2: distribute the n items into n/2 buckets linearly covering the interval from the minimum to the median and n/2 buckets linearly covering the interval from the median to the maximum. Step 3: compact the buckets, removing empty buckets. Recursively start again at step 1 for any bucket with multiple items. Linked lists are used to avoid moving items until a final phase and to avoid bucket overflows.
Palindrome
String that reads the same forwards as backwards
RP
The class of languages for which membership can be determined in polynomial time by a probabilistic Turing machine with no false acceptances and less than half false rejections. "RP" means "Randomized Polynomial" time.
NP-complete
The complexity class of decision problems for which answers can be checked for correctness, given a certificate, by an algorithm whose run time is polynomial in the size of the input (that is, it is NP) and no other NP problem is more than a polynomial factor harder. Informally, a problem is NP-complete if answers can be verified quickly, and a quick algorithm to solve this problem can be used to solve all other NP problems quickly.
NP-hard
The complexity class of decision problems that are intrinsically harder than those that can be solved by a nondeterministic Turing machine in polynomial time. When a decision version of a combinatorial optimization problem is proved to belong to the class of NP-complete problems, then the optimization version is NP-hard.
P
The complexity class of languages that can be accepted by a deterministic Turing machine in polynomial time.
Cook's theorem
The language SAT of satisfiable Boolean formulas is NP-complete.
Linear Linked Chain
The last node would contain null.
Kolmogorov complexity
The minimum number of bits into which a string can be compressed without losing information. This is defined with respect to a fixed, but universal decompression scheme, given by a universal Turing machine.
Hamming distance
The number of bits which differ between two binary strings. More formally, the distance between two strings A and B is ∑ | Ai - Bi |.
out-degree
The number of edges going out of a vertex in a directed graph.
load factor
The number of elements in a hash table divided by the number of slots. Usually written α (alpha).
quadtree complexity theorem
The number of nodes in a quadtree region representation for a simple polygon (i.e. with nonintersecting edges and without holes) is O(p+q) for a 2q× 2q image with perimeter p measured in pixel widths. In most cases, q is negligible, and thus, the number of nodes is proportional to the perimeter. It also holds for three-dimensional data where the perimeter is replaced by surface area, and in general for d-dimensions where instead of perimeter we have the size of the (d-1)-dimensional interfaces between the d-dimensional objects.
key
The part of a group of data by which it is sorted, indexed, cross referenced, etc.
recursion termination
The point when conditions are met and a recursive algorithm ceases calling itself and begins to return values.
range
The possible results of a function or relation. For instance, the range of cosine is [-1,+1].
element uniqueness
The problem of determining if there are duplicates in a set of numbers.
assignment problem
The problem of finding a maximum (or minimum) weight matching in a weighted, bipartite graph.
longest common subsequence
The problem of finding a maximum length (or maximum weight) subsequence of two or more strings.
forest editing problem
The problem of finding an edit script of minimum cost which transforms a given forest into another given forest.
string editing problem
The problem of finding an edit script of minimum cost which transforms a given string into another given string.
string matching
The problem of finding occurrence(s) of a pattern string within another string or body of text. There are many different algorithms for efficient searching.
Byzantine generals
The problem of reaching a consensus among distributed units if some of them give misleading answers. To be memorable, the problem is couched in terms of generals deciding on a common plan of attack. Some traitorous generals may lie about whether they will support a particular plan and what other generals told them. Exchanging only messages, what decision making algorithm should the generals use to reach a consensus? What percentage of liars can the algorithm tolerate and still correctly determine a consensus?
In-Order Traversal
The process of systematically visiting every node in a tree once, starting at the root and proceeding left down the tree, accessing the first node encountered at its "center", proceeding likewise along the tree, accessing each node as encountered at the "center".
flow conservation
The property that no vertex, except the source and sink, of a flow network creates or stores flow. More formally, the incoming flow is the same as the outgoing flow, or, the net flow is 0.
skew symmetry
The property that the flow is the same amount, but reversed direction, starting from either vertex of every edge of a flow network. More formally, for an edge e=(v,w), f(v,w) = -f(w,v), where f(a,b) is the flow from a to b.
capacity constraint
The property that the flow on every edge of a flow network is no more than the edge's capacity. More formally, for all edges e, f(e) ≤ u(e), where f(e) is the flow on e and u(e) is its capacity.
efficiency
The resources an algorithm used to find an answer. It is usually measured in terms of the theoretical computations, such as comparisons or data moves, the memory used, the number of messages passed, the number of disk accesses, etc.
feasible region
The set of all possible solutions of an optimization problem.
connected components
The set of maximally connected components of an undirected graph.
terminal
The set of points in a plane or vertices in a graph defining endpoints of a Steiner tree.
polyhedron
The set of solutions to a finite system of linear inequalities on real-valued variables. Equivalently, the intersection of a finite number of linear half-spaces in Rn.
Minimum Spanning Trees
The smallest connected graph in terms of edge weight, minimizing the total length over all possible spanning trees. However, than can be more than one minimum spanning tree in a graph. All spanning trees of an unweighted graph are minimum spanning trees.
minimum cut
The smallest set of edges in an undirected graph which separate two distinct vertices. That is, every path between them includes some member of the set.
feedback edge set
The smallest set of edges whose deletion results in an acyclic graph.
minimum vertex cut
The smallest set of vertices in an undirected graph which separate two distinct vertices. That is, every path between them passes through some member of the cut.
feedback vertex set
The smallest set of vertices whose deletion results in an acyclic graph.
recurrence relation
The specification of a sequence of values in terms of earlier values in the sequence and base values.
subtree
The tree which is a child of a node.
cube root
This describes a "long hand" or manual method of calculating or extracting cube roots. Calculation of a cube root by hand is similar to long-hand division or manual square root.
topology tree
Tree that describes a balanced decomposition of another tree, according to its topology. More precisely, given a restricted multilevel partition for a tree T, a topology tree satisfies the following: (1) A topology tree node at level l represents a vertex cluster at level l in the restricted multilevel partition, and (2) A node at level l ≥ 1 has at most two children, representing the vertex clusters at level l-1 whose union gives the vertex cluster the node represents.
Type erasure
Type erasure is any technique in which a single type can be used to represent a wide variety of types that share a common interface. In the C++ lands, the term type-erasure is strongly associated with the particular technique that uses templates in the interface and dynamic polymorphism in the implementation. 1. A union is the simplest form of type erasure. - It is bounded, and all participating types have to be mentioned at the point of declaration. 2. A void pointer is a low-level form of type erasure. Functionality is provided by pointers to functions that operate on void* after casting it back to the appropriate type. - It is unbounded, but type unsafe. 3. Virtual functions offer a type safe form of type erasure. The underlying void and function pointers are generated by the compiler. - It is unbounded, but intrusive. - Has reference semantics. 4. A template based form of type erasure provides a natural C++ interface. The implementation is built on top of dynamic polymorphism. - It is unbounded and unintrusive. - Has value semantics.
Breadth-first search
Visits the neighbor vertices before visiting the child vertices Often used to find the shortest path from one vertex to another. A queue is usually implemented
Postorder Traversal
Visits the root of a binary tree after visiting the nodes in the roots subtrees. Order: Visit all the nodes in the roots left subtree. Visit all the nodes in the roots right subtree. Visit the root.
What is advantage over single linked list?
We can have a traversal that goes both ways.
How Does Bubble Sort Work?
We go through the array and swap an index with +1 index and loop through the whole array.
Why do we use prime numbers for table size?
We mod often, and prime numbers give us the most unique numbers. (2*ts+1)
Anagram
a word, phrase, or name formed by rearranging the letters of another, such as cinema, formed from iceman.
Ordered List - add
add searches through the list to locate where the new node should be added. It then uses the addBefore method of DoubleList to add it.
if (u,v) is the last edge of the simple path from the root to vertex v, v is the _ of u
child
Multiset methods
count() sum()
Euclidean Algorithm GCD
def gcd(a, b): while a: b, a = a, b%a return b
Indexed List - operations
get returns the element in node returned by findNode. set locates the i the ith node using findNode, changes the element stored in the node and then returns the old element.
HashTable<K,V> & HashMap<K,V> class
java.util implements map<K,V> interface K-- type paramater for key and v-- type parameter for associated value Operations: lookup, insert, delete. Constructor lets you set init capacity and load factor handles collisions with chained buckets hash map only allows null for keys and values
information theoretic bound
lower bounds on the execution of a computation based on the rate at which information can be accumulated.
Normal recursion to solve fibonacci is extremely slow. What should we use to fix it?
memoization
What is a example of a base case in recursion?
n! if n == 0 then it is 1
Average Lower bound for adjacent swaps
n(n-1)/4 Ω(n^2)
Circular queue - enqueue
rear = (rear+1)%queue.length;