cs exam 2 (bruh)
rbt case 2
Z.uncle is red recolor parent, grandparent, and uncle
Suppose the numbers 7,5,1,8,3,6,0,9,4,2 are inserted in that order into an initially empty binary search tree. What is the in-order traversal sequence of the tree?
0,1,2,3,4,5,6,7,8,9
Linked Lists Advantages
1) Flexible for dynamic sets (constantly inserting and deleting) 2) Insertion, deletion, retrieval allowed anywhere 3) Most operations are O(1) time complexity. Many operations like insert or delete are O(1) because cursor points to the location, not using physical adjacency
The preorder traversal sequence of a binary search tree is 30, 20, 10, 15, 25, 23, 39, 35, 42. Which one of the following is the postorder traversal sequence?
15, 10, 23, 25, 20, 35, 42, 39, 30
The following numbers are inserted into an empty binary search tree in the given order: 10,1,3,5,15,12,16. What is the height of the binary search tree?
3
Example of phone numbers for good vs bad hash
A bad hash function takes the first three digits because it will not uniformly distribute the keys, too many same area codes A good hash function takes the last three digits
The main problem of hash tables is
COLLISION
Delete operation in a queue is called
Dequeue
A good hash function have three key requirements:
Deterministic - equal keys should produce the same hash value Efficient to compute Uniformly distribute the keys - each table position equally likely for each key
Direct access table vs hash table (Time efficiency)
Direct access tables are FAST, always O(1) operations Hash tables are slower, O(1) for best case and O(n) worst case Hash tables' space function will always be O(n)
Direct access table vs hash table (Accessing elements)
Direct access tables directly access elements using the key, faster Hash tables indirectly access elements with a hash function, slower
Direct access table vs hash table (Space efficiency)
Direct access tables take up a LOT of empty space Hash tables minimize the amount of empty space, so less space used
Insert operation in a queue is called
Enqueue
To balance itself, an AVL tree may perform the following four kinds of rotations
Left rotation, right rotation, left-right rotation, right-left rotation
Adjacency matrix vs adjacency list
Matrix faster, list slower BUT matrix bigger, list smaller Matrices use too much space, use adjacency lists for large or dense graphs because they are space efficient with O(V2) memory Using an adjacency matrix for a very large graph is BAD Using an adjacency matrix for a very dense graph is also BAD
Time complexity of close addressing (insert, search, delete)
O(1) average, O(n) worst-case
Best case time complexity for hashing
O(1) for best case
Search, insert, delete in BST is
O(h)
Search time complexity is _______ for a balanced BST
O(logn)
The height of a red-black tree is always
O(logn) b/c it's balanced Whenever the red-black tree gets too unbalanced, it must rotate
Time complexity of close addressing (space)
O(n) average, O(n) worst-case
Singly linked list
each node (list element) has a pointer to its successor (the next list element)
Strict binary tree AKA full binary tree
each node has 0 or 2 children
Three types of open addressing:
linear probing, quadratic probing, double hashing
Linear probing
linearly probe (i) for the next slot
Linked Lists (cache)
not cache friendly
Time complexity of doubly linked list (search, insert, delete):
search takes O(n), insert/delete is O(1)
Breadth-first search Each vertex keeps a status value:
visited, waiting, not visited All nodes start as not-visited state
Collision
when two keys hash to the same slot
space function time complexity for hashing
will always be O(n)
Solution for collision
chaining
Close addressing
chaining, deal with collisions by creating an additional data structure This is less space efficient than open addressing, but usually faster
Universal hashing
choose the hash function randomly in a way that is independent of the keys that are going to be stored, and this yields good performance on average no matter what keys the user chooses (randomizing) provides better performance on average No matter which keys the adversary chooses, universal hashing provides good performance on average
Arrays (space, memory, size)
continuous memory allocation, take up more space with lots of empty space, not very flexible
We avoid cycles through
cycle detection through depth-first search
Nodes that have no children store a value of
-1 in the appropriate column
Red-black trees are binary search that satisfies the following 4 properties:
Every node is either red or black, this attribute takes up 1 storage bit The root node and leaves are black If a node is red, then both its children are black (no two consecutive red nodes in a path) For each node, all simple paths from the node to descendant leaves contain the same number of black nodes
Arrays (time)
FAST! Can easily access an element using an index
Linked Lists (Insert/delete)
FAST! Can easily insert/delete something in the middle because the data is scattered around
Two-dimensional array holds
First column holds the root of the left subtree Second column holds the root of the right subtree
Balanced BSTs have height
H = logn
Insert operation in a stack is called
Push
Time complexity of red-black trees
Search, insert, delete are O(logn) because red-black trees are balanced
Time complexity of AVL Tree operations
Search, insert, delete are all O(logn)
Delete in binary search tree If number of children = 1
delete the node, connect grandparent and child
Linked Lists (space, memory, size)
discontinuous memory allocation, may take up less space, very dynamic and flexible when uncertain about size
What is the worst case time complexity for search, insert, and delete operations in a general binary search tree?
O(n) for all
Worst case time complexity for hashing
O(n) for worst case
Tree sort algorithm Worst case
O(n2) if not balanced
Building the n-node binary search tree is
O(nlogn)
Tree sort algorithm
O(nlogn)
Tree sort algorithm Best case
O(nlogn) if balanced
We are given a set of n distinct elements and an unlabeled binary tree with n nodes. In how many ways can we populate the tree with the given set so that it becomes a binary search tree?
Only 1 way
Delete operation in a stack is called
Pop
BFS vs DFS
BFS Breadth-first search finds the shortest distance to each reachable vertex in a graph G(V,E) from a given source vertex (also called the shortest path) Important for shortest path Queue, FIFO Time complexity is O(V+E) DFS DFS does not find the shortest path, but provides valuable info about the structure of a graph Important for detecting cycle Stack, LIFO Time complexity is also O(V+E) Useful for: Edge classification, Cycle detection, Topological sort, Strongly connected components There is no difference between time and memory between the two The big difference is order
Unbalanced BSTs have height
H = n
AVL Trees are also
self-balanced binary search trees The difference between heights of left and right subtrees cannot be more than one for all nodes
Path
sequence of nodes such that each pair of nodes connected by an edge
Cycle
simple path in which the first and last nodes are connected Cycles are really bad in graphs, we don't want them, also infinite loops
Arrays (Insert/delete)
slower, if you insert/delete something in the middle, you have to move everything down/up
Linked Lists (time)
slower, you have traverse O(n) elements to reach element
The performance of hash tables depends on
the hash function used
Variable root is
the index of the root of the tree
One-dimensional array holds
the information field of the node
Disadvantages of Direct Access Table
they require a LOT of storage, with LOTS of empty space! So, hash tables are used instead because they use much less space
Double hashing
use another hash function to look for the next slot
An AVL tree guarantees _________ for all operations by
O(logn) for all operations by maintaining an extra height attribute in each node
Inorder traversal is time to visit every node once
O(n)
Search time complexity is _______ for an unbalanced BST
O(n)
Time complexity to traverse the tree is
O(n)
Delete in binary search tree If number of children = 0
just delete it
Stack policy is
last-in, first-out (LIFO)
Minimum height binary search tree
least possible height for a binary tree
Adjacency list
more space efficient than a matrix with O(V2) memory locations Create an array (node list) containing each node as an element Create linked lists for each element that contains all neighbors (neighbor list), and each entry in the linked list is a neighbor node
Red-black trees are self-balancing, which means
no such path is more than twice as long as any other, so that the tree is approximately balanced
Complete binary tree
nodes are filled in from the left, but all levels except the final is filled
Doubly linked list
nodes have a pointer to successor and predecessor a linked list in which each element has both forward and backward pointers. Contains: Head node, tail node, cursor, count
Disadvantage of array-based implementation of binary search trees
notice that this array has a lot of empty spaces, so it uses more space than linked lists for binary trees
Chaining
place all elements that hash to the same slot into the same linked list
Disadvantage of Linear probing
primary clustering, consecutive elements form groups and the average search time increases
Quadratic probing
quadratically probe (i2) for the next slot
Linked Lists (RAM)
random access memory is NOT allowed
Arrays (RAM)
random access memory is allowed
Linked lists drawbacks
1) Not cache friendly 2) Random access memory is not allowed 3) Takes up more memory/space
In a red-black tree, insert and delete cause violations to the red-black properties Use two tools to rebalance:
1) recoloring and 2) rotation
You must know these three things to implement a linked list:
1)Total number of nodes 2)Head node 3)Tail node
AVL Tree vs Red Black Tree
BRING UP: balance, rotations, insertion/deletion, search AVL trees are more balanced than red-black trees, but may cause more rotations during insertion and deletion If insertion and deletion is frequent, red-black trees should be used If insertion and deletions are less frequent and search is more frequent, then an AVL tree should be used
Queue
Element delete is the one that has been in for the longest time
Stack
Element deleted is the one most recently inserted
In delete operation of BST, we need an inorder successor (or predecessor) of a node when the node to be deleted has both left and right child as non-empty. Which of the following is true about the inorder successor in delete operation?
Inorder successor is always either a leaf node or a node with empty left child
Which of the following traversal outputs the data in sorted order in a BST?
Inorder traversal
Time complexity of Direct Access Table
Time complexity of direct access tables is the best b/c all operations are O(1). Best data structure for time.
rbt case 1
Z is the root Change the color of Z from red to black
rbt case 4
Z.uncle is black (line) Rotate grandparent opposite of Z RECOLOR grandparent and parent
rbt case 3
Z.uncle is black (triangle), do NOT recolor Rotate parent opposite of Z
Hash function
a function that converts a big number or string (value or description) to a small integer (key) that can be used as an index in a hash table
Graph
a hierarchical data structure in which elements related to an arbitrary number of other elements Many-to-many data structure Nodes (vertices) connected by edges Two nodes adjacent (incident) if they share an edge Graph G = (V,E) where V is the set of vertices and E is the set of edges
Why the need for universal hashing? (worst case behavior)
a malicious person may choose the keys that hash all to the same slot, causing search time to be O(n)
Simple path
a path in which no vertices (thus no edges) are repeated
double-ended queue
a queue that allows insertion and deletion at both ends
Walk
a sequence of vertices where each adjacent pair is connected by edges
Direct Access Table
a very big array that uses a lot of storage (lots of empty space) and uses keys to index the data (access the data) Direct access tables map records to their corresponding keys using arrays So, keys are directly used as indexes, which is SUPER FAST!
Tail
a walk in which no edges are repeated
shortest path of a red-black tree
all black nodes
Perfect binary tree
all levels include the final level is full (THIS IS THE GOAL!) This is the most balanced binary tree
Full binary search tree
all non leaf nodes have a degree of 2 H = logn Time to locate a node is O(logn)
Dense graph
almost all nodes are connected
longest path of a red-black tree
alternating red and black
Arrays (cache)
better cache locality due to continuous memory allocation
Open Addressing
deal with collisions without an additional data structure Much more space efficient than close addressing! All elements occupy the hash table itself, no lists and elements stored outside the table Avoids pointers, slots to be examined are computed Insert a key into the hash table by probing the hash table until an empty slot is found The sequence of positions probed depends upon the key being inserted A hash function is extended to determine which slots to probe
Deque
double-ended queue
Stacks and Queues are
dynamic sets in which the element removed from the set by the DELETE operation is prespecified
Weighted graph
edges are weighted if they have a value (numbers)
Directed graph
edges have a direction (arrows)
Undirected graph
edges have no direction, no arrows
Degenerate binary search tree
every nonterminal node has one child H = n Time to locate a node is O(n)
Delete in binary search tree If number of children = 2, delete the node
find the successor by finding the minimum value in either the left or right child of that subtree Inorder successor is always either a leaf node or a node with empty left child
Breadth-first search is a _________________ structure
first in first out (it uses a queue) Mark and enqueue node N, visit it, mark visited Enqueue N's neighbors, mark as waiting
Queue policy is
first-in, first-out (FIFO)
The purpose of AVL Trees
fix the worst-case scenario, unbalanced BST
Depth-first search
follows specific path as far as it leads and Begin another path when the first is exhausted
Direct acyclic graph (DAG)
graph with no cycles and edges are directed
Neighbor
if two nodes have an edge between them, they are neighbors
Adjacency matrix
uses N x N two dimensional array of boolean values (0 or 1) For unweighted graph, you will have boolean values 0 or 1 For a weighted graph, you replace the boolean values with the weight If it's directed, the weight is only goes towards that direction If it's undirected, the weight goes both directions
Hash Tables
uses a hash function to map keys into a hash table, a more space efficient and indirect version of direct access tables better direct access tables that use fancier keys and less space
Breadth-first search
visit all neighbors before visiting non-neighbors
Binary tree traversal
visit every node exactly once
Traversal
visit every node in the graph exactly once
Postorder traversal
visit the left subtree → right subtree → root Postorder = visit the root last
Inorder traversal
visit the left subtree → root → right subtree (MOST USED!) If a binary tree is traversed in-order, the output will produce sorted key values in ascending order (lowest to highest)
Preorder traversal
visit the root → left subtree → right subtree Preorder = visit the root first