Midterm 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

2-3 Trees

2-3 Trees automatically maintain balance as they grow. They allow nodes to contain multiple values. 2-Nodes Contain 1 value Two children: left and right 3-Nodes Contain 2 values Three Children: left, middle, and right

2-3 Tree Search

2-Nodes: If less than -> go left If greater than -> go right 3-Nodes (for values a, b): If less than a -> go left If between a and b -> go middle if greater than b -> go right

Directed Acyclic Graphs (DAG)

A DAG is a direct graph with no cycles. There can still be multiple paths from one node to another.

Minimum Spanning Trees

A Minimum Spanning Tree (MST) contains all of the vertices of a graph, with the minimum possible weight. Brute force is not feasible, as number of possible spanning trees is N^(N-2). Two main algorithms to solve MST: Kruskal's Algorithm - A peaceful approach that starts at the lowest weighted edge and works from there. Uses Union-Find set to merge the subtrees. Prim-Jarnik Algorithm: Starts with an arbitrary vertex. Adds the minimal edge that the ENTIRE tree can currently access. Algorithm is greedy, growing along the optimal path. Both are O(|E| log |V|), where E is number of edges, and V the number of vertices.

Bit Vectors

A bit vector can store finite, countable sets using bits. Also known as bit array, bit set, or bit map. Can perform set operations with a single operation (very fast!). Each object in the universe is represented as a single bit in the string of bits. If x ∈ A, then the bit is 1, otherwise 0. This is useful for and & or operations on bytes. Allows performing set operations amazing fast. Intersection : a & b Union : a | b Complement : ~a Exclusion : a & ~b

Complete Graphs

A complete graph is a graph in which all pairs of vertices are adjacent. All vertices are connected to all other vertices. Number of edges in a complete graph is n × (n - 1) / 2. For a non-complete graph, number of edges is less than this.

Connected and Unconnected Graphs

A connected graph has a path from every vertex to every other vertex. An unconnected graph has at least one vertex that cannot access another vertex. The connected component is the maximum connected subgraph of a given graph. This is the entire graph on a connected graph.

Cycles

A cycle is a sequence of vertices that are adjacent and repeating. It is a loop of vertices. An acyclic graph contains no possible cycles.

Hash Functions

A hash function can be anything, but it's best to find one that spreads the data evenly over the array and avoids collisions when possible. Using random functions or modulus (h(k) = k mod N, where N is a prime number table size, and k is key value) are popular choices.

Hashing

A hash function takes a key object as an argument, and returns a numeric index. hash(key) -> index This can allow, given a specific key, the computation of the index of an object. This can give dictionaries O(1) access.

Heap ADT

A heap is a binary tree in which the value of a node is smaller or larger than both of its children. Every subtree is a heap. Nodes are added in breadth-first order. Resulting tree is always optimal and balanced.

Internal Node

A node with at least one child.

Paths

A path is a sequence of consecutive vertices that are adjacent. Can represent a physical path (like streets), a logical connect, or anything like that. In a simple path, all edges of the path are distinct. The length of the path is the number of edges or vertices.

Red-Black Trees

A red-black tree is essentially a 2-3 Tree implemented only using 2-Nodes. All 3-Nodes are converted to a chain of two 2-Nodes, with the child node marked red, indicating they belong to a 3-Node with the parent. The red node is essentially part of the parent. A 4-Node occurs when there are two red nodes in a chain! All added nodes are red. Splits are handled using rotations. When a black node has two red direct children (after balancing rotations), you can "split" by recoloring the red nodes to black, and recoloring the parent node red.

Binary Search Tree

A special type of binary tree that sorts nodes by value. For each node: All nodes on left branch are less than it. All nodes on right branch are greater than it.

Shortest Path Attributes

A sub-path of the shortest path is also a shortest path. Any subset of an optimal solution is optimal. There is a tree of shortest paths. Multiple solutions from the start vertex to all the other vertices. Might be multiple equally-optimal solutions.

Tree Rotations

A tree rotation can be used to rearrange a tree in a way that preserves BST logic, and O(log n) time complexity for search and insert.

Set Union

A union combines to sets into a new one: A ∪ B = { x | x ∈ A or x ∈ B }

Node

A unit of a linked list or a binary tree. Nodes usually contain data. May point to successor, predecessor, or other branch.

Weighted Graphs

A weighted graph places a value, or weight, on each edge. The values can be anything and are defined by the ADT using the graph. The weights are abstract. Often used to find minimum path.

AVL Tree

AVL Trees are height-balances binary search trees. Keeps track of height of each subtree and reorders data as needed. Aggressively balances nodes to ensure O(log n) search. Adding nodes requires more work than normal BST. Each subtree uses a height property (maximum height of left and right subtree +1). Leaves have a height of zero. If height of the right and left branches only differ by 1, the AVL tree is balanced. If not, it will be balances using rotations.

Arborescence

An arborescence is a directed graph where, if there is a path from one node to another, it is unique. So there's at most one path from a to b. ALL trees are acyclic graphs (which may be directed), but not all DAGs are trees, since some DAGs allow multiple paths to the same node. A Rooted Tree selects an arbitrary vertex as the root. A forest is a collection of trees.

Leaf

An external node, without children

Sets

An unordered collection of objects

BST Imbalances

BSTs become imbalanced in two distinct ways: Left-left imbalance or Left-right imbalance.

Graphs

Basis for ALL of computer science. Graphs are made up of Vertices and Edges. Formal Definition: A graph G = ( V, E ) is defined by a pair of two sets: a finite set V of items called vertices a finite set E of vertex ordered-pairs called edges. A vertex is a single point (or a node), and an edge is represented as a pair of vertices (a branch).

Adding a Node to a Min-Heap

Begins at next available position for a leaf. Item must be "up-heaped". Entries is moved up depending on its value until correct position is found. As this is done, nodes are swapped. Parent to child change position. Since a heap always has a height of O(log n), upheap runs in O(log n) time.

Multiset

Closely related to a set but permits duplicate elements.

Union-Find - Union

Combines two sets into a single set by pointing a representative at another representative. Union by Weight suggests the tree with viewer nodes is made a subtree of the larger tree.

Uses of Post-order Traversal

Computing space used for child nodes, calculating folder space, expression evaluation (as an alternative to the stack algorithm)

Union-Find - Makeset

Creates a set (within the Union-Find instance) that contains a single element. Also known as a singleton. Essentially, the same as an "add" for the ADT.

Set Difference

Difference removes all items found in one set from another: Typically, it is written either A - B or A \ B A - B = { x | x ∈ A and x ∉ B }

Dijkstra's Algorithm

Dijkstra's Shortest Path First Algorithm computes the distance of all vertices from a given start vertex s. Works on directed and undirected graphs. Must be connected graph, and all edges must have non-negative weight. It is a greedy algorithm, always takes the best immediate or local solution.

Max-heap

Each node's parent has a heavier value. Stores larger items at the top of a tree.

Min-heap

Each of a node's descendants have a heavier value. Stores smaller items at the top of the tree.

Dijkstra's Algorithm Logic

Each vertex has its own distance table. Contains best weight to each other vertex, as well as the best path. Initially sets all weights to infinity. Analyzes vertices, starting with the smallest distance, until all are known. Vertices are treats as being in a priority queue.

Directed and Undirected Graphs

Edges sometimes have distinct source and destination (like one-way streets). A directed graph (digraph) is a graph where every edge has a source and target vertex. This is the basis of data structures like trees and linked lists. In an undirected graph, if there's an (a, b), then there's also a (b, a). Not so for digraphs.

Downheap Algorithm

Every node has two children in a heap. Preserve the node structure: Min-heap - swap with the smallest child Max-heap - swap with the largest child

Time & Space-Complexity of Heap Sort

Heap sort allows sorting any array in O(n log n) (like Merge-sort and Quick Sort). Unlike Merge-Sort, there is no overhead. Heap Sort can sort with auxiliary storage space of O(1). Merge sort requires O(n). Heap Sort is NOT online, as it uses the entire array, and it is NOT stable.

Heap Sort

Heap sort uses a max heap to sort an array. This takes advantage of the face that is heap is a natural priority queue, and that heaps will always add/remove from the right-most leaf. Phase 1: Converts existing array to a max-heap. Phase 2: Empties the heap. Removes all the nodes. Sorted data is added to the end of the array. "Heap" and remaining array used in memory at same time. Sorted array is stored in empty space after the end of the heap. Can be done top-down or bottom-up.

Deleting a Node from Min-heap

Heaps must maintain completeness, so the right-most leaf is needed to replace the deleted node. Deletion: 1. Remove the node 2. Replace it with the right-most leaf 3. Down-heap (move down) to the correct location Down-heap runs in O(log n) time

BST Search Logic

If current node = S, return current node If S < current node, recurse on left branch If S > current node, recurse on right branch If left & right == null, return "not found"

Height of a Heap

If i is the height of a node, then there are 2^i nodes of depth i. Heap always has a height of O(log n)

Dangers of BSTs

In a BST, internal nodes NEVER change. This can affect structure of the tree. One path can end up with significantly more of the data, if the data is not truly random. If data is fully sorted, time complexity deteriorates to O(n), becoming functionally a linked list.

Red-Black Tree Attributes

In a stable tree, if a node is red, then both children are black. The root is always considered black. Null pointers are considered black. Black-height of a node is the number of black nodes on any path to a null. We don't count red, since they're part of a 3-Node. Typically, the root isn't counted. Every path from any node to a null contains the same number of black nodes. Red-Black Tree is conceptually identical to a 2-3 Tree.

Binary In-order Traversal

In an in-order traversal, a node is visited after its left branch, and before its right branch. function inOrder if left isn't null then left.inOrder() this.visit() if right isn't null then right.inOrder() end function Uses: Heap sorting, or binary searching at O(log n) when sorted

Breadth-first Traversal

In breadth-first traversal, nodes are visited by their level in the tree. So, all nodes at level 1, will be visited before level 2, and so on.

Depth-First Transversal

In depth-first transversal, the algorithm travels down the tree. Using recursion: Root recurses into its children. Each child recurses into each of its children, and so on.

Depth-first: Post-order Traversal

In post-order traversal, a node is visited after its descendants. Each child is visited before its parent.

Binary Post-order Traversal

In post-order, a node is visited after its left branch and after its right branch. function postOrder if left isn't null then left.postOrder() if right isn't null then right.postOrder() this.visit() end function

Depth-first: Preorder Traversal

In preorder traversal, a node is visited before its descendants. Each child is visited after its parent.

Binary Pre-order Traversal

In preorder traversal, the node is visited before the right or left child is visited. function preOrder this.visit() if left isn't null then left.preOrder() if right isn't null then right.preOrder() end function

BST Insert

Inserting into a binary tree is handled like a search, but the value is added if it's not found. Best case is O(log n).

2-3 Tree Insert

Instead of creating a new left or right leaf, values will be merged into the current leaf (instead of a new node). A 2-Node will become a 3-Node. A 3-Node will become a 4-Node, a temporary structure. A 4-node has 3 values and 4 children. It will be split into other nodes. The height of the tree does not change (right away).

Linked Lists

Linear, access is O(n). Nodes can only have one predecessor and/or one successor node.

Branch

Links between nodes. Often unidirectional.

Branching-factor

Max number of branches any node can have. Can be 2 or more.

Priority Queue

Modification of queue ADT that follows first-in-least-out logic.

Problems with Hash Functions

Most apps will have key ranges that are too large for 1:1 mapping between hashing and keys. There is no perfect hashing function for any real world application. Collisions!

Binary Tree

Most commonly used tree. Nodes in a binary tree have only two successors. They have a left node and a right node.

Tree

Non-linear, abstract model of a hierarchical structure. Nodes can have multiple successors, but only one predecessor.

Depth of a Node

Number of ancestors to the root.

Open Hashing

Open Hashing solves many of the problems with closed hashing, by storing a linked list or tree in each array element. Also known as bucket hashing. If using a balanced tree for each bucket, for example, it's O(1) at best, and O(log n) at worst. Will not fill up array, and can grow indefinitely. Far faster than closed hashing. No clustering. Objects CAN be deleted.

Ancestor Node

Predecessor, parent, grandparent, etc.

Set Membership

Set notation uses special symbols to denote if an object is a member of a set. If V contains vegetables: potato ∈ V bacon ∉ V

Graph Traversal

Since graphs can have cycles built in, it's important to add a visited property to each vertex. This prevents infinite loops.

Heaps and Arrays

Since heaps are complete, balanced, binary trees, their structure lends will well to being stored in an array. Node's parent and child can be computed mathematically. Heap ADTs track the index of the end of the heap. All new items are added to end before upheap. Last item will be swapped for deleted item before down-heap.

BST Search Time Complexity

Since tree divides the problem progressively in two, best time complexity for search is only O(log n). If unbalanced (like a linked list type chain), shifts to O(n) for search.

Spanning Trees

Spanning Trees include every vertex in a graph (with no cycles), to ensure that every vertex is connected in a graph. Given a complete graph of N vertices, there are N^(N-2) possible spanning trees. With only 6 vertices, possible trees are around 1.2k. At 12 vertices, 61 billion. Scales terribly.

Applications of Binary Trees

Storing arithmetic expressions Decision processes Searching Sorting

Descendent Node

Successors, child, grandchildren, etc.

Set Complement

The complement of a set is all elements in the universe that are not in the set: A' = { x | x ∉ A }

Union-Find - Find

The find operation starts with a node and locates its representative. Traves up the tree until it finds a node without a link. After the answer is found, all nodes along the path are pointed to the representative. This means the tree optimizes by becoming flattened and faster with each find.

Set Intersection

The intersection of two sets contains only elements found in both sets: A ∩ B = { x | x ∈ A and x ∈ B }

First-in-Least-Out

The least element is the one that is removed. If two items have the same "rank", they can be queued as normal. Object's key can be used to determine "rank", but any field will do. Meaning of least is defined within the ADT. Doesn't necessarily mean "less than", can be any way of ranking items.

Splitting 4-Nodes in 2-3 Tree

The middle item ascends, the right and left values of the node become the children, and each gets two of the node's 4 children, becoming 2-Nodes. As the middle item ascends, the parent node will change from a 2-Node to a 3-Node... or from a 3-Node to a 4-Node. It continues to bubble up in this manner (possibly up to the root. This occurs at O(log n). Six types of splits: Parent is 2-Node: 1. Node is left child of the parent 2. Node is right child of the parent Parent is 3-Node: 3. Node is left child of the parent 4. Node is middle child of the parent 5. Node is right child of the parent Or 6. Node is the root A 2-3 tree changes in height ONLY when the root is split - and it splits BALANCED on the left and right side.

Size of the tree

Total number of nodes in a tree.

Union-Find Operations

Union-Find contains three operations that handle sets. All three modify the structure of the tree, which automatically optimizes itself into O(1). void makeSet(object value) void union(object a, object b) object find(object a)

Union-Find Data Structure

Union-Find data structure maintains a list of nodes partitioned into disjoint subsets. e.g. { {a} {b} {c} {d} {e} } is a full partition of {a,b,c,d,e} Sets are stored as a tree in which the children link to the parents (branching upwards). A parent can have multiple children. Every node in the tree is part of the same set. The root is called the representative of the set. It is not special. Any node can find the representative by following branches upwards. If any two nodes have the same representative, then they are in the same set.

Time-Complexity of Priority Queue

Unsorted Array enqueue - O(n) for resizing array dequeue - O(n) for search and moving Sorted Array enqueue - O(n) to find position to insert and move the rest dequeue - O(n) Unsorted Linked List enqueue - O(1) dequeue - O(n) to find & remove node Sorted Linked LIst enqueue - O(n) to find position and insert dequeue - O(1) to remove head/tail Implementing as a Heap enqueue - O(log n) dequeue - O(log n)

Tree Traversal

Visiting the nodes of a tree in a systematic way. Since trees can be divided into smaller and smaller subtrees, recursion is appropriate for traversal.

AVL Tree Insertion

When a node is inserted, only nodes on the path from insertion point to the root have the possibility of becoming imbalanced. After insert, algorithm starts balancing at the lowest node, then recurses up to the root, rotating as needed.

BST Delete

When node is deleted, tree needs to be re-linked to preserve ordering. Case 1: Node is a leaf. Node can simply be deleted. Case 2: Node has a single child. Child node is promoted, so subtree moves into space created by deleted node. Case 3: Node has two children. Find mathematically next value to promote. Either find the maximum node of the left (smaller) branch, or the minimum node of the right (larger) branch. Either one should fit mathematically and keep a valid BST.

Graphs - Adjacent and Incident

When two vertices share an edge (x, y), they are said to be adjacent. They're connected. The edge (x, y) is said to be incident on vertices x and y. It is the connection.

Closed Hashing

With closed hashing, we use an existing array and search for an empty position. We use the hash value as a starting point, and search from there. If array element is occupied, search down and look for empty slot. If search wraps completely around and didn't find anything, array is full. One problem with closed hash is the tendency to cluster. A cluster is a group of continuous, filled array elements with no open slots. Hash will eventually degrade to O(n). Also, items cannot be deleted from closed hash. Despite this, items can often be found at around O(1).

A tree consists of:

nodes with a parent-child relationship to zero or more nodes


Ensembles d'études connexes

Microbio Final Study Guide Fact Sheet Questions

View Set

STAT3331 MindTap Practice Problems Ch 2

View Set

Module 13C Hip Fractures- Study Module

View Set