Mid 1 pob q
A node's level
The length of the path to that node, plus 1 Max level of a node in a tree is the tree's height Empty tree has height 0 and tree of height 1 consists of single node which is tree's root and leaf Level of node must be between 1 and tree's height
Finding Asymptotic Complexity(single loop)
Interest in time complexity(based on assgn. and comparisons in program) Consider this loop: for (i=sum=0; i<n; i++) sum = sum + a[i]; Initialization: Two assgn. executed once (sum=0&i=sum) Iteration: i++ executed n times In loop: sum=sum+a[i] executed n times Two executed once and two executed n times. So 2+2n assgn. in loop's execution and 2+2n=O(n)
How to find N and c from "big O" problem Consider the function f: f(n)=2n^2+3n+1 and g: g(n)=n^2
Obtained by solving f(n)<=cg(n) with different N Substituting for f(n) and g(n): 2n^2+3n+1<=cn^2 or 2+3/n+1/n^2<=c Start with N=1 and subst. to obtain c (n>=N(pos)) Possible candidates for c and N: c: >=6 >=3(3/4) >=3(1/9) >=2(13/16) >=2(16/25)....-->2 N:1 2 3 4 5 ....-->infinity
Problem can arise with dynamic memory allocations (memory leak)
Occurs when the same pointer is used in consecutive allocations without being freed Ex: p = new int; p = new int; Since the second allocation occurs without deleting the first, the memory from the first allocation becomes inaccessible This leak can accumulate until no memory is available for further allocation To avoid this, memory needs to be deallocated when no longer in use (use delete; and =NULL when done)
Recursive Definitions
Often used to define infinite sets because an exhaustive enumeration of such a set is impossible, so some others means to define it is needed Two parts to a recursive definition: Anchor/ground/base case - establishes basis for all other elements of the set Inductive clause - establishes rules for creation of new elements in the set Using this, we can define the set of natural numbers as follows: 0 ε N (anchor) if n ε N, then (n + 1) ε N (inductive clause) there are no other objects in the set N Recursive definitions used in two ways: Define new elements in the set in question Demonstrate that a particular item belongs in a set
Ω Notation
Opposite of Big O(vise-versa) Let f(n) and g(n) be functions, where n is pos int f(n)=Ω(g(n)) when g(n)=O(f(n)) (read "f of n is omega of g of n.") Since g is lower bound for f, after a certain n, f will never go below g (ignoring multiplicative const)
Dynamic binding in c++
Method declared "virtual", allowing the method to be selected based on the value the pointer has At run time
Static binding in c++
Method is chosen based on the pointer's type Allows the sending of messages to different objects without having to know any of the details of the receiver It is the receiving object's responsibility to determine how to handle the message At compile time
Priority Queues
Normal FIFO operation of queue may need to be overridden due to priorities associated with elements of queue that affect order of processing Where elements are removed based on priority and position The difficulty in implementing such a structure is trying to accommodate the priorities while still maintaining efficient enqueuing and dequeuing Elements arrive randomly, so their order reflects no specific priority
protected
Protected class members and functions can be used inside its class and by friend functions and classes. Protected members and functions cannot be accessed from other classes directly.
Generic entities provided by STL
Provides generic entities: Containers, iterators, algorithms, function objects, and ready-made set of common classes for C++ (containers, associative arrays, etc.)
Public
Public class members and functions can be used from outside of a class by any function or other classes. You can access them directly by using dot operator (.) or arrow operator(->)(pointers)
Singly Linked Lists Searching
Purpose is to scan linked list to find a target data No modification made to list(done using one temp) Traverse list until temp->info=target or temp-> next=NULL (reached end of list and search fails)
Deletion by Merging
The first approach, deletion by merging, works by making one tree out of the node's two subtrees and attaching it to the node's parent This is accomplished by recalling that the value of every node in the right subtree is greater than the value of every node in the left subtree So the rightmost node of the left subtree will be the largest value in that subtree, and it will become the parent of the right subtree To do this, we start at the root of the left subtree and follow right links until we encounter a node with an empty right pointer This node will then set that pointer to the right subtree, and the parent of the left subtree is promoted to replace the deleted node The tree that results from merging may have a very different structure from the original tree In some cases it can be taller, even skewed; occasionally it can be shorter This does not mean the algorithm is inefficient; but we do need to try and find a way to maintain some type of balance in the tree
Binary tree
Tree where each node has only two children, designated the left child and the right child These children can be empty Important attribute of binary trees is num of leaves Useful in assessing efficiency of algorithms
The Eight Queens Problem
Try to place eight queens on a chessboard in such a way that no two queens attack each other To solve this, place one queen at a time, making sure that the queens do not check each other If at any point a queen cannot be successfully placed, the algorithm backtracks to the placement of the previous queen Then moved and the next queen is tried again If no successful arrangement is found, the algorithm backtracks further, adjusting the previous queen's predecessor, etc. PS Code: putQueen(row) for every position col on the same row if position col is available place the next queen in position col; if (row < 8) putQueen(row+1); else success; remove the queen from position col;
Using pointers to access dynamically created locations
Two functions that handle dynamic memory: new data_type; - Allocates memory which returns the address of the allocated memory, which can be assigned to a pointer delete pointername;(like free in c) - Releases the dynamically allocated memory pointed at.
Queue Operations
Typically, the following methods are implemented: clear( ): clears the queue isEmpty( ): determines if the queue is empty enqueue(el): adds the data item el to the end of the queue dequeue( ): removes the element from the front of the queue firstEl( ): returns the value of the first element of the queue without removing it
Class
User defined data-type which has data members and member functions. When a class is defined, no memory is allocated but when it is instantiated(object is created) memory is allocated.
Limits for big-O, big-Ω, and Θ
Using lim(n->∞)(f(n)/g(n)): If lim=inf, f(n)=Ω(g(n)) If lim=c>0, f(n)=Θ(g(n)) If lim=0, f(n)=O(g(n))
Pointers and Destructors
When a local object goes out of scope, the memory associated with it is released but if one of the object members is a pointer, the pointer's memory is released, leaving the object pointed at inaccessible To avoid this memory leak, objects that contain pointers need to have destructors written for them
Binet's formula
fib(n)=((theta)^n-(theta(hat))^n)/sqrt(5) theta=1/2(1+sqrt(5)) theta(hat)=1/2(1-sqrt(5)) which becomes - (theta)^n)/sqrt(5) Rounded to nearest int to get Fibonacci number Formula has straightforward implementation in code, using ceil() to round result to nearest integer Code: unsigned long deMoivreFib(unsigned long n) { return ceil(exp(n*log(1.6180339897)- log(2.2360679775))-0.5); }
Queue
first-in first-out structure Involves both ends, with additions restricted to one end (rear) and deletions to other (front) Item added to queue must migrate from rear to front to be removed. Items are removed in order they are added
binary search tree
or ordered binary tree - values stored in the left subtree of a given node are less than the value stored in that node, and values stored in the right subtree of a given node are greater than the value stored in that node The values stored are considered unique; attempts to store duplicate values can be treated as an error The meanings of the expressions "less than" and "greater than" will depend on the types of values stored
Vector syntax
#include <vector> //vector library vector<dataType> vectorName; //Declaring vector vector<dataType> vecName(size); //Providing size vecName.push_back(value); //Adding item to vec vecName[index]; //Access index of a vector vecName[index]=value; //Change value at index
ADT Examples
Array, List, Map, Queue, Set, Stack, Table, Tree, and Vector are ADTs
Container
A data structure that is typically designed to hold objects of the same type. Ex. list,array,vector,etc. Implemented as class templates whose methods specify operations on the data in the structures as well as the structures themselves. The data stored in containers can be of any type and must supply some basic methods and operations
Object
A data structure, combined with the operations pertinent to that structure. Most object-oriented languages define objects through the use of a class ClassName ObjectName; or ClassName *objectname=new ClassName; Ex. Queue *q = new Queue;
Pointers
A variable whose value is the address of another variable in memory Name is a user-defined name preceded by an asterisk (*) to show variable is a pointer Ex: *ptr; Two important attributes: value (what it stores) address (where it is) Type is the type of variable they point to Ex. int i=4; int *p=&i; so p's type is (*int)
Complexity
An algorithm's complexity is a function describing the efficiency of the algorithm in terms of the amount of data the algorithm must process There are two main complexity measures of efficiency: Time and Space complexity For both, the importance is on the algorithm's asymptotic complexity When n (number of input items) goes to infinity, what happens to the algorithm's performance? (limit of n->inf)
Polymorphism
Ability to create objects of different types that respond to method calls having the same name They differ in that they respond according to type-specific behavior Means that a call to a member function will cause a different function to be executed depending on the type of object that invokes the function. Decision made with binding
Multiple inheritance
Allows subclasses to inherit from more than one superclass Can create problems if superclasses have a common ancestor in inheritance hierarchy
Pointer arithmetic
An offset can be added to the base address(*a) of the array: a+1, a+2, ... etc. and dereference the result. Ex. *(a+1)=a[1], *(a+2)=a[2], ... etc. As long as the value of a is not changed, this alternate approach can be used to access the array's elements
Iterator
Any object that, pointing to an element in a container, can iterate through those elements using a set of operators (with at least the increment (++) and dereference (*) operators). A pointer can point to elements in an array, and can iterate through them using the increment operator (++). Other kinds possible too. Ex. Each container type has a specific iterator type designed to iterate through its elements Reduce the complexity and execution time
Backtracking
Approach to problem solving that uses a systematic search among possible pathways to a solution As each path is examined, if it is determined the pathway isn't viable, it is discarded and the algorithm returns to the prior branch so that a different path can be explored Must be able to return to the previous position, and ensure that all pathways are examined
Use Amortized Complexity anaysis to consider a dynamic array application where the size of the array is doubled each time it fills up.
Array reallocation may be required, so worst case insertion may be O(n) Since remaining insertions are done in constant time, sequence of n insertions can always be done in O(n) time. So, n insertions can be completed in O(n) time. So the amortized time per operation is O(n)/n=O(1)
Static declaration
Arrays in C++ are declared before they can be used which means the size of the array must be determined before it is used. Ex. int a[size]; This is wasteful if the array declared is too large, or a limitation if it's too small. Ex. You need a 5 element array and the declared array has a size of 3
Data structure Examples
Arrays, linked lists(or just lists), records(also called tuples/structs), objects
Excessive Recursion
As the number of function calls increases, a program suffers from some performance decrease Also, amount of stack space required increases dramatically with amount of recursion that occurs Leads to program crashes if the stack runs out of memory More frequently, increased execution time leads to poor program performance Example with Fibonacci numbers: fib(n)={n if n<2 {fib(n-2)+fib(n-1)) otherwise Tells us any Fibonacci number after first two(0,1) is defined as the sum of the two previous numbers Code: unsigned long Fib(unsigned long n) { if (n < 2) return n; // else return Fib(n-2) + Fib(n-1); } Amount of calculation necessary to generate successive terms becomes excessive Because every calculation has to rely on base case for computing the values, since no intermediate values are remembered This exponential growth makes the algorithm unsuitable for anything but small values of n There are acceptable iterative algorithms that can be used far more effectively
Best, Average, and Worst Cases
Best - algorithm takes the fewest number of steps Worst - algorithm takes max number of steps Average - falls between the extremes Weight number of steps that occur for a given input by the probability(p) of that input occurring, and sum this over the number of inputs: ∑(𝑖)(p(input(i))steps(input(i))) In probability theory, above equ. defines expected value(assumes probabilities can be determined and their distribution known) p is a probability distribution, so it satisfies two constraints: Function p is never negative Sum of all probabilities=1 Consider sequentially searching an unordered array to find a target value: Best/worst cases are: Best case - target value is found in the first cell Worst case - target value found in the last cell, or not at all (but searched entire array) Avg. case - consider probability of finding target Probability of finding target in any one location is 1/n (Assuming uniform distribution of n values). So target in 1st location is p=1/n, in 2nd is p=1/n, etc. Number of steps required to get to each location is the same as the location itself, so sum becomes: 1/n*(1+2+...+n) = (n+1)/2 If probabilities differ, then it is much more involved
Ω and Θ Notations compared to Big O
Big-O only gives upper bound of function Consider O(n^3): n^2=O(n^3) but O(n^2) is a more meaningful upper bound Need lower and same rate bounds Big-Ω-lower bound (function that grows more slowly than f(n)) Θ-tight bound (function that grows same rate as f(n))
Breadth-First Traversal
Breadth-first traversal proceeds level-by-level from top-down or bottom-up visiting each level's nodes left-to-right or right-to-left This can be easily implemented using a queue If we consider a top-down, left-to-right breadth-first traversal, we start by placing the root node in the queue We then remove the node at the front of the queue, and after visiting it, we place its children (if any) in the queue This is repeated until the queue is empty
Tail Recursion
Characteristic implementation of tail recursion is that single recursive call occurs at end of function No other statements follow recursive call, and no other recursive calls prior to call at end of function Ex: void tail(int i) { if (i > 0) { cout << i << ''; tail(i-1); } } Tail recursion is a loop; it can be replaced by an iterative algorithm to accomplish the same task In most cases, languages supporting loops should use this construct rather than recursion Ex: iterative form of the function: void iterativeEquivalentOfTail(int i) { for ( ; i > 0; i--) cout << i << ''; }
Linked lists
Collection of independent memory locations (nodes) that store data and links to other nodes Moving between nodes is accomplished by following links, which are the addresses of nodes Many ways to implement linked lists, but the most common utilizes pointers(provide flexibility)
Θ Notation
Combines upper and lower bounds to get tight bound (intersection of O(f(n)) and Omega(f(n))) All Θ(f(n)) are O(f(n)), but not other way around Let f(n) and g(n) be functions, where n is pos int f(n)=Θ(g(n)) when g(n)=O(f(n)) and g(n)=Ω(f(n)). (read "f of n is theta of g of n.")
Possible Problems for complexity
Consider 2 alg.: one requires 10^8n=(O(n)) the other 10n^2=(O(n^2)) Big-O only: 2nd alg. rejected(grows too fast) Above only if n>107(often much smaller), so 2nd alg. would be faster than the 1st Consider other factors in analysis: "double-O" notation has been used in such cases f=OO(g(n)) (for f=O(g(n)) & c is too large) So 10^8n=OO(g(n))
Example of complexity(f(n)=n^2+100n+log10(n)+1000)
Consider f(n) = n^2 + 100n + log10(n) + 1000 As n increases, the significantterm goes: 1000->100n->n^2 (at large n, only n^2 is significant) So, n^2+100n+log10(n)+1000=O(n^2) (read "big-oh of n squared")
Tree
Consist of 2 components(nodes and arcs/edges) Drawn with the root at the top, and "grow" down Defined recursively as follows: Tree with no nodes or arcs is an empty tree If we have a set t1... tk of disjoint trees, tree whose root has the roots of t1... tk as its children is a tree Only structures generated by rules 1/2 are trees If elements of a list are stored in a tree that is organized in a predetermined fashion, number of elements that must be looked at for searching can be substantially reduced
Vector
Container in STL where the elements are stored contiguously in memory and the entire structure is treated like a dynamic array Exhibit low memory utilization, good locality of reference, and good data cache utilization (Like dynamic arrays) Allow random access, so elements can be referenced using indices
Public, protected, or private in Subclass headers
Control amount of access, and level of modifications A subclass with public inheritance preserves the access classes of superclass A subclass with protected inheritance treats public and protected members of superclass as protected(private members remain private) Subclasses with private inheritance treat all members of superclass as private
path's length
Every node in tree must be accessible from root through a unique sequence of arcs(path) Number of arcs in path
Algorithms
Data structures are implemented using algorithms Some algorithms are more efficient than others which are preferred Use metrics to compare them(Complexity)
Amortized Complexity
Data structures can be manipulated by sequences of operations. So operations early in the sequence can impact performance later in the sequence. To determine overall performance, accumulate performance for each sequence to determine result(Can give inaccurate results) More useful approach: consider entire sequence of operations of program. Worst-case bound can be determined irrespective of the inputs by looking at all of the operations While some operations may be costly, they do not occur frequently enough to bias entire program. Because less costly operations will outnumber costly ones in the long run ("paying back" program over number of iterations) Useful because rather than assuming, it guarantees worst-case performance
Doubly Linked List Deletion
Deletion from end: Retrieve the data member from the node, then set tail to the node's predecessor Delete node, next of new last node=NULL Special cases: Node deleted is only node in list (head/tail=NULL) List is empty (Report back to function caller)
Deletion in binary trees
Deletion is another operation essential to maintaining a binary search tree This can be a complex operation depending on the placement of the node to be deleted in the tree The more children a node has, the more complex the deletion process Implies three cases that need to be handled: The node is a leaf; this is the easiest case, because all that needs to be done is to set the parent link to null and delete the node The node has one child; also easy, as we set the parent's pointer to the node to point to the node's child The third case, and most difficult to handle, is when the node has two children, as there is no one-step process; we'll consider two options Deletion by Merging Deletion by Copying
Depth-First Traversal
Depth-first traversal proceeds by following left- (or right-) hand branches as far as possible The algorithm then backtracks to the most recent fork and takes the right- (or left-) hand branch to the next node It then follows branches to the left (or right) again as far as possible This process continues until all nodes have been visited We are interested in three activities: traversing to the left, traversing to the right, and visiting a node These activities are labeled L, R, and V, for ease of representation Generally, we follow the convention of traversing from left to right: VLR - known as preorder traversal LVR - known as inorder traversal LRV - known as postorder traversal While the code is simple, the power lies in the recursion supported by the run-time stack, which places a heavy burden on the system To gain more insight into the behavior of these algorithms, let's consider the inorder routine In this traversal, if the tree is nonempty, we traverse the left subtree of the node, then visit the node, then traverse the right subtree Because of the order of the recursion in the code, the V and R steps are held pending until the L step completes So the inorder and postorder nonrecursive algorithms have to be developed separately Fortunately, creating a postorder algorithm can be accomplished easily by noting that an LRV traversal is simply a reversed VRL traversal
Space complexity
Describes the amount of memory (space) an algorithm takes in terms of the amount of input
Time complexity
Describes the amount of time an algorithm takes in terms of the amount of input
Principle of information-hiding
Details of implementation of objects can be hidden from other objects to prevent side effects from occurring
Doubly Linked List
Each element keeps information on how to locate the next and the previous elements. No direct random access. Utilizes the pointer prev along with next and info.
Nontail Recursion
Ex: void reverse() { char ch; cin.get(ch); if (ch != '\n') { reverse(); cout.put(ch); } } Function displays a line of input in reverse order Function reverse() uses recursion to repeatedly call itself for each character in the input line Assuming input "ABC", first time reverse() is called an activation record is created to store the local variable ch and return address of the call in main() get() reads in the character "A" from the line and compares it with the end-of-line character Since they aren't equal, function calls itself, creating a new activation record Process continues until end of line character is read and the stack appears as in Current call terminates, pops last activation record off stack, and resuming the previous call This outputs current value of ch, which is contained in current activation record, and is the value "C" Current call then ends, causing a repeat of the pop and return action, and once again the output is executed displaying the character "B" Finally, original call to reverse() is reached, which will output the character "A" Control is returned to main(), and the string "CBA" will be displayed non-recursive version of the same algorithm: void simpleIterativeReverse() { char stack[80]; register int top = 0; cin.getline(stack,80); for(top = strlen(stack)-1; top >= 0; cout.put(stack[top--])); } If these functions weren't available, we'd have to make the processing more explicit: void iterativeReverse() { char stack[80]; register int top = 0; cin.get(stack[top]); while(stack[top]!='\n') cin.get(stack[++top]); for (top -= 2; top >= 0; cout.put(stack[top--])); } When implementing nontail recursion iteratively, stack must be explicitly implemented and handled Second, clarity of algorithm, and often its brevity, are sacrificed as a consequence of the conversion
Nested Recursion
Example: {0 if n=0 h(n)={n if n>4 {h(2+h(n)) if n<=4 Given that h(n) has solutions for n = 0 and n > 4 But, for n = 1, 2, 3, and 4 function determines a value based on a recursive call that requires evaluating itself Ackermann function example: {m+1 if n=0 A(m,n)={A(n-1,1) if n>0, m=0 {A(n-1,A(n,m-1)) otherwise
Properties of Ω and Θ Notations
First four for big-O are true for Ω and Θ Replacing O with Ω and "largest" with "smallest" in the fifth theorem for big-O and it remains true f(n)=Ω(g(n)) when lim(n->∞)(g(n)/f(n))=constant f(n)=Θ(g(n)) when lim(n->∞)(f(n)/g(n))=constant≠0
Searching a Binary Search Tree
For each node, compare the value to the target value; if they match, the search is done If the target is smaller, we branch to the left subtree; if larger, we branch to the right If at any point we cannot proceed further, then the search has failed and the target isn't in the tree Finding (or not finding) the values 26 - 30 requires the maximum of four comparisons; all other values require less than four This also demonstrates why a value should occur only once in a tree; allowing duplicates requires additional searches: If there is a duplicate, we must either locate the first occurrence and ignore the others, or We must locate each duplicate, which involves searching until we can guarantee that no path contains another instance of the value This search will always terminate at a leaf node The number of comparisons performed during the search determines the complexity of the search This in turn depends on the number of nodes encountered on the path from the root to the target node So the complexity is the length of the path plus 1, and is influenced by the shape of the tree and location of the target Searching in a binary tree is quite efficient, even if it isn't balanced However, this only holds for randomly created trees, as those that are highly unbalanced or elongated and resemble linear linked lists approach sequential search times
Different ways to describe Big-O notaion
Given f(n)=O(g(n)) f is big-O of g f will never go above g f is bound from above by g f grows slower than g
"Big-O" notation
Formal method for expressing asymptotic upper bounds (growth of function bound from above) Let f(n) and g(n) be functions, where n = pos. int f(n)=O(g(n)) if there exists a real number(c) and pos. int(N) satisfying 0<=f(n)<=cg(n) (for n>=N). (read "f of n is big-oh of g of n.") Ex. n^2+n, 4n^2-nlog(n)+12 ,n^2/5-100n all =O(n^2)
Indirect Recursion
Function may be called by a function that it calls, forming a chain: f()→ g() → f() Chains of calls may be of arbitrary length: f() → f1() → f2() → ∙∙∙ → fn() → f() Also possible that a given function may be a part of multiple chains, based on different calls
Pointers and Function
Function's value is the result it returns Its address is the memory location of the function So a pointer can be used to point to a function to access it Given temp( ), temp(address) is a pointer to the function and *temp(value) is the function itself Using this we can implement functionals
Functionals
Functions that take functions as arguments
NP-complete
Generally believed P ≠ NP, but still a famous open problem.(Belived bc NP-complete problems exist) Problem is reducible if every instance of it can be transformed into instances of another using a process referred to as a reduction algorithm If this transformation can be done efficiently (polynomial time), then efficient solutions of 2nd problem can be transformed into efficient solutions of the original problem Problem is NP-comp. when it's NP and all others in NP is reducible to the problem in polynomial time. (all NP-complete problems are equal) If NP-complete prob. solved with deterministic alg. all NP-complete prob. can be solved same way. If any NP is intractable, so are all NP-complete Reducibility process uses NP-comp prob to show another prob is NP-comp. But, has to be at least one prob proven to be NP-comp by means other than reducibility to make reduction possible
Algorithms in STL
Generic functions that perform operations where each is implemented to require a certain level of iterator (work on any container with interface by iterators) Algorithms are in addition to the methods provided by containers, but some algorithms are implemented as member functions for efficiency Operate through iterators directly on the values while never affecting the size or storage allocation of the container
NP-Completeness
Given an input, underlying algorithm has only one way to decide what step to perform at any point Problems that can be solved by a deterministic alg. in polynomial time belong to class P. (tractable) If problem can be solved in polynomial time by a nondeterministic algorithm, it is of class NP. Tractable only if a nondeterministic alg. is used. "P"⊆NP (deterministic alg. are nondeterministic alg. that don't use nondeterministic decisions)
Pointers and Reference Variables
Given the declarations: int n=5; *p=&n; &r=n; A change to the value of n via any of the three will be reflected in the other two. Ex: n=7 or *p=7 or r=7 (All get the same result, the value of n is 7) So we can dereference a pointer, or use a reference directly to access the original object's value
Singly Linked lists
If node contains pointer to another node, any number of nodes can be strung together(need only a single variable to access the sequence) Each node contains data and link to the next node Last node in the list has a null pointer ( \ ) Nodes have two data members: info - stores the node's info content (value) next - points to the next node in the list First node = head and last node = tail
Advantages of Encapsulation
Implementation errors are confined to the methods of a class in which they occur, making them easier to detect and correct Allows principle of information-hiding
Queues in the Standard Template Library
Implemented as adapted deque(or list) dequeuing is implemented by front() and pop() enqueuing is implemented by push()
Reference variable
Implemented as constant pointers that allow modification of values of arguments Can also be returned from a function(Pointers have to be dereferenced first)
Implementing Binary Trees
Implementing binary tree arrays drawbacks: We need to keep track of the locations of each node, and these have to be located sequentially Deletions are also awkward, requiring tags to mark empty cells, or moving elements around, requiring updating values Consequently, while arrays are convenient, we'll usually use a linked implementation In a linked implementation, the node is defined by a class, and consists of an information data member and two pointer data members The node is manipulated by methods defined in another class that represents the tree
Pointers and Arrays
In array notation, we access the elements of a by subscripting: a[0], a[1], ... etc. Can also dereference the pointer to achieve the same results: *a is equivalent to a[0] (other elements can be accessed using pointer arithmetic) A name of an array is nothing more than a label for the beginning of the array in memory, so it is a pointer An array can also be declared dynamically as long as the size is known when the declaration is executed
Doubly Linked Lists Insertion
Insertion of a new node at end: New node created and info is initialized Next set to null, prev set to tail(links former end) Tail points to this new node Next of previous node set to point to new node Special case when node inserted is the only node (no prev node so head/tail point to new node and in last step, head set to point to the new node)
Properties of Big-O Notation
Key: x=f(n), y=g(n), z=h(n) if x=O(y) and y=O(z), then x=O(z) If x=O(z) and y=O(z), then x+y=O(z) A function an^k=O(n^k) (for a>0) Any kth degree polynomial is O(n^k+j) (for j>0) x=O(y) is true if lim(n->∞) (x/y) is a constant (x=cy) loga(n)=O(logb(n)) (for any a, b>1). Means(except few cases) base of log doesn't matter. Instead use one base and rewrite as: loga(n)=O(lg(n)) (for pos a≠1 and lg(n)=log2(n))
Leaves of the tree
Leaves of the tree(terminal nodes) are at the bottom of the tree
Limitations of linked lists, stacks, and queues
Linked lists are linear in form and cannot reflect hierarchically organized data Stacks and queues are one-dimensional structures and have limited expressiveness
How to delete specific node based on info?
Locate the specific node, then link around it by linking the previous node to the following node. To do this keep track of the previous node and keep track of the node containing the target value (requires two pointers)
Disadvantage of single-linked lists
Longer the list, longer the chain of next pointers needed to be followed to a given node. This reduces flexibility and is prone to errors. Alternative is doubly linked lists
Destructor
Member function automatically called when its associated object is deleted. It then frees the resources that the object may have acquired during its lifetime. It can specify special processing to occur, such as the deletion of pointer-linked memory objects
constructor
Member function of a class which initializes objects of a class. Constructor is automatically called when a object is created
Pointers and Copy Constructors
Member function(constructor) which initializes an object using another object of the same class. Copies not only the pointer, but the object the pointer points to
PS Code: MorrisInorder()
MorrisInorder() while not finished if node has no left descendant visit it; go to the right; else make this node the right child of the rightmost node in its left descendant; go to this left descendant;
Diamond problem
Multiple inheritance Subclass inherits multiple copies of the same member(Possible errors like a compiler error) Ex: Two classes B and C inherit from A, and D inherits from B and C The problem occurs in the example if D uses a method(Not overridden) that is defined in A Solution: classes B and C inherit class A as "virtual"
Finding Asymptotic Complexity(nested loops)
Nested loops can grow complexity by a factor of n Consider this nested loop: for (i=0; i < n; i++) { for (j = 1, sum = a[0]; j <= i; j++) sum += a[j]; cout << "sum for subarray 0 through " << i <<" is "<<sum<<end1; } Outer loop: loop header: i initial one time i++ n times In loop: cout executed n times (doesn't count towards complexity) Inner loop: loop header: j set to value n times sum set to value n times <-Here 1+3n j++ i times In loop: sum += a[j]; executed i times <-Here 2i Inner loop executes i times where 1<=i<=(n-1) So it executes ∑(n-1)(𝑖=1)(2𝑖)=2(1+2+...+(n-1))=2n(n-1) Total number of assgn: 1+3n+2n(n-1)=O(1)+O(n)+O(n^2)=O(n^2) Not all loops increase complexity Additional complexity when number of iterations changes during execution. (The case more powerful searching and sorting alg.)
private
Private class members and functions can be used only inside of class and by friend functions and classes.
tractable
Problem that can be solved this way: Decision problems can be defined equivalently as the set of inputs for which the problem returns yes Nondeterministic alg. can solve decision problem if path in decision tree of alg. that leads to a "yes" answer exists (otherwise answer is "no") If number of steps in the decision tree path to the affirmative answer is O(n^k) (for n=size of specific problem), then algorithm is considered polynomial
Tree Traversal
Process of visiting each node in a tree data structure exactly one time This definition only specifies that each node is visited, but does not indicate the order of the process Hence, there are numerous possible traversals; in a tree of n nodes there are n! traversals Two especially useful traversals are depth-first traversals and breadth-first traversals
complete binary tree
Root is at level 1, its children at level 2, etc. If each node at any given level (except the last) had two children, there would be 20 nodes at level 1, 21 nodes at level 2, etc. In general, there would be 2i nodes at level i + 1 All nonterminal nodes have both children, and all leaves are on the same level Because leaves can occur throughout this tree (except at level 1), there is no general formula to calculate the number of nodes An i + 1 level complete decision tree has 2i leaves and 2i - 1 nonterminal nodes, totaling 2i+1 - 1 nodes
Stacks in the Standard Template Library
STL implements the stack as a container adaptor Not a new container, just adaptation of existing one to make the stack behave in a specific way deque is default container, but lists and vectors can also be used stack<int> stack1; // deque by default stack<int,vector<int>> stack2; // vector stack<int,list<int>> stack3; // list pop() does not return a value; pop() must be combined with top()
Insertion in binary trees
Searching a binary tree does not modify the tree Traversals may temporarily modify the tree, but it is usually left in its original form when the traversal is done Operations like insertions, deletions, modifying values, merging trees, and balancing trees do alter the tree structure In order to insert a new node in a binary tree, we have to be at a node with a vacant left or right child This is performed in the same way as searching: Compare the value of the node to be inserted to the current node If the value to be inserted is smaller, follow the left subtree; if it is larger, follow the right subtree If the branch we are to follow is empty, we stop the search and insert the new node as that child
Vector in STL
Sequence container representing an array that can change in size Useful for storing lists with unknown length prior to setting it up but where removal is rare Adding new elements is easy, unless the size reaches capacity. To fix, resize the vector first.
Function objects in STL
Set of classes that overload the function operator (operator()) Maintain state info in functions that are passed to other functions Regular function pointers can also be used as function objects
Self-Organizing Lists
Several ways to speed up a list search is based on dynamically re-organizing the lists as they are used Several ways to accomplish this: Move-to-front: when found, target moved to front Transpose: when found, swap with its predecessor in the list Count: list is ordered by frequency of access Ordering: list is ordered based on the natural nature of the data For first three, new info is in a node placed at end. And their goal is place most likely looked-for items near the beginning of the list In ordering, placement based on maintaining the order of the elements. Uses properties of the data itself to organize the list Counting can be looked at as ordering, but often it is simply freq of access and stored as separate info Three use insertion at head and tail for comparison move-to-front/count perform better than others
Limitations of arrays
Size of array must be known at compile time The elements of the array are the same distance apart in memory, requiring potentially extensive shifting when inserting a new element. Linked lists can be used instead
Activation record or stack frame
State of a function by a set of information, stored on the runtime stack Contains the following information: Values of function's parameters, addresses of reference variables (including arrays) Copies of local variables Return address of the calling function Dynamic link to calling function's activation record Function's return value if it is not void Every time a function is called, its activation record is created and placed on the runtime stack Runtime stack always contains the current state of the function Consider f1()called from main(). It calls f2(), which calls f3() Once f3() completes, its record is popped, and f2() can resume and access information in its record If f3() calls another func., new func's activation record is pushed onto stack as f3()is suspended When a function calls itself recursively, it pushes a new activation record of itself on the stack, suspending calling instance of function, and allows the new activation to carry on the process Recursive call creates a series of activation records for different instances of the same function
Circular list
Structure where the nodes form a ring Implementation requires only one permanent pointer (usually tail)
Overriding
Subclasses can add/delete/modify methods and data members in their own definitions
Using "big O" notation on 3n^2+4n-2
Substitute f(n) and g(n) into f(n)<=cg(n) ,(g(n)=n^2): 3n^2+4n-2<=cn^2 (for n>=N) Divide by n^2: 3+4/n-2/n^2<=c Choose N so c can be found, then solve for c: N=1 3+4-2<=c so c>=5 Set c to 6 (>=5) in f(n)<=cg(n): 3n^2+4n-2<=6n^2 (for n>=1) So the function is O(n^2)
Sparse Tables
Tables are data structure of choice applications due to ease of implementation, use, and access But size of the table can lead to difficulties like if the table is mostly unoccupied(sparse). Use of two tables speeds processing for various lists. While much less than the original single table, it is still wasteful, and inflexible if conditions change More useful and open-ended solution utilizes two arrays of linked lists
Inheritance
Technique of reusing existing class definitions to derive new classes. The new classes(called derived/sub/child classes) can inherit attributes and behavior of pre-existing classes(called base/super/parent classes) This relationship of classes through inheritance forms a hierarchy Information hiding can be extended through this hierarchy by the access the superclass allows the subclass(es) Subclasses can override in their own definitions
Types of iterators in STL
The STL implements five types of iterators: Input iterators - reads a sequence of values Output iterators - writes a sequence of values Forward iterators - can be read, written to, or moved forward Bidirectional iterators - behave like forward iterators but can also move backwards Random iterators - can move freely in any direction at one time Most Limited->Least Limited: Input/output->Forward->Bidirectional->Random
Data encapsulation
The combination of data members and methods in a class. Binds the data structure and its operations together in the class
Binding
The method called depends on the time at which the decision is made about the call in polymorphism Can occur in different ways: Static - determines the function call at compile time Dynamic - delays the decision until run time
Stackless Depth-First Traversal: Threaded Trees
The previous algorithms were all characterized by the use of a stack, either implicitly through the system, or explicitly in code A more efficient implementation can be achieved if the stack is incorporated into the design of the tree itself This is done by using threads, pointers to the predecessor and successor of a node based on an inorder traversal Trees using threads are called threaded trees To implement threads, four pointers would be needed for each node, but this can be reduced by overloading the existing pointers The left pointer can be used to point to the left child or the predecessor, and the right pointer can point to the right child or successor Only a single variable is needed for this; no stack is required However, the memory savings will be highly dependent on the implementation We can also use threads to support preorder and postorder traversal In preorder, the existing threads can be used to determine the appropriate successors Postorder requires somewhat more work, but is only slightly more complicated to accomplish
Stackless Depth-First Traversal: Tree Transformation
These algorithms rely on making temporary changes in the tree structure during traversal, and restoring the structure when done The algorithm is based on the observation that inorder traversal is very simple for trees that have no left children Since no left subtree has to be considered, the LVR traversal reduces to VR Morris's algorithm utilizes this observation by modifying the tree so that the node being processed has no left child This allows the node to be visited and then the right subtree can be investigated Since this changes the tree's structure, the traversal can only be done once, and information must be kept to restore the original tree Preorder and postorder traversals can be implemented in a similar fashion The preorder traversal requires moving the visit() operation from the inner else to the inner if Postorder requires additional restructuring of the tree
Stack Operations
These operations are: clear( ): clears the stack isEmpty( ): determines if the stack is empty push(el): pushes the data item el onto the top of the stack pop( ): removes the top element from the stack topEl( ): returns the value of the top element of the stack without removing it
Deletion by Copying
We locate the node's predecessor by searching for the rightmost node in the left subtree The key of this node replaces the key of the node to be deleted We then recall the two simple cases of deletion: if the rightmost node was a leaf, we delete it; if it has one child, we set the parent's pointer to the node to point to the node's child This way, we delete a key k1 by overwriting it by a key k2 and then deleting the node holding k2 This algorithm avoids the height increase problem of merging, but problems can still result Since the algorithm always deletes the immediate predecessor of the key being replaced, the left subtree can shrink while the right subtree is unchanged, making the algorithm asymmetric Eventually the tree becomes unbalanced to the right, and the right subtree is bushier and larger than the left
probability theory
Weight number of steps that occur for a given input by the probability(p) of that input occurring, and sum this over the number of inputs: ∑(𝑖)(p(input(i))steps(input(i))) In probability theory, above equ. defines expected value(assumes probabilities can be determined and their distribution known) p is a probability distribution, so it satisfies two constraints: Function p is never negative Sum of all probabilities=1 Consider sequentially searching an unordered array to find a target value: Probability of finding target in any one location is 1/n (Assuming uniform distribution of n values). So target in 1st location is p=1/n, in 2nd is p=1/n, etc. Number of steps required to get to each location is the same as the location itself, so sum becomes: 1/n*(1+2+...+n) = (n+1)/2 If probabilities differ, then it is much more involved
Problem can arise with delete; (dangling reference problem)
When an object is deleted without modifying the value of the pointer, the pointer still points to the memory location of the deallocated memory Attempting to dereference the pointer will cause an error To avoid this, after deleting the object, the pointer should be set to a known address or NULL(0) Ex: delete p; p = NULL; or p = 0;
Problem with "big-O" notation
While c and N exist, does not tell how to calculate or what to do if multiple candidates exist (often do) Choose N so one term dominates the expression Ex. Consider the function f: f(n)=2n^2+3n+1 and g: g(n)=n^2 Clearly 2n^2+3n+1=O(n^2) or f(n)=O(g(n)); Only two terms to consider: 2n^2, 3n (last term is constant) When n>=1.5, 2n^2 dominates the expression (N>=2, and c>3.75) Main point of "big-O"(f(n)<=cg(n)) relies on the choices of c and N to be within the perimeters Choice of c depends on choice of N(vice-versa)
iterator operations
begin(),end(),next(),prev(),etc.
Class Syntax
class Classname{ access specifier: //public, private or protected Data members; //Variables to be used Member functions(); //methods to access data members }; //ends with ";" after "}"
Complexity table
constant (O(1)) logarithmic (O(log(n))) linear (O(n)) O(nlg(n)) Quadratic (O(n^2)) Cubic (O(n^3)) Exponential (O(2^n)) Longer time down the table and from left to right
Deleting dynamic arrays syntax
delete[ ] a; a=NULL; [ ] indicate that an array is to be deleted p is the pointer to that array
Syntax to declare an array dynamically
int *a; a = new int[size]; Each element in the array declared dynamically can now be accessed by the pointer pointing to the array by pointer arithmetic Ex: *a=25 1st element of a set to 25 *(a+3)=25; 4th element of a set to 25 cout<<*(a+3)<<endl; Outputs the 4th element of a(25)
Problems with Pointers and Reference Variables (const)
int *const - declares a constant pointer to an integer (Memory address constant) Ex: int *const ptr1=&x; const int *- declares a pointer to a constant integer (Value pointed to constant) Ex: const int *ptr1=&x; const int * const- declares a constant pointer to a to a constant integer (Both memory address and value pointed to constant) Ex: const int * const ptr1=&x;
Using pointers to access dynamically created locations Syntax
int *p; p=new int; //p initialized then used in function delete p; p=NULL;
Basic pointer syntax
int *ptr; - defines ptr as an int pointer pointing to NULL ptr=&i; - Initialization of ptr, now ptr points to the memory address of i *ptr=20; - The value of the address ptr points to(i) is changed to 20 j=2* *p; - j now equals 2 multiplied by the value stored in the memory address ptr is pointing to cout<<ptr<<endl; - outputs ptr, where ptr reffers to the address ptr points to (endl the same as \n in c) cout<<*ptr<<endl; - outputs *ptr, where *ptr reffers to the value of the address ptr points to
Stack
last-in first-out structure Ex. Stack of trays in cafeteria. Trays are removed from top and placed back on top. So very first tray in the pile is the last one to be removed. Can only remove items that are available(top) and can't add more if there is no room. Can define stack in terms of operations that change it or report its status
Recursive definition of the factorial function
n!={1 if n=0 {n(n-1)! if n>0 So 3! = 3 ∙ 2! = 3 ∙ 2 ∙ 1! = 3 ∙ 2 ∙ 1 ∙ 0! = 3 ∙ 2 ∙ 1 ∙ 1 = 6 This is cumbersome and computationally inefficient For factorials, we can use n!=PI(n)(i=1)(i) Code: unsigned int factorial (unsigned int n){ if (n == 0) return 1; else return n * factorial (n - 1); }
Preferred algorithms
n^k=O((1+ε)^n)) for pos k & ε (So all polynomials are bound from above by all exponentials) Algorithm run in polynomial time will eventually be preferable to one run in exponential time (log n)^ε=O(n^k) for pos k & ε (So logarithm^(any power) grows slower than polynomials) Run in logarithmic time will eventually be preferable to run in polynomial(or exponential) time So slow growth and lower bounds are preferred
Self-referential objects
next points to node of same type being defined
deterministic alg
nondeterministic alg. that don't use nondeterministic decisions
nondeterministic alg
uses operation to "guess" what to do next when a decision is made. (Can exhibit different behaviors on different runs) There are several ways this can happen Ex.concurrent alg. may experience race conditions
Anatomy of a Recursive Call
x^n={1 if n=0 {x(x^(n-1)) if n>0 Code: double power (double x, unsigned int n) { if (n == 0) return 1.0; else return x * power(x,n-1); } Using the definition, the calculation of x4 would be calculated as follows: x4 = x ∙ x3 = x ∙ (x ∙ x2) = x ∙ (x ∙ (x ∙ x1)) = x ∙ (x ∙ (x ∙ (x ∙ x0))) = x ∙ (x ∙ (x ∙ (x ∙ 1))) = x ∙ (x ∙ (x ∙ (x))) = x ∙ (x ∙ (x ∙ x)) = x ∙ (x ∙ x ∙ x) = x ∙ x ∙ x ∙ x Produces result of x0, 1, and returns this value to the previous call, which had been suspended, then resumes to calculate x ∙ 1, producing x Each succeeding return then takes the previous result and uses it in turn to produce the final result Sequence of recursive calls and returns looks like: call 1 x4 = x ∙ x3 = x ∙ x ∙ x ∙ x call 2 x3 = x ∙ x2 = x ∙ x ∙ x call 3 x2 = x ∙ x1 = x ∙ x call 4 x1 = x ∙ x0 = x ∙ 1 call 5 x0 = 1 non-recursive code: double nonRecPower(double x, unsigned int n) { double result = 1; for (result = x; n > 1; --n) result *= x; return result; } Recursive code is more intuitive, closer to the specification, and simpler to code