Thinkful Data Structures and Algorithms
Top 10 algorithms for linked lists?
1. Insertion of a node in Linked List (On the basis of some constraints) 2. Delete a given node in Linked List (under given constraints) 3. Compare two strings represented as linked lists 4. Add Two Numbers Represented By Linked Lists 5. Merge A Linked List Into Another Linked List At Alternate Positions 6. Reverse A List In Groups Of Given Size 7. Union And Intersection Of 2 Linked Lists 8. Detect And Remove Loop In A Linked List 9. Merge Sort For Linked Lists 10. Select A Random Node from A Singly Linked List
Number Theory top 10
1. Modular Exponentiation 2. Modular multiplicative inverse 3. Primality Test | Set 2 (Fermat Method) 4. Euler's Totient Function 5. Sieve of Eratosthenes 6. Convex Hull 7. Basic and Extended Euclidean algorithms 8. Segmented Sieve 9. Chinese remainder theorem 10. Lucas Theorem
String / Array top 10
1. Reverse an array without affecting special characters 2. All Possible Palindromic Partitions 3. Count triplets with sum smaller than a given value 4. Convert array into Zig-Zag fashion 5. Generate all possible sorted arrays from alternate elements of two given sorted arrays 6. Pythagorean Triplet in an array 7. Length of the largest subarray with contiguous elements 8. Find the smallest positive integer value that cannot be represented as sum of any subset of a given array 9. Smallest subarray with sum greater than a given value 10. Stock Buy Sell to Maximize Profit
Important Sorting Assumptions
1.Sorting array of integers 2. Length of array is n 3.Sorting least to greatest 4.Can access array element in constant time 5.Compare ints in array only with '<' 6.Focus on # of comparisons
Bit Array
A bit array is a mapping from some domain (almost always a range of integers) to values in the set {0, 1}. The values can be interpreted as dark/light, absent/present, locked/unlocked, valid/invalid, et cetera. The point is that there are only two possible values, so they can be stored in one bit. As with other arrays, the access to a single bit can be managed by applying an index to the array. Assuming its size (or length) to be n bits, the array can be used to specify a subset of the domain (e.g. {0, 1, 2, ..., n−1}), where a 1-bit indicates the presence and a 0-bit the absence of a number in the set. This set data structure uses about n/w words of space, where w is the number of bits in each machine word. Whether the least significant bit (of the word) or the most significant bit indicates the smallest-index number is largely irrelevant, but the former tends to be preferred (on little-endian machines).
Fibonacci Heap
A data structure that is a collection of trees satisfying the minimum-heap property, that is, the key of a child is always greater than or equal to the key of the parent. This implies that the minimum key is always at the root of one of the trees. The trees do not have a prescribed shape and in the extreme case the heap can have every element in a separate tree. This flexibility allows some operations to be executed in a "lazy" manner, postponing the work for later operations. For example, merging heaps is done simply by concatenating the two lists of trees, and operation decrease key sometimes cuts a node from its parent and forms a new tree. For the Fibonacci heap, the find-minimum operation takes constant (O(1)) amortized time. The insert and decrease key operations also work in constant amortized time. Deleting an element (most often used in the special case of deleting the minimum element) works in O(log n) amortized time, where n is the size of the heap. This means that starting from an empty data structure, any sequence of a insert and decrease key operations and b delete operations would take O(a + b log n) worst case time, where n is the maximum heap size. In a binary or binomial heap such a sequence of operations would take O((a + b) log n) time. A Fibonacci heap is thus better than a binary or binomial heap when b is smaller than a by a non-constant factor. It is also possible to merge two Fibonacci heaps in constant amortized time, improving on the logarithmic merge time of a binomial heap, and improving on binary heaps which cannot handle merges efficiently. Using Fibonacci heaps for priority queues improves the asymptotic running time of important algorithms, such as Dijkstra's algorithm for computing the shortest path between two nodes in a graph, compared to the same algorithm using other slower priority queue data structures.
Pop
A process used in stack and queue processing where a copy of the top or front value is acquired, and then removed from the stack or queue (Dequeue).
Rolling hash function
A rolling hash (also known as a rolling checksum) is a hash function where the input is hashed in a window that moves through the input. A few hash functions allow a rolling hash to be computed very quickly—the new hash value is rapidly calculated given only the old hash value, the old value removed from the window, and the new value added to the window—similar to the way a moving average function can be computed much more quickly than other low-pass filters. One of the main applications is the Rabin-Karp string search algorithm, which uses the rolling hash described below.
Quick Select
A selection algorithm to find the kth smallest element in an unordered list. Quickselect uses the same overall approach as quicksort, choosing one element as a pivot and partitioning the data in two based on the pivot, accordingly as less than or greater than the pivot. However, instead of recursing into both sides, as in quicksort, quickselect only recurses into one side - the side with the element it is searching for. This reduces the average complexity from O(n log n) to O(n). Partition algorithm:
4. Euler's Totient Function Euler's Totient function Φ(n) for an input n is count of numbers in {1, 2, 3, ..., n} that are relatively prime to n, i.e., the numbers whose GCD (Greatest Common Divisor) with n is 1.
A simple solution is to iterate through all numbers from 1 to n-1 and count numbers with gcd with n as 1. Below is C implementation of the simple method to compute Euler's Totient function for an input integer n.
Array Index
A value that indicates the position in the array of a particular value. The last element in a zero-indexed array would be the length of the array, minus 1.
Array length
A value that represents the number of elements contained in an array. Often there is a process associated with an array that provides this value, such as list.length, or len(list).
ArrayLists: advantages, disadvantages
Advantage: advantages of an array, plus does not run out of space Disadvantage: inserting can be slower than an array
Graph: advantage, disadvantage
Advantage: best models real-world situations Disadvantage: can be slow and complex
Stack: advantage, disadvantage
Advantage: quick access Disadvantage: inefficient with an array
Array: advantage, disadvantage
Advantage: quick insert, quick access if index is known Disadvantage: slow search, slow delete, fixed size
Doubly Linked List: advantage, disadvantage
Advantage: quick insert, quick delete Disadvantage: slow search
Invariants
Algorithm design patterns. Identify an invariant and use it to rule out potential solutions that are suboptimal/dominated by other solutions.
Recursion
Algorithm design patterns. If the structure of the input is defined in a recursive manner, design a recursive algorithm that follows the input definition.
Sorting
Algorithm design patterns. Uncover some structure by sorting the input.
What is Exponential time O(2^n) complexity?
Algorithms with exponential time complexity (O(2^n)) have running times that grow rapidly with any increase in input size. For an input of size 2, an exponential time algorithm will take 2^2 = 4 time. With an input of size 10, the same algorithm will take 2^10 = 1024 time, and with an input of size 100, it will take 2^100 = 1.26765060022823 * 1030 time. Yikes!
What is Linear time o(n) complexity?
Algorithms with linear time complexity (0(n)) have running times that are directly proportional to the size (n) of the input. Some examples of linear complexity algorithms are summing the elements in an array and finding the minimum or maximum value in an array.
Floyd-Warshall Algorithm
An algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles). A single execution of the algorithm will find the lengths (summed weights) of the shortest paths between all pairs of vertices, though it does not return details of the paths themselves.
Dijkstra's Algorithm
An algorithm for finding the shortest paths between nodes in a weighted graph. For a given source node in the graph, the algorithm finds the shortest path between that node and every other. It can also be used for finding the shortest paths from a single node to a single destination node by stopping the algorithm once the shortest path to the destination node has been determined. Its time complexity is O(E + VlogV), where E is the number of edges and V is the number of vertices.
Counting Sort
An algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that have each distinct key value, and using arithmetic on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum and minimum key values, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. However, it is often used as a subroutine in another sorting algorithm, radix sort, that can handle larger keys more efficiently.[1][2][3] Because counting sort uses key values as indexes into an array, it is not a comparison sort, and the Ω(n log n) lower bound for comparison sorting does not apply to it.
What is Polynomial time O(n^k) complexity?
An algorithm that has a running time that would be some input size n raised to some constant power k. The easiest way to understand polynomial time complexity is with nested loops. An algorithm that requires 2 levels of looping over an input would be O(n^2) while one requiring 3 levels of looping would be O(n^3). In both cases, we have polynomial time complexity.
2D Array
An array of an arrays, characterized by rows and columns, arranged in a grid format, but still stored in contiguous, or side-by-side memory, accessed using two index values.
hash Table:
An array that stores a collection of items. A data structure that is used to store keys/value pairs. It uses a hash function to compute an index into an array in which an element will be inserted or searched.
Row Major
An array where the two index values for any element are the row first, then the column.
Internal Sorting
An internal sort is any data sorting process that takes place entirely within the main memory of a computer. This is possible whenever the data to be sorted is small enough to all be held in the main memory. For sorting larger datasets, it may be necessary to hold only a chunk of data in memory at a time, since it won't all fit. The rest of the data is normally held on some larger, but slower medium, like a hard-disk. Any reading or writing of data to and from this slower media can slow the sortation process considerably.
memoization
An optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again.
Binary Search
An ordered array of data which has efficiently supported operations. The worst and average case of a search using this structure is lgN. The Worst case of an insertion is N, and the average case of an insertion is N/2.
set
An unordered collection (possibly empty) of distinct items called elements of the set.
Concrete Examples
Analysis pattern. Manually solve concrete instances of the problem and then build a general solution
Iterative Refinement
Analysis pattern. Most problems can be solved using s brute-force approach. Find such a solution and improve upon it.
Case Analysis
Analysis pattern. Split the input/execution into a number of cases and solve each case in isolation
Reduction
Analysis pattern. Use a well-known solution to some other problem as a subroutine.
Heuristics
Any approach to problem solving, learning, or discovery that employs a practical method not guaranteed to be optimal or perfect, but sufficient for the immediate goals. Where finding an optimal solution is impossible or impractical, heuristic methods can be used to speed up the process of finding a satisfactory solution. Heuristics can be mental shortcuts that ease the cognitive load of making a decision. Examples of this method include using a rule of thumb, an educated guess, an intuitive judgment, stereotyping, profiling, or common sense
7. Basic and Extended Euclidean algorithms GCD of two numbers is the largest number that divides both of them. A simple way to find GCD is to factorize both numbers and multiply common factors.
Basic Euclidean Algorithm for GCD The algorithm is based on below facts. If we subtract smaller number from larger (we reduce larger number), GCD doesn't change. So if we keep subtracting repeatedly the larger of two, we end up with GCD. Now instead of subtraction, if we divide smaller number, the algorithm stops when we find remainder 0. Extended Euclidean Algorithm: Extended Euclidean algorithm also finds integer coefficients x and y such that: ax + by = gcd(a, b)
Dynamic Programming
Break down a problem into smaller and smaller subproblems. At their lowest levels, the subproblems are solved and their answers stored in memory. These saved answers are used again with other larger (sub)problems which may call for a recomputation of the same information for their own answer. Reusing the stored answers allows for optimization by combining the answers of previously solved subproblems.
3. Bubble Sort
Bubble Sort is the simplest sorting algorithm that works by repeatedly swapping the adjacent elements if they are in wrong order.
Transitive Closure
Can one get from node a to node d in one or more hops? A binary relation tells you only that node a is connected to node b, and that node b is connected to node c, etc. After the transitive closure is constructed one may determine that node d is reachable from node a. (use Floyd-Warshall Algorithm)
External Sorting
External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted subfiles are combined into a single larger file. Mergesort is typically preferred.
Trie
In computer science, a trie, also called digital tree and sometimes radix tree or prefix tree (as they can be searched by prefixes), is an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are not necessarily associated with every node. Rather, values tend only to be associated with leaves, and with some inner nodes that correspond to keys of interest. For the space-optimized presentation of prefix tree, see compact prefix tree.
Doubly Linked List: memory
Memory: O(3n) (LL: O 2n)
6. Pythagorean Triplet in an array
Method 1 (Naive) A simple solution is to run three loops, three loops pick three array elements and check if current three elements form a Pythagorean Triplet. Method 2 (Use Sorting) We can solve this in O(n2) time by sorting the array first. 1) Do square of every element in input array. This step takes O(n) time. 2) Sort the squared array in increasing order. This step takes O(nLogn) time. 3) To find a triplet (a, b, c) such that a = b + c, do following. Fix 'a' as last element of sorted array. Now search for pair (b, c) in subarray between first element and 'a'. A pair (b, c) with given sum can be found in O(n) time using meet in middle algorithm discussed in method 1 of this post. If no pair found for current 'a', then move 'a' one position back and repeat step 3.2.
10. Reverse alternate levels of a perfect binary tree
Method 1 (Simple) A simple solution is to do following steps. 1) Access nodes level by level. 2) If current level is odd, then store nodes of this level in an array. 3) Reverse the array and store elements back in tree. Method 2 (Using Two Traversals) Another is to do two inorder traversals. Following are steps to be followed. 1) Traverse the given tree in inorder fashion and store all odd level nodes in an auxiliary array. For the above example given tree, contents of array become {h, i, b, j, k, l, m, c, n, o} 2) Reverse the array. The array now becomes {o, n, c, m, l, k, j, b, i, h} 3) Traverse the tree again inorder fashion. While traversing the tree, one by one take elements from array and store elements from array to every odd level traversed node. For the above example, we traverse 'h' first in above array and replace 'h' with 'o'. Then we traverse 'i' and replace it with n.
2. All Possible Palindromic Partitions
Note that this problem is different from Palindrome Partitioning Problem, there the task was to find the partitioning with minimum cuts in input string. Here we need to print all possible partitions. The idea is to go through every substring starting from first character, check if it is palindrome. If yes, then add the substring to solution and recur for remaining part. Below is complete algorithm.
Insertion Sort
Stable, in place sort with an order of growth which is between N and N-squared, needs only one spot of extra space and is dependent on the order of the items. Works by scanning over the list, then inserting the current item to the front of the list where it would fit sequentially. All the items to the left of the list will be sorted, but may not be in their final place as the larger items are continuously pushed back to make room for smaller items if necessary.
What is Stack and where it can be used?
Stack is a linear data structure which the order LIFO(Last In First Out) or FILO(First In Last Out) for accessing elements. Basic operations of stack are : Push, Pop , Peek Applications of Stack: Infix to Postfix Conversion using Stack Evaluation of Postfix Expression Reverse a String using Stack Implement two stacks in an array Check for balanced parentheses in an expression
Hamming Weight
The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used (also called the population count, popcount or sideways sum). Algorithm: - Count the number of pairs, then quads, then octs, etc, adding and shifting. v = v - ((v>>1) & 0x55555555); v = (v & 0x33333333) + ((v>>2) & 0x33333333); int count = ((v + (v>>4) & 0xF0F0F0F) * 0x1010101) >> 24;
7. Remove nodes on root to leaf paths of length < K
The idea here is to use post order traversal of the tree. Before removing a node we need to check that all the children of that node in the shorter path are already removed. There are 2 cases: i) This node becomes a leaf node in which case it needs to be deleted. ii) This node has other child on a path with path length >= k. In that case it needs not to be deleted.
5. Merge A Linked List Into Another Linked List At Alternate Positions
The idea is to run a loop while there are available positions in first loop and insert nodes of second list by changing pointers.
1. Find Minimum Depth of a Binary Tree
The idea is to traverse the given Binary Tree. For every node, check if it is a leaf node. If yes, then return 1. If not leaf node then if left subtree is NULL, then recur for right subtree. And if right subtree is NULL, then recur for left subtree. If both left and right subtrees are not NULL, then take the minimum of two heights. Better Solution is to do Level Order Traversal. While doing traversal, returns depth of the first encountered leaf node. Below is implementation of this solution.
8. Segmented Sieve Given a number n, print all primes smaller than n. For example, if the given number is 10, output 2, 3, 5, 7.
The idea of segmented sieve is to divide the range [0..n-1] in different segments and compute primes in all segments one by one. This algorithm first uses Simple Sieve to find primes smaller than or equal to √(n). Below are steps used in Segmented Sieve. Use Simple Sieve to find all primes upto square root of 'n' and store these primes in an array "prime[]". Store the found primes in an array 'prime[]'. We need all primes in range [0..n-1]. We divide this range in different segments such that size of every segment is at-most √n Do following for every segment [low..high] Create an array mark[high-low+1]. Here we need only O(x) space where x is number of elements in given range. Iterate through all primes found in step 1. For every prime, mark its multiples in given range [low..high].
5. Sieve of Eratosthenes Given a number n, print all primes smaller than or equal to n. It is also given that n is a small number. For example, if n is 10, the output should be "2, 3, 5, 7". If n is 20, the output should be "2, 3, 5, 7, 11, 13, 17, 19".
The sieve of Eratosthenes is one of the most efficient ways to find all primes smaller than n when n is smaller than 10 million or so (Ref Wiki). Following is the algorithm to find all the prime numbers less than or equal to a given integer n by Eratosthenes' method: Create a list of consecutive integers from 2 to n: (2, 3, 4, ..., n). Initially, let p equal 2, the first prime number. Starting from p, count up in increments of p and mark each of these numbers greater than p itself in the list. These numbers will be 2p, 3p, 4p, etc.; note that some of them may have already been marked. Find the first number greater than p in the list that is not marked. If there was no such number, stop. Otherwise, let p now equal this number (which is the next prime), and repeat from step 3.
What is the purpose of the JavaScript array slice method?
The slice() method return the selected elements in an array, as a new array object. The slice() method selects the elements starting at the given start argument, and ends at, but does not include, the given end argument. The original array will not be changed.
What is time complexity in regards to algorithm performance?
Time complexity refers to the number of operations an algorithm requires to complete.
Type erasure
Type erasure is any technique in which a single type can be used to represent a wide variety of types that share a common interface. In the C++ lands, the term type-erasure is strongly associated with the particular technique that uses templates in the interface and dynamic polymorphism in the implementation. 1. A union is the simplest form of type erasure. - It is bounded, and all participating types have to be mentioned at the point of declaration. 2. A void pointer is a low-level form of type erasure. Functionality is provided by pointers to functions that operate on void* after casting it back to the appropriate type. - It is unbounded, but type unsafe. 3. Virtual functions offer a type safe form of type erasure. The underlying void and function pointers are generated by the compiler. - It is unbounded, but intrusive. - Has reference semantics. 4. A template based form of type erasure provides a natural C++ interface. The implementation is built on top of dynamic polymorphism. - It is unbounded and unintrusive. - Has value semantics.
10. Lucas Theorem Given three numbers n, r and p, compute value of nCr mod p.
Using Lucas Theorem for nCr % p: Lucas theorem basically suggests that the value of nCr can be computed by multiplying results of niCri where ni and ri are individual same-positioned digits in base p representations of n and r respectively.. The idea is to one by one compute niCri for individual digits ni and ri in base p. We can compute these values DP based solution discussed in previous post. Since these digits are in base p, we would never need more than O(p) space and time complexity of these individual computations would be bounded by O(p^2). A Simple Solution is to first compute nCr, then compute nCr % p. This solution works fine when the value of nCr is small. What if the value of nCr is large? The value of nCr%p is generally needed for large values of n when nCr cannot fit in a variable, and causes overflow. So computing nCr and then using modular operator is not a good idea as there will be overflow even for slightly larger values of n and r. For example the methods discussed here and here cause overflow for n = 50 and r = 40. The idea is to compute nCr using below formula C(n, r) = C(n-1, r-1) + C(n-1, r) C(n, 0) = C(n, n) = 1
2. Delete a given node in Linked List (under given constraints)
We explicitly handle the case when node to be deleted is first node, we copy the data of next node to head and delete the next node. The cases when deleted node is not the head node can be handled normally by finding the previous node and changing next of previous node. Following is C implementation.
Binary Search Tree
Will have a best case high of lgN. This is also its expected height. In the worst case, it will have a height of N, and thus become similar to a linked list. Works by inserting nodes of lesser values to the left of a node, and inserting greater values to the right of the node, traversing down the tree until we reach a blank spot to insert. Has a worst case cost of N to search and insert node. The average case of searching will be 1.39lgN compares
Quicksort
partitioning Best: O(n log n) (or O(n) three-way) Avg: O(n log n) Worst: O(n^2)
4. Convert array into Zig-Zag fashion
A Simple Solution is to first sort the array. After sorting, exclude the first element, swap the remaining elements in pairs. (i.e. keep arr[0] as it is, swap arr[1] and arr[2], swap arr[3] and arr[4], and so on). Time complexity is O(nlogn) since we need to sort the array first. We can convert in O(n) time using an Efficient Approach. The idea is to use modified one pass of bubble sort. Maintain a flag for representing which order(i.e. < or >) currently we need. If the current two elements are not in that order then swap those elements otherwise not.
9. 0-1 Knapsack Problem
# A Dynamic Programming based Python Program for 0-1 Knapsack problem # Returns the maximum value that can be put in a knapsack of capacity W def knapSack(W, wt, val, n): K = [[0 for x in range(W+1)] for x in range(n+1)] # Build table K[][] in bottom up manner for i in range(n+1): for w in range(W+1): if i==0 or w==0: K[i][w] = 0 elif wt[i-1] <= w: K[i][w] = max(val[i-1] + K[i-1][w-wt[i-1]], K[i-1][w]) else: K[i][w] = K[i-1][w] return K[n][W]
Array Bucket
-- a bucket of arrays. -Fixed in size. -size of about 3 work usually well.
Load Factor
#items(n) / table size
HashCode Method:
-method of OBJECT class -Returns an int -default hash code is BAD-- computed from Object's memory address. --> must override
Binary Search Tree
Avg height: O(log n) Worst height: O(n)
4 Rules of Recursion
Base Cases: You must always have some base cases, which can be solved without recursion. Making Progress: For the cases that are to be solved recursively, the recursive call must make progress to a base case. Design rule: Assume that all recursive calls work Compound Interest Rule: Never duplicate word by solving the same instance of a problem in separate recursive calls.
Extraction:
Breaking keys into parts and using the parts that uniquely identify with the item. 379452 = 394 121267 = 112
Linear Probing
Checks each spot in order to find available location, causes primary clustering.
Can doubly linked be implemented using a single pointer variable in every node?
Doubly linked list can be implemented using a single pointer. See XOR Linked List - A Memory Efficient Doubly Linked List
Collision
Entering into a space already in use.
Trie
Has only part of a key for comparison at each node.
ArrayLists: insert
Insert: often O(1), sometimes more
TreeMap complexity for iterating over associated values:
O(N)
Rehashing Complexity:
O(N)-- costly. Carefully select initial TS to avoid re-hashing.
Bucket Sort
O(n+m) where m is the # of buckets.
Insertion sort
Side-by-side comparison Best: O(n) Avg: O(n^2) Worst: O(n^2)
If we have UW-Madison student ID's, and we wanted the ideal hash functions, how would we do it, and why would there be a problem
-> We'd simply count each one as an index -> Hash table would be huge.
What are the 2 parts of recursion?
1. Base case (i.e., when to stop) 2. General, or recursive case (i.e., function calling itself)
What is a Data Structure?
A data structure is a way of organizing the data so that the data can be used efficiently
Heap
A type of priority queue. Stores data which is order-able. O(1) access to highest priority item.
Node
An object linked to other objects, representing some entity in that data structure.
Weighting:
Emphasizing some parts of the key over another.
Compressing:
Ensuring the hash code is a valid index for the table size.
Selection sort
Find smallest, put at beginning Best: O(n^2) Avg: O(n^2) Worst: O(n^2)
Arithmetic progressions
For p < -1, this sum always converges to a constant.
stack
LIFO list in which insertions/deletions are only done at one end.
Dynamic Memory
Memory that is allocated as needed, and NOT contiguous (side-by-side), specifically during the implementation of a linked list style data structure, which also includes binary trees and graphs.
Array: memory
Memory: O(n)
5. Merge Sort
MergeSort(arr[], l, r) If r > l 1. Find the middle point to divide the array into two halves: middle m = (l+r)/2 2. Call mergeSort for first half: Call mergeSort(arr, l, m) 3. Call mergeSort for second half: Call mergeSort(arr, m+1, r) 4. Merge the two halves sorted in step 2 and 3: Call merge(arr, l, m, r)
Heap Sort
Non-stable, in place sort which has an order of growth of NlogN. Requires only one spot of extra space. Works like an improved version of selection sort. It divides its input into a sorted and unsorted region, and iteratively shrinks the unsorted region by extracting the smallest element and moving it into the sorted region. It will make use of a heap structure instead of a linear time search to find the minimum.
Load Factor (LF)
Number of items/Table size. For instance, a load factor of 1 = 100% of the items are used.
HashMap complexity of basic operations:
O(1)
Linear Probing:
Step size is 1. Find the index, and keep incrementing by one until you find a free space.
Little-Oh
T(n) = 0(f(n)) if T(n) = O(f(n)) and T(n) != Ω(f(n))
6. Convex Hull
The idea of Jarvis's Algorithm 1) Initialize p as leftmost point. 2) Do following while we don't come back to the first (or leftmost) point. .....a) The next point q is the point such that the triplet (p, q, r) is counterclockwise for any other point r. .....b) next[p] = q (Store q as next of p in the output convex hull). .....c) p = q (Set p as q for next iteration).
Quick Sort
Unstable, O(n log n) for a good pivot,O(n^2) for a bad pivot Ω(n log n) : Uses partitioning O(n), Pick a median of 1st, middle, and last element for pivot. Random selection is also good, but expensive. Algorithm can be slow because of many function calls.
How do you delete a value within the hash table?
You just set Table[hash(Key)] = null
Binary Tree
A data structure that consists of nodes, with one root node at the base of the tree, and two nodes (left child and right child) extending from the root, and from each child node.
Dictionary: definition
A data structure that maps keys to values.
Mnemonic
A device such as a pattern of letters, ideas, or associations that assists in remembering something
Hash Function
A function that takes in the key to compute a specific Hash Index.
cycle
A path of positive length that starts and ends at the same vertex and does not traverse the same edge more than once.
Insertion Sort
Stable, O(n^2), Ω(n) : Swapping elements one at a time starting at the beginning.
Memoization
What happens when a sub problem's solution is found during the process of Dynamic Programming. The solution is stored for future use, so that it may be reused for larger problems which contain this same subproblem. This helps to decrease run time.
Greedy Algorithm
an algorithm that follows problem solving heuristic of making optimal choices at each stage. Hopefully finds the global optimum. An example would be Kruskal's algorithm.
priority queue
collection of data items from a totally ordered universe
The more items a table can hold, the () likely a collision will happen.
less
Post-Order Traversal
1. Process left child. 2. Process right child. 3. Process self.
Recurrence Relation
An equation that is defined in terms of itself. Any polynomial or exponential can be represented by a recurrence.
What happens if the start parameter is not defined in a slice method?
If omitted, it acts like "0"
Open addressing
Uses probes to find an open location to store data.
Why do we use prime numbers for table size?
We mod often, and prime numbers give us the most unique numbers. (2*ts+1)
Euclidean Algorithm GCD
def gcd(a, b): while a: b, a = a, b%a return b
L N R
in-order traversal
What kind of Collection is Hashing?
value-orientated.
Graphs
"Uber" data structure. Shows connections between objects. Can be displayed as either a matrix or linked list representation.
Prim's Algorithm
(Minimum Spanning Trees, O(m + nlogn), where m is number of edges and n is the number of vertices) Starting from a vertex, grow the rest of the tree one edge at a time until all vertices are included. Greedily select the best local option from all available choices without regard to the global structure.
Tree Buckets
+--WC = O(logN) +--no wasting space +--dynamically sized -- more complicated than what's needed. --> insert with dups= O(1) --> W/o dups = O(N)
Chained bucket:
+--easy to implement +-- buckets can't overfill +-- buckets won't waste time. +-- buckets are dynamically sized.
Probe Hashing:
-> Hash it, and if it leads to a collision, use a separate equation to determine the step size and use that step size to find a new site.
Collision Hashing using Buckets
-Each element can solre than one item. -throw collisions into a bucket. -buckets aren't sorted.
10. Bridges in a Graph
1) For every edge (u, v), do following .....a) Remove (u, v) from graph .....b) See if the graph remains connected (We can either use BFS or DFS) .....c) Add (u, v) back to the graph. Time complexity of above method is O(E*(V+E))
Steps to resizing:
1. Double table size to nearest prime number 2. Re-hash items from old table into the new table.
Pre-Order Traversal
1. Process self. 2. Process left child. 3. Process right child.
Queue
A FIFO (First In First Out) data structure, where the first element added will be the first to be removed, and where a new element is added to the back, much like a waiting line.
1D Array
A linear collection of data items in a program, all of the same type, such as an array of integers or an array of strings, stored in contiguous memory, and easily accessed using a process called indexing.
Linked List
A linear data structure, much like an array, that consists of nodes, where each node contains data as well as a link to the next node, but that does not use contiguous memory.
Parent Node
A node, including the root, which has one or more child nodes connected to it.
What is a linked list head?
A reference to the first node in a linked list.
Data Structure
A way of organizing data in a computer so that it can be used efficiently, such as an array, linked list, stack, queue, or binary tree.
Array: access, search, insert, delete
Access: O(1) Search: O(n) Insert: O(n) Delete: O(n)
Doubly Linked List: access, search, insert, delete
Access: O(n) Search: O(n) Insert: O(1) Delete: O(1)
Greedy Algorithms
Algorithm design patterns. Compute a solution in stages, making choices that are local optimum at step; these choices are never undone.
Divide-and-conquer
Algorithm design patterns. Divide the problem into two or more smaller independent subproblems and solve the original problem using solutions to the subproblems.
Stack
An abstract data type that serves as a collection of elements, with two principal operations: push, which adds an element to the collection, and pop, which removes the last element that was added. LIFO - Last In First Out
Aggregate Data Types
Any type of data that can be referenced as a single entity, and yet consists of more than one piece of data, like strings, arrays, classes, and other complex structures.
Load Factor
Approximately how it's full... 0.7-0.8.
Unordered Linked List
Data structure with non-efficently supported operations. Is unordered. Has a worst case cost of search and insertion at N, an average case cost of insertion at N, and an average case cost of searching at N/2.
2. Maximum Path Sum in a Binary Tree
For each node there can be four ways that the max path goes through the node: 1. Node only 2. Max path through Left Child + Node 3. Max path through Right Child + Node 4. Max path through Left Child + Node + Max path through Right Child The idea is to keep trace of four paths and pick up the max one in the end. An important thing to note is, root of every subtree need to return maximum path sum such that at most one child of root is involved. This is needed for parent function call. In below code, this sum is stored in 'max_single' and returned by the recursive function.
Lucas's theorem?
For non negative integers n and r and a prime p, the following congruence relation holds: \binom{n}{r}=\prod_{i=0}^{k}\binom{n_i}{r_i}(mod \ p), where n=n_kp^k+n_k_-_1p^k^-^1+.....+n_1p+n0, and r=r_kp^k+r_k_-_1p^k^-^1+.....+r_1p+r0
HashMap underlying structure:
HashTable with chained buckets
What is an algorithm's time complexity?
How long they take to finish their work
Collesion handeling:
How you handle the collisions so each element in the hittable stores only one item.
How to check if a given Binary Tree is BST or not?
If inorder traversal of a binary tree is sorted, then the binary tree is BST. The idea is to simply do inorder traversal and while traversing keep track of previous key value. If current key value is greater, then continue, else return false. See A program to check if a binary tree is BST or not for more details.
What happens if the end parameter is not defined in a slice method?
If omitted, all elements from the start position and to the end of the array will be selected.
Key
Information in items that is used to determine where the item goes into the table.
Indirect Sorting
Involves the use of smart pointers; objects which contains pointers.
Selection Sort
Non-stable, in place sort. Has an N-squared order of growth, needs only one spot of extra space. Works by searching the entire array for the smallest item, then exchanging it with the first item in the array. Repeats this process down the entire array until it is sorted.
What is the worst case time complexity for: Insert, lookup, and delete, for hash functions?
O(1)
Treemap complexity of basic operations:
O(logN)
B-Trees
Popular in disk storage. Keys are in nodes. Data is in the leaves.
L R N
Postorder traversal (Reverse Polish)
Preemption
Preemption is the act of temporarily interrupting a task being carried out by a computer system, without requiring its cooperation, and with the intention of resuming the task at a later time. Such a change is known as a context switch.
N L R
Preorder traversal (Polish)
Topological Sorting
Receives a DAG as input, outputs the ordering of vertices. Selects a node with no incoming edges, reads it's outgoing edges.
Prime number Tables
Reduce the chance of collision.
Combinations
Repetition is Allowed: such as coins in your pocket (5,5,5,10,10) No Repetition: such as lottery numbers (2,14,15,27,30,33) https://www.mathsisfun.com/combinatorics/combinations-permutations.html
Mergesort
Stable sort which is not in place. It has an order of growth of NlogN and requires N amount of extra space. Works by dividing an array in half continuously into smaller and smaller arrays. At the lowest level, these arrays are sorted and then merged together after sorting in the reverse order they were divided apart in.
Tree Sort
Stable, O(n log n), Ω(n log n) : Put everything in the tree, traverse in-order.
Merge Sort
Stable, O(n log n), Ω(n log n): Use recursion to split arrays in half repeatedly. An array with size 1 is already sorted.
Bubble Sort
Stable, O(n^2), Ω(n) : Compares neighboring elements to see if sorted. Stops when there's nothing left to sort.
Big-Oh
T(n) = O(f(n)) if there are positive constants c & n° such that T(n) <= c * f(n) for all n >= n°
Big Omega
T(n) = Ω(f(n)) if ∃ positive constants c & n° such that T(n) >= c * f(n) for all n >= n°
5. Generate all possible sorted arrays from alternate elements of two given sorted arrays
The idea is to use recursion. In the recursive function, a flag is passed to indicate whether current element in output should be taken from 'A' or 'B'. Below is C++ implementation.
How is an Array different from Linked List?
The size of the arrays is fixed, Linked Lists are Dynamic in size. Inserting and deleting a new element in an array of elements is expensive, Whereas both insertion and deletion can easily be done in Linked Lists. Random access is not allowed in Linked Listed. Extra memory space for a pointer is required with each element of the Linked list. Arrays have better cache locality that can make a pretty big difference in performance.
What is the "base case" of a recursive algorithm?
The solution to the "simplest" possible problem. The base case does not require calling itself or any recursive case.
What is the purpose of big O notation?
To classify the time complexity of different algorithms, which will allow us to quantify how much better or worse a particular algorithm is in terms of solving some problem.
Heap Sort
Unstable, O(n log n), Ω(n log n): Make a heap, take everything out.
Selection Sort
Unstable, O(n^2), Ω(n^2) : Iterates through every elements to ensure the list is sorted.
Separate Chaining
Uses a linked list to handle collisions at a specific point.
Collisions:
When the Hash Function returns the same index for different keys.
Exception handling
When there's an error, the program makes an error object and passes it off to the runtime system, which looks for a method in the call stack to handle it.
Why array based representation for Binary Heap?
array based representation is space efficient. If the parent node is stored at index I, the left child can be calculated by 2 * I + 1 and right child by 2 * I + 2 (assuming the indexing starts at 0).
Algorithm Analysis
how long it takes a computer to do something
HashTable<K,V> & HashMap<K,V> class
java.util implements map<K,V> interface K-- type paramater for key and v-- type parameter for associated value Operations: lookup, insert, delete. Constructor lets you set init capacity and load factor handles collisions with chained buckets hash map only allows null for keys and values
Average Lower bound for adjacent swaps
n(n-1)/4 Ω(n^2)
How do you look up a value within the hash table?
return Table[Hash(key)];
Recursive Algorithms
solve a problem by solving smaller internal instances of a problem -- work towards a base case.
Hash function
takes an object and tells you where to put it.
Idea of probing:
If you have a collision, search somewhere else on the table.
Stack: definition
Last in, first out.
Contiguous Memory
Memory that is "side-by-side" in a computer, typical of an array structure.
Insertion & Quick Sort
Using both algorithms together is more efficient since O(n log n) is only for large arrays.
Explain does Minimum Spanning tree **Prim** ?
1) Create a set mstSet that keeps track of vertices already included in MST. 2) Assign a key value to all vertices in the input graph. Initialize all key values as INFINITE. Assign key value as 0 for the first vertex so that it is picked first. 3) While mstSet doesn't include all vertices ....a) Pick a vertex u which is not there in mstSet and has minimum key value. ....b) Include u to mstSet. ....c) Update key value of all adjacent vertices of u. To update the key values, iterate through all adjacent vertices. For every adjacent vertex v, if weight of edge u-v is less than the previous key value of v, update the key value as weight of u-v
What is a Linked List and What are its types?
A linked list is a linear data structure (like arrays) where each element is a separate object. Each element (that is node) of a list is comprising of two items - the data and a reference to the next node.Types of Linked List : Singly Linked List : In this type of linked list, every node stores address or reference of next node in list and the last node has next address or reference as NULL. For example 1->2->3->4->NULL Doubly Linked List : Here, here are two references associated with each node, One of the reference points to the next node and one to the previous node. Eg. NULL<-1<->2<->3->NULL Circular Linked List : Circular linked list is a linked list where all nodes are connected to form a circle. There is no NULL at the end. A circular linked list can be a singly circular linked list or doubly circular linked list. Eg. 1->2->3->1 [The next pointer of last node is pointing to the first]
Ragged Array
An array where the number of columns in each row may be different.
Breadth-First Search
Explores the oldest unexplored vertices first. Places discovered vertices in a queue. In an undirected graph: Assigns a direction to each edge, from the discoverer to the discovered, and the discoverer is denoted to be the parent.
Permutations
A permutation is an ordered combination. Repetition is Allowed: such as the lock above. It could be "333". No Repetition: for example the first three people in a running race. You can't be first and second. https://www.mathsisfun.com/combinatorics/combinations-permutations.html
Peek
A process used in stack and queue processing where a copy of the top or front value is acquired, without removing that item.
9. Chinese remainder theorem We are given two arrays num[0..k-1] and rem[0..k-1]. In num[0..k-1], every pair is coprime (gcd for every pair is 1). We need to find minimum positive number x such that: x % num[0] = rem[0], x % num[1] = rem[1], ....................... x % num[k-1] = rem[k-1]
A Naive Approach to find x is to start with 1 and one by one increment it and check if dividing it with given elements in num[] produces corresponding remainders in rem[]. Once we find such a x, we return it.
Stable Sorting Algorithm
Items with the same key are sorted based on their relative position in the original permutation
Radix Sort
Non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. Two classifications of radix sorts are least significant digit (LSD) radix sorts and most significant digit (MSD) radix sorts.
3-Way Quick Sort
Non-stable, in place sort with an order of growth between N and NlogN. Needs lgN of extra space. Is probabilistic and dependent on the distribution of input keys.
Bloom Filters
Probabilistic hash table. No means no. Yes means maybe. Multiple (different) hash functions. Can't resize table. Also can't remove elements.
Sharding
Sharding is a type of database partitioning that separates very large databases the into smaller, faster, more easily managed parts called data shards.
Palindrome
String that reads the same forwards as backwards
Heapify (bubble down)
Swap a node with one of its children, calling bubble_down on the node again until it dominates its children. Each time, place a node that dominates the others as the parent node.
Divide-and-Conquer Recurrances
T(n) = aT(n/b) + f(n)
How do you insert a value within the hash table?
Table[Hash(key)]=data;
Table Size(TS)
The Array's Length
3. Primality Test | Set 2 (Fermat Method)
// Higher value of k indicates probability of correct // results for composite inputs become higher. For prime // inputs, result is always correct 1) Repeat following k times: a) Pick a randomly in the range [2, n - 2] b) If an-1 ≢ 1 (mod n), then return false 2) Return true [probably prime].
Euler's Totient Function properties
1) For a prime number p, Φ(p) is p-1. For example Φ(5) is 4, Φ(7) is 6 and Φ(13) is 12. This is obvious, gcd of all numbers from 1 to p-1 will be 1 because p is a prime. 2) For two numbers a and b, if gcd(a, b) is 1, then Φ(ab) = Φ(a) * Φ(b). For example Φ(5) is 4 and Φ(6) is 2, so Φ(30) must be 8 as 5 and 6 are relatively prime. 3) For any two prime numbers p and q, Φ(pq) = (p-1)*(q-1). This property is used in RSA algorithm. 4) If p is a prime number, then Φ(pk) = pk - pk-1. This can be proved using Euler's product formula. 5) Sum of values of totient functions of all divisors of n is equal to n. gausss
9. Count number of bits to be flipped to convert A to B
1. Calculate XOR of A and B. a_xor_b = A ^ B 2. Count the set bits in the above calculated XOR result. countSetBits(a_xor_b)
Red Black Trees
1. Every node is Red or Black 2. The root is Black 3. If a node is red, it's children must be black 4. Every path from a node to a NULL pointer must contain the same number of black nodes
Good Hash Function qualities:
1. Must be deterministic: -> Key must ALWAYS generate the same Hash Index (excluding rehashing). 2. Must achieve uniformity -> Keys should be distributed evenly across hash table. 3. FAST/EASY to compute -> only use parts of the key that DISTINGUISH THE ITEMS FROM EACH OTHER 4. Minimize collisions:
In-Order Traversal
1. Process left child. 2. Process self. 3. Process right child.
How does Hashing work?
1. you have a key for the item. 2. the item's key gets churned within the hash function to form the Hash index. 3. The hash index can be applied to the data array, and so, the specific data is found.
10. Find Next Sparse Number
A Simple Solution is to do following: 1) Write a utility function isSparse(x) that takes a number and returns true if x is sparse, else false. This function can be easily written by traversing the bits of input number. 2) Start from x and do following while(1) { if (isSparse(x)) return x; else x++ } An Efficient Solution can solve this problem without checking all numbers on by one. Below are steps. 1) Find binary of the given number and store it in a boolean array. 2) Initialize last_finalized bit position as 0. 2) Start traversing the binary from least significant bit. a) If we get two adjacent 1's such that next (or third) bit is not 1, then (i) Make all bits after this 1 to last finalized bit (including last finalized) as 0. (ii) Update last finalized bit as next bit.
9. Find Kth Smallest/Largest Element In Unsorted Array
A Simple Solution is to sort the given array using a O(nlogn) sorting algorithm like Merge Sort, Heap Sort, etc and return the element at index k-1 in the sorted array. Time Complexity of this solution is O(nLogn). Method 3 (Using Max-Heap) We can also use Max Heap for finding the k'th smallest element. Following is algorithm. 1) Build a Max-Heap MH of the first k elements (arr[0] to arr[k-1]) of the given array. O(k) 2) For each element, after the k'th element (arr[k] to arr[n-1]), compare it with root of MH. ......a) If the element is less than the root then make it root and call heapify for MH ......b) Else ignore it. // The step 2 is O((n-k)*logk) 3) Finally, root of the MH is the kth smallest element. Time complexity of this solution is O(k + (n-k)*Logk) The following is C++ implementation of above algorithm
1. Maximum Subarray XOR
A Simple Solution is to use two loops to find XOR of all subarrays and return the maximum. ^ 1) Create an empty Trie. Every node of Trie is going to contain two children, for 0 and 1 value of bit. 2) Initialize pre_xor = 0 and insert into the Trie. 3) Initialize result = minus infinite 4) Traverse the given array and do following for every array element arr[i]. a) pre_xor = pre_xor ^ arr[i] pre_xor now contains xor of elements from arr[0] to arr[i]. b) Query the maximum xor value ending with arr[i] from Trie. c) Update result if the value obtained in step 4.b is more than current value of result.
What is recursion?
A method of solving problems that involves a function calling itself.
Set Partition
A partitioning of elements of some universal set into a collection of disjointed subsets. Thus, each element must be in exactly one subset.
Push
A process used in stack and queue processing where a new value is inserted onto the top of the stack OR into the back of the queue (Enqueue).
Linear Data Structure
A programming data structure that occupies contiguous memory, such as an array of values.
linked list
A sequence of zero or more nodes containing some data and pointers to other nodes of the list.
Head
A typical object variable identifier name used to reference, or point to, the first object in a linked list. The number one rule for processing linked lists is, 'Never let go of the head of the list!", otherwise all of the list is lost in memory. The number two rule when managing linked lists is, 'Always connect before you disconnect!'.
What are the two parts of a linked list node?
A value and a pointer to the next node in the sequence.
Selection Sort
An in-place comparison sort algorithm, O(n^2). The algorithm divides the input list into two parts: the sublist of items already sorted, which is built up from left to right at the front (left) of the list, and the sublist of items remaining to be sorted that occupy the rest of the list. Initially, the sorted sublist is empty and the unsorted sublist is the entire input list. The algorithm proceeds by finding the smallest (or largest, depending on sorting order) element in the unsorted sublist, exchanging (swapping) it with the leftmost unsorted element (putting it in sorted order), and moving the sublist boundaries one element to the right
Inverted Index
An index data structure storing a mapping from content, such as words or numbers, to its locations in a database file, or in a document or a set of documents (named in contrast to a Forward Index, which maps from documents to content). The purpose of an inverted index is to allow fast full text searches, at a cost of increased processing when a document is added to the database.
Counting Sort (Key Indexed sort)
An integer sorting algorithm which counts the number of objects that have a distinct key value, and then used arithmetic on those countes to determine the positions of each key value in the output array. It cannot handle large keys efficiently, and is often used as a subroutine for other sorting algorithms such as radix sort. Has a time complexity of N.
Iterators
An object that knows how to "walk" over a collection of things. Encapsulates everything it needs to know about what it's iterating over. Should all have similar interfaces. Can read data, move, know when to stop.
Solving Divide-and-Conquer Recurrances
Case 1: Too many leaves. Case 2: Equal work per level. Case 3: Too expensive a root
Quadratic Probing
Checks the square of the nth time it has to check, causes secondary clustering. Not guaranteed to find an open table spot unless table is 1/2 empty.
Folding:
Combining parts of the key using operations like + and bitwise operations such as exclusive or. Key: 123456789 123 456 789 --- 1368 ( 1 is discarded)
Rabin-Karp
Compute hash codes of each substring whose length is the length of s, such as a function with the property that the hash code of a string is an additive function of each individual character. Get the hash code of a sliding window of characters and compare if the hash matches.
Abstract Data Types
Consists of 2 parts: 1. Data it contains 2. Operations that can be performed on it
What is Constant Time complexity? O(1)
Constant time complexity is the "holy grail". No matter the size of your input, the algorithm will take the same amount of time to complete.
What are two common ways to search through trees?
Depth-First search (DFS) and Breadth-First search(BFS)
What is the goal of Hashing?
Do faster than O(LogN) time complexity for: lookup, insert, and remove operations. To achieve O(1)
Double Checked Locking
Double-checked locking is a software design pattern used to reduce the overhead of acquiring a lock by first testing the locking criterion (the "lock hint") without actually acquiring the lock. Only if the locking criterion check indicates that locking is required does the actual locking logic proceed. (Often used in Singletons, and has issues in C++).
What would the Perfect Hash Function be?
Each Key maps to an unique Hash Index.
Rehashing
Expanding the table: double table size, find closest prime number. Rehash each element for the new table size.
Depth-First Search
Explore newest unexplored vertices first. Placed discovered vertices in a stack (or used recursion). Partitions edges into two classes: tree edges and back edges. Tree edges discover new vertices; back edges are ancestors.
queue
FIFO list in which elements are added from one end of the structure and deleted from the other end.
Queues
First in, first out. O(1)
Relaxation
Getting from A->C more cheaply by using B as an intermediary.
1. Insertion of a node in Linked List (On the basis of some constraints)
If Linked list is empty then make the node as head and return it. 2) If value of the node to be inserted is smaller than value of head node, then insert the node at start and make it head. 3) In a loop, find the appropriate node after which the input node (let 9) is to be inserted. To find the appropriate node start from head, keep moving until you reach a node GN (10 in the below diagram) who's value is greater than the input node. The node just before GN is the appropriate node (7). 4) Insert the node (9) after the appropriate node (7) found in step 3.
Fermats little theorem?
If n is a prime number, then for every a, 1 <= a < n, an-1 ≡ 1 (mod n) OR an-1 % n = 1 Example: Since 5 is prime, 24 ≡ 1 (mod 5) [or 24%5 = 1], 34 ≡ 1 (mod 5) and 44 ≡ 1 (mod 5) Since 7 is prime, 26 ≡ 1 (mod 7), 36 ≡ 1 (mod 7), 46 ≡ 1 (mod 7) 56 ≡ 1 (mod 7) and 66 ≡ 1 (mod 7) Refer this for different proofs.
10. Boolean Parenthesization Problem
If we draw recursion tree of above recursive solution, we can observe that it many overlapping subproblems. Like other dynamic programming problems, it can be solved by filling a table in bottom up manner. Following is C++ implementation of dynamic programming solution.
One-Sided Binary Search
In the absence of an upper bound, we can repeatedly test larger intervals (A[1], A[2], A[4], A[8], A[16], etc) until we find an upper bound, the transition point, p, in at most 2[log p] comparisons. One sided binary search is most useful whenever we are looking for a key that lies close to our current position.
2. Search an element in a sorted and rotated array
Input arr[] = {3, 4, 5, 1, 2} Element to Search = 1 1) Find out pivot point and divide the array in two sub-arrays. (pivot = 2) /*Index of 5*/ 2) Now call binary search for one of the two sub-arrays. (a) If element is greater than 0th element then search in left array (b) Else Search in right array (1 will go in else as 1 < 0th element(3)) 3) If element is found in selected sub-array then return index Else return -1.
Shellsort
Insertion sort over a gap Best: O(n log n) Avg: depends on gap sequence Worst: O(n^2)
Standard data structure for solving complex bit manipulation
Lookup table
Lazy Deletion
Marking a spot as deleted in a hash table rather than actually deleting it.
Static Memory
Memory allocated to an array, which cannot grow or shrink once declared.
Inversions
Min: 0 Max: n(n-1)/2 Swapping removes 1 inversion
Quick Sort
Non-stable, in place sort with an order of growth of NlogN. Needs lgN of extra space. It has a probabilistic guarantee. Works by making use of a divide and conquer method. The array is divided into two parts, and then the parts are sorted independently. An arbitrary value is chosen as the partition. Afterwards, all items which are larger than this value go to the right of it, and all items which are less than this value go to the left of it. We arbitrarily choose a[lo] as a partitioning item. Then we scan from the left end of the array one by one until we find an entry that is greater than a[lo]. At the same time, we are scanning from a[lo] to the right to find an entry that is less than or equal to a[lo]. Once we find these two values, we swap them.
ShellSort
Non-stable, in place sort with an order of growth which is undetermined, though usually given at being N-to-the 6/5. Needs only one spot of extra space. Works as an extension of insertion sort. It gains speed by allowing exchanges of entries which are far apart, producing partially sorted arrays which are eventually sorted quickly at the end with an insertion sort. The idea is to rearrange the array so that every h-th entry yields a sorted sequence. The array is h-sorted.
Typical runtime of a recursive function with multiple branches
O( branches^depth )
Complexity for iterating over associated values:
O(T.S + N) --> worst case.
Quadratic Probing:
Probe Sequence is (Hk+1)^2. Minimizes clustering better at distinguishing items across table.
Replica
Replication in computing involves sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.
Compute XOR of every bit in an integer
Similar to addition, XOR is associative and communicative, so, we need to XOR every bit together. First, XOR the top half with the bottom half. Then, XOR the top quarter of the bottom half with the bottom quarter of the bottom half... x ^= x >> 32 x ^= x >> 16 x ^= x >> 8 x ^= x >> 4 x ^= x >> 2 x ^= x >> 1 x = x & 1
6. Longest Path In Matrix
The idea is simple, we calculate longest path beginning with every cell. Once we have computed longest for all cells, we return maximum of all longest paths. One important observation in this approach is many overlapping subproblems. Therefore this problem can be optimally solved using Dynamic Programming. Below is Dynamic Programming based implementation that uses a lookup table dp[][] to check if a problem is already solved or not.
6. Print Nodes in Top View of Binary Tree
The idea is to do something similar to vertical Order Traversal. Like vertical Order Traversal, we need to nodes of same horizontal distance together. We do a level order traversal so that the topmost node at a horizontal node is visited before any other node of same horizontal distance below it. Hashing is used to check if a node at given horizontal distance is seen or not.
7. Length of the largest subarray with contiguous elements
The important thing to note in question is, it is given that all elements are distinct. If all elements are distinct, then a subarray has contiguous elements if and only if the difference between maximum and minimum elements in subarray is equal to the difference between last and first indexes of subarray. So the idea is to keep track of minimum and maximum element in every subarray.
Modular Exponation
The problem with above solutions is, overflow may occur for large value of n or x. Therefore, power is generally evaluated under modulo of a large number. Below is the fundamental modular property that is used for efficiently computing power under modular arithmetic. (a mod p) (b mod p) ≡ (ab) mod p or equivalently ( (a mod p) (b mod p) ) mod p = (ab) mod p For example a = 50, b = 100, p = 13 50 mod 13 = 11 100 mod 13 = 9 11*9 ≡ 1500 mod 13 or 11*9 mod 13 = 1500 mod 13
Articulation Vertex
The weakest point in a graph
2. Modular multiplicative inverse
We have discussed three methods to find multiplicative inverse modulo m. 1) Naive Method, O(m) A Naive method is to try all numbers from 1 to m. For every number x, check if (a*x)%m is 1. 2) Extended Euler's GCD algorithm, O(Log m) [Works when a and m are coprime] 3) Fermat's Little theorem, O(Log m) [Works when 'm' is prime] Applications: Computation of the modular multiplicative inverse is an essential step in RSA public-key encryption method.
Red-black Tree
Worst height: 2 log n
Explain a full binary tree?
a binary tree T is full if each node is either a leaf or possesses exactly two child nodes
list
a collection of data items arranged in a certain linear order
data structure
a particular scheme organizing related data items.
Anagram
a word, phrase, or name formed by rearranging the letters of another, such as cinema, formed from iceman.
10. Stock Buy Sell to Maximize Profit
f we are allowed to buy and sell only once, then we can use following algorithm. Maximum difference between two elements. Here we are allowed to buy and sell multiple times. Following is algorithm for this problem. 1. Find the local minima and store it as starting index. If not exists, return. 2. Find the local maxima. and store it as ending index. If we reach the end, set the end as ending index. 3. Update the solution (Increment count of buy sell pairs) 4. Repeat the above steps if end is not reached. // Program to find best buying and selling days
Chaining
make each slot is the head of a linked list
5. Bottom View Binary Tree
Method 1 - Using Queue The following are steps to print Bottom View of Binary Tree. 1. We put tree nodes in a queue for the level order traversal. 2. Start with the horizontal distance(hd) 0 of the root node, keep on adding left child to queue along with the horizontal distance as hd-1 and right child as hd+1. 3. Also, use a TreeMap which stores key value pair sorted on key. 4. Every time, we encounter a new horizontal distance or an existing horizontal distance put the node data for the horizontal distance as key. For the first time it will add to the map, next time it will replace the value. This will make sure that the bottom most element for that horizontal distance is present in the map and if you see the tree from beneath that you will see that element. Method 2- Using HashMap() This method is contributed by Ekta Goel. Approach: Create a map like, map where key is the horizontal distance and value is a pair(a, b) where a is the value of the node and b is the height of the node. Perform a pre-order traversal of the tree. If the current node at a horizontal distance of h is the first we've seen, insert it in the map. Otherwise, compare the node with the existing one in map and if the height of the new node is greater, update in the Map.
3. Edit Distance?
def editDistDP(str1, str2, m, n): # Create a table to store results of subproblems dp = [[0 for x in range(n+1)] for x in range(m+1)] # Fill d[][] in bottom up manner for i in range(m+1): for j in range(n+1): # If first string is empty, only option is to # isnert all characters of second string if i == 0: dp[i][j] = j # Min. operations = j # If second string is empty, only option is to # remove all characters of second string elif j == 0: dp[i][j] = i # Min. operations = i # If last characters are same, ignore last char # and recur for remaining string elif str1[i-1] == str2[j-1]: dp[i][j] = dp[i-1][j-1] # If last character are different, consider all # possibilities and find minimum else: dp[i][j] = 1 + min(dp[i][j-1], # Insert dp[i-1][j], # Remove dp[i-1][j-1]) # Replace return dp[m][n]
What is the "recursive case" of a recursive algorithm?
When we use the same algorithm to solve a simpler version of the problem.
Algorithm
has input, produces output, definite, finite, operates on the data it is given
What is a perfect binary tree?
is a binary tree in which all interior nodes have two children and all leaves have the same depth or same level.[18]
Explain a complete binary tree?
is a binary tree in which every level, except possibly the last, is completely filled, and all nodes are as far left as possible
Mergesort
split into sub-arrays Best: O(n log n) Avg: O(n log n) Worst: O(n log n)
Hash collision
two (or more) keys hash to same slot
Divide and Conquer
works by recursively breaking down a problem into two or more sub problems until the problems become simple enough to be solved directly. An example would be mergesort.
Replace the lowest bit that is 1 with 0
x & (x - 1)
Compute x modulo a power of 2 (y)
x & (y - 1)
Right propagate the rightmost set bit in x
x | (x & ~(x - 1) - 1)
Geometric series
.
8. Interpolation Search
// The idea of formula is to return higher value of pos // when element to be searched is closer to arr[hi]. And // smaller value when closer to arr[lo] pos = lo + [ (x-arr[lo])*(hi-lo) / (arr[hi]-arr[Lo]) ] arr[] ==> Array where elements need to be searched x ==> Element to be searched lo ==> Starting index in arr[] hi ==> Ending index in arr[] Step1: In a loop, calculate the value of "pos" using the probe position formula. Step2: If it is a match, return the index of the item, and exit. Step3: If the item is less than arr[pos], calculate the probe position of the left sub-array. Otherwise calculate the same in the right sub-array. Step4: Repeat until a match is found or the sub-array reduces to zero.
Non-Linear Data Structure
A data structure that does not occupy contiguous memory, such as a linked list, graph, or tree.
Topological Sort
A linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering.
What is Logarithmic time O(log(n)) complexity?
Logarithmic time complexity is the next best thing after constant time. While logarithmic time complexity algorithms do take longer with larger inputs, running time increases slowly. If myLogRunTimeAlgo takes 1 second to complete with an input of size 10, when we increase our input by 10x to 100, the running time only grows to 2 seconds. When we increase the input size to 1000, the time only grows to 3 seconds. It is also characteristic of logarithmic algorithms that they cut the problem size in half each round through.
TreeMap underlying Structure:
RBT
Parameter Passing
Small, no modification - value Large, no modification - CONST reference modified - pointer
What is space complexity in regards to algorithm performance?
Space complexity refers to the amount of physical memory that an algorithm requires to complete.
Double Hashing
The process of using two hash functions to determine where to store the data.
3. Compare two strings represented as linked lists
# Traverse both lists. Stop when either end of linked # list is reached or current characters don't watch while(list1 and list2 and list1.c == list2.c): list1 = list1.next list2 = list2.next # If both lists are not empty, compare mismatching # characters if(list1 and list2): return 1 if list1.c > list2.c else -1 # If either of the two lists has reached end if (list1 and not list2): return 1 if (list2 and not list1): return -1 return 0
10. Select A Random Node from A Singly Linked List
(1) Initialize result as first node result = head->key (2) Initialize n = 2 (3) Now one by one consider all nodes from 2nd node onward. (3.a) Generate a random number from 0 to n-1. Let the generated random number is j. (3.b) If j is equal to 0 (we could choose other fixed number between 0 to n-1), then replace result with current node. (3.c) n = n+1 (3.d) current = current->next
7. Quick Sort
/* low --> Starting index, high --> Ending index */ quickSort(arr[], low, high) { if (low < high) { /* pi is partitioning index, arr[pi] is now at right place */ pi = partition(arr, low, high); quickSort(arr, low, pi - 1); // Before pi quickSort(arr, pi + 1, high); // After pi } } /* This function takes last element as pivot, places the pivot element at its correct position in sorted array, and places all smaller (smaller than pivot) to left of pivot and all greater elements to right of pivot */ partition (arr[], low, high) { // pivot (Element to be placed at right position) pivot = arr[high]; i = (low - 1) // Index of smaller element for (j = low; j <= high- 1; j++) { // If current element is smaller than or // equal to pivot if (arr[j] <= pivot) { i++; // increment index of smaller element swap arr[i] and arr[j] } } swap arr[i + 1] and arr[high]) return (i + 1) }
4. Insertion Sort
// Sort an arr[] of size n insertionSort(arr, n) Loop from i = 1 to n-1. ......a) Pick element arr[i] and insert it into sorted sequence arr[0...i-1] <like inserting a card in a deck>
Sorting And Searching
1. Binary Search 2. Search an element in a sorted and rotated array 3. Bubble Sort 4. Insertion Sort 5. Merge Sort 6. Heap Sort (Binary Heap) 7. Quick Sort 8. Interpolation Search 9. Find Kth Smallest/Largest Element In Unsorted Array 10. Given a sorted array and a number x, find the pair in array whose sum is closest to x
Top 10 graph problems?
1. Breadth First Search (BFS) 2. Depth First Search (DFS) 3. Shortest Path from source to all vertices **Dijkstra** 4. Shortest Path from every vertex to every other vertex **Floyd Warshall** 5. To detect cycle in a Graph **Union Find** 6. Minimum Spanning tree **Prim** 7. Minimum Spanning tree **Kruskal** 8. Topological Sort 9. Boggle (Find all possible words in a board of characters) 10. Bridges in a Graph
6. Heap Sort (Binary Heap)
1. Build a max heap from the input data. 2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap followed by reducing the size of heap by 1. Finally, heapify the root of tree. 3. Repeat above steps while size of heap is greater than 1.
10 top Tree / Binary Search questions?
1. Find Minimum Depth of a Binary Tree 2. Maximum Path Sum in a Binary Tree 3. Check if a given array can represent Preorder Traversal of Binary Search Tree 4. Check whether a binary tree is a full binary tree or not 5. Bottom View Binary Tree 6. Print Nodes in Top View of Binary Tree 7. Remove nodes on root to leaf paths of length < K 8. Lowest Common Ancestor in a Binary Search Tree 9. Check if a binary tree is subtree of another binary tree 10. Reverse alternate levels of a perfect binary tree
10 most common problems Dynamic Programming?
1. Longest Common Subsequence 2. Longest Increasing Subsequence 3. Edit Distance 4. Minimum Partition 5. Ways to Cover a Distance 6. Longest Path In Matrix 7. Subset Sum Problem 8. Optimal Strategy for a Game 9. 0-1 Knapsack Problem 10. Boolean Parenthesization Problem
BIT Manipulation top 10
1. Maximum Subarray XOR 2. Magic Number 3. Sum of bit differences among all pairs 4. Swap All Odds And Even Bits 5. Find the element that appears once 6. Binary representation of a given number 7. Count total set bits in all numbers from 1 to n 8. Rotate bits of a number 9. Count number of bits to be flipped to convert A to B 10. Find Next Sparse Number
Explain Minimum Spanning tree **Kruskal** ?
1. Sort all the edges in non-decreasing order of their weight. 2. Pick the smallest edge. Check if it forms a cycle with the spanning tree formed so far. If cycle is not formed, include this edge. Else, discard it. 3. Repeat step#2 until there are (V-1) edges in the spanning tree.
How to implement a queue using stack?
A queue can be implemented using two stacks. Let queue to be implemented be q and stacks used to implement q be stack1 and stack2. q can be implemented in two ways: Method 1 (By making enQueue operation costly) Method 2 (By making deQueue operation costly) See Implement Queue using Stacks
How to implement a stack using queue?
A stack can be implemented using two queues. Let stack to be implemented be 's' and queues used to implement be 'q1' and 'q2'. Stack 's' can be implemented in two ways: Method 1 (By making push operation costly) Method 2 (By making pop operation costly) See Implement Stack using Queues
6. Reverse A List In Groups Of Given Size
Algorithm: reverse(head, k) 1) Reverse the first sub-list of size k. While reversing keep track of the next node and previous node. Let the pointer to the next node be next and pointer to the previous node be prev. See this post for reversing a linked list. 2) head->next = reverse(next, k) /* Recursively call for rest of the list and link the two sub-lists */ 3) return prev /* prev becomes the new head of the list (see the diagrams of iterative method of this post) */
3. Sum of bit differences among all pairs
An Efficient Solution can solve this problem in O(n) time using the fact that all numbers are represented using 32 bits (or some fixed number of bits). The idea is to count differences at individual bit positions. We traverse from 0 to 31 and count numbers with i'th bit set. Let this count be 'count'. There would be "n-count" numbers with i'th bit not set. So count of differences at i'th bit would be "count * (n-count) * 2".
What is a Queue, how it is different from stack and how is it implemented?
Queue is a linear structure which follows the order is First In First Out (FIFO) to access elements. Mainly the following are basic operations on queue: Enqueue, Dequeue, Front, Rear The difference between stacks and queues is in removing. In a stack we remove the item the most recently added; in a queue, we remove the item the least recently added. Both Queues and Stacks can be implemented using Arrays and Linked Lists.
Which data structures are used for BFS and DFS of a graph?
Queue is used for BFS Stack is used for DFS. DFS can also be implemented using recursion (Note that recursion also uses function call stack).
1. Reverse an array without affecting special characters
Simple Solution: 1) Create a temporary character array say temp[]. 2) Copy alphabetic characters from given array to temp[]. 3) Reverse temp[] using standard string reversal algorithm. 4) Now traverse input string and temp in a single loop. Wherever there is alphabetic character is input string, replace it with current character of temp[]. 1) Let input string be 'str[]' and length of string be 'n' 2) l = 0, r = n-1 3) While l is smaller than r, do following a) If str[l] is not an alphabetic character, do l++ b) Else If str[r] is not an alphabetic character, do r-- c) Else swap str[l] and str[r]
Shortest Path from every vertex to every other vertex **Floyd Warshall*?
We initialize the solution matrix same as the input graph matrix as a first step. Then we update the solution matrix by considering all vertices as an intermediate vertex. The idea is to one by one pick all vertices and update all shortest paths which include the picked vertex as an intermediate vertex in the shortest path. When we pick vertex number k as an intermediate vertex, we already have considered vertices {0, 1, 2, .. k-1} as intermediate vertices. For every pair (i, j) of source and destination vertices respectively, there are two possible cases. 1) k is not an intermediate vertex in shortest path from i to j. We keep the value of dist[i][j] as it is. 2) k is an intermediate vertex in shortest path from i to j. We update the value of dist[i][j] as dist[i][k] + dist[k][j].
7. Count total set bits in all numbers from 1 to n
This needs to be reviewed asked in amazon interview Method 2 (Simple and efficient than Method 1) If we observe bits from rightmost side at distance i than bits get inverted after 2^i position in vertical sequence. for example n = 5; 0 = 0000 1 = 0001 2 = 0010 3 = 0011 4 = 0100 5 = 0101 Observe the right most bit (i = 0) the bits get flipped after (2^0 = 1) Observe the 3nd rightmost bit (i = 2) the bits get flipped after (2^2 = 4) So, We can count bits in vertical fashion such that at i'th right most position bits will be get flipped after 2^i iteration;
4. Check whether a binary tree is a full binary tree or not
To check whether a binary tree is a full binary tree we need to test the following cases:- 1) If a binary tree node is NULL then it is a full binary tree. 2) If a binary tree node does have empty left and right sub-trees, then it is a full binary tree by definition 3) If a binary tree node has left and right sub-trees, then it is a part of a full binary tree by definition. In this case recursively check if the left and right sub-trees are also binary trees themselves. 4) In all other combinations of right and left sub-trees, the binary tree is not a full binary tree.
8. Rotate bits of a number
Let n is stored using 8 bits. Left rotation of n = 11100101 by 3 makes n = 00101111 (Left shifted by 3 and first 3 bits are put back in last ). If n is stored using 16 bits or 32 bits then left rotation of n (000...11100101) becomes 00..0011100101000. Right rotation of n = 11100101 by 3 makes n = 10111100 (Right shifted by 3 and last 3 bits are put back in first ) if n is stored using 8 bits. If n is stored using 16 bits or 32 bits then right rotation of n (000...11100101) by 3 becomes 101000..0011100.
What are linear and non linear data Structures?
Linear: A data structure is said to be linear if its elements form a sequence or a linear list. Examples: Array. Linked List, Stacks and Queues Non-Linear: A data structure is said to be non-linear if traversal of nodes is nonlinear in nature. Example: Graph and Trees.
Explain Depth First Search (DFS)?
Pick a starting node and push all its adjacent nodes into a stack. Pop a node from stack to select the next node to visit and push all its adjacent nodes into a stack. Repeat this process until the stack is empty. However, ensure that the nodes that are visited are marked. This will prevent you from visiting the same node more than once. If you do not mark the nodes that are visited and you visit the same node more than once, you may end up in an infinite loop.
1. Binary Search
We basically ignore half of the elements just after one comparison. Compare x with the middle element. If x matches with middle element, we return the mid index. Else If x is greater than the mid element, then x can only lie in right half subarray after the mid element. So we recur for right half. Else (x is smaller) recur for the left half.
7. Subset Sum Problem
We can solve the problem in Pseudo-polynomial time using Dynamic programming. We create a boolean 2D table subset[][] and fill it in bottom up manner. The value of subset[i][j] will be true if there is a subset of set[0..j-1] with sum equal to i., otherwise false. Finally, we return subset[sum][n]
8. Detect And Remove Loop In A Linked List
We know that Floyd's Cycle detection algorithm terminates when fast and slow pointers meet at a common point. We also know that this common point is one of the loop nodes (2 or 3 or 4 or 5 in the above diagram). We store the address of this in a pointer variable say ptr2. Then we start from the head of the Linked List and check for nodes one by one if they are reachable from ptr2. When we find a node that is reachable, we know that this node is the starting node of the loop in Linked List and we can get pointer to the previous of this node.
Which Data Structure Should be used for implementing LRU cache?
We use two data structures to implement an LRU Cache. Queue which is implemented using a doubly linked list. The maximum size of the queue will be equal to the total number of frames available (cache size).The most recently used pages will be near front end and least recently pages will be near rear end. A Hash with page number as key and address of the corresponding queue node as value. See How to implement LRU caching scheme? What data structures should be used?
5. Ways to Cover a Distance
class GFG { // Function returns count of ways to cover 'dist' static int printCountDP(int dist) { int[] count = new int[dist+1]; // Initialize base values. There is one way to // cover 0 and 1 distances and two ways to // cover 2 distance count[0] = 1; count[1] = 1; count[2] = 2; // Fill the count array in bottom up manner for (int i=3; i<=dist; i++) count[i] = count[i-1] + count[i-2] + count[i-3]; return count[dist]; } // driver program public static void main (String[] args) { int dist = 4; System.out.println(printCountDP(dist)); } }
4. Add Two Numbers Represented By Linked Lists
1) Calculate sizes of given two linked lists. 2) If sizes are same, then calculate sum using recursion. Hold all nodes in recursion call stack till the rightmost node, calculate sum of rightmost nodes and forward carry to left side. 3) If size is not same, then follow below steps: ....a) Calculate difference of sizes of two linked lists. Let the difference be diff ....b) Move diff nodes ahead in the bigger linked list. Now use step 2 to calculate sum of smaller list and right sub-list (of same size) of larger list. Also, store the carry of this sum. ....c) Calculate sum of the carry (calculated in previous step) with the remaining left sub-list of larger list. Nodes of this sum are added at the beginning of sum list obtained previous step.
10. Given a sorted array and a number x, find the pair in array whose sum is closest to x
1) Initialize a variable diff as infinite (Diff is used to store the difference between pair and x). We need to find the minimum diff. 2) Initialize two index variables l and r in the given sorted array. (a) Initialize first to the leftmost index: l = 0 (b) Initialize second the rightmost index: r = n-1 3) Loop while l < r. (a) If abs(arr[l] + arr[r] - sum) < diff then update diff and result (b) Else if(arr[l] + arr[r] < sum ) then l++ (c) Else r--
Longest Common Subsequence?
1) Optimal Substructure: Let the input sequences be X[0..m-1] and Y[0..n-1] of lengths m and n respectively. And let L(X[0..m-1], Y[0..n-1]) be the length of LCS of the two sequences X and Y. Following is the recursive definition of L(X[0..m-1], Y[0..n-1]). If last characters of both sequences match (or X[m-1] == Y[n-1]) then L(X[0..m-1], Y[0..n-1]) = 1 + L(X[0..m-2], Y[0..n-2]) If last characters of both sequences do not match (or X[m-1] != Y[n-1]) then L(X[0..m-1], Y[0..n-1]) = MAX ( L(X[0..m-2], Y[0..n-1]), L(X[0..m-1], Y[0..n-2])
3. Check if a given array can represent Preorder Traversal of Binary Search Tree
A Simple Solution is to do following for every node pre[i] starting from first one. 1) Find the first greater value on right side of current node. Let the index of this node be j. Return true if following conditions hold. Else return false (i) All values after the above found greater value are greater than current node. (ii) Recursive calls for the subarrays pre[i+1..j-1] and pre[j+1..n-1] also return true. Time Complexity of the above solution is O(n2) An Efficient Solution can solve this problem in O(n) time. The idea is to use a stack. This problem is similar to Next (or closest) Greater Element problem. Here we find next greater element and after finding next greater, if we find a smaller element, then return false. 1) Create an empty stack. 2) Initialize root as INT_MIN. 3) Do following for every element pre[i] a) If pre[i] is smaller than current root, return false. b) Keep removing elements from stack while pre[i] is greater then stack top. Make the last removed item as new root (to be compared next). At this point, pre[i] is greater than the removed root (That is why if we see a smaller element in step a), we return false) c) push pre[i] to stack (All elements in stack are in decreasing order)
3. Count triplets with sum smaller than a given value
A Simple Solution is to run three loops to consider all triplets one by one. For every triplet, compare the sums and increment count if triplet sum is smaller than given sum. better solution: 1) Sort the input array in increasing order. 2) Initialize result as 0. 3) Run a loop from i = 0 to n-2. An iteration of this loop finds all triplets with arr[i] as first element. a) Initialize other two elements as corner elements of subarray arr[i+1..n-1], i.e., j = i+1 and k = n-1 b) Move j and k toward each other until they meet, i.e., while (j < k) (i) if (arr[i] + arr[j] + arr[k] >= sum), then do k-- // Else for current i and j, there can (k-j) possible third elements // that satisfy the constraint. (ii) Else Do ans += (k - j) followed by j++
8. Find the smallest positive integer value that cannot be represented as sum of any subset of a given array
A Simple Solution is to start from value 1 and check all values one by one if they can sum to values in the given array. This solution is very inefficient as it reduces to subset sum problem which is a well known NP Complete Problem. We can solve this problem in O(n) time using a simple loop. Let the input array be arr[0..n-1]. We initialize the result as 1 (smallest possible outcome) and traverse the given array. Let the smallest element that cannot be represented by elements at indexes from 0 to (i-1) be 'res', there are following two possibilities when we consider element at index i: 1) We decide that 'res' is the final result: If arr[i] is greater than 'res', then we found the gap which is 'res' because the elements after arr[i] are also going to be greater than 'res'. 2) The value of 'res' is incremented after considering arr[i]: The value of 'res' is incremented by arr[i] (why? If elements from 0 to (i-1) can represent 1 to 'res-1', then elements from 0 to i can represent from 1 to 'res + arr[i] - 1' be adding 'arr[i]' to all subsets that represent 1 to 'res')
Explain a binary heap?
Complete Binary Tree where items are stored in a special order such that value in a parent node is greater(or smaller) than the values in its two children nodes. The former is called as max heap and the latter is called min heap. The heap can be represented by binary tree or array.
Explain Breadth First Search (BFS)?
First move horizontally and visit all the nodes of the current layer Move to the next layer. To avoid processing a node more than once, we use a boolean visited array.
8. Lowest Common Ancestor in a Binary Search Tree
If we are given a BST where every node has parent pointer, then LCA can be easily determined by traversing up using parent pointer and printing the first intersecting node. We can solve this problem using BST properties. We can recursively traverse the BST from root. The main idea of the solution is, while traversing from top to bottom, the first node n we encounter with value between n1 and n2, i.e., n1 < n < n2 or same as one of the n1 or n2, is LCA of n1 and n2 (assuming that n1 < n2). So just recursively traverse the BST in, if node's value is greater than both n1 and n2 then our LCA lies in left side of the node, if it's is smaller than both n1 and n2, then LCA lies on right side. Otherwise root is LCA (assuming that both n1 and n2 are present in BST) Time complexity of above solution is O(h) where h is height of tree. Also, the above solution requires O(h) extra space in function call stack for recursive function calls. We can avoid extra space using iterative solution.
Topological Sorting vs Depth First Traversal (DFS): ?
In DFS, we print a vertex and then recursively call DFS for its adjacent vertices. In topological sorting, we need to print a vertex before its adjacent vertices.
Explain Topological Sort?
In DFS, we start from a vertex, we first print it and then recursively call DFS for its adjacent vertices. In topological sorting, we use a temporary stack. We don't print the vertex immediately, we first recursively call topological sorting for all its adjacent vertices, then push it to a stack. Finally, print contents of stack. Note that a vertex is pushed to stack only when all of its adjacent vertices (and their adjacent vertices and so on) are already in stack.
What are Infix, prefix, Postfix notations?
Infix notation: X + Y - Operators are written in-between their operands. This is the usual way we write expressions. An expression such as A * ( B + C ) / D Postfix notation (also known as "Reverse Polish notation"): X Y + Operators are written after their operands. The infix expression given above is equivalent to A B C + * D/ Prefix notation (also known as "Polish notation"): + X Y Operators are written before their operands. The expressions given above are equivalent to / * A + B C D
What are the various operations that can be performed on different Data Structures?
Insertion − Add a new data item in the given collection of data items. Deletion − Delete an existing data item from the given collection of data items. Traversal − Access each data item exactly once so that it can be processed. Searching − Find out the location of the data item if it exists in the given collection of data items. Sorting − Arranging the data items in some order i.e. in ascending or descending order in case of numerical data and in dictionary order in case of alphanumeric data.
Interpolation Search vs Binary Search
Interpolation Search is an improvement over Binary Search for instances, where the values in a sorted array are uniformly distributed. Binary Search always goes to middle element to check. On the other hand interpolation search may go to different locations according the value of key being searched. For example if the value of key is closer to the last element, interpolation search is likely to start search toward the end side.
7. Union And Intersection Of 2 Linked Lists
Intersection (list1, list2) Initialize result list as NULL. Traverse list1 and look for its each element in list2, if the element is present in list2, then add the element to result. Union (list1, list2): Initialize result list as NULL. Traverse list1 and add all of its elements to the result. Traverse list2. If an element of list2 is already present in result then do not insert it to result, otherwise insert. This method assumes that there are no duplicates in the given lists.
9. Merge Sort For Linked Lists
MergeSort(headRef) 1) If head is NULL or there is only one element in the Linked List then return. 2) Else divide the linked list into two halves. FrontBackSplit(head, &a, &b); /* a and b are two halves */ 3) Sort the two halves a and b. MergeSort(a); MergeSort(b); 4) Merge the sorted a and b (using SortedMerge() discussed here) and update the head pointer using headRef. *headRef = SortedMerge(a, b);
6. Binary representation of a given number
Method 1: Iterative For any number, we can check whether its 'i'th bit is 0(OFF) or 1(ON) by bitwise ANDing it with "2^i" (2 raise to i). 1) Let us take number 'NUM' and we want to check whether it's 0th bit is ON or OFF bit = 2 ^ 0 (0th bit) if NUM & bit == 1 means 0th bit is ON else 0th bit is OFF 2) Similarly if we want to check whether 5th bit is ON or OFF bit = 2 ^ 5 (5th bit) if NUM & bit == 1 means its 5th bit is ON else 5th bit is OFF. Method 2: Recursive Following is recursive method to print binary representation of 'NUM'. step 1) if NUM > 1 a) push NUM on stack b) recursively call function with 'NUM / 2' step 2) a) pop NUM from stack, divide it by 2 and print it's remainder.
5. Find the element that appears once Given an array where every element occurs three times, except one element which occurs only once. Find the element that occurs once. Expected time complexity is O(n) and O(1) extra space.
The idea is to use bitwise operators for a solution that is O(n) time and uses O(1) extra space. The solution is not easy like other XOR based solutions, because all elements appear odd number of times here. The idea is taken from here. Run a loop for all elements in array. At the end of every iteration, maintain following two values. ones: The bits that have appeared 1st time or 4th time or 7th time .. etc. twos: The bits that have appeared 2nd time or 5th time or 8th time .. etc. Finally, we return the value of 'ones'
4. Swap All Odds And Even Bits
The following solution is based on this observation. The solution assumes that input number is stored using 32 bits. Let the input number be x 1) Get all even bits of x by doing bitwise and of x with 0xAAAAAAAA. The number 0xAAAAAAAA is a 32 bit number with all even bits set as 1 and all odd bits as 0. 2) Get all odd bits of x by doing bitwise and of x with 0x55555555. The number 0x55555555 is a 32 bit number with all odd bits set as 1 and all even bits as 0. 3) Right shift all even bits. 4) Left shift all odd bits. 5) Combine new even and odd bits and return.
9. Check if a binary tree is subtree of another binary tree
The idea is based on the fact that inorder and preorder/postorder uniquely identify a binary tree. Tree S is a subtree of T if both inorder and preorder traversals of S arew substrings of inorder and preorder traversals of T respectively. Following are detailed steps. 1) Find inorder and preorder traversals of T, store them in two auxiliary arrays inT[] and preT[]. 2) Find inorder and preorder traversals of S, store them in two auxiliary arrays inS[] and preS[]. 3) If inS[] is a subarray of inT[] and preS[] is a subarray preT[], then S is a subtree of T. Else not. We can also use postorder traversal in place of preorder in the above algorithm.
Explain Boggle (Find all possible words in a board of characters)
The idea is to consider every character as a starting character and find all words starting with it. All words starting from a character can be found using Depth First Traversal. We do depth first traversal starting from every cell. We keep track of visited cells to make sure that a cell is considered only once in a word.
4. Minimum Partition
The task is to divide the set into two parts. We will consider the following factors for dividing it. Let dp[n+1][sum+1] = {1 if some subset from 1st to i'th has a sum equal to j 0 otherwise} i ranges from {1..n} j ranges from {0..(sum of all elements)} So dp[n+1][sum+1] will be 1 if 1) The sum j is achieved including i'th item 2) The sum j is achieved excluding i'th item. Let sum of all the elements be S. To find Minimum sum difference, w have to find j such that Min{sum - j*2 : dp[n][j] == 1 } where j varies from 0 to sum/2 The idea is, sum of S1 is j and it should be closest to sum/2, i.e., 2*j should be closest to sum.
2. Longest Increasing Subsequence?
There is a recursive way, dynamic programming with brute force , and a nlogn version brute force dynamic programming version O(n^2) def lis(arr): n = len(arr) # Declare the list (array) for LIS and initialize LIS # values for all indexes lis = [1]*n # Compute optimized LIS values in bottom up manner for i in range (1 , n): for j in range(0 , i): if arr[i] > arr[j] and lis[i]< lis[j] + 1 : lis[i] = lis[j]+1 # Initialize maximum to 0 to get the maximum of all # LIS maximum = 0 # Pick maximum of all LIS values for i in range(n): maximum = max(maximum , lis[i]) return maximum # end of lis function
How To detect cycle in a Graph **Union Find**?
Union-Find Algorithm can be used to check whether an undirected graph contains cycle or not. Note that we have discussed an algorithm to detect cycle. This is another method based on Union-Find. This method assumes that graph doesn't contain any self-loops
Pseudo code for Dijkstra algorithm?
function Dijkstra(Graph, source): 2 3 create vertex set Q 4 5 for each vertex v in Graph: // Initialization 6 dist[v] ← INFINITY // Unknown distance from source to v 7 prev[v] ← UNDEFINED // Previous node in optimal path from source 8 add v to Q // All nodes initially in Q (unvisited nodes) 9 10 dist[source] ← 0 // Distance from source to source 11 12 while Q is not empty: 13 u ← vertex in Q with min dist[u] // Node with the least distance 14 // will be selected first 15 remove u from Q 16 17 for each neighbor v of u: // where v is still in Q. 18 alt ← dist[u] + length(u, v) 19 if alt < dist[v]: // A shorter path to v has been found 20 dist[v] ← alt 21 prev[v] ← u 22 23 return dist[], prev[] If we are only interested in a shortest path between vertices source and target, we can terminate the search after line 15 if u = target. Now we can read the shortest path from source to target by reverse iteration: 1 S ← empty sequence 2 u ← target 3 while prev[u] is defined: // Construct the shortest path with a stack S 4 insert u at the beginning of S // Push the vertex onto the stack 5 u ← prev[u] // Traverse from target to source 6 insert u at the beginning of S // Push the source onto the stack
Explain Shortest Path from source to all vertices **Dijkstra** algorithm?
is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. Mark all nodes unvisited. 1) Create a set sptSet (shortest path tree set) that keeps track of vertices included in shortest path tree, i.e., whose minimum distance from source is calculated and finalized. Initially, this set is empty. 2) Assign a distance value to all vertices in the input graph. Initialize all distance values as INFINITE. Assign distance value as 0 for the source vertex so that it is picked first. 3) While sptSet doesn't include all vertices ....a) Pick a vertex u which is not there in sptSetand has minimum distance value. ....b) Include u to sptSet. ....c) Update distance value of all adjacent vertices of u. To update the distance values, iterate through all adjacent vertices. For every adjacent vertex v, if sum of distance value of u (from source) and weight of edge u-v, is less than the distance value of v, then update the distance value of v.
8. Optimal Strategy for a Game
{ // Create a table to store solutions of subproblems int table[][] = new int[n][n]; int gap, i, j, x, y, z; // Fill table using above recursive formula. // Note that the tableis filled in diagonal // fashion (similar to http://goo.gl/PQqoS), // from diagonal elements to table[0][n-1] // which is the result. for (gap = 0; gap < n; ++gap) { for (i = 0, j = gap; j < n; ++i, ++j) { // Here x is value of F(i+2, j), // y is F(i+1, j-1) and z is // F(i, j-2) in above recursive formula x = ((i + 2) <= j) ? table[i + 2][j] : 0; y = ((i + 1) <= (j - 1)) ? table[i +1 ][j - 1] : 0; z = (i <= (j - 2)) ? table[i][j - 2]: 0; table[i][j] = Math.max(arr[i] + Math.min(x, y), arr[j] + Math.min(y, z)); } } return table[0][n - 1]; }