Grokking algorithms book
Longest common substring
- Axes 2 words - Each cell will contain length of longest substring that two substrings have in common
Examples of dynamic programming
- Biologists use the longest common subsequence to find similarities in DNA strands. They can use this to tell how similar two animals or two diseases are. The longest common subsequence is being used to find a cure for multiple sclerosis. - Have you ever used diff (like git diff)? Diff tells you the differences between two files, and it uses dynamic programming to do so. - String similarity: Levenshtein distance measures how similar two strings are, and it uses dynamic programming. Levenshtein distance is used for everything from spell-check to figuring out whether a user is uploading copyrighted data.
Bubble sort
- Compare pairs, swap if out of order, move down list, and repeat, until everything is in order - Scan array left to right multiple time. First look at 0th then 1st etc. - Compare the element at 0th position with 1st position, swap - After one pass, biggest bubbles up to right most. - After n-1 we will be sorted - Two for loops -- O(n^2)
Merge sort
- Divide array into 2 equal halves - Sort each half, created separately in memory - Then merge them in sorted order (Left and Right) - Take smallest unpicked in L - Take smallest unpicked in R - Put smallest of the two in original array - Keep doing above as long as there are elements in both array O(nLog(n)) Space complexity O(n) create array to sort and merge. Need double memory to store sorted halves pre merge (move them back and forth b/w 1st and 2nd (temp) array
Selection Sort (sort an array for smallest to largest)
- Find smallest number, move to first spot (swap the number there - evict - and put it where you took the smallest one) - Check each item in the list (takes O(n) time). Create extra array, take minimum from input array, and iterate. Fin min and swap with element at 0th index. In each pass, find MIN in smaller array. - 2 for loops O(n x n) = O(n^2)
Sets (Union, intersection & difference)
- Set union, |, means "combine both sets." - Set intersection, &, means "find the items that show up in both sets" - Set difference, -, means "subtract items in one set from the items in the other set."
Insertion Sort
- Take 2, compare, compare to next thing, swap, and so on down the line until sorted - Take value sequentially and insert into new array - For in-place: dividing array into sorted and unsorted part and continuously inserting element from unsorted part to sorted part - Element with 0 (only one element) can be starting point - sorted. Now pick index 1 and insert / shift as needed
Examples of problems solved by graphs
- Write a checkers AI that calculates the fewest moves to victory - Write a spell checker (fewest edits from your misspelling to a real word—for example, READED -> READER is one edit) - Find doctor closest to you in your network
Optical Character Recognition (OCR)
1) Go through lots of images of numbers, and extract features of those numbers. 2) When you get new image, extract the features of that image, and see what its nearest neighbors are OCR algos measure lines, points, and curves
Quicksort
1) Pick an element from the array (pivot) 2) Find elements smaller than the pivot and elements larger than the pivot. Partition so have 2 sub-arrays 3) Call quicksort recursively on 2 sub-arrays and combine results Av O(nlogn), worst is O(n^2) Space complexity: O(log(n))
How to add up all numbers in an array and return total
1. Figure out base case 2. Move closer to empty array with every recursive call (reduce problem size by sum function. def sum(list): if list == []: return 0 return list[0] + sum(list[1:])
Stack helps keep track of calls but there's a cost: saving all that info can take up a lot of memory. Each of function calls take up some memory and when stack is too tall that means your computer saving info for too many function calls. How to solve?
1. Rewrite your code to use loop instead 2. Tail recursion (only supported by some languages)
List number of possible subsets
2^n subsets
Amortized time
Amortized time allows us to describe that worst case happens every once in a while. But once it happens, it won't happen again for so long that the cost is "amortized". E.g. for ArrayList, as we insert, we double capacity when size of array is a power of 2 - so array sizes 1, 2, 4, 8, 16, 32. If you invert, starts with X and halves until it gets to 1.ie. X + X/2 + X/4 + X/8 + ..1 approx equal to 2X - so insertions take O(2X) time - amortized time for insertion is O(1)
Explain recursion to 5 year old
Analogy: Imagine you are entering a cinema. You sit down in a row but realise you had to sit in row 20. Being a lazy computer scientist in an already full theatre of other computer scientists. You ask the person in front. Trouble is they didn't check either. So again being lazy they ask the person in front of them. This continues until the bottom row is reached. This person knows they are in row 1 (they are there with a computer scientist friend but they aren't a computer scientist so they start counting at 1 not 0). This is passed back to the person who asked, who being a computer scientist can add 1 to the number this person told them. This carries on until the original person asking finds out that the row in front of them is 31 and that must mean they are in 32.
HyperLogLog
Approximates the number of unique elements in a set. Just like bloom filters, it won't give you an exact answer, but it comes very close and uses only a fraction of the memory a task like this would otherwise take. e.g. Google wants to count the number of unique searches performed by its users. Amazon wants to count the number of unique items that users looked at today
Greedy algorithms
At each step you pick the locally optimal solution e.g. knapsack problem (Pick largest box that will fit in remaining space ) e.g. BFS, Dijkstra's algorithm
Binary Search Tree
BST faster for insertions and deletions but don't get random access.
Base case and recursive case for binary search?
Base case is an array with 1 item. In recursive case, split array in half, throw away one half and call binary search on other half
Why is quick sort O(n log n)?
Best case: if all elements are equal then quicksort will on average just traverse through the array once. I.e. O(N) Worst case: when pivot is repeatedly the biggest element in the array so our recursion doesn't divide the array in half and recurse on each half. It just shrinks the subarray by one element [O(n) levels]. This will degenerate to a O(N^2) runtime Expected case: Usually the best or worst case won't happen. Height of the call stack is O(log n)). And each level takes O(n) time. So algorithm will take O(n) * O(log n) = O(n log n) time.
What if one user more conservative in ratings?
Can use cosine similarity. Doesn't measure the distance between two vectors. Instead, it compares the angles of the two vectors. Better at dealing with cases like this - Or could use normalization (see average rating of each person and use it to scale their ratings and then compare their ratings on same scale)
Simple Search
Check each element O*n)
MapReduce
Distributed algorithm - run queries about data across multiple machines Map function: takes an array and applies the same function to each item in the array (Map automatically spreads out work across all of 100 machines) Reduce function: whole list of items down to one item - transform array to 1 item
NP-complete problems
E.g. Traveling-salesperson (want to find the shortest path that connects several points) & set-covering problem. - Algorithm runs quickly with a handful of items but really slows down with more items. - "All combinations of X" usually point to an NP-complete problem. - Have to calculate "every possible version" of X because you can't break it down into smaller sub-problems - If problem involves a sequence (such as a sequence of cities, like traveling salesperson), and difficult - If problem involves a set & hard to solve, might be NP-complete. - Restate problem as set-covering problem or traveling-salesperson problem?
Implementing graph
Express relationship with hash table (allows you to map a key to a value) e.g. map node to neighbours
Breadth-first search (BFS)
Find the shortest distance between two things e.g. 1) Is there a path from node A to node B? 2) What is the shortest path from node A to node B? - Traverse the primary then secondary
Dijkstra's algorithm (assign # or weight to each segment)
Finds path with smallest total weight 1. Find "cheapest" node (least amount of time) 2. Check whether there's a cheaper path to the neighbors of this node. If so, update their costs 3. Repeat until complete for every node in the graph 4. Calculate final path - Can't use Dijkstra's algorithm if you have negative-weight edges.
O(A * B) when to multiply runtimes
For (int a: arrA) { for (int b : arrB) { print(a + "," + b); } } - Do B chunks of work for each element in A
secure hash algorithm (SHA) function
Generates a hash, which is just a short string. The hash function for hash tables went from string to array index, whereas SHA goes from string to string. Can use SHA to tell whether two files are the same. This is useful when you have very large files E.g. storing/checking passwords Also locality insensitive
ArrayList or dynamically resizing array
Gives benefits of array while offering flexibility in size. When array hits capacity, ArrayList class will create a new array with double the capacity and copy all the elements over to the new array.
Hash table
Hash function & array together. Maps keys to values Great for create a mapping from one thing to another thing + Look something up e.g. phone book. Hash tables help with DNS resolution, preventing duplicate entries (store dupes in hash table), caching
When you have a large set and want to see whether new item belongs to that set?
Hash needs to be huge so can use Bloom filter. Probabilistic data structures. Ask your bloom filter if you've crawled this URL before. - False positives are possible. Google might say, "You've already crawled this site," even though you haven't. - False negatives aren't possible. If the bloom filter says, "You haven't crawled this site," then you definitely haven't crawled this site. - Take up very little space
Hash table performance (Search, Insert, Delete)
Hash tables as fast as arrays at searching (getting a value at an index). And they're as fast as linked lists at inserts and deletes. It's the best of both worlds! But in worst case, hash tables are slow at all of those
Inverted index (for search engines)
Hash that maps words to places where they appear
SHA (Secure Hash Algorithm)
In cryptography, SHA-1 is a cryptographic hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest - typically rendered as a hexadecimal number, 40 digits long.
Linked lists
Items can be anywhere in memory. Each item stores the address of the next item in the list. A bunch of random memory addresses are linked together. Just can change what previous & next element points to. Sequential access
Hash function
Maps strings to numbers. - Needs to be consistent (consistently maps a name to the same index) - Map different words to different numbers Hash function (different strings to different indexes) - knows how big your array is and only returns valid indexes
Graphs
Models set of connections. Made up of nodes and edges. Node can be directly connected to many other nodes e.g. neighbours
Build a spam filter (Figures out the probability that something is likely to be spam based on probability of that word showing up in spam email)
Naive Bayes classifier
Implementation of Dijkstra's algorithm
Need 3 hash tables (Graph, costs & parents) and can put hash tables inside to represent weights. Need an array to keep track of all the nodes you've already processed. Cost of a node is how long it takes to get to that node from the start
Load factor
No. of item in hash table / total no. of slots Resize when your load factor > 0.7
Recursive function to calculate factorial of number
Notice that each call to fact has its own copy of x. You can't access a different function's copy of x.
When you see a problem where the number of elements in the problem space gets halved each time
O(log N) runtime e.g. QuickSort & Binary Search Tree (half the nodes on each side so cut problem space in half each time)
Best time for sorting algorithm
O(n log n) Can't sort an array in O(n) time unless you use parallel algorithm. Parallel version of quicksort that can sort an array in O(n) time
Run time for Breadth First Search
O(number of edges + number of people) O(V+E) (V for number of vertices, E for number of edges).
Issue with dynamic programming
Only works when each subproblem is discrete—when it doesn't depend on other subproblems.
Binary search (looking for companies in phone book)
Only works when your list is in sorted order The binary_search function takes a sorted array and an item. If the item is in the array, the function returns its position. You'll keep track of what part of the array you have to search through. O(log n)
K-Nearest Neighbours classifier
Plot user Look at its 3 nearest neighbors Classify based on that - Building recommendations system
Regression
Predicting a response (like a number). Take average of ratings
How to calculate similarity?
Pythagorus theorem across different dimensions
Merge sort vs Quick Sort
Quicksort has smaller constant than merge sort. So if they're both O(n log n) time, quicksort is faster. And quicksort is faster in practice because it hits the average case more often than worst case since if you randomly pick pivot. Performance of quicksort heavily depends on the pivot you choose.
Array
Random access
Arrays vs Linkedlist
Reading in Arrays: O(1) Reading in Linked Lists: O(n) Insertion in Arrays: O(n) Insertion in Linked list: O(1) Deletion in Arrays: O(n) Deletion in Linked list: O(1)
Divide & Conquer algorithms
Recursive algorithms 1. Figure out the base case. This should be the simplest possible case. Often empty array or array with 1 element 2. Divide or decrease your problem until it becomes the base case.
Travelling salesperson algorithm
Salesperson has to go to 5 cities and wants to calculate MIN distance. Must look at every possible order in which he could travel to cities. O(n!)
Locally sensitive algorithms
Simhash comes in. If you make a small change to a string, Simhash generates a hash that's only a little different. This allows you to compare hashes and see how similar two strings are. E.g. Google uses Simhash to detect duplicates while crawling the web & plagiarism
Dynamic programming
Solve subproblems and then build up to solving big problem - Solve through building grid and get to optimal solution Can't steal fraction of items - Trying to optimize something given a constraint
Tree
Special type of graph, where no edges ever point back.
Big O notation
Tells you how fast an algorithm is. Lets you compare the number of operations - how fast the algorithm grows. O(n) where n is number of operations. Worst-case scenario
Implementation of Hash Table
Use an array of linked lists + hash code function. To insert a key + value: 1) Compute key's hash code. 2) Map hash code to an index in the array. E.g. hash(key) % array_length. 2 different hash codes could map to same index 3) At this index, there is linked list of keys and values. Store key and value in this index. Must use a linked list because of collisions: you could have 2 different keys with same hash code, or 2 different hash codes that map to same index.
Approximation algorithm
When calculating exact solution takes too much time, approximation algorithm will work. Approximation algorithms are judged by - How fast they are - How close they are to the optimal solution
Recursion
When function calls itself. Often makes the solution clearer but no performance benefit. Every recursive function has 2 parts: 1) base case (when function doesn't call itself again) 2) recursive case (when function calls itself
O(n) of Recursive function
When you have a recursive function that makes multiple calls, the runtime will often look like O(branches ^ depth), where branches is the number of times each recursive call branches - and gives us O(2^N)
Stack
When you insert an item, it gets added to the top of the list. When you read an item, you only read the topmost item, and it's taken off the list. Push (add new item to top) Pop (remove topmost item & read)
How memory works?
When you want to store an item in memory, ask computer for some space and it gives you an address where you can store item. If want to store multiple items, can store through arrays and lists
Call stack
Your computer uses a stack internally. Every time you make a function call, computer saves the values for all variables for that call in memory. Computer uses stack for memory. - When you return from function call, box on top of stack gets popped off. - when you call a function from another function, the calling function is paused in a partially completed state. All the values of the variables for that function are still stored in memory
Binary Search python code
def binary_search(list, item): low = 0 high = len(list)—1 while low <= high: mid = (low + high) guess = list[mid] if guess == item: return mid if guess > item: high = mid - 1 else: low = mid + 1 return None
Find MAX number of list
def count (list): if len(list) == 2: return list[0] if list[0] > list[1] else list[1] sub_max = max(list[1:]) return list[0] if list[0] > sub_max else sub_max
Recursive function to count number of items in a list
def count (list): if list == []: return 0 return 1 + count(list[1:])
O(A+B) when to add runtimes
for (int a: arrA) { print(a); } for (int b : arrB) { print(b); } - Do A chunks of work then B chunks of work
Linear programming
used to maximize something given some constraints
How to pick right features for KNN?
• Directly correlate to movies you're trying to recommend • Don't have a bias (e.g. only rate comedy movies, that doesn't tell you whether they like action movies)
To avoid collisions
• Low load factor • Good hash function