Cracking the Coding Interview | Oct 2018 | v1.1
What is a perfect binary tree?
- A binary tree that is both full and complete. - It's rare in interviews and real life, as a perfect tree must have exactly 2k-1 nodes, where k is the number of levels.
What is a min-heap?
- A complete binary tree where each node is smaller than its children. - The root is the minimum element.
What is the common strategy to solve a linked list problem?
- A number of linked problems rely on recursion. - If you've having trouble solving linked list problems, trying a recursive approach might help.
What are the characteristics of a tree?
- A tree has a root node. - Each node has zero or more child. - A tree cannot contain cycles ~ a graph without cycles. - The nodes may or maynot be in a particular order. - A node is called a leaf node if it has no children.
When I do add/multiple the runtimes?
- Add: do this, then do that - Multiply: do this each time do that
What are the 2 common ways to represent a graph?
- Adjacency List: Every vertex stores a list of adjacent vertices. - Adjacency Matrix is an NxN matrix where N is the number of nodes.
What are the places where queue are often useful?
- Breadth-first search. - Implementing a cache.
What are the 2 common ways to search a graph?
- Depth-first search: start at a node and explore each branch completely before moving on to the next branch. - Breadth-first search: start at a node and explore each neighbor before going on to any of their children.
2.3 Delete Middle Node: Implement an algorithm to delete a node in the middle of a singly linked list, given only access to that node.
- Exactly in the middle or what? - What does it mean to be only access to that node? Not given access to the head of the linked list, but only one node. Copy the data of the next node to the one to be deleted and delete the next node.
Walkthrough the use of runner technique for this example: Rearrange this a1 > a2 > ... > an > b1 > b2 > ... > bn to a1 > b1 > a2 > b2 > ... > an > bn.
- Have a pointer p1 that moves every 2 elements for every one move that p2 makes. - Once p1 reaches the end, p2 is in the middle. Once p1 loops back to the beginning, it ends up in a place where it can coordinate with p2 to rearrange these elements.
2.2 Return Kth to Last: Implement an algorithm to find the kth to last element of a singly linked list.
- Is linked list size known? Because if it's known, then the kth to last element is the (length - k)th element. 1. Keep a counter and everytime we make a recursive call, we add it by 1. It will eventually reach k and that's when we know we found it. O(n) space due to the recursive call. O(k) time. 2. Have two pointers with k place apart. Move them at the same pace. Where the faster one reach the end, the runner one reach (length-k)th element. O(n) time. O(1) space.
2.6 Palindrome: Implement a function to check if a linked list is a palindrome.
- Is the length known? 1. Reverse the linked list then compare it to the original. Need to compare the first half only. 2. Push the first half to a stack and compare the rest. If the length is known, pretty straightforward. Need to handle the odd length. If the length is not known, use runner technique. When the fast one hits the end, the slow one gets in the middle. Simply compare the rest. 3. Recursive approach: Check head and tail then get closer each iteration.
What are common questions to ask about strings?
- Is the string ASCII or Unicode? - Is there a space? - Is there a special character? - Is there number? - Is there a difference between uppercase and lowercase?
1.1 Is Unique: Implement an algorithm to determine if a string has all unique characters.
- Is the string ASCII or Unicode? We assume it's ASCII since it's more simple. - Can return false immediately if the length of a string is larger than 128. 1. We can have an array of boolean. Iterate through the string by character, check if we have already seen it. 2. Can also use a hash table instead of an array. O(n) time where is the the length of the string. It can also be O(1) since the loop never goes beyond 128 characters. O(m) space where m is the size of all unique characters.
1.8 Zero Matrix: Write an algorithm such that if an element in a MxN matrix is 0, its entire row and column are set to 0.
- It might looks easy: just iterate through the matrix and every time we see a cell with value zero, set its row and column to 0. We might run to the program where our entire matrix will be set to zeros. - Iterate through the entire array and keep two arrays of zero rows and columns. Nullify one by one.
Given a list of millions of documents, how would you find all documents that contain a list of words? The words can appear in any order, but they must be complete words. That is, "book" does not match "bookkeeper."
- Pretend we have a dozen of documents, how would we implement findWords? - One way could be preprocess each document and create a hash table, where a word would map to a list of documents that contain that word. - What if there are millions of documents? Maybe divide up documents across many machines - What if we can't fit the full hash table on one machine? Maybe divide by keyword such that a given machine contains the full document list for a given word. Or divide by document such that a machine contains keyword mapping for only a subset of documents. - How do all these machines talk to each other? How do we know which machine contains which piece of data? - One possible solution is to divide up the words alphabetically and assign them in machine order. Basically iterate through the keyword alphabetically and store as much as possible on machine then go to the next one in order. For a given sentence, we can sort it then send each to the machine with the associated range then merge results from all machines. - Advantage: lookup table is small and simple. - Disadvantage: if new documents or words are added, a shift will be expensive.
What are the 2 common types of balanced trees?
- Red-black trees. - AVL trees.
1.5 One Away: There are 3 types of edits that can be performed on string: insert a character, remove one, replace one. Given 2 strings, write a function to check if they are on edit (or zero edits) away. For example, - pale, ple => true - pales, pale => true - pale, bale => true - pale, bae => false
- Replacement means two strings are different only in one place. - Insertion means if you compared the strings, they would be identical, except for a shift at some point in the strings. - Removal is the inverse of insertion. 1. Keep two pointers to the index of 2 strings. If the strings have the same length, they could be one replacement away, else they could either be insertion or removal away. O(n) time, O(m) space where n is the length of the string and m < n.
4.1 Route Between Nodes: Given a directed graph, find out whether there is a route between two nodes.
- Simple graph traversal, DFS and BFS. - Start with one of the two nodes and do the search. - Remember to mark visited nodes to avoid cycles.
What are common questions to ask about linked list?
- Singly or doubly? - Length? Even? Odd? - Sorted or unsorted?
2.8 Loop Detection: Given a circular linked list, returns the node at the beginning.
- Start in an arbitrary node. - Have a hash table to count the number of occurrences of letter. - If we seen that node before, then it's a cycle.
1.2 Check Permutation: Given two strings, write a method to decide if one is a permutation of the other.
- Strings of different lengths cannot be permutation of each other. 1. If two strings are permutations, they have the same characters but in different orders. We can sort them first then compare the results. 2. Use a hash table to count the number of character appearance in one string. Use another one to decrease the count. If all are 0 in the end, true. O(m + n) time where m, n are the size of the two strings. O(m) space because of the hash table.
1.4 Palindrome Permutation: Given a string, write a function to check if it's a permutation of palindrome. For example, if the string is "tact coa", its could be a permutation of a palindrome "taco cat".
- To check if a string is a palindrome, we can check if all characters in the string has an even number counts, and at most one can have an odd count, which is the one in the middle. 1. Use a hash table to count the number of appearance. Iterate through the hash table and ensure that no more than one characters has an odd count. O(n) time, O(m) space where n is the length of the string and m < n.
4.2 Minimal Tree: Given a sorted (increasing order) array with unique integer elements, create a binary search tree with minimal height.
- To create a tree with minimal height, need to match the number of nodes on the left to the right's as much as possible. - Since the array is sorted, we can have the middle one in the array the root, the left half will be in the left subtree and vice versa. 1. Recursive approach: - Insert the middle one. - Insert the left one. - Insert the right one.
Sum up: What are the common strategies to solve arrays, strings related questions?
- Use a hash table to keep the count. - Use 2 pointers at the index.
What is a bidirectional search?
- Used to find the shortest path. - Run 2 simultaneous breadth-first, one from each node. Once they collide, we found a path.
What are the common ways of partitioning and their drawbacks?
- Vertical partitioning: say in social media app, we can have different tables for each components: profiles, messages and so on. Drawback: one table can get very big and needed to repartition. - Key-based/hash-based partitioning. Drawback: adding additional means reallocating all the data, which is very expensive. - Directory-based partitioning by maintaining a lookup table - distributed file system (gfs). Drawback: lookup table can be a single point of failure; constantly accessing this table affects performance.
What is in-order traversal?
- Visit the left branch. - The current node. - The right branch.
2.7 Intersection: Given two singly linked list, determine if two lists intersect.
- What does an "intersection" mean? - Are they the same length? 1. If two intersect, they have the same end. So we can just traverse backward. But you can't traverse backward a singly linked list. So if they're the same length, just traverse forward and see when they collide. If not, just chop off the length, starting from the head, then do the same.
3.1 Three in One: Describe how you could use a single array to implement three stacks.
- What does it mean? - Fix size or dynamic size? 1. Fixed division: divide the array in three equal parts and allow the individual stack to grow in that limited space. 2. Flexible divisions: once one stack grow over its capacity, simply allocate a new space and shift things over.
1.6 String Compression: Implement a method to perform basic string compression using the count of repeated characters. For example, "aabcccccaaa" would become "a2b1c5a3".
- What if the compressed string is not smaller than the original string? Return the original one Is there uppercase or lowercase difference? - Seems easy but it would be very inefficient if use strings concatenation. Remember that string concatenation operates in O(n2) because it needs to copy the string over at every step. 1. We need to use a strings.Builder. Iterate through the string, keep a counter for consecutive character and append using that Builder object. O(n) time, O(m) space where n is the length of the string and m < n.
2.5 Sum Lists: You have 2 numbers represented by a linked list, where each node contains a single digit. The digits are stored in reverse order. Add two numbers and return the sum as a linked list. For example, if the input is (7 -> 1 -> 6) + (5 -> 9 -> 2), we calculate 617 + 295 = 912, then output (2 -> 1 -> 9).
- What if the lengths are not the same? Pad the shorter list with zero then do the same. 1. Because they are in reverse order, we can add one node by one node and remember the carry.
1.3 URLfy: Write a method to replace all spaces in a string with '%20'.
- Will there be spaces in the end of the string? What do I do with that? - Go through the array. If there is a word, copy over. If there's a space, move on.
What are the core operations of a queue?
- add(item): add an item to the end of the list. - remove(): remove the first item in the list. - peek(): return the top of the queue. - isEmpty(): return true iff the stack is empty.
What are the core operations of a stack?
- pop(): remove the top item from the stack - push(item): add an item to the top of the stack. - peek(): return the top of the stack. - isEmpty(): return true iff the stack is empty.
2.4 Partition: Partition a linked list around a value x, where all nodes less than x come before all nodes greater than or equal to x.
1. Basically create 2 linked list, one contains all nodes that are less than x and one contains all nodes that are greater or equal than. Merge these two.
What are the two key operations on a mean heap?
1. Insert. - Insert the element at the bottom, the rightmost spot to maintain the complete tree property. - Fix the tree by swapping the new element with its parent until we find the appropriate spot. In other word, we bubble up the minimum element. This takes O(log n) time, where n is the number of nodes in the heap 2. Extract minimum element. - The minimum element is always at the top so the tricky part is to remove it. - Swap it with the bottommost, rightmost element. - Delete the last min element. - Downheap. This also takes O(log n) time.
2.1 Remove Dups: Write code to remove duplicates from an unsorted linked list.
1. Iterate through the linked list and add each element to the hash table if we haven't seen it. If we do, then delete it. - O(n) time where n is the number of elements in the linked list. - O(m) space where m <= n because of the allocated hash table. 2. Have two pointers. The faster one iterates through the linked list. The runner one checks the duplicates. O(1) space because there is no allocation. O(n^2) time because we have to check every pointer for every pointer.
What is a balanced tree?
A balanced tree means something more like "not terribly imbalanced". It's balanced enough to ensure O(log n) times for insert and find.
What is a complete binary tree?
A binary tree in which every level of the tree is fully filled, except for the rightmost element on the last level.
What is a binary search tree?
A binary tree in which every node fits a specific ordering property: all left descendents <= n <= all right descendents.
What is a full binary tree?
A binary tree in which every node has either zero or two children.
What is a binary tree?
A binary tree is a tree in which each node has up to 2 children.
What is a trie?
A funny data structure.
What is MapReduce?
A program used to process large amount of data, requires you to write 2 functions: - Map takes in some data and emits a key-value pair. - Reduce takes a key and a set of associated values and reduces them, emit a new key-value.
What is asynchronous processing & queue?
A synchronous operation blocks and async does not and only indicates the operations. It allows to do more things while the message is in transit. (Think like how we multitask, but not do multiple at the same time aka in parallel)
What is a load balancer?
A system that distribute the load evenly across all servers.
What is throughput?
Actual data that is transfer.
What is database denormalization?
Adding redundant information into a database to speed up read.
What are the common questions about trees and graphs?
Ask for clarification on which type of tree.
What is an example of O(log n) runtime?
Binary search.
What to do with read-heavy?
Cache
What is the one problem of joining relational database?
Can be very slow and expensive when the systems get bigger.
What are the common use cases BFS?
Find the shortest path between two nodes.
Describe the runtime of insertion of an dynamically resizing array, for example a slice in Go.
For most of the time, the runtime is O(1) because inserting an element into an array is just adding it to the end of the array. However, when there is no space, then the array will double itself, which means that it will be allocated to a bigger space and everything will be copied over. In conclusion, we say that X insertions take O(2X) time. The amortized time for each insertion is O(1).
What is availability?
Function of the percentage of time the system is operational.
What is reliability?
Function of the probability that the system is operational for a certain unit of time.
What is horizontal scaling?
Increasing the number of nodes, say adding more servers.
What is vertical scaling?
Increasing the resource of a specific node, say adding more computing power or memory.
What is the amortized time?
It describes the concept that worst case scenario happens once in a while. Once it does, it is costly and that's the trade off.
What is the most important fact about encodings?
It does not make sense to have a string without knowing what encoding it uses. You can no longer stick your head in the sand and pretend that "plain" text is ASCII.
What is the runner technique?
Iterate through the linked list with two pointers simultaneously, with one head of the other.
3.2 Stack Min: Design a algorithm to return the minimum element of a stack?
Keep a minimum value as a member of a stack and update every time we update a new item.
3.3 Stack of Plates: Implement a dynamic stack basically.
Look into Go slice implementation. Basically it will double the size everytime there is no room left. It's a tradeoff.
There are some examples after this that include pictures that cannot be parsed here
Look into the notes or more information.
What is bandwidth?
Maximum amount of data that can be transferred in a unit of time.
1.9 String Rotation: Assume you have a method isSubstring which checks if one word is a substring of another. Given two strings, s1 and s2, write code to check if s2 is a rotation of s1 using one call to isSubstring.
Nah
If we have n calls, does it ALWAYS mean that it take O(n) space? Give a counter example.
No. Here is an example: defining a variable, which 1 space, then looping N times to add to it. Each time, it doesn't allocate any more space because it just needs to add to that 1 space there. Therefore, it is O(1) space complexity in this case.
How much space do recursive algorithms take in general?
O(branches^depth)
What to do with write-heavy?
Queue up the writes. Think twice about the the potential failure here.
What is the one case where stacks are often useful?
Recursive algorithms: Push data onto a stack as you recurse. Remove them as you backtrack.
Walkthrough a step-by-step design of TinyURL
Scope the problem - Will people be able to specify their own short URL? - Will it be auto generated? - Will you need to keep track of stats on the clicks? - Should the URL live forever or do they have timeout? List major use cases - Shorten a URL to TinyURL - Analytics for URL - Retrieving the URL - User account and link management Make reasonable assumption - Is it reasonable to assume that there will be 1 million new URLs a day? - Is it okay for the data to be stale by a max of 10 minutes? Draw the major components and walkthrough the flow Identify key issues - What will be the bottlenecks or major challenges? - Highly frequent accessed URL? Redesign for key issues - Cache
What is cache?
Simple key-value pair in memory data store.
3.4 Queue via Stacks: Implement a queue using two stacks.
Since queue is first in first out vs stack's last in first out, we can use the second stack to reverse the order.
3.5 Sort Stack: Sort a stack such that the smallest items are on the top.
Sort it and push it in order you want
What is database partitioning/sharding?
Splitting data across multiple machines while ensuring you have a way of figuring out which data is on which machine.
9.2 Social Network: Walk through the step of designing a data structure for a very large social network like Facebook and an algorithm to show the shortest path between two people.
Step 1: Simplify the problem by forgetting about million of users. - Use bidirectional BFS - basically doing 2 BFS from the source and the destination and stop at the collision. - We only need to go through 2k nodes, where k is the number of friend each person has. While with one way BFS, we need to go through k friend of k each each of their k friends, which is k + k*k nodes in total. Step 2: Handle the million users - Keep data on multiple machines instead of one. Need a new algorithms to search across all machines. - One simple algorithm could be: for each friend, go to the corresponding machine then repeat. Step 3: Optimization - Jumping from one machine to another can be expensive. Instead of jumping with each friend, can try to batch these jumps. - Rather than randomly dividing people across machines, can sort by country, city, state,... What can the follow-up questions be? - What to do if one or multiple servers fail? How to take advantage of caching? - Do you search until the end of the graph? When to stop? - In the case above, we assume everyone has the same number of friends, but that's not the case in real world. How do we take advantage of that?
4.10 Check Subtree: T l and T2 are two very large binary trees, with T l much bigger than T2. Create an algorithm to determine if T2 is a subtree of Tl. A tree T2 is a subtree ofT i if there exists a node n in T i such that the subtree of n is identical to T2. That is, if you cut off the tree at node n, the two trees would be identical.
TODO
4.11 Random Node: You are implementing a binary tree class from scratch which, in addition to insert, find, and delete, has a method getRandomNode() which returns a random node from the tree. All nodes should be equally likely to be chosen. Design and implement an algorithm for getRandomNode, and explain how you would implement the rest of the methods.
TODO
4.12 Paths with Sum: You are given a binary tree in which each node contains an integer value (which might be positive or negative). Design an algorithm to count the number of paths that sum to a given value. The path does not need to start or end at the root or a leaf, but it must go downwards (traveling only from parent nodes to child nodes).
TODO
4.3 List of Depths: Given a binary tree, create a linked list of all nodes at each depth.
TODO
4.4 Check Balanced: Implement a function to check if a binary tree is balanced. For the purposes of this question, a balanced tree is defined to be a tree such that the heights of the two subtrees of any node never differ by more than one.
TODO
4.5 Validate BST: Implement a function to check if a binary tree is a binary search tree.
TODO
4.6 Successor: Write an algorithm to find the "next" node (i.e., in-order successor) of a given node in a binary search tree. You may assume that each node has a link to its parent.
TODO
4.7 Build Order: You are given a list of projects and a list of dependencies (which is a list of pairs of projects, where the second project is dependent on the first project). All of a project'sdependencies must be built before the project is. Find a build order that will allow the projects to be built. If there is no valid build order, return an error. EXAMPLE Input: - projects: a, b, c, d, e, f - dependencies: (a, d), (f, b), (b, d), (f, a), (d, c) Output: f, e, a, b, d, c
TODO
4.8 First Common Ancestor: Design an algorithm and write code to find the first common ancestor of two nodes in a binary tree. Avoid storing additional nodes in a data structure. NOTE: This is not necessarily a binary search tree.
TODO
4.9 BST Sequences: A binary search tree was created by traversing through an array from left to right and inserting each element. Given a binary search tree with distinct elements, print all possible arrays that could have led to this tree.
TODO
CHAPTER 5 BIT MANIPULATION
TODO
What is the definition of big O?
The definition of big O in academia and industry is somewhat similar, which is the upper bound.
What is space complexity?
The memory or space it takes to run.
What is latency?
The time it takes for data to go from one end to other.
What is the difference between academia and industry's meaning of big O?
There is not much difference. They are both pointing to the upper bound. In academia, there are 3 different definitions: - big O: upper bound - big omega: lower bound - big theta: tight bound In industry, big O means big O and big theta, which basically the "tight" upper bound.
3.6 Animal Shelter
Too long man
What is the alternative way to solve a recursive problem?
Try to solve it in the iterative way or vice versa.
What is post-order traversal?
Visit the current node after its child nodes.
What is pre-order traversal?
Visit the current node before its child nodes.
What are the common use cases DFS?
Want to visit every node in the graph.
9.1 Stock Data: Let say we have a simple service that return end-of-day stock information. Assume that you already have the data, store in anyform you want. How would you distribute the information to clients? Name 3 different proposals. You're responsible for development, rollout, monitoring, maintenance.
What are different aspects we should consider: - Ease of use for client and developer. Scalable. Proposal 1: Keep the data in simple text file, let them download through FTP. Pros: - Simple to maintain in a sense because file can be easily viewed and backed up. Cons: - Require complex parsing to do query - Additional data will break the mechanism. Proposal 2: Use standard SQL and let the client plug directly through that. Pros: - Traditional way to query data, less learning curve. - Existing support features: backup, rollback, replication,... - Relatively easy to integrate with existing application. Cons: - Heavier than we need. - Might be difficult for human to read, given the fact that they need to be somewhat familiar with SQL. Proposal 3: XML, JSON,... Pros: - Easy to distribute, add, maintain. - Readability Widely supported Cons: - (not clear what the books say)
Does stack space in recursive call counts?
Yes. It makes sense because the stack space is just adding up with the recursive call and this definitely takes up more space.
List the common big O rate of increase in worst to best order.
x!, 2^x, x^2, x log(x), x, log(x)