yahoo interview
Design a game with deck of cards.
...
singleton
...
Create maximum unique palindromes from a given string
Already given
How to resolve conflict of hash map
Chaining
DNS resolution
Each web server (and indeed any host connected to the internet) has a unique IP address in textual form, translating it to an IP address (in this case, 207.142.131.248) is a process known as DNS resolution or DNS lookup; here DNS stands for Domain Name Service.
What is the difference between hashtable and hashmap?
Image result for What is the difference between hashtable and hashmap?javarevisited.blogspot.com 1)Hashtable is synchronized whereas hashmap is not. 2)Another difference is that iterator in the HashMap is fail-safe while the enumerator for the Hashtable isn't. If you change the map while iterating, you'll know. 3)HashMap permits null values in it, while Hashtable doesn't.
Arraylist vs linkedlist
LinkedList is fast for adding and deleting elements, but slow to access a specific element. ArrayList is fast for accessing a specific element but can be slow to add to either end, and especially slow to delete in the middle. ArrayList is essentially an array. LinkedList is implemented as a double linked list.
pthread
POSIX Threads, usually referred to as pthreads, is an execution model that exists independently from a language, as well as a parallel execution model. It allows a program to control multiple different flows of work that overlap in time.
What is singleton, and pros and cons of it?
Positives: Lazy instantiation, Static initialization Negative: They deviate from the Single Responsibility Principle, Singleton classes cannot be sub classed.
Pseudo code that makes debugging localized and easier
RDD.transformation() RDD.action() RDD.transformation() RDD.action()
Concise pseudo code that makes debugging localized and easier
RDD.transformation().transformation().action()
If you have a lot of data and want to sort it what would you do
Split the file into chunks of MAX_MEM size, Sort each chunk in memory and store as a separate file, Open all chunks as streams of elements, Merge all streams by selecting the lowest element at each step, Delete the chunks
Concurrency
This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi-processor and multi-core systems.
Thread vs process
Threads are used for small tasks, whereas processes are used for more 'heavyweight' tasks - basically the execution of applications. Another difference between a thread and a process is that threads within the same process share the same address space, whereas different processes do not.
Check whether a number is palindrome
Yes
Detect cycle in a linkedlist
Yes
Find the kth largest element in an array
Yes
Finding an element in a rotated sorted array,
Yes
Given a binary search tree and a node in it, find the in-order successor of that node in the BST. Note: If the given node has no in-order successor in the tree, return...
Yes
Longest palindromic substring
Yes
Merge k sorted array
Yes
Merge k sorted lists into one
Yes
Print all the leaves of a binary tree
Yes
Remove duplicates from an array
Yes
Reverse a input string
Yes
Reverse string
Yes
Two number represented by linked list, from most significant to lowest. Add them together and generate a new linked list.
Yes
Wildcard matching
Yes
anagrams
Yes
design LFU cache
Yes
finding the median of an unsorted array
Yes
fizz bizz
Yes
postfix evaluation,
Yes
queues with stacks
Yes
reverse linked list
Yes
reverse words in string
Yes
trees - max sum in path
Yes
permutes a list of numbers, unique permutations
Yes ,
Reverse characters of each word in a sentence
Yes, reverse words in a string III
Using an array of singular animal names, create an RDD that contains a list of animal names singular, and another RDD that contains a one-dimensional list of animal names both singular and plural
animals = ['cat','dog','bird','rat'] animalsRDD = sc.parallelize(animals,4) singularAndPluralAnimalsRDD = animalsRDD.flatMap(lambda x: (x, x + 's'))
Using an array of singular animal names, create an RDD that contains a list of animal names singular, and another pair RDD that contains a two-dimensional list of animal names both singular and plural
animals = ['cat','dog','bird','rat'] animalsRDD = sc.parallelize(animals,4) singularAndPluralAnimalsRDD = animalsRDD.map(lambda x: (x, x + 's'))
Create a range of numbers from 1 .. 10000 using Python
data = xrange(1,10000)
Define a Python function to return a given value - 1, then use the function to transform xrangeRDD into subRDD using the function
define sub(value): return (value - 1) subRDD = xrangeRDD.map(sub)
In a reusable fashion, create a new RDD from xrangeRDD that contains only elements less than 10
define ten(value): if (value<10): return True else: return False filteredRDD = xrangeRDD.filter(ten)
In a non reusable fashion, create a new RDD from xrangeRDD that contains only elements less than 10
filteredRDD = xrangeRDD.filter(lambda x: x < 10)
Find lowest unique minimum
http://stackoverflow.com/questions/11944737/finding-the-minimum-unique-number-in-an-array
To count all log file items with the same timestamp and output the result sorted according to time
http://stackoverflow.com/questions/7347054/sort-logfile-by-timestamp-on-linux-command-line
Reverse a BST into a Binary tree and linked list
http://www.geeksforgeeks.org/convert-bst-to-a-binary-tree/
Find a target value in a binary tree recursively and iteratively
http://www.geeksforgeeks.org/find-closest-element-binary-search-tree/
Find path in a directed graph
http://www.geeksforgeeks.org/find-if-there-is-a-path-between-two-vertices-in-a-given-graph/
How can you find the greatest possible product of multiplying three numbers in a list
http://www.geeksforgeeks.org/find-maximum-product-of-a-triplet-in-array/
Find first non-repeated character
http://www.geeksforgeeks.org/given-a-string-find-its-first-non-repeating-character/
Convert a BST into Doubly Linked List
http://www.geeksforgeeks.org/in-place-convert-a-given-binary-tree-to-doubly-linked-list/
Given a line, print pascals triangle
http://www.geeksforgeeks.org/pascal-triangle/
Reverse a stack
http://www.geeksforgeeks.org/reverse-a-stack-using-recursion/
Write an algorithm to calculate x to the power of n in less than O(n) time
http://www.geeksforgeeks.org/write-a-c-program-to-calculate-powxn/
Binary Tree Question: Compare two binary tree and return if they are the same or not.
http://www.geeksforgeeks.org/write-c-code-to-determine-if-two-trees-are-identical/
design concurrent load balancer,
http://www.javaperformancetuning.com/tips/loadbalance.shtml
Find the longest string containing common substring in a string
https://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Longest_common_substring#Python_2
Print the items in wordsRDD by partition, including the index, joining items in each partition by a comma
itemsByPartRDD = wordsRDD.mapPartitionsWithInde(lambda index, iterator: [(index, list(iterator))])
Print the items in wordsRDD by partition, joining items in each partition by a comma
itemsRDD= wordsRDD.mapPartitions(lambda iterator: [','.join(iterator)]) print itemsRDD.collect()
Show the current location of filteredRDD (memory, disk, etc.)
print filteredRDD.getStorageLevel()
Print all elements (in all partitions) of subRDD
print subRDD.collect()
Print the global count of elements across all partitions of subRDD
print subRDD.count()
Print new RDD that shows the count of each element in xrangeRDD
print xrangeRDD.countByValue()
Print the first element of xrangeRDD
print xrangeRDD.first()
Print the sum of all elements in xrangeRDD using Python's add function
print xrangeRDD.reduce(add)
Print the sum of all elements in xrangeRDD using an inline function
print xrangeRDD.reduce(lambda a,b: a + b)
Print the first three elements of xrangeRDD
print xrangeRDD.take(3)
Print the smallest three elements in xrangeRDD
print xrangeRDD.takeOrdered(3)
Print ten random elements in xrangeRDD, duplicates prohibited
print xrangeRDD.takeSample(withReplacement=False, num=10)
Print ten random elements in xrangeRDD, duplicates are OK
print xrangeRDD.takeSample(withReplacement=True, num=10)
Print eight random elements in xrangeRDD, duplicates OK, deterministically (multiple invocations produce same result)
print xrangeRDD.takeSample(withReplacement=True, num=8, seed=500)
Print the largest three elements in xrangeRDD
print xrangeRDD.top(3)
Check if the string is balanced. eg {{}} should return true,
valid parantheses
Parse a string with spaces
word split, Yes
Place wordsRDD in memory to improve response times involving this RDD
wordsRDD.cache()
Remove wordsRDD from memory
wordsRDD.unpersist()
Create an RDD with 8 partitions given the variable data
xrangeRDD = sc.parallelize(data,8)
Show the number of partitions for xrangeRDD
xrangeRDD.getNumPartitions()
Set the name of xrangeRDD to MyRDD
xrangeRDD.setName('MyRDD')
Print the lineage (set of transformation) of xrangeRDD
xrangeRDD.toDebugString()