Data Structures

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What is a Permutation?

A permutation, also called an "arrangement number" or "order", is a rearrangement of the elements of an ordered list S into a one-to-one correspondence with S itself. Below are the permutations of string ABC. ABC ACB BAC BCA CBA CAB Or for example the first three people in a running race: you can't be first and second. Number of combinations: n * (n-1) * (n -2) * ... * 1 = n! When repetition is allowed we have permutations with repetitions. Number of combinations: n * n * n ... (r times) = n^r

What is a Polynomial Rolling Hash?

An ideal hash function for strings should obviously depend both on the multiset of the symbols present in the key and on the order of the symbols. The most common family of such hash functions treats the symbols of a string as coefficients of a polynomial with an integer variable p and computes its value modulo an integer constant M: The Rabin-Karp string search algorithm is often explained using a very simple rolling hash function that only uses multiplications and additions - polynomial rolling hash: H(s0, s1, ..., sk) = s0 * pk-1 + s1 * pk-2 + ... + sk * p0 where p is a constant, and (s1, ... , sk) are the input characters.

What is Bellman-Ford's Algorithm?

The Bellman-Ford algorithm is an algorithm that computes shortest paths from a single source vertex to all of the other vertices in a weighted digraph. It is slower than Dijkstra's algorithm for the same problem, but more versatile, as it is capable of handling graphs in which some of the edge weights are negative numbers.

What is a Fisher-Yates shuffle?

The Fisher-Yates shuffle is an algorithm for generating a random permutation of a finite sequence—in plain terms, the algorithm shuffles the sequence. The algorithm effectively puts all the elements into a hat; it continually determines the next element by randomly drawing an element from the hat until no elements remain. The algorithm produces an unbiased permutation: every permutation is equally likely. The modern version of the algorithm is efficient: it takes time proportional to the number of items being shuffled and shuffles them in place.

What is the Longest Common Subsequence Problem?

The longest common subsequence (LCS) problem is the problem of finding the longest subsequence common to all sequences in a set of sequences (often just two sequences). It differs from the longest common substring problem: unlike substrings, subsequences are not required to occupy consecutive positions within the original sequences. Example: LCS for input Sequences ABCDGH and AEDFHR is ADH of length 3. LCS for input Sequences AGGTAB and GXTXAYB is GTAB of length 4.

What is the Maximum subarray problem?

The maximum subarray problem is the task of finding the contiguous subarray within a one-dimensional array, a[1...n], of numbers which has the largest sum. Example The list usually contains both positive and negative numbers along with 0. For example, for the array of values −2, 1, −3, 4, −1, 2, 1, −5, 4 the contiguous subarray with the largest sum is 4, −1, 2, 1, with sum 6.

What is a Radian?

The radian (symbol rad) is the unit for measuring angles, and is the standard unit of angular measure used in many areas of mathematics. The length of an arc of a unit circle is numerically equal to the measurement in radians of the angle that it subtends; one radian is just under 57.3 degrees. An arc of a circle with the same length as the radius of that circle subtends an angle of 1 radian. The circumference subtends an angle of 2π radians.

What is the Shortest Common Subsequence?

The shortest common supersequence (SCS) of two sequences X and Y is the shortest sequence which has X and Y as subsequences. In other words assume we're given two strings str1 and str2, find the shortest string that has both str1 and str2 as subsequences. This is a problem closely related to the longest common subsequence problem. Example Input: str1 = "geek", str2 = "eke" Output: "geeke" Input: str1 = "AGGTAB", str2 = "GXTXAYB" Output: "AGXGTXAYB"

What is Liu Hui's π Algorithm?

Multiply one side of a hexagon by the radius (of its circumcircle), then multiply this by three, to yield the area of a dodecagon; if we cut a hexagon into a dodecagon, multiply its side by its radius, then again multiply by six, we get the area of a 24-gon; the finer we cut, the smaller the loss with respect to the area of circle, thus with further cut after cut, the area of the resulting polygon will coincide and become one with the circle; there will be no loss

What is a Power Set?

Power set of a set S is the set of all of the subsets of S, including the empty set and S itself. Power set of set S is denoted as P(S).

What is a Primality Test?

A prime number (or a prime) is a natural number greater than 1 that cannot be formed by multiplying two smaller natural numbers. A natural number greater than 1 that is not prime is called a composite number. A natural number greater than 1 that is not prime is called a composite number. A primality test is an algorithm for determining whether an input number is prime. Among other fields of mathematics, it is used for cryptography. Unlike integer factorization, primality tests do not generally give prime factors, only stating whether the input number is prime or not. Factorization is thought to be a computationally difficult problem, whereas primality testing is comparatively easy (its running time is polynomial in the size of the input).

What is the Unique Paths Problem?

A robot is located at the top-left corner of a m x n grid (marked 'Start' in the diagram below). The robot can only move either down or right at any point in time. The robot is trying to reach the bottom-right corner of the grid (marked 'Finish' in the diagram below). How many possible unique paths are there?

What is a Bubble Sort?

Bubble sort, sometimes referred to as sinking sort, is a simple sorting algorithm that repeatedly steps through the list to be sorted, compares each pair of adjacent items and swaps them if they are in the wrong order (ascending or descending arrangement). The pass through the list is repeated until no swaps are needed, which indicates that the list is sort

What is a Depth-First Search?

Depth-first search (DFS) is an algorithm for traversing or searching tree or graph data structures. One starts at the root (selecting some arbitrary node as the root in the case of a graph) and explores as far as possible along each branch before backtracking.

What is the Combination Sum Problem?

Given a set of candidate numbers (candidates) (without duplicates) and a target number (target), find all unique combinations in candidates where the candidate numbers sums to target. The same repeated number may be chosen from candidates unlimited number of times. Note: All numbers (including target) will be positive integers. The solution set must not contain duplicate combinations.

What is a Hamiltonian path?

Hamiltonian path (or traceable path) is a path in an undirected or directed graph that visits each vertex exactly once. A Hamiltonian cycle (or Hamiltonian circuit) is a Hamiltonian path that is a cycle. Determining whether such paths and cycles exist in graphs is the Hamiltonian path problem.

What is the Knapsack Problem?

Knapsack Problem The knapsack problem or rucksack problem is a problem in combinatorial optimization: Given a set of items, each with a weight and a value, determine the number of each item to include in a collection so that the total weight is less than or equal to a given limit and the total value is as large as possible. It derives its name from the problem faced by someone who is constrained by a fixed-size knapsack and must fill it with the most valuable items. Example of a one-dimensional (constraint) knapsack problem: which boxes should be chosen to maximize the amount of money while still keeping the overall weight under or equal to 15 kg?

What is a Combination with repetitions?

Repetition is Allowed: such as coins in your pocket (5,5,5,10,10) Or let us say there are five flavours of ice cream: banana, chocolate, lemon, strawberry and vanilla. We can have three scoops. How many variations will there be? Let's use letters for the flavours: {b, c, l, s, v}. Example selections include: {c, c, c} (3 scoops of chocolate) {b, l, v} (one each of banana, lemon and vanilla) {b, v, v} (one of banana, two of vanilla) Number of combinations Formula Where n is the number of things to choose from, and we choose r of them. Repetition allowed, order doesn't matter.

What is a Selection Sort?

Selection sort is a sorting algorithm, specifically an in-place comparison sort. It has O(n2) time complexity, making it inefficient on large lists, and generally performs worse than the similar insertion sort. Selection sort is noted for its simplicity, and it has performance advantages over more complicated algorithms in certain situations, particularly where auxiliary memory is limited.

What is the Longest Common Substring Problem?

The longest common substring problem is to find the longest string (or strings) that is a substring (or are substrings) of two or more strings.

What is the Longest Increasing Subsequence?

The longest increasing subsequence problem is to find a subsequence of a given sequence in which the subsequence's elements are in sorted order, lowest to highest, and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous, or unique. Example In the first 16 terms of the binary Van der Corput sequence 0, 8, 4, 12, 2, 10, 6, 14, 1, 9, 5, 13, 3, 11, 7, 15 a longest increasing subsequence is 0, 2, 6, 9, 11, 15. This subsequence has length six; the input sequence has no seven-member increasing subsequences. The longest increasing subsequence in this example is not unique: for instance, 0, 4, 6, 9, 11, 15 or 0, 2, 6, 9, 13, 15 or 0, 4, 6, 9, 13, 15 are other increasing subsequences of equal length in the same input sequence.

What is the Square Matrix In-Place Rotation problem?

You are given an n x n 2D matrix (representing an image). Rotate the matrix by 90 degrees (clockwise). Note You have to rotate the image in-place, which means you have to modify the input 2D matrix directly. DO NOT allocate another 2D matrix and do the rotation.

What is a Linked List?

n computer science, a linked list is a linear collection of data elements, in which linear order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a group of nodes which together represent a sequence. Under the simplest form, each node is composed of data and a reference (in other words, a link) to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing efficient insertion or removal from arbitrary element references. A drawback of linked lists is that access time is linear (and difficult to pipeline). Faster access, such as random access, is not feasible. Arrays have better cache locality as compared to linked lists.

What is the Hamming Distance?

the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other, or the minimum number of errors that could have transformed one string into the other. In a more general context, the Hamming distance is one of several string metrics for measuring the edit distance between two sequences.

Fenwick Tree / Binary Indexed Tree

A Fenwick tree or binary indexed tree is a data structure that can efficiently update elements and calculate prefix sums in a table of numbers. When compared with a flat array of numbers, the Fenwick tree achieves a much better balance between two operations: element update and prefix sum calculation. In a flat array of n numbers, you can either store the elements, or the prefix sums. In the first case, computing prefix sums requires linear time; in the second case, updating the array elements requires linear time (in both cases, the other operation can be performed in constant time). Fenwick trees allow both operations to be performed in O(log n) time. This is achieved by representing the numbers as a tree, where the value of each node is the sum of the numbers in that subtree. The tree structure allows operations to be performed using only O(log n) node accesses.

What is a Bloom Filter?

A bloom filter is a space-efficient probabilistic data structure designed to test whether an element is present in a set. It is designed to be blazingly fast and use minimal memory at the cost of potential false positives. False positive matches are possible, but false negatives are not - in other words, a query returns either "possibly in set" or "definitely not in set". Bloom proposed the technique for applications where the amount of source data would require an impractically large amount of memory if "conventional" error-free hashing techniques were applied.

What is a Complex Number?

A complex number is a number that can be expressed in the form a + b * i, where a and b are real numbers, and i is a solution of the equation x^2 = −1. Because no real number satisfies this equation, i is called an imaginary number. For the complex number a + b * i, a is called the real part, and b is called the imaginary part. A Complex Number is a combination of a Real Number and an Imaginary Number: Geometrically, complex numbers extend the concept of the one-dimensional number line to the two-dimensional complex plane by using the horizontal axis for the real part and the vertical axis for the imaginary part. The complex number a + b * i can be identified with the point (a, b) in the complex plane. A complex number whose real part is zero is said to be purely imaginary; the points for these numbers lie on the vertical axis of the complex plane. A complex number whose imaginary part is zero can be viewed as a real number; its point lies on the horizontal axis of the complex plane. A complex number can be visually represented as a pair of numbers (a, b) forming a vector on a diagram called an Argand diagram, representing the complex plane. Re is the real axis, Im is the imaginary axis, and i satisfies i^2 = −1. The radian (symbol rad) is the unit for measuring angles, and is the standard unit of angular measure used in many areas of mathematics. The length of an arc of a unit circle is numerically equal to the measurement in radians of the angle that it subtends; one radian is just under 57.3 degrees. An arc of a circle with the same length as the radius of that circle subtends an angle of 1 radian. The circumference subtends an angle of 2π radians.

What is a Strongly Connected Component?

A directed graph is called strongly connected if there is a path in each direction between each pair of vertices of the graph. In a directed graph G that may not itself be strongly connected, a pair of vertices u and v are said to be strongly connected to each other if there is a path in each direction between them.

What is the Knight's Tour problem?

A knight's tour is a sequence of moves of a knight on a chessboard such that the knight visits every square only once. If the knight ends on a square that is one knight's move from the beginning square (so that it could tour the board again immediately, following the same path), the tour is closed, otherwise it is open. The knight's tour problem is the mathematical problem of finding a knight's tour. Creating a program to find a knight's tour is a common problem given to computer science students. Variations of the knight's tour problem involve chessboards of different sizes than the usual 8×8, as well as irregular (non-rectangular) boards. The knight's tour problem is an instance of the more general Hamiltonian path problem in graph theory. The problem of finding a closed knight's tour is similarly an instance of the Hamiltonian cycle problem.

What is a Red-Black Tree?

A red-black tree is a kind of self-balancing binary search tree in computer science. Each node of the binary tree has an extra bit, and that bit is often interpreted as the color (red or black) of the node. These color bits are used to ensure the tree remains approximately balanced during insertions and deletions. Balance is preserved by painting each node of the tree with one of two colors in a way that satisfies certain properties, which collectively constrain how unbalanced the tree can become in the worst case. When the tree is modified, the new tree is subsequently rearranged and repainted to restore the coloring properties. The properties are designed in such a way that this rearranging and recoloring can be performed efficiently. The balancing of the tree is not perfect, but it is good enough to allow it to guarantee searching in O(log n) time, where n is the total number of elements in the tree. The insertion and deletion operations, along with the tree rearrangement and recoloring, are also performed in O(log n) time.

What are Articulation Points (or Cut Vertices)?

A vertex in an undirected connected graph is an articulation point (or cut vertex) if removing it (and edges through it) disconnects the graph. Articulation points represent vulnerabilities in a connected network - single points whose failure would split the network into 2 or more disconnected components. They are useful for designing reliable networks. For a disconnected undirected graph, an articulation point is a vertex removing which increases number of connected components.

What is a Breath-First Search?

Breadth-first search (BFS) is an algorithm for traversing or searching tree or graph data structures. It starts at the tree root (or some arbitrary node of a graph, sometimes referred to as a 'search key') and explores the neighbor nodes first, before moving to the next level neighbors.

What is Dijkstra's Algorithm?

Dijkstra's algorithm is an algorithm for finding the shortest paths between nodes in a graph, which may represent, for example, road networks. The algorithm exists in many variants; Dijkstra's original variant found the shortest path between two nodes, but a more common variant fixes a single node as the "source" node and finds shortest paths from the source to all other nodes in the graph, producing a shortest-path tree.

What is a Disjoint Set?

Disjoint-set data structure (also called a union-find data structure or merge-find set) is a data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It provides near-constant-time operations (bounded by the inverse Ackermann function) to add new sets, to merge existing sets, and to determine whether elements are in the same set. In addition to many other uses (see the Applications section), disjoint-sets play a key role in Kruskal's algorithm for finding the minimum spanning tree of a graph.

What is the Rain Terraces (Trapping Rain Water) Problem?

Given an array of non-negative integers representing terraces in an elevation map where the width of each bar is 1, compute how much water it is able to trap after raining. Input: arr[] = [0, 1, 0, 2, 1, 0, 1, 3, 2, 1, 2, 1] Output: 6 Structure is like below: | | || | _|_||_|||||| Trap "1 unit" between first 1 and 2, "4 units" between first 2 and 3 and "1 unit" between second last 1 and last 2.

What is the Jump Game problem?

Given an array of non-negative integers, you are initially positioned at the first index of the array. Each element in the array represents your maximum jump length at that position. Determine if you are able to reach the last index. Example #1 Input: [2,3,1,1,4] Output: true Explanation: Jump 1 step from index 0 to 1, then 3 steps to the last index. Example #2 Input: [3,2,1,0,4] Output: false Explanation: You will always arrive at index 3 no matter what. Its maximum jump length is 0, which makes it impossible to reach the last index.

What is the Regular Expression Matching Problem?

Given an input string s and a pattern p, implement regular expression matching with support for . and *. . Matches any single character. * Matches zero or more of the preceding element. The matching should cover the entire input string (not partial). Note s could be empty and contains only lowercase letters a-z. p could be empty and contains only lowercase letters a-z, and characters like . or *.

What is an Insertion Sort?

Insertion sort is a simple sorting algorithm that builds the final sorted array (or list) one item at a time. It is much less efficient on large lists than more advanced algorithms such as quicksort, heapsort, or merge sort.

What is a Heap Sort?

Heapsort is a comparison-based sorting algorithm. Heapsort can be thought of as an improved selection sort: like that algorithm, it divides its input into a sorted and an unsorted region, and it iteratively shrinks the unsorted region by extracting the largest element and moving that to the sorted region. The improvement consists of the use of a heap data structure rather than a linear-time search to find the maximum.

What is a Least Common Multiple?

In arithmetic and number theory, the least common multiple, lowest common multiple, or smallest common multiple of two integers a and b, usually denoted by LCM(a, b), is the smallest positive integer that is divisible by both a and b. Since division of integers by zero is undefined, this definition has meaning only if a and b are both different from zero. However, some authors define lcm(a,0) as 0 for all a, which is the result of taking the lcm to be the least upper bound in the lattice of divisibility.

What is Prim's Algorithm?

In computer science, Prim's algorithm is a greedy algorithm that finds a minimum spanning tree for a weighted undirected graph. The algorithm operates by building this tree one vertex at a time, from an arbitrary starting vertex, at each step adding the cheapest possible connection from the tree to another vertex.

What is a Doubly Linked List?

In computer science, a doubly linked list is a linked data structure that consists of a set of sequentially linked records called nodes. Each node contains two fields, called links, that are references to the previous and to the next node in the sequence of nodes. The beginning and ending nodes' previous and next links, respectively, point to some kind of terminator, typically a sentinel node or null, to facilitate traversal of the list. If there is only one sentinel node, then the list is circularly linked via the sentinel node. It can be conceptualized as two singly linked lists formed from the same data items, but in opposite sequential orders. The two node links allow traversal of the list in either direction. While adding or removing a node in a doubly linked list requires changing more links than the same operations on a singly linked list, the operations are simpler and potentially more efficient (for nodes other than first nodes) because there is no need to keep track of the previous node during traversal or no need to traverse the list to find the previous node, so that its link can be modified.

What is a Graph?

In computer science, a graph is an abstract data type that is meant to implement the undirected graph and directed graph concepts from mathematics, specifically the field of graph theory A graph data structure consists of a finite (and possibly mutable) set of vertices or nodes or points, together with a set of unordered pairs of these vertices for an undirected graph or a set of ordered pairs for a directed graph. These pairs are known as edges, arcs, or lines for an undirected graph and as arrows, directed edges, directed arcs, or directed lines for a directed graph. The vertices may be part of the graph structure, or may be external entities represented by integer indices or references.

What is a Heap?

In computer science, a heap is a specialized tree-based data structure that satisfies the heap property described below. In a min heap, if P is a parent node of C, then the key (the value) of P is less than or equal to the key of C. In a max heap, the key of P is greater than or equal to the key of C The node at the "top" of the heap with no parents is called the root node.

What is a Priority Queue?

In computer science, a priority queue is an abstract data type which is like a regular queue or stack data structure, but where additionally each element has a "priority" associated with it. In a priority queue, an element with high priority is served before an element with low priority. If two elements have the same priority, they are served according to their order in the queue. While priority queues are often implemented with heaps, they are conceptually distinct from heaps. A priority queue is an abstract concept like "a list" or "a map"; just as a list can be implemented with a linked list or an array, a priority queue can be implemented with a heap or a variety of other methods such as an unordered array.

What is a Queue?

In computer science, a queue is a particular kind of abstract data type or collection in which the entities in the collection are kept in order and the principle (or only) operations on the collection are the addition of entities to the rear terminal position, known as enqueue, and removal of entities from the front terminal position, known as dequeue. This makes the queue a First-In-First-Out (FIFO) data structure. In a FIFO data structure, the first element added to the queue will be the first one to be removed. This is equivalent to the requirement that once a new element is added, all elements that were added before have to be removed before the new element can be removed. Often a peek or front operation is also entered, returning the value of the front element without dequeuing it. A queue is an example of a linear data structure, or more abstractly a sequential collection.

What is a Segment Tree?

In computer science, a segment tree also known as a statistic tree is a tree data structure used for storing information about intervals, or segments. It allows querying which of the stored segments contain a given point. It is, in principle, a static structure; that is, it's a structure that cannot be modified once it's built. A similar data structure is the interval tree. A segment tree is a binary tree. The root of the tree represents the whole array. The two children of the root represent the first and second halves of the array. Similarly, the children of each node corresponds to the two halves of the array corresponding to the node. We build the tree bottom up, with the value of each node being the "minimum" (or any other function) of its children's values. This will take O(n log n) time. The number of operations done is the height of the tree, which is O(log n). To do range queries, each node splits the query into two parts, one sub-query for each child. If a query contains the whole subarray of a node, we can use the precomputed value at the node. Using this optimisation, we can prove that only O(log n) minimum operations are done.

What is a Stack?

In computer science, a stack is an abstract data type that serves as a collection of elements, with two principal operations: push, which adds an element to the collection, and pop, which removes the most recently added element that was not yet removed. The order in which elements come off a stack gives rise to its alternative name, LIFO (last in, first out). Additionally, a peek operation may give access to the top without modifying the stack. The name "stack" for this type of structure comes from the analogy to a set of physical items stacked on top of each other, which makes it easy to take an item off the top of the stack, while getting to an item deeper in the stack may require taking off multiple other items first.

What is a Tree?

In computer science, a tree is a widely used abstract data type (ADT) — or data structure implementing this ADT—that simulates a hierarchical tree structure, with a root value and subtrees of children with a parent node, represented as a set of linked nodes. A tree data structure can be defined recursively (locally) as a collection of nodes (starting at a root node), where each node is a data structure consisting of a value, together with a list of references to nodes (the "children"), with the constraints that no reference is duplicated, and none points to the root.

What is a Trie?

In computer science, a trie, also called digital tree and sometimes radix tree or prefix tree (as they can be searched by prefixes), is a kind of search tree—an ordered tree data structure that is used to store a dynamic set or associative array where the keys are usually strings. Unlike a binary search tree, no node in the tree stores the key associated with that node; instead, its position in the tree defines the key with which it is associated. All the descendants of a node have a common prefix of the string associated with that node, and the root is associated with the empty string. Values are not necessarily associated with every node. Rather, values tend only to be associated with leaves, and with some inner nodes that correspond to keys of interest. For the space-optimized presentation of prefix tree, see compact prefix tree.

What is an AVL tree?

In computer science, an AVL tree (named after inventors Adelson-Velsky and Landis) is a self-balancing binary search tree. It was the first such data structure to be invented. In an AVL tree, the heights of the two child subtrees of any node differ by at most one; if at any time they differ by more than one, rebalancing is done to restore this property. Lookup, insertion, and deletion all take O(log n) time in both the average and worst cases, where n is the number of nodes in the tree prior to the operation. Insertions and deletions may require the tree to be rebalanced by one or more tree rotations.

What is a Binary Search Tree

In computer science, binary search trees (BST), sometimes called ordered or sorted binary trees, are a particular type of container: data structures that store "items" (such as numbers, names etc.) in memory. They allow fast lookup, addition and removal of items, and can be used to implement either dynamic sets of items, or lookup tables that allow finding an item by its key (e.g., finding the phone number of a person by name). Binary search trees keep their keys in sorted order, so that lookup and other operations can use the principle of binary search: when looking for a key in a tree (or a place to insert a new key), they traverse the tree from root to leaf, making comparisons to keys stored in the nodes of the tree and deciding, on the basis of the comparison, to continue searching in the left or right subtrees. On average, this means that each comparison allows the operations to skip about half of the tree, so that each lookup, insertion or deletion takes time proportional to the logarithm of the number of items stored in the tree. This is much better than the linear time required to find items by key in an (unsorted) array, but slower than the corresponding operations on hash tables.

What is a Binary Search?

In computer science, binary search, also known as half-interval search, logarithmic search, or binary chop, is a search algorithm that finds the position of a target value within a sorted array. Binary search compares the target value to the middle element of the array; if they are unequal, the half in which the target cannot lie is eliminated and the search continues on the remaining half until it is successful. If the search ends with the remaining half being empty, the target is not in the array.

What is a Counting Sort?

In computer science, counting sort is an algorithm for sorting a collection of objects according to keys that are small integers; that is, it is an integer sorting algorithm. It operates by counting the number of objects that have each distinct key value, and using arithmetic on those counts to determine the positions of each key value in the output sequence. Its running time is linear in the number of items and the difference between the maximum and minimum key values, so it is only suitable for direct use in situations where the variation in keys is not significantly greater than the number of items. However, it is often used as a subroutine in another sorting algorithm, radix sort, that can handle larger keys more efficiently. Because counting sort uses key values as indexes into an array, it is not a comparison sort, and the Ω(n log n) lower bound for comparison sorting does not apply to it. Bucket sort may be used for many of the same tasks as counting sort, with a similar time analysis; however, compared to counting sort, bucket sort requires linked lists, dynamic arrays or a large amount of preallocated memory to hold the sets of items within each bucket, whereas counting sort instead stores a single number (the count of items) per bucket. Counting sorting works best when the range of numbers for each array element is very small.

What is a Linear Search?

In computer science, linear search or sequential search is a method for finding a target value within a list. It sequentially checks each element of the list for the target value until a match is found or until all the elements have been searched. Linear search runs in at worst linear time and makes at most n comparisons, where n is the length of the list.

What is a Merge Sort?

In computer science, merge sort (also commonly spelled mergesort) is an efficient, general-purpose, comparison-based sorting algorithm. Most implementations produce a stable sort, which means that the implementation preserves the input order of equal elements in the sorted output. Mergesort is a divide and conquer algorithm that was invented by John von Neumann in 1945. An example of merge sort. First divide the list into the smallest unit (1 element), then compare each element with the adjacent list to sort and merge the two adjacent lists. Finally all the elements are sorted and merged.

What is a Radix Sort?

In computer science, radix sort is a non-comparative integer sorting algorithm that sorts data with integer keys by grouping keys by the individual digits which share the same significant position and value. A positional notation is required, but because integers can represent strings of characters (e.g., names or dates) and specially formatted floating point numbers, radix sort is not limited to integers.

What is the Floyd-Warshall Algorithm?

In computer science, the Floyd-Warshall algorithm is an algorithm for finding shortest paths in a weighted graph with positive or negative edge weights (but with no negative cycles). A single execution of the algorithm will find the lengths (summed weights) of shortest paths between all pairs of vertices. Although it does not return details of the paths themselves, it is possible to reconstruct the paths with simple modifications to the algorithm.

What is a Hash Table?

In computing, a hash table (hash map) is a data structure which implements an associative array abstract data type, a structure that can map keys to values. A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found Ideally, the hash function will assign each key to a unique bucket, but most hash table designs employ an imperfect hash function, which might cause hash collisions where the hash function generates the same index for more than one key. Such collisions must be accommodated in some way.

What are Bridges in Graph?

In graph theory, a bridge, isthmus, cut-edge, or cut arc is an edge of a graph whose deletion increases its number of connected components. Equivalently, an edge is a bridge if and only if it is not contained in any cycle. A graph is said to be bridgeless or isthmus-free if it contains no bridges.

How do you Detect Cycle in Graphs?

In graph theory, a cycle is a path of edges and vertices wherein a vertex is reachable from itself. There are several different types of cycles, principally a closed walk and a simple cycle. A closed walk consists of a sequence of vertices starting and ending at the same vertex, with each two consecutive vertices in the sequence adjacent to each other in the graph. In a directed graph, each edge must be traversed by the walk consistently with its direction: the edge must be oriented from the earlier of two consecutive vertices to the later of the two vertices in the sequence. The choice of starting vertex is not important: traversing the same cyclic sequence of edges from different starting vertices produces the same closed walk. A simple cycle may be defined either as a closed walk with no repetitions of vertices and edges allowed, other than the repetition of the starting and ending vertex, or as the set of edges in such a walk. The two definitions are equivalent in directed graphs, where simple cycles are also called directed cycles: the cyclic sequence of vertices and edges in a walk is completely determined by the set of edges that it uses. In undirected graphs the set of edges of a cycle can be traversed by a walk in either of two directions, giving two possible directed cycles for every undirected cycle. A circuit can be a closed walk allowing repetitions of vertices but not edges; however, it can also be a simple cycle, so explicit definition is recommended when it is used.

What is a Eularian path?

In graph theory, an Eulerian trail (or Eulerian path) is a trail in a finite graph which visits every edge exactly once. Similarly, an Eulerian circuit or Eulerian cycle is an Eulerian trail which starts and ends on the same vertex. Euler proved that a necessary condition for the existence of Eulerian circuits is that all vertices in the graph have an even degree, and stated that connected graphs with all vertices of even degree have an Eulerian circuit.

What is Pascals Triangle?

In mathematics, Pascal's triangle is a triangular array of the binomial coefficients. The rows of Pascal's triangle are conventionally enumerated starting with row n = 0 at the top (the 0th row). The entries in each row are numbered from the left beginning with k = 0 and are usually staggered relative to the numbers in the adjacent rows. The triangle may be constructed in the following manner: In row 0 (the topmost row), there is a unique nonzero entry 1. Each entry of each subsequent row is constructed by adding the number above and to the left with the number above and to the right, treating blank entries as 0. For example, the initial number in the first (or any other) row is 1 (the sum of 0 and 1), whereas the numbers 1 and 3 in the third row are added to produce the number 4 in the fourth row.

What is a Euclidean Algorithm?

In mathematics, the Euclidean algorithm, or Euclid's algorithm, is an efficient method for computing the greatest common divisor (GCD) of two numbers, the largest number that divides both of them without leaving a remainder. The Euclidean algorithm is based on the principle that the greatest common divisor of two numbers does not change if the larger number is replaced by its difference with the smaller number. For example, 21 is the GCD of 252 and 105 (as 252 = 21 × 12 and 105 = 21 × 5), and the same number 21 is also the GCD of 105 and 252 − 105 = 147. Since this replacement reduces the larger of the two numbers, repeating this process gives successively smaller pairs of numbers until the two numbers become equal. When that occurs, they are the GCD of the original two numbers. By reversing the steps, the GCD can be expressed as a sum of the two original numbers each multiplied by a positive or negative integer, e.g., 21 = 5 × 105 + (−2) × 252. The fact that the GCD can always be expressed in this way is known as Bézout's identity.

What is a Fibonacci Number?

In mathematics, the Fibonacci numbers are the numbers in the following integer sequence, called the Fibonacci sequence, and characterized by the fact that every number after the first two is the sum of the two preceding ones.

What is a Factorial?

In mathematics, the factorial of a non-negative integer n, denoted by n!, is the product of all positive integers less than or equal to n. For example:

What is an Integer Partition?

In number theory and combinatorics, a partition of a positive integer n, also called an integer partition, is a way of writing n as a sum of positive integers. Two sums that differ only in the order of their summands are considered the same partition. For example, 4 can be partitioned in five distinct ways: 4 3 + 1 2 + 2 2 + 1 + 1 1 + 1 + 1 + 1 The order-dependent composition 1 + 3 is the same partition as 3 + 1, while the two distinct compositions 1 + 2 + 1 and 1 + 1 + 2 represent the same partition 2 + 1 + 1. Young diagrams associated to the partitions of the positive integers 1 through 8. They are arranged so that images under the reflection about the main diagonal of the square are conjugate partitions.

What is a Cartesian Product?

In set theory a Cartesian product is a mathematical operation that returns a set (or product set or simply product) from multiple sets. That is, for sets A and B, the Cartesian product A × B is the set of all ordered pairs (a, b) where a ∈ A and b ∈ B.

What is Topological Sorting?

In the field of computer science, a topological sort or topological ordering of a directed graph is a linear ordering of its vertices such that for every directed edge uv from vertex u to vertex v, u comes before v in the ordering. For instance, the vertices of the graph may represent tasks to be performed, and the edges may represent constraints that one task must be performed before another; in this application, a topological ordering is just a valid sequence for the tasks. A topological ordering is possible if and only if the graph has no directed cycles, that is, if it is a directed acyclic graph (DAG). Any DAG has at least one topological ordering, and algorithms are known for constructing a topological ordering of any DAG in linear time.

What is a Interpolation Search?

Interpolation search is an algorithm for searching for a key in an array that has been ordered by numerical values assigned to the keys (key values). For example we have a sorted array of n uniformly distributed values arr[], and we need to write a function to search for a particular element x in the array. Linear Search finds the element in O(n) time, Jump Search takes O(√ n) time and Binary Search take O(Log n) time. The Interpolation Search is an improvement over Binary Search for instances, where the values in a sorted array are uniformly distributed. Binary Search always goes to the middle element to check. On the other hand, interpolation search may go to different locations according to the value of the key being searched. For example, if the value of the key is closer to the last element, interpolation search is likely to start search toward the end side.

What is Kruskal's Algorithm?

Kruskal's algorithm is a minimum-spanning-tree algorithm which finds an edge of the least possible weight that connects any two trees in the forest. It is a greedy algorithm in graph theory as it finds a minimum spanning tree for a connected weighted graph adding increasing cost arcs at each step. This means it finds a subset of the edges that forms a tree that includes every vertex, where the total weight of all the edges in the tree is minimized. If the graph is not connected, then it finds a minimum spanning forest (a minimum spanning tree for each connected component).

What is a Jump Search?

Like Binary Search, Jump Search (or Block Search) is a searching algorithm for sorted arrays. The basic idea is to check fewer elements (than linear search) by jumping ahead by fixed steps or skipping some elements in place of searching all elements. For example, suppose we have an array arr[] of size n and block (to be jumped) of size m. Then we search at the indexes arr[0], arr[m], arr[2 * m], ..., arr[k * m] and so on. Once we find the interval arr[k * m] < x < arr[(k+1) * m], we perform a linear search operation from the index k * m to find the element x. What is the optimal block size to be skipped? In the worst case, we have to do n/m jumps and if the last checked value is greater than the element to be searched for, we perform m - 1 comparisons more for linear search. Therefore the total number of comparisons in the worst case will be ((n/m) + m - 1). The value of the function ((n/m) + m - 1) will be minimum when m = √n. Therefore, the best step size is m = √n.

What is a Quicksort?

Quicksort is a divide and conquer algorithm. Quicksort first divides a large array into two smaller sub-arrays: the low elements and the high elements. Quicksort can then recursively sort the sub-arrays The steps are: Pick an element, called a pivot, from the array. Partitioning: reorder the array so that all elements with values less than the pivot come before the pivot, while all elements with values greater than the pivot come after it (equal values can go either way). After this partitioning, the pivot is in its final position. This is called the partition operation. Recursively apply the above steps to the sub-array of elements with smaller values and separately to the sub-array of elements with greater values.

What is a Shellsort?

Shellsort, also known as Shell sort or Shell's method, is an in-place comparison sort. It can be seen as either a generalization of sorting by exchange (bubble sort) or sorting by insertion (insertion sort). The method starts by sorting pairs of elements far apart from each other, then progressively reducing the gap between elements to be compared. Starting with far apart elements, it can move some out-of-place elements into position faster than a simple nearest neighbor exchange.

What is a Fourier Transformation?

The Fourier Transform (FT) decomposes a function of time (a signal) into the frequencies that make it up, in a way similar to how a musical chord can be expressed as the frequencies (or pitches) of its constituent notes. The Discrete Fourier Transform (DFT) converts a finite sequence of equally-spaced samples of a function into a same-length sequence of equally-spaced samples of the discrete-time Fourier transform (DTFT), which is a complex-valued function of frequency. The interval at which the DTFT is sampled is the reciprocal of the duration of the input sequence. An inverse DFT is a Fourier series, using the DTFT samples as coefficients of complex sinusoids at the corresponding DTFT frequencies. It has the same sample-values as the original input sequence. The DFT is therefore said to be a frequency domain representation of the original input sequence. If the original sequence spans all the non-zero values of a function, its DTFT is continuous (and periodic), and the DFT provides discrete samples of one cycle. If the original sequence is one cycle of a periodic function, the DFT provides all the non-zero values of one DTFT cycle.

What is the KMP algorithm?

The Knuth-Morris-Pratt string searching algorithm (or KMP algorithm) searches for occurrences of a "word" W within a main "text string" T by employing the observation that when a mismatch occurs, the word itself embodies sufficient information to determine where the next match could begin, thus bypassing re-examination of previously matched characters.

What is the Levenshtein Distance?

The Levenshtein distance is a string metric for measuring the difference between two sequences. Informally, the Levenshtein distance between two words is the minimum number of single-character edits (insertions, deletions or substitutions) required to change one word into the other.

What is the Rabin Karp Algorithm?

The Rabin-Karp algorithm seeks to speed up the testing of equality of the pattern to the substrings in the text by using a hash function. A hash function is a function which converts every string into a numeric value, called its hash value; for example, we might have hash('hello') = 5. The algorithm exploits the fact that if two strings are equal, their hash values are also equal. Thus, string matching is reduced (almost) to computing the hash value of the search pattern and then looking for substrings of the input string with that hash value. However, there are two problems with this approach. First, because there are so many different strings and so few hash values, some differing strings will have the same hash value. If the hash values match, the pattern and the substring may not match; consequently, the potential match of search pattern and the substring must be confirmed by comparing them; that comparison can take a long time for long substrings. Luckily, a good hash function on reasonable strings usually does not have many collisions, so the expected search time will be acceptable.

What is the Sieve of Eratosthenes

The Sieve of Eratosthenes is an algorithm for finding all prime numbers up to some limit n. It is attributed to Eratosthenes of Cyrene, an ancient Greek mathematician. Create a boolean array of n + 1 positions (to represent the numbers 0 through n) Set positions 0 and 1 to false, and the rest to true Start at position p = 2 (the first prime number) Mark as false all the multiples of p (that is, positions 2 * p, 3 * p, 4 * p... until you reach the end of the array) Find the first position greater than p that is true in the array. If there is no such position, stop. Otherwise, let p equal this new number (which is the next prime), and repeat from step 4

What is the Tower of Hanoi problem?

The Tower of Hanoi (also called the Tower of Brahma or Lucas' Tower and sometimes pluralized) is a mathematical game or puzzle. It consists of three rods and a number of disks of different sizes, which can slide onto any rod. The puzzle starts with the disks in a neat stack in ascending order of size on one rod, the smallest at the top, thus making a conical shape. The objective of the puzzle is to move the entire stack to another rod, obeying the following simple rules: Only one disk can be moved at a time. Each move consists of taking the upper disk from one of the stacks and placing it on top of another stack or on an empty rod. No disk may be placed on top of a smaller disk.

What is the Z Algorithm?

The Z-algorithm finds occurrences of a "word" W within a main "text string" T in linear time O(|W| + |T|). Given a string S of length n, the algorithm produces an array, Z where Z[i] represents the longest substring starting from S[i] which is also a prefix of S. Finding Z for the string obtained by concatenating the word, W with a nonce character, say $ followed by the text, T, helps with pattern matching, for if there is some index i such that Z[i] equals the pattern length, then the pattern must be present at that point. While the Z array can be computed with two nested loops in O(|W| * |T|) time, the following strategy shows how to obtain it in linear time, based on the idea that as we iterate over the letters in the string (index i from 1 to n - 1), we maintain an interval [L, R] which is the interval with maximum R such that 1 ≤ L ≤ i ≤ R and S[L...R] is a prefix that is also a substring (if no such interval exists, just let L = R = - 1). For i = 1, we can simply compute L and R by comparing S[0...] to S[1...].

What is the N-Queens Problem?

The eight queens puzzle is the problem of placing eight chess queens on an 8×8 chessboard so that no two queens threaten each other. Thus, a solution requires that no two queens share the same row, column, or diagonal. The eight queens puzzle is an example of the more general n queens problem of placing n non-attacking queens on an n×n chessboard, for which solutions exist for all natural numbers n with the exception of n=2 and n=3.

What is the Traveling Salesmen Problem?

The travelling salesman problem (TSP) asks the following question: "Given a list of cities and the distances between each pair of cities, what is the shortest possible route that visits each city and returns to the origin city?" TSP can be modelled as an undirected weighted graph, such that cities are the graph's vertices, paths are the graph's edges, and a path's distance is the edge's weight. It is a minimization problem starting and finishing at a specified vertex after having visited each other vertex exactly once. Often, the model is a complete graph (i.e. each pair of vertices is connected by an edge). If no path exists between two cities, adding an arbitrarily long edge will complete the graph without affecting the optimal tour.

What is the Recursive Staircase Problem?

There are n stairs, a person standing at the bottom wants to reach the top. The person can climb either 1 or 2 stairs at a time. Count the number of ways, the person can reach the top.

What is a Combination without repetitions?

This is how lotteries work. The numbers are drawn one at a time, and if we have the lucky numbers (no matter what order) we win! No Repetition: such as lottery numbers (2,14,15,27,30,33) Number of combinations Formula where n is the number of things to choose from, and we choose r of them, no repetition, order doesn't matter. It is often called "n choose r" (such as "16 choose 3"). And is also known as the Binomial Coefficient.

What is the difference between a Permutation and Combination?

When the order doesn't matter, it is a Combination. "My fruit salad is a combination of apples, grapes and bananas" We don't care what order the fruits are in, they could also be "bananas, grapes and apples" or "grapes, apples and bananas", its the same fruit salad. When the order does matter it is a Permutation. "The combination to the safe is 472". We do care about the order. 724 won't work, nor will 247. It has to be exactly 4-7-2.


Ensembles d'études connexes

Chapter 10 Bony Thorax, Sternum, Ribs

View Set