Chapter 1--Introduction to Data Structures & Algorithms
BAG(adt)
Items are not ordered. Duplicate items are allowed.
set (adt)
Items are not ordered. Duplicate items are not allowed.
Priority Queue (adt)
Items are ordered based on items' priority. Duplicate items are allowed.
T/F: An algorithm with a polynomial runtime is considered efficient. True False
T. An efficient algorithm is generally one whose runtime increases no more than polynomially with respective to the input size. In contrast, an algorithm with an exponential runtime is not efficient.
A list node's data can store a record with multiple subitems.True/False
True
NP-complete problems
a set of problems for which no known efficient algorithm exists.
Ex: ADT- A list is a common ADT for holding ordered data, having operations like
append a data item, remove a data item, search whether a data item exists, and print the list.
selection of data structures used in a program depends on...
both the type of data being stored and the operations the program may need to perform on that data
Computational Problem
computational problem specifies an input, a question to be answered about the input that **can be answered using a computer,** and the desired output.
abstract data type (ADT) is a
data type described by predefined user operations, such as "insert data at rear," without indicating how each operation is implemented.
algorithm
describes a sequence of steps to solve a computational problem or perform a calculation.
Measuring runtime and memory usage allows
different algorithms to be compared.
t/f A programmer must know the underlying implementation of the list ADT in order to use a list.
f. A programmer need not have knowledge of the underlying implementation to use a list ADT.
ADTs are only supported in standard libraries.
f. Many third-party libraries, which are not built in to the programming language standard, implement ADTs.
While common operations include inserting, removing, and searching for data, the algorithms to implement those operations are typically
specific to each data structure.
Some algorithms utilize data structures to __________ during the algorithm execution. Ex: An algorithm that determines a list of the top five salespersons, may use an array to store salespersons sorted by their total sales.
store and organize data
Using abstract data types enables programmers or algorithm designers to focus on _______, thus improving ________
-higher-level operations and algorithms. -programmer efficiency.
inserting an item at the beginning of a 999-item linked list requires how many items to be shifted?
0. No shifting of other items is required, which is an advantage of using linked lists.
hash table
A hash table is a data structure that stores unordered items by mapping (or hashing) each item to a location in an array.
linked list
A linked list is a data structure that stores an ordered list of items in nodes, where each node stores data and has a pointer to the next node.
graph
A graph is a data structure for representing connections among items, and consists of vertices connected by edges. A vertex represents an item in a graph. An edge represents a connection between two vertices in a graph.
heap
A max-heap is a tree that maintains the simple property that a node's key is greater than or equal to the node's childrens' keys. A min-heap is a tree that maintains the simple property that a node's key is less than or equal to the node's childrens' keys.
Given a sorted list of a company's employee records and an employee's first and last name, what is a specific employee's phone number?
Binary search. If employee records are stored in sorted order, sorted by their name, binary search can be used to efficiently search for a specific employee's record. The found employee record can then be accessed to determine their phone number.
Ex: The space complexity of an algorithm that duplicates a list of numbers is
S(N) = 2N + k, where k is a constant representing memory used for things like the loop counter and list pointers.
What is the problem output?
The computational problem's question can be phrased as: How many times does the user-specified word appear on the list? The answer to that problem is an integer value for the frequency of the specified word.
Which can be used as the problem input?
The input must include a list of all words and the specific word for which determining the frequency is desired.
how is algorithm efficiency measured??
is most commonly measured by the algorithm runtime, and an efficient algorithm is one whose runtime increases no more than polynomially with respect to the input size.
An algorithm's best case
is the scenario where the algorithm does the minimum possible number of operations.
list (adt)
items are ordered based on how items are added. Duplicate items are allowed.
An algorithm's runtime complexity is a function, T(N), that represents
the number of constant time operations performed by the algorithm on an input of size N.
Two different algorithms that produce the same result have the same computational complexity. True False
F: Two different algorithms can produce the same result in various ways and may have different computational complexities.
Items stored in an array can be accessed using a positional index. T/F
array elements are stored in sequential locations, so can be easily accessed using an index
An efficient algorithm to solve an NP-complete problem may exist. T/F
F. Many computational problems exist for which an efficient algorithm is unknown. Such problems are often encountered in real applications.
NP-complete problems have the following characteristics:
**No efficient algorithm has been found to solve an NP-complete problem. **No one has proven that an efficient algorithm to solve an NP-complete problem is impossible. **If an efficient algorithm exists for one NP-complete problem, then all NP-complete problem can be solved efficiently.
Abstraction simplifies________.ADTs allow programmers to focus on choosing which ______ best match a program's needs.
- programming. - ADTs
LinearSearch(numbers, numbersSize, key) { i = 0 while (i < numbersSize) { if (numbers[i] == key) return i i = i + 1 } return -1 // not found } // numbers: 54 79 26 91 29 33 key= 26(neither best nor worst case) key=24: best case key = 82: worst case
-The search for 26 is neither the best nor the worst case. -Searching for 54 only requires one comparison and is the best case: The key is found at the start of the array. No other search could perform fewer operations. -Searching for 82 compares against all array items and is the worst case: The number is not found in the array. No other search could perform more operation
Inserting an item at the end of a 999-item array OR linked list requires how many items to be shifted?
0. Appending an item just places the new item at the end of the array. No shifting of existing items is necessary
Inserting an item at the beginning of a 999-item array requires how many items to be shifted?
999.Inserting at the beginning requires making room for the new item. So every current item must be shifted once.
Binary Tree
A binary tree is a data structure in which each node stores data and has up to two children, known as a left child and a right child.
Record
A record is the data structure that stores subitems, often called fields, with a name associated with each subitem.
array
An array is a data structure that stores an ordered list of items, where each item is directly accessible by a positional index.
Example computational problems and common algorithms: Search engines--Given a product ID and a sorted array of all in-stock products, is the product in stock and what is the product's price?
Binary search: The binary search algorithm is an efficient algorithm for searching a list. The list's elements must be sorted and directly accessible (such as an array).
computational problems and common algorithms: Navigation---Given a user's current location and desired location, what is the fastest route to walk to the destination?
Dijkstra's shortest path: Dijkstra's shortest path algorithm determines the shortest path from a start vertex to each vertex in a graph.The possible routes between two locations can be represented using a graph, where vertices represent specific locations and connecting edges specify the time required to walk between those two locations.
T/F: A linked list stores items in an unspecified order.
F. A linked list stores an ordered list of items. The links in each node define the order in which items are stored.
An algorithm to solve this computation problem must be written using a programming language. T/F
False.An algorithm can be described in English, pseudocode, a programming language, hardware, etc., as long as the algorithm precisely describes the steps of a computational procedure.
computational problems and common algorithms: DNA analysis-- CP: Given two DNA sequences from different individuals, what is the longest shared sequence of nucleotides? What is the common algorithm?
Longest common substring problem: A longest common substring algorithm determines the longest common substring that exists in two inputs strings.DNA sequences can be represented using strings consisting of the letters A, C, G, and T to represent the four different nucleotides.
ADTs in standard libraries
Most programming languages provide standard libraries that implement common abstract data types.
Why must Input data size remain a variable
Otherwise, the overwhelming majority of algorithms would have a best case of N=0, since no input data would be processed. In both theory and practice, saying "the best case is when the algorithm doesn't process any data" is not useful. Complexity analysis always treats the input data size as a variable.
An efficient algorithm exists for all computational problems. T/F
T. Whether or not an efficient algorithm exists for NP-complete problems is an open research question. However, the current consensus is that such an algorithm is unlikely.
A node in binary tree can have zero, one, or two children.True/False
Trye
GetEvens(list, listSize) { i = 0 evensList = Create new, empty list while (i < listSize) { if (list[i] % 2 == 0) Add list[i] to evensList i = i + 1 } return evensList } a. What is the maximum possible size of the returned list? b. What is the minimum possible size of the returned list? c. What is the worst case auxiliary space complexity of GetEvens if N is the list's size and k is a constant? d. What is the best case auxiliary space complexity of GetEvens if N is the list's size and k is a constant?
a.listSize----If all items in the input list were even, then the returned list would have the same size. b. 0---If the input list doesn't have any even numbers, then the returned list will be empty. c. S(N) = N+K ---in the worst case, all items from the input list are added to the returned list. As the input size N increases, the output size increases to match. d. S(N)=k In the best case, the input list contains only odd numbers and the output size is 0, whether the list has 10 odd numbers or 1000 odd numbers. A constant output size of zero corresponds to an auxiliary space complexity of k.
Do two student essays share a common phrase consisting of a sequence of more than 100 letters?
longest common substring. if the longest common substring between the two essays is greater than 100, then the two essays share a common phrase consisting of more than 100 letters.
Computational complexity is
the amount of resources used by the algorithm.
A list ADT is commonly implemented using
arrays or linked list data structures.
The underlying data structure for a list data structure is the same for all programming languages. t/f
f. The underlying data structures used to implement ADTs may vary between implementations. Some programming languages allow programmers to select a specific implementation. Ex: Java's list ADTs can be implemented using arrays or linked lists.
A list ADT's underlying data structure has no impact on the program's execution. t/f
f. different underlying data structures will require different algorithms to perform same list ADT operation, which will have different runtimes. Ex: For prepending an item to a list, a linked list-based implementation is more efficient than an array-based implementation.
Runtime and memory usage are the only two resources making up computational complexity.
f.Although runtime and memory usage are the most common, computational complexity can include other factors, such as network communication.
t/f: The linear search algorithm's best case scenario is when N = 0.
f: The variable value N cannot be replaced with a constant of 0 to describe a best case scenario. A best case scenario must describe the contents of the data being processed.
ADTs support abstraction by
hiding the underlying implementation details and providing a well-defined set of operations for using the ADT.
Because an algorithm's runtime may vary significantly based on the input data, a common approach is to
identify best and worst case scenarios.
How can knowing a problem is NP-complete help?
instead of trying to find an efficient algorithm to solve the problem, a programmer can focus on finding an algorithm to efficiently find a good, but non-optimal, solution.
space complexity
is a function, S(N), that represents the number of fixed-size memory units used by the algorithm for an input of size N.
An algorithm's worst case
is the scenario where the algorithm does the maximum possible number of operations.
the _____ ADT is the better match for the program's requirements.
list. The list ADT provides an abstraction that satisfies each program requirement, including iterating in reverse order, which the queue ADT doesn't provide.
An algorithm's computational complexity includes
runtime and memory usage.
Given the airports at which an airline operates and distances between those airports, what is the shortest total flight distance between two airports?
shortest path algorithm--Many computational problems use graphs to represent connections between different items, such as airports, computers, intersections, train stations, etc. If the connection between airports represents flight distance, then the shortest path found will be the path with the shortest flight distance.
Knowledge of an ADT's underlying implementation is needed to analyze or impove the runtime efficiency. t/f
t. Different ADTs' implementations may have different runtimes for various operations. Ex: Accessing the ith list element is more efficient in an array-based implementation than a linked list implementation.
Algorithm efficiency is typically measured by
the algorithm's computational complexity.
An algorithm's auxiliary space complexity is
the space complexity not including the input data. Ex: An algorithm to find the maximum number in a list will have a space complexity of S(N) = N + k, but an auxiliary space complexity of S(N) = k, where k is a constant.
Abstraction means
to have a user interact with an item at a high-level, with lower-level internal details hidden from the user.
Complexity analysis is used
to identify and avoid using algorithms with long runtimes or high memory usage.
Python, C++, and Java all provide built-in support for a deque ADT. t/f
true
data structure
way of organizing, storing, and performing operations on data