CASWE
By default, the re module's sub function replaces only the first occurrence of a pattern with the replacement text you specify. The statement re.sub(r'\t', ', ', '1\t2\t3\t4') returns '1, 2\t3\t4'
False
The following code creates a Series of student grades from a list of integers: import pandas as pdgrades = pd.Series([87, 100, 94]) By default, a Series has integer indexes numbered sequentially from 1, and the Series argument may be a tuple, a dictionary, an array, another Series or a single value
False
The following code formats the float value 17.489 rounded to the hundredths position: '{:.3f}'.format(17.489)
False
The following code replaces tab characters with commas: values = '1\t2\t3\t4\t5'values.replace('\t', ', ')
False
If the contents of a file should not be modified, open the file for —another example of the principle of least privilege. This prevents the program from accidentally modifying the file.
reading only
Which of the following would be accepted by a program strictly looking for four integers of data in CSV format? 100, 85 77,9 100,85,,77,9 '100,85,77,9'=' 100,85, 77,9
100,85, 77,9
Assuming the following DataFrame grades: Wally Eva Sam Katie BobTest1 87 100 94 100 83Test2 96 87 77 81 65Test3 70 90 90 82 85 Which of the following statements is false? One benefit of pandas is that you can quickly and conveniently look at your data in many different ways, including selecting portions of the data. The following expression selects 'Eva' column and returns it as a Series: grades['Eva'] If a DataFrame's column-name strings are valid Python identifiers, you can use them as attributes. The following code selects the 'Sam' column using the Sam attribute: grades.Sam All of the above statements are true.
All of the above statements are true.
Preparing data for analysis is called or .
Both (a) and (b).
Consider the following code: In [1]: s1 = 'happy'In [2]: s2 = 'birthday'In [3]: s1 += ' ' + s2In [4]: s1Out[4]: 'happy birthday'In [5]: symbol = '>'In [6]: symbol *= 5In [7]: symbolOut[7]: '>>>>>' Which snippet(s) in this interactive session appear to modify existing strings, but actually create new string objects?
Both snippets [3] and [6]
Which of the following statements is false? DataFrames have a describe method that calculates basic descriptive statistics for the data and returns them as two-dimensional array. In a DataFrame, the statistics are calculated by column. Method describe nicely demonstrates the power of array-oriented programming with a functional-style call—Pandas handles internally all the details of calculating these statistics for each column. You can control the precision of floating-point values and other default settings with pandas' set_option function
DataFrames have a describe method that calculates basic descriptive statistics for the data and returns them as two-dimensional array.
Which of the following statements is false? String method split with no arguments tokenizes a string by breaking it into substrings at each whitespace character, then returns a list of tokens. To tokenize a string at a custom delimiter (such as each comma-and-space pair), specify the delimiter string (such as, ', ') that split uses to tokenize the string, as in: letters = 'A, B, C, D'letters.split(', ') If you provide an integer as split's second argument, it specifies the maximum number of splits. The last token is the remainder of the string after the maximum number of splits. Assuming the string in Part (b), the code letters.split(', ', 1) returns ['A', 'B', 'C, D'] There is also an rsplit method that performs the same task as split but processes the maximum number of splits from the end of the string toward the beginning.
If you provide an integer as split's second argument, it specifies the maximum number of splits. The last token is the remainder of the string after the maximum number of splits. Assuming the string in Part (b), the code letters.split(', ', 1) returns ['A', 'B', 'C, D']
Which of the following statements are false? JSON (JavaScript Object Notation) is a data-interchange format readable only by computers and used to represent objects (such as dictionaries, lists and more) as collections of name-value pairs. Many libraries you'll use to interact with cloud-based services such as Twitter, IBM Watson and others communicate with your applications via JSON objects. JSON can represent objects of custom classes. JSON has become the preferred data format for transmitting objects across platforms. This is especially true for invoking cloud-based web services, which are functions and methods that you call over the Internet.
JSON (JavaScript Object Notation) is a data-interchange format readable only by computers and used to represent objects (such as dictionaries, lists and more) as collections of name-value pairs.
Which of the following statements is false? NumPy arrays use only zero-based integer indexes. Like arrays, Series use only zero-based integer indexes. Series may have missing data, and many Series operations ignore missing data by default. All of the above statements are true.
Like arrays, Series use only zero-based integer indexes.
Which of the following statements is false? Pandas is the most popular library for dealing with mixed data types, customized indexing, missing data, and data that needs to be manipulated into forms appropriate for the databases and data analysis packages. Data science presents unique demands for which more customized data structures are required. NumPy's array is optimized for heterogeneous numeric data that's accessed via integer indices. Pandas provides two key collections—Series for one-dimensional collections and DataFrames for two-dimensional collections.
NumPy's array is optimized for heterogeneous numeric data that's accessed via integer indices.
Which of the following statements is false? As in lists and arrays, the first character in a text file and byte in a binary file is located at position 0, so in a file of n characters or bytes, the highest position number is n - 1. Python views text files and binary files (for images, videos and more) as sequences of bytes. For each file you open, Python creates a file object that you'll use to interact with the file. All of the above statements are true.
Python views text files and binary files (for images, videos and more) as sequences of bytes.
Consider this text from Shakespeare: soliloquy = 'To be or not to be, that is the question' Which of the following statements is false? String method index searches for a substring within a string and returns the first index at which the substring is found; otherwise, a ValueError occurs. The following code returns 3: soliloquy.index('be') String method rindex performs the same operation as index, but searches from the end of the string and returns the last index at which the substring is found; otherwise, a Value-Error occurs. The following code returns 3: soliloquy.rindex('be') String methods find and rfind perform the same tasks as index and rindex but, if the substring is not found, return -1 rather than causing a Value-Error. All of the above statements are true.
String method rindex performs the same operation as index, but searches from the end of the string and returns the last index at which the substring is found; otherwise, a Value-Error occurs. The following code returns 3: soliloquy.rindex('be')
Which of the following statements is false? The d presentation type in the following f-string formats strings as integer values: f'{10:d}' The integer presentation types b, o and x or X format integers using the binary, octal or hexadecimal number systems, respectively. The c presentation type in the following f-string formats an integer character code as the corresponding character: f'{65:c} {97:c}' If you do not specify a presentation type, as in the second placeholder below, non-string values like the integer 7 are converted to strings: f'{"hello":s} {7}'
The d presentation type in the following f-string formats strings as integer values: f'{10:d}'
The following IPython session loads and displays the CSV file accounts.csv: In [1]: import pandas as pdIn [2]: df = pd.read_csv('accounts.csv', ...: names=['account', 'name', 'balance'])...:In [3]: dfOut[3]:account name balance0 100 Jones 24.981 200 Doe 345.672 300 White 0.003 400 Stone -42.164 500 Rich 224.62 The names keyword argument specifies the DataFrame's column names. If you do not supply the names keyword argument, read_csv assumes that the CSV file's first row is a comma-delimited list of column names.
True
Which of the following statements about DataFrames is false? The index can be a slice. In the following slice containing, the range specified includes the high index ('Test3'): grades.loc['Test1':'Test3'] When using slices containing integer indices with iloc, the range you specify excludes the high index (2): grades.iloc[0:2] To select specific rows, use a tuple rather than slice notation with loc or iloc. All of the above statements are true.
To select specific rows, use a tuple rather than slice notation with loc or iloc.
Assuming the following DataFrame grades: Wally Eva Sam Katie BobTest1 87 100 94 100 83Test2 96 87 77 81 65Test3 70 90 90 82 85 To see the average of all the students' grades on each test, call mean on the T attribute: grades.T.mean()
True
Assuming the following DataFrame grades: Wally Eva Sam Katie BobTest1 87 100 94 100 83Test2 96 87 77 81 65Test3 70 90 90 82 85 rather than getting the summary statistics by student, you can get them by test. Simply call describe on grades.T, as in: grades.T.describe()
True
Based on the string sentence = '\t \n This is a test string. \t\t \n' The following code snippets first use method lstrip to remove only leading whitespace from sentence: sentence.lstrip() then use method rstrip to remove only trailing whitespace: sentence.rstrip()
True
Consider a Series of hardware-related strings: hardware = pd.Series(['Hammer', 'Saw', 'Wrench']) The following code calls string method contains on each element to determine whether the value of each element contains a lowercase 'a': hardware.str.contains('a') and returns a Series containing bool values indicating the contains method's result for each element.
True
Consider a Series of hardware-related strings: hardware = pd.Series(['Hammer', 'Saw', 'Wrench']) The following code uses the Series str attribute to invoke string method upper on every Series element, producing a new Series containing the uppercase versions of each element in hardware: hardware.str.upper()
True
The following code creates an accounts.txt file and write five client records to the file. Generally, records in text files are stored one per line, so we end each record with a newline character: with open('accounts.txt', mode='w') as accounts:accounts.write('100 Jones 24.98\n')accounts.write('200 Doe 345.67\n')accounts.write('300 White 0.00\n')accounts.write('400 Stone -42.16\n')accounts.write('500 Rich 224.62\n') You can also write to a file with print (which automatically outputs a \n), as in print('100 Jones 24.98', file=accounts)
True
The names keyword argument specifies the DataFrame's column names. If you do not supply the names keyword argument, read_csv assumes that the CSV file's first row is a comma-delimited list of column names.
True
When you call describe on a DataFrame containing both numeric and non-numeric columns, describe calculates the statistics below only for the numeric columns.
True
In the following interactive session that compares the strings 'Orange' and 'orange': In [1]: 'Orange' == 'orange'Out[1]: FalseIn [2]: 'Orange' != 'orange'Out[2]: ???In [3]: 'Orange' < 'orange'Out[3]: TrueIn [4]: 'Orange' <= 'orange'Out[4]: TrueIn [5]: 'Orange' > 'orange'Out[5]: FalseIn [6]: 'Orange' >= 'orange'Out[6]: ??? The outputs of snippets [2] and [6] (marked as ???) respectively are:
True, False
Which of the following statements is false? Variables, lists, tuples, dictionaries, sets, arrays, pandas Series and pandas DataFrames offer long-term data storage. Computers store files on secondary storage devices, including solid-state drives, hard disks and more. Some popular text file formats are plain text, JSON (JavaScript Object Notation) and CSV (comma-separated values). All of the above statements are true.
Variables, lists, tuples, dictionaries, sets, arrays, pandas Series and pandas DataFrames offer long-term data storag
The json module's function reads the entire JSON contents of its file object argument and converts the JSON into a Python object. This is known as the data.
load, deserializing