Midterm Fall 2021
You can create a ______________ file in your repository's root directory to tell Git which files and directories to ignore when you make a commit.
.gitignore
Given the following code, what is the values that are printed? list2 = range(1,16,3) for i in list2: print(i)
1 4 7 10 13
falsy
A value treated in comparison operations as the Boolean value false. A characteristic of empty lists, strings, and tuples in Python.
What is the name of the Anaconda GUI that comes with Anaconda Individual Edition that allows you to launch applications and manage packages?
Anaconda Navigator
Given the following code, what will be printed? students = ("John", "Sue", "Bill", "Joan", "Andrew", "Colette") print(students[4])
Andrew
Given the following code, how many lines will be printed? x = 15 if x > 0: print("x is greater than 0") if x > 5: print("x is greater than 5") if x > 10: print("x is greater than 10") if x > 15: print("x is greater than 15")
3
Give the following code, what is the length of list1? list1 = range(5,25,5)
4
Which feature of version control is most directly responsible for allowing continuous integration (CI) testing?
Automating processes
Which of the following were debugging recommendations provided by the instructor?
Breathe, add print() statements to your code, explain your code out loud
Given the following code, what will the value of car be? cars = {"Dodge" : "Charger", "Kia" : "Sorento", "Chevy" : "Camaro", "Lincoln" : "Navigator"} car = cars["Chevy"]
Camaro
Pandas has which three main objects
DataFrame, Series, Index
Which feature of version control allows teams of developers/testers working at remote locations to communicate during a project?
Enhancing Collaboration
NumPy ndarray's are multidimensional containers for heterogeneous data (True/False)
False
The df.tail() command will display the last 10 rows of a DataFrame. (True/False)
False
Using plt.scatter should be preferred over using plt.plot when plotting large datasets because is requires less processing.
False
Version control does not allow programmers at remote locations to work on the same files at the same time. (True/False)
False
What type of plot is often used as a first step in "understanding a dataset"?
Histogram
Which Integrated Development Environment (IDE) is lightweight and comes with Python?
IDLE
Select all the applications installed by default and ready to Launch on the Anaconda Navigator dashboard
Powershell Prompt, Jupyter Notebook, Jupyter Lab, Spyder
What is the name of the IDE (Integrated Development Environment) you were asked to Launch as one of the steps in the Getting started with Anaconda instructions?
Spyder
Which full-function Integrated Development Environment (IDE) is available as part of the Anaconda package?
Spyder
With version control, if one programmer overlays another programmer's code -- can the original code be recovered? (True/False)
True
Matplotlib
a tool for visualization in Python; built on NumPy arrays and designed to work with the broader SciPy stack. Conceived by John Hunter in 2003. Plays well with many OS's and graphics backends
What is one of Matplotlib's most important features?
ability to play well with many operating systems and graphics backends
git add
add a file as it looks now to your next commit (stage)
Which assert statement would check to see if multiply_nums (x, y) function works correctly?
assert multiply_nums (8, 2) == 16
2 types of testing
unit testing (tests individual module functions), integration testing (tests modules' interaction with each other)
What class is subclassed when writing unittest testcases?
unittest.TestCase
When using Matplotlib subplots, the grid indexing runs from ...
upper left to bottom right
Unittest
used to define test cases to validate the behavior of your code
S2 = pd.Series ( [2, 4, 6], index = ['a', 'b', 'c'])
user defined Series creation
If a Pandas Series has a label-based index (user-defined), the data can be accessed using...?
both
Given the following code, what will the output of the print statement be? colors = {"blue", "green", "red", "yellow", "orange"} print(type(colors))
class 'set'
git commit -m "[descriptive msg]"
commit your staged content as a new commit snapshot
What command allows you to join a sequence of arrays along an existing axis?
concatenate
Matplotlib figures
consists of (among other items) title, xlabel, ylabel, legend, xticks, yticks, linestyle, color - all can be customized
If you want to eliminated one (or more) of the columns from your DataFrame, you can do so using what command?
drop()
When creating a NumPy array, to explicitly set the data type of the resulting array you use the ________ keyword.
dtype
ndarray
generic multidimensional container for homogeneous data
Series
homogenous one-dimensional array, the basic Pandas object
NumPy arrays are most often compared to what Python data type?
lists
What debugging strategy was used in the lab that allowed you to send messages to the console and/or a file?
logging
The ______________ branch in Git is 1st branch and is often designated as the Production branch.
main/master
git merge
merge the specified branch's history into the current one
Which NumPy function was used in the lab to initialize all the entries in an array to a specified value (not 1 or 0)?
np.full
What command would be used to create a square array with 1's on the main diagonal, and all other elements set to 0's?
np.identity
The command to create an array with 5 values evenly spaced between 0 and 25 is:
np.linspace(0,25,5)
ndim
number of dimensions
A Pandas DataFrame can be created with a CSV file using what command?
pd.read_csv()
Matplotlib plt.plot vs plt.scatter
plot - creates plots, including scatterplots. points are clones of each other so processing is done once; higher performance, faster. scatter - more powerful, can render different size and/or color for each point
What is the primary difference between plt.scatter and plt.plot when it comes to plotting individual points?
plt.scatter plots can control the properties of each individual point
Matplotlib subplot
plt.subplot(rows,cols,index) - single subplot within a grid, indexed from upper left to bottom right
pdb
python debugger included with python and run from the command line
What command allows you to change the dimensions of an array, for example -- change a 12-element array to a 2D 3X4 array.
reshape
git clone [url]
retrieve an entire repository from a hosted location via URL
The plt.subplot() command takes 3 integer arguments. What are the 3 arguments (in the correct order)?
rows, columns, index
git config --global user.email "[email-addr]"
set the email address that identifies the user for all git commands
git config --global user.name "[user name]"
set the name that identifies the user for all git commands
git status
show modified files in working directory, staged for your next commit
What command was required to render the plots visible when using Jupyter?
show()
ndarray itemsize
size (in bytes) of each array element
ndarray: shape
size of each dimension
What type of Matplotlib plot allows you to compare different views of data side by side?
subplots
git checkout
switch to another branch and check it out into your working directory
Asserts
tests a condition and raises an AssertionError if false
Series objects can be created with a label-based index by specifying the labels for the index using ...
the index= parameter
ndarray nbytes
total size (in bytes) of the array
git push
transmit the local branch commits to the remote repository branch
In NumPy arrays, elements are stored in contiguous blocks of memory for faster operations.
True
Individual DataFrame rows can be printed using the index value with the .iloc command (True/False)
True
Basic debugging strategies
Using print("x") statements, output via python's logging function, interactive debugging tools (eg pdb)
git init
initialize an existing directory as a Git repository
What are the two types of testing discussed in this lesson?
integration testing, unit testing
Which statement about Pandas DataFrames and Series is True?
A DataFrame is a collection of Series while a Series is an array/list.
If an Assert condition fails, the program will stop and generate what type of error?
AssertionError
What do Version Control Systems do for developers?
Concurrent Development ~ working on same code at the same time Automation ~ improves productivity and quality (continuous integration) Team Collaboration ~ multiple remote sites in different time zones Preserving History ~ track all changes (when, why, who) High Availability/Disaster Recovery ~ replicas of code
What shortcut key combination can be used to run cell in Jupyter Notebook?
Ctrl + Enter
In the lab we used the Pandas method pd.date_range() method to create...?
Datetimeindex
GitHub ________ was the tool used in the Learning GitHub tutorial to interact with GitHub from the desktop.
Desktop
NumPy
Numerical Python - the most widely used and important foundation package for numerical computing in Python
Pandas
Python Data Analysis Library; may have derived from "panel data"
PEP 8
Python Enhancement Proposals 8, the style guide for python (of note - 79 characters to a line, 4 spaces or 1 tab for indentation)
How does well-written (PEP8) code help the efficiency of the software development process?
Readability, Collaboration, Bug Fixes
Histograms uses ____ to group a range of values, then counts the number of values in that range for plotting.
bins
What Matplotlib object is thought of as the single container that contains all the other objects in a plot?
figure
A Pandas Series is a ______________________ array of indexed data.
one-dimensional
DataFrame
usually heterogeneous two-dimensional container of Series objects; the most common pandas object; basically a dictionary of Series objects