QMB EXAM 1
not equal to
!=
What are three ways we have explored a data set in the course videos
- .head() - .describe() - .info()
input the code required to complete the graph: qtrs = [1,2,3,4] series1 = [60,120,150,152] series2 = [10,15,26,109] series3 = [7,56,115,117] labels = ['series1', 'series2', 'series3'] plt. ? (qtrs, series1, series2, series3) plt. ? (loc=upperleft) plt. ? ('sales over 4 qtrs')
- stackplot - legend - title
Missing values can be imputed/replaced with other values. If my data set has 1000 rows, and 200 missing values for the category age. What can I impute for age?
- the most common value - the median - the mean
If we were to fix and remove the outlier we could do one of two things
- use some code to replace the outlier with the mean if the values - use some code to replace the outlier with the median of the values
what is the output of the following code data=pd.Series([.25,.5,.75,1]) data[1]
.5
What will be the output of this code? for i in range(1, 4): print(i, end=' ')
1 2 3
what will be the output if x=1? x = 1 if x==1: print(1+1)
2
first create a simple data set usig the code below: import pandas as pd data = [['A1', 2, 4, 8], ['A2', 3, 7, 17], ['A3', 1, None, 7], ['A4', 989, 186, 3698], ['A5', 0, 0, None]] df = pd.DataFrame(data, columns=['ID', Value 1', 'Value 2', 'Value 3']) df if you ran: df = df.dropna() df how many rows would remain in the data frame?
3
if i type the below, what will i get? x=2 y=3 x=3 z=4 print (x+y)
6
What will be the value of total after running this code? total = 0 sales = [12, 14,12,18,17,21,19,18,132,93,3,8,7,76,23,65,192,9,1,3] for N in sales: total = total + N print(total)
743
what will the following code output? def addmeup(x=1, y=1, z=1): addme = x+ y + z return addme result = addmeup(2,5) print(result)
8
less than
<
less than or equal to
<=
what does the following code do: arr = p.array([1,2,3,4,5]) arr[arr>3]
returns [4,5]
when working with a pandas data frame, what is one advantage Seaborn has over using native matplotlib to visualize two of the columns.
seaborn can use columns from pandas. matplotlib requires additional formatting data
equal to
==
greater than
>
greater than or equal to
>=
What is a function in python
A block of code that performs a specific task and can be reused
What is the default argument in a python function
An argument that takes a default value if no value is provided during the function call
How do you pass arguments to a function in python
By specifying the arguments inside the parentheses when calling the function
How do you count the number of iterations in a for loop
Initialize a counter variable outside the loop and increment it inside the loop
What is the correct way to sum values in a list using a for loop
Initialize a total variable before the loop and add each value to it inside the loop
What does *args do in a python function
It allows the function to accept an arbitrary number of positional arguments
What does the break statement do in a loop
It immediately exits the loop, skipping any remaining iterations
what does np.transpose() function do to a 2D array
It's swaps the rows and columns of the array. Like it is being flipped on the side
what is the output of the function? def myname(): print('joel davis') print('hello class') myname()
Joel Davis followed by hello class on the next line
Which function in numpy allows you to compute the sum while ignoring NaN values
np.nansum
How do you generate an array of five random integers between zero and 10 and numpy
np.random.randint(0, 10, size=5)
What is the key difference between a numpy array and a pandas series
Series has an explicitly defined index, while numpy array has an implicitly defined integer index
How do you create an numpy array filled with zeros of size 3 x 3?
np.zeros((3,3))
If you ahve a 2D array: np_a = np.array([[6.1,5.8,5.97], [2.5,3.19,2.26]]) how owuld you access the element 5.8?
np_a[0,1]
How do you create a pandas data frame from a dictionary
pd.DataFrame({'name': ['john'], 'age':[25]})
how do you created a panda series from a list of values [.25,.5,.75,1]
pd.Series([.25,.5,.75,1])
What is the main advantage of using functions in python
They allow code to be reused an organized into manageable sections
What is the purpose of a for loop in python
To iterate over a sequence, such as a listing or a string
what does the following code return: arr = np.array([10,20,30,40,50]) arr[1:4]
[20,30,40]
what happens when you perform 2*np.array([2,3,10])
[4,6,20]
what it the result of the operation np.array([1,2,3]) + np.array([4,5,6])?
[5,7,9]
what is the result of slicing a panda series data ['a':'c'], where data is defined as pd.Series([.25,.5,.75,1], index=['a', 'b', 'c', 'd']?
a .25 b .5 c .75 dtype:float64
what is the result of the following fancy indexing operation on series data data = pd.series([.25,.5,.75,1], index = ['a', 'b', 'c', 'd']) data[['a', 'c']]
a series containing the values a: .25 and c: .75
What is a nested if statement
an if statement within an if statement
write a python program that: - check the stock levels of apples, bananas, and oranges - if the stock of any items below 10, print a message to re-order the item - if all items are sufficiently stocked, print "inventory is sufficient"
apples = 5 bananas = 15 oranges = 8 if apples < 10: print('reorder apples') if bananas < 10: print ('reorder bananas') if oranges < 10: print('reorder oranges') if apples >= 10 and bananas >= 10 and oranges >= 10: print('inventory is sufficient')
given the array arr=np.array([1,2,3,4,5,6]), how would you reshape it into a 2x3 array?
arr.reshape((2,3))
what Portion of the code controls the bins and the bin size in the output
bins=binsep
how do you create a panda series from the dictionary {'a': .25, 'b': .5, 'c': .75}?
pd.Series({'a': .25, 'b': .5, 'c': .75})
Sometimes when working with missing values, you find that it is not missing at all. Sometimes someone has been helpful and entered some placeholder value like "this is missing" when that happens, what could you do?
replace "this is missing" with a missing value (np.NaN)
what does np.linspace(0,10,15) do?
creates an array of 5 evenly spaced numbers between 0 and 10
How do you access the value associated with the key be in the following panda series? data = pd.Series([.25,.5,.75,1], index = ['a', 'b', 'c', 'd'])
data['b']
Which of the following lines of code correctly adds a new index E with the value 1.25 to the existing panda series? data = pd.Series([.25,.5,.75,1], index = ['a', 'b', 'c', 'd'])
data['e']=1.25
Is price has an outlier variable that is really really extreme what should we do with it
delete these rows
What would produce a table with the first seven rows from the data frame
df.head(7)
What will be the output of this code? for N in "string": print(N)
each character of the string on a new line
What is the difference between else and elif in python
elif checks additional conditions, else handles the default case
matplotlib is built on top of seaborn
false
ture or false: all operators produce an int output
false
What will be the output of the code when Y=6? y = 6 if y>0: if y>5: print('higher') if y<=5: print('lower')
higher
company is a list containing 4 strings, sales is a list with 4 integers. Which of these codes snippets would create a bar chart
import matplotlib.pyplot as plt plt.bar(company, sales, color='grey')
how do you import numpy in python with an alias
import numpy as np
What package have we been using to import data and what is the abbreviation we have been using
import pandas as pd
What is the output of the following code when Y=0? if y>0: print('positive number') elif y==0: print('its a zero') else: print('negative number')
its a zero
What is the primary difference between the loc and iloc attributes in pandas
loc uses explicit index labels, while iloc uses implicit integer indices
What is the purpose of the code below %matplotlib inline
make the plots show up inline
Which function is used to concatenate to arrays along a specific axis
np.concatenate
What is the purpose of the break statement in the following code? x = 1 maxc = 30 while x>0: print(x) x=x+1 if x >= maxc: break print(x)
stop the loop when x reaches 30
How do you calculate the average of a list of numbers of the loop
sum the numbers in the loop and divide the total by the count of numbers
What will happen when you run the following code? x = 1 while x<=5: print(x) x = x + 1 print(x)
the numbers 1-6 will be printed, with 6 printed last
what is the output of the following code data =pd.series([.25,.5,.75,1], index = ['a', 'b', 'c', 'd']) data['a':'c']
the values associated with a, b, and c
When using matplotlib, if the color is the only part of the format string, you can use any matplotlib colors spec (eg. full names like "red") or hex strings.
true
heatmaps can be used to quickly understand correlated variables in a data set
true
when creating a chart using Seaborn, it is possible to make formatting changes to the chart using matplotLib code
true
What error occurs if you try to multiply two numpy array of different shapes, like np.array([2,3]) * np.array([2,3,10])
valueError
