Python Numpy and Pandas

Ace your homework & exams now with Quizwiz!

Numpy: given two array how would you horizontally and vertically stack it

For horizontally stacking we can use hstack or concatenate with axis=1 or we can also use column_stack() For vertically stacking we can use vstack or concatenate with axis=2 or we can also use row_stack()

Pandas: What is the use of inplace=True in many pandas methods

Inplace=True will actaul change dataframe, if False it will not make permanent changes to dataframe

Numpy: what is notation for "not a number"?

NaN

Define numpy object

NumPy's main object is the homogeneous multidimensional array. It is a table of elements (usually numbers), all of the same type, indexed by a tuple of positive integers. In NumPy dimensions are called axes.

What happens when you add two series objects? series1+series2

Series 1 labels are retains, if lable not found in series2 then NaN

Numpy: How to get numpy flatier object, what is the use of it?

The flat property gives back a numpy.flatiter object. This is the only means to get a flatiter object; we do not have access to a flatiter constructor. The flat iterator enables us to loop through an array as if it were a flat array

Numpy: How to use dstack() give an example

To boot, there is the depth-wise stacking employing dstack() and a tuple, of course. This entails stacking a list of arrays along the third axis (depth). For example, we could stack 2D arrays of image data on top of each other as follows In: dstack((a, b)) Out: array([[[ 0, 0], [ 1, 2], [ 2, 4]], [[ 3, 6], [ 4, 8], [ 5, 10]], [[ 6, 12], [ 7, 14], [ 8, 16]]])

Numpy: what is broadcasting?

Using broadcasting once can change multiple values in one go. Below example changes nulls to 0 world_alcohol[:,4][world_alcohol[:,4]=='']='0'

Numpy: Explain arithmetic operations for numpy array...

We can add, substract, mulitply arr+arr, arr-arr, arr*arr, arr/arr

Numpy: Given and 2D array array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) How to flat this array? How to transpose it? How to change shape of it?

We can use ravel() or flatten() methods. Flatten allocates new memory transpose() method resize() method

Pandas: How to use multiple filter conditions? And condition and or condition

We cannot use and or operator with Pandas, instead we have to use & / | for and and or respectively

Explain this code - totals = {} year = (world_alcohol[:,0]=='1989') year = world_alcohol[year,:] for i in countries: is_country = (year[:,2]==i) country_consumption = year[is_country,:] country_consumption[:,4][country_consumption[:,4]=='']='0' l = country_consumption[:,4].astype(float) isum = l.sum() #print(i,l,'-',isum) totals[i] = isum print(totals)

We've assigned the list of all countries to the variable countries. Find the total consumption for each country in countries for the year 1989. Refer to the steps outlined above for help. When you're finished, totals should contain all of the country names as keys, with the corresponding alcohol consumption totals for 1989 as values.

Create a numpy array 3x5?

a = np.arange(15).reshape(3, 5)

here is an array - a = np.range(0,10) change array to 2,5

a.reshape(2,5)

Numpy: Given an array a=array([[ 1., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0.], [ 0., 0., 1., 0., 0.], [ 0., 0., 0., 1., 0.], [ 0., 0., 0., 0., 1.]]) how to know shape of an array

a.shape

Numpy: What is conditional selection. if a=[1,2,3,4,5,6,2,4,3,4,11,2,33,23]. create an array with numbers > 5

a[a>5]

Pandas: how to use apply method? How to use apply with a row

apply is can be used to implement any function to a column or a row

What are different ways to create arrays

arange.reshape zeroes ones .array([1,2,3,4]) empty

How to get index position of max value in array

argmax

Numpy: Explain below methods np.min() np.max() np.argmax() np.argmin()

argmax return index of max value

Numpy: How to change datatype of an array?

astype() function

Numpy: How to convert the data type of an array?

astype() method. vector = numpy.array(["1", "2", "3"]) vector = vector.astype(float)

Use of axis parameter in Pandas?

axis = 0 means rows axis =1 means columns

How many axes and length are there in below ndarray? [[ 1., 0., 0.], [ 0., 1., 2.]]

axis =2 first axes is length of 2 and second axes is length 3. column first and row second.

List all numpy numirical types

bool, initi, int8, int16, int32, int64 unit8, unit16, unit32, unit64 float16, float32, float64/flaot complex64, complex128

Pandas: How to convert a all column in to a list?

col_names = food_info.columns.tolist()

Write down attributes of data frame

columns, dytpe, shape, index etc

Pandas: how to do union of data

concat

Pandas: How to drop a column? What is defualt axis value?

df.drop('a',axis=1) if axis=0 then it will drop row

how to give a name to index

df.index.name = ['a','b']

Pandas: How to retrieve specific column and rows?

df.loc[['r1','r2'],['c1','c2']]

How to join data using join

df1.join(df2)

Pandas: How to filter data? How to get multiple column using filter on column.

df['x' > 5][['x','y','z']]

Drop all columns in titanic_survival that have missing values and assign the result to drop_na_columns. Drop all rows in titanic_survival where the columns "age" or "sex" have missing values and assign the result to new_titanic_survival.

drop_na_rows = titanic_survival.dropna(axis=0) drop_na_columns = titanic_survival.dropna(axis=1) new_titanic_survival = titanic_survival.dropna(axis=0,subset=["age", "sex"])

Pandas: List function to handle missing values

dropna, isnull etc

Create an array with explicit data type as complex

dtype=Complex

What is default datatype of numbers in series object

float

Pandas: How to read CSV file?

food_info = pandas.read_csv("food_info.csv")

Numpy: What is the use of hsplit(), vsplit() and split() function

hsplit - splits by column vsplit - splits by rows split - splits by both row and column with axis parameter

What is the use of numpy character code? why is it still there? what are the use character codes: d, V, U

i, f, u, b, d,S, V,U they are used to assign datatype to numpy object. It is preserved for backward compatibility to Numerical

Numpy: Which function can be used to create numpy array from file

import numpy as np world_alcohol = np.genfromtxt("world_alcohol.csv", delimiter=",") Signature: np.genfromtxt( fname, dtype=<type 'float'>, comments='#', delimiter=None, skip_header=0, skip_footer=0, converters=None, missing_values=None, filling_values=None, usecols=None, names=None, excludelist=None, deletechars=None, replace_space='_', autostrip=False, case_sensitive=True, defaultfmt='f%i', unpack=None, usemask=False, loose=True, invalid_raise=True, max_rows=None)

Pandas: use of iloc and iat functions

it can be used to access rows by number rows, iat is supposed to be faster

Numpy: Explain linespace function

it generate evenly spaced number between given range a, b np.linespace(a,b,n) n number of elements are generated between range a and b

what is the use of np.set_printoptions(threshold=np.nan) while printing array

it will show NaN values

Pandas: What is the use of iloc and loc function. Write an example...

loc['country','state'] iloc[1,2,3,4]

Pandas: How to merge data? How to do outer join?

merge, pd.merge(left,right,on=[key1,key2])

Explain below code - flt = (world_alcohol[:,0]=='1986') & (world_alcohol[:,2]=='Canada') canada_1986 = world_alcohol[flt] print(canada_1986) canada_1986[:,4][canada_1986[:,4]==''] = '0' canada_alcohol = canada_1986[:,4].astype(float) print(canada_alcohol) total_canadian_drinking = canada_alcohol.sum()

na

What is NumPy's array class is called?

ndarray

List attributes of ndarray

ndarray.ndim the number of axes (dimensions) of the array. ndarray.shape the dimensions of the array. ndarray.size the total number of elements of the array. ndarray.dtype an object describing the type of the elements in the array. ndarray.itemsize the size in bytes of each element of the array. ndarray.data the buffer containing the actual elements of the array.

Define each of below attribtue ndarray.ndim ndarray.shape ndarray.size ndarray.dtype ndarray.itemsize ndarray.data

ndarray.ndim the number of axes (dimensions) of the array. ndarray.shape the dimensions of the array. ndarray.size the total number of elements of the array. T ndarray.dtype an object describing the type of the elements in the array. ndarray.itemsize the size in bytes of each element of the array. ndarray.data the buffer containing the actual elements of the array.

Numpy: Explain below numpy attributes ndim size itemsize nbytes .T

ndim gives the number of dimensions size holds the count of elements itemsize returns the count of bytes for each element in the array The T property has the same result as the transpose() function

Decoding - passenger_classes = [1, 2, 3] fares_by_class = {} for this_class in passenger_classes: pclass_rows = titanic_survival[titanic_survival["pclass"] == this_class] pclass_fares = pclass_rows["fare"] fare_for_class = pclass_fares.mean() fares_by_class[this_class] = fare_for_class

none

Numpy: How to generate identity matrix

np.eye(4)

Numpy: how to generate 5x5 random matrix between values 0,1

np.random.rand(5,5)

Panda: How to use pivot_table function?

passenger_class_fares = titanic_survival.pivot_table(index="pclass", values="fare", aggfunc=np.mean)

How to manauly create dataframe? What arguments can be passed

pd.Dataframe(data,index,columns,dtype)

What is Pandas Series object? Explain different ways to create series?

pd.series(data= , lable=)

Numpy: Difference between rand and randn

rand generate matrix with eveReturn a sample (or samples) from the "standard normal" distribution. Unlike rand which is uniform:

Pandas: What is the use of reset_index() method

reset_index() method will add column with index and resets index with numerical index.

Pandas: What is the use of set_index() method

set_index() method can be used to set any existing column as index

Numpy comes with many universal array functions, which are essentially just mathematical operations you can use to perform the operation across the array. List couple of them...

sqrt, max, min, log etc

Numpy: List five aggregate function for numpy array

sum, mean, max min, count etc

Numpy: How to convert numpy array to list

tolist() function

Create 2-d array from list

using array function

How to grab the data type of the object in the array?

using np.dtype attribute

Pandas: how to select multiple columns?

zinc_copper = food_info[["Zinc_(mg)", "Copper_(mg)"]] columns = ["Zinc_(mg)", "Copper_(mg)"] zinc_copper = food_info[columns] selenium_thiamin=food_info[["Selenium_(mcg)",'Thiamin_(mg)']]


Related study sets

Ecology Test 3 Practice Questions

View Set

N410 - CHAPTER 18 Nursing Care of the Child With an Alteration in Gas Exchange/Respiratory Disorder

View Set

CHAPTER 2 COHABITATION AND PREMARITAL AGREEMENTS

View Set

Psychosocial Development in Middle Childhood

View Set

Chapter 13 Lower GI System (Connor)

View Set

Engage Community/Public Health Environmental Influences

View Set

Test review ob/gyn,first aid ect

View Set

Special Senses- chapter 8, clicker questions

View Set