ST308 Quiz 3

Ace your homework & exams now with Quizwiz!

geom_smooth()

Adds a trend line to a plot

longer (more rows and less columns)

Analysis methods often prefer (longer/wider)?

.cols =

Attribute in across() that specifies the columns you want to apply the func to

.fns =

Attribute in across() that specifies the func you want to apply

use = "complete.obs"

Attribute in cov() and cor() functions that removes NA values in calculation

sep

Attribute that specifies the character used to separate or unite column(s)

aes(x = ...)

Attribute used to specify that we want our categories across x-axis

values_to

Attribute within pivot_longer() that gives new name(s) for data values

names_to

Attribute within pivot_longer() that provides new name(s) for columns created

cols

Attribute within pivot_longer() that specifies columns to pivot to longer format

values_from

Attribute within pivot_wider() that specifies the column(s) to get the cell values from

names_from

Attribute within pivot_wider() that specifies the column(s) to get the names used in the output columns

.fns = , .cols =

Attributes of across() func

linear relationship

Covariance, Correlation, etc are measures of

dplyr::across()

Func that allows for applying a summarization to multiple columns easily

geom_jitter()

Func used to create box plot with jitter qualities

geom_point()

Func used to create scatter plot

geom_violin()

Func used to create violin plot similar to boxplot

levels() <- c()

Func used to specify the levels of a factor (variable)

geom_boxplot()

Function from ggplot2 used to create a boxplot

geom_density()

Function from ggplot2 used to create a kernel smoother (smoothed version of a histogram)

pivot_longer()

Function that lengthens data by increasing the number of rows and decreasing the number of columns

cor()

Function that returns correlation value

cov()

Function that returns covariance value

pivot_wider()

Function that widens data by increasing the number of columns and decreasing the number of rows

as.factor()

Function used to create a new factor version of a variable

table()

Function used to create contingency table

geom_text()

Function used to describe text

shape

Histogram, Density plot, etc describe the ______ of the data (numeric)

Describe the relative frequency (or count) for each category

How do we describe the distribution of a categorical variable?

shape, measures of center, and measures of spread

How do we describe the distribution of a numeric variable?

center

Mean, Median, etc are measures of....

position = "..."

Syntax for position attribute

contingency tables

Table that describes the relative frequency (or count) for each category (cat variables)

fill

Used in position attribute and stacks bars and standardizes each stack to have constant height

stack

Used in position attribute and stacks bars on top of each other

jitter

Used in position attribute for continuous data with many points at same values

dodge

Used in position attribute to create side-by-side bar plot

Categorical (Qualitative) variable

Variable where entries are a label or attribute

Numeric (Quantitative) variable

Variable where entries are a numerical value where math can be performed

spread

Variance, Standard Deviation, Quartiles, IQR, etc are measures of

cols, names_to, and values_to

What are the attributes of pivot_longer()?

names_from and values_from

What are the attributes of pivot_wider()?

sep

What attribute is important to write out in the separate and untie functions?

fill = "..."

What do we write using the labs attrb to label the legend for stacked barplots?

Q3 - Q1

What is IQR?

Shape and measures of linear relationship

What is used to describe the dist of two numeric variables?

summarise(avg = mean(fare, na.rm = TRUE), med = median(fare, na.rm = TRUE), var = var(fare, na.rm = TRUE))

What to write if we wanted to make varibales based on summaries of subgroups of data: avg variable that gives mean for each subgroup, med variable that gives median for each subgroup, and var variable that gives variance for each subgroup

[ , , ]

What to write when you want to find conditional bivariate info from three-way contingency table

group_by(var1, var2) %>% summarise(.....)

What to write when you want to find summary values for subgroups based on two variables?

In the ggplot() func

Where does the aes(x = ...) go?

na.rm = TRUE

Which attribute is used for numeric functions to remove NA values in calculation?

aes()

Which attribute maps variables in the data frame to plot elements?

if_else()

Which function is used to execute statements conditionally to create a variable?

group_by()

Which function is used when creating summaries for groups?

summary()

Which function returns the Min, 1st Qu, Median, Mean, 3rd Qu, Max, and NA's?

Scatter plot

Which kind of plot is used to describe the shape of distribution of two num varibales?

tidyr

Which package is used to reshape data?

levels

______ define all possible values for the factor (variable)

probs =

attribute in quantile() that specifies which percentage of quantile you want to return

alpha =

attribute used specify transparency

labs

attribute used to label things in plot

aes()

defines visual properties of objects in the plot

coord_flip()

func that rotates a plot

label = paste()

func used to when adding text to plot

ggplot() + geom_bar() and ggplot() + stat_count()

funcs used to create bar plots

quantile()

function that returns quantile(s) of specified value(s)

where()

function used in across() .cols to specific which columns based on a specific characteristic

mutate()

function used to add newly created column(s) to current data frame (doesn't overwrite the data frame)

unite()

function used to combine two columns

tidyr :: drop_na(var_name)

function used to remove NA class for a variable

separate()

function used to separate a column

stat = "identity"

if you have summary data and don't want to use stat = "count", specify y and use .....

ggplot

package used to create plots

distribution

pattern and frequency with which you observe a variable

factor

special class of vector with a levels attribute


Related study sets

BICSI IN225 7.0, Installer 2, Copper 2017

View Set

Leadership Final Exam ch. 19, 25, 6, 17, 5, 7, 27, 26, 31

View Set

Separation of Powers and Checks and Balances Quiz

View Set

Health Psych Exam 4 - Ch. 13 & 14

View Set

Aquifer FM End of Case Questions

View Set