R Methods

¡Supera tus tareas y exámenes ahora con Quizwiz!

aov(formula, data=___)

Fit an analysis of variance model by a call to lm for each stratum.

t.test(vectorname,optionalsecondvector,conf.level=___)

Performs one and two sample t-tests on vectors of data. will either give you an estimate of the mean (one sample) or the estimate of the difference (two sample)

quantile(vectorname, probvalue)

The generic function quantile produces sample quantiles corresponding to the given probabilities. The smallest observation corresponds to a probability of 0 and the largest to a probability of 1. USE WITH STAT FROM INFERENCE PACKAGE AS VECTOR

element_blank()

Theme element: blank. This theme element draws nothing, and assigns no space

position.jitter(value)

allows you to control how much to jitter

read.csv("cities.csv")

creates a data.frame from a csv file

get_regression_points(modelname)

creates a tibble of y hat along with data values for each plot variable for each observation

rbinom(numberofresults, numberofflipsineachresult, chanceofeachflip)

creates a vector of how many "heads" you get from a coin flip style probability

cor()

embedded within the summarize() function, takes two variables and returns the correlation

generate(reps=___, nameoftype)

generates a certain number of replicates and gives them a type, such as "bootstrap"

which(vectorname)[value]

returns index of vectorname where first instance of value is found

newdata parameter of get_regression_points

specifies a new dataframe to apply your predictions to

first three steps of the infer package

specify() will specify the response and explanatory variables. hypothesize() will declare the null hypothesis. generate() will generate resamples, permutations, or simulations.

pivot_longer()

"lengthens" data, increasing the number of rows and decreasing the number of columns. uses names_to, values_to, col

aes(aesthetic = value, ...)

Aesthetic mappings describe how variables in the data are mapped to visual properties (aesthetics) of geoms. examples are x =, y =, color =

pairwise.t.test(responsevector, groupingvector)

Calculate pairwise comparisons between group levels with corrections for multiple testing.

%>%

R pipe operator. Can be used to filter data.frames

scale_typeofmethod_manual(value1, value2, ...)

These functions allow you to specify your own set of mappings from levels in the data to aesthetic values.

subsetting data

[row:row,col:col]

tibble(attribute1=...,2=...,.......)

a less expressive but quicker version of a dataframe

annotate(

adds embellishments to the plot

mutate(col=___,col2=___ ,...)

adds new columns to a data frame

attach(data.framename)

allows you to access the variables of the data.frame without the dollar sign $

element_something()

allows you to alter the various theme components

arrange(colname)

arranges data based off of the given column, similar to group_by() but will actually display the ordering as grouped

n()

can only be called within a summarize(), is a count of observations

summarize(summarycol1=____,...)

compiles and displays summary data chosen by the programmer. Can be split up using group_by()

margin(top,right,bottom,left,unit)

creates a border for whiespace. used to set the sizes of varying rectangular objects such as the data space and legend with legend.margin = margin(20,30,40,50,"pt")

data.frame(attr1=__,attr2=___,...

creates a dataframe with many attributes, which should be set to some list or vector

lm(y ~ x, data=data.framename)

creates a linear model of a chosen variable against another. defaults to mathematical interpretation, but can be added to a ggplot stack. use + to use multiple regression in the x area.

rnorm(n, mean = 0, sd = 1)

creates a normally distributed set of data with n points, a mean and a standard deviation

random poisson, rpois(n, p)

creates a random vector with a poisson distribution. P IS EXPRESSED AS A WHOLE NUMBER PERCENTAGE

geom_smooth(method=___,se=FALSE)

creates a straight line with a given behavior such as "lm" for linear model

ftable(vectorname)

creates frequency table for categorical values

theme(component1=___,....)

customize the non-data components of your plots: i.e. titles, labels, fonts, background, gridlines, and legends.

visualize()

display data with p value

get_p_value()

display p value

factor(vectorname)

encode a vector as a factor

filter(col==value)

filters out all rows that don't satisfy the given predicate

%in%

filters the rows where a variable is an element of the proceeding vector of values

get_regression_table(modelname)

give a model and it will give you the regression table

mean(dataset)

gives mean

sd(dataset)

gives standard deviation for dataset

IQR(vectorname)

gives the IQR

var(listorvectorname)

gives the variance

group_by(col)

groups together rows with matching columns in the ordering of the displayed data. Should be used for categorical data. typically used with summarize.

how does the area in the tails change as degrees of freedom increases for a t distribution

increases

ggplot(dataset, mapping = aes())

initialize a ggplot object by declaring the input data frame and plot aesthetics

library(libname)

loads a package into the script so that its methods can be used

hist(dataset)

makes a histogram

qt(p, df)

p probability df degrees of freedom gives the cutoff value for the t distribution with df degrees of freedom for which the probability under the curve is p

how to print to console in R

print()

read_csv("Web address or file")

puts data into a tibble, requres the readr package, allows imports from the web.

pt(q,df)

q given cutoff value df degrees of freedom gives probability under the t distribution with df degrees of freedom for values of t less than q.

pbinom(q,n,p)

q is the number that you want to know if there will be less than this number of heads n is number of flips, p is probability of heads

sample_frac(size=___,replace=___)

randomly shuffles the rows

desc(vector/listname)

rearranges numeric data to be in descending order

replicate(n,functioncall)

repeats a function call n times and puts results into a vector

theme_set(nameoftheme)

resets a theme to default

diff()

returns differences, used in dplyr

select(col1,col2,...)

selects only specified columns of the data.frame, using - (negation) on a col omits it from the results.3

unit(amount, unit)

set a whitespace value for use in margin(). used to set tick marks on axes to a variable amount of some unit such as axis.ticks.length=unit(2,"cm")

labs(labelname=___,...)

set the label texts of the plot

hypothesize(null = option, stat = ___)

set your null hypothesis using one of many options such as "point" and also set a stat.

c(e1, e2, e3,...)

short for concatenate. creates a vector

glimpse(data.framename)

similar to str(), but gives some basic inferences

facet_wrap(~ variablename)

splits a plot by a certain categorical variable

str(objectname)

str short for structure. Gives cursory information on any given object like a list of numbers

built in themes and ggthemes

theme_gray() theme_bw() theme_classic() theme_void() theme_fivethirtyeight() and so on

TRouBLe

top right bottom left unit

tbl_df(data.framename)

turns a data.frame into a tibble

prop.test(successnumbervector,count,conf.level=___)

used for testing the null that the proportions (probabilities of success) in several groups are the same, or that they equal certain given values.

starts_with(string)

used with select to select columns of a certain starts with string

tidy(x)

x is an object to be converted into a tidy data.frame

dbinom(x,n,p)

x is desired number of heads n is number of flips p is probability of heads returns the chances of that many heads appearing


Conjuntos de estudio relacionados

Chapter 7 - External Competitiveness

View Set

Community Health Test 1 - Ch 1, 2, 3, 5, 9, 11, 12, 24

View Set

Money And Banking Final Study Guide

View Set

HST 130 Rogers Final Exam Study Guide

View Set

PSPO 1 - Scrum.org Question Bank

View Set

Chapter 9: Production and Operations Management

View Set