NCSU ST370 Chapters 1-8 Important Content

Ace your homework & exams now with Quizwiz!

r² sentence

"about {r²}% of the variability in Y can be expressed by its linear relationship with X"

μ (ANOVA model)

(ANOVA) baseline mean

i

(ANOVA) number of levels in a factor; 1,......t

j

(ANOVA) number replicates in each treatment group; 1,......n

N (ANOVA)

(ANOVA) t x n

significance of a small MSE

(ANOVA) very little variation in response variable within treatment groups

the total area a pdf, f(y)

1

uniform distribution properties

1) X assumes values only in a bounded interval 2) the pdf of X is constant over the interval, f(x) = c

requirements for a binomial experiment

1) fixed number of identical trials 2) trials are independent 3) two outcomes, often "success" or "failure" 4) each trial has the same probability of success, p

design of experiments procedure (6 steps)

1) identify problem 2) determine factors 3) determine number of experimental units 4) determine how the factors will be handled (controlled, manipulated, etc) 5) collect data and perform analysis 6) draw conclusions (inferential statistics)

types of experimental error

1) inherent variability 2) measurement error 3) variations in applying/creating treatments 4) extraneous/lurking/confounding variables

assumptions of SLR

1) linear relationship between X and Y (scatter plot) 2) Yi's are independent from each other (study design) 3) Y is approximately normal (QQ plot)

assumptions of one-way ANOVA and how to test them

1) random sample is selected from each group (design) 2) true variance of Y is the same for all groups (Levene Test) 3) Y is normally distributed within each population (histogram or QQ plot)

when are observational studies performed?

1) to study relationships among variables 2) to learn about the population distribution 3) obtain info/data for experimental studies 4) when control is unethical or impossible

the two sources of variation in one-way ANOVA

1) treatment effect 2) error

percentage of all values within 1 SD of mean

68.3%

percentage of all values within 2 SDs of mean

95.4%

E(Y) of a binomial distribution

E(Y) = np

replicate

EUs that receive the same treatment

disjoint or mutually exclusive

P(A ∩ B) = 0

r² equation

SS(model) / SS(Tot)

cumulative distribution function (CDF) for discrete RV

The probability that the observed value of X will be at most x; denoted as F(x) = P(X≤x).

expected value of a discrete random variable is a parameter (T/F)

True

shape of binomial distribution

X ~ Bin (n, p) ; n = #trials, p= probability of success in each trial

how to write the exponential distribution

X ~ Exp(λ)

how to write a normal distribution

X ~ N(μ,σ²)

how to write the uniform distribution

X ~ U (a,b); where a and b are the bounds/parameters of the distribution

writing of the standard normal distribution

Z ~ N(0,1)

parameter

a (usually) unknown summary value about the population

two-way ANOVA analyzes data from what experiment type?

a factorial experiment

F-stat

a measure of the ratio of the variation between the groups to the variation within the groups

random variable ***

a real-valued function with domain and range that assigns a real number to each outcome possible

treatment

a specific experimental condition, either the level of a factor or combinations of levels from multiple factors

correlation

a statistic that measures the strength and direction of the linear association between two quantitative variables

sample

a subset of the population we observe data on

statistic

a summary value calculated from the sample observations

qualitative

a variable that is described by attributes or labels

quantitative

a variable that is described by numerical measurements

methods for accounting for/reducing effects of lurking variables

a) controlled variables b) blocking

significance of total sum of squares

all of the variation in the response variable in our sample compared to the overall mean

full factorial experiment

all possible level combinations are used as treatments

probability distribution

all possible values with corresponding probabilites

population

all the values, items, or individuals of interest

stratified sampling (define and tell how)

allows the researcher to control on variables that may influence outcome. 1) divide the pop into groups (strata) 2) select a SRS from each group

Bernoulli Trial

an experiment with only 2 possible mutually exclusive outcomes

ANOVA (acronym)

analysis of variance

one way ANOVA answers the question:

are the means of these groups different? (for only 1 factor)

control treatment

benchmark treatment sometimes necessary for comparison

factor

categorical (qualitative) explanatory variable of interest

statistical inference

claim about a population based on sample data

properties of a normal distribution

continuous unimodal defined entirely by the mean and SD symmetric

continuous

data type in which any value in an interval is possible

ordinal

data type in which categories can be ordered

nominal

data type in which categories have no ordering

discrete

data type of finite or countable finite number values

probability density function (pdf)

describes the probability distribution of a continuous RV; denoted by f(y)

blocking

divide subjects with similar characteristics into "blocks," then in each block randomly assign to treatment groups

variations in applying/creating treatments

error due to treatment not being clearly defined, leaving room for interpretation

extraneous/lurking/confounding variables

error from variables that are not part of the treatment, but may influence the response

inherent variability

error type characterized by the fact that no two experimental units are the same

measurement error

error type due to error in measurement

mutually independent

events are _____________ if the probability of the intersection of any subset of the n events is equal to the product of the individual probabilities

simple random sampling (define and tell how)

every unit in the population has an equal chance of being selected. 1) assign each unit of the population a number 2) use a random number generator to select which units to use

what type of study establishes causality?

experimental study

for a CDF, F(x), F'(x) = ?

f(y)

when interaction is not significant, use (treatment effects/main effects)

fitted main effects

when interaction is important, use (treatment effects/main effects)

fitted treatment effects

completely random design (CRD)

for t treatments, replicated n_t times each, use a random number generator to assign the treatments to the EUs

controlled variables

holding certain variables constant across the EUs decreases generalizability, but reduces experimental error

independent

if an only if any one of the 3 hold: P(A|B) = P(A) P(B|A) = P(B) P(A ∩ B) = P(A) * P(B)

observational study

individuals in a sample are studied but the investigator does not attempt to manipulate or influence the variables of interest

multiplicative model indicators

interaction p<.05 = interaction = dependent

additive model indicators

interaction p>.05 = no interaction = independent

beta_0 hat meaning

intercept; whe X=0, we expect Y to be beta_0 hat

beta_0 and beta_1 are estimated with the method of

least squares

μ

mean (parameter)

mean (statistic)

randomization

means that the treatments are randomly allocated to the EUs

repeated measures

measuring the same experimental unit multiple times

k (two way ANOVA)

number of replicates in each treatment group

extrapolation

predicting a new Y value that is outside the range of the data.

conditional probability

probability of an event A given event B already occurred

p-value

probability that we found an f-stat as large as we did by chance

p or π

proportion (parameter)

proportion (statistic)

probability

proportion of times something would likely occur in many repeated trials

covariate

quantitative explanatory variable

epsilom_ij

random error corresponing to the j-th observation in the i-th level

replication

repetition of an experiment using a large group of subjects to reduce chance variation in the results

coefficient of determination

r or rho hat

sample (estimated) correlation

SRS avoids ________

selection bias

beta_1 hat meaning

slope; for every 1 unit increase in X, we expect a change of about beta_1 hat in the average of y

σ

standard deviation (parameter)

S or s

standard deviation (statistic)

discrete random variable

takes on a finite or countably infinite number of values

continuous random variable

takes on a subset of intervals of real numbers

study

the act or process of investigating of something

main effects

the differences in the mean response when the factor goes from one level to another

union

the event consisting of all out comes that are either in A or B (A ∪ B)

intersection

the event consisting of all out comes that are in both A and B (A ∩ B)

memoryless property

the exponential distribution has this special property, which means if X is the lifetime of a component the probability of failure is constant across time; the probability the component will last "a+t" time units given it has already lasted "a" units is the same as that of a new component lasting more than "t" times units

alpha_i

the main effect of group i

probability mass function (pmf)

the probability distribution function of a discrete random variable symbolized by f(x) = P(Y = y)

complement (A')

the set of all available outcomes not contained in the event

level

the specified value for the factor

experiment

the study environment is regulated, the variables of interest are manipulated by the investigator

MSE in SLR

the variation between observed #'s and predicted #'s

experimental error

the variation in response among replicates

treatment effect

there is an effect due to the variables we are setting

rho

true population correlation

what does a 3x4 factorial design indicate?

two factors, one with 3 levels the other with 4

experimental units

units on which the treatments are assigned

convenience sample

use of the most convenient group available

σ²

variance (parameter)

S² or s²

variance (statistic)

exponential distribution's single parameter

λ; λ > 0

pdf of an exponential distribution

λe^(-λx)


Related study sets

intro to criminal justice exam 1 review

View Set

Political Science Test One Chapter 2

View Set

MGMT 490: Chapter 1 - Learnsmart, Activity and Quiz Questions

View Set

Hinkle PrepU Chapter 36: Management of Patients With Immune Deficiency Disorders

View Set