MGSC 291 Hendrix

Ace your homework & exams now with Quizwiz!

Z=

(x - μ) / σ

Pie Charts

-Qualitative -displays parts of a whole -not good when there are too many categories -NEVER make 3D or titled -not good for comparisons

Boxplot

-displays quantitative data - works for small to large datasets -plots the five number summary -great for side-by-side comparisons

Histograms

-medium to large quantitative sets -bins touch -choice of number of bins can distort features of the shape of the distribution

Bar Graph

-qualitative data -can be horizontal or vertical -display parts of a whole or separate value

Line Graph

-quantitative data changing over time -use different lines to denote separate categories -beware of plotting different scales

Skewed left

-the tail to the left of the peak is longer than the tail to the right of the peak -mean<median

Skewed right

-the tail to the right of the peak is longer than the tail to the left of the peak -mean>median

Scatterplot

-used to depict two potentially related variables -linear, curvilinera, or no relationship -positive or negative relationship

1. Chebychev's 2. Empirical Rule

2 ways to estimate percent of observations with certain sd

0.5

=p

dim()

Check how big a set is in R

name<-read.csv(file.choose(),header=TRUE)

Code to call a data set into R (csv)

name<-table.read(file.choose(),header=TRUE)

Code to call a data set into R (excel table)

1-P(A)

Complement rule

P(AintersectB)= P(A|B)P(B)

Conditional probability

X-bar has a normal distribution

Confidence intervals work when

P(X)<a

Continuous probability P(X)<=a is the same as

Poisson Process

Discrete Probability -only one event can occur at a particular point -events occurring in one range are independent of events occurring in other ranges -the expected number of events during any such interval is constant m

Binomial Experiment

Discrete Probability n identical trials where: - each trial has only 2 outcomes - probability of "success" is a constant p for every trial -Trials are independent

np

E[X]=

1/lambda

Expected value for continuous probability

Mean

Expected value=

Q3-Q1

IQR

the distribution is normal

If the sample mean is normal

n

No matter what, round up. needs to be a whole number

Event A will not occur

P(A)=0

Event A will surely occur

P(A)=1

Event A will happen 50% of the time

P(A)=1/2

P(AintersectB)/P(B)

P(A|B)=

When events are disjoint

P(A|B)=0 P(B|A)=0

When events are independtent

P(A|B)=P(A) P(B|A)=P(B)

pbinom(j,n,p)

R code for binomial experiment where P(X)<=j

dbinom(j,n,p)

R code for binomial experiment where P(X)=j

qnorm(prob.,mu,sd)

R code for finding the percentile of a probability under a normal distribtion

ppois(j,lambda)

R code for poisson process where P(X)<=j

dpois(j,lambda)

R code for poisson process where P(X)=j

pnorm(x,mu,sd)

R code for probabilities under a normal distribution

pexp(t,lambda)

R code to compute probability under the exponential distribution

prop.test()

R code to compute the CI for the population proportion, p.

t.test(name,conf.level=.9)

R code to give the confidence interval of 90%

t.test()

R code to give the confidence intervals of 95%

s

Sample standard deviation

Coefficient of Variation

Sd expressed as a percent of the mean s/x̅ *100 -compare variation in datasets with different units or means

Big Data

The huge capaciity of warehouses

CLT

The sampling distribution of a sum or percentage will become approximately normal as the sample size gets larger

Empirical Rule

Unimodal distributions that are fairly symmetric - 68% = 1 sd of the mean -95% = 2 sd of the mean - 99.7% = 3 sd of the mean

U

Union

P(A U B)= P(A)+P(B)-P(AintersectB)

Union (Addition) rule

np(1-p)

Var(X)=

mean is pulled in the direction of the outlier, median stays the same

What happens to mean/ median when an outlier is added?

report the mean and sd

When data is symmetric

report the 5 number summary

When data skewed left/right

nameofdoc$Columnname

Work with 1 column at a time in R

Variables

are measured in the columns

Characteristics

are measured in the rows

Union rule

at least one event happened

Descriptive statistics

collecting, organizing, and presenting the data

Data warehouse

data are recorded and stored electronically, in vast digital repositories

Larger sample size

decrease width of CI

Sampling variability

different samples from the same population may yield different values of the sample statistic

Inferential Statistics

drawing conclusions about a population based on sample data from that population.

<-

equal

Chebychev's Rule

for any population with a mean and sd the percent of observations that lie within k sd of the mean is at least 1-(1/k^2)*100

Standard normal distribution

has a mean of 0 and variance of 1.

Quantitative

have a numerical value (must have units)

Disjoint events/ mutually exclusive

have no intersection

Independent

if the occurrence of one event does not affect the probability of the occurrence of the other event

Larger confidence level

increase width of CI

Qualitative

is categorical

lambda

mean (poisson)

Sampling error

minimize the difference in statistics from sample to sample

Statistic (#)

number calculated from a sample and is used to estimate the parameter

Parameter

number used to describe a population

Random sampling

reduce bias

Increase sample size

reduce variability, reduce sampling error

sample mean

Z-score

the number of standard deviations a particular score is above or below the mean (normal distribution)

Z-score

the number of standard deviations above or below average

Statistics

the study of the collection, organization, analysis, interpretation, and presentation of data

Y

the variance of a discrete random variable=

If the sample is large enough

then: -the sampling distribution of x-bar is approx. normal -the mean of the distribution is mu -the sd is (sigma)/sqrt(n)

Nominal

used only to name categories

Ordinal

variables have an order to them

Time Series

variables that are measured at regular intervals over time

Cross-sectional data

when several variables are all measured at the same time point

Biased sample

when summary characteristics of the sample differ systematically from those of the population

Events are dependent

when the given intersection % does not equal the individual %'s added together


Related study sets

Питання для підготовки до модульної контрольної роботи з ЦНС (теми 4-7)

View Set

ch 18- assessing mouth, throat, nose, sinuses

View Set

Preparing the basic Income statement and statement of retained earnings.

View Set

Microeconomics chapter 1 study material part 7

View Set

قواعد السلوك المهني (م3)

View Set