bstat multiple choice

Ace your homework & exams now with Quizwiz!

required sample size

(Za/2(standard deviation)/desired margin of error (E))squared

4 characteristics of mean

-we need at least *interval* data -unique -sum of deviations = 0 -can be affected by outliers

calculated by dividing a data set's standard deviation by its mean,

CV is a unitless measure that allows for direct comparisons of mean -adjusted dispersion across different data sets

the squaring of differences from the mean emphasizes larger differences more than small ones while

MAD weighs large and small differences equally

interquartile range

Q3 - Q1

computing x for given probabilities

X = u + Zo

alternative hypothesis Ha

contradicts the default state or status quo

range is NOT a good measure of dispersion because

it only focuses on the extremes

reject the null if the p-value is less than a

not reject the null if the p-value is higher than a

weighted mean

relevant when some observations contribute more than others

the higher the sharpe ratio,

the better the investment compensates its investors for risk

covariance is 0 when

y and x have no linear relationship

test statistic

z = estimated value - hypothesized value / standard error

normal distribution is

-completely described by two parameters (mean and variance)

empirical rule

1 standard deviation: 68% 2 standard deviations: 95% 3 standard deviations: almost 100%

the variability between samples is measured by the standard error of

X bar if standard error is small, it implies that the sample means are not only close to one another, they are also close to the unknown population mean

t distribution

X bar - mean / S/square root of n

z distribution

X bar - mean/ o/ square root of n

standard transformation

Z = X - u/o

sharpe ratio

a ratio calculated by dividing the difference of the mean return from the risk-free rate by the asset's standard deviation -used to characterize how well the return of an asset compensates for the risk that the investor takes

standard normal distribution

a special case of the normal distribution with a mean of 0 and a standard deviation of 1

exponential distribution

a useful nonsymmetric continuous probability distribution is the exponential distribution used when were interested in times or distrances nonnegative

calculating the pth percentile

a. smallest to largest b. Lp = (n+1) p/100 c. if it's an integer, Lp denotes the location d. if not, you have to interpolate

margin of error

accounts for the standard error of the estimator and the desired confidence level of the interval

mean median and mode

all equal at the center of the distribution

covariance

an objective numerical measure that reveals the direction of the linear relationship between two variables

we interpret the geometric mean return as the

annualized return

transformation of normal random variables

any normally distributed random variable can be transformed into the standard normal variable with mean of zero and sd of 1

continuous uniform distribution

appropriate when the underlying random variable has an equally likely chance of assuming a value within a specified range

t distribution consists of a family of distributions where the actual shape of each one depends on the degrees of freedom

as df increases, the t becomes similar to the z distribution; it is identical when reaches infinity

it is also

asymptotic which means the tail gets closer and closer to the axis but never touches it

Mean Absolute Deviation

average of all absolute differences between the observations and the mean

variance (s2 and o2)

average of the squared differences between the observations and the mean

chapter 6 a continuous random variable

characterized by uncountable values because it can take on any value within an interval

main difference between these two rules

chebyshev's applies to all data sets whereas empirical is appropriate when the distribution is symmetric and bell-shaped

normal distributions serve as the

cornerstone of statistical inference

lognormal distribution

defined with reference to the normal distribution positively skewed and it is relevant for a positive random variable useful for describing variables such as income, real estate values, and asset prices

variance describes

dispersion (shape)

most researchers favor the p-value approach since

every statistical software package reports p-values

normal probability distribution also referred to as the "gaussian distribution"

familiar bell-shaped distribution closely approximates the probability distribution for a wide range of random variables of interest

if a random sample is taken from a normal population with a finite variance, then the t statistic

follows the t distribution with (n-1) degrees of freedom

approximately 100-p percent have values

greater than the pth percentile

examples of random variables that follow a normal distribution

heights and weights of newborn babies scores of SAT cumulative debt of college graduates advertising expenditure of firms rate of return on investment

a z-score is a unitless measure since

its numerator and denominator have the same units, which cancel each other out

approximately p percent of observations have values

less than the pth percentile

mean describes

location

finding a z value for a given probability

look up probability in the body of the table

finding a probability for a given z value

looking at the z-chart

positive skewness negative skewness symmetric

mean > mode (+) mode > mean (-) mode = mean

the modes usefulness seems to diminish with data sets that have

more than three modes

two + modes

multimodal

geometric mean

multiplicative average as opposed to an additive average

selecting n to estimate p

n=(Za/2/E)squared (p(1-p)

4 levels of measurement ( lowest - highest)

nominal: placing into categories, not measuring ordinal: ranking involved, one has more or less *must be mutually exclusive* interval: equal interval between categories ratio: showing absence of what is being measured

the p-value is the

observed probability of making a type I error

if the minimum and maximum values of the population are available, a rough approximation for the population standard deviation is given by

range/4

implementing a two-tailed test using a confidence interval

reject null if the mean does not fall within confidence interval

hypothesis test for a population proportion

test statistic for p sample proportion - hypothesized value /square root of hypothesized value (1- hypothesized value)/n

chebyshev's theorem

the proportion of observations that lie within k standard deviations from the mean is at least 1-1/k2 where k is any number greater than 1

if no other reasonable estimate of the population proportion is available, we can use .5

the required sample is largest when p=.5

range

the simplest measure of dispersion greatest value - smallest value

the margin of error in a confidence interval depends on

the standard error of the estimator and the desired confidence level

unlike cumulative probabilities in the z table,

the t table provides probabilities in the upper tail of the distribution

chapter 3 central location relates to

the way quantitative data tend to cluster around some middle or central value

for a given confidence level and population standard deviation o, *the smaller the sample size n*,

the wider the confidence interval

for a given confidence level and sample size n, *the larger the population standard deviation o*,

the wider the confidence interval

for a given sample size n and population standard deviation o, the greater the confidence level,

the wider the confidence interval

the precision is directly linked with the width of the confidence interval

the wider the interval, the lower the precision

width of a confidence interval

two times the margin of error

one mode

unimodal

z-scores

used to find the relative position of a sample value within the data set by dividing the deviation of the sample value from the mean by the standard deviation z = x-mean/s

characteristics of a mode

useful for *nominal* + level data not affected by outliers not unique (you can have more than one)

rejecting the null at 1% significance level

very strong evidence that its false

by reducing the likelihood of a type 1 error,

we increase the likelihood of a type 2 error and vice versa

a two-tailed test

when the alternative hypothesis includes "not equal to"

type 2 error

when we do not reject the null when it is false

type 1 error

when we reject a true hypothesis

all t distributions have slightly broader tails than the

z distribution

the probability that a continuous random variable assumes a particular value x is

zero because we can't assign a nonzero probability to each of the uncountable values

rejecting a null at 5% significance level

strong evidence that its false

confidence coefficient (1-a)

the probability that the estimation procedure will generate an interval that contains u

for a continuous random variable,

it is only meaningful to calculate the probability that the value of a random variable falls within some interval

significance level, alpha

the probability that the estimation procedure will generate an interval that does not contain u

chapter 8 when a statistic is used to estimate a parameter,

it is referred to as a point estimator and a particular value of the estimator is called a point estimate

characteristics of the median

-at least *ordinal* level data -not affected by outliers -unique (only one)

correlation coefficient

-describes both the direction and the strength of a linear relationship between x and y -unit-free -value falls within 1 and -1

probability density function f(x) has the following properties

-f(x) > for all possible values of x of X -the area under f(x) over all values x of X equals one

three steps when formulating the competing hypothesis

1. identify relevant population parameter of interest 2. determine whether its one sided or two sided 3. include some form of the equality sign in the null hypothesis and use alternative to establish a claim

the median is also called the

50th percentile

selecting the required sample size

if we are able to increase the size of the sample, the larger n reduces the margin of error for the interval estimates

a positive value indicates a positive linear relationship

if x is above its mean, then y tends to be above its mean and vice versa

negative value indicates negative linear relationship

if x is above the mean, y tends to be below and vice versa

a one-tailed test

involves a null hypothesis that can only be rejected on one side of the hypothesized value

the allowed probability of making a type 1 error (rejecting a true hypothesis)

is a or the significance level

like the z distribution, the t distribution

is bell shaped and symmetric around 0 with asymptotic tails

main advantage of chebyshev's theorem

it applies to all data sets, regardless of the shape of the distribution

informally, we can report with 95% confidence that u lies in the interval

it is not correct to say that there is a 95% chance that u lies in the given interval

confidence interval for population proportion

p +- Za/2(square root of p(1-p)/n)

hypothesis test for population when standard deviation is known

p value approach and the critical value approach

the confidence interval for the population mean and the population proportion is constructed as

point estimate +- margin of error

mean-variance analysis

postulates that the performance of an asset is measured by its rate of return, and this rate of return is evaluated in terms of its mean and variance. higher average returns = higher risk

percentiles

provide detailed information about how data are spread over the interval from smallest to largest values

confidence interval or interval estimate

provides a range of values that with a certain level of confidence contains a population parameter of interest

chapter 9 we use hypothesis testing to

resolve conflicts between 2 competing hypotheses on a particular population parameter of interest

sample mean vs population mean

sample: x bar; statistic population: u, parameter

coefficient of variation (s/x bar) or (u/o)

serves as a relative measure of dispersion and adjusts for differences in the magnitudes of the means

rejecting a null at the 10% significance level

some evidence that its false

standard deviation (s and o)

square root of variance

standard error

standard deviation / square root of sample size

converting sample data into z-scores is called

standardizing the data

the degrees of freedom determine

the extent of broadness of the tails of the distribution; the fewer degrees of freedom, the broader the tails

unlike the exponential distribution whose failure rate is constant,

the failure rate of the lognormal distribution may increase of decrease over time

p-value

the likelihood of obtaining a sample mean that is at least as extreme as the one derived from the given sample, under the assumption that the null is true

null hypothesis Ho

the presumed default state of nature or status quo

the arithmetic mean

the primary measure of central location "average"

for a discrete random variable, we can compute

the probability that it assumes a particular value x.


Related study sets

EMT Chapter 26 Soft-Tissue Injuries

View Set

Chapter 46: Care of Patients with Problems of the Peripheral Nervous System NCLEX (3-3)

View Set

prepU ch. 61 Management of Patients with Neurologic Dysfunction

View Set

Chapter 4: Ethics in International Business

View Set

A&P Ch. 1 The Human Body: An Orientation

View Set

Geology Lecture 2 Part II Test Prof Gamber

View Set

Infection Control - Chapter 13 & 14

View Set

Chp 7) High and Low Level languages - CS

View Set