DSC 210 FINAL EXAM VOCABULARY, DSC 210 QUIZ CHAPTER 7, DSC 210 Chapter 8
in order to develop an interval estimate of a population mean
either the population standard deviation or sample standard deviation must be used to compute the margin of error
The standard deviation determines
how flat and wide the normal curve is. Larger values of the standard deviation result in wider, flatter curves, showing more variability in the data.
ratio data
if the data demonstrate all the properties of interval data and the ratio of two values is meaningful. Ratio data are always numeric.
multiplication rule is used for
independent events, intersection of events, "and" probabilities
=norm.s.inv
insert probability and receive z score
=Norm.s.dist
insert z score and receive probability
Because we are sampling, in the NORM.DIST commands where prompted for the "standard_dev" we are NOT using the population standard deviation;
instead we must use the standard deviation of the sampling distribution of means, i.e. the standard error!
The expected value of p-bar (i.e., the mean of the sampling distribution of sample proportions)
is always equal to the population proportion, p.
If the sample size is large enough, the sampling distribution of sample means
is approximately normal regardless of the population distribution characteristic. approaches the normal distribution as the sample size increases.
If the population is normal, the sampling distribution of sample means
is normal.
Because it is symmetric, the normal distribution
is not skewed, its skewness measure is zero.
The Sampling Distribution of Sample Means
is the distribution of all possible sample means from all possible samples of a given sample size n from a population.
The Sampling Distribution of Sample Proportions
is the distribution of all possible sample proportions from all possible samples of a given sample size n from a population.
Confidence coefficient
level of confidence in decimal form, eg 0.90, 0.95, 0.99
t distribution becomes more and more
like normal distribution as it increases
A specific t distribution depends on a parameter known as the
degrees of freedom
measures of variability
depict diversity of the distribution (range, standard deviation)
if you are given a reasonable estimate of the range of data, you must
divide by 4
An increased level of confidence widens the interval estimate
meaning the estimate is less precise.
population proportion
p +/- margin of error
The measure/s of location that is/are the least likely to be influenced by any outliers in a distribution is/are the
median
empirical method
method for acquiring knowledge based on observation, including experimentation, rather than a method based only on forms of logical argument or previous authorities
=norm.dist and =norm.inv
relate to x
=norm.s.inv and =norm.s.dist
relate to z score
Adjust a sample statistic, ie, a point estimate, for two things;
sampling error (the standard error) level of confidence
margin of error is found from
sampling error and level of confidence
in the FINITE population case, The standard deviation of the sampling distribution of sample means is the
standard error, equal to the population standard deviation divided by the square root of the sample size; but further multiplied by the finite correction factor in the finite population case.
volume
the amount of data generated
mean
the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores
sigma known
the case when historical data or other information provides a good value for the population standard deviation prior to taking sample. The interval estimation procedure uses this known value of sigma in computing the margin of error
variables
the characteristics of the individuals within the population, are the columns of the data, run horizontally
when the sample mean happens to fall in the tail of the sampling distribution (which it will 5% of the time)
the confidence interval generated will not contain the sample mean
The expected value of x bar equals
the mean of the population from which the sample is selected.
The entire family of normal distributions is differentiated by two parameters
the mean μ and the standard deviation σ.
median
the middle score in a distribution; half the scores are above it and half are below it
mode
the most frequently occurring score(s) in a distribution
the number of observations is equal to
the number of elements
ta/2
the t value providing an area of a/2 in the upper tail of a t distribution with n - 1 degrees of freedom
Because the distribution is symmetric, the area under the curve to the left of the mean is ___ and the area under the curve to the right of the mean is ___
they are both .50
The probability distribution of such a random variable is called a sampling distribution
thus we considered the standard deviation called the standard error these estimators
In point estimation we use the data from a sample
to compute a value of a single sample statistic that serves as an estimate of a population parameter.
The purpose of an interval estimate
to provide information about how close the point estimate is to the value of the parameter
purpose of interval estimate
to provide information about how close the point estimate is to the value of the parameter
A point estimator cannot be expected
to provide the exact value of the population parameter.
The mean of the sampling distribution of sample means is always equal
to the population mean.
analytics
transforming data into insight for making better decisions
2t
two tails how much area do you have in both tails
interval data
when ordinal data are numeric and intervals between values are in a fixed unit of measure
higher confidence level
wider confidence interval
smaller sample
wider interval
Approximately 68% of the data values
will be within one standard deviation of the mean
Almost all (approximately 99.7%) of the data values
will be within three standard deviations of the mean.
Approximately 95% of the data values
will be within two standard deviations of the mean
If several different frequency distributions are constructed from a quantitative data set, each with different numbers of classes, the distribution with the widest class widths
will exhibit the fewest classes
The normal distribution is symmetric
with the shape of the normal curve to the left of the mean a mirror image of the shape of the normal curve to the right of the mean. The tails of the normal curve extend to infinity in both directions and theoretically never touch the horizontal axis.
finite population correction factor
the term √((N-1)/(n-1)) that is used in the formulas for standard deviation of x bar and p bar whenever a finite population, rather than an infinite population, is being sampled. the generally expected rule of thumb is to ignore this when n/N ≤ .05
sample statistic
the value of a variable that is estimated from a sample
So using the t - value results in a less precise interval
reflecting the fact that we do not know the value of the standard error of the mean, but rather only an estimate of standard error by substituting s, the standard deviation of the sample for the unknown value of the population standard deviation sigma.
standard deviation of x bar
σ/√n
prior probability
Initial estimates of the probabilities of events.
one standard deviation
68%
q3
75th percentile
Commonly used confidence levels:
90%, 95%, 99%.
two standard deviations
95%
When the sampling distribution of x bar is normally distributed
95% of the x-bar values must be within +/- 1.96 standard deviations of the mean
three standard deviations
99.7%
α , alpha
= complement of confidence coefficient, in decimal form ( 95% confidence = α alpha 0.05)
In general, more confidence
= less precision
relative frequency method
A method of assigning probabilities that is appropriate when data are available to estimate the proportion of the time the experimental outcome will occur if the experiment is repeated a large number of times.
standard normal distribution
A normal distribution with a mean of 0 and a standard deviation of 1.
population parameter
A numerical value used as a summary measure for a population (e.g., the population mean, the population variance, and the population standard deviation.)
sample statistic
A numerical value used as a summary measure for a sample (e.g., the sample mean, the sample variance, and the sample standard deviation
sampling distribution
A probability distribution consisting of all possible values of a sample statistic.
Binomial Probability Distribution
A probability distribution involving two random variables. A discrete bivariate probability distribution provides a probability for each pair of values that may occur for the two random variables.
Poisson Probability Distribution
A probability distribution showing the probability of x occurrences of an event over a specified interval of time or space.
experiment
A process that generates well-defined outcomes.
unbiased
A property of a point estimator that is present when the expected value of the point estimator is equal to the population parameter it estimates.
random sample
A random sample from an infinite population is a sample selected such that the following conditions are satisfied: (1) Each element selected comes from the same population; (2) each element is selected independently.
continuous
A random variable that may assume any numerical value in an interval or collection of intervals. any range of numbers
discrete
A random variable that may assume either a finite number of values or an infinite sequence of values. Distinct, seperate.
empirical rule
A rule that can be used to compute the percentage of data values that must be within one, two, and three standard deviations of the mean for data that exhibit a bell-shaped distribution.
simple random sample
A sample of size n selected from the population in such a way that each possible sample of size n has an equal chance of being selected.
simple random sample
A simple random sample of size n from a finite population of size N is a sample selected such that each possible sample of size n has the same probability of being selected.
percentile
A value that provides information about how the data are spread over the interval from the smallest to the largest value.
union of events
All events that are in A or B or both (everything!); A∪B; ∪= or/union
descriptive statistics
Tabular, graphical, and numerical summaries of data.
quartile
The 25th, 50th, and 75th percentiles, referred to as the first quartile, the second quartile (median), and third quartile, respectively. The quartiles can be used to divide a data set into four parts, with each part containing approximately 25% of the data. calculated using either =quartile.exc or .inc
target population
The population for which statistical inferences such as point estimates are made. It is important for the target population to correspond as closely as possible to the sampled population.
class midpoint
The value halfway between the lower and upper class limits
point estimate
The value of a point estimator used in a particular instance as an estimate of a population parameter.
Probabilities for the normal random variable
are given by areas under the normal curve.
Nominal data
are simply labels or names for attributes name can be a number, but this is arbitrary
The sampling distribution of p-bar can be approximated by a normal distribution
as long as np ≥ 5 and n(1-p) ≥ 5.
for unknown sigma, standard error is calculated
as sample standard deviation divided by the square root of the sample size.
variance of a discrete random variable
Weighted average of the squared deviations of the values of the variable from their mean.
Probability tree
a diagram that can be used to calculate the probabilities of combinations of events resulting from multiple random trials
An interval estimate can be computed
by adding and subtracting a margin of error to the point estimate.
ordinal data
a type of data that refers solely to a ranking of some kind
highly skewed right
a very long tail to the right
simple random sample
can be selected from a finite population or an infinite population.
The data collected from such samples
can be used to develop point estimates of population parameters.
cross sectional data
collected at the same or approximately the same point in time
time series data
collected over several time periods
P(Ac | B)
complement of A
P(A | Bc)
complement of B
1-α
confidence coefficient
what is the only was to eliminate sampling error
create a census
Multiplication Law
P(AnB)=P(A)P(B|A) A probability law used to compute the probability of the intersection of two events.
sigma
population standard deviation
σ
population standard deviation
Bayes' theorem is used to compute
posterior probabilities of an event and its complement.
The sampling distribution of x bar can be used to provide
probability information about how close the sample mean is to the population mean
P(A | B)
probability of event A given event B occured
addition law
provides a way to compute the probability of event A, or B, or both A and B occurring P(A u B) = P(A) + P(B) - P(A n B)
posterior probability
Revised probabilities of events based on additional information.
if confidence coefficient decreases
standard error of the mean increases
Our procedure for selecting a simple random sample of size n from a population of size N involves two steps.
- Step 1 Assign a random number to each element of the population. - Step 2 Select the n elements corresponding to the n smallest random numbers.
correlation coefficient
- measure of linear association - just because two variables are highly correlated, it does not mean that one variable is the cause of the other
unknown sigma sample
- take a sample - calculate a sample mean - calculate standard deviations - replace population standard deviation with sample standard deviation
variance
- the average of the squared differences between each data value and the mean - calculated using var.s or .p
correlation coefficient formulated
- the coefficient can take on values between -1 and +1 - values near -1 indicate a strong negative linear relationship - values near +1 indicate a strong positive linear relationship - the closer the correlation is to zero, the weaker the relationship
The total area under the curve for the normal distribution is
1
The three steps necessary to define the classes for a frequency distribution with quantitative data are as follows
1. Determine the number of nonoverlapping classes. 2. Determine the width of each class. 3. Determine the class limits.
Properties of a Binomial Experiment
1. The experiment consists of a sequence of n identical trials 2. Two outcomes, success and failure, are possible on each trial 3. The probability of a success, denoted by p, does not change from trial to trial 4. The trials are independent
Central Limit Theorem
1. The mean of the sampling distribution of sample means is always equal to the population mean. 2. The standard deviation of the sampling distribution of sample means is the standard error, equal to the population standard deviation divided by the square root of the sample size in the infinite population case; but further multiplied by the finite correction factor in the finite population case. 3. If the population is normal, the sampling distribution of sample means is normal. 4. If the sample size is large enough, the sampling distribution of sample means is approximately normal regardless of the population distribution characteristic.
Basic requirements for assigning probabilities
1. The probability assigned to each experimental outcome must be between 0 and 1 2. The sum of the probabilities for all experimental outcomes must equal 1
A planning value for the population standard deviation must be specified before the sample size can be determined. Three methods of obtaining a planning value for population standard deviation are discussed here
1. Use the estimate of the population standard deviation computed from data of previous studies as the planning value 2. Use a pilot study to select a preliminary sample. The sample standard deviation from the preliminary sample can be used as the planning value for 3. Use judgment or a "best guess"
planning value for population proportion
1. Use the sample proportion from a previous sample of the same or similar units. 2. Use a pilot study to select a preliminary sample. The sample proportion from this sample can be used as the planning value, . 3. Use judgment or a "best guess" 4. If none of the preceding alternatives applies, use a planning value of .50
Note that in this problem the (absolute) t - value for 95% confidence at n - 1 = 69 degrees of freedom (df) is
1.995, which is larger than the z - value for 95% confidence, 1.96.
q1
25th percentile
q2
50th percentile (median)
event
A collection of sample points.
uniform probability distribution
A continuous probability distribution for which the probability that the random variable will assume a value in any interval is the same for each interval of equal length.
exponential probability distribution
A continuous probability distribution that is useful in computing probabilities for the time it takes to complete a task.
normal probability distribution
A continuous probability distribution. Its probability density function is bell-shaped and determined by its mean and standard deviation .
tall data
A data set that has so many observations that traditional statistical inference has little meaning
wide data
A data set that has so many variables that simultaneous consideration of all variables is infeasible.
t distribution
A family of probability distributions that can be used to develop an interval estimate of a population mean whenever the population standard deviation σ is unknown and is estimated by the sample standard deviation s.
bar chart
A graphical device for depicting categorical data that have been summarized in a frequency, relative frequency, or percent frequency distribution.
pie chart
A graphical device for presenting data summaries based on subdivision of a circle into sectors that correspond to the relative frequency for each class.
Histogram
A graphical display of a frequency distribution, relative frequency distribution, or percent frequency distribution of quantitative data constructed by placing the class intervals on the horizontal axis and the frequencies, relative frequencies, or percent frequencies on the vertical axis.
scatter diagram
A graphical display of the relationship between two quantitative variables. One variable is shown on the horizontal axis and the other variable is shown on the vertical axis.
stem and leaf display
A graphical display used to show simultaneously the rank order and shape of a distribution of data.
covariance
A measure of linear association between two variables. Positive values indicate a positive relationship; negative values indicate a negative relationship
coefficient of variation
A measure of relative variability computed by dividing the standard deviation by the mean and multiplying by 100.
expected value
A measure of the central location, or mean, of a random variable
skewness
A measure of the shape of a data distribution. Data skewed to the left result in negative skewness; a symmetric data distribution results in zero skewness; and data skewed to the right result in positive skewness.
standard deviation
A measure of variability computed by taking the positive square root of the variance.
range
A measure of variability, defined to be the largest value minus the smallest value.
subjective method
A method of assigning probabilities on the basis of judgment.
measure of location
A single value that is typical of the data. It pinpoints the center of a distribution. The arithmetic mean, weighted mean, median, mode, and geometric mean are measures of location
sample
A subset of the population.
sample survey
A survey to collect data on a sample.
census
A survey to collect data on the entire population.
Crosstabulation
A tabular summary of data for two variables. The classes for one variable are represented by the rows; the classes for the other variable are represented by the columns.
relative frequency distribution
A tabular summary of data showing the fraction or proportion of observations in each of several nonoverlapping categories or classes.
frequency distribution
A tabular summary of data showing the number (frequency) of observations in each of several nonoverlapping categories or classes.
percent frequency distribution
A tabular summary of data showing the percentage of observations in each of several nonoverlapping classes.
cumulative frequency distribution
A tabular summary of quantitative data showing the number of data values that are less than or equal to the upper class limit of each class.
five number summary
A technique that uses five numbers to summarize the data: smallest value, first quartile, median, third quartile, and largest value.
Central Limit Theoremdefinition
A theorem that enables one to use the normal probability distribution to approximate the sampling distribution of x bar whenever the sample size is large.
nonsampling error
All types of errors other than sampling error, such as coverage error, nonresponse error, measurement error, interviewer error, and processing error.
sample point
An element of the sample space. A sample point represents an experimental outcome.
outlier
An unusually small or unusually large data value.
big data
Any set of data that is too large or too complex to be handled by standard data-processing techniques and typical desktop software.
data dashboards
Collections of tables, charts, maps, and summary statistics that are updated as new data become available
mutually exclusive
Events that cannot occur at the same time. P(A n B) = 0 Two events are said to be mutually exclusive if the events have no sample points in common.
The level of confidence is reflected by values of the appropriate sampling distribution.
For example, when estimating a population mean if the sampling distribution of sample means is normal and the population standard deviation σ value is known, values of the normal distribution associated with the assigned level of confidence are used.
data
Facts and statistics collected together for reference or analysis
pth percentile
For a data set containing n observations, the pth percentile divides the data into two parts: Approximately p% of the observation are less than the pth percentile and approximately (100 − p)% of the observations are greater than the pth percentile.
categorical data
Labels or names used to identify an attribute of each element. Uses either the nominal or ordinal scale of measurement and may be nonnumeric or numeric.
Symmetric Skewness
Left tail is the mirror image of the right tail mean = median
Point Estimate +/-
Margin of Error
nonresponse error
Nonsampling error that results when potential respondents that belong to some segment(s) of the population are less likely to respond to the survey mechanism than potential respondents that belong to other segments of the population.
coverage error
Nonsampling error that results when the research objective and the population from which the sample is to be drawn are not aligned.
Quantitative data
Numeric values that indicate how much or how many of something. Obtained using either the interval or ratio scale of measurement.
P(B | A)
Probability of B given A
moderately skewed left
Skewness is negative Mean will usually be less than the median longer tail to the left
Key Performance Indicators
Specific criteria used to measure the efficiency and effectiveness of the business's performance
Statistics
The art and science of collecting, analyzing, presenting, and interpreting data.
sampling error
The error that occurs because a sample, and not the entire population, is used to estimate a population parameter.
complement of a
The event consisting of all sample points that are not in A.
weighted mean
The mean obtained by assigning each observation a weight that reflects its importance. calculated using =sumproduct
sigma unknown
The more common case when no good basis exists for estimating the population standard deviation prior to taking the sample. The interval estimation procedure uses the sample standard deviation s in computing the margin of error.
independent events
The outcome of one event does not affect the outcome of the second event
bayes theorem
The probability of an event occurring based upon other event probabilities.
intersection of events
The probability that Events A and B both occur is the probability of the intersection of A and B. The probability of the intersection of Events A and B is denoted by P(A ∩ B).
confidence coefficient
The probability that the interval estimation procedure will generate an interval that contains the actual value of the population parameter being estimated The confidence level expressed as a decimal value. For example, .95 is the confidence coefficient for a 95% confidence level.
level of significance
The probability that the interval estimation procedure will generate an interval that does not contain μ (population mean) 1- confidence coefficient
statistical inference
The process of using data obtained from a sample to make estimates or test hypotheses about the characteristics of a population.
point estimator
The sample statistic, such as x bar, s, or p bar, that provides the point estimate of the population parameter.
sampling distribution of p bar
The sampling distribution of p bar is the probability distribution of all possible values of the sample proportion .
population
The set of all elements of interest in a particular study.
sample space
The set of all experimental outcomes.
increased confidence
less precision
There is a 1 - a probability that the value of a sample mean will provide
a margin of error of z_(α/2) σ_x ̅ or less.
classical method
a method of assigning probabilities that is appropriate when all the experimental outcomes are equally likely
random variable
a numerical description of the outcome of an experiment can be discrete or continuous
measure of central tendencies
a single value that describes the way in which a group of data cluster around a central value. To put in other words, it is a way to describe the center of a data set. Mean, median, mode
point estimator
a statistic that provides an estimate of a population parameter
point estimate
a summary statistic from a sample that is just one number used as an estimate of the population parameter
A graphical device for depicting quantitative data
histogram
The union of the two events A B is the event containing
all the sample points belonging to A or B or both
what is the result of the interval estimate of a population parameter
an interval estimate comprised of two values, the Lower Confidence Limit (LCL) and the Upper Confidence Limit, UCL.
in some applications, large amounts of data are known
and can be used to estimate the population standard deviation prior to sampling
The highest point on the normal curve is at the
mean, which is also the median and mode
graphical device for depicting qualitative data that have been summarized in a frequency distribution
bar graph
A random sample of 81 automobiles traveling on a section of an interstate showed an average speed of 60 mph. The distribution of speeds of all cars on this section of highway is normally distributed, with a standard deviation of 13.5 mph. If the sample size was 25 (other factors remain unchanged), the interval for μ would _____.
become wider
As the degrees of freedom increase, the difference between the t distribution and the standard normal probability distribution
becomes smaller and smaller.
weakest correlation
correlation coefficient that is closest to 0
p
population proportion
The summation of the class percentages in a relative percent frequency distribution
equals 100%
interval estimate
estimate of a population parameter that provides an interval believed to contain the value of the parameter
point estimation
form of statistical inference. "Point" is singular; think of a single value in a data set as a single point.
a t distribution with more degrees of freedom
has less distribution
In this report, "Confidence"
is the margin of error as calculated by excel using the appropriate t - value.
s
is the point estimator of the population standard deviation σ
in the INFINITE population case, The standard deviation of the sampling distribution of sample proportions
is the standard error of proportion, equal to (p*(1-p))/n)^.5
In the FINITE population case, The standard deviation of the sampling distribution of sample proportions
is the standard error of proportion, equal to (p*(1-p))/n)^.5; but further multiplied by the finite correction factor
in the INFINITE population case, The standard deviation of the sampling distribution of sample means
is the standard error, equal to the population standard deviation divided by the square root of the sample size
Za/2
is the z value providing an area of a/2 in the upper tail of the standard normal probability distribution
interval estimate of a population parameter
is then the point estimate, ie, a sample statistic computed from a sample of the population, plus and minus the margin of error
smaller the sample size
larger margin of error
interval estimate
margin of error
variance of the data set
may be smaller, equal, or larger than the standard deviation
addition rule is used for
mutually exclusive events, union of events, "or" probabilities
larger sample
narrower interval
lower confidence level
narrower interval
the mean of the distribution can be any numerical value
negative, zero, or positive
scales of measurement
nominal, ordinal, interval, ratio
alpha is the compliment
of the level of confidence in decimal
for sigma known samples, we use
population standard deviation
Interval Estimate of a Population Proportion p
p-bar +/- margin of error
upper limit
point estimate + margin of error
lower limit
point estimate - margin of error
p bar
point estimator of the population proportion p.
Because different samples provide different values for the point estimators,
point estimators such as x bar and p bar are random variables.
μ
population mean
interval estimate estimates
population parameter
sampling distribution of x bar
s the probability distribution of all possible values of the sample mean x bar.
point estimate
sample mean
x-bar
sample mean
p bar
sample proportion
n
sample size
degrees of freedom
sample size minus one a parameter of the t distribution when the t distribution is used in the computation of an interval estimate of a population mean, the appropriate t distribution has n-1 degrees of freedom, where is the sample size
if your degrees of freedom were 10
sample size was 11
for sigma unknown samples, we use
sample standard deviation
moderately skewed right
skewness is positive, mean will usually be more than the median longer tail to the right
larger the sample size
smaller margin of error
sampling distribution of p bar formula
square root of p(1-p) divided by n where p is the population proportion and n is the sample size
population variance =
square root of standard deviation
standard error formula
standard deviation/ square root of n
as the degrees of freedom increase
the difference between the t distribution and the standard normal probability distribution becomes smaller and smaller.
variety
the diversity in types and structures of data generated
elements
the entities on which data are collected, run vertically down the sheet on the left side
confidence level
the estimated probability that a population parameter lies within a given confidence interval
Note, as the sample size increased,
the interval estimate became narrower, i.e., more precise.
in the sigma unknown case
the interval estimate for μ is based on the t distribution.
x̄
the point estimator of the population mean μ
use σx ̅ =σ/√n whenever
the population is infinite, or, the population is finite and the sample size is less than or equal to 5% of the population size; that is, n/N <= 0.05
In order to develop an interval estimate of a population mean, the margin of error must be computed using either:
the population standard deviation or the sample standard deviation
Hypergeometric probability distribution
the probability distribution that is applied to determine the probability of x successes in n trials when the trials are not independent
z score
the standardized value the number of standard deviations a data value is away from the mean calculated using =standardize
veracity
the relability of the data generated
population has a normal distribution
the sampling distribution of x bar is normally distributed for any sample size.
observation
the set of measurements obtained for a particular element, each number is an observation
velocity
the speed at which the data are generated
standard error of the mean
the standard deviation of a sampling distribution
if you do not know population standard deviation
use t not z
Population does not have a normal distribution
use the central limit theorem In selecting random samples of size n from a population, the sampling distribution of the sample mean x bar can be approximated by a normal distribution as the sample size becomes large.
Summarizing Data for a Categorical Variable
uses frequency distribution, relative frequency distribution, bar charts, and pie charts as it is labels and names
summarizing data for a quantitative variable
uses frequency distribution, relative frequency, dot plots, and histograms as it is numeric values that indicate how much or how many of something
Inferential Statistics
using data from a sample of items taken from a larger population of items to make estimates and test hypotheses about characteristics of the population
if t value is larger than z value,
we are going to have a larger margin of error
if we use the sample standard deviation to estimate population standard deviation,
we are in the sigma unknown case
In point estimation, to estimate the value of population parameter,
we can compute the corresponding characteristic of the sample, referred to as the sample statistic.
sigma is rarely known exactly, but often a good estimate can be obtained based on historical data or other information.
we refer to such cases as the sigma known case
Because 95% of all the intervals constructed using (x ) ̅+ 1.96σ_(x ̅ ) will contain the population mean,
we say we are 95% confident that the interval (x ) ̅+ 1.96σ_(x ̅ ) includes the population mean m.
the population standard deviation is not known, therefore
we substitute the sample standard deviation s in the calculation of the standard error this further requires the use of a t-value instead of a z-value in the calculation of margin of error
BECAUSE WE ARE ASKING ABOUT SAMPLE MEAN RESULTS, in our NORM.DIST commands where prompted by excel for "x"
we use "x-bar"!
If an estimate of the population standard deviation s cannot be developed prior to sampling,
we use the sample standard deviation to estimate population standard deviation
Interval Estimate of a Population Mean u
x-bar +/- margin of error
general form of an interval estimate of the population mean
x-bar +/- margin of error
90% confidence interval
z = 1.645
99% confidence interval
z = 2.576
95% confidence interval
z= 1.96
A t distribution with more degrees of freedom
§has less dispersion.