Quantitative Research Techniques + Statistics - Peregrine
In a sample proportion, represented by p = x/n, what does X refer to?
# of successes in the sample
Probability of any outcome must be between
0 and 1
Sum of all probabilities must equal: ____
1
sampling distribution of the mean
1. the sampling distribution of the mean has a different mean from the original population 2. the standard deviation of the sampling distribution of the mean is referred to as the standard deviation 3. if the original population is not normally distributed, the sampling of the mean will be normal
Suppose you are given 3 numbers that relate to the number of people in a sample. The three numbers are 10, 20 and 30. If the standard deviation is 10, the standard error equals
5.77 To find the standard deviation, find the mean of the data set ( 10, 20, 30) then calculate in excel. Watch video Bozeman Science Standard Error
Find the sample standard deviation of 5, 10, 15 and 20.
6.45
Confidence level + significance level
= 1
one-tailed test
A hypothesis test in which rejection of the null hypothesis occurs for values of the test statistic in one tail of its sampling distribution.
simple random sampling
A sampling procedure in which each member of the population has an equal probability of being included in the sample. (example - tickets in a bin each assigned a number)
mutually exclusive
Events that cannot occur at the same time.
union of two events
Given by the outcomes that belong either to one or to both events.
Type 1 error criminal
Innocent person is wrongly convicted
Significance level
Measures the reliability of a statistical inference
calculate conditional Probability
P (A1/B2)
Addition Rule of Probability
P(A or B) = P(A) + P(B) - P(AB)
inferential statistics
Process of using sample statistics (mathematics) to draw conclusions about population parameters
confidence level
Proportion of times that an estimating procedure will be correct
Type 1 error
Rejecting null hypothesis when it is true
Central Limit Theorem (CLT)
The name of the theorem stating that the sampling distribution of a statistic (e.g. x ) is approximately normal whenever the sample is large and random. Allows us to draw conclusions about the population based on strictly sample data.
p-value
The probability level which forms basis for deciding if results are statistically significant (not due to chance).
doing inferential statistics.
The process of using sample statistics to draw conclusions about population parameters is called
What is needed to compute posterior probabilities?
The sum of all the P(sj and li)'s, likelihood probabilities, p (li/sj)
Union of two events
The union of events A and B is the event containing all sample points that are in A or B or both
multiplication rule
To determine the probability, we multiply the probability of one event by the probability of another.
standard deviation
a computed measure of how much scores vary around the mean score
Parameter
a descriptive measure of a population
statistic
a number that describes a sample
A company developed a smartphone whose average lifetime is unknown. In order to estimate the average, 200 smartphones are randomly selected from a large production line and tested; their average is found to be 5 years. The 200 smartphones represent:
a sample
stratified random sample
a sample from selected subgroups of the target population in which everyone in those subgroups has an equal chance of being included in the research example: The manager of a customer service division wants to know if the customers in the past 12 months are satisfied with their purchase of CD's. There are four types of CD's.
unbiased estimator
a sample statistic which has an expected value equal to the value of the population parameter
Parameter
a summary measure that is computed
Classical Approach to Probability
associated with games of chance. If an experiment has N possible outcomes, this method would assign probability of 1/n to each customer. For example, the probability of a heads and tails in the flip of a coin are equal to each other. Because the sum must be 1, the probability is 1/2, or 1/6 for a dice.
mean
average
Bayes Law
calculates posterior probability
Three approaches to assigning probability
classical approach, relative frequency approach, subjective approach
relative frequency approach to probability
defines probability as the long run frequency in which something occurs. Example: the last 1000 students took a course, 200 got an A. The relative frequency is 20% - estimate of students that will get an A
Complement Probability
event that occurs when A does not occur. P(A) + P(Ac) = 1
The branches in a decision tree are equivalent to
events and acts
Direct observation
example: counts backpacks on campus for a day
type 11 error
false negative
measure of variability
how closely scores bunch up around the central point; a statistic that indicates the spread of distribution
exhaustive
including everything possible; very thorough or complete
prior probabilities
initial estimates of the probabilities of events
The confidence level
measures the proportion of times an estimation procedure will be correct in the long run
stratified random sampling
obtained by separating the population into mutually exclusive sets of data (strata) then drawing a simple random sample from each stratum
non-sampling error
occurs when the sample data are incorrectly collected, recorded, or analyzed. Three types of errors: data acquisition errors, non-response errors (or bias), selection bias
statistical inference
process of making an estimate, prediction, or decision about a population based on a sample
cluster sampling
random sample of the groups or clusters of elements versus a simple random sample of individual objects
Suppose a population of blue whales is 8000. Researchers are able to garnish a sample of oceanic movements from 100 blue whales within the population. Thus, - researchers do not have a significant sample size - researchers need a larger population of blue whales - the finite population correction factor is necessary - researchers can ignore the finite population correction factor
researchers can ignore the finite population correction factor
Design of a good survey components
short survey, short/simple questions, start with demographic questions, ues dichtomous (yes-no) and multiple choice
the standard error of the sample mean
standard deviation of the distribution of the sample means , Standard deviation/sq rt (population size)
The hypothesis of most interest to the researcher is:
the alternative hypothesis
Variance
the average of the squared deviations from the mean
Business Statistics
the collection, summarization, analysis, and reporting of numerical findings relevant to a business decision or situation
range
the difference between the highest and lowest scores in a distribution
sampling error
the difference between the results of random samples taken at the same time
Conditional Probability
the likelihood that a target behavior will occur in a given circumstance
Median
the middle score in a distribution; half the scores are above it and half are below it
Mode
the most frequently occurring score(s) in a distribution
Three concepts that a statistical inference problem contains
the population, the sample, and the statistical inference
subjective approach to probability
the probability is obtained on the basis of personal judgement; often the only method of assigning likelihood to an outcome; educated guess based on knowledge available
Marginal Probability
the probability of a single event without consideration of any other event
Joint Probability
the probability of the intersection of two events
sample space
the set of all possible outcomes of a probability experiment
Variance
the square of the standard deviation
standard deviation
the square root of the variance and provides a measure of the standard, or average, distance from the mean
standard error
the standard deviation of a sampling distribution
descriptive statistics
uses graphical or numerical techniques to summarize and present data