Stats Exam 1 Online Assessments Quiz
You roll 10 dice and note the sum of this first roll. You keep the results of 3 dice, but roll the other 7 again, then noting the sum (of all 10 dice) of this second roll. You do this (throwing the dice in this way) 100 times. The expected correlation between the sum of the first and the sum of the second rolls is
.3
The probabilities of all outcomes in a sample space add up to
1
Why is the MBTI so unreliable?
1. Because the underlying distribution of scores in each dimension is unimodal, not bimodal 2. Because there are 4 independent dimensions and we categorize each one 3. Because the scores of most people fall close to the decision boundary in each dimension
Why are anecdotes not data?
1. Because they do not result from a formal measurement process 2. It is unclear how representative one's subjective experience is, and science strives for intersubjective truths 3. People have all kinds of cognitive biases - who knows if the anecdotes are even true
Why does correlation not imply causality?
1. Correlation is just a statistical relation 2. The relation between two variables in a correlation is bidirectional 3. There are other variables that could mediate the observed correlation
What are data?
1. Data are both "born and made" 2. Data come about as a result of a measurement process Wrong: - Data are given to you - by gods and/or nature (depending on your metaphysics) - Data are made by people
Given measurements on a ratio scale, we can
1. Interpret the distance b/w two numbers 2. interpret the ration b/w two numbers 3. interpret the relative magnitude of two numbers 4. Assume that the scale has an absolute zero
Computer science is the study of algorithms to automate processes. A computer scientist specializes in the theory of computation and the design of computational systems. Given this definition, is computer science a science?
1. No. Computer science is generally concerned with abstract statements about mathematical structures. 2. No. There is no data in computer science, generally speaking
Which statement about experiments is accurate?
1. Participants are randomly assigned to conditions 2. Experiments potentially allow to establish causality (in contrast to observational studies) 3. All potential confounds are controlled by randomization
What are some challenges to learning statistics?
1. Statistics is not intuitive 2. Most people are exposed to statistical concepts only late in their educational career 3. Most people who are teaching statistics are not optimally suited to communicate the concepts in an engaging way
The black swan problem concerns:
1. The fact that induction can never delivery certainty 2. The fact that something that has never happened before is not impossible Wrong: - The uncertainty of deduction. - The plural of anecdotes.
Why is statistical literacy increasingly important?
1. We have more and more methods to record data 2. We have better and better ways to analyze data 3. Data is becoming ever more ubiquitous 4. All fields of human endeavor are increasingly relying on data for decision making
Which of these plots is most consistent with the variable on the x-axis being independent of the variable on the y-axis?
A and D
Induction is
A method to derive general principles from specific instances Wrong: - A method to make conclusions about specifics from general principles. - A method to create certainty.
Deduction is
A method to make conclusions about specifics from general principles Wrong: - A method to derive general princples from specific instructions - A method to deal with uncertainty
You measure IQ in a group of 200 people with severe learning disorders. In general, this sample is described well by a mean of 75 and a standard deviation of 15. However, one person who absent-mindedly wandered in from the Mensa convention next door was also tested and measured at an IQ of 142. The measure of central tendency that is most affected by this is likely the A. Mean B. Median C. Mode D. SD E. Mean average deviation
A. Mean
| |
Absolute value
Interval Scale
Adding
A general interpretation of the average is that of
An expected value Wrong: - A measure of unity - A measure of dispersion - A measure of diversion
The best way to control for confounds is to do
An experiment
Deduction
Applying general rules to specific instances Certainty
You attend the first lecture of a course and estimate the probability that you will get an "A" in this course. The next lecture, you adjust your estimate of getting an "A" based on the content of the 2nd lecture. This process implies which interpretation of probability? A. B. C. D. E.
Bayesian/Subjective Wrong: - Frequentist/Objective - Induction - Anecdotal - Weighted Probability
Someone asserts that "Frogs are green". How would you test this assertion?
By looking for a frog that is not green, e.g. a red frog. Wrong: - By finding a green frog - By remembering that all the frogs you have ever seen were green
Experiment
Causality
χ2
Chi square
Induction
Coming up with general rules based on specific instances Uncertainty
Lack of independence
Conditional Probability
Nominal Scale
Counting
You present Nature with the numbers 3, 6 and 9 and ask whether they are members of the set. Each time, Nature answers that yes, these numbers are members of the set. You conclude that the rule is: "Consecutive multiples of 3 are members of the set". Which of these is the best (maximally informative) number to try next in order to test this rule? A. 12 B. 21 C. 18 D. 15 E. 2
E. 2
Showing that your experimental result replicates (holds) in a population you haven't studied yet is a good example of
External Validity
Science
Falsification Data
Γ
Gamma (function)
Which statement about instruments of measurement is (most) accurate?
If an instrument is not objective, it cannot be reliable or valid
Which statement about instruments of measurement is accurate?
If the instrument is not reliable, it can't be valid
i
Index or instance
Multiplication Rule
Intersection
∩
Intersection of
Standardized tests of achievement like the SAT are a good example of measures on a
Interval Scale
A key characteristic of science is that it
Involves both induction and deduction in an iterative fashion Wrong: - involves deduction - involves induction - creates undeniable facts
The principal problem of the range as a measure of dispersion is that it
Is extremely sensitive to outliers Wrong: - too robust to outliers - Requires ratio scale data - Is correlated with the mean
Which of the following is not a valid description of what a probability is
Is: - Degree of Belief - Quantified plausibility - Relative Frequency - A number between 0 and 1
The probability of the intersection of A and B is equal to
It can't be determined from this information, it depends on whether A and B are independent or not
Given measurements on a nominal scale, numbers should be interpreted as
Labels and Categories
Median
Mean absolute deviation
Given measurements on an ordinal scale, we can
Meaningfully interpret the relative magnitude of two numbers Wrong: - interpret distance between two numbers - Assume the scale has an absolute zero - interpret ratio between two numbers
Data
Measurement
Given measures on an ordinal scale, it is meaningful to interpret the
Mode and Median
The mode of a sample is its
Most common value
Is medicine a science?
No, it is primarily concerned with fixing things that are wrong - specifically with healing the sick - not with a principled understanding of the natural world. Wrong: - Yes, doctors use tests to make decisions
Is math a science?
No. Math is entirely deductive, so it doesn't qualify as a science Wrong: - Yes, math is inductive
Taking note of someone's gender is a good example of taking measures on a
Nominal Scale
~
Not
¬
Not
n
Number of observations
Multiple linear regression is indicated when
One has multiple predictors
A natural experiment is most suitable when
One would like to do an experiment, but cannot do so, for technical or ethical reasons and one would like to know causality
Ordinal Scale
Ordering
A Likert scale is a good example of measures on a
Ordinal Scale
Asking someone how much pain they are in, on a scale from 0 to 10 is a good example of a
Ordinal Scale
r
Pearson's product-moment correlation
σ
Population Standard Deviation "lowercase signma)
μ
Population mean "mu"
p( )
Probability of
If events A and B are not mutually exclusive, their joint probability (A happening or B happening) is given by the
Probability of A plus the probability of B minus the probability of A and B
Mode
Range
You measure reaction times in a face recognition task. Most of your measures follow a normal distribution with a mean of 1.2 seconds and a standard deviation of 0.3 seconds. However, in one of the trials, the study participant didn't pay attention and didn't make a response for 20 seconds. The dispersion measure most affected by this is the A. Mean Average Deviation B. Mode C. Median D. Standard Deviation E. Range
Range
Reaction times are a good example of measures used in psychology that can be considered to be on a
Ratio Scale
Interval scales allow us to meaningfully interpret measures in all ways except for
Ratios between two measures
ρ
Rho - Spearman's rank correlation
X bar
Sample Mean
s
Sample Standard deviation "sigma"
Knowledge
Science
Mean
Standard Deviation
∑
Sum of ("uppercase sigma")
The two operations that underlie the calculation of the mean and the median, respectively are
Summing and ordering
The fundamental problem with induction is that
The conclusions one draws with inductive methods can be wrong Wrong: - It is only useful if one has a small number of observations - It is the complement to deduction - One needs data to do it
Conceptually, a correlation is
The covariance normalized by the product of the standard deviations Wrong: - the median normalized by the product of SDs - The covariance normalized by the sum of SDs - The covariance normalized by the product of the mean average deviations - The mean normalized by the product of SDs
What is true about mean and median?
The median is a more robust measure than the mean because it is less affected by outliers.
If events A and B are mutually exclusive, the probability of their union is equal to
The probability of A added to the probability of B Wrong: - The probability of A alone - The probability of B alone - Their intersection - The probability of the union minus the intersection
An event A is independent of an event B if
The probability of A equals the probability of A given B The probability of the intersection of A and B is equal to the probability of A times the probability of B
What generative process - in nature - yields normal distributions?
The random combination of many independent factors
What does simple linear regression minimize?
The sum of squared differences between predicted values and measurements Wrong: - minimizes regression to the mean - sum of absolute value of diff. b/w predicted values and measurements - sum of cubed differences - sum of differences
In the general linear model, "beta" refers to
The weights of the predictors
What is the fundamental problem with operational definitions of a construct?
There is always a degree of arbitrariness and it is unclear whether the operational definition captures the construct fully There are always other operational definitions possible
A measurement is considered to have no criterion validity if
They don't predict anything in the real world Wrong: - they don't correlate with other measures we expect from our theory - they are not standardized - they systematically depend on who makes them - no consistency b/w repeated measures
A measurement is considered to be "not" objective if
They systematically depend on who makes them
Why can't we use the sum of simple differences from the sample mean as a dispersion measure
This measure would always be zero This measure is meaningless, as the positive and negative deviations from the sample mean cancel out, due to how the sample mean is defined.
Addition Rule
Union
∪
Union of
Turkey Problem
Verifying
A good synonym for "dependent variable" would be
What we measure
The mean absolute deviation (MAD) is
a more robust measure of dispersion than standard deviation
α
alpha
What would not be a valid (possible) correlation value
anything not in the rang -1 to 1
β
beta
df
degrees of freedom
|
given
The probability of the intersection of two independent events A and B can be calculated as
p(A) x p(B)
In a binary world with only two possible and mutually exclusive outcomes A and B, the probability of B can be arrived at by calculating
p(B) 1-p(A) p(∼A)
In a regression model, the sum of squares explained relative to the sum of squares total follows
r² Not: r³, sqrt(r), r, or the residual sum of squares