Statistics Midterm

¡Supera tus tareas y exámenes ahora con Quizwiz!

Hindsight:

"I knew it all along" effect, where people being confronted with a fact tend to construe a belief that they actually knew it is or would be so all along..

How do we know how strong a correlation is?

(subjective) rules of thumb to describe size of correlation: +/-0.1=small +/-0.3=medium +/-0.5=large Always look at the scatterplot

In a dataset with M = 10 and SD = 2, what score corresponds to z = -1.5?

-1.5 = x-10/2 = 7

Alpha =

.05, gives 95% confidence. A confidence that in the long-run, when we sample many samples and get a confidence interval of whatever statistic we are measuring each time, 95% of those confidence intervals will include the population parameter.

For a standardized exam, scores are normally distributed 𝜇 = 80 𝝈 = 5. Find each student's z score: 1. Student 1 X = 80 2. Student 2 X = 90 3. Student 3 X = 75

0 2 -1

Joe scored an 88 on his last statistics exam. His professor wouldn't tell him, but he did share that his z-score was 1 and that the standard deviation for the sample was 3. There are 75 students in class. What was the mean score for his class on this exam? a. 91 b. 83 c. 85 d. 88

A)

Covariance:

Looking at how much two variables vary with each other. How much each score deviates from the mean. If both variables deviate from the mean by the same amount, they are likely to be related In probability theory and statistics, covariance is a measure of the joint variability of two random variables.

Purpose of the t-test:

Looking at two groups, and we want to compare their means We do this a LOT in stats Example: We might be comparing two different groups on their average depression, looking at female vs. male, or a population in New York vs. a population in Colorado, and comparing their averages. It can also be done looking at the same group, looking at pre/post. So it can be done on one sample, or between two samples.

Measures of central tendency (or 'location' of the data):

Mean = average Sensitive to outliers Median = 'middle' score when data are ordered (i.e., 1,2,3,4,5,6,7,8,9) Less sensitive to outliers (i.e., more accurate than mean in skewed distributions) Mode = most frequently occurring score (i.e., 1,2,2,3,4,4,4,5,6) A data set could have one mode, more than one mode, or no mode When there are more than one mode that usually means there are more than one group in your data, and you can figure out what that is (i.e. gender identity)

The larger the value of 𝝈 the more or the less sample means tend to vary from the population?

More! The larger the population standard deviation, the more the sample means will vary from the population mean. A larger SD means that there is greater variability in our data (i.e., our data is more spread out from the mean), and less scores clustered around the mean.

Equation that we need to memorize:

Outcomei = (Model) + errori

What is the p value?

P = Probability Probability that the result obtained was due to chance. Generally a P-value < 0.05 indicated statistical significance. If P < 0.05 that means there is a < 5% probability that the result occurred by chance. In null hypothesis significance testing, the p-value is the probability of obtaining test results at least as extreme as the results actually observed, under the assumption that the null hypothesis is correct. A p-value is a measure of the probability that an observed difference could have occurred just by random chance. P-value can be used as an alternative to or in addition to pre-selected confidence levels for hypothesis testing.

Define a parameter and a statistic and note how the two terms differ from each other.

Parameters are numbers that summarize data for an entire population. Statistics are numbers that summarize data from a sample, i.e. some subset of the entire population.

What does the average amount of deviation among our set of scores tell us?

The average amount of deviation among our set of scores tells us how much variability is in our data set (i.e., how well our mean represents or 'fits' our data) The variance is the average of the squared differences from the mean.

Variability:

The concept of variability is central to understanding all statistics A score's deviation is a number that represents how far that score deviates from the mean score

Measurement Error(always):

The discrepancy between the actual value we're trying to measure, and the number we use to represent that value. Basically the discrepancy between the truth and what we find. There's error because we are only looking at one sample, there's many reasons for error, but measurement is never the full truth. Your sample might be completely different from the population, and that's what stats is trying to look at how big that error is.

The empirical rule:

The empirical rule, or the 68-95-99.7 rule, tells you where most of the values lie in a normal distribution: Around 68% of values are within 1 standard deviation of the mean. Around 95% of values are within 2 standard deviations of the mean. Around 99.7% of values are within 3 standard deviations of the mean.

How to calculate standard deviation?

The equation will be on the exam Work out the Mean (the simple average of the numbers) Then for each number: subtract the Mean and square the result. Then work out the mean of those squared differences. Take the square root of that and we are done!

Margin of error (MoE):

just know that this is half of the Standard Deviation (so that you understand what it is if you come across in the readings. However, Salar did not teach this term and it is unimportant).

Form of the relationship:

linear vs. non-linear (hint: shouldn't be using r if non-linear; find this out by looking at your scatterplot)

Confidence interval:

measures the precision of our statistic. An interval of uncertainty around what we measured

Reliability:

The ability of the measure to produce the same results under the same conditions.

What you need to know to describe a correlation:

form direction strength/size

Sample distribution of the sample means:

A distribution of many means from different samples.

What is a z score?

A standardized score that gives you an idea of how far from the mean a data point is

How to interpret z scores:

A z-score of 0 means the score is the same as the mean of the sample. If the z-score is 2, it means it's 2 SDs away from the mean, therefore it's less likely.

Choose one of the following biases or fallacies that we have discussed in class and describe how it can affect conclusions drawn from the data: 1. Survivorship bias 2. Conjunction fallacy 3. Self-selection bias 4. Prosecutor fallacy

1. Survivorship bias or survival bias is the logical error of concentrating on the people or things that made it past some selection process and overlooking those that did not, typically because of their lack of visibility. Survivorship bias matters because it can distort performance figures significantly. Survivorship bias tends to distort data in only one direction, by making the results seem better than they actually are. 2. Conjunction fallacy (the Linda Problem): The conjunction fallacy is a formal fallacy that occurs when it is assumed that specific conditions are more probable than a single general one. This usually happens when it is easier to imagine two events occurring in a combination than occurring alone, and it violates the laws of probability. 3. Self-selection bias: In statistics, self-selection bias arises in any situation in which individuals select themselves into a group, causing a biased sample with nonprobability sampling. 4. mathematically, results from misunderstanding the concept of a conditional probability, which is defined as the probability that an event A occurs given that event B is known -or assumed -to have occurred, and it is written as P(A|B). The error is based on wrongly assuming that P(A|B) = P(B|A). (i.e. a murdered woman being murdered by her batterer is 7 times more likely than a battered woman being murdered by her batterer).

Depression was measured, after treatment, on a scale from 0-10, where scores over7 are considered clinically depressed. We found M =5.0, 95% CI [2.0, 8.0]. Which of the below options is correct? Explain why the one you chose is correct and why the others are not. 1. This is clear evidence that those treated are not depressed. 2. The treatment works for some people, but not for others. 3. The lower limit of the CI is quite low, so most likely treated patients are hardly depressed at all. 4. Encouraging but leaving great uncertainty about the extent to which treated patients are depressed. More data must be collected to better estimate depression after this type of treatment.

1. This statement is inaccurate. The range of plausible values extends beyond 8, therefore, the true population mean could plausibly be 8 (however, it is more probable that the true population mean is 5 as values close to our sample mean are more probable than values at the ends of our interval). 2. .This is a fair statement. The CI is wide, suggesting that some people scored very low on depression after treatment and some scored above the cut-off for clinical depression. 3. This statement is inaccurate. It is more likely that patients are moderately depressed (or whatever a score of 5 corresponds to). As above, it is more probable that the true population mean is 5 as values close to our sample mean are more plausible than values at the ends of our interval. 4. This is a fair statement. Our CI contains the test value (7), so we fail to reject the null hypothesis—treatment doesn't decrease depression. It's encouraging, however, because the test value is close to the upper limit of our confidence interval, and it is likely to have fallen outside our CI if we had a larger sample size and thus a more precise CI.

Below are some reports of CIs, each with a number of alternative interpretations or statements. Give your comments on each option. As usual, be prepared to use your judgment: 1. We measured subjective well-being on a scale from 1-7 in a sample of participants who had recently won the lottery, and found M= 4.0,95% CI [1.5, 6.5]. 2. Happiness amongst lottery winners can range from very low to very high. 3. In general, lottery winners are likely to be quite happy. 4. The result is not believable; there must be some flaw in the study. 5. The CI is too long to be interpreted; more data must be collected.

1. [This CI means the true population, 95% of the time, will fall between 1.5 and 6.5.]. This is a very broad CI, especially because it covers most of the theoretical range of the scale. 2. This is probably an accurate statement. The wide CI indicates that there is a considerable amount of variance in the population (a large SEM), suggesting that happiness amongst lottery winners can range from very low to very high. 3. This is difficult to interpret without knowing the scale anchors, but assuming a score of 4 reflects a 'neutral' or 'neither more nor less happy' kind of response, then this is probably an inaccurate statement. This is because, assuming that our CI is not one of the 5% of 95% CIs that does not capture the true population mean, the values that are close to our sample mean (M = 4) are more plausible values for the population mean than those at the ends of our interval and beyond, suggesting that it is more likely that people are no more or less happy than before they won the lottery, rather than "quite" happy. 4. The CI is very wide, so it may be that we simply do not have a large enough sample size to find a more precise estimate of the population mean. We would need more concrete information surrounding the study design, sample size, etc. to determine if there was a flaw. 5. Yes, more data should be collected. We are not told the N (sample size). If we knew more about the sample, perhaps we would know more. Because the CI is broad - we may assume that the sample size is small. Less people results in less precision.

What is the value 1.96 used for?

1.96 is a z score, and we use it for calculating 95% confidence. Think of confidence intervals as coverage. So, you have your whole distribution, you want to say: how sure are we that our data lies between two different points. If you want to be really sure you want to cover more, and less sure, you cover less.

Confidence Intervals:

Anytime we give a statistic we have to also give our level of uncertainty in that statistic Ex: voting polls - 80% +/- 3% agree that Trump sucks We will use a 95% confidence interval in this class The width of your CI is determined by sample size, variability within your sample, and the p-value obtained from your inferential analysis (all the information comes from the data!) Know what they mean, how to report it

Drop-out rates from The New School have been increasing exponentially over the past two years, and a couple of very concerned faculty members decide to conduct a study to figure out why. They devise a survey to determine what issues students may have with the school by listing a number of possible issues and asking the participants to rate each issue on a seven-point Likert scale. They give that survey to random undergraduate and graduate students until they reach a sample size of 200. Which of the following is a flaw of this study? A. The sample size is too small to generalize to the population. B. The study falls victim to survivor bias. C. The Likert scale is inappropriate for this study. D. The scale is imbalanced.

B)

The null hypothesis: A. is the opposite of the research hypothesis. B. states that any results are due to chance alone. C. is what we call the alternative hypothesis. D. none of the above

B)

Variables other than the IV that cause differences between groups are: a. extraneous variables b. confounds c. nuisance variables d. dependent variables

B)

What does a correlation of -0.57 mean? A) There is no correlation. B) The correlation is strong and negative. C) The correlation is weak and negative. D) The correlation is strong and positive.

B)

When the correlation coefficient is -0.90, it means that a) There's almost no correlation b) The correlation is very strong c) There is a weak negative correlation d) If the correlation efficient is negative, the calculation is wrong.

B)

A sample size n = 100 has M = 10, SD = 2. What is the 95% confidence interval for the population mean? a. 12 b. [9.61, 10.4] c. [12, 14] d. [5.51, 6.61]

B) [9.61, 10.4]

Which measure of central tendency is more appropriate for a data set that includes many outliers? A. Mean B. Median C. Mode D. None of the Above

B. Median

Explain the difference between a within-subjects and between-subjects design when creating a study.

Between-subjects (or between-groups) study design: different people test each condition, so that each person is only exposed to a single user interface. Within-subjects (or repeated-measures) study design: the same person tests all the conditions (i.e., all the user interfaces).

What makes a good survey question?

Brief Relevant Unambiguous Specific Objective

Oscar has taken four exams in their history class. Their results are: Exam 1 z = 0 Exam 2 z = 2.2 Exam 3 z = -.6 Exam 4 z = 1.0 On which did they score below the average? A) Exam 1 B) Exam 2 C) Exam 3 D) Exam 4

C)

What does an r = zero mean? a) There is a weak correlation. b) There is a moderately strong correlation. c) There is no correlation. d) There is a normal distribution.

C)

What is the purpose of calculating a z score? a) To make it possible to compare datasets b) To standardize the data c) a & b d) None of the above

C)

Which of the following is not a characteristic of a normal distribution? a) At least 68% of data is within 1 SD of the mean b) The distribution is bell-shaped c) It contains a positive skew. d) The mean lies at Z=0.

C)

A researcher published an article showing that people who got into a competitive school were likelier than not to habitually eat eggs for breakfast. However, the researcher did not consider breakfast patterns of a sample that did not get into the competitive school. This is an example of: A) Information Bias B) Non-Response Bias C) Survivorship Bias D) Confirmation Bias

C) Survivorship Bias

Types of data:

Categorical (can't use any numbers, unless looking at frequencies [percentages]): Nominal: 2+ categories (some say yes/no) Ordinal: there's a logical order to the categories (grades from A to F) Quantitative (not a category, it's a continuum): Discrete: counting (# of people in the room) Continuous: results from height/speed, etc.

What is one type of categorical variable and one type of quantitative variable?

Categorical: nominal (i.e. what is your favorite color) Quantitative: continuous (i.e. high school grade point average)

Cieling/Floor effect:

Ceiling or floor effects occur when the tests or scales are relatively easy or difficult such that substantial proportions of individuals obtain either maximum or minimum scores and that the true extent of their abilities cannot be determined. Ceiling and floor effects, subsequently, causes problems in data analysis.

R2

Coefficient of Determination The variance in y accounted by x The coefficient of determination, R2, is used to analyze how differences in one variable can be explained by a difference in a second variable. For example, when a person gets pregnant has a direct relation to when they give birth. More specifically, R-squared gives you the percentage variation in y explained by x-variables. The range is 0 to 1 (i.e. 0% to 100% of the variation in y can be explained by the x-variables).

Independent Samples t-test:

Compares two means based on independent data (i.e., data from different groups of people)

Dependent t-tests:

Compares two means based on related data i.e., data from the same people measured at different times. Looking at the difference between two groups...i.e. Did they improve significantly, and if they did, was it a big/small effect? Pre and post Data from 'matched' samples (don't worry too much about this, really remember that independent t tests are two different groups and dependent is measuring one group and comparing the difference there). Each score in one sample is paired with specific score in the other sample An observation in one sample is matched with an observation in the other sample

Variables (Conceptual vs. Operational):

Conceptual definition:•provides the broad meaning of an abstract termPositive psychological assessment: A handbook of models and measures Self-esteem is an attitude about the self and is related to personal beliefs about skills, abilities, social relationships, and future outcomes. Operational definition:•indicates how a concept is coded, measured or quantified•no single operational definition can capture fully the concept that it's intended to measure •each operational definition is one of several possibilities•Janis, I. L. (1959). The Janis-Field feelings of inadequacy scale. Janis, IL & HowIand, CF.•How often do you feel inferior to most of the people you know? •How often do you have the feeling that there is nothing you can do well? •Have you ever thought of yourself as physically uncoordinated?

Content vs. Ecological validity:

Content Validity: Evidence that the content of a test corresponds to the content of the construct it was designed to cover Ecological validity: Evidence that the results of a study, experiment or test can be applied, and allow inferences, to real-world conditions.

Standardized covariance:

Correlation coefficient Standardization using the product of standard deviations makes the correlation coefficient independent of the scales of the two variables.

Nonlinear Relationships:

Curvilinear relationship between room comfort and temperature - too cold or too hot Ceiling effect: scores on a test top out around 30, showing a ceiling effect, where most students scored very high (A floor effect is possible if the test were much harder - and most people did terribly on it) r may be a misleading representation of a relationship

Kelsey has gotten 4 scores for her dance numbers in her dance competition. Her results of each competition are: Dance 1 z = 6.7 Dance 2 z = 3 Dance 3 z = 0 Dance 4 z= 6.6 On which dance number did she score exactly the average of the other dance competitors?

Dance 3

Dave has learned how to use Jamovi, but hasn't really bothered to learn much about statistics. He measures education (1 = dropout, 2 = high school degree, 3= college degree), size of household (adults and children), number of hours worked a week, hourly wage (USD), and happiness (1 = very unhappy to 7 =very happy). He uses Jamovi to calculate descriptive statistics for all 5 measures. Some of his 'findings' do not make any sense. Which of Dave's results is an improper use of statistics? Why? 1.For education,M= 1.9,SD= 0.5 2.For household size:M= 3.5,SD= 2. 3.For number of hours worked a week:M= 35,SD= 15 4.For hourly wage:M= $17.25,SD= 9.41 5.For happiness:M= 3.8,SD= 2.3

Education is a categorical variable. Mean and SD have no meaning! We could also accept that household size could be considered a categorical variable, and having 1.5 children is meaningless, however, this is often how household data is presented.

Anna has taken four exams in her physics class. Her results are: ● Exam 1 z = 1.1 ● Exam 2 z = 0 ● Exam 3 z = -.8 ● Exam 4 z= 2.3 On which exam did she score exactly the class mean? On which did she score below the average? On which did she do her best?

Exam 2 Exam 3 Exam 4

Hypothesis (Experimental, Alternative, & Null):

Experimental Hypothesis: A statement about how the world works, which your experiment is designed to test (i.e. morning people and night people differ in their level of extraversion) Alternative Hypothesis: Another statement about a specific alternative way that the world works (i.e. morning people are more extraverted than night people) Null Hypothesis: What we would assume to be true if we don't find support for our hypotheses (i.e. morning people and night people do not differ in their level of extraversion)

Experimental Research Definition (what is necessary for research to be a true experiment?):

Experimental research: a set of variables are kept constant while the other set of variables are being measured as the subject of an experiment For something to be a true experiment there needs to be manipulation & controls

QALMRI: Logic

Explains the logic behind the alternative hypotheses: If alternative 1 (and not alternative 2) is correct, then when a particular variable is manipulated, the participant's behavior should change in a specific way. For example, the logic of the color experiment would sound like this: If a person's native language influences their perception of color, then speakers who lack a color term should perceive the boundary between that color and another color differently than a speaker who has that color term. Alternatively, if language does not influence the perception of color, then all speakers should perceive all color boundaries similarly.

Outliers:

Extreme values in your data set (extremely high or low; extremely far from the mean) r is impacted by outliers (a single outlier can change the size of the correlation) This is why using r alone to interpret results can be misleading! Always check your scatterplot before interpreting r! Outliers could be the result of a data entry mistake, or it could represent a participant who scored very differently to the majority of participants (this could be for many reasons) Depending on what you decide is the reason for the outlier, you may be justified in removing the outlier from the dataset and re-running your correlation.

On an exam, Geo is told he has a z = 3. Should he be happy?

Geo should be very happy. He scored 3 standard deviations above the class average.

Independent vs. dependent variables:

IV: probable cause of something (i.e. being a night/morning person) DV: a variable thought to be affected by the changes in the IV (i.e. extraversion)

Central Limit Theorem:

If you have a population with mean and standard deviation and take sufficiently large random samples from the population with replacement... Then the distribution of the sample means will be approximately normally distributed! When the sample is relatively small (<30) the sampling distribution has a different shape, known as a t-distribution

When to use each measure of central tendency?

Important to know the distinction of when to use mean vs. median! Based on skewness, if your data is super skewed you know your mean may not be representative of where most of the data is, and then you would want to use your median (because it is less sensitive to outliers). Use mode when dealing with nominal data.

Although the standard deviation and the standard error of the mean both measure variability, these are two very different concepts. Please provide a definition for both standard deviation and the standard error of the mean.

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard error is the standard deviation of a sample population. It measures the accuracy with which a sample represents a population. In other words, The standard deviation (SD) measures the amount of variability, or dispersion, from the individual data values to the mean, while the standard error of the mean (SEM) measures how far the sample mean (average) of the data is likely to be from the true population mean.

Types of t-tests:

Independent Dependent

How to report t-test example:

Independent: The XX participants who received the drug intervention (M = XX, SD = XX) compared to the XX participants group (M = XX, SD = XX) demonstrated significantly better peak flow scores, t(df) = t value, p = .XX, Cohen's d (explain what Cohen's d means about the finding somewhere. This was a moderate effect etc...) Dependent: The results from the pre-test (M = XX, SD = XX) and post-test (M = XX, SD = XX) memory task indicate that caffeine in the bloodstream resulted in an improvement in memory recall, t(df) = XX, p = .XX, Cohen's d = XX (explain what cohen's d means about the finding somewhere. This was a moderate/small/large effect etc...)

Independent vs. Paired samples t test

Independent: between two samples Paired: one sample (i.e. pre vs. post)

What does the t test ask?

Is there a difference between the means of these two groups?

What happens to the SE as the sample gets bigger? Explain

It gets smaller - more data means less variation (and more precision) in your results.

If I run an experiment with 20 people and then re-run the experiment with 100 people, what do you think will happen with the SE? Explain why.

It will get smaller - more data means less variation (and more precision) in your results.

Parametric and non-Parametric Statistics:

Parametric Variable: A variable belongs to known parameterized family of probability distributions (i.e. normal) Nonparametric Variable: Don't assume any distribution Parametric statistics are based on assumptions about the distribution of the population from which the sample was taken. We use parametric statistics. (i.e., assume a normal distribution) Nonparametric statistics are not based on assumptions, that is, the data can be collected from a sample that does not follow a specific distribution.

QALMRI method

Question Alternatives Logic (design) Method Results Inferences

QALMRI

Question, Alternatives, Logic, Methods, Results, Inferences

t test write up

Results of the independent sample t-tests indicated that there were no significant differences in job satisfaction between males (M = XX, SD = XX) and females (M = XX, SD = XX), t(29) = -1.85, p = .074.

Z-score: What it is and why you would want to use it:

Simply put, a z-score gives you an idea of how far from the mean a data point is. But more technically it's a measure of how many standard deviations below or above the population meana specific score is. A z-score can be placed on a normal distribution curve. Z-scores range from -3 standard deviations (which would fall to the far left of the normal distribution curve) up to +3 standard deviations (which would fall to the far right of the normal distribution curve). In order to use a z-score, you need to know the mean μ and also the population standard deviation σ. They (a) allows researchers to calculate the probability of a score occurring within a standard normal distribution and (b) enables us to compare two scores that are from different samples or even measured in different units (which may have different means and standard deviations).

1. Explain what a z score is and why you'd want to use one?

Simply put, a z-score gives you an idea of how far from the mean a data point is. But more technically it's a measure of how many standard deviations below or above the population mean a specific score is. A z-score can be placed on a normal distribution curve.Z-scores range from -3 standard deviations (which would fall to the far left of the normal distribution curve) up to+3 standard deviations (which would fall to the far right of the normal distribution curve).In order to use a z-score, you need to know the mean μ and also the population standard deviation σ. They (a) allows researchers to calculate the probability of a score occurring within a standard normal distribution and (b) enables us to compare two scores that are from different samples or even measured in different units (which may have different means and standard deviations).

Normal Distribution:

Symmetrical bell-shape with most data points near the center Because the shape always looks like a bell, it can be concisely described by only two parameter: Mean and SD Of course, M and SD vary across different data and samples At least 68% of data is within 1 SD of the mean Some curves are much narrower, some wider

On the midterm you will need to write a study out, example:

Take a group of people, randomly assign half of them to a treatment condition, and the other half is no treatment, a control group. & then we look at depression levels, everything is controlled for because it was randomly assigned (that will make sure the two groups look similar in diversity). I could randomly assign people into three groups, and we can look at x (maybe look at a few study examples). i.e. Do scary movies cause people to eat more popcorn/candy during the movie? I could randomly assign people into three groups (Show one group the shining, show one group the sound of music, and one group the shining with no sound). The films were shown to 50 randomly selected undergraduate students from The New School. Food was weighed before and after intake to determine the amount eaten by each viewer. Limitations: if there are allergies or eating disorders or other food conditions that may limit participant's food consumption. Does having other participants in the room influence food consumption? Another: i.e. Do colors really impact mood?

What is inferential statistics?

Taking a sample statistic and making inferences about the population's parameter

Rationale for the t-test:

Taking our observed difference between sample means (group one's average - group two's average OR post-average - pre-average), and then we are subtracting the expected difference between population means (if null hypothesis is true, which is always 0... the null hypothesis is no effect-what our data looks like if there's no difference, i.e. whatever we are testing there is none of that). Then, we are dividing that by the estimate of the standard error of the difference between two sample means (Whenever we are dividing by a standard deviation, or a standard error, it's because we are standardizing something. So, that's all you really need to understand about that). The t distribution is also a normal distribution that we use when comparing group's data.

How are p-value and CI related?

The CI will exclude the null value of 0, if and only if the p-value < .05. You can know if your p-value is significant just by looking at your CI.

Test-retest reliability:

The ability of a measure to produce consistent results when the same entities are tested at two different points in time.

Explain the difference between a sample mean and a population mean and why they are not always the same.

The sample mean we know; it is derived from our sample. The population mean is (usually) unknown, and we are trying to estimate it using our sample mean.

What is the sampling distribution of the sample mean? What are we talking about when we say that?

The sampling distribution of the sample mean is the distribution that you get if you draw an infinite number of samples from your population and compute the mean of all the collected samples.

What is the sampling distribution of the sample mean? What are we talking about when we say that?

The sampling distribution of the sample mean is the distribution that you get if you draw an infinite number of samples from your population and compute the mean of all the collected samples.

What does the standard deviation tell us?

The standard deviation tells us the average amount that the scores differ from the mean (i.e. where most of the data lies: we know that most of the data lies within x sd away from the mean value)

Difference between validity and reliability:

To help you remember the difference between validity and reliability, think of a scale. If a scale is valid, that means it is measuring your weight. If a scale is reliable, that means it is measuring the true weight. If a scale is valid, but not reliable that means it is measuring weight, but not the right weight.

What does CI display?

Together, the point estimate (i.e., r or M -whatever statistic you are interested in) and CI provide information to assess the clinical usefulness or 'practical' significance of an intervention/test, etc.

Another example of t-test interpretation (no significance):

We looked at the difference between morning people (N = 24, M = 4.17, SD = 2.26) and night people (N = 44, M = 4.59, SD = 1.86), and found that there was no significant difference in extraversion, t(66) = -0.83, p = .41, d = -0.21. Therefore we conclude that with 95% confidence, we cannot reject the null hypothesis that there is no difference. Because this is not significant, you don't really need to explain Cohen's d, because there's no practical/statistical significance.

Requirements to do an independent t-test:

When both samples are randomly selected, we can make inferences about the population Samples have to be independent from one another: samples are different people, cases, etc.

Self-selection bias:

When people "choose" to be in the data (i.e., a data that looks at married people). i.e. Suppose you were running a mail-in poll on how many people in a district could read.

Effect size (t-tests):

Whenever we are doing any sort of test in this class it comes with an effect size. For t-tests, it is called Cohen's d. It just tells you Okay, this is significant but HOW BIG is the effect?? Gives a sense of how large the effect is!

Validity:

Whether an instrument measures what it was set out to measure.

How do we calculate z-scores?

You deduct the mean from the score you are interested in, and then divide it up by the standard deviation Gives you how many SD's the score is from the mean, "the distance from the mean in units of standard deviation".

Giraffes have a mean height of 180 inches and a standard deviation of 50.4 inches. What is a giraffe's Z score that measures 129.6 inches? Z = -0.95 Z = -1 Z = 1 Z = 0.95

Z = 1

What do z scores allow us to do?

Z scores allow you to compare things that are not on the same scale, as long as they are normally distributed ex. For example, heights of people might range from eighteen inches to eight feet and weights can range from one pound (for a preemie) to five hundred pounds or more) Those wide ranges make it difficult to analyze data, so we "standardize" the normal curve, setting it to have a mean of zero and a standard deviation of one. When the curve is standardized, we can use a Z score and a Z table to find percentages under the curve. The Z score helps us understand the probability of a certain score occurring within our normal distribution.

Simpson's paradox:

a trend that appears in several different groups of data but disappears or reverses when these groups are combined (i.e. example with Berkeley college admissions and gender bias)

The _____ sample size the ____ standard error. a. smaller, smaller b. larger, larger c. smaller, larger d. larger, smaller (2 correct answers)

c&d

Effect size for t-tests (Cohen's D):

d = 0.2 - 'small' effect size, d = 0.5 - 'medium' effect size d = 0.8 - 'large' effect size. Has meaning for 'practical significance'. Sometimes a finding can be significant. If Cohens d is quite small, it means the difference is actually quite negligible. Ex. cohen's d = 0.2 (the difference between the two groups' means is less than 2 standard deviations, the difference is negligible, even if statistically significant.) Report Cohen's d after your p-value and interpret the effect size.

Which of the following correlations is the strongest?: a) r = 0.09 b) r = -0.581 c) r = 0.81 d) r = -0.851

d)

Experimental research...

disproves things, not "proves" them You are trying to reject the null hypothesis (through p-values/Confidence Intervals)

How to report confidence intervals:

memorize this sentence: We are 95% confident that the true x (i.e. mean, effect size, difference), lies somewhere between lower bound of CI and higher bound of CI. When reporting confidence intervals, use the format 95% CI [LL, UL] where LL is the lower limit of the confidence interval and UL is the upper limit. We typically use 95% confidence intervals. CIs are reported in brackets: 95% CI [114,120] for college-student IQ from our sample of 100 students. If it includes 0 that means your findings are not significant and you cannot reject the null hypothesis. If it doesn't include 0, your findings are significant, and you can reject the null hypothesis.

The larger your sample, the x your CI will be?

narrower

Direction of the relationship:

positive vs. negative vs. zero

Correlation coefficient (r):

r can only range from -1 and 1 Positive values = positive correlation Negative values = negative correlation Zero = no relationship r can be used to measure degree of reliability and validity (the effect size for correlation)

Parameter:

refers to the population

Statistic:

refers to the sample

How to find the t-value?

t = (mean of group A - mean of group B)/ (standard error of the difference between two sample means)

Cohen's d

t test Say there is a statistically significant difference, then you want to think about whether or not that difference is even meaningful. This is where Cohen's d comes in. It tells you how big/meaningful/practically significant the difference is. (d = 0.2 - 'small' effect size, d= 0.5 - 'medium' effect size, d= 0.8 - 'large' effect size.)

Z score:

the distance of a score from the mean in standard deviations

What is statistics?

the science of collecting, organizing, analyzing, and interpreting data in order to make decisions. The entire goal of statistics is to estimate something about a population using a sample with a specific degree of uncertainty.

Standard error:

the standard deviation of the sampling distribution of the sample means (used to find the confidence interval) calculated by dividing the standard deviation by the square root of the number of samples. The standard error tells you how accurate the mean of any given sample from that population is likely to be compared to the true population mean. When the standard error increases, (i.e. the means are more spread out), it becomes more likely that any given mean is an inaccurate representation of the true population mean.

Strength/size of the relationship:

weak vs. strong based on r

The more variability with your sample, the x your CI will be?

wider

How do you find variance? What's wrong with variance?

σ² Squaring the deviations from the mean before dividing by the total number of scores (minus 1 - don't worry about this yet) gives us the variance When we square the deviations, we also square whatever unit of measurement we are using (i.e., height in inches), which isn't easily interpreted (or meaningful) So, we take the square root of the variance that gives us the standard deviation: a measure of variability in our original unit of measurement, telling us how well our mean represents or 'fits' our data Remember the mean is our expectation


Conjuntos de estudio relacionados

Interperiod Tax Allocation Basics

View Set

Chapter 01 Quiz [The Sociological Perspective]

View Set

History Study Guide (Mesopotamia/Ancient Egypt/Ancient Greece)

View Set

Most Missed Questions - DMV Test

View Set

Unit 2- State Regulation Under the Uniform Securities Act (USA)

View Set

Chapter 15: "what is Freedom" Reconstruction 1865-1877

View Set

Quiz 1: Immunizations & Communicable Disease

View Set