Data Analysis Final

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Law of large numbers

The larger the sample size (n), the more probable it is that the sample means is close to the population mean.

Exploratory Data Analysis

"A set of procedures for arranging and displaying numbers that allow the researcher to quickly organize, summarize, and interpret the data collected from a research project." - Stem and leaf plot - Frequency distribution and grouped frequency distribution - Histogram - Polygon - Bar graph

null hypothesis

(H0) states that in the general population, there is no change, no difference, or no relationship. In the context of an experiment, H0 predicts that the independent variable (treatment) has no effect on the dependent variable (scores) for the population

alternative hypothesis

(H1) states that there is a change, a difference, or a relationship for the general population. In the context of an experiment, H1 predicts that the independent variable (treatment) does have an effect on the dependent variable.

estimated standard error

(SM) is used as an estimate for the real standard error, θM, when the value of θ is unknown. It is computed from the sample variance or sample standard deviation and provides an estimate of the standard distance between a sample mean, M and the population mean, μ.

Random Sampling

- ***sampling with replacement*** - 2 key components: 1) each individual has an equal chance of being selected, 2) the probability of being selected does not change across repeated draws. Each draw is an ~independent event~

independent groups one-way ANOVA criteria for statistical inferencing tests

- 1 IV with 3 levels ; DV is at least interval data Example: Does caffeine affect errors on a stats exam?

dependent groups t-test

- 1 sample, IV 2 levels

WHY YOU NEED N-1IN DENOMINATOR FOR STANDARD DEVIATION OF A SAMPLE

- A SAMPLE IS A BIASED ESTIMATE --> TAKING A SAMPLE OF A POPULATION SYSTEMATICALLY UNDERESTIMATES THE VARIABILITY OF THE POPULATION (B/C YOU ARE MOST LIKELY TO GET PEOPLE WHO ARE CLOSE TO THE MEAN), THEREFORE YOU HAVE TO ADJUST FOR THIS BY SUBTRACTING 1

Mutually exclusive events

- A and B are mutually exclusive events if A and B cannot occur simultaneously. - If A and B are mutually exclusive, then Pr{A or B} = Pr{A} + Pr{B}

Percentile rank of a score (c%)

- A statistic that describes the relative position of a score within a sample. The sample is the frame of reference - % of individuals that have scores AT or BELOW a given value of X - % of individuals with scores < X - (cf of that score/N) X 100 - More precise ways to calculate percentile rank using interpolation - SPSS and textbook uses them - *** What's the percentile rank for a score of 7.7? 5.2?(AKA IN-BETWEEN THE GIVEN INTERVAL** Think about real limits. ---> CHOSE THE SCORE THAT THE REAL LIMITS CORRESPOND WITH; AKA FOR 7.7 CHOOSE 8 AND FOR 5.2 CHOOSE 5***

Which is more likely when tossing a coin 8 times? HHHHHHHH or HHHHHHHT or HTHTHTHT ?

- ALL EVENTS ARE EQUALLY LIKELY! (.5)^8 despite ppl thinking that HTHTHTHT is more likely b/c more equal

Univariate statistics

- Any distribution - To determine how an individual performed relative to his/her reference group. - To compare an individual's score across two dependent measures. - On the French final, the mean was 75 with a standard deviation of 5. On the math final, the mean was 65 with a standard deviation of 10. Cameron got an 80 on the French final and an 80 on the math final. ****Is she equally proficient at math and French compared to her classmates?**** Why or why not? What is Cameron's z-score on the French final? On the Math final? X-mean = deviation score (X-mean)/s = standardized (or z) score Mean and standard deviation BOTH determine the z-score ALSO for: - Normal distribution - Estimate percentage or number of subjects in a sample that fall within any interval of scores. - Find scores that mark off particular percentages of the sample or areas of the normal distribution.

Outliers

- Any score in a distribution that's 3 or more standard deviations away from the mean is considered an outlier - Such scores are very rare with bell-shaped (normally distributed) data - Extreme scores and outliers have a large impact on the range, variance, and the standard deviation

organizing vs. summarizing data

- As you organize your data, you can begin to interpret it, understand it, and communicate its meaning to others. - As you summarize the data, you select the aspects of the data that you wish to highlight; But, as you summarize, you LOSE information. (more descriptive stats)

Mean

- Best measure of typicality for interval or ratio data (under certain conditions); most commonly used - Average score; sum of x/N - Balance point of the weight, size, or magnitude of the scores in the distribution - Easy to work with mathematically - Misleading if skewed distribution or outliers; very sensitive to extreme scores **SENSITIVE TO EXTREME SCORES**

Biased vs. unbiased statistic

- Biased statistic: Any sample statistic that, on average, consistently overestimates or underestimates the corresponding population parameter. -Unbiased statistic: Any sample statistic that, on average, consistently equals the corresponding population parameter. -- M (sample mean) is an unbiased statistic -- M provides a good estimate of population mieu - If we calculated s using the same formula that we used for s, it would be a biased statistic b'c it would consistently underestimate s Samples are consistently less variable than populations b'c we are more likely to select typical scores (scores close to the mean) b'c those scores occur more frequently in the distribution (assuming a normal distribution)

Discrete variables: Binomial Distribution

- Binomial Distribution is used whenever the measurement procedure classifies individuals into exactly 2 categories; p(A) = p and p(B) = q - binomial distribution gives the probability for each value of X, where X= the number of occurrences of category A in a series of n events - (SHOWS THE PROBABILITY ASSOCIATED WITH EACH VALUE OF X FROM X=0 TO X=n) - each trial has only 2 POSSIBLE OUTCOMES (i.e., coin toss (H OR T), multiple choice questions (correct or wrong), gender (M or F), class year (1st year or other) - 3 ways to solve: 1) use binomial formula, 2) approx. using normal curve or 3) use trick learned today?

Making Predictions using Linear Regression

- Decide which variable is X and which is Y. - Convert x to a z-score --> zx - Multiply r and zx to get the predicted z-score for y --> zy^ - Convert the predicted z-score to original y units; predicted y = y^

dependent groups t-test criteria for statistical inferencing tests

- Dependent or Correlated Samples (Within-subject IV) --> Same subjects in both groups; Repeated-measures design; Within-subjects design; Before-and-after design Different subjects in each group; Matched random assignment or matched-groups design in which match (or pair) subjects based on a common characteristic (e.g., same IQ) and then randomly assign each member of each pair to a group. - Normally distributed; at least interval data

Correlation Coefficient

- Descriptive statistic - Measures or describes (quantifies) the strength and direction of a linear relationship between two variables. In particular, to what extent did participants' performance on X relative to the rest of the sample match their performance on Y relative to the rest of the sample? - For interval or ratio data, it's the Pearson product-moment correlation coefficient - Pearson r - rp (sample) - roe p (population) - rp > 0 --> positive (or direct) relationship rp < 0 --> negative (or inverse) relationship rp = 0 --> no relationship rp = +1 or -1 --> perfect relationship rp = 6.25 --> ????? NOT POSSIBLE

Discrete vs. continuous scales

- Discrete scale: Separate, indivisible categories. No values can exist between two neighboring categories - Continuous scale: There are an infinite number of possible values that fall between any two observed values. Is divisible into an infinite number of fractional parts.

Interpretation of a score with different shaped distributions

- First exam: X = 63 and M = 52 - Second exam: X = 63 and M = 52 - Third exam: X = 63 and M = 52 - In each case, X is 11 points above average (X - M) = 11 = deviation score - Nonetheless, the interpretation of 63 varies across the 3 distributions because of -- Shape of distribution -- Amount of variability in the distribution

Scales of measurement

- Function: To assign scores or values along some dimension according to a set of rules - Qualitative scales : No arithmetic operations beyond <, >, =, Nominal (aka categorical), Ordinal - Quantitative scales: Arithmetic operations allowed, Interval, Ratio

Frequency distribution graphs

- Histograms - Polygons - Bar graphs - Scores (X values) are listed on the X axis and f (count) is listed on the Y axis

Philosophy of statistical inferencing

- Hypothesis testing - Method of indirect proof - Foundation based in probability - all interval/ratio data

Multiplication law of probability

- If A and B are independent, then Pr{A and B} = Pr{A} x Pr{B} - If A and B are not independent, then Pr{A and B} = Pr{A} x Pr{B|A} joint marginal conditional prob. prob. prob. Note: If A and B are dependent, then Pr{B|A} = Pr{A and B}/Pr{A} Pr{A|B} = Pr{A and B}/Pr{B}

Histogram

- Interval or ratio data - Each bar is centered above each score (or class interval) - Height corresponds to f for each bar - Width extends to real limits - Adjacent bars touch

Polygon

- Interval or ratio data - Same features as a histogram but use a dot centered above each score (or class interval) - Lines connect dots with line starting and ending at f = 0

independent groups t-test

- Is there an IV? What is it or what are they? How many levels of the IV(s) are there? - What is the DV? What is its scale of measurement? - Describe the characteristics of the population from which the sample is selected if H0 is true. - Sample size? (N1+N2) - Statistical hypotheses? (H0 and H1) - Level of significance? (a) - Describe the sampling distribution using CLT. - Randomly select all possible PAIRs of samples (Get all possible pairs of means; all possible mean differences (M1-M2). Expected average mean difference is 0 if H0 is true; (mM1- mM2) = 0; Estimated sampling error of the mean differences is the sq root of the sum of the estimated variances for the sampling distributions created by sampling from each population. It's the pooled variance of the sampling distributions. Variance Sum Law: The variance of the difference between two variables equals the sum of their variances.) - s^2 = SS/n-1 and n1 = n2 = n (n-1)s^2 = SS - Describe the sampling distribution using CLT. - It's a t-distribution (t) with: df (degrees of freedom) = n1 + n2 - 2 = 26 + 26 - 2 = 50 - mean = m(M1 -M2) = 0 - sd = s(M1-M2) = 3.54 = sq root of [(13.26)2/26 + (12.24)2/26] = 3.54 - What is the critical rejection region or the critical value of the test statistic? (use table B.2) -Perform the statistical test.

Pearson r

- Measures the extent to which (x, y) points fall on a line - **The SD line is the line on which all data points will fall if r = +1/-1*** - The SD line is the line which is based on the standard deviation of the two variables - sy/sx (since r=1 or -1 then the slope of SD is just sy/sx) - Change in y over change in x - Rise over run - Standard deviation line = sy/sx --> not the same as regression line, it's the line when r=1 or -1

Calculating the Median

- Method 1: N is an odd number - Median is the middle-most score - Rule of thumb to locate the median, esp. for a large data set - (N+1)/2 = location of the median; What is the score at that location? 81 83 85 87 89 95 98 - (7+1)/2 = 4 (cum f of Median = 4); Median is the 4th score - Median = 87 (3 scores below 87 and 3 above) - Why do we use N+1 rather than N? What happens if we use N rather than N+1? - ***Median (use N + 1) vs. 50th percentile (use N) 7/2 = 3.5**** - So the median should be the 3.5th score or 86 but then we have 3 scores below the median and 4 scores above the median so 86 is not the middle-most score. ---> WHY YOU USE N+1 INSTEAD OF JUST N Method 2: N is an even number - Median is the average of the middle-most pair of scores Rule of thumb to locate the median - (N+1)/2 = location of the median; Find the average of the two adjacent scores. 81 83 85 87 89 95 - (6+1)/2 = 3.5 (cum f of Median = 3.5); Median is the 3.5th score - Median = (85+87)/2 = 86 (3 scores below 86 and 3 above) 81 85 92 97 99 100 - What's the median? - What happens if we use N rather than N+1? - 6/2 = 3 - So the median would be the 3rd score or 92 but then we have 2 scores below the median and 3 scores above the median so 92 is not the middle-most score. --> USE N+1 AND NOT N Method 3: Interpolation -- we don't have to know this

Median

- Midpoint of an *ordered* distribution - Use for interval/ratio or ordinal data - Best measure of typicality for ordinal data - Best measure of typicality for interval/ratio data if distribution is skewed or outliers exist - Mathematically difficult to work with - Balance point of number of observations, but ignores magnitude of each observation

Mode

- Most frequently occurring value in the distribution - Simple yet crude measure - Can be used for all scales of measurement, but especially for nominal data - Useful for describing shape of distribution - unimodal = one peak - Major vs minor mode (taller peak = major, smaller but still peak = minor mode) - bimodal = two peaks (two scores occur most often; peaks have equal heights) - multimodal = more than two peaks - Imprecise and possibly misleading as a measure of typicality

How to interpret a score relative to a comparison group/ sample

- Need frame of reference: - Percentile rank (median) - z-scores or standard scores (mean)

Bar graph

- Nominal or ordinal data - Similar to histogram but gaps between bars

2 kinds of alternative hypotheses

- Non-directional (2-tailed) - Directional (1-tailed)

Ratio scale

- Ordered values that are all intervals of exactly the same size and an absolute zero point - Orderable values - Equal distances between values on the scale reflect equal distances magnitude; Intervals are objective and equivalent - Truly quantitative scale - Can tell both direction and magnitude of the difference between two individuals - Ratio comparisons of measurements are allowed - Use arithmetic operations (add, subtract, multiply, divide) - Examples: Inches, pounds - What scale of measurement are IQ tests? SAT?

Interval scale

- Ordered values that are all intervals of exactly the same size; no absolute zero point - Orderable values - Equal distances between values on the scale reflect equal distances magnitude; Intervals are objective and equivalent; This allows for arithmetic operations - Truly quantitative scale - Can tell both direction and magnitude of the difference between two individuals - Use arithmetic operations (add, subtract, multiply, divide) but ratios of magnitudes are not meaningful - Example: Celsius temperature scale (the absence of heat = -273˚ not 0˚) Why can't I say that 40˚C is twice as warm at 20˚ C? ---> BC there's no absolute zero (so ratios not meaningful)

Representativeness (heuristic)

- People think diverse is more representative - "the degree to which [an event] (i) is similar in essential characteristics to its parent population, and (ii) reflects the salient features of the process by which it is generated". [1] When people rely on representativeness to make judgments, they are likely to judge wrongly because the fact that something is more representative does not actually make it more likely.

Population vs. sample

- Population = all the individuals of interest - Sample is taken from a population for a research study; and results generalized to the population - Population parameters (describe populations) vs. sample statistics (describe samples)

Topics covered on test on 4/21

- Probability - Hypothesis Tests (z-tests) - t-tests: (where variance is unknown): Two Independent Samples (independent measures design), and Two Related Samples (repeated measures design and matched-subjects design) - f-tests (anova - independent groups design, one- way) - Effect size : Cohen's d - Calculating the portion of variability in the DV (in behavior) that's accounted for by the IV: r squared/ omega squared/ eta squared - Post-hoc tests (Tukey HSD test) - Power

standard error for the sample mean difference (ESTIMATED SAMPLING ERROR)

- S(M1-M2) - how much difference actually exists between the two sample means/ how much difference should exist between the two sample means if there is no treatment effect that causes them to be difference - a large value for the t statistic is evidence for existence of a treatment effect - find the error from each sample separately (SM) and then add the two errors together so S(M1-M2) = S1 squared/ n1 + S2 squared/ n2 --> then take the square root of this (but only when n is the same size for both samples, otherwise formula is biased)

Ordinal scale

- Scale consists of a set of categories that are organized in an ordered sequence - Orderable values - Rank observations in terms of size or magnitude; Measures relative position of observation on scale - Distances between ranks are not held constant across the scale; Intervals are neither objective nor equivalent - Can only tell the direction of difference between two individuals; not true magnitude of difference - No arithmetic operations other than <, >, = - Still a qualitative scale - Examples: Rating scales, rank

Nominal scale

- Scale consists of a set of categories that have different names - Label and categorize - Categories are not orderable - Can only determine if two individuals are the same or different - No quantitative or numerical info is provide by scale itself Examples: gender, hair color, occupation

probability of sample means

- To determine if a sample mean is "different enough," we rely on probability. Specifically, the probability that a sample mean will or will NOT be found in the middle 95% of the distribution of sample means. - Effect of treatment, for example, is considered to be "statistically significant" (i.e., not likely due to chance or due to sampling error) if the probability of that sample mean is very small. - How small depends on alpha (α)

Binominal Formula

- Use the binomial formula to calculate the exact probability: - example of coin toss: - p = pr(H) and q = pr(T) - n = # of trials or observations or individuals - X = # of times expect to get a particular outcome μ= pn θ = sq root of npq

Standard Deviation

- Want a single number that tells me how far, on average, scores deviate from the mean - Average deviation or distance from the mean; standard deviation from the mean - 0 = no deviation (all scores = mean) - Large number = Large spread @ the mean; lot of deviation - Small number = Small spread @ mean; tight cluster around mean; little deviation - Population variance (sigma2) and the population standard deviation (s) - Sample variance (s2) and the sample standard deviation (s) - conceptual: RMSD = Square Root of the Mean of the Squared Deviation Scores - computational: - SS/N for population - SS/N-1 for sample remember SS = sum of (x squared) - (sum of x) squared over n

hypothesis test

- a statistical method that uses sample data to evaluate a hypothesis about a problem

Law of large numbers vs. Law of small numbers

- a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value, and will tend to become closer as more trials are performed. - there is no principle that a small number of observations will coincide with the expected value or that a streak of one value will immediately be "balanced" by the others (see the gambler's fallacy).

t-distributions

- bell-shaped but are flatter than normal distributions with larger tails (more extreme sample means; atypical samples) - s will better estimate θ as N increase - Atypical samples become increasingly rare as N increases because larger samples (N > 30) are more likely to have representative (or typical) individuals - Thus, the exact shape depends on N or, more precisely, df = N-1 - So there are a family of t-distributions

Analysis of Variance (aka ANOVA)

- better than pairwise t-tests b/c less time consuming AND increased probability of making a type 1 error if keep doing pairwise tests - alpha(EW) = experimentwise error-rate --> very high if you keep doing t-tests -Simultaneously compare all group means in a single test - Omnibus test (compares all sample means at once) - alpha (EW) = alpha * can only have non-directional hypotheses

independent groups t-test criteria for statistical inferencing tests

- between subject IV (different subjects in each group --> Random assignment of subjects to groups; experimental design; manipulated IV Sort subjects into groups based on pre-existing characteristics; quasi-experimental design (or intact-groups design); quasi-IV or participant IV) - NORMALLY distributed

cohen's d

- determines the importance (substantiality) of a significant result - for z-tests (vs use ~estimated~ cohen's d for t-tests) - indicates the relative magnitude of the difference between the means - the treatment mean represented as M and the untreated mean represented as μ - I.E., How large is the mean difference relative to the standard DEVIATION (not standard error)? mean difference/ standard deviation = Mtreatment - μ no treatment / θ - (not influenced by number of scores in sample) - d = 0.2 --> small effect - d =0.5 --> medium effect - d = 0.8 --> large effect

flaws in hypothesis testing?

- doesn't really evaluate the absolute size of a treatment effect - significance is relative to standard error (which is very dependent on sample size)

Type 2 error

- failure to reject a false null hypothesis (hypothesis test failed to detect a real treatment effect) - occurs when a researcher fails to reject a null hypothesis that is really false. In a typical research situation, Type II error means that the hypothesis test has failed to detect a real treatment effect. - (not as serious) - beta symbol - Pr{Type II error} = beta - As decrease alpha, the Pr{Type II error} increases. - Be careful: beta ≠ 1- a

Not independent

- first trial affects what happens on the second trial

Normal curve approximation

- in the normal approximation to the binomial distribution, each value of X has a corresponding z-score - z = (X-μ)/θ μ= pn θ = sq root of npq - NOTE "X" ="X +/- 0.5" FOR BINOMIAL DATA - "X +/- 0.5": Because the normal curve is continuous and the binomial distribution is discrete, we will have to use LRL (lower real limit) or URL (upper real limit) depending on the problem - X - .05 --> "at least" - X + .05 --> "at most" - Use the normal curve to approximate it (BUT pn ≥ 10 AND qn ≥ 10) IN ORDER TO MEET CRITERIA FOR NORMAL APPROXIMATION TO THE BINOMINAL DISTRIBUTION MAKE SURE THIS IS TRUE - With the z-score and the normal unit table, you can find the probability values associated with any value of X. For maximum accuracy, use the appropriate real limits for X when computing z-scores and probabilities

Real limits

- located halfway between 2 scores - use real limits (+ or - .5) only for nominal data? - each score has an upper real limit and a lower real limit

Directional alternative hypotheses

- makes a statement about the direction of the difference, change or relationship - specifies whether the difference, change, or relationship will be positive or negative - i.e., states whether there will be an increase or decrease in the population mean

Non directional alternative hypotheses

- makes a statement there will be a difference, a change, a relationship

measures of central tendency

- mean, median, mode Descriptive statistics that determine a single score that defines the center of the distribution Goal is to find the most typical or representative score in a data set. Mode = Most frequently occurring score Median = Middle-most score (the midpoint of the distribution) Mean = Average

3 basic types of descriptive stats

- measures of central tendency - measures of variability or deviation - measures of the shape of the distribution

Gambler's fallacy

- mistaken belief that, if something happens more frequently than normal during some period, it will happen less frequently in the future, or that, if something happens less frequently than normal during some period, it will happen more frequently in the future - In situations where what is being observed is truly random (i.e., independent trials of a random process), this belief, though appealing to the human mind, is false.

Rounding

- normally round to 2 digits - If at exactly "a half," drop remainder if it's even and round up if it's odd - Round the following to 2 decimal places - 49.7650000 --> 49.76 - 49.7750000 --> 49.78

when r = 1

- perfect correlation - predicted M is the same deviance away from M as original score - Sesty is 0

Compound probability statements

- pr {2 first-years and 1 soph or 3 seniors} = (2/7)*(2/7)*(3/7) + (50/700)^3

when r = 0

- predicted M is just M - Sesty is Sy

measures of variability or deviation

- range - variance - standard deviation Descriptive statistics that provide a quantitative measure of the degree to which scores in a distribution are spread out or cluster together Usually a single number Describes the extent to which scores in the data set differ from each other Range = Highest score - lowest score Variance (s2) and standard deviation (s) Measures deviation around the mean

t- test

- same as z-test, but since θ is unknown --> Use sample standard deviation (or s) to estimate θ - ***Sm = s/ sqrt N OR Sm = Sqroot of s^2/N CLT:: If I draw ALL POSSIBLE RANDOM samples of size N from NORMAL population with a mean (m) and an UNKNOWN s 1. the mean of the sample means (i.e., the mean of the sampling distribution) will be equal to the population mean μM = μ 2. the sd of the sample means (i.e., the estimated standard error of the mean) will be equal to the sample sd divided by the square root of N sM = s /sqrt N 3. the sampling distribution will be a t-distribution with df (degrees of freedom) = N-1 **TO FIND T. CRITICAL --> USE TABLE B.2 IN BACK OF BOOK WHICH USES DF (degrees of freedom) or N-1, alpha, and 2 tailed

Sesty

- standard error of estimate - a new kind of standard deviation - difference between actual y score and predicted/ estimated y score - equation = sq. root of 1 - r squared all times Sy - Why calculate sest y? - It estimates the size of the errors in our predictions. It estimates the average residual. - It estimates the predictive power of r. Compare sest y to sy. - We can use sest y and y^ and assumption about normally distributed data to estimate the percentage of time predicted scores will fall within a certain range. Like we did before with Table B.1 and z-scores.

Sampling Distributions Overlap

- to find power - assume a treatment effect - find the critical rejection region - find M --> m - μ/ θm ---> θm = s/sqrt N - with this new m, find a New Z --> m - μnew/ θm - look at proportion of body in table

homogeneity of variance

- two populations being compared must have the same variance (both samples are estimating the same population variance, ie: two people guessing your IQ, versus one person guessing your IQ and the other guessing number of grapes in a pound --> this way, we cannot average them)

ESTIMATED STANDARD ERROR FOR SAMPLE MEAN DIFFERENCE

- unbiased/ when n is not the same size for both samples - S(M1-M2) = sq root of : s squared p/n1 + s squared p/ n2

pooled variance

- use for correcting bias - obtained by averaging/ pooling the two sample variances using a procedure that allows the bigger sample to carry more weight in determining final value - s squared (p) = SS1 + SS2 / df1 + df2 - also an alternative formula

For nominal data

- use real limits (+ or - .5) only for this type of data! when exactly?

Continuous variables: Finding probabilities using the normal curve

- what % of general population has an IQ greater than/ equal to 115 when μ = 100 and θ = 15? (assuming IQ scores are normally distributed) -- we would look at z-score (115-100/ 15) and then look at the z-score chart for probability beyond that z-score - what's the probability of randomly selecting 4 people with IQs less than or equal to 85? -------> use IQ ≥ 85 = 16% or .016 (proportion in tail of z= -1.00) then do (.016)^4 - what's the probability of randomly selecting 2 people with IQs as deviant, as rare, or as unusual as 55? --------> IQ of 55 is a percent of (.0013)x2 = .0026, then do .0026^2 (why multiply by 2 then square it?)

Stem and leaf plot

- with first digit on the left hand side (stem) and then a line, and then second digit on the right hand side (leaves) - Efficient method for obtaining and displaying data - Organizes data without summarizing - Displays a frequency distribution - Each score is divided into a stem and a leaf - Stem: First digit or digits - Leaf: Final digit - Stems: List all possible values from lowest (bottom) to the highest (top) obtained value (SO VERTICALLY TOP TO BOTTOM GOES FROM HIGH TO LOW) Unit = 1 or unit > 1 but unit size cannot vary from one stem to another - AND Leaves should be organized left to right from low to high values***

Not mutually exclusive events

- you can be both 2 events at once (i.e., female and sophomore) - therefore to find probability, add up probability of each event, then subtract the probability of being both - If A and B are not mutually exclusive, then Pr{A or B} = Pr{A} + Pr{B} - Pr{both A and B}

Properties of probability

0 <(or equal to) Pr{event} <(or equal to) 1 Pr{all possible outcomes} = 1 Pr{A} + Pr{not A} = 1 Note: Pr{A} = 1 - Pr{not A}

z-test criteria for statistical inferencing tests

1 sample (i.e., college students), no IV (no manipulation); DV is at least interval data (i.e., IQ); normally distributed data **θ OF POPULATION KNOWN***

t-test criteria for statistical inferencing tests

1 sample (i.e., college students), no IV (no manipulation); DV is at least interval data (i.e., IQ); normally distributed data **θ OF POPULATION UNKNOWN***

Null/ Alternative Hypotheses for 1 sample (No IV)

1 sample no IV example: Are college students different in intelligence (IQ) from the general population (not studied, just known already)? OR What's the probability that I would get a result as rare as M= 108 if H0 were true and the IQ of college students is the same as that of adults in general? - Pr {results as rare as those observed|H0} = p Non-directional: - H0 : μ = 100 - H1 : μ ≠ 100 Directional (They're smarter): - H0 : μ ≤100 - H1 : μ > 100 **NO SUBSCRIPTS** b/c that implies 2 different groups you studied

Power of an Inferential Test

1- beta - the probability of rejecting the null hypothesis when it is false (i.e., of finding a statistically significant difference when it exists)

z- test

1. Is there an IV? What is it or what are they? How many levels of the IV(s) are there? 2. What is the DV? What is its scale of measurement? Describe the characteristics of the population from which the sample is selected if H0 is true. 3. Sample size? 4. Statistical hypotheses? (H0 and H1) 5. Level of significance? (alpha) 6. Describe the sampling distribution.**USING CLT** --> μ = μ; θm= θ/sq.root of n 7. What is the critical rejection region or the critical value of the test statistic? 8. Perform the statistical test. **CALCULATE Zobs vs. Zcritical*** using z-score formula 9. Reject or fail to reject H0. Draw a probabilistic conclusion—be specific and directional regardless of H0 if you reject H0.

Exhaustive events

A and B are exhaustive events if Pr{A} + Pr{B} = 1 Note: A and not A are exhaustive events

Independent events

A and B are independent events if the occurrence of A will not affect the probability of B occurring - i.e., rolling the die twice

Measures of Central Tendency 2

A statistical measure that determines a single value that :: - Accurately describes the center of a distribution - Accurately represents the entire distribution - Find the single value that is most typical or most representative of the sample Three common measures:: Mode = most frequently occurring score or category Median = middle-most score (data point) in the distribution Mean = average (sum of the scores divided by the number of scores) Which one is most appropriate to use?

How to standardize/compare

Convert raw scores to z-scores Convert z-scores back to raw scores Given a mean, a raw score, and a z-score, we can determine the standard deviation M = 5, X = 40, z = 5, find s Given a standard deviation, a raw score, and a z-score, we can determine the mean s = 10, X = 80, z = -1, find M

Measures of Variability (or Dispersion or Spread)

Descriptive statistics that summarize the data and describe the extent to which scores differ from each other in the data set Descriptive statistics that provide a quantitative measure of the degree to which scores in a distribution are spread out or clustered together Range (mode); distance between highest and lowest observed score ((((Semi-interquartile range (deviation @ the median))) Standard deviation (deviation @ the mean); average (or standard) distance from the mean (M vs. m) sample sd vs. population sd (s vs. s) Sample variance vs. population variance (s2 vs. s 2)

measures of the shape of the distribution

Descriptive stats that assess various aspects of the shape of the distribution Measure of skew :: Symmetrical --> normal, mean = median = mode Positively skewed --> long tail on the positive side, mean is on the right side of the peak, MEDIAN IS IN BETWEEN THE TWO Negatively skewed --> long tail on the negative side, mean is on the left side of the peak, MEDIAN IS IN BETWEEN THE TWO

How to determine scale of measurement

Determining the scale of measurement? Are there orderable values? No--> Nominal scale Yes--> Next Q Are there equal, object units across the scale? No--> Ordinal scale Yes--> Next Q Is there an absolute zero point? No--> Interval scale Yes--> Ratio scale

Sample size of n > 30

Does not produce much additional improvement in how well the sample represents the population

Null/Alternative Hypotheses for 1 sample (one IV with 2 levels); Dependent measures

Example: Do blonds have more fun? Test number of dates blonds get, then have them dye their hair and see how many dates they get then/ *Is there an effect of hair color on number of dates?* Non-directional: H0: μd = 0 H1: μd ≠ 0 Directional: H0: μd ≤ 0 H1: μd > 0

Null/ Alternative Hypotheses for 3+ samples (One IV with 3+ levels)

Example: Does caffeine affect errors on a stats exam? (3 different levels: : 0mg/kg, 100mg/kg, and 200mg/kg) - H0 : μ0 = μ100 = μ200 - H1 : At least one population mean is different from another one.

Null/ Alternative Hypotheses for 2 samples (one IV with 2 levels); Independent Measures

Example: Does the IQ of students who self-identify as male differ from that of students who self-identify as female? (Independent) OR Does taking a test more than once affect performance? (Dependent) **WITH SUBSCRIPTS** Non-directional: - H0 : μmales = μfemales - H1 : μmales ≠ μfemales Directional: - H0 : μmales ≤ μfemales - H1 : μmales > μfemales

Central Limit Theorem

For any population with mean μ and standard deviation θ, the distribution of sample means for sample size n (SAMPLING DISTRIBUTION) will have a mean of μ and a standard deviation of θ/ √n and will approach a normal distribution as n approaches infinity. CLT: - 1) μM = μ - 2) θM = s /sqroot n - 3) the distribution will approach NORMALITY as n approaches infinity

Addition law of probability

If A and B are mutually exclusive, then Pr{A or B} = Pr{A} + Pr{B} If A and B are not mutually exclusive, then Pr{A or B} = Pr{A} + Pr{B} - Pr{A and B}

Regression effect

In virtually all test-retest situations, the bottom group on the 1st test will, on average, show some improvement on the 2nd test, and the top group will, on average, fall back. That is, extreme scores will become less extreme on the 2nd testing.

Assumptions of ANOVA

Interval or ratio data DV is normally distributed Independent scores Homogeneity of variance; Test with Fmax test But ANOVA is a fairly robust test, if N is relatively large (n > 30).

Why is the sum of the deviation scores always 0?

Mean is the balance point of the distribution Sum of the deviation above the mean is equal to the sum of the deviation below the mean.

dfwithin

N-k

Frequency distribution and grouped frequency distribution

Organized tabulation that displays the number of individuals who obtained each value for a variable - List all possible values from lowest (bottom) to the highest (top) obtained value in column X - Interval width: w = 1 (regular frequency distribution) w > 1 (grouped frequency distribution) width cannot vary from one interval to another. - Grouped frequency distribution (w > 1) summarizes data (lose information about the frequency of individual values) - How large should w be? depends** - f column provides tally or frequency for each value of X - Sf = N - cf = cumulative frequency - cprop = cf/N = cumulative proportion - c% = cf/N x 100 = cumulative percent - Use to locate an individual's score relative to others in the distribution

Pearson Product-Moment Correlation

Pearson Product-Moment Correlation Moment refers to the fact that used deviation scores (z-scores) - N-1 for samples and just N for population

Percentile rank limitations

Percentile rank: Way to measure or interpret an individual's score based on the LOCATION of their score within a data base - Limitations to using PR for that purpose:: appropriateness of the comparison group; make-up of the sample - shape of the distribution - based on the location of the score rather than on the magnitude of the score - Alternative measure is z-scores. Based on the mean and the standard deviation of the sample.

Probability of an event

Pr{event} = # of outcomes consistent with event/ total # of possible outcomes

Quasi-experimental design

Quasi-experimental research shares similarities with the traditional experimental design or randomized controlled trial, but they specifically lack the element of random assignment to treatment or control. i.e., child wake up time by academic performance --> wake up time is not randomly assigned; it is sorted into groups (early riser and late riser) --> therefore it is a quasi-IV; also in the case of gender or before/after

Range

Range is a crude measure of variability Only relies on the two endpoints Fails to consider all the scores Highly affected by extreme scores Used with all scales except nominal

Experiment

Research Q Researcher's hypothesis Conduct a study (collect data) and then analyze data to test hypothesis What are my statistical hypotheses? H0 and H1 What is alpha (a)? Collect data. Assess statistical significance of results. Answer the Key Q: What is the probability of obtaining results as rare as those if H0 is true? "p" Compare p to a. Make a decision about H0. Reject or fail to reject H0. Draw a probabilistic conclusion about the research hypothesis and question.

Standard error of M (θm)

S(statistic) -- the standard deviation of the distribution of sample means; provides a measure of how much distance is expected on average between a sample mean and the population mean

General Linear Model

Score = Base population mean + Effect due to treatment condition + sampling error - x = μ + alphaj + Eij

Factors affecting power

Size of the difference between the means (want big difference) Variability in the population (want less variability) Sample size (want big sample size) Alpha (want bigger alpha) Kinds of statistical hypotheses [directional (1-tail) vs. nondirectional (2-tail)] (want directional/ one tail for more power) Design (independent vs. dependent groups) --> want dependent for more power Parametric vs. nonparametric tests (want normally distributed/ parametric data)

Standardization

Standardization does not change the shape of the distribution. If you standardize a normal distribution, you have a standard normal curve. Standardization is not the same as normalization

Break down of tests (true score + error)

Test score = True Score + Error True score = True measure on the hypothetical construct (e.g., true knowledge about stats) True scores will vary due to individual differences in amount of knowledge (one source of variability) Error provides additional sources of variability Measurement error Random error Other individual differences (e.g., test-taking skills) Between-subject variability(or systematic variability) is another source of variability

Expected value of M

The mean of the distribution sample is equal to the mean of the population of scores μ - unbiased

Standard error of M

The standard deviation of the distribution of sample means, θM. Provides a measure of how much distance is expected on average between a sample mean (M) and the population mean (μ). - (when n=1, θM= θ) - measure of reliability

Variance Sum Law

The variance of the difference between two variables equals the sum of their variances.

Regression fallacy

Thinking the regression effect must be due to something important, something nonstatistical, something other than simply the regression effect. Occurs when people use a nonstatistical theory to explain the regression effect. It needs no additional explanation.

Standard deviation (s) or (θ)

To find: - add up all the scores in a set - get the mean by dividing by n - then subtract the mean from each X value to get deviation scores - then square each of the deviation scores - then add them up - then divide by n (or n-1 for samples) --> (this equals the varaince) - then take the square root = STANDARD DEVIATION - aka √SS/N-1 for sample or √SS/N for population

Bivariate statistics

Use z-scores of two dependent measures to assess the strength and direction of the relationship between 2 variables. Calculate correlation coefficient. Use z-scores to help us make predictions (our best guess) about an individual's score on one dependent measure (Y) using his/her score on another dependent measure (X). - - - **Linear regression**

What score is at X percentile?

What score is at the 35th percentile? What score has a percentile rank of 35% (c%)? (N=36) Step 1: Find the location of the score Convert c% to cf cf/36 x 100% = 35 cf/36 = .35 cf = 36(.35) = 12.6; cf = N(c%) Step 2: Calculate the score Average the score at cf = 12 and the score at cf = 13 (3 + 4)/2 = 3.5 X = 3.5 Note: You don't need to interpolate **SCORE AT 50TH PERCENTILE IS NOT THE MEDIAN*

What's the best measure of central tendency for a particular data set?

What's the goal? - Shape of the distribution (mode) - Representative or typical score (next Q) What is the scale of measurement? - Nominal (mode) - Ordinal (median) - Interval/ratio (next Qs) Skewed (median) or symmetrical (mean)? - Outliers (median)? - Undetermined values or open-ended values (median)?

between-subjects design/ independent-measures research design

a research design that uses a separate group of participants for each treatment condition (or for each population)

critical region

composed of the extreme sample values that are very unlikely (as defined by the alpha level) to be obtained if the null hypothesis is true. The boundaries for the critical region are determined by the alpha level. If sample data fall in the critical region, the null hypothesis is rejected. - for alpha = .05, 2-tailed, then critical regions marked off by z= + /-1.96 (.025 on either side) - for alpha = .01, 2-tailed, then critical regions marked off by z= + /-2.58 (.0050 on either side)

degrees of freedom

df = n-1 describes the number of scores in a sample that are independent and free to vary. Because the sample mean places a restriction on the value of one score in the sample, there are n-1 degrees of freedom for a sample with n scores. * like a RESTRICTION*

df for the independent measures t-statistic (Which one forreal??)

df = n1+n2 - 2

estimated Cohen's d

for t-tests mean difference / standard deviation = M- μ/ s

significant/ statistically significant

if a result is very unlikely to occur when the null hypothesis is true. That is, the result is sufficient to reject the null hypothesis. Thus, a treatment has a significant effect if the decision from the hypothesis test is to reject H0. - 2 WAYS TO DETERMINE SIGNIFICANCE: - Compare p value to alpha or sig. level - Or look at critical regions and see if a score lies beyond a critical region, then it is significant

effect size

intended to provide a measurement of the absolute magnitude of a treatment effect, independent of the size of the sample(s) being used - use cohen's d

dfbetween

k-1

dftotal

n-1

Type 1 error

occurs when a researcher rejects a null hypothesis that is actually true. In a typical research situation, a Type 1 error means that the researcher concludes that a treatment does have an effect when, in fact, it has no effect. - (happens by chance or if sample is special) - can lead to false report - alpha symbol - Pr{Type I error} = alpha - .05 = alpha, Pr{Type I error} = .05 - .001 = alpha, Pr{Type I error} = .001 - As decrease alpha, the Pr{Type I error} decreases.

r^2

percentage of variability in IV accounted for by the DV

r squared (omega squared/ ω squared)

percentage of variance accounted for by the treatment -ω2 = t^2 / (t^2 + df) - ω^2 < 0.09 --> small effect - ω^2 < 0.25 --> medium effect - ω^2 > 0.25 --> large effect - s (standard deviation) and n (sample size) matter to effect cohen's d and omega squared

alpha level/ level of significance

probability value that is used to define the concept of "very unlikely" in a hypothesis test (probability that test will lead to a type 1 error)

independent measures (between-subjects) t statistic

sample mean difference - population mean difference/ estimated standard error - (M1-M2) - (μ1 - μ2) / S(M1-M2) - (M1-M2) comes from sample data - (μ1 - μ2) comes from null hypothesis - S(M1-M2) comes from combining standard error of the samples and making it unbiased using pooled variance

SSbet

see page

Sample space

set of all possible outcomes for an experiment

Grand mean

sum of all x's/ n

SSwithin

sum of each SS in group

SS

sum of squared deviations

Final formula for independent- measures t statistic

t = (M1-M2) - (μ1 - μ2) / S(M1-M2)

Distribution of sample means

the collection of sample means for all the possible random samples of a particular size (n) that can be obtained from a population

t distribution

the complete set of t values computed for every possible random sample for a specific sample size (N) or a specific degrees of freedom (df). The t distribution approximates the shape of a normal distribution. - shape = flatter and more spread out than z-score distribution (bc more variability)

Sampling distribution

the distribution of statistics obtained by selecting all of the possible samples of a specific size from a population

Sampling Error

the natural discrepancy, or amount of error, between a sample statistic and a corresponding population parameter

power of a statistical test

the probability that the test will correctly reject a false null hypothesis. That is, power is the probability that the test will identify a treatment effect if one really exists. - 1 - beta (probability of rejecting type 2 error)

Regression equation

y hat = My + (Sy)(r)(x-m/s)

t statistic

used to test hypotheses about an unknown population mean, μ, when the value of θ is unknown. The formula for the t-statistic has the same structure as the z-score formula, expect that the t-statistic uses the estimated standard error in the denominator. - t = M-μ / Sm

directional hypothesis test/ one-tailed test

when the statistical hypotheses (H0 and H1) specify either an increase or a decrease in the population mean. That is, they make a statement about the direction of the effect.


Set pelajaran terkait

CoursePoint Topic 6: Rest, Sleep, Comfort, and Pain Management

View Set

Missed Questions WI Health Insurance Exam

View Set

microsoft word chapter 1 questions

View Set

Microbiology chapter 5 example questions

View Set

research methods - exam 1 practice questions

View Set

Linguistics Final he he he he he he he he he he he he he he he

View Set

Mental Chapter 25- The Aging Individual

View Set