Stats Placement Exam
Range
Most common type: Xmax-Xmin There are other ways to compute
Hypothesis testing w Correlations
H0: p = 0 H1: p ≠ 0 (there is a real correlation) You can test whether the correlation is significant with a t test or an F ratio Complete t statistic example on page 466
Sample Variance
How to compute: - Square each deviation - Sum the squared deviations - Divide the sum by N-1
To change z score back into X (equation)
X = μ + z
Test for association with chi square is the test of independence
X^2 = ∑(f0 + fe)^2 / fe - Formula measures the discrepancy between the data (f0 values) and the hypothesis (f1 values) - Large discrepancy produces large value for chi square and indicates H0 should be rejected - First have to determine degrees of freedom - Based on number of cells you choose expected frequencies - Df = (R-1)(C-1)
Pos and Neg Z Scores
Z = + 1.00 → is located above the mean by 1 standard deviation Z = - .1.5 → is located below the mean by 1 (½) standard deviation
Z score for sample (equation)
Z = X - M / sd
Population z score (equation)
Z = X - μ / σ
ANOVA Notation System
k = number of treatment conditions n= number of scores in each treatment N - total number of scores in entire study T = sum of scores for each treatment condition G = sum of all scores in the research study G = sumT If she gives you chart and you have to find F do MSbetween/MSwithin
Μ
mean of a sample
p (italics)
rho, linear correlation coefficient of a population
SD
standard deviation
Central limit theorum
statistical theory states that given a sufficiently large sample size from a population with a finite level of variance, the mean of all samples from the same population will be approximately equal to the mean of the population
Using a z score to describe the exact location of any specific sample mean within the distribution of sample means:
z = M - μM / σm where σM = σ/ square root of n
Dependent Variable
- (DV) can be thought of as the effect. - It is the measured outcome or behavior, which the researcher then assumes is attributable to the treatment. -The variable that is observed to assess the effect of the tx
Type I error
- A Type I error occurs when a researcher rejects a null hypothesis that is actually true. - In a typical research situation, a Type I error means that the researcher concludes that a treatment does have an effect when, in fact, it has no effect. - Serious because the researcher has rejected the null hypothesis and believes that the treatment has a real effect, it is likely that the researcher will report or even publish the research results. - A Type I error, however, means that this is a false report.
Type II error
- A Type II error occurs when a researcher fails to reject a null hypothesis that is really false. - In a typical research situation, a Type II error means that the hypothesis test has failed to detect a real treatment effect.
Box Plot
- A standardized way of displaying the distribution of data based on the five number summary (minimum, first quartile, median, third quartile, and maximum) - Reveal the main features of a batch of data, i.e. how the data are spread out. -Any boxplot is a graph of the five-number summary: ......the minimum score, first quartile (Q1-the median of the lower half of all scores), the median, third quartile (Q3-the median of the upper half of all scores), and the maximum score, with suspected outliers plotted individually. - The boxplot consists of a rectangular box, which represents the middle half of all scores (between Q1 and Q3). - Approximately one-fourth of the values should fall between the minimum and Q1, and approximately one-fourth should fall between Q3 and the maximum. - A line in the box marks the median. - Lines called whiskers extend from the box out to the minimum and maximum scores that are not possible outliers. - If an observation falls more than 1.5x IQR outside of the box, it is plotted individually as an outlier.
Inferential Statistics
- A statistic that is used to draw a conclusion about the characteristics of a larger group from which the sample was drawn. - Consist of techniques that allow us to study samples and then make generalizations about the populations from which they were selected. -Larger group is the population, the characteristic of the population is a parameter
Descriptive Statistics
- A statistic that says something about, or describes, the group of subjects in the study. - Stats procedures used to summarize, organize, and simplify data. - Organize/summarize raw scores in a form that is more manageable. -E.g., We calculated the mean and SD of a group of subjects' score. However, the only subjects described have been those actually observed.
Effect size
- A statistical measure of the size of an effect in a population, which allows researchers to describe how far scores shifted in the population, or the percent of variance that can be explained by a given variable - as the effect size increases, the probability of rejecting H0 also increases, which means that the power of the test increases. - measures of effect size such as Cohen's d and measures of power both provide an indication of the strength or magnitude of a treatment effect
Non-experimental
- A study when a researcher cannot control, manipulate or alter the predictor variable or subjects, but instead, relies on interpretation, observation or interactions to come to a conclusion. - Typically, this means the non-experimental researcher must rely on correlations, surveys or case studies, and cannot demonstrate a true cause-and-effect relationship.
Statistic
- A value, usually numeric, that describes a sample. - Usually derived from measurements of individuals in the sample. -This is a summary number (e.g., an average) for a sample. -E.g., Our statistical average might be 63 in.
Tukey's HSD Test (posthoc)
- Allows you to compute single value that determines minimum difference between treatment means that is necessary for significance - honestly significant difference - If the mean difference exceeds Tukey's HSD then you conclude there is a significant difference between the treatments - Otherwise you can't say they are different
Percentages
- Another way of expressing a proportion. - A percentage is equal to the proportion times 100. -Equation: p(100) = f/n (100)
Sampling Error
- Any deviation due only to the particular cases falling within the sample. A deviation from expectation that is due to mere chance. -Naturally occurring discrepancy or error that exists between a sample statistic and the correlating population parameter
Pie Charts
- Are best used with categorical data to help us see what percentage of the whole each category constitutes. -Pie charts require all categories to be included in a graph. -Each graph always represents the whole. -One of the reasons why bar graphs are more flexible than pie charts is the fact that bar graphs compare selected categories, whereas pie charts must either compare all categories or none.
Histograms
- Are yet another graphic way of presenting data to show the distribution of the observations. - It is one of the most common forms of graphical presentation of a frequency distribution. - A histogram is constructed by representing the measurements or observations that are grouped on a horizontal scale, the interval frequencies on a vertical scale, and drawing rectangles whose bases equal the class intervals and whose heights are determined by the corresponding class frequencies. - To make a histogram, we break the range of values into intervals of equal length. - We first count and then display the number of observations in each interval. - Bars represent the frequency of observations in the intervals such that the higher the bar is, the higher the frequency. - As mentioned before, the standard format of a histogram usually involves a vertical scale that represents the frequencies or the relative frequencies and a horizontal scale that represents the individual intervals. - Histograms show us shapes of distributions of the observations. - If properly constructed, not too few or too many intervals, histograms allow us to determine whether the shape of our data distribution is bell-curved, right-skewed, left-skewed, or neither, based on the overall heights of the bars. - Histograms are also useful in identifying possible outliers. -If a histogram is symmetric around some value that value equals the average. -Half the area under the histogram lies to the left of that value, and half to the right.
Limits (Real Limits)
- Boundaries of intervals for scores that are represented on a continuous number line - Applies to any continuous variable, even when the #s are not whole #s - Value of the real limit is set to half the scale's unit. Always halfway between adjacent categories! -E.g., If intelligence is measured to the nearest one point and if Claude scores 116, the real limits of Claude's score are 116 (+- symbol stacked) 0.5, which is 115.5 and 116.5 (There are upper and lower limits)
Cohen's d
- Cohen's d is a measure of effect size Cohen's d - mean difference / standard deviation = (μ treatment - μ no treatment) / σ - In most situations the population values are not known so you have to substitute the corresponding sample values in their place Estimated d - mean difference / sample standard deviation = (M - μ) / s -0.2 = small, 0.5 = medium, 0.8 = large
Interpreting Pearson Correlation
- Correlation describes relationship but not why it exists - Value can be affected by range of scores - If you are looking at IQ and creativity but only in college students that's a really restricted range of scores- can't generalize beyond range you have - One or two extremes can have dramatic effect on value of correlation - When judging how good relationship is don't think its a percentage.. R = .5 does not mean variables are 50% related - r2 is called the coefficient of determination..... measures proportion of variability in one variable that can be determine from the relationship with the other variable - Correlation of 0.80 means that r2 = 0.64 or that 64% of variability in Y scores can be predicted from relationship with x
Confidence Intervals
- Describe size of treatment effect by computing estimate of population mean after treatment - Estimating unknown population mean involves constructing a confidence interval - Confidence interval consists of interval of values around sample mean - Μ = M + tsm - To obtain value of t we pick the confidence interval we want (typically 95) then we looked at df and go to the t chart - Conclude that your result is between the two calculated values and you can be 95% sure that that assumption is correct because 95% of all the possible t values are located in that interval - If you increase confidence in your estimate you increase the width of your interval - Larger level of confidence produces a larger t value and a wider interval -Bigger the sample size the smaller the interval
ANOVA test statistic
- F = variance (difference) between sample means variance (differences) expected with the treatment effect - Similar in structure to T except based on variance instead of sample mean difference (large value for test statistic is evidence that sample mean differences - numerator, are bigger than expected if no treatment effect - denominator)
Distribution of f ratios
- F values are always positive numbers - When H0 is true, numerator and denominator are measuring the same variance - should be near 1 - Look at the F table you find numerator df and denominator df and then look at your alpha level and find your F. - If your F is bigger than that then your finding is significant
Sample Mean
- For a sample, the computation is exactly the same, but the formula for the sample mean uses symbols (M and n) that signify sample values - M=ΣX/N
Left-Skewed Distribution
- Has a long left tail. - Left-skewed distributions are also called negatively-skewed distributions. - That's because there is a long tail in the negative direction on the number line. -The mean is also to the left of the peak
ANOVA (analysis of variance)
- Hypothesis testing procedure that is used to evaluate mean differences between two or more treatments or populations. - Uses sample data as the basis for drawing general conclusions about populations - Decide between two interpretations - No differences between the populations- observed differences are caused by random sampling error that differentiate one sample from another - Populations or treatments really do have different means and they are responsible for causing systematic differences between the sample means - In an ANOVA the independent variable or quasi-independent variable is called a factor (telephone conditions) - Individual conditions or values that make up a factor are called the levels of the factor (driving with no phone, talking on hands free, talking on hand-held phone) - ANOVA can be used with independent-measure or repeated measure design -Hypothesis : H0 = x1 = x2 = x3 H1 = There is at least one mean difference among the populations
Probability Equation
- If the possible outcomes are identified as A, B, C, D, and so on, then Probability of A = # of outcomes classified as A ÷ total number of possible outcomes Example with cards: - Probability of getting the king of hearts: 1/52 - Probability of getting an ace: 4/52
ANOVA interpretation
- In ANOVA you calculate between treatments variance and within treatment variance - Figure 12.3 if confused - F-ratio near 1 indicates differences between treatments are random - Conclude no evidence to suggest an effect - When there is an effect- combination of differences in numerator should be larger than random differences in denominator so you should have F value larger than 1 - larger F value more evidence of difference between treatments - Denominator in F factor is called the error term - evidence of random differences in the treatment conditions!
Greek letters v. alphabet
- In general, we use Greek letters to identify characteristics of a population (parameters) and letters of our own alphabet to stand for sample values (statistics)
ANOVA Assumptions
- Independence of cases - Normality - the distributions of the residuals are normal. - Equality (or "homogeneity") of variances
Chi square assumptions (for indep?)
- Independence of observations - Size of expected frequencies - Should not be performed when expected frequency of any cell is less than 5 - When it is, you get large chi - square value even when you might not be looking at something meaningful
Ratio Scale
- Like an interval scale, in that the distance between adjacent scores is equal throughout the distribution. -However, unlike an interval scale, in a ratio scale there is an absolute zero point. - There is a point at which a person does not have any of the measured traits. -Typically applies to measures in the physical sciences. -E.g., Height, weight, distance, time, etc. If one "lands" on zero, there is nothing
Intro to linear equations
- Line makes the relationship easier to see - Identifies the central tendency of the relationship - Line can be used for prediction
Characteristics of regression line
- Line passes through point defined by mean for X and mean for Y -Sign of the correlation is the same as the sign of the slope of the regression line
Pearson correlation
- Measures the degree and the direction of the linear relationship between two variables - Identified by the letter r - If they are perfect: x and y covary together exactly
Pooled Variance (indep t test)
- Method to correct the bias in standard error - Combine the two sample variances into a single value called the pooled variance - Obtained by averaging the two sample variances using procedure that allows bigger sample to carry more weight in determining the final value - Pooled variance = sp^2= SS1+ SS2 / df1 + df2
Experimental Method
- Most clearly shows cause-and-effect because it manipulates a single variable in order to clearly show its effect. -Almost always have two distinct variables: 1) First an independent variable is manipulated by an experimenter to exist in at least two levels (usually "none" and "some"). 2) Then, the experimenter measures the second variable, the dependent variable.
Repeated measures t test pros and cons
- Number of subjects: a repeated measures design requires fewer subjects than an independent measures design - Study changes over time - measures people at one time and then returns to measure them later - Individual differences - reduces or eliminates problems caused by these differences - Reduces variance by removing individual differences, which increases chance of finding a significant result - Disadvantages: - Allows for factors other than the treatment effect to cause someone's score to change from one treatment to the next - Time-related factors can cause problems - Also possible participation in the first treatment influences performance in second treatment - order effects Counterbalancing - Counterbalance the order presentation of treatments to deal with time-related factors and order effects
Assumptions of repeated t test
- Observations in each treatment condition must be independent - Assumption of independence refers to scores within each treatment - Population distribution of difference scores (D values) must be normal. -Normality assumption is not cause for concern unless sample size is relatively small.
Assumptions of indep t test
- Observations within each sample must be independent - The two populations from which they are selected must be normal - To justify using pooled variance the two populations from which the samples are selected must have equal variances Homogeneity of variance -If the two sample variances are estimating different population variances then the average is meaningless - Most important when there is a big discrepancy in sample size - when equal sample sizes this is important but not critical - If one sample variance is more than 3-4x bigger than the other you should be concerned
Single factor designs
- One independent variable - Simplest form of the ANOVA - Factor 1: Therapy (3 levels: before, during, after) - ANOVA would compare these groups
Scheffe Test (posthoc)
- One of the safest post hoc tests - Uses an F ratio to evaluate significance of difference between treatment conditions
Frequency Distribution
- Organized tabulation of the # of individuals located in each category on the scale of measurement - High to low scores; groups individuals with same scores - Usually a table or graph -Contains 2 elements: 1) A set of categories that make up the original measurement scale 2) A record of the frequency, or # of individuals in each category
Effect size
- Outcome of hypothesis test should be accompanied by a measure of effect size Phi-Coefficient - Measure of correlation for data consisting of two dichotomous variables - Measures strength of relationship rather than significance so it gives measure of effect size
Post-hoc tests
- Overall F tells you something is different but doesn't tell you what is different - Post hoc tests are done after ANOVA to determine which mean differences are significant and which are not - Making pairwise comparisons -e.g., Tukey's HSD test and Scheffe test
Standard Error (SE)
- Plays important role in inferential statistics - Computed by dividing the SD by the square root of n - The Standard error is always less than or equal to the SD - a sample is not expected to give a perfectly accurate reflection of its population. - In particular, there will be some error or discrepancy between a sample statistic and the corresponding population parameter. -In this chapter, we have observed that a sample mean is not exactly equal to the population mean. The standard error of M specifies how much difference is expected on average between the mean for a sample and the mean for the population. - The natural differences that exist between samples and populations introduce a degree of uncertainty and error into all inferential processes. - Specifically, there is always a margin of error that must be considered whenever a researcher uses a sample mean as the basis for drawing a conclusion about a population mean.
Variability
- Provides a quantitative measure of the differences between scores in a distribution and describes the degree to which the scores are spread out or clustered together - Defined in terms of distance -Measures how well an individual score or group of scores represents the entire distribution - In statistics, our goal is to measure the amount of variability for a particular set of scores: a distribution. - In simple terms, if the scores in a distribution are all the same, then there is no variability. - If there are small differences between scores, then the variability is small, and if there are large differences between scores, then the variability is large. -higher variability can reduce the chances of finding a significant treatment effect
Ordinal Scale
- Ranks people according to the degree to which they possess some measured trait. - Persons are first measured on some attribute (e.g., height). Then, they are assigned ranks according to how much of the attribute they possess. - Ordered sequence -Rank observations in terms of size and magnitude - E.g., 1 = tallest, 2 = second tallest, 3 = third tallest, etc.
Partial correlation
- Relationship between two variables being distorted by third variable - holds the third variable constant
Random Sampling
- Requires that each individual has an equal chance of being selected and that the probability of being selected stays constant from one selection to the next if more than one individual is selected. -E.g., you cannot select only members from a certain club - Probability must stay constant Example with cards - If you are looking for the jack of diamonds, the probability of finding the jack of diamonds is 1/52. - However, if you select a card and it is not the jack of diamonds, then the new probability of getting the jack of diamonds is now 1/51. This is not constant. - If you return the first card pulled to the deck before pulling another card, then this is called sampling with replacement.
Deviation (GENERAL)
- Uses the mean as a reference point and measures variability by considering the distance between each score and the mean - Describes if scores are close to mean or widely scattered - Same def for samples and populations, but slightly different equations Equation: X - μ X = individual score If positive (+), it is above the mean If negative (-), it is below the mean
Independent Samples t test
- SEPARATE GROUP OF PARTICIPANTS FOR EACH OF THE TREATMENTS (OR POPULATIONS) BEING COMPARED - Most studies require the comparison of two or more sets of data - Might want to compare men and women in terms of their political attitudes - Two sets of data can come from two totally different groups of participants - Independent- measures design or between subjects design -H0 = μ1 - 2 = 0 or μ1 = 2 -H1 = μ1 - 2 ≠ 0 or μ1 ≠ μ2 -Independent measures t uses the difference between two sample means to evaluate a hypothesis about the difference between two population means *look at the equation on google doc* - In this formula the standard error measures the amount of error that is expected when you use a sample mean difference to represent a population mean difference - Still measures the difference between sample and population mean - Well null is true difference is 0 - Standard error - measure of how much difference is reasonable to expect between two sample means if null hypothesis is true -Or you can think denominator should measures how much difference should exist if there is no treatment effect that causes them to be different S(M1-M2) = square root of sp^2 / n1 + sp^ 2 / n2
Cramer's V
- Same as phi coefficient formula except it adds df in the denominator - Df for this test is smaller of either R- 1 or C -1 -See formula for this and phi on page 533
Regression
- Statistical technique for finding the best fitting straight line for a set of data - Resulting straight line is called regression line - Have to determine the line that is the best fit! - There is linear regression and multiple linear regression - Regression tells you how the value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held constant
Correlation models
- Statistical technique used to measure and describe relationship between two variable - Usually exist naturally in the environment- no attempt to control or manipulate them - Normally called X and Y
two-way factorial design
- Study that combines two factors - Ex. Factor 1: Therapy (group 1 and group 2), Factor 2: Time (before therapy, after therapy, 6 months after therapy) ^ two factors. First factor has two levels, second factor has three levels Therapy factor uses two separate groups (independent measures) and time factor uses same group for all three levels (repeated measures)
Normal Distribution
- Symmetry means that one half of the distribution is a mirror image of the other half. - For example, the normal distribution is a symmetric distribution with no skew. The tails are exactly the same. - The normal distribution is the most common distribution you'll come across. - A normal distribution, sometimes called the bell curve, is a distribution that occurs naturally in many situations. - For example, the bell curve is seen in tests like the SAT and GRE. The bulk of students will score the average (C), while smaller numbers of students will score a B or D. An even smaller percentage of students score an F or an A. This creates a distribution that resembles a bell (hence the nickname). - The bell curve is symmetrical. -Half of the data will fall to the left of the mean; half will fall to the right.
Alpha level
- The alpha level for a hypothesis test is the probability that the test will lead to a Type I error if the null hypothesis is true. - That is, the alpha level determines the probability of obtaining sample data in the critical region even though there is no treatment effect - alpha level helps to determine the boundaries for the critical region by defining the concept of "very unlikely" outcomes aka → minimize the risk of a Type I error. - So → alpha levels tend to be very small probability values - the largest permissible value is a .05. - When there is no treatment effect, an alpha level of .05 means that there is a 5% risk, or a 1-in-20 probability, of rejecting the null hypothesis and committing a Type I error - Because consequences of type I error can be relatively serious, some researchers and publications prefer a .01 or .001 alpha level - The trade-off between the demands of the test and the risk of a Type I error is controlled by the boundaries of the critical region. - For the hypothesis test to conclude that the treatment does have an effect, the sample data must be in the critical region - If the treatment really has an effect, it should cause the sample to be different from the original population; essentially, the treatment should push the sample into the critical region. - However, as the alpha level is lowered, the boundaries for the critical region move farther out and become more difficult to reach. - the boundaries for the critical region determine how much distance between the sample mean and m is needed to reject the null hypothesis. - As the alpha level gets smaller, this distance gets larger. - Thus, an extremely small alpha level, such as .000001 (one in a million), would mean almost no risk of a Type I error but would push the critical region so far out that it would become essentially impossible to ever reject the null hypothesis; that is, it would require an enormous treatment effect before the sample data would reach the critical boundaries
Quasi-Experiment
- The control and treatment groups differ not only in terms of the experimental treatment they receive, but also in other, often unknown or unknowable, ways. - Thus, the researcher must try to statistically control for as many of these differences as possible. - Because control is lacking in quasi-experiments, there may be several "rival hypotheses" competing with the experimental manipulation as explanations for observed results - Study includes a quasi-independent variable - The study lacks a comparison/control group - Quasi-independent variable A pre-existing variable that is often a characteristic inherent to an individual which differentiates the groups or conditions being compared in a research study. Because the levels of the variable are preexisting, it is not possible to randomly assign participants to groups. e.g., Sex
Interval Scale
- The distances between adjacent scores are equal and consistent throughout the scale. - Equal intervals on the scale imply equal amounts of the variable being measured. - Also referred to as the equal-interval scale. -Intervals are in exactly the same size - E.g., degrees Fahrenheit or Celsius, golf scores, above and below average rainfall etc. - The zero point on an interval scale is arbitrary and does not indicate a zero amount of the variable being measured -True zero: When the value 0 truly indicates nothing on a scale of measurement. -E.g., zero degrees farenheit does not mean there is no temperature
Z score characteristics
- The distribution of the z-scores will have the same shape as the original distribution of scores - If original distribution is negatively skewed z scores will be too - The mean: z score will always have a mean of 0 -Standard deviation: composed of scores that have been transformed to create predetermine values for μ and σ
Sum of Squares (Population)
- The first of these formulas is called the definitional formula because the symbols in the formula literally define the process of adding up the squared deviations: - Definitional Formula: SS = Σ(X - μ)2 -To find the sum of the squared deviations, the formula instructs you to perform the following sequence of calculations: 1. Find each deviation score (X 2 μ). 2. Square each deviation score (X 2 μ)2. 3. Add the squared deviations.
Proportion
- The fraction of the total that possesses a certain attribute. - Often called relative frequencies -Equation: p = f/N F = frequency N = total number
Mean
- The mean (or average) is the most popular and well known measure of central tendency. - It can be used with both discrete and continuous data, although its use is most often with continuous data. -The mean is equal to the sum of all the values in the data set divided by the number of values in the data set.
Population mean
- The mean for a distribution is the sum of the scores divided by the number of scores. - The formula for the population mean is μ=ΣX/N - add all of the scores in the population, and then divide by N
Median
- The median is the middle score for a set of data that has been arranged in order of magnitude. -The median is less affected by outliers and skewed data. -Our median mark is the middle mark
Mode
- The mode is the most frequent score in our data set. -On a histogram it represents the highest bar in a bar chart or histogram. -You can, therefore, sometimes consider the mode as being the most popular option.
Sign Test (Special Case of Binomial)
- The sign test is a statistical method to test for consistent differences between pairs of observations - It tests if the direction of change is random or not. - The change is expressed as a binary variable taking the values + if the dependent variable is larger for a given observation after the treatment and − if it is smaller. - When there is not change, the change is coded 0 and is ignored in the analysis. -For example, suppose that we measure the number of candies eaten on two different days by 15 children, and that we expose the children to a film showing the danger of eating too much sugar between these two days. On the second day, out of these 15 children, 5 ate the same number of candies, 9 ate less, and 1 ate more. Can we consider that the film diminished candies consumption? This problem is equivalent to comparing 9 positive outcomes against one negative with P = 1 2 . From Equation 5, we get that such a result has a p value smaller that α = .05 and we conclude that the film did change the behavior of the children.
Standard deviation def
- The square root of the variance** - Provides a measure of the standard, or average, distance from the mean
Nonparametric tests
- Used when situations do not conform to the requirements of parametric test - Use sample data to evaluate hypotheses about the proportions that exist within populations - Make few assumptions about population distribution - Sometimes called distribution free tests Usually participants are classified into categories (democrat or republican) - Nominal or ordinal scales - Usually frequencies Parametric is usually preferred but these are when transforming scores into categories might be a better choice: - Simpler to obtain category measurements sometimes - Original scores might violate assumptions for certain stats procedures - Original scores might have really high variance - Experiment produces an undetermined score
Least Squares Solution (Regression)
- To fit how well a line fits you have to define the distance between the line and each data point. - For every x value there is a linear equation that determines the Y value. - This value is the predicted Y is is called ŷ . - Regression equation for y is: ŷ = bX + a - Standard error of estimate: gives a measure of the standard distance between the predicted y values on the regression line and the actual y values in the data
Dependent samples/repeated measures t-test
- Two sets of data could come from the same group of participants - e.g., Same people before and after therapy Matched subjects design - Each individual in one sample is matched with a person in the other sample - Individuals are equivalent with respect to specific variable that the researcher would like to control t-Statistic for a repeated - measures research design - Based on difference scores rather than raw scores - Difference score = D = X2- X1 -MD= sum D / n - t = MD - μD / SMD - SMD= square root of s^2 / n - Df = n - 1 (n = difference scores) - Null hypothesis- for general population there is no change or difference: H0 : μD = 0 (any changes are due to chance and in long run will average out to 0) - Alternative hypothesis states that there is treatment effect that causes the scores in one treatment condition to be higher or lower than the scores in the other condition - H1 : μD ≠ 0 (general trend toward higher scores as you gain more experience- systematic and predictable)
Binomial Tests
- Use the binomial test when there are two possible outcomes. -You know how many of each kind of outcome (traditionally called "success" and "failure") occurred in your experiment. - You also have a hypothesis for what the true overall probability of "success" is. -The binomial test answers this question: If the true probability of "success" is what your theory predicts, then how likely is it to find results that deviate as far, or further, from the prediction? -It is used to examine the distribution of a single dichotomous variable in the case of small samples. -It involves the testing of the difference between a sample proportion and a given proportion.
t test
- Used to determine if there is a significant difference between the means of two groups - Begin with known population before treatment - Goal is to use sample from the treated population as the basis for determining whether treatment has an effect - Unknown values for the mean and the variance of treated population so we have to use a sample - T = M- μ / Sm - Sample mean is from the sample data - Population mean is hypothesized from the H0 - Estimated standard error - computed from the sample data - Numerator measures the actual difference between sample data and the population hypothesis - Denominator measures how much difference is reasonable to expect between sample mean and population mean - When difference between data and hypothesis (numerator) is greater than expected (denominator) we obtain large value for t (large positive or negative) .......Then we say data are not consistent with hypothesis and we reject H0 - When difference between data and hypothesis is small relative to standard error we get t near zero and we fail to reject H0 - You can also do t tests where you don't know the population mean to serve as a standard .... You can just base your null hypothesis value (μ) off of theory ......You just have to state that you don't know the population mean but your hypothesis is based on logic and then test the actual data you get from your sample against that value
chi-square test for independence
- Uses frequency data from a sample to evaluate the relationship between two variables in the population - Each person in the sample is classified on both of the two variables creating two dimensional frequency distribution matrix . - Frequency distribution is then used to test hypotheses about the corresponding frequency distribution for the population - Null hypothesis is that two variables are independent- the value obtained for one is not related to the value for the second - Either no relationship or the same proportions so they are not dependent
Assumptions of t test
- Values in the sample have to have independent observations - Occurrence of first effect can't affect probability of the second event - Population must be normal - To note: the larger the variance, the larger the error - scores are scattered and it's more difficult to see consistent patterns in the data. - Larger the sample is the smaller the error is. - Two tailed test: not making hypothesis in a specific direction. H0: x = 10, H1: x does not equal 10. - One tailed test: making hypothesis in specific direction: H0: X >10, H1: X <10
Continuous Variable
- Variables whose values theoretically could fall anywhere between adjacent scale units. -Data measured on a ratio scale are always continuously scored. -E.g., A person's height or weight or the time a person spends talking on the phone can fall anywhere between the scale units.
Correlational Method
- We have a single group of subjects rather than two or more groups. -In addition, each of the subjects has a score on two different variables. -We do not seek cause-and-effect relationships between independent and dependent variables. Rather, we simply want to know whether or not the scores on two variables are related. -Two different variables are observed to determine whether there was a relationship between them -E.g., SAT forms: Developers gave the same students (single group of subjects) two different forms of the test (two variables). Then, they compared the students' scores on both tests (correlation). They found that the scores were similar for the same students on both forms of the test. This type of correlation is called test reliability.
Independent Variable
- What is manipulated by the researcher - Usually consists of the 2 or more treatment conditions to which the subjects are exposed - The antecedent conditions that are manipulated prior to observing the DV - Can be thought of as the cause. -It is the treatment or condition that the researcher expects will make subjects perform either better or worse on some measure of behavior. -Goes on X axis
t statistic
- When the variance for the population is not known (so you can't calculate a z score) we use the sample value - Estimated standard error (SM) = square root of s^2/n - T statistic = T = M- μ / Sm= (sample mean - population mean) / estimated standard error) - Same thing is Z except instead of standard error in denominator you have estimated standard error because Z score uses population variance (σ2)and t statistic uses sample variance (s2) - Used to test hypotheses about an unknown population mean (μ), when the value of is unknown - T distribution approximates normal distribution - As df gets larger the distribution looks more normal -In general t distribution more flat and spread out than z distribution
Linear equations
- Y = bX + a - The value of b is the slope - Determines how much the y variables changes when x is increased by 1 point - The value of a is the y intercept because it determines the value of Y when X = 0 - You want to plot at least 3 points for y
Using regression for prediction
- You can compute predicted value for Y using the ŷ = bX + a -predicted value is not perfect - As you get closer to zero magnitude of error increases - Regression equation should not be used to make predictions for X values that fall outside of range covered by original data
Degrees of freedom (df)
- describe the number of scores in a sample that are independent and free to vary. - Because the sample mean places a restriction on the value of one score in the sample, there are n -1 degrees of freedom for sample with n scores. - As df increases for sample, the better the sample variance represents the population variance and the better the t statistic approximates the z-score
Right-Skewed Distribution
- has a long right tail. - Right-skewed distributions are also called positive-skew distributions. - That's because there is a long tail in the positive direction on the number line. - The mean is also to the right of the peak.
Probability Definition
- likelihood that a particular event will occur -For a situation in which several different outcomes are possible, the probability for any specific outcome is defined as a fraction or a proportion of all of the possible outcomes. -Defined as a proportion (or part of a whole)
Parametric Test
- make assumptions about population parameters and test hypotheses about specific parameters - Require numeric score for each person in the sample
Semi-quartile range
- measure of spread or dispersion. - It is computed as one half the difference between the 75th percentile [often called (Q3)] and the 25th percentile (Q1). -The formula for semi-interquartile range is therefore: (Q3-Q1)/2
Interquartile Range
- measure of variability, based on dividing a data set into quartiles. - Quartiles divide a rank-ordered data set into four equal parts. - The values that divide each part are called the first, second, and third quartiles; and they are denoted by Q1, Q2, and Q3, respectively. Q3 - Q1 = IQR
Statistical power
- statistical power is the likelihood that a study will detect an effect when there is an effect there to be detected. -If statistical power is high, the probability of making a Type II error, or concluding there is no effect when, in fact, there is one, goes down. - power = 1 - β. ...Where beta = probability of making a type II error - Although the power of a hypothesis test is directly influenced by the size of the treatment effect, power is not meant to be a pure measure of effect size. - Instead, power is influenced by several factors, other than effect size, that are related to the hypothesis test. Some of these factors are considered in the following sections.
Percentile Ranks
- the percentage of individuals with scores at or below a particular X value always corresponds to the proportion to the left of the score in question.
Z-score definition
- to identify and describe the exact location of each score in a distribution. - to standardize an entire distribution. - We can use z scores to make comparisons! (60 in bio but 85 in psych, can't compare directly but can transfer to z scores and then compare directly. Z-score transforms each x value into a signed number (+ or -) so that The sign tells whether the score is above or below the mean - The number tells the distance between the score and the mean in terms of number of standard deviations - Statistical technique that uses the mean and the standard deviation to transform each score (X value) into a z-score, or a standard score. - Transforming a distribution of raw scores (a) into z-scores (b) will not change the shape of the distribution
Chi square model
- uses sample data to test hypotheses about the shape or proportion of a population distribution. - Determines how well the obtained sample proportions fit the population proportions specified by the null hypothesis - For chi square goodness of fit the null hypothesis specifies the proportion of the population in each category H0 - No preference, equal proportions - No difference from a known population Data - Don't need to calculate anything complex you just select sample of n individuals and count how many are in each category - Resulting values are called observed frequencies - Expected frequency: frequency value that is predicted from the proportions in the null hypothesis and the sample size - Expected frequencies define ideal hypothetical sample distribution that would be obtained if sample proportions were in perfect agreement with proportions in the null hypothesis - Equation and steps listed page 515 - If you get a big difference between observed and expected then you can reject the null hypothesis - Reference the chi square distribution - Positively skewed - Df = C -1 - Reference the table to see if value is bigger than critical value and then reject null hypothesis
Proportion of Variance (t test?)
-Alternative method for measuring effect size is to determine how much variability in the scores in explained by treatment effect - Treatment causes scores to increase or decrease which means treatment causes scores to vary - If we can measure variability by treatment then we know the size of the treatment effect -Percentage of variance accounted for by the treatment r^2= t^2 / t^2+df r2 = 0.01 = small effect, 0.09 medium effect. 0.25 large effect
Nominal Scale
-Categorical -Classifies cases into categories. Also sometimes called a categorical scale. -Label and categorize observations, but does not make any quantitative distinctions between observations. - E.g., m = male, f = female ; 1 = married, 2 = divorced, 3 = separated, 4 = never married ; Tel = owns a telephone, notel = does not own a telephone
ANOVA alpha level
-Test- wise alpha level - risk of Type I error for individual hypothesis test - Experiment-wise alpha level - total probability of Type I error accumulated from all the individual tests in the experiment - usually greater than what is used for individual tests - ANOVA does all three comparisons at the same time in one test so there is one alpha level to evaluate the differences so it avoids the problem of inflated alpha
Sampling
-The process of drawing a sample. -Validity of an inference depends on whether or not the sample was representative.
Parameter
-This is a summary number (e.g., an average) for a population. -A value (usually numeric) that describes a population. -Usually derived from measurements of the individuals in the population -E.g., Our parametric average might be 63.5 in.
2 things that differ experiments from other study designs
-Two things that differ this method from others 1) Manipulation (Changing a variable from one level to another and observing second variable) 2) Control (Exercise control over the research situation to minimize extraneous variables)
Discrete Variable
-Values that cannot even theoretically fall between adjacent scale units. -Separate, indivisible categories. -No values can exist between two neighboring categories. -E.g., Number of blue ribbons won, number of children in a family, or number of photographs taken.
Pooled Variance
-W hen the sample sizes are different, the two sample variances should not be treated equally, because they are not equally good - One method for correcting the bias in the standard error is to combine the two sample variances into a single value called the pooled variance. -The pooled variance is obtained by averaging, or "pooling," the two sample variances using a procedure that allows the bigger sample to carry more weight in determining the final value Equation: s^2p = SS1 + SS2 / df1 + df2
Steps in Hypothesis Testing *not done/corrected*
1. Type I error 2. Type II error 3. Selecting an alpha level 4. The trade-off between the demands of the test and the risk of a Type I error is controlled by the boundaries of the critical region. 5. Effect size 6. Power
Measure of Central Tendency
A measure of central tendency is a single value that attempts to describe a set of data by identifying the central position within that set of data
Quasi Experimental Variable
A pre-existing variable that is often a characteristic inherent to an individual which differentiates the groups or conditions being compared in a research study. Because the levels of the variable are preexisting, it is not possible to randomly assign participants to groups. e.g., Sex
χ
Chi
3 characteristics of correlation models
Direction of the relationship - sign positive or negative describes direction - Positive : then variables change in the same direction - Negative : they go in opposite directions Form of the relationship -Tend to have linear form - cluster around a straight line Strength or consistency of the relationship - Perfect correlation is 1.00 - Each change in x is accompanied by predictable change in Y - No relationship is 0 (No clear trend) - Intermediate values between 0 and 1 -indicate degree of consistency
Standard Deviation (Sample)
Equation: s=square root of s2 = square root of SS/n-1 How to Compute: - Calculate the mean (simple average of the numbers). - For each number (data point): subtract the mean. - Square the result. - Add up all of the squared results. - Divide this sum by one less than the number of data points (N - 1). This gives you the sample variance. -Take the square root of this value to obtain the sample standard deviation.
Assumptions of regression
Independence - The residuals are serially independent (no autocorrelation). Linearity - The relationship between the dependent variable and each of the independent variables is linear. Mean of Residuals - The mean of the residuals is zero. Homogeneity of Variance - The variance of the residuals at all levels of the independent variables is constant. Errors in Variables - The independent (predictor) variables are measured without error. Model Specification - All relevant variables are included in the model. - No irrelevant variables are included in the model. Normality - The residuals are normally distributed. - This assumption is needed for valid tests of significance but not for estimation of the regression coefficients.
Population
Larger group (entire set) of subjects about which we want to draw a conclusion. E.g., We may have sampled only 100 elderly women, but we might want to draw a conclusion about the height for the whole population of elderly women of which the 100 women were a part.
Using the Pearson Correlation
Prediction: - if two variables are known to be related in systematic way then it is possible to use one variable to make predictions about the other ....e.g., SAT and how well you do in college Validity: ex: if IQ test measures what it should then it should be related to other measures of intelligence - Reliability: use correlations to determine relationship between two sets of measurement when reliability is high the measurements should be strong and positive -Theory verification: specific predictions about the relationship between two variables. Prediction of theory could be tested by determining the correlation between the two variables
Interpreting Regression
Regression coefficient example: - indicates that for every additional meter in height you can expect weight to increase by average of 106.5 kilograms - Null hypothesis: slope of the regression equation is zero - Testing significance is called analysis of regression - Get a T or an F statistic and then reference the corresponding table to see if your value is in the critical region - Linear regression degrees of freedom are n - 2 -For multiple regression degrees of freedom are n - k - 1
factors that influence power
Sample size - one of the primary reasons for computing power is to determine what sample size is necessary to achieve a reasonable probability for a successful research study. Before a study is conducted, researchers can compute power to determine the probability that their research will successfully reject the null hypothesis. If the probability (power) is too small, they always have the option of increasing sample size to increase power Alpha level - Reducing the alpha level for a hypothesis test also reduces the power of the test One-tailed versus two-tailed tests - Changing from a regular two-tailed test to a one-tailed test increases the power of the hypothesis test
Σ
Sigma, Summation
σ
Sigma, population SD. If squared, variance
Alternatives to Pearson correlation
Spearman correlation - Measures relationship between X and Y when both variables are measured on ordinal scales -Also useful for interval/ratio scales when you are trying to measure consistency of relationship, independent of form - Original scores first converted to ranks and then this formula is used with the ranks
SE
Standard error; the standard deviation of the population divided by the square root of n
Sample
The group of subjects in a study. Usually, they are a subset of a larger group (population) Usually intended to represent the population in a research study. E.g., May measure the height of a sample of 100 elderly women.
When to use regression
Use regression analysis when you want to use a single independent variable or multiple independent variables to predict the value of a dependent variable
Population Variance
Variance is the average squared distance from the mean. How to compute: - Square each deviation - Find the average of the squared deviation (this is the variance)
μ
mu, population mean
df
nu, degrees of freedom in a student's t or chi-squared distribution