300B FINAL!!!

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

application of chi square to research designs: k = 1, c = 2

(one variable with 2 categories) N must be 20 or more to reduce bias E is always equally distributed (?) df= c-1 steps: - data matrix: 4 cells: observed, expected and C1, C2 fill in E, then just use chi square formula

(ch 11) assumptions of the random sampling model of HT

1. data are scores (if using the t test, f test, or z test) - if data are ranks or frequencies-- use test ratios specially designed for those 2. participants were randomly sampled from the population to create the sample - this ensures that the sample is unbiased - want to make sure participants are independent in our groups - we need independence in scores for an unbiased sample ** problem: this is almost impossible, rarely actually random sample 3. the DV is normally distributed in the population - this would mean the sampling dis would be a specific shape - sampling distribution has to be kurtotic to use the random sampling model -- theoretical tables are based on that shape 4. homogeneity of variance - all individual sample populations have the same variance as grand population - in the random sampling model, we either pool these variances together for t test, or create MSe which is based on summed variances - only works if variances are equal and are pooled together 5. each group/ .condition/ cell has 7 or more scores - difficult to reject H0 when n is less than 7 when using a theoretical population. power is too low.

3 ways to examine or interpret outcomes from an analysis (ch 10)

1. hypothesis testing - what was the relationship between p(obs) and p(a) and what was your statistical decision? 2. visual (graphic) display - what was teh pattern of particpants' behaviour across levels of your experimental conditions? 3. verbal descriptions of the relationship between 2 factors - additive or non additive - independent or dependent - having or not having a difference between differences

computation of post hoc tests: generic procedure

1. identify values for MSe, dfe, a 2. if relevant, identify value for k 3. find critical valye from appropriate table of values - fishers: t table - tukeys: Q - dunnett's: dunnet's t table 4. apply relevant test ratio - compute critical valye to find the minimum mean difference needed to reject the null 5. compute mean differences between each possible pair of means 6. list mean differences 7. compare mean differences to computed value

6 steps of hypothesis testing for 2 x 3 factorial ANOVA

1. state alternative and null hypothesis - 3 treatment effects to test: 2 main effects and 1 interaction -- need to set a H1/H0 for each - h1 for row main effect: there will be a difference in mean scores as a function of the row factor (h0: no difference) - h1 for column main effect: there will be a difference in mean scores as a function of colu,mn factor (h0: no diff) - interaction h1: mean scores will vary as a function of column factor and row factor (h0: will not vary) 2. select the sampling distribution -model of HT: random sampling model -research design: balanced 2 x 3 between groups factorial design (balanced= equal n in each cell) - type of data: scores, ratio - stat for anal: cell means, marginal means - pop param: not known, so will be estimated from sample stats. assume assumptions are met - sample size: should have at least 7 per group 3. must set a priori criterion for type 1 error for each treatment effect and interaction to be evaluated against p(a)= 0.05. also df value for each treatment effect as well as for SSerror and SSt - df cells: #cells - 1 - df rows: #rows - 1 - df cols: 3columns - 1 - df rxc: (r-1) x (c-1) - df error: sum (ncell-1) - df total: NT-1 4. father data and compute appropriate statistics for calculation of ANOVA - compute summary stats: cell, row, column means, and grand mean - graph cell means - compute SST, SSerror, SScells - compute SSrows, SScols, SSrxc - complete ANOVA source table - do follow up probes 5. compare p(obs) with p(a) and make a statistical decision for each effect - row, column, interaction - then do follow up probes and repeat steps 4 and 5 6. formally report

options to fix non normality in data

2 forms of data transofrmations: - mathematical procedures applied to scores (data remain scores) - transform scores to lower scale of measurement and apply non parametic test

total number of distributions in 2 x 3 factorial design

22: 6+6+6+4 6 independent conditions 6 populations 6 sampling distributions 1 omnibus 2 main effects 1 interaction

application of chi square to research designs: contingency tables (k bigger than 2, c bigger than 2)

2x2 or larger design homogeneity: same as independence, but with equal E - can calculate marignals before indepdnence: cannot calculate marginals before, calculate E based on row and column marginals - then do chi square formula

assumptions for the random assignment model

3 ASSUMPTIONS: 1. data are scores (tests are chosen that are specific to that type of data) 2. n is bigger than or equal to 3 in each group 3. there is independence among the scores, which is achieved through random assignment to conditions - neither tests under the RSM or the RAM are robust to violations of the assumption of independence among scores !! v important !!!

in k bigger than 2 design, for each condition, we have ___ distributions (F statistic) ch 7

3 distributions for each condition with the f statistic population, sample, and sampling distribution combine the various sampling distributions into one sampling distribution of the F statistic

describe the strengths and weaknesses of a factorial ANOVA relative to a 1-way ANOVA.

ADVANTAGES: convenient/ economical: you can manipulate more than one variable within the same experiment efficient/ powerful: more statistically powerful than doing multiple 1 way ANOVA, reduces random error due to participants' differences, reduces experimenter error that would occur when using several different testing environments MAJOR ADVANTAGE: gives the ability to test for interactions between variables. by incorporating more than 1 variable in the same experiment, you can see whether the effect of one variable depends on a different variable NEED WEAKNESSES???

how are the concepts of power and effect size relevant to ANOVA?

ANOVAs for one way designs are more powerful than doing multiple two-sample designs: - reduces unexplained variability by pooling variability within the groups: pooling variability into one single term leads to less overall variability than if you pooled vriability into several terms based on every possible pair of groups - more power to reject the null: when there is a real effect, pooling variability should reduce the error variance (denominator), leading to an increased F obs to exceeed the F crit. ANOVAs for factorial designs are more powerful than doing multiple 1way designs: - reduces unexplained variability by reducing experimental error: variability induced during data collection that is unrelated to the effects being measured more power to reject the null: when there is a real effect (eg a main effect), reducing experimental error should reduce the error variance (denominator), leading to increased F obs to exceed the F crit

how do the 3 statistical measures, F(obs), η2 (& R2) and f, provide evidence of a treatment effect in your data?

F(obs): for k bigger than 2 design - sampling distribution is based on two measures of variance (error): error variance and treatment variance - if there is no treatment effect, then there is no differences among means, and F obs is bigger than F crit η2 (& R2): explain how much of variability is accounted for. If lots of variability is accounted for (i.e. these are close to 1), then most of the variation in the DV can be accounted for by our treatment effects f: effect size for k bigger than 2 design. - estimate of the REAL differences among populations means, uses sample information to estimate

distinguish among 3 types of post hoc probes (fishers, tukeys, dunnetts), both conceptually and mathematically. know how they are applied as probes in the analysis of an independent (between groups), 1 way, or factorial design

FISHERS: - k = 3 - assumes homogeneity of variance - per comparison error rate - not approrpriate for factorial ANOVA since they almost always have more than 3 conditions - use t-table for critical value - number of comparisons= k(k-1) TUKEYS: - k bigger than 3 - assumes homogeneity of variance - experiment wise (1way ANOVA) or family wise (factorial ANOVA) error rate - use Q statistic for critical value - number of comparisons= k(k-1) DUNNETS: - for comparing each group to a control OR when homogeneity of variance is violated - experiment (1way ANOVA) or family wise (factorial ANOVA) error rate - uses Dunnett's t - max comparisons: k - 1 ALL 3: - are modified t tests for doing multiple comparisons - work by computing a minimum mean difference

in a data matrix, values in the cells are ___

MEANS for each trratment condition

MSe vs MSeffect

MSe: mean squared error - pooled S^2 e - an estimate of the population variance (theoretical population??) - reflects unexplained variability in participants, experimenter error - always present (can never eliminate unexplained error) MSeffect: SSeffect/ dfeffect - eg MSrows: SSrows / df rows, MScols: SS cols/ df cols, Msrxc: SS rxc/ df rxc, MSe: SSe/ dfe MSeffect estimates the variance for a particular treatment effect (between group variance estimate) F=MSeffect/ MSe

can we tell if an outcome is significant from a data matrix?

NOOOOO the only way to know for sure is to apply hypothesis testing procedures

null and alternative hypotheses etc for RAM VS RSM

RANDOM SAMPLING - H0: mu1-mu2= 0 - a priori criterion: p(a) - sampling distribution: theoretical - assumptions: parametric, n= 7 or more - test ratio: mean diff/ estimated error - analysis for score data: z, t, F test - analysis for frequencies: chi square - p obs: associated with z(obs), t(obs), F(obs) - decision to reject the null: p(obs) less than p(a) interpretation: generality RANDOM ASSIGNMENT MODEL - H0: xb1-xb2= 0 - a priori criterion: p(a) sampling distribution: empirical - assumptions: no parametric assumptions, n is bigger than or equal to 3 - test ratio: p= frequency of observation / total number of outcomes - analysis for score data permutations, total # of possible arrangements in data - analysis for frequencies: binomial exact, fisher exact - p(obs): calculated directly from test ratio - decision to reject null: p obs less than p(a) - interpretation: specific to your sample, can establish causality if all 3 assumptions are met

interpreting main effects

SOMETIMES a main effect obscures important info and misleads you in your interpretation of data - this happens because main effects are averaging values across individual cell means - eg if an interaction is actually what's important, the main effect might not be the important thing whatever is explaining more of the variability (the main effect or the interaction) is what's more important

definition terms for the null hypothesis (ch6)

alpha: the probability of making a type 1 error (rejecting null when null is true and should have been retained 1 - alpha: all the area under the H0 curve expcpet the region of rejection -- probability of retaining h0 when h0 is true

denominator of F ratio in ch 9

always MSe

ANOVA stands for...

analysis of variance

column main effect

applying a 1 way anova to the column marginal means produces an F value for the column factor determines whether the column factor alone affects the DV

assumptions associated with applying an ANOVA to data in a factorial design

assumption of normality: for each condition, the data are normally distributed homogeneity of variance: for each condition, variance is equal to all others sample size: N is bigger than or equal to sample size homogeneity of sample size: for each condition, the sample size is equal to all others NEED TO ADD IN RATIOS

assumptions for factorial ANOVA

assumptions refer to the status of the data within each cell of teh design, not the study as a whole! - this is because each cell is associated with its own population assumed DV is normally distributed in each pop homogeneity of pop variances and sample size: - estimated variances for each cell must be about the same - the number of participants or scores for each cell must be about the same - there is very little flexibility regarding these two assumptions n of 7 or more for each cell

definition terms for alternative hypothesis

beta: priobability of making a type 2 error (retaining null when its false and u should have rejected it silly) power: 1 - b -- all the area under the h1 curve except the critical area that overlaps the null distribution

why might we have non normality in a population?

ceiling or floor effects - create a positive or negative skew - means if you sample from the popualtion, your sample is more likely to be from one part of the population - floor effect: timing people, can't get less than 0 - ceiling effect: drug dosage, doesn't do anything different after a certain dose outliers in data - sample based skewness - produces large vaalues for s^2 and estimated standard error , and small values for observed test statistic - large variability for other reasons

choosing probes (multiple comparisons) for factorial designs

choice of test based on number of means in the treatment effect to be tested family wise error rate: - treats each treatment effect in the analysis as a separate family - eg a 2 factor design has 3 treatment effects = 3 families - controls for type 1 error separately for each family by setting p(a) separately for each effect - set at 0.05 for each probe carried out - the sum of each comparison within a family is 0.05 or 0.06 - even tho you are testing 3 treatment effects there are no concerns about the 3 separate tests producing an inflated probability of making a type 1 error because each probe is tested against a unique F distribution to apply dunn's test: use MSerror from the ANOVA source table, the value for ncols is the number of scores that contribute to each column mean - use alpha EW to apply fishers lsd: - use alpha PC, alpha = 0.02 - use the value from MSerror from ANOVA source table, n cols or n rows= number of scores that contribute to each column or row mean use post hoc comparisons to probe interactions because it is difficult to justify a priori what interactions will actually exist - use MSerror, use ncell ( the number of scores that contribute to each cell mean)

problems with randomization tests

computer programs that have these tests are not widely available or user friendly despite new developments, randomization tests are still not as flexible for application to more complex designs because randomization tests are not widely understood by most psychologists / researchers, it can be hard to publish your study

distinguish conceptually between r squared, f, d, R squared, n squared

d: cohen's effect size - difference between means - the standardized distance between means in units of standard deviation - ranges from 0 to infinity - describes effect between 2 groups (k=2) - it is a difference value: an estimate of real differences among population means n squared: variance explained - a proportion - always between 0 and 1 - describes effect across multiple groups (k bigger than 2) - proportion: is limited to the sample data - SSg/ SSt f: cohen's effect size - difference between means - how far group means would deviate from grand mean (in population) - from 0 to infinity - describes effect across multiple groups (k bigger than 2) - difference value: an estimate of real differences across population means - r squared, R squared, and n squared are conceptually the same - variance explained / accounted for by the linear relationship (correlation) between the IV and the DV - r squared describes variance explained for a single IV on DV - R squared (multiple r squared) describes variance explained for multiple IV (factors) on DV -- does this by removing the variance explained by other effects from SS total

assumptions for applying chi square test to data

data are qualitative each observation is independent of every other observation IVs represent discrete categories measured on either a nominal or ordinal scale categories are exhaustive, so NT is the total number of recorded responses to apply chi square, you must have NT 20 or more and E values of 5 or more

assumptions of the t statistic applied to a set of difference scores ch 2

data are scores difference scores are normally distributed in the population so sampling distribution will be kurtotic sample is created by randomly sampled participants from the population - can generalize if you randomly sampled n is 7 or more for entire sample - N too small: not enough power to detect an effect

choice of statistical tests for frequency data

depends on 3 factors: - size of sample - type of design - model of hypothesis testing k=1 c=2 design: - NT less than 20, random assignment: binomial exact test, sign test - NT 20 or more, RS or RA: z corrected, chi square corrected, binomial exact - design and parametric equivalent test (with scores): 2 sample independent design (t test) k=1, c more than 2 design: - NT less than 20, ra: NOTHIN - NT 20 or more, RA OR RS: Chi square goodness of fit - design and parametric equivalent: 1 way between groups design (ANOVA) k and c bigger than 2: - NT less than 20, random assignment: fisher's exact, median split - NT more than 20, RA or RS: chi square contingency: independence or homogeneity - design and parametric equivalent: factorial between groups deisgn (ANOVA) 2x2 or larger

3 types of factorial designs

determined by the nature of the factors independent or between group factorial design: - all factors are independent variables - this design creates a set of experimental conditions which are all independent of each other - each participant experiences only one unique condition and only contributes data to one cell mean repeated measures or within group designs: - all factors are repeated measures variables - one set of participants contributes data to every single condition ( every cell mean) mixed factorial: has at least 1 independent and 1 repeated measures factor in the design

Phi coefficient

effect size for contingency tables applies ONLY to contingency tables measures strength of association between two variables for dichotomous data (x and y) we would use r squared phi is not identical to r squared because it does not measure proportion of variability accounted for - because it's an effect size difference between proprtion of variability accounted for and effect size: povaf is between 0 and 1, effect size is between 0 and infinity small: 0.10 med: 0.3 large: 0.5

qualities of the sampling distribution in the R A M

empirical null distribution ( or sampling distribution) - a frequency distribution of all possible outcomes (permutation distribution) is a posteriori: genreated from data, therefore each sampling dis is unique to each data set each possible outcome is represented by its mean difference on the x axis and the frequency of that difference on the y axis

row main effect

examines whether there is a difference between the row marginal means produces an F value for the row factor determines whether the row factor alone affects the DV

advantages of ANOVA for a factorial design

factorial designs allow you to include 2 or more factors in one analysis, rather than having to carry out a series of 1 way designs - more economical - allows some control over unexplained variability so it is a bit more powerful -- easier to control the experimental situations when examining the influence between the factors and their effect on behaviour MAJOR advantage: allwos you to test for the presence of an interaction. aka you can examine the generalizability of a factor

error rates for the different comparison tests

fishers: alpha PC

how is SS total partitioned for an analysis of independent designs (1 way and factorial)?

for one way anova: SST= SSG + SSE group effect: the ratio of variance of SSG to SSE for factorial anova: SScell= SSrows + SScols + ssRXC main effect of row factor: the ratio of variance of SSrows to SSerror main effect of column factor: the ratio of variance of SScols to SSerror interaction effect: the ratio of variannce of SSrxc to SSerror

application of chi square to research designs: k= 1, c bigger than 2

goodness of fit!! df= c-1 2 variations: E distributed proportionally or equally for equal distribution: just use chi square formula for proportional distribution: calculate E based on the proportions then use formula

reporting outcomes for a factorial design

graphic figures only sugest the outcome, you only know whether the interaction is significant by doing a statistical test report ALL SIGNIFICANT OUTCOMES, but also note non significant findings of major treatment effects in the factorial order of report - name the design and the type of analysis applied to the data - report significant outcomes (main effects, interactions) (name groups, give descriptive stats, name DV, giove each F outcome) interpretation - always finish the results section by writing a sentence or 2 that tells what analysis revealed about the effect of the IV on the DV - be sure that the final paragraph is in lay person language and includes no psyc jargon or acronyms

ch 12 when data are tallies or frequencies...

hypothesis testing procedures are adjsted to accommodate data that are qualitative (frequency counts, tallies) and represent a nominal scale of measurement statistical tests applied to these data are referred to as non parametric tests because their descriptive statistics are not used to estimate population parameters there are specific statistical tests associated with each type of research design as well as tests that are unique to each model of hypothesis testing

logic of an analysis of variance as applied to a multi factorial design

hypothesis testing: still applying the random sampling model. assumes each individual condition (cell) is made up of participants that were randomly sampled from a unique population difference: these separate sampling distributions of means are then recombined to form different null hypothesis distributions (for each F statistic) depending on which effect is being tested this means you will have more than 1 null hypothesis to test, with each H0 testing each effect separately, each effect involves a unique sampling distribution created specifically for that particular effect new sampling distributions created to test each effect

pros and cons of the two types of 2-group analyses- in general ch 11

if applying a t test to data instead of doing a randomization test, the t test will underestimate the value for p(obs), potentially increasing the probability of making a type 1 error this increases the probability of making a different statistical decision regarding the null hypothesis (rejecting when you should retain) most of the time, both randomization and parametric tests give siilar outcomes for p ovs, but it is not always possible to predict when this will or will not happen values for p obs are the same when.. - have a large N for sample size - the sample approximates the characteristics of a normally distributed population - you have not violated the parametric assumptions both models of HT can be applied to complex designs

fixes for violaltions of assumptions for RSM

if there is no random sampling: - randomly assign participants to each condition - can't generalize outcome from sample to pop if there are severe violations of homog of variance: - use levene's test to check for violations - for t test: SPSS recreates the sampling distribution to adjust for inequality of error between groups -- new sampling dis adjusts values for estimated SE, df, t obs and p obs - for f test: ANOVA is not reliable, SPSS offers a fix - - if probing, choose a multiple comparisons test that adjusts if n is less than 7: - no fix sorry sis - use RAM if population is non normal (as estimated by the sample) - do data transformations!!! - apply to all scores in set - makes non normal pop more kurtotic

assumptions when applying hypothesis testing to pearson r ch 3

independent random sampling data are scores, because you are testing pearson r each variable is independently normally distributed in the population homogeneity or variance and sample size n 7 or more

factor

independent variable levels of a factor: the subgroups or conditions of that factor

why are they called factorial designs?

it has: two or more factors (IVs) one dependent measure (DV) approximately equal numbers of scores (Ps) in each cell

k and c

k= number of variables c= number of categories of the vairable

frequency analysis with the random assignment model: binomial exact test

k=1, c=2, any size n - gives an exact probability value of a specified outcome and any more extreme outcome - only 2 possibilites: each trial has only a + (P) or - (q or not-P) outcome sampling distribution: - binomial probability distribution - when p = 0.5, shape is kurtotic - when p does not equal 0.5, shaped is skewed toward higher probability H0: response does not differ from a chance distribution of frequencies, this is a random distribution of frequencies h0: p = 0.5 h1: p does not equal 0.5 p(obs)= frequency of occurence of interest / total possible outcomes + all probabilities more extreme

interaction

lines in line graph intersect or are at an angle to each other, NOT parallel diagonal means are different one high one low alternating in bar graph

2 main effects, no interaction

lines on line graph are parallel but far apart from each other and not parallel to the x axis bar graph: a1 is different from a2, but also the lines are tall and short. however, the pattern is the same for both a1 and a2 column means and row means are different, diagonal means are the same

column main effect, no interactions

look at the angle of the lines relative to the x axis-- the sharper the angle, the more likely the main effect is significant column means are different A1 (including b1 and b2) is different from A2 (including b1 and b2)

row main effect

look at the distance between lines-- if there is a big difference between the levels of the lines, then there is probably a significant main effect for row row means are different one tall and one short bar in bar graph

math fixes

moderate positive skew: square root transformation (limitation: have to deal with negative numbers in a special way strong positive skew: log transformation (original data cant be less than 1 or logs will produce negative numbers) extreme positive skew: inverse transformation (original data can't be between 0 and 1) if the data have a negatife skew, first reverse it to a positive skew, then do one of the 3 above non kurtotic distribution: arcsine transformation

partitioning the SST for a 2 factor design

must calculate 3 unique forms of between cells variance and compute an F ratio for each form of between group variance (this will add up to omnibus F) SST= SScells+ SSerror SScells= SSrows + SScols + SSrxc

additive vs non additive

no interaction: additive - if two variaables can relate to each other without interacting, they are additive - best illustrated with a time variable significant interaction: non additive or multiplicative - if the variables do interact with each other, the effect is multiplicative or non additive. change accumulates across time

independence vs dependence

no interaction: this means that the influence of one factor on behaviour is independent of the effect of the other factor - you can interpret the effect of one factor without considering the presence or role of the other factor interaction: dependent effect between the factors - you cannot interpret the effect of one factor on performance without talking about the role of the other factor. the effect of one factor is qualified by the presence of the other factor

sampling distribution for the random assignment model

not a theoretical distribution, is empirical (based on real data) created a posteriori-- only after the data are collected generated by a computer program

characteristics of the null distribution for the chi square statistic

numerically represents a squared z distribution x axis is defined with: - the standard score values for chi square (which are only positive) large values of chi square mean there is only a small probability that the distribution of responses was random, Ps were expressing some preference in their responses

transforming data to a lower scale of measurement and applying design equivalent tests

options: transform data to ranks - apply a rank order transformation through SPSS - if deisgn is a related samples design, use the wilcoxon signed-rank test - if design is independent groups (k=2), apply the wilcoxon rank sum test or the mann-whitney U test - for multi-group designs uses the kruskal-wallis h test - rank them yourself tho transform score data to nominal data - sign test: data are frequencies but were originally a related samples design - median split: data are frequencies but were originally an independent samples design

p obs vs p (a)

p(obs): the actual p value we observe, is what we use to reject the null or retain the null - p(obs) observed probability of the outcome - represents the area under the curve (of the Sampling Distribution) associated with observed statistical outcome - (computed value for the test statistic). -It is the ACTUAL probability that your outcome occurred by chance. The value for p(obs) comes from the test-ratio used. p(a) : the value we base rejecting or retaining the null off of - p(a) is the probability of alpha, the probability of getting a type 1 error

random sampling model vs random assignment model

random sampling model: - generated through mathematical theory (a priori) - values are standard scores based on all possible values whether or not they actually occurred in the data set - x axis valyes: standard scores, range from neg infinity to pos infiinity - y axis: the probability of occrrence of each possible outcome - shape: smoothly changing function of y values, asymptotic to x axis Random assignment model: - generated empirically from data (a posteriori) - valuesa re means based on values in data, only values present in the data set are used to create the distribution - x axis values: values as defined by the data. for k=2 design: mean differences - y axis: actual frequency of each possible outcome in the data set - shape: smooth or lumpy, data driven, touches y axis

numerator of F ratio in ch 9

represents the MS between group variance estimate for whatever effect is being tested (main effect or interaction)

what is the logic associated with applying ANOVA to a factorial design? state the null hypotheses for each effect and determine the value for F critical for each effect understand how each value for F obs tests each null hypothesis

row main effect: H0: there will be no difference in mean scores as a function of IV - F = MSrow/ MSe - F crit based on df rows, df error column main effect: H0: there will be no difference in mean scores as a function of IV - F = MScol/ MSe - F crit vased on df cols, df error interaction: H0: mean scores will not vary as a function of IV1 and IV2. - F= MSrxc/MSe - F crit based on dfrxc, df error df error= nt - cells

assumptions for independent samples k=2

samples randomly sampled from population where the DV is normally distributed data are scores n 7 or more for each condition homogeneity of population variances between both conditions homogeneity of sample size robustness of assumptions: t test is robust to most violations, can handle violation of one assumption at a time - NOT robust when larger variance is associated with smaller N - larger population variance with larger sample size will inflate type 2 error

difference between experiment wise, family wise, and per comparison error rate

special forms of error rates were developed so that the probability of making a type 1 error wouldn't inflate by stacking several multiple comparisons per comparison error rate refers to how much of the alpha value is budgeted for each comparison family wise and experiment wise refer to the alpha value when accounting for all comparisons (total budgeted across all comparisons) PC error rate: - intended for only a small number of multiple comparisons - multiple t tests, fisher's - probability of making a type 1 error for any individual comparison EW/FW error rate: - intended for many comparisons - dunn's , tukeys, dunnetts - probability of making a type 1 error across all tests

H0 AND H1 FOR RAM

still want to disconfirm the null hypothesis H1: xbar 1 will not equal Xbar 2, xb1 - xb2 will not equal 0 - the particular arrangement of scores in the data set was not independent of the random assignment: the data values in that condition are a result of the treatment which systematically influenced the difference (not by chance) H0: xb1= xb2, xb1-xb2= 0 - the arrangement of score was a random occurrence, independent of the fact that participants were randomly assigned to those 2 conditions ( it was by chance)

reality of HT and violations of assumptions

t test and violations - t test is moderately robust to violations - BUT: t obs is biased when there is heterogeneity of sample sizes AND heterogeneity of variances, especially when larger variance is with smaller sample size (inflates type 1 error) f test and violations: - less robust to violations of assumptions - need homogeneity of variance and N within 1:1.5 ratio REALITY: we rarely randomly sample from population, and behaviours in a study are rarely normally distributred - eg SONA systems we rarely know the characteristics of a theoretical population. best we can do is estimate these from a sample we can't prove that a sampling dis is a specific shape - when Ni is more than 25 to 30, the sampling dis of a mean will be kurtotic, but it is sometimes unrealistic or undesirable to have a sample this large

test of independence vs test of homogeneity

test of independence: 1 sample of participants categorised on variable 1, and your survey represents variable 2 - marginals not known until data are collected test of homogeneity: 2 survey questions asked , we are testing if responses are equally distributed - marginals known before data collected

review of logic for factorial anova

testing the significance of a treatment effect: when H0 is true: -- there is no effect of that factor on behaviour -- (there is no overall difference ) -- that variance due to the row main effect is an estimate of the population error variance -- value for MS rows estimates the value for row effect variance when F(obs) is bigger than 1: - it is reflecting the presence of a between group effect due to the column factor

chi square statistic and its distribution

the chi square stat is best known for analysis of tallies, but when there are small expected frequencies for one or more categories, chi square statistic might result in a biased overestimate of the p(obs) value, resulting in increased probability of a type 1 error to avoid this, apply correction or use the appropriate test from random assignment model general characteristics of chi square: - can be used with all 3 design options as long as n is 20 or more and value for expected frequencies is 5 or more - does not rely on population parameters to test this hypothesis, but does assume we have a good estimate of the expected values H0: probability that a distribution of observed frequencies across various categories comes from a population where responses are assumed to be evenly or proportionally distributed across categories h1: probability that a distribution of observed frequencies in various categoires came from a population with a hypothesized non random distribution of those frequencies the observed chi square statistic reflects the degree to which the observed frequencies diverge from the expected frequency

efect size (ch 6)

the degreee ot which experimental manipulation separates H0 and H1 distributions distance between the means of the distrivutions (mu not and mu 1) distance expressed in standard deviation units larger effect size= smaller overlap of H0 and H1 distribution gamma (cohen's effect size): for whole empirical population - H1 distribution mean minus H0 distribution mean over SD for H1 distribution interpreting: small: 0.20, medium: 0.50, large: 0.80 effect size for theoretical population: cohen's d. formula varies for each design (but essentially it is mean - mean/ sd) more variability reduces effect size, less variability increases effect size

when you have a significant interaction, it means

the difference in performance across trials is NOT consistent performance varies as a function of both factors

problems with t tests

the distribution of the behaviour we are examining is rarely normal in populations t test is only moderately robust to violations of its assumptions we rarely randomly sample we don't really know the characteristics of a theoretical population, we can only estimate them from a sample

main effects

the effect of one factor averaged over the levels of another factor allow us to test a null hypothesis for each individual factor in the experiential design mathematically, testing a main effect compares one level of a marginal mean to the other level(s) of a marginal mean HYPOTHESIS testing for main effects produces a separate F obs for each main effect row main effect and column main effect

when an interaction is significant

the effects of the two factors on behaviour are dependent on one another, the difference between the differences is large, the relationship is non additive

when an interaction is not significant

the effects of the two factors on behaviour are independent, the difference between the differences is small, the relationship is additive

interaction (ch 9)

the extent to which the effect of one factor varies across levels of the 2nd factor does the presence of a second variable influence how you explain participants' behaviour in the experimental condition? also a test of a treatment effect, so a separate F obs is computed calculated from cell means

general characteristics of frequency analysis

the independent variable (s) of the research design are discrete categories (nominal or ordinal scale) data are qualitative, measured as a frequency count or tallies for each category of the IV each tally is an independent occurrence- a person can contribute only 1 response to just one category -degree or intensity of participant response is irrelevant tests in frequency analyses are called non parametric because... - the analysis involves no estimation of population parameters - decisions about H0 do not require random sampling of observations - in the interpretation of outcomes, however, assumptions about random behaviour in the population are made statistical analyses are a test of the null hypothesis - you are testing whether the data (frequency counts) are evenly distributed across categories or conditions (or proportionally) - you are testing the probability that the distribution of the tallies across categories is a random occurrence - you compare the observed distribution of frequencies (tallies) against the expected distribution of frequencies caution about small data sets: - the sampling distribution associated with chi statistic does not manage small expected frequencies very well.. especially when the value for df is also small - in these cases, it is better to use a randomization test alternartive (fisher's test) or do a yates correction ADD TO THIS LATER

cell means

the mean of a specific combination or levels of the factor determine the number of cell means by multiplying the design label (eg 2 x 3 =6)

marginal means

the mean score for all participants at a particular level of one factor determine number of marginal means by addityion of levels in factor designs (eg 2 + 3= 5)

(ch 9) calculate an F ratio for....

the omnibus F (based on SScells) each main effect each possible combination of main effects that can form an interaction

power

the probability that experimental outcome allows for rejection of the null hypothesis if the independent variable has a real effect aka the probability of correctly rejecting a false H0 the probability of not making a type 2 error 1- beta a measure of sensitivity

conclusion when you reject the null for chi square test

the probability that the response rate was random was less than 0.05

when you have no interaction between factors, it means

the value for F obs for the interaction was not significant the change in values across trials is consistent you can generalize the effect of one factor on the other

no main effect or interactions

there are no significant treatment effects or interactions. performance was the same in all conditions. row marginals are the same column marginals are the same diagnoal means are the same line graph: the lines are in the same place bar graph: all bars are equal height

difference between the differences

this refers to how we use the matrix to determine the presence of a main effect or interaction no interaction: the difference between the differences is 0 interaction: the difference between the differences is significant dont use this in a formal report

fisher's exact test

to analyze a 2 x 2 contingency table which involves 2 categorical variables any size n sampling distribution: binomial probability distribution preferred option to yates correction h0: dsitribution of frequencies on one variable is not contingent on distribution of responses on the other variable is a test of the independence between 2 variables p= marginals!/ cells!N!

random assignment model

use when extreme violations of the assumptions for RSM occur does not rely on samples being representative of the population provides a more accurate evaluation of your statistical outcome than the RSM basic principles of hypothesis testing are retained, including: - stating an alternative and null hypothesis - set a value for p(a) a priori - use a sampling distributing to compare p obs to p(a). *the differenfe is that this distribution is empirically determined from the data, not theoretically derived as with z, t, and F distributions* -- this means steps 2 and 4 are reversed - carry out an experiment using sound methods - apply a test ratio (randomization tests) that do not involve paramters (non parametric) (do not estimate population params) - analysis of statistical significance based on p obs less than p(a)


Kaugnay na mga set ng pag-aaral

Real Estate 100 - Chapter 4 - Real Estate Disclosures

View Set

CRJ 100 Exam 2 Study Guide Chapters 6,7,8

View Set

Topic 10 Lesson 3 Earthquakes and Tsunamis

View Set

Chapter 5. review guide in American history

View Set

Security + Risk / Cryptography / PKI

View Set

Section 8- Commercial Lines Insurance

View Set