PSYCH 218
Null hypothesis
(H subscript 0) logical counterpart of H1. Formulated so that if H0 is false, the H1 is true. -h0 and H1 are mutually exlusive and exhaustive -if H1 is nondirectional, then H0 states that the IV has no effect on the DV -H1 marajuana affects RT -H0 marajuana does not affect Rt
more common power?
0.40 or 0.60
what is a desirable level of power?
0.80, but this is rare
Chapter two
:)
What is CSIS?
Canadian CIA
Human trafficking: can be recognized under all defs. of crime but....?
Cross cultural- some places it is more accepted in the culture
Ch.12 sampling distribution of a statistic gives:
all values that a statistic can take -probability of getting each value under the assumption it occured by chance alone
interval estimate
compute confidence interval for pop. mean
correlation coefficient
expresses quantatively the magnitude and direction of the relationship
The lindenburgh law was created in response to the case of :
kidnapping
mode:
the most frequent score in the distribution -calculated by inspection
bimodal distributions
two modes
Alpha beta and reality: When one does an experiment, do you want to minimize both alpha beta and why?
yes! p (correctly concluding) = p (retaining null) = 1- alpha if alpha is stringent at 0.05 then 1-0.05= 0.95 ALSO p (correctly concluding) = p (rejecting null) = 1 - beta if beta is 0.10 then 1-.10= 0.90
example of t test for correlated groups
you hypothesize that endorsing increases in social spending affects politician's popularity -data: 8 politicians recieve rating / 10 before and after endorsing increases in social spending ED= 22 (mean diff score) -N = 8 , D obt. = 22/8, = 2.75. Mean pop rating went up 2.75 after politician inc. enorsing social spending.
when r=1 paired raw scores have the same:
z score, knowing one of the paired scores lets us predict the other. Paired raw scores occupy the same relative positions within their own distributions
Alternative hypothesis
(H subscript 1) Claim that the diff. in results between conditions is due to the indpendent variable- marajuana affects reaction time -can be directional or nondirectiona, non direction: marajuana affect RT, directional: marajuana makes RT slower
example of binomial distribution for flipping 2 unbiased coins (N=2). List the properties.
(P + Q) ^ N = P^2 + 2PQ + Q^2 = all the possible outcomes. 1) The letters ( P and Q) tell the kinds of events that comprise the outcome 2) the exponents tell how many of that kind of event there are in the outcome 3) the coefficients tell how many ways there are of obataining the outcome. P^2= one way of obtaining two P's, 2PQ 2 ways of obtaining one P and one Q, Q ^2= one possible way of getting two tails
positive correlational relationship
(also called direct) slope is positive
negative correlational relationship
(also called inverse) slope is negative
z score
(standard score): a transformed score that designates how many standard deviation units the corresponding raw score is above or below the mean
(EX) squared
(x1 + x2 + x3 + x4) sqaured... add all number together first then sqaure
white collar crime in canada
-Bre-X, gold mining fraud - "salting" no real gold mine, it was fake! -company goes bankrupt, investors loose billions -people rarely go to jail for white collar crime, Canada described as la la land
REMEMBER WITH MS between and MS within
-MS between inc. with effect of IV and MS within is unaffected by effect of IV
Type one and type two errors
-When we make decisions about H0, we can make two kinds of errors
terrorism: labeling theory
-acts are only criminal if people in power label them criminal -institutions (E.g. states/ law enforcement agencies) label and have all the power
terrorism: NR conservatism
-anti-terrorism: terrorism is a violation of law e.g. 9/11 (Results from lack of control, terrorists= inherently evil, response= coercion, detterence)
Using binomial table (Called table B in appendix D)
-any problem involving binomial data can be answered by using binomial expression -easier to use Table B in Appendix D -can be used to solve any problem involving binomial data for N < (less than or equal to) 20 and for values of P or Q given in the table -NOTE can use table B to solve problems in terms of either P or Q
back to our ex)
-calc. appropriate stat. Z obt. =1.90 -eval. stat. based on sampling dist. z crit= 1.645 alpha = 0.05, 1 tailed pos. direction_ -sincw z obt. is greater than z crit. we reject Ho we conclude: this years calc. class is better than previous years -sample with X obt. = 63 is random sample from pop where mean > 60
terrorism: bill C51
-combat terrorism in canada, criminalizes promotion of ideas related to terrorism e.g. social media - pro:gives gov. power, allows shared info btwn. agencies -against: CSIS too much power, unconsitituional, violates civil rights
Multiplication Rule (the and rule) definition
-concerns joint or successive occurrence of several events
review of presentations thus far:
-domestic violence and gangs-- focus on psych. positivism- indicates the criminal is not like us, they are set apart -cyber, white collar, sports crimes-- focus on rationality-> classical theory: imagine ourselves in that situtation with a cost/ benefit analysis
Variabilty and three commonly used measures: calculating standard deviations with sample data
-estimate the population standard deviation -N in the denominator gives an estimate of the population that is too small -N-1 gives a more accurate estimate of population in the denominator
Ex of deriving sampling distribution
-experiment with sample size 2, N=2 -use sign test for analysis -null population: 6 scores, 3+'s and 3-'s - P (=Q) = 0.50 -derive sampling distribution 1) determine all different samples of size N=2 that can be formed from null population (Sampling one at a time with replacement, 36 possible samples) 2) calculate statistic for each sample -possible values= 2+'s 1+ or 0+s -calculate probability of getting each value of statistic if chance alone is responsible -36 equally likely possible samples p (2+'s) = 0.25000 p (1+)= 0.5000 p (1+)= 0.2500 -would have gotten the same results no matter how big the population
properties of standard deviation
-gives a measure of dispersion relative to mean -sensitive to each score in the distribution -stable with regard to sampling fluctuations
terrorism: Al Queda
-global extremist islamic terrorist movement by Osama bin laden -created in Afghanistan, religious fight in name of Allah
what can inferential statistics be used for?
-hypothesis testing -parameter estimation: inferential techniques to draw inferences about a population from a sample
Implications for interpreting non-significant results
-if our results are non-significant, it could be: 1 because null hypothesis is true 2 because null is false but power was too low in experiment to reject the null -because of 2 non significant results lead us to say "we fail to reject the null hypothesis" but NOT we accept the null or we have proven it -if power is high, we can be fairly confident in non-significant results that the real effect effect of IV is not large- BUT power is aways low to detect small but real effects of IV
effect of range on correlation
-if there is a correlation between X and Y, lowering range of either vairable will lower the correlation ex) college entry scores and first year scores
F distribution
-in chapter 12, 13, 14 we used the mean as a basic stat. to evaluate H0 -now the variance, one of most important tests -Part 1) calculate. -part 2) Evaluate using the sampling distribution of F.
Evaluating the tail of the distribution
-in marajuana example, we found the obtained probability by using the specific outcome (9+s) -an oversimplification -in evaluating experimental results the obtained probability is: the prob. of getting the obtained outcome OR any outcome more extreme (further into tail of distribution) -we evaluate the tails of the distribution beginning with the obtained result
Why do you need random samples?
-in order to generalize from the sample to the population -the sample must be representative of the population --> "random samples are representative" -Must apply the laws of probability --> "these laws depend on random sampling"
terrorism: sociological positivism
-lack of opportunities to achieve goals -use violence to intimidate society into political/ religious ideologies
how to achieve high power with stringent alpha level?
-large N, use powerful test for data, control conditions to reduce variability in data (e.g. repeated measures design)
setting a more stringent alpha level:
-lowers chance of REJECTING H0 when H0 is true (type 1 error) -increases chance of retaining H0 when H0 is flase (type ll)
Prediction and imperfect relationships
-many possible regression lines could be drawn
assumptions underlying ANOVA
-normality in population, homogenity of variance -ANOVA is robust- minimally affected by violations of pop normality or homogeniety of variance, provided sample sizes are eqaul
rounding
-not following textbook -if we want to report a result, round to two decimal places. We will round up the number in the 2nd decimal place if it is 5 or above, otherwise round down
z test applied: what is the prob. of getting X bar obt. greater than or equal to 63 if the 40 scores are a random sample from a normal population with mean = 60 and s.t. = 10 (assume alpha is one tailed, 0.05)
-our statistic is X bar obt. (Sample mean) -so to find apporpriate prob. we need to evaluate stat. using sampling distribution of the mean. -one solution: use sampling dist. of mean for samples of N=40 with mean of 60 and s.d. of 10. 1) it is normal (N is over 30) and raw score pop. is normal 2) has mean =60 3) s.d. = 10 / sqaure root of 40 (see eqaution)= 1.58 4) calc. z scores using X= 63, mean = 60, s.d. = 1.58 5) from table A you can see prob. = 0.0287 (area C), so reject Ho and conclude this years calc. class is better than previous years -sample with X bar obt. = 63 is a random sample from a pop. where mean > 60
if the events are mutually exclusive:
-p (A and B) = 0, they cant happen at the same time! -the addition rule simplifies to p (A orB) = p (A) + p (B)--> ONLY with mutually exclusive events
Link between regression constants and pearsons r
-pearsons r can be defined in terms of the linear regression line -recall by = slope of the least regression line when scores are plotted as raw scores -Pearsons r = slope of the least squares regression line when scores are plotted as z scores! Pearsons r = a slope
how to construct 95% interval?
-pop. mean for random sample of 20 NHL players who weighed, X obt = 185, s = 15 (see notes for exact calc.) produces a range of scores
why would we care about the relationship between two variables?
-prediction (=regression) -causation--> other research must be done to establish this -reliability (ex test retest split half reliability)
what assumptions underly t test for correlated groupss
-sampling dist. of the mean >30 or pop scores are normal
properties of the mean
-sensitive to exact value of all scores in distribution -sum of the deviations around the mean is zero -very sensitive to extreme scores -sum of sqaured deviations of all the scores around the mean is a minimum- if every deviation score is squared and you add them up, the mean generates a smaller number than any other number would -least subject to sampling variation
correlation coefficients depend on:
-shape of the relationship bewtween vairbales pearson r --> linear n (eta)--> curvilinear -type of measuring scale underlying data person r--> interval or ration spearmans rho (rs)--> one or both variables are ordinal point biserial correlation (rb) --> 1 vairable is at least interval and the other is dichotmous (having to options eg sex) phi coefficient - both variables are dichotomous -they are the same thing as pearsons r but applied to lower order scaling
conditions for using t test
-single sample, mean is specified, s.d. unknown, X bar obt. is basic stat., sampling dist. of X bar is normal (N greater than or equal to 30, or pop. of raw scores is normal.)
Using confidence intervals to eval the effect of the indep. variable (IV) in indep. groups design
-so far: H0 approach: lets us determine whether we can reject H0, if we can, lets us conclude that IVhad a real effect. -now: cofidence intervals approach, lets us determine whether we can reject H0 if we can, lets us conclude that IV had real effect A:SP gives us estimate of effect size -we already know how to construct confidence interval for pop mean (one sample experiment) -Now: constuct confidence interval for two sample experiment -see evaluation online!
Confidence intervals for population mean
-sometimes we want to know value of pop mean, but we only have X obt. as an estimate of pop mean -What is av. weight of NHL players?
Response to gangs
-state supression and punishment of gangs has been deeply counter productive -interaction with pleasure plain principle -Goal: creation of opportunity, remove strain. Make social structures more appea,ing -organic solidarity: reintegrate gang offenders!
Notes about t distribution
-symmetrical about 0, becomes closer to normal (z) with inc. df, with df = 20: t looks virtually normal, with df = infinity t is identical to z -at any df other than infinity, t. dsit has more extreme values (more elevated tails) than z -thus for any alpha level, t crit is highter than z crit -critical values of t for various alpha levels and df appear in table D of appendix D -use table D to eval t. obt. for any expt.
prob. of getting each value of sample mean if sampling is random from Ho poppulation can be derived empirically:
-take a pop. of raw scores with mean and specified s.d. -draw all possible smaples of fixed N size -calc. mean of each sample (X bar) -calc. porbability of getting each mean value if chance alone were operating
How to empirically generate the sampling distribution of of F
-take all possible samples of size n1 and n2 from same popultaion, estimate s.d. from each sample using s1 sqaured and s2 squared, calc F obt. for all possible combinations of s1 sqaured and s2 sqaured, calc. prob. of F for each different F obt. -the result is the sampling dist. of F with: All possible values of F and probability of F for each value, assuming random sampling from a pop.
Between groups variance (MS between)
-the second estimate of variance is based on variability between groups -H0 states that pop mean one= pop mean two -SO if H0 is true, we can use variance of sample means to estimate variance of these pops. How? Recall that if we take all possible samples of size n from a pop and calculate X bar, the sampling distribution of the mean can be converted algebraically into s.d. sqaured= n x s.d. of x sqaured -see notes for eqaution
properties of normal curve
-theoretical distribution of population scores -bell shaped curve
Generating binomial distributions from the binomial expansion
-there is a math expression that lets us generate everything so far in a simple way -the binomial expansion = (P + Q) ^ N where P = probability of one of the two possible outcomes on a trial, Q = probability of the other possible outcome and N is the number of trials. This is for outcomes that are mutually exclusive and exhaustive!
Why can you not simply square both correlation in a mutliple coefficient of determination?
-there is overlap in the explained variability (they are not independent of eachother) -there is overlap in the variability of Y accounted for by both X values -must use R squared equation
Overview of 1 way ANOVA technique
-used for analyzing multi group experiments -F test: allows us to make one overall comparisson telling us whether there is a significant diff. between means of groups -avoids problem of increases type 1 errors when computing many t values -used in both independent groups and repeated measures design
Pamela wallen: white collar crime?
-used tax payer dollars for vacationing unrelated to senator work- she claims there was a change in expenses she could attribute to senate -she said law changed in 2012 and would no longer apply but it did back then -ignorance of the crime (classical) is no excuse -she has to pay the money back, no jail time--> special treatment
t test for correlated groups
-used with repeated measures design (same subject used for both conditions 1) before and after or control and experimental (Same idea) 2) pairs of subjects matched on one or more characteristics serve in both conditions -diff score (D) calculated for each subject or pair and then analyzed -just like the sign test, but now we use both sign (direction) and size of diff scores -analysis treats difference scores like raw scores
The normal deviate (z) test
-uses the mean of the sample (Xobt) as a basic statistic -used when: we know the null hypothesis population parameters (mean and s.d.) -sampling distribution of the mean is normally distributed
terrorism: terrorist ideologies, ISIS
-violation of law and morality, crime originates as lack of discipline, offender is evil EX) non muslim countries, response= coercion, slavery, genocide--> propaganda
testing the signifcance of r
-we can also use t test to test significance of pearson r -does the correlation we obtained in our random sample exist in pop.? same procedure as before! p= pop. of correlation coefficient
why is white collar crime the least understood? but most consequential type of crime? Sutherland
-we can't agree on a definition (should it be based on the offence/ offender?) -Sutherland's argument- offender with power, committed in their occupation - we dont really know the causes: small number of criminals are caught, we have not identified childhood predictors, existing theories not a good fit--> not good predictors of who engages in white collar crime -Hard to measure- cant rely on victim survey, nearly impossible to acquire funds needed to study
calculate t from original scores
-we could calc. t directly from raw scores w/o first calc. s. -will give same answer calc. using s (see notes for example of Q that will be on final- CH.13)
What if H1 was directional? "Marajuana slows RT"
-we must evaluate only the tail in the direction specified by H1 -we obtained 9+s so we must include outcomes as extreme as or more extreme than 9+s -if marajuana slows reaction time then we predict mostly +'s so we evaluate only the tail with the higher number of +s 9+'s, 10+'s, why? If we are willing to reject H0 with 9+s then we must be willing to reject H0 with 10+s ( 10 +s more favourable to H1 than 9+s) -to get obtained prob. use table B with N=10, P = 0.50, # of P events = 9, 10 p(9+s or 10+s) = 0.0098+ 0.0010= 0.0108 -0.0108 is the prob. we are comparing to alpha when we decide whether to reject/ retain H0. -one-tailed prob. bc the outcomes we evaluate are in one tail of distribution -so if H1 is directional: we evaluate it with 1 tailed prob. values, alpha is one tailed
Hypothesis testing and inferrential statistics
-we will introduce the topic of hypothesis testing using an inference test called the sign test -first need to discuss a probability distribution called the binomial sdistribution
Implications for design and analysis of experiments
-when we design experiments, we dont know whether null is really true or false -so to maximize our chances of reaching correct decision, we must prepare for either possibility by setting alpha to a stringent level, by keeping beta low (=power high)
effect size for correlated groups t test: cohens d
-when we use correlated groups t test, we estimate pop. st. d. with sample s.d. (see text for eqaution)
Suppose you changed your mind after you see the results, in order to use non-directional alternative hypothesis
-you inflate the prob. of making a type one error, prob is actually 0.075 with 0.05 under originally predicted tail, 0.025 on the other tail
how much can power vary?
0-1
What does the results of Using confidence intervals to eval the effect of the indep. variable (IV) in indep. groups design tell us?
1) Can we reject H0? 95% confidence interval corresponds to alpha of 0.05 2 tailed, 0.025 under each tail -no directional H0 predicts pop mean 1-pop mean 2=0 -so if our interval does NOT include O, we can reject H0 -conclude IV had a real effect 2) What about size of effect? when we reject H0 we can also condclude that we are 95% confident that the interval contains the real effect of IV, if it does then the interval provides estimate of actual size of real effect
Three factors that affect power
1) N 2) Size of real effect, power varies directly with effect size 3) Alpha- power varies directly with alpha, as alpha gets lower, kills power of experiment. Lowering alpha, harder to reject null hupothesis when it's false
Three factors that effect power:
1) N (sample size) Power varies directly with N
ANOVA partitions total variability of the data (SS total) into two sources:
1) Variability that exists within each group- within groups sum of squares (SS within) 2) Variability that exists between groups- between groups sum of squares (SS between) -Each SS is used to make an indep. estimate of H0 pop. variance: estimate based on SS within- called MS within. -estimate based on SS between is called MS between. -An F stat. is calculated, Fobt. = MS between/ MS within
for sample sizes of any size of N, sampling dist. of mean is:
1) a dist. of scores that are sample means) IT has its own mean and standarad deviation 2) it has a mean equal to mean of raw scores population 3) has a s.d. called the standard error of the mean equal to the s.d. of raw score pop. devided by sqaure root of N -as a reult there is a dif. sampling dist. of the mean for each dif. sample size N 4) is often normally shaped depending on: shape of population raw scores, sample size (N)
how do you derive a sampling distribution?
1) a priori approach: use basic probability considerations 2) empirical approach: begin with actual or theoretical set of population scores that exist if IV has no effect ( if null hypothesis is true)==> called the null hypothesis population 2) then determine all possible different samples of size N that can be formed from null population 3) calculate statistic from each sample and calcualte the probability of getting each value of statisitc if chance alone is responsible
sampling distribution of the mean gives:
1) all values sample mean (X bar) can take 2) prob. of getting each value of sample mean if sampling is random from Ho poppulation -can be derived a priori or empiracally
t test for two condition"
1) correlated groups (repeated measures) 2) independent groups (not same individuals in 2 conditions)
When the five requirements of binomial distribution are met, the binomial distribution gives:
1) each possible outcome of the N trials 2) probability of getting each of these outcomes
to generate all possible outcomes of N trials and the associated probabilities we can:
1) expand the expressions for any particular value of N 2) evauluate each term in expansion
Three conditions for the multiplation rule:
1) mutually exclusive events p (A and B) = 0 2) Independent events: Two events are independent if the occurence of one has no effect on the probability of occurrence of the other--> Sampling with replacement 3) Dependent events: Two events are dependent if the probability of occurence of one (B) is affected by the occurrence of the other (A)
Assunptions underlying t test for independent groups
1) sampling distribution of X bar one minus X bar two is normal (pop from which samples were taken should be normal) 2) Homogeneity of variance (variance of two populations are equal)
Evaluating Marajuana expt. using binomial distribution- the data fit the requirements of binomial distribution
1) series of trials N = 10 2)two possible outcomes per trial (+/-), if a tie occurs (a diff. score of 0) it must be discarded and N must be reduced 3) mutually exclusive outcomes 4) independent trials 5) Prob of getting +/- constant across trials --> so we use binomial distribution ** let P= prob. of getting a + in experiment (chance alone is operating not marajuana b/c this is how we see P, though the null hypothesis) so P = .50 if chance alone is operating (if H0 is true) - number of P events = 9. (how many people got +'s) -enter table B, p(9+'s)= 0.0098 -we set alpha to 0.05 and p < 0.05 so we reject H0 -we accept H1m we conclude marajuana effects RT -we sampled randomly from university undergrads, assume conclusion applies to undergrads at university
Overview of 1 way ANOVA technique for independent groups
1) subjects are randomly sampled from pop. and randomly assigned to conditions, ideally with eqaul n's - as many independent groups as conditions H1 (nondirectional): one or more of the conditions have different effects from at least one of the others on DV. H0: the diff conditions are equally effective, H0 scores in each group are random sample from populations with the same mean
notes about binomial distribution with P = 0.50
1) symmetrical 2) Has two tails 3) Involves a discrete variable (cant have 2.5 heads) 4) starts to get closer to shape of normal curve as N increases
Properties of z scores
1) z scores have the same shape as the set of raw scores from which they were transformed 2) mean of z scores always equals 0 3) standard deviation of z scores always equals 1
Using linear regression for prediction: the criteria (4)
1-relationship between X and Y must be linear 2-Use regression line to predict only scores of individuals not in the group used for calculating the line 3- Regression line must be calculated from data that weel represent the group we want to make predictions about--> random sampling 4-Prediction must be limited to range of the base data --> dont predict outside base data (eg dont predict someones GPA with and IQ of 140 with a sample of people with IQs of 90-100)
correlation coefficient:degree
1= perfect, between 0 and 1 = imperfect, 0= non extistant
Chapter one: statistics and the scientific method
:)
MIDTERM 2 STARTS HERE
:)
Post MIDTERM 2
:)
SEE NOTES FOR EXAMPLE
:)
See lecture notes for discussion of power of t test (section C) and comparrisson btwn. correlated and indep. groups t test (section D)
:)
Symetrical Frequency curve
A curve is symmetrical when folded in half the sides conincide. If a curve is not symetrical, it is skewed
REQUIREMENTS (5) for a binomial distribution:
A probability distribution that results with: a series of "N" trials (could be a participant, flip of a penny etc), only two possible outcomes on each trial, mutually exclusive outcomes on each trial, independence between outcomes on each trial, probability of each possible outcome on any trial staying constant from trial to trial
terrorism: prevention and patriot act
American preventative respsonse to terrorist threat, under Bush administaraton
Four methods to acquire knowledge:
Authority: Something is true because of tradition/ some person of distinction says its true Rationalism: Reasoning to arrive at knowledge--> the use of logic Intuition: a sudden insightm the clarifying idea that springs into conciousness all at once as a whole Scientific Method: Uses reasoning and intuition but relies on objective assessment. Design an experiment to objectivly test a hypothesis.
How can you find the probability of each of the possible outcomes in the binomial expansion?
By evaluating their terms using the numeric value of P and Q. -EX) coins are unbiased, so P=Q= 0.50 so we evaluate the terms p (2H)= P^2= (0.50)^2 = 0.2500 p (1H)= 2PQ= 2 (0.50) (0.50) = 0.5000 p (H)= Q^2= (0.50)^2 = 0.2500 -same results with enumeration
Example of experiment with power being affected by N (N=10)
Conduct same experiment with N = 10. -N= 10, P real = 0.90, alpha= 0.05, 2 tailed--> calculate the power 1) assume null hypothesis is true. Determine which sample outcomes would allow us to reject the null if P null= 0.50. -best results for rejecting null: 0's or 10's p (0 or 10+'s) = 0.0020 (from table B) p< 0.05, we would reject null -next best results for rejecting null 9+s or 1 +s p (0 or 1 or 9 or 10 +s)= 0.0216 From table B p < 0.05, so reject null -next best results for rejecting null -8's and 2+s p (0 or 1 0r 2 or 8 or 9 or 10) = .1094 from table B p > 0.05, so we could not reject the null. stop here. '-so if the null is true, the sample outcomes that would allow us to reject the null are 0,1,9,10 +s 2) Assume alternative hypothesis is true. Determine prob. of getting any sample outcomes from step one if p real = 0.90. -power = 0.7361 (from table B, N= 10, P = 0.90 [Q=.10]) -we have 73.61% chance of rejecting null if alternative hypothesis is true
Interval scales
EX) IQ and Celsiuis. Ranking individuals, equal interval in-between. Adjacent units on the scale
Conditions for multiplation rule: independent events, also applies with more than two events
EX) Sample four people from pop. of 40 men and 80 women, one at a time, with replacement. What is the probability of picking two men and two women in that order? p ( M 1st) p (M 2nd) p( M3rd) p (M 4th)= 40/120 x 40/120x 80/120 x 80/120= 0.0494
Sampling with replacement:
Each member of the population selected for the sample is returned to the population before the next member is selected
ANOVA F test
F test appropriate when scores in experiment can be used to make 2 independent estimates of variance -common where we analyze experiments with more than 2 conditions / group-- often need more than one control group to accompany experimental group, often want to explore how DV changes with several leveles of IV
Solution to t test problem
H1 (nondirectional): The new movie is diff. from an average movie, sample with X bar obt. is a random sample from a pop. where mean does not equal 6.00 H0: The new movie is not diff. from avergae movie, sample with X bar obt. = 7.70 is a random sample from pop. where mean = 6.00 Step 1) Calc. appropriate statistic: (see equation) = 4.30 2) Evaluate stat. (t obt.) using sampling dist of t using decision rule. We reject H0 is t obt. is greater than or equal to t crit when H1 is directional and positive (alpha is one tailed) OR t obt. is less than or equal to t crit when H1 is directional and neg. (alpha = 1 tailed) OR o obt. is greater than or equal to t crit when H1 is non directional (two tailed) enter table D! -t crit= +/- 2.262, since t obt. is greater than t crit, we reject H0! We conclude: the new movie is diff from av. movie -sample with X bar obt. = 7.70 is random sample from pop. where mean does not equal 6
Solution z test
H1: directional. This years class is better than previous years classes. -OTHER WORDS, STATISTICALLY SPEAKING: sample with X bar obt. = 63 is a random sample from a population where mean > 60. Ho: This years class is not better than previous years classes -sample with X bar obt. = 63 is a random sample from a population where mean is less than or equal to 60
hypothesis and specific stat. lang for ex of t test correlated groups
H1: endorsing inc. socail spending affects popularity, sample diff. scores with Dobt. = 2.75 are random sample from pop. of diff scores with mean pop does not equal 0 -Ho- endorsing inc social spending does not affect popularity, sample diff socres with Dobt. = 2.75 are random sample from pop. of diff scores with mean pop does eqaul 0 -test Ho using alpha=0.05 2 tailed
hypotheses for example of t test for independent groups
H1: temp affects amount of frog croaking -the sample mean diff. of 1.60 is due to random sampling from pops where pop mean one- pop mean 2 does not eqaul zero -H0: temp does not affect amount of frog croaking. the sample mean diff. of 1.60 is due to random sampling from pops where pop mean one- pop mean 2 does equal 0
Ratio scales
Height, weight, reaction time. Rank order, equal intervals btwn adjacent points, absolute zero point. Can make ratio statements. Absolute absence of what your measuing= Absolute zero. 0-->> O on kelivin scale- absolute absence of temperature. With absolute zero point it is possible to use ratios EX) someone who is ten centimetres tall is twice as tall as someone who is 5cm tall
IF you change your mind AFTER you see your results, what happens to p value?
If you used the neg. directional hypothesis, with an alpha level of 0.05 (1 tailed) in neg direction (predicts -'s) You have raised p value to 0.10 (prob. on high level and low level) INFLATES probability of making a type one error, in the long run the probability is.10, not 0.05
XL
Lower real limit of interval containing percentile point
sampling without replacement:
Members of a sample don't get returned to the population prior to selecting subsequent members
Example of experiment with power being affected by N (N=5)
N= 5. suppose p real = 0.90 and alpha= 0.05, 2 tailed, calculate the power. Step one: assume null hypothesis is true, determine which sample outcome would allow us to reject null, if p null = 0.50. HINT: always start at the tail and work way inwards until you reach the 1st outcomes for which obtained 2 tailed prob. exceeds alpha -best results for rejecting null: 0+s and 5+s p( 0 or 5+'s) = 0.0312+ 0.0312= 0.0624 (from table B, n = 5, P = 0.50) -p > 0.05, we retain null hypothesis. If null is true, there are no sample outcomes that would allow us to reject null! -Step two:assume alternative hypothesis is true. Determine the probability of getting any sample outcomes from step one. if p real= 0.90 there are NO sample outcomes from step one so this prob = 0. Power= 0.0000 -we have a zero percent chance of rejecting null, if alternative hypothesis is true
is white collar crime victimless?
NO! ex) draining pension funds, hazardous waste dumping
Richter magnitude scale
Ordinal scale 0...1...2....3...4...5 x10.x10.x10.x10.10! There is no absolute zero point. You can have negative amplitude. Not an equal amount between units (progressively gets time times larger- lack of equivalence between units)
Who introduced bill making gang acitivity illegal?
Parm Gill )c 394)
EXAMPLE of Z test
Prof teaches calculus classes for many years: final grades from a normal distribution with mean of 60 and s.d. of 10 -this year, class N= 40, gets final grade X bar obt. = 63. IS this years class better than previous year?
White collar crime: wolf of wallstreet
Pump and dump- uses false and misleading positive statements to artifically inflate cost of stock -people cheated out of their money -takes skilled people to follow paper trail
usually when there are two mutually exclusive events, we denote the probability of occurrence of one as P and the other as...
Q ex) in flipping a coin, P= probability of getting head 1/2 and Q= probability of getting a tail 1/2 P + Q = 1.0000 when two events are exhaustive and mutually exclusive.
Ordinal scale
Ranking individuals- have property of magnitude/ size. Different intervals between units
true or false: addition rule can be used in situation involving more than 2 events?
TRUE! assuming the events are mutually exclusive p(A or B or C or.... or Z)= p (A)+ p(B) + p(C) +.....+ p(Z) where A,b,c...z are all the events
Measurement scales
The type of scale used to collect data can influence type of statistical inference test used to analyze data
what is meant by the statement: it is impossible to ever prove that the null hypothesis is true because the power to detect very small but real effects of if IV is always low?
This means that if we did an experiment, and in reality the IV has a sm. effect like p real = 0.51, the power of the experiment is very low. Power of experiment varies directly with real effect. -If you cant reject null with very high power then you can be pretty confident that if there is any real effect of IV it is not very big.
why do we want to minimize E(Y-Yprime)^2?
To avoid negative values- some errors will be positive but some will be negative -they will cancel eachother out if summed! -so we sqaure the errors before we sum them -for any linear relationship- only one least squares regression line
Inferential statistics
Using sample scores to make statements about a characteristic of the population
cumulative percentage curve
Vertical axis plotted in cumulative percentage units, horizontal axis points are plotted at the upper real limit of the interval.
Variables
X or Y stands for variables and subscripts on X stand for specific observations/ people. EX) X1= subject one
In the Monty Hall problem is it in your advantage to switch?
Yes!
how do we achieve random sampling?
You can use tables of random numbers (See the back of textbook)
Why is there a problem with changing your mind about the direction of the hypothesis after you have seen the results?
You give yourself a greater chance to reject the null (extra outcomes to reject the null) before you make your decision. Increases probability of a type one error`
Example of t test for independent groups
You hypothesize temp affects number of croaks a frog makes. Data: Number of croaks (per five min). MEasured in condition one (30 degrees C) EX1= 40, n1= 5 (little n = sample size in one of your conditions), X1= 8. MEasured in condition two (22 degrees C) EX2= 32, n2= 5, X2 = 6.40. X1-X2= 1.60
standadrd error of estimate defined:
a measure of average deviation of prediction errors around the regression line
1st interpretation of pearson r
a measure of the extent to which paired scores occupy the same or opposite positions within their own distributions -same positions- positive correlation, negative correlation- opposite positions
second interpretation of pearson r:
a measure of the variability of Y accounted for by X
parameter
a number calculated on population data that quantifies a characteristic of the population
statistic
a number calculated on sample data that quantifies a characteristic of the sample
sample
a subset of the population
Real limits of a continuous variable
accuracy of measurement. -Values that are above or below the recorded value by 1/2 of the smallest measuring unit on the scale
perfect relationships:
all points fall on a straight line
sampling dist. of mean gives:
all values a sample mean can take, prob. of getting each mean value if sampling is random from Ho pop.
partial correlation
allows us to measure the relationship between two variables after controlling for (partialling out) another variable or variables
power decreases with more stringent...
alpha levels
effect of extreme scores on correlation
an extreme score can dramatically change the size of the correlation coefficient
Continuous variables
an infinite number of values between adjacent units on the scale (Theoretically)
descriptive statistics
analysis to describe or characterize obtained data
inferential statistics
analysis to draw inferences about population using data from sample
variable
any property or characteristic of some event, object, person, that may have different values at different times depending on the conditions
Area contained under normal curve:
area under curve represents the percentage of scores contained within the area.
central tendency
average of scores
4 type of initiation:
beat down, blood in (kill to get in) blood out (killed if try to get out), jacked in (stealing), sexed in (sexual favors, usually females)
IF you set your alpha level to be one tailed why should you retain H0 if the results are extreme in the opposite direction?
because if you dont, you inflate the probability of making a type 1 error ex) suppose you have a positive directional alternative hypothesis and set alpha to 0.05 in the positive direction, (predict +'s). Your proability of type one error = 0.05. Suppose you observe an extreme result in the opposite direction (mostly -'s)and as a result, you decide to reject null. You change your mind AFTER you see the results, in order to use a negative directional hypothesis with alpha = 0.05 in the neg. direction (predict -'s) or a non directional alternative hypothesis with alpha= 0.05 (two tailed, predict either +'s or minuses)---> there is a problem!
why do we rarely make alpha level 0.000001?
because it increases the likelihood of making a type two error
why would you not do a bunch of t tests instead of ANOVA?
because it increases the probability of a type 1 error. The more tests you do, the higher probability of a type 1 error
NOTE about pearson r
because pearson r is based on z scores- direction and degree of correlation are independent of any differences in units and scaling between the two variables
regression line
best fitting line used for prediction
the higher power is the lower....
beta is, and lower the proability of making a type two error is
Human trafficking: Domoter case
biggest trafficking case in Canada -Hungarian immigrants, forced unpaid labor -sociological positivism viewpoint: members of domotor family were roma, put into transition zone in canada, used innovation type of deviant behaviour
purpose of gang initiation:
bond over sim experience, test loyalty, traps people legally
preffered solution of sampling dist.
calc. critical region for rejection of Ho -area under curve containing all values of stat. that let you reject Ho -determined by alpha -to find this region, calc. critical value of statisitic (z crit) this is the vvalue of stat. that bounds the critical region -found using table A in reverse manner- by searching forz that correpsonds to area we want -if alpha = 0.05 1 tailed in neg. direction then z crit = -.645 -in pos. direction then z crit = 1.645
test H0 by assuming the sample set of X and Y scores with correlation of r obt. is a random sample from pop. where p =0
can evaluate significance of r using t test (see notes for equation)
White collar crime...has many definitions:
committed by a person of respectability and high social status in the course of his occupation--> critisized for being to narrow -OR crime cimmitted for financial benefit of indic. in occupation -OR occupational def. - crime committed through their legitimate job
we can construct confidence intervals about which there are specified degrees of confidence
commonly 95% or 99% confidence intervals -these are intervals such that the probability is 0.95 or 0.99 that the interval contains the pop. value
size of effect: eta sqaured
commonly used but biased estimate of effect of IV in the pop (usually overstimate)
point estimate:
compute mean of sample: X obt. = 185 how close is it to pop. we dont know!
a measure of the variability of Y accounted for by X: an example (scenario two) suppose there is a relationship between spelling score and writing score- what is the best guess?
compute the regression line that uses this relationship and base our prediction on it
Sign test
considers directions of diff. scores (their signs are + or -) if we obtained 9/10 pluses in an experiment can we conclude that marajuana affects RT? Not necessarily! Could have happened by chance or luck alone--> we need a rule to tell us when the obtained prob. is low enough for us to reject chance as an explanation. Must adopt a critical prob. level against which to compare results.
Independent variable
controlled by the researcher- it is manipulated
Must convert question if P> .50
convert into: 1) Into Q terms- P + Q = 1, so Q = 1-P 2) into number of Q events, number of P events plus number of Q events = total number of events, so # of Q events= total number of events - number of P events -now you can use table B to solve
2 forms of white collar crime: occupational
crime violation of legal codes during the course of individuals occupation
two major probability rules: Addition rule
deals with probability of occurrence of any one of several possible events p (A or B)= p(A) + p(B) - p(A and B) -the probability of occurrence of A or B equals: probability of occurrence of A plus the probability of occurrence of B minus the probability of occurrence of A and B
What is alpha ususally?
decided at beginning of experiment: o.o5 or 0.01
terrorism:HAMAS
democratically elected social movement yet labelled a terrorist group
F distribution vaires with...
df. 2 values for df- df for numerator= n1-1, df for denominator = n2-1
constant
does not have different values at different times
Human trafficking: types
domestic service, sexual and labor exploitation, begging and petty crime -90% of Canadians trafficking victims come from within canada
what should we do to prevent violent crime?
empowerment of women and minorities- look at the structure, early intervention
terrorism: Tamil tigers
ethnic and power relations in Sri Lanka- the root of terrorism-> labelling causes stigma
if H1 is nondirectional then we:
evaluate both tails (both directions)
Marajuana experiment: Our H1 was non-directional =- marajuana affects RT so we must:
evaluate both tails. We obtained 9+'s so we must include outcomes as extreme as or more extreme than 9+s in both directions--> 9+s, 10+s, 1+, 0+s Why? if we are willing to reject H0 with 9+'s then we must also be willing to reject H0 with 10+'s, 0+s, 1+ because they are at least as favourable to H1 as 9+s in both directions -to get the obtained prob. use table b with N= 10, P= 0.50, number of P events= 0,1,9,10 -p( 0 or 1 or 9 or 10)= 0.0010 + 0.0098+0.0098+0.0010=0.0216 -0.0216 is prob. we compare to alpha when we decide whether to reject/ retain null -b/c the outcomes we evaluate are in both tails of direction -so if H1 is non directional: we evaluate it with 2 tailed prob. values, alpha level is two tailed
define mutually exclusive:
event that both cannot occur together ex) picking a two and a 9 in one draw from a regular deck: they are mutually exclusive, they cannot happen together!
the addition rule is commonly used when:
events are mutually exclusive
how can we establish causation?
experiment
Conditions for using z test, when is single sample z test appropriate?
experiment has one sample, mean and s.d. are specified, X bar obt. is a basic statistic, sampling dist. of mean is normal, N is greater than or equal to 30 or pop. of raw scores is normal
differential association theory: gangs
exposure- more likely to join bc of the people around them, frequent interaction with criminals
recurring theme in other violent crime:
feminisit perspective: committed against women (hate crime, kidnapping, sexual assault)
Variabilty and three commonly used measures: Standard deviation for population
first consider: deviation score (tells how far away a raw score is from the mean in the distribution) X-X bar (sample) and X - u (population) Consider the population data: 1, 2, 3, 6, 8- the mean is 4, then calculate deviation scores (tells you how spread the scores are around the mean) Average mean deviation? Sqaure the numbers and then use square root
white collar crime: classical
focus on crime itself -free will, looking out for own interest, cost benefit analysis , deterrence based on punishment
Ms13 gang
focus on drug sales and territory -levels in the heirarchy, multinational -deportation of MS 13- many here are illegal/ immigrated -instead offer opportunities/ acitivites to youth
white collar crime: Marxism
focuses on power inequalities -divided into powerful crimes ( like dumping hazardous waste) and unpowerful (like tax fraud) -only way to fix it is overthrow capitalism
NOTICE the effect of N on power example
for alpha = 0.05, 2 tailed, p real = 0.90, if N=5 then power = 0.0000, but if N=10, power = 0.7361
Cumulative frequency distibution
for each interval -cum f= frequency of that interval + frequency of all classes below it
Cumulative percentage distribtuion
for each interval cum %= cum f/ N x 100
relative frequency distribution
for each interval: rel f= f/ N (frequency at a certain interval/ total number)
homoscedasticity
for standard error of estimate to be meaningful we assume the variability of Y remains constant as we go from one X score to the next -if we divided the x scores into columns the variability of Y would not change from column to column
F stands for
frequency
probability values: range
from 0.0000 to 1.0000 (round to four decimal places with probability) p(A)= 0.0000 means event certain not to occur p(A)= 1.0000 means event A is certain to occur
sig figs
general practice in psych: report the value to two decimal places -carry all intermediate calculation to 4 decimal places -there will be exceptions later in the course
Binomial distribution example:
generate binomial distribution for flipping unbiased coins (using enumeration). Three coins, each flipped one (3 flips = 3 trials) possible outcomes: HHH, HHT, HTH,THH, TTH,THT HTT, TTT p (3H)= p (HHH)= 1/8 = 0.1250, p (2H) = 3/8 = 0.3750, p (1 H) = 3/8= 0.3750, p (0H) = 1/8 = 0.1250 Why is this a binomial distribution? because it satisfies the five requirements
gangs: sociological positivism
generated by structure of society
size of effect: omega squared
gives unbiased estimate of proportion of total variance of DV in the pop. accounted for by IV
scatterplot
graph of paired X and Y values
Gangs : defined
group of recurrently associating individuals with identifiable leadership and internal organization, claiming control over territory in community and engaging in illegal activites
random sampling is important because:
helps achieve a sample that is representative of population - allows the laws of probability to apply
Variabilty and three commonly used measures: The range
highest score in the distribution minus the lowest score EX) 2, 3, 7, 18, 6 = 18-2 =the range is 16
if pop. raw score is:
i)normal, then so too is sampling dist. of mean regardless of N ii) not normal, then sampling dist. of mean approaches normal as N increases -if N is greater than or eqaul to 30 you can assume sampling dist. will look normal
cohens d continued
if H1 is directional, then X bar obt. should be in direction predicted by H1 in order to calc. d -if X bar obt. is in opp. direction, we would conclude by retaining H0 and it would not make sense to inquire about size of effect
correlation does not imply causation
if X and Y are correlated the correlation may be spurious, X may cause Y, Y may cause X, a third variable may be the cause
what does standard error of estimate tell us? gnerally
if it is larger there is less confidience in predicting Y given X if it is smaller there is more confidence in predicting Y given X
when might we set alpha level to be one tailed?
if it makes no practical diff. If result is on opposite direction eg comparing a new treatment to an old treatment, which one is better? OR if there is a theorectical reason to predict result (e.g. in 1000s of papers)
standard error of estimate
if linear relationship between X and Y is imperfect, most actual Y values will not fall on regression line for predicting X given Y -these are prediction errors -its useful to know how big these errors are - how far off regression line actual Y scores are b/c they tell you how strongly we can rely on the prediction
what does standard error of estimate tell us? specifically
if we assume points are normally distributed around the regression line then we constuct two lines parralel to the regression line at distances of: +/- 1: about 68% of scores will fall between lines +/- 2: about 95% of scores fall between lines +/-3: about 99% of scores will fall between lines
NOTE on a priori and a posteriori
if you roll a die many times, then a priori and a posteriori will probably approach eachother
2 forms of white collar crime: corporate crime
illegal activity done by employees to benefit company
the normal curve
important in behavioural sciences 1) many variables are normally distributed (height, wight, intelligence, achievement) 2) Many statistical tests have sampling distributions that become normally distributed as N increases (sign test) 3) Many statistical tests require sampling distributions that are normally distributed ( Z test, t test, F test)
size of effect for t test: cohens d
in addition to determining whether there is a real effect, we also may want to determine size of effect (cohens d) -gives a measure of size of real effect (ignoring direction) using absolute value of mean diff. -standardizes the measure by dividing absolute value of mean dif. by pop. dif.
example of exhaustive vents
in rolling one die, the set of events of getting 1,2,3,4,5,6 is exhaustive (nothing else that could happen)
Ch 14. Students t test for correlated and independent groups
in z and t tests for single sample experiments, we had to specify at least one parameter (pop mean) -but usually we dont know the pop mean -if we think we know pop mea, we may not know if it is accurate -two condition experiments help us deal with this issue -weve already seen 1 test for a two condition expt. (sign test)
MS between
increases with size of IV's effect
positive relationship: correlation
indicates that there is a direct relationship between the variables
Negative relationship:
indicates that there is an inverse relationship between X and Y
cumulative frequency ditribution defined
indicates the number of scores that fall below the upper real limit of each interval
cumulative percentage distribution
indicates the percentage of scores that fall below the upper real limit of each interval
Relative frequency ditribution definition
indicates the proportion of the total number of scores that occurs in each interval
the histogram
interval or ratio data. Bar is drawn for each class interval, each bar begins and terminates at the real limits. If odd amount of number, plot the one in the middle to represent that bar
frequency polygon
interval or ratio data. Point is used to represent data (at midpoint of each interval). Points joined with straight lines
correlational research
investigator focuses attention on two or more variables to determine whether they are related
beta
is the probability of making a type two error
MS within
is unaffected by size of IV's effect
terrorism: ISIS
islamic extremist group controlling territories in Iraq and Syria, also operating in Lebanon, Libya, Egypt -formally a branch of Al Queda
Terrorism group defintion
it is ambiguous. -political, religious, ideological purpose, objective or cause
terrorism: limitiations of labeling theory
it proposes to stop labeling and decriminalize -we cant decriminalize terrorism
why do people join gangs?
lack of jobs, poverty, social isolation, domestic violence, neg. peer networks, lack of parental supervision, lack of school attachment, and early academic failure, attractive lifestyles
6 levels of gangs
leaders, hard core associates, clique, fringe, wannabes
Properties of the median
less sensitive to extreme scores than the mean, usually more subject to sampling variability than the mean but less subject to sampling variability than the mode
alpha
limits the probability of making a type one error
negatively skewed distribution mean vs mode
mean dragged in direction of skew mean < median
sampling dist. of mean is made up of:
mean scores
observational: correlational research
measure individuals on multiple variables and explore their relationship
what is the most informative measure of central tendency? skwewed distribution
median
what is the most informative measure of central tendency? bimodal or mulitmodal distribution
mode
negatively skewed distribution
most of the scores occur at the higher values of the horizontal axis and the curve tails off toward the lower end
positively skewed distribution
most of the scores occur at the lower values of the horizontal axis and the curve tails off towards the higher end
R squared in words
multiple coefficent of determination
observational: 3 types
naturalistic, parameter estimation, correlational
Standard deviation can ____ be negative
never
Sum of sqaures can ____ be negative
never
Using the t test example
new movie shown to test audience (random) who rate it from 1 (bad) to 10 (great) -over the years, test audience ratings have a mean of 6.00 by s.d. unknown -new movie rating by 10 members of test audience: x bar obt. = 7.70, s =1.25 -is new movie dif. from average movie? Use alpha 2 tailed
discrete variable
no possible values between adjacent units on the scale
The bar graph
nominal or ordinal data. Bar drawn for each category, where height of each bar represent the frequency or number of members in that category. The bars do not touch eachother, arrange in any way, usually in descending frequency
Degrees of freedom for a stat (df):
number of scores that are free to vary in calc. stat. (free to take on any value you want to assign to them) -for X bar, df = N, why? All scores are free to vary -if N=3 and we know only two scores, the third score is free to take on any value -thus N degrees of freedom -for s, df = N-1, why? computed from deviation scores E (X - X bar) = 0, so only N-1 of the devaition scores are free to take on any value, thus N-1 degrees of freedom -when we calc. t for single samples, we must first calc. s, so there are only N-1 df associated with t
two categories of scientific research:
observational and experiments
observational: parameter estimation
observe certain individuals on variables- use observations to estimate what is going on in the population
Conditions for multiplation rule: dependent events
occurence of B is affected by occurrence of A. Sampling without replacement. p (A and B) = p (a) x p (BlA) ex) draw 2 cards randomly from a regular deck one at a time, without replacement. What is the probability that both cards are spades? 13/52 x 12/ 51= 0.0588
unimodal distributions
one mode
How do we test our diff scores with t test correlated groups
our stat is the mean diff score from sample (d obt.) To evaluate we can treat it like any other sample mean (Xbar obt) So we can compute t test! solution: 1) calc appropriate stat (See equations) It is identical to single sample t test, but uses diff scores (D) instead if raw scores (X) 2) Eval. stat. if ltobt.l > lt critl then reject H0 (for non directional H1) -df = number of diff scores - 1. -enter table D! -we concluded endoring increases social spending affects popularity. sample diff scores with Dobt. = 2.75 are random sample from pop of diff scores with pop mean not equal to zero
Conditions for multiplation rule: dependent events, also applies to sitauations with more than two events
p (A and B and C...and Z) = p (A) x p(BlA) x p (ClAB)... where p (A) = prob. of A, p (BlA)= probability of B, given A has already occurred and so on. ex) sample four people from pop. of 40 men and 80 women at one time without replacement. What is the probability of picking 2 men and 2 women in that order? 40/120 x 39/ 119 x 80/ 118 x 79/ 117= 0.0500
Conditions for multiplation rule: independent events
p (A and B) = p (A) x p (BlA) = p (A) x p (B) Ex) Draw two cards randomly from a regular deck, one at a time, with replacement. What is the probability that they will both be spades? This is sampling with replacement, so the events are independent 13/52 x 13/52 = 169/2704 = 0.0625
Multiplication Rule (the and rule)
p (A and B)= P (A) x p (BlA) which means the probability of occurence of both A and B is equal to the probability of the occurence of A times the probability of the occurrence B, given A has occurred
if a variable is continuous and normaly distributed
p (A) = area under curve corresponding to A/ total area under curve ex) suppose Iq's of all students at UBC (population) -normally distributed, mean = 125, sd. = 15, if randomly sampled from a population what is the probability that an IQ would be equal to or less than 108? TWO STEPS: Convert raw score to z score, look up required area in table A 1) 108-125/ 15 = -1.13 2) look in Table A for area C corresponding to z = 1.13 p (X less than or equal to 108) = 0.1292
Why in the monty hall problem is it to your advantage to switch?
p (car@door 1 or car@door2 or car@door3) = 1 p (car @ door 1) + p (car @ door 2) + p(car @door 3)= 1 -we use the addition rule. It is a mutually exclusive, exhaustive set of events. -suppose you pick door 1, p (car @ door 1) = 1/3 p(car@door 2) p(car at door 3)= 1- p(Car @ door 1) 1- 1/3= 2/3 (probability that it is behind door 2/3) Host opens door two or three with no car behind it. So: If host opens door two, than mean p(Car @ door 2) = 0 p (Car @door 2) + p (Car @ door 3)= 2/3 0 + P (car @ door three)--> p (car@ door 3) = 2/3 If you stay with your original door, door one, the probability of winning a car is 1/3 or 33.33% and if you switch to the other door, door 3, the probability of winning the car is 2/3 or 66.67%. We used the addition rule along with properties of exhaustive sets of mutually exclusive events ( a priori approach)
Repeated measures design
paired scores in conditions, same subjects in both conditions, analyze the diffs. between paired scores
terrorism:Irish Republican Army (IRA)
parliamentary group, fight for succession of Northern IReland from U.K. -forced british to withdraw
percentile rank
percentage of scores with values lower than the score in question
example of addition rule using mutually exclusive events:
pick one card from a regular deck, what is the probability of picking a queen or jack? --> mutually exclusive! p(queen) + p(jack)= 4/52 + 4/52 = 8/52 or 0.1538
Example of addition rule:
pick one card from regular deck, what is the probability of picking a diamond or a ten? p(diamond)= 13/52 p(ten)=4/52 p(diamond and ten)=1/52 SO p(diamond or ten) = 13/52 + 4/52 - 1/52= 16/52 or 0.3077
if alpha is 2 tailed = 0.05 crombined area under two tails of curve must equal 0.05, so area under each must equal 0.025 so z crit=
plus or minus 1.96 -decision rule for z test, we reject H0 is z obt. is less than or equal to z crit. when alpha is one tailed in neg. direction or if z obt. is greater than or equal to z crit when alppha is one tailed in pos. direction, or when z obt. is greater than or equal to z crit when alpha is two tailed
imperfect relationships
points tend to fall near a straight line but not all points fall on it
power mathematical expression
power= 1- beta, smaller the beta the higher the power, we want power to be as high as possible
Y prime
predicted Y value for any X value
least sqaures regression line:
prediction line that minimizes total errors of prediction minimizes E(Y-Yprime) ^2
Frequency distribution
present score values and their frequency of occurance -table or graph -when presented in a table, score values are listed in order -lowest score at bottom of table
P real
probability of getting a + with any subject in sample if IV has real effect (if the alternative hypothesis is true) -P real does not ever equal 0.50. marajuana slows reaction time (+'s) P real = 1.00- very large real effect, prob. of getting a plus is 1.00, if P real = 0.70 moderate real effect, prob of getting a plus = 0.70, if p real = 0.00, prob of getting + is 0.00, very large real effect -size of effect is measured by how far from 0.50 the number is
beta limits:
probability of type two error
Power
probability that results of an experiment will allow rejection of the null hypothesis if the independent variable(IV) has a real effect ex) if marajuana affects RT
Two approaches to probability: empirical or a posteriori
problems solved after some data have been collected p(A)= number of times A has occurred/ total number of occurrences. ex) roll a die many times and count the total number of occurrences of three-> number of times three occurred/ total number of occurrences.
Two approaches to probability: Classical or a priori
problems solved by reason alone p(A) = number of events classifiable as A/ total number of possible events ex) roll the die once, what is the probability it will be a three? -> number of events classifiable as 3/ total number of events possible. = 1/6.
recurring theme in violent crime:
psychological positivism (serial killers and infanticide)
terrorism: conservatism in guantanamo bay
punitive response tp terrorist threat, meant to detain dangerous persons
confidence interval:
range og values that probably contains pop. value, wider the range= greater confidence that it contains pop value confidence limits: values that state the boundaries
type two error
reatin H0 when H0 is false
Real limits example
recorded measurement on scale is 70 kg. smallest measuring unit of the scale is .50 kg. What are the real limits of the measurement? 69.75 and 70.25
observational: naturalistic research
recording what you see in an environment, without intervening
type one error
reject H0 when H0 is true
meaning of the signs in the R squared equation
ryx1= the correlation between Y and X ryx2= the correlation between Y and X rx1x2= the correlation between X
sampling dist. of mean varies with size of:
sample
median
scale value below which 50% of the scores fall (P50) -for raw scores: Mdn is the centre most score if numbers of scores are odd, if you have even number of scores- the av. of the two centre most scores
Stnadrad deviation computational eqaution**
see notes- this is the formula hall would like us to use
Random sampling:
selected from population by a process that ensures each possible sample of a given size has an equal chance of being selected, all members of the population have an equal chance of being selected into the sample
exhaustive defined:
set includes all the possible events
E summation sign is called
sigma (add up everything that comes after sigma)
-how big should size of effect be for sm. medium or lrg. effect?
sm: 0-.20, medium : 0.21-0.79, large: greater than or equal to 80
gangs: strain
society values wealth, not everyone can achieve wealth -deviant subcultures engage in crim. activites to achieve culturally defined goals -gangs are innovators - trying to achieve prosperity by unconventional means, vs. conformists
the variance
square of standard deviation
s is estimate of:
standard deviation
how to know if prediction of Y increased by adding another predictor variable
standard error of estimate proportion of variability of Y accounted for by X--> (the higher r sqaured is, the better you can explain)
importance of size of effect
statistically significant does not mean important -statistically significant = results probably not due to chance -important effect (practically or theoretically) usually related to size of effect
How to calculate power for the sign test:
step one: assume null hypothesis is true, use Pnull = 0.50 to determine possible sample outcomes that allow null to be rejected step two: assume alternative hypothesis is true, for given size of P real, determine probability of getting any sample outcomes from step one- this is power
linear relationships can be described by a :
straight line
4 types of gangs
street, motorcycle, mofia, organized crime
E = X is called
sum X
the arithmitic mean
sum of all the scores divided by the number of scores
stem and leaf diagram
summarizing data with less than 100 scores Stem is placed left of the vertical line, leaf to the right
what if assumptions of t test for independent groups are violated?
t test is very robust- insensitive to violations of underlying assumptions esp. if n1=n2, and n1, n2 greater than or equal to 30 -effect size for independent groups t test: Cohen's d general equation = mean diff/ pop s.d., ssee notes for conceptual equation
negative skewed graph
tail on the left
positive skewed graph
tail on the right
How to empirically evaluate stat for independent t test groups
take all possible samples of size n1 from opo with the mean and standard deviation -take all diff samples of size n2 from pop with mean and s.d. -calculate sample mean 1 and 2 for each sample -calculate sample mean 1 minus sample mean 2for all possible pairings of sample of size n1 and n2 -calc. prob of getting each sample mean assume random sampling
Deviation score
tells how far away the raw score is from the mean of its distribution
frequency distributions typically between ___ and ___ intervals
ten and twenty
Population
the complete set of individuals, objects, or scores that you are interested in studying
what is sampling dist. of t?
the dist. giving: 1) all possible values of t for samples of size N 2) prob. of getting each value if sampling is random from H0 pop. -can be derived theoretically/ empirically -empirically: take all possible diff. samples of size N from pop. of raw scores, calc. t for each sample, calc. prob of getting each t value if randomly sampling from pop. -if H0 pop is normal or N is greater than or equal to 30 then t dist. looks like z dist, BUT there is a family of t curves that vary with N (like the sampling dist of mean) -there is only one z curve -slight complication: t dist. varies with N but varies uniquely with degrees of freedom rather than N
cumfp is...
the frequency of scores below the percentile point
Xi stands for
the ith score, where i can range from 1 to N (max number)
Positively skewed distribution mean vs mode
the mean is pulled in the direction of the skew. mean > median
a measure of the variability of Y accounted for by X: an example (scenario one) suppose there is no relationship between spelling score and writing score- what is the best guess?
the mean is the best predictor if there is no relationship between X and Y because the sum of the squared deviations from the mean is a minimum (not possible to make a better guess)
data
the measurements made on your research subjects
asymptonic
the normal curve gets closer and closer to the horiztontal axis but never quite touches it. It is said to be asymptonic to the horizontal axis
in the sign test: our statistic we calculated was...
the number of +'s in sample of N difference scores -statistic was evaluated using binomial distribution with p = 0.50, this is sampling distribution of statistic used in sign test.--> different for each sample size (N)
N stands for
the number of subjects/ scores in data sets EX) N=5
dependent variable
the outcome
P null
the probability of getting a + with any subect in sample if the IV has NO effect (if null is true) - P null is always 0.50 (marajuana has no effect on RT) The probability of anyone in the sample getting + is 0.50
alpha limits:
the probability of type one error
Human trafficking`
the recruitment, transportation, transfer or receipt of people in order to exploit them
linear relationship
the relationship can be most accurately represented by a straight line
if a set of events is mutually exclusive and exhaustive:
the sum of the individual probabilities of each event in the set but be equal to one ex) rolling a die once (p of 1 or 2 or 3 or 4 or 5 or 6). This is mutually exclusive and exhaustive. = 1/6 +1/6 + 1/6 + 1/6 + 1/6 + 1/6 =1.0000
in a unimodal symmetrical distribution what can be said about the mode median and mean?
they all equal eachother
Details of the two variance estimates: Within group variance
this estimate of variance is based on variability within groups (see notes for equation) -numerator is called "within groups sum of sqaures" (SS within) - denominator eqauls df for within groups variance, MS within = SS within/ df within NOTE: this estimate of variance is unaffected by effect of IV ... why? because it is based on variability within each group and the IV is mainpulated between goups
regression
topic that considers using the relationship between two or more variables for prediction
relative frequency
total number of scores at an interval
Nominal scale
units on the scale are names/ categories, purely quantitative or categorical
So as F obt. increases, H0 becomes more...
unreasonable -we evaluate F obt. by comparing it with F crit. Using sampling dist. of F, decision rule: if F obt. > or equal to F crit we reject the null!
overall mean:
used when calculating the mean of muliple means, you must take into account the size (N) of each group
recall z test
used when mean and s.d. of pop. are known
Students t test for single samples
used when mean of H0 pop is known but s.d. is not known
t test for independent groups
used when: each subject tested only oncce, no basis for pairing scores, subjects are randomly sampled from pop. and assigned to one of 2 conditions -both sample means (X1 and X2) are calculated. Diff btwn. sample means (X1-X2) is then calculated and analyzed.
what is the most informative measure of central tendency? Unimodal, symmetrical distribution
usually the mean. (but all measures would be similar)
One and two tailed probability evaluations- when we set alpha level for an experiment, should prob. evaluation be one tailed or two tailed?
usually two tailed! -unless we are willing to retain H0 if the results are extreme in the direction opposite than predicted
correlation coefficient: direction
vairies from -1 (neg) to + 1 (pos)
Percentile point
value on measurement scale below which a specified percentage of the scores on the distribution fall
binomial expansion is general and applies to:
values of P other than .50 what if question asks for prob. of some number of P events, but P > .50? Cant use table B!
like the t test, ANOVA assumes that only the means of the population scores are affected by the IV not the...
variances
how to construct freqeucny distrutions?
view pg. 51-54
Human trafficking NR libertarianism
violates natural rights of vitim -hungarian victims brought to canada
Human trafficking : Kayiley Oliver Machada
was one of three girls running prostitution ring out of Ottawa. -used SNS to lure girls into serve "johns" -psych positivism viewpoint: troubled upbringing, surrounded by drugs, mom was a sex worker -she was determined to be sociopathic narcisist -restorative justice: sees crime as more than breaking the law, causes harm to community -3 propositions: Crime is violation of people and interpersonal relationships, violations create obligations and liabilities, heal and put rightwrong
decision rule
we always evaluate the results of an experiment by evaluating H0. Why? because we can calcualte the prob. of chance events H0 but not the prob of H1. -we evaluate H0 by assuming it is true and testing how reasonable that assumption is -How? -By calc. the prob. of obtaining our results by chance or luck alone
if H1 is directional:
we evaluate only the tail in the direction specified by H1
if we obtained a probability greater than alpha
we fail to reject H0, retain H0 and do not accept H1. Results are not significant
To evaluate stat of independent t test groups
we need to know sampling dist. of the difference btwn sample means. -can be derived theoretically/ empirically.
Measurement scales in psyhcology
we often treat scales as interval w/o clearly establishing that they measure eqaul intervals btwn. adjavent points-- clincial, social, educational psychology
if the obtained prob, is less than or equal to the critical prob. called alpha
we reject H0 and accpet H1 and we say results are significant (probably not due to chance)
Relationship between ANOVA and the t test
when a study involves just two independent groups, and were testing the null that pop mean one = pop mean two or pop mean one minus pop mean two = 0, we can either use t test for independent groups of F test, in such situations t squared = F
when do we care about power?
when designing/ analyzing experiments, when interpreting non-significant results
the "birthday problem"
when there is about 23 people in a room there is a 50% chance that two of them have the same birthday... increases with the number of people in the room. At 80 people there is almost a 100% chance.
EX sqaured
x1 sqaured + x2 sqaured + x3 sqaured + x4 squared... square each individual number and add them together
do we ever need to use both addition and multiplication rules in a single equation? Give example
yes ex) Roll two die at one. What is the probability that the sum of numbers showing on the die is three? -two possible outcomes to yeild a three. die 1 = 1 and die 2=2 or die 1 =2 and die 2 =1. -Dice are independent, so use multiplication rule with independent events to find prob. of each outcome. p(A) = p (die 1 =1 and die 2=2)= 1/6 x 1/6 = 1/36 p(B) = p (die 1=2 and die 2=1) = 1/6 x 1/6= 1/36 -mutually exclusive, so add probabilities. p (sum of 3) = p (A or B) = p (A) + p (B) = 1/36 + 1/36 = 2/36 or 0.0556
are violent crimes viewed as common?
yes bc they are sensationalized, think its frequent becasue they are vivid