Psyc 301 Exam 1
Data
information collected from the sample on the variable we are interested in
Nominal scales
measures that split people into categories - categories must be mutually exclusive Nominal = Nameable
Validity
measuring exactly what you intend to measure
inter-rater reliability
A measure of how similarly two different test scorers would score a test.
What are correlations?
A measure of the relationship between two variables
Why is it important?
A set can have the same sample size and the same mean by have a different amount of variability
Bar graphs vs histograms
Bar graphs- compare the frequency of categorical responses Histograms- compare the distribution of continuous variables
Misleading graphs
Beware the scale of the axes
Goals of science and how statistics help achieve those
Description- How do people behave? Prediction- Identifying the factors that influence behavior Explanation-Identifying the underlying causes of a behavior Research methods and statistics are the ways we learn and discover
Correlation vs. Causation
Correlation does not equal causation 1. Reverse Causation: may flow from Y to X 2. Reciprocal Causation: two variables cause each other 3. Common Causal Variables: third variable influence the other 2 variables
Variability
The extent to which the scores in a data set tend to vary from each other and from the mean. -Range -Standard Deviation -Variance
Reporting correlations
There was a strong positive correlation between age and happiness (r=.61), suggesting that as age increases, so does happiness.
When to use which measure of central tendency
Use MODE when the data are categorical Use MEAN when the data are continuous and you don't have any extreme scores Use MEDIAN when the data are continuous and you think mean is misleading -When in doubt report both
Continuous Data
data measured on a continuum, all numbers between two end points are possible scores - Ex: height, weight, age
Categorical Data
data that sorts people into categories, only so many options for the variable - Ex: gender, major, experimental condition
Bimodal
a data set with two modes
Scatterplots
a graphed cluster of dots which represents the values of two variables
Statistics
a set of tools and techniques used for describing, organizing, and interpreting information or data
Central Tendency
a single number number that represents a group of scores -Mean -Median -Mode
Identifying Extreme Values
any data point more than 2 SD away from the mean is a potential outlier
Limitations of the correlation coefficient
can only identify LINEAR relationships
parallel forms reliability
consistency between alternate versions of the same test
reliability
consistency of a measure
error score
discrepancy between observed score and true score
Kurtosis
how flat or peaked a normal distribution is - Platykurtic: low kurtosis, flat more variability - Leptokurtic: high kurtosis, peaked less variability
coefficient of determination
how much variation 2 variables share -r^2
Measurement scales
nominal, ordinal, interval, ratio
Ordinal scales
number is in a ranking - unclear how much distance separates the data Ordinal = ordering
Pearson correlation
only used for 2 continuous variables - most common type
Interval scales
ordered events with equal spacing - zero not necessarily meaningful Most common type
Direction of a Correlation
positive or negative -positive relationship: direct relationship, same direction -negative relationship: indirect relationship, opposite direction
Computing a correlation coefficient
r=ΣXY-n̅x̅y̅/√(ΣX²-n ̅x̅²)√(ΣY²-n ̅y̅²) -Do numerator and denominator separately in order to prevent mistakes. -for r, it doesn't matter which variable is X or Y because each variable goes through the exact same process in the formula.
Skewness
refers to the lack of symmetry -positive skewness (right foot) -negative skewness (left foot)
Percentile Points
refers to the percentage of cases equal to and below a certain point in a group of scores -The median is the 50% percentile
Cronbach's alpha
reflects the degree of internal consistency
discriminant validity
scores on the measure are not related to other measures that are theoretically different
convergent validity
scores on the measure are related to other measures of the same construct
Ratio scales
similar to interval scale but 0 has specific meaning - zerO = ratiO uncommon
Correlation matrix
simple way to report multiple correlations
Variable
something that can change or have different values for different individuals
Variance
standard deviation squared -s^2
Measures
the act or process of assigning numbers to phenomena according to a rule - Behavioral measures - Self-report measures - Physiological measures
coefficient of alienation
the amount of unexplained variance
Standard Deviation
the average amount of variability in a set of scores - most common measure of variability - low SD data points are close to sample mean - high SD data points are far away from sample mean s = √[(Σ(x - xbar)^2)/n-1]
Mean
the average value of a group of numbers - most common X-bar = (Σx)/n
Strength of a Correlation
the closer the absolute value of the correlation is to 1 the stronger the relationship -Very strong .8 to 1 -Strong .6 to .8 -Moderate .4 to .6 -Weak .2 to .4 - No to weak 0 to .2
internal consistency (inter-item reliability)
the degree to which a test yields similar scores across its different items
Range
the difference between the highest and lowest scores in a distribution - most general measure of variability - ignores the middlemost values - emphasis on extreme scores r = h - l
criterion validity
the extent to which a measure is related to an outcome
predictive validity
the extent to which a score on a scale predicts scores
content validity
the extent to which a test samples the behavior that is of interest
concurrent validity
the extent to which two measures of the same trait or ability agree
construct validity
the extent to which variables measure what they are supposed to measure or don't measure what they shouldn't
Sample
the group you actually collect data from
Population
the group you are actually interested in drawing some conclusions about
Histograms
the height of each bar is the number of times each value occurs in our data set -allows us to see the distribution of our data
Median
the midpoint in a set of scores -point at which half of the scores are bigger and half of the scores are smaller List values in order and find middlemost score
observed scores
the score you actually got
Mode
the value that occurs most frequently in data - typically used with categorical data
true scores
true reflection of what you really know
Inferential Statistics
used to make inferences about a larger group from a smaller group - The next step after description
Descriptive Statistics
used to organize and describe data - Ex: counts, means, percentenges
test-retest reliability
using the same test on two different occasions to measure consistency