AD RESEARCH FINAL: Descriptive Statistics
Regression Analysis
Most common method for determining correlation. Regression and correlation are used to determine whether the relationship between sets of variables is significant, that is, unlikely to occur by chance.
Tool that enables us to make inferences from samples to populations
Normal Curve
Basic Correlation Statistics
Numerical expressions of the degree to which two variables change in relation with one another (covary) are called measures of association or correlation. Typically, the variable presumed to be antecedent is designated "X" and the criterion "Y". Correlation is valuable because it is the simplest statistic that can be used to assess the relationship between two interval or ratio variables
What kind of technique is a thematic apperception test?
Projective Technique
Central Limit Theorem
Provides assurance that any random sample statistic comes from a population of statistics that form a normal curve around the true population parameter.
What is the probability in the area below a normal curve called
Rejection of the null hypothesis
What has to be at least at the ordinal level for the median to be a meaningful measure of central tendency?
A distribution of numbers
Rule for Percentaging Calculations
Always calculate percentages in the direction of the causal factor (antecedent/x/independent) and across the effect (criterion/y/dependent) factor.
Where does the independent variable go in a contingency table?
At the top of a cross tabulation of contingency table
Curves and their shapes
Bell-shaped Positive Skew Negative Skew
"What can we do with that data using what kind of analyses that help us address the original research problem at greater depth?"
Bivariate
use a scatterplot
Bivariate
Means that two or more sets of variables are related to one another at the interval or rational level
Correlation
What is another form of determining correlation?
Significance
Inferential Statistics
Study a group of data and inferring the sample projects onto the larger population - More powerful than descriptive - Process of statistical inference
Cross Tabulation
Technique for studying the relationships between variables. Simplest technique for analyzing nominal variables in bivariate ways
The arithmetic average. The most power, but comes at a price.
The Mean
The midpoint of a distribution
The Median
The score or scores occurring most frequently
The Mode
Confidence Level
The level of statistical confidence that is built into the sample. Commonly in advertising and public relations research, we want to be confident that our results are right maybe 90 percent of the time. They tell us how likely our aim is to actually take place.
Purpose of research and methods
The main purpose for summary and descriptive methods of quantitative analysis are to reduce data and to enable us to see patterns and trends. You run the stats. If you find something important, you highlight it in the report. You always start with univariate statistics and analyses. Then you move on to whatever bivariate analyses will reveal something important about the research problem.
Statistical Inference
The process of estimating parameters (characteristics of a population) from statistics (characteristic of a sample). It's almost entirely necessary because we typically study samples instead of populations.
All projective techniques share what element in common?
They are all based on the projective hypothesis
True or False? All quantitative data, regardless of level of measurement, can be analyzed with frequency distribution tables, proportions, percentages and ratios
True
True or False? If you want to know how well the mean represents a distribution of numbers, you should also look at the standard deviation
True
True or False? Inferential statistics are more powerful than descriptive statistics.
True
True or False? Measure of dispersion can be used to reveal the "shape" of a distribution of scores or numbers
True
True or False? The chi-square statistic and test for significance is typically used with data measured at the nominal level.
True
True or False? The term that's used to describe whether a distribution's curve is tall and peaked or short and flat is called "kurtosis"
True
True or False? Unlike the mode and the median, the mean takes into account all the values in a distribution making it especially sensitive to the effects of extreme scores
True
True or False? You could reduce data using a frequency distribution table.
True
True or False? A distribution of numbers has to be at least ordinal level for the median to be a meaningful measure of central tendency?
True
True or False? Frequency distributions are one method for presenting data in descriptive statistical analysis
True
True or false? The square root of the variance of a distribution of numbers is the standard deviation
True
Once you've collected your data, you begin with univariate analysis. One good way to begin is by looking at the items you have that measure related concepts and variables.
Univariate
Inferential Statistics are used for?
Used to make estimates of how likely it is these statistics represent the population
The T-Test
Used to test for significance between two means. The independent or antecedent variable is nominal and the dependent or criterion variable must be interval or ratio.
ANOVA
Used to test for significant differences among three or more means. Type of regression analysis.
Pearson's r
Varies between -1.00 and +1.00. Values closer to one contain greater significance. Pearson's r can be tested for significance. Relationships can be "spurious". Most researchers use adjectives such as "moderate" to "high" to describe the strength of correlations, but it depends on the nature of the phenomenon under study. "R" value Indicates the strength of a linear relationship
interval level of measurement
linear scale of measurement in which the distance or interval btw all integers along the scale is equal. reported as frequencies and percentages PLUS mean scores.
nominal level of measure
lowest level of measurement, data is placed into categories or classes without any order, value or structure.
- Used with interval and ratio - Average - Most powerful
mean
The sum of all scores in a distribution of numbers, divided by the number of scores is the definition of _____
mean
ordinal level of measure
measures and defines attributes and characteristics in an ordered sequence. ranks responses from the smallest to largest, best to worst and first to last
-used with interval and ratio - used with midpoint distribution
median
- Used only with nominal level of measurement (weakest)
mode
Probability level is usually expressed by a lowercase _____?
p (in italics) followed by a less that or equal to sign and a value
ratio level of measurement
highest level of measurement, a true zero point ex. height, weight, age, length
Standard error of the mean indicates
how well the sample mean is likely to represent the population mean.
What are examples of bivariate statistics?
- Chi Square - Correlation - ANOVA - cross tabulations
Pearson's Product Moment Correlation
- Has a sign indicating direction of relationship - Number indicates strength of relationship - Meaningful value when squared
What are examples of univariate statistics?
- Measures of central tendency - Dispersion
Bell-Shaped Curve
- Normal distribution of scores - Cluster around the mean
The different measures of dispersion (3)
- Range - Variance - Standard deviation
3 Levels of Dispersion measure
- Range - the difference between the highest and lowest scores in a distribution of scores - Variance - provides a mathematical index of the degree to which scores deviate from, or are at variance with, the mean. - Standard Deviation - overcomes the problem of variance not being calibrated in the same units as the original data by taking the square root of the variance.
Univariate Variable
- Single variable - Simplest form of data tabulation - Involves the presence of frequency and distribution
Quantitative Analysis
- Starts with little background info - Objective - Deductive and logical - Provides generalizations - Sample size is essential - Goal is to classify and count - Uses measuring instruments
Chi-Square
- Values identified by "x2" - Used to measure data at a nominal level - Used to test distribution frequency to single-variable or to compare 2 or more groups -Chi-square is a nonparametric statistic. -Chi-square is best thought of as a discrepancy statistic (AKA, "goodness of fit").
What are some data tabulation issues?
- hand tabulation is inefficient - human error while recoding - lose sight of micro-level trends - average of data does not show the entire picture - collapsing data lessens precision of analysis
A variable that goes down the side of a cross tabulation table and whose values are in the rows should be the
Dependent Variable
Dispersion measures
Describe the way in which the scores are spread out about a central point
True or False? A cross-tabulation is used when one nominal-level variable and one interval-level variable need to be analyzed simultaneously to see if there's a relationship between them
False
True or False? If you wanted to test more than two mean scores you'd have to use a t-test instead of ANOVA
False
True or False? It is generally assumed that 100% of the scores in a distribution of numbers will fall within plus or minus two standard deviations from the mean, especially if the distribution is believed to be normal
False
True or False? Measures of central tendency and dispersion are examples of bivariate statistics?
False
True or False? When testing for Mr. Pearson's r for statistical significance, the null hypothesis is almost always that the correlation between the two variables in the population is greater than .05
False
What is the most important feature of a normal curve?
Fixed areas below the normal curve represent the frequency of the scores of values of a variable that fall in those areas. Allows us to know how variables vary without measurement
Are summary descriptive statistics a way to reduce data?
Yes. - Measures of Central Tendency - Mean, Median, Mode
parameter
a characteristic of a population and is generally unknown.
statistic
a characteristic of a sample (e.g., average weekly hours spent online by OU Freshmen).
True or False? The Median and Mean can be used to analyze all four levels of measurement
false