stats
How are the mean, median, and mode affected by skewness?
(Alternatively, where are the mean, median, and mode located relative to one another in a skewed distribution?) The mean is dragged toward the direction of skew the most, the median less so, and the mode even less so. (There are good pictures in Chapter 3 illustrating this.)
Ch 2
...
Ch 3
...
Ch 4
...
Ch 5
...
Ch 6
...
How is a z-score computed?
...
How is z-score transformed to a raw score?
...
What is the mean of a distribution of z-scores?
0 (zero). Always!
What is the standard deviation of a distribution of z-scores?
1 (one). Always!
What is a construct?
A construct is an unobservable thing (like memory or intelligence) that is useful for explaining or describing behavior.
What is a dependent variable?
A dependent variable is the variable that is measured to see if a manipulation had an effect.
What is a discrete variable, and what is a continuous variable?
A discrete variable is one that has no "values" between adjacent labels (like a multiple-choice question). A continuous variable is one that always has values between any two labels.
What is the general goal of finding a measure of central tendency?
A measure of central tendency provides a single value to represent an entire set of values. Usually we look to the middle/center for this best representative.
What is the general goal of finding a measure of spread or variability?
A measure of spread provides a quantity that helps describe how spread out or clustered a group of scores is.
What is a (population) parameter? What is a (sample) statistic? How are the two related?
A parameter is something (usually a quantity, such as an average) that describes a population. Similarly, a statistic is something that describes a sample. Because we hope that samples represent populations, we similarly hope that statistics provide good estimates for parameters
In terms of research, what is a population? What is a sample? How are the two related?
A population is the group of interest in a study. A sample is the subset of the population that is actually studied. Populations are generally too large to study, so instead a sample is taken from the population, with the hope being that the sample represents the population well.
What is a quasi-experiment?
A quasi-experiment is a type of correlational research in which one or more variables is used to create groups to compare to one another. Structurally, this is similar to an experiment in which groups are created by a researcher's manipulation, but because these groups already exist, no cause-and-effect conclusions can be drawn.
How does a simple frequency distribution table differ from a grouped frequency distribution table?
A simple frequency distribution has one score per row, whereas a grouped frequency distribution has many scores per row.
Describe a skewed distribution. And what makes a distribution negatively skewed or positively skewed?
A skewed distribution is one that has scores clustered at the high end (this is negative skew) or at the low end (positive skew).
Describe a symmetrical distribution.
A symmetrical distribution is one that's left side is a mirror image of its right side.
What is a z-score?
A z-score is a value that describes the exact location of a score in a distribution relative to the mean.
What information does a z-score contain?
A z-score's sign (+, -) tells whether a score is above or below the mean of a distribution. A z-score's numerical value tells how far away from the mean a score is, counting in standard deviations.
What effect does transforming a variable have on the SD?
Adding or subtracting a constant to/from every score will not change the SD. Multiplying or dividing by a constant will multiply or divide the SD by that constant.
What is an independent variable? What is a quasi-independent variable? How are they similar, and how are they different?
An IV is a variable manipulated by a researcher to create groups to compare. A quasi-IV is a variable selected (but not manipulated) to create groups to compare. They are similar in that in both cases groups are being compared, but in an experiment the groups are created by the researcher, whereas in a quasi-experiment the groups are selected by the researcher.
What is an operational definition?
An operational definition is a set of procedures used to measure some construct. For example, we can measure memory by seeing how many words someone remembers from a list they studied.
Can z-scores be used to decide what scores are "central" and what scores are "extreme"?
As a guideline, z-scores between +1 and -1 are called "central" (because they are close to the middle (the mean). The textbooks suggests that scores beyond ±2 are "extreme". This leaves the scores that are between +1 and +2 (and -1 and -2) without a label. And that's okay!
Why can scores from different scales be compared by using z-scores?
Because all distributions of z-scores have the same mean and same standard deviation.
How are z-scores related to probability?
Because we know the mean (0) and standard deviation (1) of the standard normal curve (that is, any normal curve that's been converted to z-scores), and its shape, we can make use of the table in the back of the book (which is based on calculus - integration in particular) to find out the probability of finding scores above or below and z-score, or between any two z-scores. This is very convenient, because it allows us to figure out the probability of sampling ranges of scores from a normally-shaped population.
What is the purpose of descriptive statistical techniques?
Descriptive statistics are those that are used to describe (summarize, organize) data.
What is sampling error?
Despite that we hope that statistics provide good estimates for parameters, they are rarely perfect estimates. This imperfection is called sampling error. Sampling error occurs because it is all but inevitable that a sample will not tell us everything we want to know about a population. Sampling error is NOT due to errors in computation or mistakes that researchers make or bad research methods or designs.
How are z-scores used to find the probability of events (that is, ranges of scores) occurring?
Draw the population, labeling the mean and SD. Determine if you are interested in the body, the tail, or some segment of the normal curve. Convert the score(s) of interest to a z-score. Look up the z-score(s) in the table in the back of the book and figure out the area of the part of the curve you're interested in.
What is experimental research? What is correlational research? How do the two differ from one another?
Experimental research is research in which one or more variables is manipulated by a researcher to see if these manipulations cause something to happen. Correlational research is research in which variables are observed to see if they are related. The major difference between these two types of research is that in experiments there is manipulation that is under the control of the researcher, which allows for cause-and-effect conclusions to be drawn, whereas this is not possible in correlational research.
How can scores be transformed from one scale to another?
First, transform a raw score from the original scale to z using the original scale's mean and SD. Then transform the z to a raw score using the new scale's mean and SD.
Under what circumstances should bar graphs be used? Histograms? Frequency polygons? How are these graphical displays similar? How are they different?
Histograms and frequency polygons should be used for continuous variables, whereas bar graphs should be used for discrete variables. They are all similar in that they have scores on the x-axis and frequencies on the y-axis, but differ in whether frequencies are represented with bars (histograms and bar graphs) or with points on a polygon (frequency polygons), and whether the bars touch (histogram) or not (bar graph).
What dictates whether a simple or grouped frequency distribution table should be used for a particular data set?
If "too many" row are needed for a simple frequency distribution table, a grouped table should be used. "Too many" usually means more than about 10 rows.
Why are cause-and-effect conclusions possible in experiments but not in correlational research?
In a well-designed experiment, if the manipulation of one variable is related to a change in another variable, the only possible reason for that change is the manipulation, thus allowing a causal conclusion to be drawn. In correlational research, if two variables are found to be related, it is always possible that some other variable besides the ones that have been observed could be causing the relationship. (Pages 12-14 of the textbook have a nice discussion of this.)
What good are z-scores?
In addition to providing a descriptive tool, they are useful for transforming scores from one scale to another (for example, from a scale that has μ = 29 and σ = 4.5 to one with μ = 50 and σ = 10) and for comparing scores on different scales.
What is the purpose of inferential statistical techniques?
Inferential statistics are those that are used to generalize the information gathered from a sample to a larger population.
What the three main measures of central tendency?
Mean, median, and mode.
Why is probability important?
Probability allows us to determine the likelihood of getting different scores and samples from a population, if we know the shape, mean, and standard deviation. Because of this, we will also be able to go the other direction: We'll be able to say that, given a certain sample, how likely is it that the sample came from a certain population? This leap from samples to populations is what research is all about.
What is probability?
Probability is the likelihood of some event or outcome occurring when there is more than one possible event or outcome occurring. (That is, it doesn't make a lot of sense to ask what the probability of the sun rising is. There's only one possible event. But to ask what the probability of rain is does make sense, according to this definition.
How are range statistics similar to the median?
Range statistics are similar to the median in that not every score affects them.
What are some measures of variability?
Range, interquartile range, semi-interquartile range, variance, and standard deviation are the most important of these.
What does the SD tell us?
SD tells us the typical distance between the scores in a distribution and the mean of the distribution.
How is SS computed?
SS is computed by first finding deviations (from the mean) for every score, then by squaring these deviations, and finally by summing the squared deviations. This is why this value is called the Sum of Squared deviations.
What is a good general definition for the term statistics?
Statistics are procedures for organizing, summarizing, and interpreting information. They are tools for use in doing research.
Why are there different formulas for variance and SD for a population and a sample?
The formulas for variance and SD for a sample are different from those for a population to make the former unbiased as an estimator. (See p. 101-102 for how this works.)
What is the general goal of using frequency distribution tables and graphs?
The goal of these tables and graphs is to summarize the information in a distribution of values succinctly, by representing the frequencies of different scores.
Define the mean.
The mean is the balance point of a set of values, or the total of scores equally divided among all the scores.
What is the preferred measure of central tendency, and under what conditions should it not be the preferred measure?
The mean is the default, preferred measure of central tendency. Another measure should be considered in its place if there is a lot of skew or some very extreme scores (use the median) or if a nominal scale is used (use the mode).
What happens to the mean if a score is changed?
The mean will change if any score is changed.
What happens to the mean if a constant is added to (or subtracted from) every score? What about if a constant product is multiplied by every score?
The mean will increase or decrease by the constant in the case of adding or subtracting a constant, respectively. The mean will be changed by the factor of the constant in the case of multiplication.
Define the median.
The median is the value that divides a list into equal halves, with half the values above the median and half below; that is, it's the 50th percentile.
Define the mode.
The mode is the most frequent score.
What are the scales of measurement that can be used when measuring variables?
There are four scales of measurement, listed below in order from the least informative to the most informative:
What else are z-scores good for?
They can be used to compute the probability of finding ranges of scores from a normally-shaped distribution. This is a big part of what Chapter 6 was about.
What is a standard score?
This is just another label for a z-score.
How are variance and SD similar to the mean?
Variance and SD, just like the mean, are affected by every single score.
How are variance and SD computed?
Variance is simply SS divided by N (for a population) or n - 1 (for a sample). (It is the "average squared deviation".)
Is there a relationship between area and probability?
Yes! In a frequency polygon (or histogram), the area of a curve that corresponds to the outcome is precisely the same as the probability of the outcome of interest. This is illustrated very well in Figure 6.2 (on p. 133).
Can I compute z-scores for distributions that are not normally shaped?
Yes! The only time you need to worry about a distribution being normal is if you are trying to determine probability using z-scores.
Is there a relationship between z-scores and frequency?
Yes, if the distribution is bell-shaped or normal. If this is the case, then z-scores that are close to zero (that is, close to the mean) are more frequent, and z-scores that are far from zero (that is, far from the mean) are less frequent. More generally, this means that raw scores that are close to the mean are more frequent, and those that are far from the mean are less frequent.
Is there a relationship between frequency and probability?
Yes. The frequency of the outcome of interest and the frequency of all possible outcomes are in the numerator and denominator, respectively, of the fraction that determines probability.
What is degrees of freedom (df)?
df is n - 1 (see p. 99 for why this is)
How is the probability of an event determined?
probability of A = number of A outcomes / number of possible outcomes
interval -
this scale provides labels + order + distance (interval) information (the last of these is need to do meaningful addition and subtraction)
ratio -
this scale provides labels + order + distance + a true zero (the last of these is needed to do meaningful multiplication and division)
ordinal -
this scale provides labels + order information (like letter grades)
nominal -
this scale provides only labels or names for things being measured