Psych Statistics-Chapter 2
A skewed distribution with the tail on the right-hand side is _____ because the tail points toward the positive (above-zero) end of the X-axis. If the tail points to the left, the distribution is _____.
Positively skewed, negatively skewed
An example of a frequency distribution polygon for grouped data:
The same set of data is presented in a frequency distribution table and in a polygon
An example of a frequency distribution histogram:
The same set of quiz scores is presented in a frequency distribution table and in a histogram
For a nominal scale, the space between bars emphasizes that:
The scale consists of separate, distinct categories
It is customary to list categories from highest to lowest, but:
This is an arbitrary arrangement (many computer programs list categories from lowest to highest)
In a skewed distribution, the scores tend to:
Pile up toward one end of the scale and taper off gradually at the other end
A frequency distribution takes a disorganized set of scores and and does what with them?
Places them in order from highest to lowest, grouping together individuals who all have the same score (ex. if the highest score is X = 10, the frequency distribution groups together all the 10s, then all the 9s, then the 8s, and so on)
The second option for graphing a distribution of numerical scores from an interval or ratio scale of measurement (besides a histogram) is called a:
Polygon
When you can obtain an exact frequency for each score in a _____, you can construct frequency distribution graphs that are exactly the same as the histograms, polygons, and bar graphs that are typically used for samples
Population
For a very difficult exam, most scores tend to be low, with only a few individuals earning high scores. This produces what kind of distribution?
Positively skewed distribution
Not all distributions are perfectly symmetrical or obviously skewed in one direction. Therefore, it is common to modify these descriptions of shape with phrases like:
"Roughly symmetrical" or "tends to be positively skewed" (the goal is to provide a general idea of the appearance of the distribution)
On the second graph, what does the break (-/ /-) on the x-axis mean?
-Does not list all of the possible heights starting from zero and going up to 48 inches -Instead, the graph clearly shows a break between zero and 30, indicating that some scores have been omitted
How do you find where to position a dot on a polygon with data that has been grouped into class intervals?
-For a grouped distribution, you position each dot directly above the MIDPOINT of the class interval -The midpoint can be found by averaging the highest and the lowest scores in the interval - For example, a class interval that is listed as 20-29 would have a midpoint of 24.5
Give an example of how you can construct these frequency distribution graphs mentioned in the previous flashcard:
-For example, if a population is defined as a specific group of N = 50 people, we could easily determine how many have IQs of X = 110 -However, if we were interested in the entire population of adults in the United States, it would be impossible to obtain an exact count of the number of people with an IQ of 110
Give an example of what a relative frequency is:
-For example, no one knows the exact number of male and female human beings living in the United States because the exact numbers keep changing -However, based on past census data and general trends, we can estimate that the two numbers are very close, with women slightly outnumbering men
Example of how the same set of data can be presented in two entirely different ways by manipulating the structure of a graph (1):
-For the past several years, the city has kept records of the number of homicides occurring per year. The data are summarized as follows: -These data are shown in 2 different graphs
Give an example of how with continuous variables, measurements correspond to INTERVALS on the number line rather than SINGLE POINTS:
-If you are measuring time in seconds, for example, a score of 8 actually represents an interval bounded by the real limits 7.5 seconds and 8.5 seconds -Thus, a frequency distribution table showing a frequency of individuals all assigned a score of does not mean that all three individuals had exactly the same measurement -The three measurements are simply located in the same interval between 7.5 and 8.5
Give an example of how the wider the class intervals are, the more information is lost:
-In table in notebook, the interval width is 5 points, and the table shows that there are three people with scores in the lower 60s and one person with a score in the upper 60s -This information would be lost if the interval width were increased to 10 points -With an interval width of 10, all of the 60s would be grouped together into one interval labeled 60-69 -The table would show a frequency of four people in the 60-69 interval, but it would not tell whether the scores were in the upper 60s or the lower 60s
What is a version of an easy to draw and simple to understand histogram?
-Instead of drawing a bar above each score, the modification consists of drawing a stack of blocks -Each block represents one individual, so the number of blocks above each score corresponds to the frequency for that score
In the previous flashcard, which graph is correct?
-Neither one is very good -The purpose of a graph is to provide an ACCURATE display of the data (the first graph exaggerates the differences between years, and the second graph conceals the differences, some compromise is needed)
What are 2 features of normal distributions?
-Normal-shaped distributions occur commonly -This shape is mathematically guaranteed in certain situations
What are 2 things to notice about the graphs on the previous flashcards?
-Notice that the values on both the vertical and horizontal axes are clearly marked and that both axes are labeled -On the second graph, it shows a distribution of heights measured in inches (whenever possible, the units of measurement are specified)
What are some advantages of the simplified histogram?
-The number of blocks in each stack makes it very easy to see the absolute frequency for each category -It is easy to see the exact difference in frequency from one category to another -Because the frequencies are clearly displayed by the number of blocks, this type of display eliminates the need for a vertical line (the Y-axis) showing frequencies -In general, this kind of graph provides a simple and concrete picture of the distribution for a sample of scores
Example of how the same set of data can be presented in two entirely different ways by manipulating the structure of a graph (2):
-These data are shown in two different graphs -In the first graph, we have exaggerated the height and started numbering the Y-axis at 40 rather than at zero (as a result, the graph seems to indicate a rapid rise in the number of homicides over the four-year period) -In the second graph, we have stretched out the X-axis and used zero as the starting point for the Y-axis (the result is a graph that shows little change in the homicide rate over the 4-year period)
How can you represent the relative frequencies on the previous flashcard (the example with men and women) on a graph?
-You can represent these relative frequencies in a bar graph by making the bar above Female slightly taller than the bar above Male -The graph does not show the absolute number of people, instead, it shows the RELATIVE number of males and females
When constructing or working with a grouped frequency distribution table, a common mistake is to calculate the interval width by using the highest and lowest values that define each interval. For example, some students are tricked into thinking that an interval identified as 20-24 is only 4 points wide. To determine the correct interval width, you can:
1. Count the individual scores in the interval. For this example, the scores are 20, 21, 22, 23, and 24 for a total of 5 values. Thus, the interval width is 5 points. 2. Use the real limits to determine the real width of the interval. For example, an interval identified as 20-24 has a lower real limit of 19.5 and an upper real limit of 24.5 (halfway to the next score). Using the real limits, the interval width is 24.5-19.5 = 5 points
When the data consist of numerical scores that have been measured on an interval or ratio scale, there are two options for constructing a frequency distribution graph. The two types of graphs are called:
1. Histograms 2. Polygons
Although it is still possible to construct graphs showing frequency distributions for extremely large populations, the graphs usually involve two special features:
1. Relative frequencies 2. Smooth curves
There are 3 characteristics that completely describe any distribution:
1. Shape 2. Central tendency 3. Variability
A frequency distribution can be structured either as a table or as a graph, but in either case, the distribution presents the same 2 elements:
1. The set of categories that make up the original measurement scale 2. A record of the frequency, or number of individuals in each category
It is recommended that a REGULAR (not grouped) frequency distribution table have a maximum of _____ rows to keep it simple
10 to 15
What is a final general rule for the way in which a graph should be constructed?
A final general rule is that the graph should be constructed so that its height (Y-axis) is approximately two-thirds to three-quarters of its length (X-axis)
Modified (simplified) histogram:
A frequency distribution graph in which each individual is represented by a block placed directly above the individual's score. For example, three people had scores of X = 2.
When data have been grouped into class intervals, you can construct:
A frequency distribution histogram by drawing a bar above each interval so that the width of the bar extends to the real limits of the interval (adjacent bars touch with no space between bars)
A frequency distribution graph is basically:
A picture of the information available in a frequency distribution table
A frequency distribution graph provides:
A picture showing exactly where the individual scores are located (to make this concept more concrete, you might find it useful to think of the graph as showing a pile of individuals just like we showed a pile of blocks in the simplified histogram, for the population of IQ scores shown, the pile is highest at an IQ score around 100 because most people have average IQs)
The word "normal" in normal curve refers to:
A specific shape that can be precisely defined by an equation
What kind of distribution would an easy exam produce?
A very easy exam tends to produce a negatively skewed distribution, with most of the students earning high scores and only a few with low values
The horizontal line is the X-axis, or the _____. The vertical line is the Y-axis, or the _____.
Abscissa, ordinate
What does the frequency distribution allow?
Allows the researcher to see "at a glance"
The shape of a distribution is defined by:
An equation that prescribes the exact relationship between each X and Y value on the graph
The population distribution of IQ scores:
An example of a normal distribution
What is a result of each bar in a histogram extending to the midpoint between adjacent categories?
As a result, adjacent bars TOUCH and there are no spaces or gaps between bars
When the scores are measured on a nominal or ordinal scale (usually non-numerical values), the frequency distribution can be displayed in a:
Bar graph
A bar graph showing the distribution of personality types in a sample of college students:
Because personality type is a discrete variable measured on a nominal scale, the graph is drawn with space between the bars
We can describe a normal distribution as:
Being symmetrical, with the greatest frequency in the middle and relatively smaller frequencies as you move toward either extreme
How does the simplest frequency distribution table present the measurement scale?
By listing the different measurement categories (X values) in a column from highest to lowest-beside each X value, we indicate the frequency, or the number of times that particular measurement occurred in the data (it is customary to use an X as the column heading for the scores and an f as the column heading for the frequencies)
A polygon also can be used with data that have been grouped into:
Class intervals
Rather than drawing a complete frequency distribution graph, researchers often simply:
Describe a distribution by listing its characteristics
In a symmetrical distribution, it is possible to:
Draw a vertical line through the middle so that one side of the distribution is a mirror image of the other
Although graphs are intended to provide an accurate picture of a set of data, they can be used to:
Exaggerate or misrepresent a set of scores
These misrepresentations on a graph generally result from:
Failing to follow the basic rules for graph construction
In addition to using frequencies (f) and proportions (p), researchers often describe a distribution of scores with percentages, give an example of this:
For example, an instructor might describe the results of an exam by saying that 15% of the class earned As, 23% Bs, and so on
The _____ are listed on the Y-axis with values increasing from bottom to top
Frequencies
One of the most common procedures for organizing a set of data is to place the scores in a:
Frequency distribution
Whenever the term distribution appears, you should conjure up an image of a:
Frequency distribution graph
In general, the wider the class intervals are, the more:
Information is lost
When a continuous variable is measured, the resulting measurements correspond to:
Intervals on the number line rather than single points
What does the frequency distribution show?
It shows whether the scores are generally high or low, whether they are concentrated in one area or spread out across the entire scale, and generally provides an organized picture of the data
When a set of data covers a wide range of values, it is unreasonable to:
List all the individual scores in a frequency distribution table
You should note that after the scores have been placed in a grouped table, you:
Lose information about the specific value for any individual score
The _____ is listed along the X-axis with values increasing from left to right
Measurement scale (set of X values)
What is variability?
Measures the degree to which the scores are spread over a wide range or are clustered together
What is central tendency?
Measures where the centre of the distribution is located
One commonly occurring population distribution is the _____ curve
Normal
A frequency distribution showing the relative frequency of females and males in the United States:
Note that the exact number of individuals is not known, the graph simply shows that there are slightly more females than males
Give an example of how you lose information about the specific value for any individual score in a grouped frequency distribution table:
One person having a score between 65 and 69, but the table does not identify the EXACT value for the score
The results from a research study usually consist of pages of numbers corresponding to the measurements or scores collected during the study, the immediate problem for the researcher is to:
Organize the scores into some comprehensible form so that any patterns in the data can be seen easily and communicated to others (descriptive statistics)
What is proportion measure?
Proportion measures the fraction of the total group that is associated with each score
The concept of _____ also applies to the class intervals of a grouped frequency distribution table
Real limits
Although you usually cannot find the absolute frequency for each score in a population, you very often can obtain:
Relative frequencies
Because proportions describe the frequency (f) in relation to the total number (N), they often are called:
Relative frequencies
In some cases a graph may not be the best way to display information, for these data, what would be a better way to display them?
Showing the numbers in a table
When a population consists of numerical scores from an interval or a ratio scale, it is customary to draw the distribution with a _____ instead of the jagged, step-wise shapes that occur with histograms and polygons
Smooth curve
A bar graph is essentially the same as a histogram, except that:
Spaces are left between adjacent bars
Nearly all distributions can be classified as being either:
Symmetrical or skewed
The section where the scores taper off toward one end of a distribution is called the:
Tail of the distribution
A frequency distribution presents a picture of how:
The individual scores are distributed on the measurement scale
In addition to providing a picture of the entire set of scores, a frequency distribution allows you to see:
The location of any individual score relative to all of the other scores in the set
For both continuous and discrete variables, each bar in a histogram extends to:
The midpoint between adjacent categories
A good example of a normal distribution is:
The population distribution for IQ scores
An example of a frequency distribution histogram for grouped data:
The same set of children's heights is presented in a frequency distribution table and in a histogram
Does a simplified histogram replace a regular histogram?
This kind of display simply provides a sketch of the distribution and is NOT a substitute for an accurately drawn histogram with two labeled axes
How do you construct a bar graph?
To construct a bar graph, list the categories of measurement along the X-axis and then draw a bar above each category so that the height of the bar corresponds to the frequency for the category
How do you construct a histogram?
To construct a histogram, you first list the scores (measurement categories), equally spaced along the X-axis. Then you draw a bar above each X value so that: 1. The height of the bar corresponds to the frequency for that category 2. For continuous variables, the width of the bar extends to the real limits of the category. For discrete variables, each bar extends exactly half the distance to the adjacent category on each side.
How do you construct a polygon?
To construct a polygon, you begin by listing the scores (measurement categories), equally spaced along the X-axis, then: 1. A dot is centred above each score so that the vertical position of the dot corresponds to the frequency for the category 2. A continuous line is drawn from dot to dot to connect the series of dots 3. The graph is completed by drawing a line down to the X-axis (zero frequency) at each end of the range of scores (the final lines are usually drawn so that they reach the X-axis at a point that is one category below the lowest score on the left side and one category above the highest score on the right side)
Why does the graph have to be constructed this way (as told in the previous flashcard)?
Violating these guidelines can result in graphs that give a misleading picture of the data
The smooth curve indicates that:
You are not connecting a series of dots (real frequencies) but instead are showing the relative changes that occur from one score to the next
For ordinal scales, separate bars are used because:
You cannot assume that the categories are all the same size (you only know ORDER of size, etc. but not the actual differences in size, etc.)
As a general rule, the point where the two axes intersect should have a value of _____ for both the scores and the frequencies
Zero