Chapter 5. Organizing and Displaying Data
histogram vs bar chart
A bar chart is for categorical data, and the x-axis has no numeric scale A histogram is quantitative/continuous data. bars represent intervals. can understand shape of data distribution/most common data values. Frequency via Y, categories via x.
Frequency Distribution
A method for presenting data that includes possible values for a given variable and the number of times each value is present
boxplot whiskers
Top & bottom= min/max values. Lower whisker= 1.5xi nterquartile range below the 1st quartile (25%). Upper whisker= 1.5x interquartile range above 3rd quartile (75%).
Boxplot
A chart that represents the distribution of data values; it also illustrates the quartiles and any outliers
Pie Chart
A circular chart in which the sections are proportionally representative of the frequencies of specific values of the given variable. It is most useful for the nominal and ordinal levels of measurement
Bar Chart
A graphical representation of data, which is teh most useful for the data at the nominal or ordinal level of measurement; the data categories on the horizontal axis, whereas the frequencies of each category are on the vertical axis.
Which of the following is true regarding frequency distribution? a. The simplest form involves four columns. b. Cumulative frequency is the product of frequency of the current category with that of previous categories c. Cumulative percentage is the ratio of cumulative frequency of the category of interest to the total number of subjects. d. A grouped frequency distribution is useful with the data is small.
c. cumulative percentage is the ratio of cumulative frequency of the category of interest to the total number of subjects
Which type of graphic is best for examining trends of a variable over time? a. scatter plot b. pie chart c. line chart d. bar chart
c. line chart
best way to represent data with large amount of data
charts and graphs
boxplot middle box
contains middle 50%. middle line is exact center of data set. Top line is 75%. bottom 25%. If line is not in exact center of box data is not evenly distributed
cumulative percentage
cumulative frequency of interest divided by cumulative frequency of all subjects
Which format is best for conveying data when the data set is large? a. histograms b. pie charts c. graphs d. all of these are correct
d. all of these are correct
Which presentation method is most suitable for conveying data about gender? a. Language b. Table c. Graph d. Bar chart
d. bar chart
Which of the following graphical displays is most suited for illustrating the distribution of continuous data? a. bar charts b. histograms c. box plots d. leaf blots
d. leaf plots
Stem and leaf plot pros cons
displays distribution of continuous data. similar to histogram but more flexible, and more info. Shows overall shape distribution, info on individual data
Scatter plot pros
examine relationship between two continuous variables relationships can be in +/- directions (+=same, -= opposite)
Mark True or False: When designing a graph, the independent variable is placed on the vertical axis and the dependent variable is placed on the horizontal axis
false
Line chart pros
good for frequency (similar to bar and pie) Best for trends of variables over time
best way to represent data with small number of data values
language or written
pie chart pros cons
nominal and ordinal. Visualize most common occurring class vs whole. Difficult to compare when percentages similar across
bar
nominal/ordinal with multiple subjects
Histogram
A visual method for presenting data that is similar to a bar chart, but instead groups data points into intervals, rather than individual categories; most useful for showing the distribution of continuous data
Line Chart
A visual representation of data that is useful for following changes over time or for finding patterns in the data
Scatterplot
A visual representation of the relationship between two continuous variables
Stem and Leaf Plot
A visualization of continuous data that shows both frequency distribution and information on individual data values.
Boxplot pros cons
Continuous data. most information. able to compare across all groups. CON does not show individual data points. Shows: overall distribution, center of distribution, quartiles, possible outliers.
Mark True or False: Matching data points with the frequency with which they occur is called a box plot
False
Percentile
Where a data point falls within the data set; specifically, how many data values fall above or below a specific point
Which of the following is capable of displaying the most information? a. box plot b. Leaf plot c. Stem plot d. Scatter plot
a. box plot
Bar charts and pie charts are useful ways to represent which form of data? a. continuous b. categorical c. interval d. ratio
b. categorical
Best graphs for nominal (gender)/ordinal
bar graphs, pie charts
cumulative frequency
sum of frequency of current category and frequency of previous categories
Ungrouped frequency
used for nominal, ordinal, categorical and ratio/interval but with limited amount of subjects
grouped frequency
used for ratio, interval but with high amount of subjects.-- lose info on individual data values