Chapter 2 Organizing and Summarizing Data

Ace your homework & exams now with Quizwiz!

If the data are discrete, but there are many different values of the variables, or if the data are continuous, the categories of data (the classes)

must be created using intervals of numbers.

If the data are discrete and there are relatively few different values of the variable, the categories of data (classes) will be the

observations (as in qualitative data).

time-series plot

obtained by plotting the time in which a variable is measured on the horizontal axis and the corresponding value of the variable on the vertical axis.

When data is collected from a survey or a designed experiment it must be__________

organized into a manageable form

stem and leaf plot

uses digits to the left of the rightmost digit to form the stem. Each rightmost digit forms a leaf.

time series data

value of a variable is measured at different points in time

first step in summarizing quantitative data is to determine

whether the data are discrete or continuous

Raw data.

data that is not organized

Advantage of Stem-and-Leaf Diagrams over Histograms?

Once a frequency distribution or histogram of continuous data is created, the raw data is lost (unless reported with the frequency distribution). However, the raw data can be retrieved from the stem-and-leaf plot

cumulative frequency distribution

displays the aggregate frequency of the category. For continuous data, it displays the total number of observations less than or equal to the upper class limit of a class.

cumulative relative frequency distribution

displays the proportion (or percentage) of observations less than or equal to the category for discrete data and the proportion (or percentage) of observations less than or equal to the upper class limit for continuous data.

relative frequency formula

frequency/sum of all frequencies

Guidelines for Determining the Lower Class Limit of the First Class and Class Width Determining the Class Width

Decide on the number of classes. Generally, there should be between 5 and 20 classes. The smaller the data set, the fewer classes you should have.

Horizontal Bars

Bar graphs may also be drawn with horizontal bars. Horizontal bars are preferable when category names are lengthy.

Determine the class width by computing

Round this value up to a convenient number.

Ex. 11: Constructing a Stem-and-Leaf Plot An individual is considered to be unemployed if he or she does not have a job, but is actively seeking employment. The following data represent the unemployment rate in each of the fifty United States plus the District of Columbia in June, 2008.

We let the stem represent the integer portion of the number and the leaf will be the decimal portion. For example, the stem of Alabama (4.7) will be 4 and the leaf will be 7

Stem-and-leaf Plot Example

a data value of 267 would have 26 as the stem and 7 as the leaf. Repeating numbers for the leafs is possible

Pie Charts Ex.7: (continued) a) What variable is described by this pie chart? Also, is it qualitative or quantitative? b) What proportion of fathers stayed home due to being ill or disabled? c) What percentage of fathers stayed home from reasons not related to school, being retired, or other? d) If there are 560 fathers surveyed that stayed home from being ill or disabled, how many fathers in total were sampled that stayed home?

a) Reason why fathers stay home; qualitative b) 0.34 c) 100% − 22% = 78% d) Define: N = number of fathers total; solve for N proportion ill or disabled = 𝟓𝟔𝟎/𝑵 = 0.34 -> 560 = 0.34N -> N = 1647

Classes

are categories into which data are grouped.

Pareto chart

is a bar graph where the bars are drawn in decreasing order of frequency or relative frequency.

pie chart

is a circle divided into sectors. Each sector represents a category of data. The area of each sector is proportional to the frequency of the category.

ogive

is a graph that represents the cumulative frequency or cumulative relative frequency for the class. It is constructed by plotting points whose x-coordinates are the upper class limits and whose y-coordinates are the cumulative frequencies or cumulative relative frequencies of the class. Line segments are drawn connecting consecutive points.

frequency polygon

is a graph that uses points, connected by line segments, to represent the frequencies for the classes. It is constructed by plotting a point above each class midpoint on a horizontal axis at a height equal to the frequency of the class. Line segments are drawn connecting consecutive points.

histogram

is constructed by drawing rectangles for each class of data. The height of each rectangle is the frequency or relative frequency of the class. The width of each rectangle is the same and the rectangles touch each other

bar graph

is constructed by labeling each category of data on either the horizontal or vertical axis and the frequency or relative frequency of the category on the other axis.

dot plot

is drawn by placing each observation horizontally in increasing order and placing a dot above the observation each time it is observed.

Frequency of a category

is the number of observations in that category

relative frequency

is the proportion (or percent) of observations within a category

class midpoint

is the sum of consecutive lower class limits divided by 2.

frequency distribution

lists each category of data and the number of occurrences for each category of data.

relative frequency distribution

lists each category of data with the relative frequency

Ways to Organize Data

• Tables • Graphs • Numerical Summaries


Related study sets

ch 1 Introduction to Injury Care

View Set

Social Psychology-Heuristics and decision making

View Set

nr 302 health assessment lecture september 28th chapter 19 thorax and lungs

View Set

Hoot suite Social Media Marketing

View Set