Statistics Exam 1
parameter
a characteristic of a population
pie chart
a chart that shows the proportion or percentage that each class represents of the total number of frequencies; shows QUALITATIVE
sample
a portion, or part, of the population of interest
cumulative frequency distribution
add up the frequencies before it to calculate; shows the number of observations less than a given value
Continous variables
can assume an value within a specific range. Example: weight, grade point average, tire pressure; result from measuring
midpoint
halfway between the lower and upper limits of 2 consecutive classes (computed by adding the lower and upper limits of consecutive classes and dividing
Statistics
a collection of numerical information stated as a value or percentage. It is the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions
frequency table
a grouping data into mutually exclusive and collectively exhaustive classes showing the number of observations; SUMMARIZES QUALITATIVE VARIABLES; everything can be assigned to only one class, everything must be accounted
frequency distribution
a grouping of quantitative data into mutually exclusive and collective exhaustive classes showing the number of observations in each class
negatively skewed distribution
mode is greater than the median, median is greater than mean
Qualitative
non-numeric characteristics: gender, eye color, race, religion, etc. Also known as an attribute
class frequency
number of observations in each class
relative class frequencies
number of observations of specific/ total number of observations
measures of location
pinpoint the center of distribution of data; example like averages
bar chart
presents a QUALITATIVE variables; horizontal axis shows the variable of interest; the vertical axis shows the frequency of each of the possible outcomes; GAPS BETWEEN THE BARS
Quanatative
reported numerically and can be discrete or continuous variables
frequency polygon
shows the shape of a distribution; uses the midpoint; allows us to compare directly two or more frequency distributions; consists of line segments connecting the points formed by the intersection of the class midpoint and the class frequency
Range
simplest measure of dispersion; difference between the maximum and minimum values in a data set
symmetrical distribution
special because all 3 measures of location are at center of the distribution
population SD
square root of variance
population mean
sum of all the values in the population divided by the number of values in the population
mode
the class of a distribution with the highest frequency; extremely high or low values do not affect its value
population
the entire set of individuals or objects of interest or the measurements obtained from all individuals or objects of interest
variance
the mean of the squared deviations from the mean
median
the midpoint of the values after they have been ordered from the minimum to the maximum values; measure of location; data must be at least an ordinal level of measurement
sample mean
the sum of all the sampled values divided by the total number of sampled values
dispersion
the variation or the spread of data; standard deviations
Inferential stats
used to find something about a population from a sample taken from that population; based on a limited set of data; the methods used to estimate a property of a population in the basis of a sample
sample variance
variance of a sample
arithmetic mean
widely used measure of location
A good working knowledge of statistics is useful to organize and summarize information.
How does statistics help us understand large amounts of numerical information?
Ratio
(is the highest level of data) gives you the most information about an observation. Includes characteristics of interval level. Practically all quantitative data are recorded on the ratio level of measurement; meaningful interpretation of zero on the scale; examples wages, units of production, weight, changes in stock prices, height
Nominal
(lowest level of measurement) - only qualitative variables can be classified in the categoryl; recorded as labels and names; THEY HAVE NO ORDER
Steps of frequency distributions
1. decide on the number of classes (number is determined by 2^k is greater than the number of observations (n); k is the number of classes 2. determine the class interval 3. set the individual class limits ( must avoid overlapping or unclear class limits 4. determine the number of observations in each class
False Truth: Statistics is about collecting and processing information to provide a basis for decision making
True or false: The primary purpose of statistics is to present numerical data.
Ordinal
(next highest level of measurement) - includes characteristics of nominal level and is used to rank data. Example: 5-Superior, 4- Good, 3 - Average, 2-Poor, 1- inferior; is based on a relative ranking or rating of items based on a defined attribute or qualitative variable; RANKED OR COUNTED
Interval
(next highest level of measurement) - includes characteristics of ordinal level and difference between values is a constant size. Data classifications are ordered according to the amount of the characteristic they possess. Example - clothing size, temperature,; the difference or interval between values is meaningful; based on a scale with a known unit measurement; HAS NO NATURAL ZERO
positively skewed distribution
mean is greater than the median, median is greater than the mode
Discrete variables
can assume only certain values and there are gaps. Example: number of bedrooms in a house
statistic
characteristic of a sample
weighted mean
convenient way to compute arithmetic mean ; denominator of a weighted mean is always the sum of the weights
cumulative relative frequency
cumulative frequency/ number of observations; shows the percent of observations less than a given value
empirical rule
data have a symmetrical, bell-shaped distribution
class interval
difference between the limits of two consecutive classes
histogram
display a frequency distribution; BASED ON QUANTITATIVE data; there are no gaps between bars
relative frequency distribution
each of the class frequencies is divided by the total number of observations; shows the percent of observations in each class
Chebyshev's theorem
for any set of observations, the proportion of the values that lie within k SD is at least 1-1/k2; k=2 SD the proportion of the data set's values that lie between 8 and 12 is at least 1-1/2^2, or 75% 75% minimum proportion
Descriptive statistics
give you information about current statistics, not future information; methods of organizing and summarizing and presenting data in an informative way; summarize a large amount of data
sample SD
standard deviation of a sample