Statistics - Chapter 1

Ace your homework & exams now with Quizwiz!

Categorical Variable

A categorical variable divides the cases into groups, placing each case into exactly one of two or more categories.

Boxplot

A graph of the five-number summary. Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values. A plot of data that incorporates the maximum observation, the minimum observation, the first quartile, the second quartile (median), and the third quartile.

Population

A group of individuals that belong to the same species and live in the same area

Correlation

A measure of the relationship between two variables. The measure of a relationship between two variables or sets of data. A measure of the extent to which two factors vary together, and thus of how well either factor predicts the other.

Standard Deviation (SD)

A measure of variability that indicates the average difference between the scores and their mean. A measure of the variability of a data set, calculated as the square root of the variance (V). Measures the average difference between each score and the mean of the data set.

Quantitative Variable

A quantitative variable measures or records a numerical quantity for each case. Numerical operations like adding and averaging make sense for quantitative variables.

Sample

A subset of the population

Variable

A variable is any characteristic that is recorded for each case. The variables generally correspond to the columns in a data table.

Mean

The arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores. Most data are within two standard deviations of the mean.

Median

The median is another statistic used to summarize the center of a set of numbers. If the numbers in a dataset are arranged in order from smallest to largest, the median is the middle value in the list. If there are an even number of values in the dataset, then there is not a unique middle value and we use the average of the two middle values. The median splits the data in half.

Explanatory Variable

Independent Variable. A variable that we think explains or causes changes in the response variable. A variable that helps explain or influences changes in a response variable.

Minimum, Median and Maximum

The minimum and maximum in a dataset identify the extremes of the distribution: the smallest and largest values, respectively. The median is the 50th percentile, since it divides the data into two equal halves. If we divide each of those halves again, we obtain two additional statistics known as the first (Q1) and third (Q3) quartiles, which are the 25th and 75th percentiles. Together these five numbers provide a good summary of important characteristics of the distribution and are known as the five number summary.

Z-Score

The number of standard deviations a particular score is from the mean. A measure of how many standard deviations you are away from the norm (average or mean). A type of standard score that tells us how many standard deviation units a given score is above or below the mean for that group.

Mode

The value that occurs most frequently in a given data set. The most frequently occurring score(s) in a distribution.

First Quartile (Q1)

25th Percentile If the observations in a data set are ordered from lowest to highest, the first quartile (Q1) is the median of the observations whose position is to the left of the median. A number for which 25% of the data is less than that number; same as the median of the data which are less than the overall median.

Third Quartile (Q3)

75th Percentile If the observations in a data set are ordered from lowest to highest, the third quartile Q3 is the median of the observations whose position is to the right of the median. A number for which 75% of the data is less than that number; same as the median of the part of the data which is greater than the median.

95% Rule

95% of data in a sample from a bell-shaped distribution should fall x bar +- 2s If a distribution of data is approximately symmetric and bell-shaped, about 95% of the data should fall within two standard deviations of the mean. In a normal distribution, about 95% of the cases lie between the mean and TWO standard deviation units on both sides of the mean.

Data

Facts and statistics collected together for reference or analysis

Residual

Left over; remaining. Remaining after a part is used or taken.

Regression Line

Line of best fit. A line that describes how a response variable, y, changes as an explanatory variable x changes.

Response Variable (Dependent Variable)

Measures an outcome of a study. The variable that shows the value you want to predict. The result or change that occurs due to the experimental variable.

Five Number Summary

Minimum, 1st Quartile (Q1), Median, 3rd Quartile (Q3), Maximum Q1 = First quartile = 25th Percentile Q3 = Third quartile = 75th Percentile The five number summary divides the dataset into fourths: about 25% of the data fall between any two consecutive numbers in the five number summary.

Range and Interquartile Range

The five number summary provides two additional opportunities for summarizing the amount of spread in the data, the range and the interquartile range. From the five number summary, we can compute the following two statistics: Range = Maximum − Minimum Interquartile Range IQR = Q3 − Q1


Related study sets

RHIT Domain 3 Practice questions

View Set

Nursing Jurisprudence and Ethics for Texas Nurses

View Set

Chapter 12: Business Cycles and Unemployment

View Set

How and why did the outcomes of the war with Mexico 1846-48 add to sectional difficulties?

View Set

Congenital Neuro/SpinalDisorders

View Set

Principles and Concepts Level B 3.0

View Set

Pathophysiology - Chapter 4 (Evolve)

View Set