STA 215 Section 3.1-3.6 (Part 1)Descriptive Statistics: The five number summary and percentiles

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Of the following options, which one include 50% of the data

Median to Maximum First Quartile to the Third Quartile the Minimum to the Median

Numerical Summaries of Center

NSC

Skewed Right (a.k.a. positively skewed)

, if most of the data occurs on the left (lower) side (i.e. most of the data values are small) with a long tail on the right, upper side.

Bell-Shaped

, if the data set is unimodal, roughly symmetric, with its peak at the center and the graph looks similar to a bell.

Steps for calculating standard deviation

1.) Calculate the sample mean, . 2.) For each observation, calculate the difference between the data value and the mean. 3.) Square each difference calculated in step 2. 4.) Sum the squared differences calculated in Step 3, and then divide the sum by n -1. The answer for this step is called the variance. 5.) Take the square root of the variance calculated in Step

What percent of data is located below the median?

50%

If I told you that you scored in the 90th percentile on the ACT math portion, what percent of students who took the ACT did you score higher than?

90%

In a symmetric distribution, the mean will be

About equal to the median

A histogram is useful for identifying shape, but they are capable of identifying the other three descriptions of a distribution, _________, _________ and ________.

Center, spread, and outliers

The maximum is the largest value. Unless there are outliers. Then it would be the largest number in the dataset that was not an outlier.

False

The minimum is the smallest value. Unless there are outliers. Then it would be the smallest number in the dataset that was not an outlier.

False

Variability

How much variation /spread is there in the data. Range & IQR & SD

The mean tends to be _______________________ outliers and skewness. However, it should be noted that if there are outliers at both ends of a distribution, they will tend to "cancel out" their effect on the mean.

Impacted

In a left skewed distribution, the mean will be

Less than the median

Descriptive Statistics

Quantitative data

Deviation

The distance of a data value from the mean Always zero example: x-bar=2.71 0-2.71 = -2.71 1-2.71 = -1.71 2-2.271 = 0.71

Median (NSC)

The median locates the center of the data and splits it in half.

Histograms

These are essentially bar graphs for quantitative date. Good for shape

Outliers

Values that stay far away from the rest if the data if present give the count and location. Who the person is not how many people

Measures of Variability

Variance and Standard Deviation IQR Range

Shape

What does the data look like?

Boxplot

a graphical display of the five number summary. Good for outliers

Variance

a measure of spread. It is the squared value of the standard deviation.

standard deviation

a measure of the spread of values. Another way to think of it is as roughly the average distance values fall from the mean SD is the square root if the variance.

Median

a value such that at least half of the data values are less than or equal to the median and at least half of the data values are greater than or equal to the median. We can think of the median as splitting the ordered data set in half.

The median tends to be a ________________ measure of center in the case of skewness or outliers.

better

Simple______ does not show outliers

boxplot

We calculate the deviation of a data value as: ___________________________________, where xi is the data value and x-bar is the mean.

deviation for xi=xi-x-bar

Interquartile Range (IQR)

distance from Q3 to Q1 Q3-Q1 *Middle 50%

Range

distance from the maximum to the minimum. MAX-MIN

The mean will be approximately ________________ to the median in a symmetric distribution.

equal

The mean will be _______________ than the median in a right skewed distribution.

greater

Skewed Left (a.k.a. negatively skewed)

if most of the data occurs on the right (upper) side (i.e. most of the data values are large) with a long tail on the left, lower side

Unimodal Distribution

if the data set has a single peak

Bimodal Distribution

if the data set has two distinct peaks separated by a valley.

Symmetric Distribution

if when you draw a line at the center of the distribution the two halves are mirror images. Real data is almost never perfectly symmetric but is often roughly symmetric.

First Quartile

is a value such that at least 25% of the data values are less than or equal to Q1 and at least 75% of the data values are greater than or equal to Q1. We can think of Q1 as splitting the lower 50% of the ordered data set in half.

Third Quartile

is a value such that at least 75% of the data values are less than or equal to Q3 and at least 25% of the data values are greater than or equal to Q3. We can think of Q3 as splitting the upper 50% of the ordered data set in half.

Five-Number Summary

is the minimum (abbreviated min), the first quartile (denoted Q1), the median (abbreviated med), the third quartile (denoted Q3), and the maximum (abbreviated max). MAX=25% Q3=25% M=25% Q1=25% MIN=25%

The larger the area the ________ the variation of the data

larger

The mean will be _______________ than the median in a left skewed distribution

less

Describing a distribution (graph)

looking at the the movement

The ________ of a sample is denoted by X-BAR and is calculated as: X1+X2+.....+Xn/n , where n is the sample size and x1 is the first value, x2 is the second value and so on.

mean

The mean is _________ robust in the presence of outliers or skewness.

not

The mean does _________ have to have the data ordered, but the median does have to have the data ______________.

not; ordered

Modified boxplot shows ________

ouliers

The median is more ________________ in the presence of outliers or skewness

robust (not affected by outliers and skewness)

Percentiles

set is a value such that at least p% of the data values are less than or equal to the pth percentile.

The smaller the the boxplot the __________ the varibaility of thedata

smaller

s

standard deviation of a a sample (statistic)

Mean (NSC)

the arithmetic average of the data values

Maximum

the largest value in a data set

Minimum

the smallest value in a data set

s^2

variance of a sample (statistic)

Center

what is the typical value. Mean or Median

Is it possible for a variable to have a distribution that is both unimodal and left skewed?

yes


Set pelajaran terkait

Chapter 9 Installing and Configuring Security Appliances

View Set

Intro to financial accounting chapter 8

View Set

Pharm Practice Questions - Exam 3

View Set