Lesson 7: Quartiles, Quantiles, and Interquartile Range

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Quartiles

A common way to communicate a high-level overview of a dataset is to find the values that split the data into four groups of equal size. By doing this, we can then say whether a new datapoint falls in the first, second, third, or fourth quarter of the data. The values that split the data into fourths are the quartiles. Those values are called the first quartile (Q1), the second quartile (Q2), and the third quartile (Q3)

Quantiles in R

Base R has a function named quantile() that will quickly calculate the quantiles of a dataset for you. quantile() takes two parameters. The first is the dataset that you are using. The second parameter is a single number or a vector of numbers between 0 and 1. These numbers represent the places in the data where you want to split. dataset <- c(5, 10, -20, 42, -9, 10) ten_percent <- quantile(dataset, 0.10) ten_percent now holds the value -14.5. This result technically isn't a quantile, because it isn't splitting the dataset into groups of equal sizes — this value splits the data into one group with 10% of the data and another with 90%.

Common Quantiles

One of the most common quantiles is the 2-quantile. This value splits the data into two groups of equal size. The 4-quantiles, or the quartiles, split the data into four groups of equal size. Finally, the percentiles, or the values that split the data into 100 groups, are commonly used to compare new data points to the dataset.

Quantiles

Quantiles are points that split a dataset into groups of equal size. The quartiles are some of the most commonly used quantiles. The quartiles split the data into four groups of equal size.

Many Quantiles

Quantiles are usually a set of values that split the data into groups of equal size. For example, you wanted to get the 5-quantiles, or the four values that split the data into five groups of equal size, you could use this code: dataset <- c(5, 10, -20, 42, -9, 10) ten_percent <- quantile(dataset, c(0.2, 0.4, 0.6, 0.8)) If you have n quantiles, the dataset will be split into n+1 groups of equal size.

Quartiles in R

The base R function that we'll be using is named quantile(). dataset = c(50, 10, 4, -3, 4, -20, 2) third_quartile = quantile(dataset, 0.75) The quantile() function takes two parameters. The first is the dataset you're interested in. The second is a number between 0 and 1. Since we calculated the third quartile, we used 0.75 — we want the point that splits the first 75% of the data from the rest. For the second quartile, we'd use 0.5. This will give you the point that 50% of the data is below and 50% is above.

Interquartile Range in R

The stats library has a function that can calculate the IQR all in one step. The IQR() function takes a dataset as a parameter and returns the Interquartile Range. dataset = c(4, 10, 38, 85, 193) interquartile_range = IQR(dataset)


Kaugnay na mga set ng pag-aaral

Combo with NOBCChE Science Bowl: EARTH SCIENCE Set 1 and 16 others

View Set

ABNORMAL PSYCH FINAL Study Guide

View Set

Business Planning A: The Business Planning Model

View Set