Applied Probability and Statistics - C955
What percent of 6565 is 3232?
x/100 = 32/65 65x = 100*32 65x = 3200 x = 3200/65 x = 49.2
Finding First Quartile
- Arrange in numerical order - Identify median of lower half
What does it mean to say that a number, 𝑥, is a factor of another number, 𝑦?
- x is smaller than or equal to y - y can be divided evenly by x - x and y are whole numbers
Finding Quartiles
1) Arrange the data in ascending order 2) Determine the median, M, or second quartile, Q2 3) Determine the first and third quartiles, Q1 and Q3, by dividing the data set into two halves; the bottom half will be the observations below (to the left of) the location of the median. The first quartile is the median of the bottom half and the third quartile is the median of the top half - odd number of values, Q2 = median
Interval
A set of numbers between two specified values - can be visualized as a segment of the number line
Check Sheet
A structured form or table that allows data to be collected by marking how often an event has occurred in an certain interval
Measure of Central Tendency
A summary measure that is used to describe an entire set of data with one value that represents the middle or center of the data set's distribution. There are three main measures: mean, median, or mode
Coordinate Plane
A tool used for graphing that is a display of a two-dimensional plane. It consists of an x-axis and a y-axis; the x-axis being a horizontal number line, the y-axis is a vertical number line, and the axes meet at the origin (0, 0).
Mean
Average - a single value that represents the center of a set of data values - add all values together, divide by the number of values
Whole Numbers
Natural numbers (counting numbers) and zero; 0, 1, 2, 3...
Net Margin
Net Income/Revenue
Does a box plot include the mean of a set of data?
No
A linear equation as a degree of ___
One - creates a straight line when graphed
1.5 IQR Criterion Rule
Outliers are defined to be any points that are more than 1.5 × IQR above Q3 or below Q1 . IQR * 1.5 = Q3 + (IQR * 1.5) = Q3 Outlier Q1 - (IQR * 1.5) = Q1 Outlier
Which of the following are best used to display categorical data?
Pie Chart
Categorical Variable
Places an individual into one of several groups or categories - aka Qualitative
Range
The difference between the highest and lowest scores in a distribution
Interquartile Range
The difference between the third and first quartiles AND An indicator of the distribution of a sample that can help identify outliers
Greatest Common Factor (GCF)
The largest factor that two or more numbers have in common
Standard Deviation
The measure of how far, on average, the data points are from the mean
Mode
The most frequently occurring value - The mode is only relevant if a data set has values that are repeated, and unlike mean and median, there can be more than one mode in a data set
Negative Square Root
The negative square root of a perfect square - for example, -6 is the negative square root of 36
Principal Square Root
The nonnegative square root of a number
Randicand
The number or expression inside a radical symbol
Reciprocal
The number which, if you multiply it by the divisor = 1 eg. 4/5, reciprocal = 5/4 because 4/5 x 5/4 = 1
X Intercept
The x-coordinate of a point where a graph crosses the x-axis
The return on investment (ROI) in a sample population of investments ranged from 8% to 23%. What percentage of the sample will fall within 11 standard deviation above the mean? Assume the data is normally distributed.
34% - The Standard Deviation Rule says that 68%68% of the values will fall within 1 standard deviation of the mean. Since we are looking for the percentage that will fall 1 standard deviation above the mean, we divide this value in half. Therefore, the answer is 34%.
When observing a box plot, what percentage of the measured data lies between Q1Q1 and the maximum value in the list?
75% - Twenty-five percent of the data is to the left of Q1Q1, so that leaves 75%75% to the right.
Estimation
= 𝐛𝐞𝐬𝐭-𝐜𝐚𝐬𝐞 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐞 + (4 𝐭𝐢𝐦𝐞𝐬 𝐭𝐡𝐞 𝐦𝐨𝐬𝐭-𝐥𝐢𝐤𝐞𝐥𝐲 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐞) + 𝐰𝐨𝐫𝐬𝐭-𝐜𝐚𝐬𝐞 𝐞𝐬𝐭𝐢𝐦𝐚𝐭𝐞 / 6
Set
A collection of numbers in mathematics
Data
A collection of numbers in statistics
Slope Intercept Form
A common format that a linear equation can take that is helpful for graphing purposes. It is of the form y = mx + b, where m is the slope of the line and b is the y-intercept
Ratio
A comparison of two quantities by division - use butterfly method
Factor Tree
A diagram showing the prime factorization of a number
Skewness
A measure of the shape of a data distribution. Data skewed to the left result in negative skewness; a symmetric data distribution results in zero skewness; and data skewed to the right result in positive skewness
Standard Deviation Rule
A normal distribution contains 68% of the data between one standard deviation above and below the mean, 95% of the data between two standard deviations above and below the mean, and 99.7% of data between three standard deviations above and below the mean
Measures of Spread
A number of measures used to determine the distance of data from the center of the data set, such as range and standard deviation - variability, range
Numerical Summary
A number used to describe a specific characteristic about a data set
Rate
A ratio that compares two quantities measured in different units
Proportion
A true statement in which two ratios are equal to each other
Graphical Display
A visual representation of data sets
Linear Equation
An algebraic equation in which each term is either a constant or the product of a constant and (the first power of) a single variable.
Distribution
An arrangement of values that illustrates their frequency or occurrence
Names of Expressions Based on Their Degree
An expression of degree 0 is known as a constant An expression of degree 1 is known as linear An expression of degree 2 is known as quadratic An expression of degree 3 is known as cubic
Irrational Number
Any number that can NOT be expressed as a ratio of two integers or as a repeating or terminating decimal - Pi or any square root of an imperfect square are considered irrational - An infinite, repeating decimal is NOT considered an irrational number
Rational Numbers
Any number that can be expressed as a fraction - a decimal whose expansion terminates or repeats
Outliers
Any points that are more than 1.5 x IQR above Q3 or below Q1 - to find Q3 outliers, multiply IQR x 1.5 and add Q3 value - to find Q1 outliers, multiply IQR x 1.5 and subtract from Q1 value
You are a professional trainer at a local sports academy. You ask your athletes to determine the number of grams of protein they consume for a particular meal. Which of the following would be the best choice to illustrate the shape of the data you collect?
Box Plot - As the data you are collecting is quantitative data, from the choices below a box plot would be your best choice to illustrate the shape of the data.
A bar chart is used to show frequencies of what type of data?
Categorical (qualitative, discrete)
Dividing Fractions
Change the division sign (÷) to a multiplication sign (×) Write the reciprocal of the second fraction Multiply the numerators Multiply the denominators Write the answer in the form of a fraction Reduce the fraction to the lowest terms, if necessary
Adding Fractions
Change the fractions into the lowest common denominator and add the numerators eg. 3/4 + 7/8 = 6/8 + 7/8 = 13/8 = 1 5/8
Finding Equivalent Fractions
Changing a mixed fraction into an improper fractions Multiply denom by whole # Add product to num Keep denom the same
The Butterfly Method
Cross-multiplication method used to determine whether two fractions are equal The numerator of one fraction is multiplied by the denominator of the opposite fraction on both sides
Valid Data
Data resulting from a test that accurately measures what it is intended to measure
Reliable Data
Data that is both consistent and repeatable
Bar Chart
Displays frequencies (i.e., counts) or relative frequencies for categorical data
Histogram
Displays frequencies or relative frequencies for quantitative data Type of data: Quantitative Used for: The distribution (shape and spread) of quantitative data - using a check sheet
Fundamental Theorem of Arithmetic
Every composite number can be expressed as a unique product of prime numbers - any integer greater than 1 is either prime, or is the product of prime numbers
Outliers do not often affect the mean measure of center. True or False?
False - Outliers can heavily affect the mean in a distribution. When a distribution is skewed, the mean value will see a direct impact
Calculating Standard Deviation
First, calculate the mean of the set of numbers. Second, calculate how far away from the mean each data point lies. Third, square each of these numbers. Fourth, sum these squared numbers Fifth, divide this summed number by the actual number of data points minus 1. Finally, take the square root of this number, yielding a sample standard deviation.
What would be the best type of graph to use to display the age of all employees in a particular division in a company?
Histogram - This is quantitative data that will be grouped into ranges or bins. Therefore, a histogram is the best choice to display this data - Quantitative data is represented on a histogram (NOT categorical)
Division Principle of Equality
If both sides of an equation are divided by the same number, the result is an equivalent equation
Multiplication Principle of Equality
If both sides of an equation are multiplied by the same number, the result is an equivalent equation
Multiplication and division of fractions
Multiplication - Denominators do not have to be the same. Multiply across both n/d. ex: 3/7 x 3/5 = 9/35 or 4/6 x5/8=20/48 Division - Denominators do not need to be the same BUT need to KCF = keep change flip aka: INVERSE
Qualitative Data
Information describing color, odor, shape, or some other physical characteristic Non-numeric information based on some quality or characteristic
Modified Box Plot
Just like a regular box plot, except that outliers are shown as points above or below the minimum and maximum
Five Number Summary
Lists the minimum, 1st quartile, median, 3rd quartile, maximum in a data set
Median
Middle - the "halfway" point of a set of values; an equal number of values will fall above and below the median of a data set - best used when data is skewed - sort values from smallest to largest, count number of values from each side to reach the middle value - if even number of values, add both and divide by 2 (take the number between the two values)
Arrange the mean, median, and mode in order from least to greatest in a distribution that it is positively skewed.
Mode, median, mean - The median will always remain in the middle for skewed data, and the mean goes towards the skew. The mode goes in the opposite direction of the skew
When multiplying and dividing decimals . . .
Move decimal over to the left two points
A histogram that has more than two modes is known as:
Multimodal - vs. unimodal (one) or bimodal (two)
Multiplying and dividing percentages
Multiplication - move decimal two places to the left eg. 87.5 x 42.4 = 3710 = 37.10 Division - move decimals two places to the right eg. 45 / 10 = 4.5 = 450
Prime vs. Composite Numbers
Prime numbers: factors are 1 and itself Composite numbers: more than two factors
Stem plots display which type of data?
Quantitative Data
Exponents
Repeated multiplication (#^3 = # x # x #) (#^1 = #) (#^0 = 1)
Net Income Equation
Revenues - Expenses Revenues - (Cost of goods sold + Operating expenses + fixed costs)
Slop Equation
Rise / Run (y2 - y1( / (x2 - x1)
Preferred Measures of Center & Spread
Skewed Measures of Center: Median Measures of Spread: Range or IQR Normal Symmetric Measures of Center: Mean Measures of Spread: Standard Deviation
Of the following sets of data, which would you assume should have the smallest range?
The ages of interns currently in the college summer internship program - 21 years is the average age of a college student in the summer internship program. The variation from this amount is generally one or two years + or -. Therefore, we can assume the data set would be 19,20,21,22,23 years of age. The range would be equal to 23−19=4 Each of the other options has a far greater probability of having data that is more spread out.
Calculating Standard Deviation Example
The average female height in the U.S. in 2010 was 63.8 inches, with a standard deviation of 62.7 inches. Assuming a normal distribution: The key to solving this problem is using what we know from the Standard Deviation Rule, specifically that 95% of the data will fall between 22 standard deviations from the mean. Since we know that the standard deviation is 2.7 inches, we can calculate that 22 standard deviations is equal to 5.45.4 inches (2.7×2). Now knowing this value, we can determine the values that 95% of the data will fall between. To obtain the first value, we will subtract 5.4 inches from the mean: 63.8 inches - 5.4 inches = 58.4. To obtain the second value we will add 5.45.4 inches to the mean: 63.8 inches + 5.4 inches = 69.2 inches. Therefore, 95% of the data will fall between 58.4 inches and 69.2
Factorization
The process of breaking up a composite number into its prime factors
Statistics
The science that deals with the interpretation of numerical facts or data through theories of probability, as well as the facts or data itself
Real Numbers
The set of all rational and irrational numbers - any number that can be placed on a number line including negatives & fractions
Intergers
The set of whole numbers and their opposites (whole numbers plus negatives)
Least Common Multiple (LCM)
The smallest multiple (other than zero) that two or more numbers have in common - If the two numbers do NOT share a common factor, multiply them together to find the least common multiple. For example, 22 and 33 have no common factors, so the LCM is 6
Least Common Denominator (LCD)
The smallest of all possible common denominators
Radical Sign
The symbol used to indicate a positive square root, √.
What is the most significant difference between histograms and bar charts?
The type of data depicted in the graph - A bar chart measures categorical data. A histogram displays quantitative data.
Box Plot
Type of data: Quantitative Used for: The center, spread, and outliers in a given data set
Dot Plot
Type of data: Quantitative Used for: The distribution of data, particularly clusters, gaps, and outliers. Most useful for smaller data sets
Stem Plot
Type of data: Quantitative Used for: The distribution or shape of data according to place values aka stem & leaf plot
Dividing by Zero
Undefined (0/# = 0) (#/0 = Undefined )
Discrete Data
Values are distinct, separate, and unconnected - can only have certain, distinct values - is 'counted'
Quartiles
Values that divide a data set into four equal parts
Continuous Data
Values within the set are connected, without gaps - can have any value within an interval - is 'measured'
Negative Division
When dividing a positive number by a negative number (or a negative number by a positive number), the quotient will always be negative Likewise, dividing a negative number by a negative number equals a positive number
Negative Multiplication
When multiplying a positive number by a negative number, the product will always be negative (+ x - = - ) - multiplying any number of positive numbers by a negative number changes the sign Multiplying a negative number by a negative number results in a positive product (- x - = +)
The Identity Property
a) the sum of a number and 0 is the number b) the product of a number and 1 is the number
Simplifying fractions
on calculator = math - frac (1) - enter