IS 310 Exam 1
Methods of assigning probabilities
1 ) The probability assigned to each experimental outcome must be between 0 and 1, inclusively. 0<P(Ei)<1 for all values of I where: Ei is the i th experimental outcome and P(Ei) is its probability 2 ) The sum of the probabilities for all experimental outcomes must equal 1. P(E1) + P(E2) + . . . + P(En) = 1 where: n is the number of experimental outcomes
Scales of measurement
1. Nominal 2. Ordinal 3. Interval 4. Ratio
Bar chart
A bar chart is a graphical display for depicting qualitative data. • On one axis (usually the horizontal axis), we specify the labels that are used for each of the classes. X-axis = classes • A frequency, relative frequency, or percent frequency scale can be used for the other axis (usually the vertical axis). Y-axis = Frequencies • Using a bar of fixed width drawn above each class label, we extend the height appropriately. • The bars are separated to emphasize the fact that each class is a separate category.
Percentile
A point on a ranking scale of 0 to 100. The 50th percentile is the midpoint; half the people in the population being studied rank higher and half rank lower. Position of pth Percentile = (p / 100) * (n + 1) Where: p is the desired percentile (e.g., 25 for the first quartile, 50 for the median, 75 for the third quartile). n is the total number of data points in the dataset.
scatter diagram (scatterplot)
A scatter diagram is a graphical presentation of the relationship between two quantitative variables. • One variable is shown on the horizontal axis and the other variable is shown on the vertical axis. • The general pattern of the plotted points suggests the overall relationship between the variables. • A trendline provides an approximation of the relationship.
In a sample of 200 students in a university, 20%, are Business majors. The 20% is an example of _______ a. A sample b. A population c. Statistical inference d. Descriptive statistics
C. Statistical Inference
Simpson''s Paradox
Conclusions drawn from two or more separate crosstabulations that can be reversed when the data are aggregated into a single crosstabulation.
Categorical data
Data that consists of names, labels, or other nonnumerical values
Data
Facts, figures, and other evidence gathered through observations.
[9,7,4,8] Find 60th percentile of the above numbers. Find first quartile of the above numbers. (25%)
S1 : Arrange the numbers in ascending order: [4, 7, 8, 9] S2 : Find the 60th Percentile: To find the 60th percentile, you first need to determine the position of the 60th percentile in the ordered data set. Use the following formula: Position of 60th Percentile = (Percentile / 100) * (n + 1) Where: Percentile = 60 n = Total number of data points = 4 Position of 60th Percentile = (60 / 100) * (4 + 1) = 0.6 * 5 = 3 Since the position is not a whole number, you'll need to interpolate between the values at positions 3 and 4. 60th Percentile ≈ (Value at position 3) + (0.6 * [Value at position 4 - Value at position 3]) 60th Percentile ≈ 8 + (0.6 * (9 - 8)) = 8 + 0.6 = 8.6 So, the 60th percentile is approximately 8.6. S3 : Find the First Quartile (25th Percentile): To find the first quartile (Q1), which is the 25th percentile, you can use the same formula as above with Percentile = 25: Position of 25th Percentile = (25 / 100) * (4 + 1) = 0.25 * 5 = 1.25 Again, interpolate between the values at positions 1 and 2: 25th Percentile ≈ (Value at position 1) + (0.25 * [Value at position 2 - Value at position 1]) 25th Percentile ≈ 4 + (0.25 * (7 - 4)) = 4 + 0.75 = 4.75 So, the first quartile (25th percentile) is approximately 4.75.
mean = 30 range = 10 mode = 53 variance = 144 median = 54 StDev is ______ Coeff. Of variation is ________
StDev = √(Variance) = √(144) = 12 CV = (StDev / Mean) × 100% CV = (12 / 30) × 100% CV = (0.4) × 100% CV = 40% So, the coefficient of variation is 40%.
Cumulative Distribution
The cumulative frequency is calculated by adding each frequency from a frequency distribution table to the sum of its predecessors.
Frequency
The number of times the value or event occurs in a data set
Pie Chart
The pie chart is a commonly used graphical display for presenting relative frequency and percent frequency distributions for categorical data. • First draw a circle; then use the relative frequencies to subdivide the circle into sectors that correspond to the relative frequency for each class. • Since there are 360 degrees in a circle, a class with a relative frequency of .25 would consume .25(360) = 90 degrees of the circle.
The variance of a sample of 1200 observations equals 324. The standard deviation of the sample equals
To find the standard deviation of a sample when you know the variance, you simply need to take the square root of the variance. In your case, the variance is 324, so: Standard Deviation (StDev) = √(Variance) = √(324) = 18 The standard deviation of the sample is 18.
Mutually Exclusive events
Two events are said to be mutually exclusive if the events have no sample points in common. EX P(a) = [1,2,3] while P(b) = [4,5,6]
Quartile
a division of the total into four segments, each one representing one-fourth of the total Position of Q1 = (25 / 100) * (n + 1) Position of Q3 = (75 / 100) * (n + 1)
independent event (probability)
an event that is not affected by another event.
Element
are the entities on which data are collected
variable
is a characteristic of interest for the elements
Probability
is a numerical measure of the likelihood that an event will occur. • Probability values are always assigned on a scale from 0 to 1. • A probability near zero indicates an event is quite unlikely to occur. • A probability near one indicates an event is almost certain to occur. RF is a way of deterring probability
Quantitative data
numerical data
Variance
standard deviation squared
class width
the distance between lower (or upper) limits of consecutive classes
Relative Frequency
the fraction or percent of the time that an event occurs in an experiment number of occurrences (frequency) / n where (n) is equal to the total number of frequencies
Statistical inference
the process of using data from a sample to gain information about the population
percent frequency distribution
• The percent frequency of a class is the relative frequency multiplied by 100. RF * 100 • A percent frequency distribution is a tabular summary of a set of data showing the percent frequency for each class.