Stats ch1-ch4

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Cluster Sampling

A smapling method is cluster sampling which is particularly useful when the members of the populations are widely scattered geographically. Procedure: Divide the population into groups, obtain a simple random same of the clusters, use all the members of the clusters obtained in step 2 as the sample.

Simple Random Sampling

A smapling procedure for which each possible sample of a given size is equally likely to be the one obtained.

Skewness

A unimodal distribution that I snot symmetric is either right skewed or left skewed. A right skewed distribution has a longer right tail that the peak in on the left and goes down showing a right tail. A left sewed distribution has a longer left tail that has a peak on the right that goes down showing a left tail.

Stem and leaf diagram

A visual method of displaying quantitative data where each value of a data set is separated into two parts: a stem, which consists of the leftmost digits, and a leaf, which consists of the last digit.

Example of standard deviation: The heights in inches of 5 players are 72, 73, 76, 76, and 78. Find the deviation from the mean, which is x-x̅.

x̅ = ∑xi/n = 72+73+76+76+78/5 = 75 deviation from mean is x-x̅: 72-75=-3, 73-75=-2,76-75=1, 76-75=1, 78-75=3

Relationships among events

(not E): the event "E does not occur" in a rectangle where the rectangle is filled but not the circle (A &B): the event "both A and B occur" in a rectangle where the two circles in the middle touch each other and are filled (A or B): the event "either A or B or both occur" in a rectangle where the two circles are touching and both are filled in completely, only the circles are filled with the middle too.

Sample space for event

A collection of outcomes for the experiment that is ant subset of the sample space, an event occurs if and only if the outcomes of the experiment is a member of the event.

Venn diagram

A diagram that is used to show relationships between sets, a sample space that has a rectangle and a circle or two or three in it.

Boxplot

A graph of the five-number summary.

The Meaning of Probability

A probability near 0 indicates that the event is question is very unlikely to occur. A probability near 1 indicates that the event is very likely to occur.

Whats in a deck of cards?

A standard deck of cards contains 52 cards. 13 of each(heart, diamond, spade, club) 4 rows of each kind 12 face cards

Event

An event is some specified result that may or may not occur when an experiment is performed.

Experiment

An experiment is an action whose outcome cannot be predicted with certainty

Single-value grouping

Classes in which each classs represents a single possible value are called single value classes. This method is particularly sortable for discrete data in which there are only a small number of distinct values.

Example: The information lists the worlds highest waterfalls. the list shows that the angel fall in Venezuela is 3281 feet high or more than twice as high as ribbon falls in Yosemite, California which is 1612 feet high. What kind of data are these heights?

Continuous, Quantitative data.

Qualitative data

Data in the form of words, non numerically valued variable

Quantitative data

Data that is in numbers, a numerical valued variable

Descriptive

Descriptive statistics consist of methods for organizing and summarizing information. Like graphs, charts, tables, calculation of averages, measures of variation, and percentiles.

What are the two major types of statistics?

Descriptive, Inferential

procedure for dotplot

Draw a horizontal axis that displays the possible value of the quantitative data, record each observation buy placing a dot over the appropriate value on the horizontal axis, label the horizontal axis with the name of the variable.

Dotplots

Each data is shown as a dot above its location on a number line.

General Addition Rule

For any two events, A and B, the probability of A or B is. P(A or B) = P(A) + P(B) - P(A and B), only for events that are mutually exclusive.

Notation for events

For convienicce we use letters such as A,B,C,D... to represent events

Example of The interquartile range: Q1 = 23, Q2=30.5, Q3=36.5 Find IQR

IQR = Q3-Q1=13.5

P(E)

If E is an event then P(E) represents the probability that event occurs it is read "the probability of E"

Example of what a mutually exclusive looks like

In one figure shows two mutually exclusive events because they are in a rectangle and they are not touching each other and are filled in the circle. In a figure 2, shows two circles in a rectangle touching each other have a common outcome which is non-mutually exclusive event.

Example of Mode: Find the mode in the data set, if not say no mode. 300, 500, 300, 440, 600, 700, 500

In this data set we see that there is two modes, one mode is 300 because there is 2 and another mode is 500 bc there is two. So there are a total of two modes in this data set.

Inferential

Inferential statistics consists of methods for drawing and measuring the reliability of conclusions about a population based on information obtained from a sample of the population.

Example for outliers: Q1 = 23, Q2=30.5, Q3=36.5 ,IQR = Q3-Q1=13.5 Find the outliers if any.

Lower limit = Q1-1.5(IQR) = 23-1.5(13.5) = 2.75 Upper limit = Q3+1.5(IQR) = 36.5+1.5(13.5) = 56.75 In the table that shows it would just be 66 as the potential outlier.

Measure of center

Numbers that are used to describe the center of a set of data. These measures include the mean, median, and mode.

When randomly selecting a faculty member from the above uni- versity, calculate the probability that the faculty member selected is a. under 30 (total is 68 for A1) b. an associate professor (total is 381 for R2) c. an associate professor under 30 (go look at the table and connect and see where they lead to to get your f) Since there are 1164 faculty in total, N = 1164. Therefore,

P(A1)= f/N = 68/1164 ≈0.058 P(R2)= f/N= 381/1164 ≈0.327 P(A1 and R2)=f/N= 3/1164 ≈0.003

A table shows a relative frequency distribution for the size of farms in the US, Also included events to the size classes: size Relative Frequency Events . ... 50-179 0.300 C 180-499 0.167 D 500-999 0.068 E 1000-1999 0.042 F 2000 and over 0.036 G What is the probability that s randomly selected farm has between 180 and 1999 acres inclusive?

P(D)+P(E)+P(F) = 0.167+0.068+0.042 = 0.277

Find the probability of the indicated event if P(E) = 0.25 and P(F) = 0.55, find P(E or F) if P(E and F) = 0.20

P(E or F) = P(E) + P(F) - P(E and F) = 0.25 + 0.55 -0.20 = 0.6

Find the probability P(E or F) if E and F are mutually exclusive. P(E) = 0.30 and P(F) = 0.47

P(E or F) = P(E) + P(F) = 0.30 + 0.47 = 0.77

Find the probability P(not E) if P(E) = 0.22

P(E) = 1-P(not E) = 1-0.22 = 0.78

Probability of an event

P(E) = number of favorable outcomes/total number of possible outcomes: Probability of an event = f/N

Example of general addition rule: Records for one year show that 73.9% of the people arrested were male, 12.0% were under 18 years of age, and 8.5% were males under 18 years of age. If a person arrested that year is selected at random, what is the probability that that person is either male or under 18? Let M = event the person obtained is male E = event the person obtained is under 18. Then we want to find P(M or E)

P(M or E) = P(M) + P(E) - P(M and E) = 73.9%+12.0%-8.5% = 0.774

Random Number tables

Picking slips of paper from a box is a impractical way o obtain a simple random sample when the population is large, or using a table of random numbers.

Basic Properties of Probability

Property1: the probability of an event is always between 0 and 1, inclusive Property2: the probability of an event cannot occur is 0(an impossible event) Property3: the probability of an event that must occur is 1(a certain event)

Lower limit

Q1-1.5(IQR)

Relative frequency Formula

Relative frequency = Frequency/Number of observations

Example for mean: Find the mean of the data set

So if you have a data set of 300, 400, 500, 330, 800, you would add all of them up then divided of how many they are so the total would be 2330 and there is 5 observations, so to find the mean you would divide 2330 by the number of observations which is 2330/5 which is 466 as the average.

The Probability is 0.303 that the gestation period of a woman will exceed 9 months. In 2000 humans period, roughly how many will exceed in 9 months?

So we have our probability which is 0.303 and our N is 200 so out formula will be: 0.303 = f/2000 = 606, so roughly 606 humans gestation periods will exceed a months, f=606

Sample

That part of the population from which information is obtained.(Obtained information from an entire population)

Population

The collection of all individuals or items under consideration in a statistical study.

Distribution shapes

The distribution of a data set is a table,graph,or formula that provides the values of the observations anyhow often they occur.

The interquartile range

The interquartile range or IQR is the difference between the first and third quartiles that is IQR = Q3-Q1

The mean

The mean is most commonly used measure of center because it is for average. The mean of a data set is the sum of the observation divided by the number of observations.

The Median

The median is the Middle number of the data set. You out them from less to greatest and find the middle number between it.

Which of the following numbers could not possibly be a probability? 0.361 -0.211. 1

The number must be between 0 and 1 so -0.211 souls not be in the probability.

At least, at most, and inclusive

The phase "at least" means "greater than or equal to x" The phase "at most x" means "less than or equal to x" the phase "between x and y, inclusive" means "greater than or equal to x"

joint probability distribution

The probability distribution determining the probabilities of outcomes involving two or more random variables. You would basically divide each on the numbers by the total and that would be your answer as in a decimal, and you would do it for each box.

When a fair dice is rolled what is the probability that the dice comes up an even number?

The probability that the dice come sup even is 0.50

Special Rule

The special rule states that for mutually exclusive events the probability that one or another of the events happens is the sum of the indicvual probabailites. (A or B)

Example of Single value grouping: The following table gives the number of TV sets per household for 50 randomly selectexc households. Use single value grouping or organize these data into frequency and relative frequency distributions.

The table that should show have the number of TV is 1-6, and the frequency is 0=1, 1=16, 2=14, 3=12, 4=3, 5= 2, 6=2, and the total is 50. the frequency is just dividing each by 50 and that is the relative frequency.

The Mode

The value that occurs most frequently in a given data set. Which means there are more than two of the same values, those are modes, if there is no common values then there is no modes.

mutually exclusive events

Two events that cannot occur at the same time (i.e., they have no outcomes in common).

When are the adjacent values just minimum and maximum observation?

When the data has no outliers.

standard deviation

a computed measure of how much scores vary around the mean score which is the average, the observation are from the mean. x-x̅

When one dice is rolled the following six outcomes are possible, there are 6 dices showing 1 2 3 4 5 6, A= event the dice comes up even B= event the dice comes up 4 or more C= event the dice comes up at most 2 D= event the dice comes up at 5 a. what outcomes constitute event A? b. what outcomes constitute event B? c. what outcomes constitute event C? d. what outcomes constitute event D?

a. 2,4,6 b. 4,5,6 c. 1,2 d. 5

A deck with 52 cards with spades, hearts, club, diamond with 13 in each suit. Spades and clubs are black and diamond and heart are red. If one of these cards is selected at random, what is the probability that it is: a. nine? b. red? c. not a spade?

a. 4/52 = 1/13 b. 13/52 = 1/2 c. 39/52 = 3/4

Example: A quantitative data set has a mean 24 and a standard deviation 3. Apply Chebyshevs rule. a. At least 75% of the observation lie between within two values b. what percent of the observations lie between 15 and 33?

a. 75% is k=2 x bar = 24 s = 3 x bar - ks = 24 -(2)(3) = 18 x bar + ks = 24 + (2)(3) = 30 At least 75% of the observation lie between 18 and 30. b. 33 = x bar +ks 33 = 24 + k3 k=3, at least 89% of the observations lie between 15 and 33.

Consider the set consistenting of the first 10 possible whole numbers that's 1-10. a. Determine the numbers in the set that are at least 8 b. Determine the numbers in the set that are at most 9 c. Determine the numbers in the set that are between 7 and 10

a. 8,9,10 b. 1,2,3,4,5,6,7,8,9 c. 7,8,9,10

Find the probability that a randomly selected from has size Relative Frequency Events under 10. 0.106 A 10-49 0.281 B 50-179 0.300 C 180-499 0.167 D 500-999 0.068 E 1000-1999 0.042 F 2000 and over 0.036 G a. less than 2000 arces b. 50 acres or more

a. Let J be the event the farm selected has less than 2000 acres, then P(J) = 1-P(not J) = 1-P(G) = 1-0.036 = 0.964 b.Let k be the event the farm selected has 60 acres or more, then P(K) = 1-P(not K) = 1-[P(A) + P(B)] = 1 - (0.106 + 0.281) = 1-0.387 = 0.613

Example of general addition rule: A standard deck of playing cards has 52, find the probability that the card selected is either a spade or a face card. a. without using the general addition rule E=event card selected is either a spade or a face card Event E contains 22 outcomes, 13 spades plus the other 9 face cards that are not spades, we using f/N b. by using the general addition rule C = event the card selected is a spade D = event of the selected is a face card event C consists of the 13 spades, event D the 12 face cards, and even (C and D) the three face cards that are spades

a. P(E) = 22/52 = 0.423 b. P(C or D) = P(C) + P(D) - P(C and D) = 13/52+12/52-3/52 = 0.423

Playing game Complete a-e: 45 48 64 70 73 74 79 79 80 80 80 80 81 82 82 a. obtain and interpret the quartiles b. Determine and interpret the interquartile range c.find the interpret the five number summary d. identify potential outliers if any e.obtain and interpret a box plot

a. Q1 = (70+73)/2 = 71.5, Q2 = (79+79)/2 = 79, Q3 = (80+80)/2 = 80, The quartiles show the different games being played while each represent a game. Q3 has the greatest game than the other quartiles. b. Range = max-min= 80-71.5 = 8.5, The number of games played in the middle 50% of the seasons goes through the IQR. c. min Q1 Q2 Q3 max = 45 71.5 79 80 82, The distance between the min and the first quartile ad the distance between the first quartile and the median have more variation than the distinct between the median and the third quartile and the4 distance between the third quartile and the max. d. lower = Q1 - 1.5(IQR) = 71.5 = 1.5(8.5)= 58.75 upper = Q3 - 1.5(IQR) = 92.75 The outliers are 45 and 48 e. Make a horizontal line from 45-82 and put boxes for the three quartiles. Interpret: number of games played from 45-82, the distribution of the data is significantly left skewed, the potential outlier observation fall from the rest of the data, there is more varriation in the first quartile than in any of the quartiles.

Example for Probability: There is a table that shows that data on a family income in the US. the US family is selected at random, determine the probability that the family selected has an annual income of: Income | Frequncy under 15000. 6827 1500-24999. 7194 and keeps going... the total frequency is 78867 a. between 50000 and 74999 inclusive b. between 1500 and 49999 c. under 25000

a. f/N = 15260/78867 = 0.193 b. so for this you would add all of them which are: f/N = 7194+7863+10898/778867 = 0.329 c.so you would use the frequency that is less than 2500 so it would be: f/N = 7194 + 6827/78867 = 0.178

Suppose A and B are events such that P(A) =1/5, P(B) = 1/3, and P(A and B) = 1/2 a. are event A and B mutually exclusive b. determine P(A and B)

a. no, P(A) + P(B) is not equal to P(A and B) b. P(A or B) = P(A) + P(B) - P(A and B) P(A and B) = P(A) + P(B) - P(A or B) = 1/5 + 1/3 - 1/2 = 1/30

Example For mutually exclusive events: For the experiment of randomly selecting one card from a standard deck of 52 let, C=event the card selected is a heart D=event the card selected is a face card E=event the card selected is a ace F=event the card selected is an 8 G=event the card selected is a 10 or a jack Which of the following collection of events are mutually exclusive? a. C and D b. C and E c. D and E d. D, E, and F e. D, E, F, and G

a. non-mutually exclusive events b.non-mutually exclusive events c. mutually exclusive event d. mutually exclusive event e. non-mutually exclusive events

Example of probability: When two fair dice are rolled, 36 equally likely outcomes are possible find the porbiblity that: a. the sum of the dice is 11 b. doubles are rolled that is both dice are the same number

a. the probability that the sum of dice is 11 = f/N = 2/36 = 0.056 b. the probability doubles are rolled = f/N = 6/36 = 0.167

Example of random selecting one card from the standard deck: A = event the card selected is the king of hearts B = event the card selected is a king C = event the card selected is a heart D = event the card selected is a face card Find the number of outcomes in each of the following events: a. (not D) b. (B and C) c. (B or C) d. (C and D)

a. there are 52 cards and take out the face cards bc of not d so it would be 52-12 which is 40 b. there are 1 bc B says look at kings then C says only select heart so choose the king with the row of hearts which is 1 cards one heart king. c. so its an or statement where you look at the cards and for B there is 4 kings and for C there is 13 cards for hearts you would have 16 cards in total bc you would stack up the king with other kings to get 17 you would have 16 cards. d. There are 3 outcomes in the event or C and D bc there are three hearts of the face

A picture shows 36 equally outcomes when two balanced dice are rolled. a. Determine the probability that the sum of dice is 6 b. determine the probability hat the sum of dice is odd c. determine the probability that the sum of the dice is 4 or 11 d. determine the probability that the sum of the dice is 2,5, or 8.

a. there is 5 ways for 6 rolled so 5/36 = 0.139 b. 0.50 c. there is 3 ways for 4 and 2 ways for 11 so 3/36 + 2/36 = 5/36 = 0.139 d. there is 1 way for 2, 4 ways for 5 and 5 ways for 8 so 1/36 + 4/36 + 5/36 = 10/36 = 0.278

Example: The student top faculty ratios of the 73 fifth grade classes sampled have a mean of 17.26 and a standard deviation off 2.02. a. x bar - 3s, x bar +3s x bar - 2s, x bar +2s x bar - 1s, x bar +1s x bar b. Apply property 1 of the empirical rule c. apply property 2 of empirical rule d. apply property 3 of empirical rule

a. x bar - 3s = 11.2, x bar +3s = 23.32 x bar - 2s = 13.22, x bar +2s = 21.3 x bar - 1s = 15.24, x bar +1s = 19.28 x bar = 17.26 b. 73*0.68 = 50, at least 50 out of 73 fifth grader classes have student faculty ratios between 15.24 and 19.28 c. 73*0.95 = 70, at least 70 out of 73 fifth grader classes have student faculty ratios between 13.22 and 21.3 d.73*9.97 = 73, at least 73 out of 73 fifth grader classes have student faculty ratios between 11.2 and 23.32

Modality for bimodel

bimodal if it has two peaks so on the graph it will show two peaks.

Outliers

extreme values that don't appear to belong with the rest of the data, observations that fall well outside the overall pattern of the date, a few times already

Complementation rule

for any event E, P(E) = 1 -P(not E)

Special Addition Rule

if event A and B are mutually exclusive, then P(A or B) = P(A) +P(B)

The five-number summary

minimum, 1st quartile, median, 3rd quartile, maximum Q2-Q1 for the second quartile Q3-Q2 for the third quartile Q1- Min for the first quartile Max - Q3 for the fourth quartile

Modality for multimodel

multimodel if it has three or more peaks so on the graph it will have three or more peaks.

Potential outliers

observations that lie below the lower limit or above the upper limit

Example of Histogram frequency: From the single value, limit, and cut point grouping you would graph each the frequency and the relative frequency in different graphs so you would have two frappes for one problem.

on the y axis it would be the frequency or the relative frequency. On the x axis it would be the variable with can be the number of TV or Number of DVDs. And then you just graph it

computing formula for a sample standard deviation

s = √∑xi^2 - (∑xi)^2/n/n-1, where n is the sample size. They will give the same answer just a different way

The Sample Variance

s² = Σ ( xi - x̄ )² / ( n - 1 )

Contingency Table (Two-Way Table)

table that lists the outcomes of two categorical variables; the values of one category are given as the row variable, and the values of the other category are given as the column variable

class mark

the average of the two class limits of a class

upper class limit

the largest value within the class

Conditional Probability

the probability of an event ( A ), given that another ( B ) has already occurred. P(B | A) = P(A & B)/P (A)

Sample space

the set of all possible outcomes for an experiment

Lower class limit

the smallest value within the class

Example of Boxplot: Construct a box-lot for the from the quartiles. The data set it :5 15 20 21 25 26 27 30 30 31 32 32 34 35 38 41 43 66 Q1 = 23, Q2=30.5, Q3=36.5

there is a horizontal line that starts from 0-70 bc of the data set. then you would add in the quartiles in a. shape of a box and draw a line for them and design the box.

The quantitative data set under consideration as roughly a bell shaped distribution. Apply empirical rule, a quantitative data set has a mean 15 and standard deviation 3. Approximately what pertentage of the observation lie between 12 and 18?

x bar = 15 s = 3 lower = 12 upper = 18 18 = x bar + ks 18 = 15 + k3 k = 1, approximately 68% of the observation lie between 12 and 18

The quantitative data set under consideration as roughly a bell shaped distribution. Apply empirical rule, a quantitative data set of a size of 80 has a mean 35 and standard deviation 3. Approximately how many observations lie between 26 and 44?

x bar=35 s=3 lower=26 upper=44 44 = x bar + ks 44 = 35 + k3 k=3, approximately 99.7% of the observation lie between 26 and 44 the size is 80, to find the amount of observation we have to do 80*99.7% = 80, so approximately 80 observations lie between 26 and 44.

Summation notation form

∑xi

Example for summation notation: x1=88,x2=75,x3=95,x4=100, find the sum using the summation notation

∑xi = x1+x2+x3+x4= 88+75+95+100 = 358

Indices for summation notation form

∑xi from i equals 1 to n, where n goes on top and i=1 goes on bottom.

Modality for unimodal

a distribution that has one peak. so the graph will only have one pack at the top

Chebyshev's Rule

for any quantitative data set and any real number k greater than or equal to 1 at least 1-1/k^2 of the observation lie within k standard deviations to either side of the mean, that is between x̅ - k*s and x̅ + k*s There are two cases of the chebyshevs rule which show up frequently: k=2: at least 75% of the observation in any data set lie within two standard deviations to either side of the mean that is between x̅ -2s and x̅ +2s k=3: at least 89% of the observations in any data set lie within three standard devuiations to either side of the mean, that is between x̅ -3s and x̅ +3s

Example: Human beings have one of the four types: A, B, AB, or O. What kind of data do you receive when you are told your blood type?

Qualitative Data

Example of chebyshevs rule: when k = 2...

1-1/k^2 = 1-1/2^2 = 1-1/4= 3/4= 0.75 or 75% Therefore when k=2 at least 75% of the observation lie within two standard deviations to either side of the mean.

Example of Quartiles: A sample of 20 people yield the weekly viewing times, in hours of televisions. Find the quartiles of this data set: 25 41 27 32 43 66 34 31 15 5 34 26 32 38 16 30 38 30 20 21. 1. find median(Q2) 2. find the first quartile(Q1) 3.find the third quartile(Q3)

1. 5 15 20 21 25 26 27 30 30 31 32 32 34 35 38 41 43 66 the median is 30 and 31. Q2= 30,31 (30+31)/2 = 30.5 average 2. the bottom half for Q1 is 5 15 20 21 25 26 27 30 30, which is 21 and 25, so Q1=(21+25)/2 = 23 3. the top half for Q3 is 31 32 32 34 35 38 41 43 66 which is 35 and 38, so Q3= (35+38)/2 = 36.5

Bar Chart

A bar chart displays the distant values of the qualitative data on a horizontal axis and the relative frequencies of those values a on a vertical axis. The relative frequency in on the y axis and the names of the observation is on the x axis.

Variables

A characteristic that varies from one person or thing to another

Symmetry

A distribution that can be divided into two pieces that are mirror images of one another is said to be symmetric. There are three graphs that are and can look like symmetry graphs. one has a bell shaped, one is a triangular shape, and one is a uniform shape or rectangular. The bell shaped and triangular look like unimodal.

Frequency Distribution

A frequency distribution of qualitative data is a listing of the distinct values and their frequencies (or counts). Procedure: list the distant values of the observations in the data set in the first column of the table. For each observation count h=in total how many there are lie how many d, r or o they are. Count the observations for each distant value and record the totals in the third column of the table to find frequency.

Modality

A key observation when the shape of distribution is its number of peaks.

Measure of variation

A measure of how much a collection of data is spread out. Commonly used types include range and quartiles or standard deviation. (Also known as spread or dispersion.)

Pie chart

A pie chart is a disk divided into wedge shaped pieces proportional t the relative frequencies of the qualitative data. Procedure: Obtain a relative frequency distribution of the data, divide a dick into wedge shaped pieces proportional to those relative frequency, label there slices with h distant values and their relative frequencies. AN for labeling put in in actual percentage %

Percentile

A point on a ranking scale of 0 to 100. The 50th percentile is the midpoint; half the people in the population being studied rank higher and half rank lower.

Continuous Variable

A quantitative variable whose possible valuers from interval of numbers

Discrete Variable

A quantitive variable whose possible values can be listed, In particular a quantitative variable with only a finite number of possible values.

Relative frequency

A relative frequency distribution of qualitative data is a listing of the distinct values and their relative frequencies. It is the ratio of the frequency to the Toal number of observations.

Simple Random Sample

A sample obtained by simple random sampling

Example for Systemtic random Sampling

A sample of 15 students from a population of 728 students, Lets suppose that the random number table yielded the number k=22. To find m we calculate 728/15 = 48. So m=48 and k = 22.

Example: The members of a population have been numbers 1-50. A sample size of 20 taking from he population Using cluster sampling. The cluster size is 10 where cluster #1 is numbered 1-10, cluster #2 numbered 11-20 and so forth. A. what are the clusters taken from the population? B. Suppose cluster #1 ans #3 are selected from the clusters determined in part a, What is the sample, List all of the members of the sample if cluster #1 and #3 are selected

A. 1-10,11-20,21-30,31-40,41-50 B. 1,2,3,4,5,6,7,8,9,10,21,22,23,24,25,26,27,28,29,30

Example: The members of a population have been numbered 1-337. A sample of size 5 is to be taken from the population, using systematic random sampling. Random Number table 13881 51320 96418 98103 32808 83902 A. To use the Random number table start at the two digit number at the top left, read down the column, up the next and so on. B. Suppose that the random number chosen from the random number table is 58(that is k = 58)

A. divide the population by size 337/5=67. To find k you have to look between 1-67 and the RNT of the first digit number is 13 so k=13. WE have size as 5 for the sample so we add 5 by 67b and keep adding 67 to get a sample of 5 which are going to 13, 80, 147, 214, 281. B. It would be the same but the first two digits would be 58. So adding 67 would be 58, 125, 192, 259, 326.

Example: the top five state officials of Oklahoma are as shown. Consider these five officials a population of interest. Governor(G) Lieutenant Governor(L) Secetary of State(S) Attorney General(A) Treasurer(T) A. List possible samples of two officials from this population of the five officials B. Describe a method for obtaining a simple random sample of the two officials from this population of five officials C. For the sampling method from part B, what are the chances two officials will be selected?

Answers: A. GL, GS, GA, GT, LS, LA, LT, SA, ST, AT B. One way possible to obtain a random sample is to write each name on a separate paper and draw two out of a box without looking. C. There are 10 possible samples and it is equally likely to choose any of them, so the chances are 1/10 or 10%

Designed experiment

Designed experiment researches impose treatments and controls and then observe characteristics and take measurements.Designed experiment is where the data do not exist until someone does something to produce the data.

Example: The US collects data on house hold size and published the information in current population reports. What kind of data is the number of people in your household?

Discrete, Quantitative Data

Example for lower, upper, and mark for limit grouping:

Fore example the class is 50-59 from the other example. The lower limit is 50 and the upper limit is 59. The Width is 60-50 = 10 and the mark is (50+59)/2 =54.5

Observational Study

Observational study researches simply observe characteristics and take measurements, such as a simple survey. Someone is observing data that already exist

Systematic Random Smapleing

Procedure: Divide the population size by the sample size and round the result down to the nearest whole number,m. Use a random-number table or a similar device to obtain a number k, between 1 and m. Select for the sample those number of the population that are numbered k, k+m, k+2m,...

Upper limit

Q3+1.5(IQR)

Example for Sampling: There are members of a population that are numbers 1-50. The simple random size is 6 from the population. Start at the two digit number in the line number go 3 and coulee numbers 20-21 and go down then back up to the next two digits inexact to it. Select a simple random of 6 subjects from th populations between 1-50.

So in the example table you do down there are a lot of no and yes, if anything is between 1-50 is good and put it in the simple random sample. so in the picture it would show in line number 3 could 20 as the first two digits are 81 which is not between 1-50, then keep going down until you find 1-50 then go to the next two digit line to find for the SRS.

Example of dot plot: before purchasing a new DVD player alien researched different brands and found the following 16 places of DVD players. COmtrust a dot plot of his data

So in the table it shows number from 190 - 230, so you would out that on the horizontal line then through that you put a dot above it and see what it shows.

Example of median: Determine the median. 300,500,400,600,700

So you would put it in order from less to greatest which would be 300,400,500,600,700. The median would be 500 bc it is the middle of the data set.

Example for Cluster sampling

The members of a population have been divided into clusters of qual size 20. Use cluster sampling to obtain a sample of size 60 firm the poputlation. 1. First observe that there are 300/20 = 15 clusters. We will give numbers 1.20 to the members of cluster #1, 21-40 to the members of cluster #2, and so on. 2. since we need sample of size 60 we need 3 clusters. we use a random number table to obtain clusters #3, #4, and #10. 3. The sample we obtain then is those numbered 41-60, 61-80, and 181-200.

The Range

The range of data set is given by the formula: Range = Max-Min where max and min denote the maximum and minimum observations.

Empirical rule

The rules gives the approximate % of observations w/in 1 standard deviation (68%) which is between x̅-s and x̅+s, 2 standard deviations (95%) which is between x̅-2s and x̅+2s and 3 standard deviations (99.7%) which is between x̅-3s and x̅+3s of the mean when the histogram is well approx. by a normal curve, bell shaped.

Example of lower outpoint, upper cutpoint and mark

The smallest value that could go in a class, the smallest value that could go in the next higher class, the average of the two class limit of a class. So the lower cut point is 160 and the upper is 180. The width is 180-160 = 20 and the mark is (160+180)/2 = 170

Example of stem and leaf diagram

You need to check the first value of each number to out it on the outer part of the line for example if you have numbers 70, 62, 77, 57, 54, and 44 you would take the front value in order which is 4,5,6,7 then you take the second value and put in on the inner line that goes in order horizontally on each number to get the same number. so like 44, 54,57,62,70,77, but this will be with a line going down.

summation notation (sigma notation)

a concise way to express the sum of a set of numbers

Example of Limit Grouping: The following table displays the number of days to maturity for 40 short term investments. use limit grouping with grouping by 10s, to organize these data into frequency and relative frequency distributions.

The table shows 30-99 so the days of maturity is 30-39,40-49,50-59,60-69,70-79,80-89,90-99. The frequency is 3,1,8,10,7,7,4 and the total is 40. For relative frequency is diving all by the Toal which is 40 so for the 30-39 the frequency is 3 so 3/40 is 0.075 as the relative frequency.

Example of Cutpoint: The filling table consists of the weights of a sample of 18-to-24 year old males. Use cutpoint grouping to organize the data into frequency and relative frequency distribution. Use a class width of 20 and a first cutpoint of 120.

The table shows a bunch of decimals and to start it says start at 120 under-140 bc of the width which is 20. so keeps gong on till it stops at 280 from the decimal table. The total of the frequency is 37 and the relative is 0.999 bc of dividing the frequency by Total of frequency.

Shape of distribution

There are three general aspects of the shape of distribution which are modality, symmetry, and skewness.

Example Of relative frequency

There are three groups, Democrat, republican, Others. There are 13 demo, 18 republican, and 9 other. The total of all that is 40, so for the RF od demo will be 13/40 = 0.325, the Rf of Republican will be 18/40 = 0.450, and the RF for other is 9/40 = 0.225 and the Toal for the Rf is 1.000.

Grouping Data

To organize quantitative data, we first group the observations into classes and then create the classes as the distinct values of qualitative data.

Limit Grouping

Use when the data are expressed as whole numbers and there are too many distinct values to employ single-value grouping. Uses a lower and upper limit.

Histogram (relative frequency histogram)

Uses adjacent bars to show the distribution of a quantitative variable. Each bar represents the frequency (or relative frequency) of values falling in each bin.

Quartiles

Values that divide a data set into four equal parts. The first quartile (Q1) is the median of the bottom half of the data set. The second quartile(Q2) is the median of the entire data set. The third quartile(Q3) id the median of the top half of the data set.

The Sample Standard Deviation

the square root of the sample variance, s = √∑(x-x̅ )^2/n-1 where n is the sample size and x̅ is the sample mean

sample data/sample distribution

the values of a variable for a sample of the population

Population data/ population distribution

the values of a variable for the entire population

The sum of squared deviations

total over all the scores of each score's squared difference from the mean, which means you would square x-x̅ which would be (x-x̅)^2

Cutpoint grouping

use when the data are continuous and are expressed with decimals

Xi for summation

xi is for the summation to represent the observation of the variable x.

Example of sample Standard deviation: Thee heights in inches of 5 players are 67,72,76,76,84. Determine the sample standard deviation.

x̅ = ∑xi/n = 67+72+76+76+85/5 = 75 x-x̅ = 67-75= -8, 72-75= -3, 76-75=1, 76-75=1, 85-75 = 10 (x-x̅ )^2= -8^2= 64, -3^2 = 9, 1^2 = 1, 1^2=1, 10^2 = 100 s = √∑(x-x̅ )^2/n-1 = √156/5-1 = √39 = 6.2

Example of sum of squared deviation: The heights in inches of 5 players are 72, 73, 76, 76, and 78. Find the deviation from the mean, which is x-x̅, and the squared deviation which is (x-x̅)^2.

x̅ = ∑xi/n = 72+73+76+76+78/5 = 75 deviation from mean is x-x̅: 72-75=-3, 73-75=-2,76-75=1, 76-75=1, 78-75=3. Squared deviation for mean is (x-x̅)^2: -3^2=9, -2^2 = 4, 1^2 = 1, 1^2= 1, 3^2 = 9. The total for the squared deviation is 24.

Example of sample variance: Dertmine the sample variance of the players height. The heights in inches of 5 players are 72, 73, 76, 76, and 78.

x̅ = ∑xi/n = 72+73+76+76+78/5 = 75 deviation from mean is x-x̅: 72-75=-3, 73-75=-2,76-75=1, 76-75=1, 78-75=3. Squared deviation for mean is (x-x̅)^2: -3^2=9, -2^2 = 4, 1^2 = 1, 1^2= 1, 3^2 = 9. The total for the squared deviation is 24. s² = Σ ( xi - x̄ )² / ( n - 1 ) = 24/5-1 = 6

Sample Mean

x̅ which means x bar, symbolically, x̅ = ∑xi/n where n is the sample size

Indices summation notation: x1=88,x2=75,x3=95,x4=100, find the sum using the summation notation

∑xi from i=1 to n=4 = x1+x2+x3+x4 = 358


Ensembles d'études connexes

AA Ch. 4, AA Ch. 3, AA Ch.2, AA Ch. 1, Ch 4 AA CONSOLIDATED FINANCIAL STATEMENTS AND OUTSIDE OWNERSHIP: Problems, Ch 3 AA Consolidations-Subsequent to the Date of Acquisition: Problems, Ch 2 AA Consolidation of Financial Information: Problems, accoun...

View Set

Chapter 7 - Workplace Legislation

View Set

Series 10- New Issues/Trading Markets

View Set

Specimen Collection: Midstream (Clean-Voided)

View Set

Maternal Newborn Success - Normal Postpartum & High Risk Postpartum

View Set