INTRO TO STATS EXAM # 2
On a 9 question multiple-choice test, where each question has 2 answers, what would be the probability of getting at least one question wrong?
1 - (1/2)^9
Here is the setup for a non-traditional casino game: You draw a card from a well shuffled full deck and if the card is a king you win $100. The game costs $2 to play and you decide to play the game until you win the $100. Each time you draw a card you pay $2, and if the card is not a king, the card is put back in the deck, and the deck is reshuffled. How much money should you expect to spend on this game?
A complete deck has 52 cards, out of which 4 are kings. So there is a 1 in 13 chance of drawing a king. Thus the expected number of draws it will take to draw a king is 13. So, expected amount of money that will be spent on this game = Expected number of draws*Payment per draw = 13*2 = 26 dollars
Convenience Sample
A convenience sample is one of the main types of non-probability sampling methods. A convenience sample is made up of people who are easy to reach. Consider the following example. A pollster interviews shoppers at a local mall.
Parameter
A numerically valued attribute of a model. For example, the values of μμ and σσ in a N(μ,σ)N(μ, σ) model are parameters
Simple Random Sample
A sampling procedure for which each possible sample of a given size is equally likely to be the one obtained.
Statistic
A value calculated from data to summarize aspects of the data. For example, the mean, and standard deviation, are statistics.
Voluntary Response Sample
A voluntary response sample is a sample made up of volunteers. Compared to a random sample, these types of samples are always biased. For example, people who call in to a radio show poll may have strong opinions about a topic in either direction.
Cluster Sample
Cluster samples slice vertically across the layers to obtain clusters, each of which may have parts of the entire population.
Control Group
Control Group. In an experiment, a control group is a baseline group that receives no treatment or a neutral treatment. To assess treatment effects, the experimenter compares results in the treatment group to results in the control group. In the design of experiments, treatments are applied to experimental units in the treatment group. In comparative experiments, members of the complementary group, the control group, receive either no treatment or a standard treatment
What is an experiment?
Data for statistical studies are obtained by conducting either experiments or surveys. Experimental design is the branch of statistics that deals with the design and analysis of experiments. ... In a completely randomized experimental design, the treatments are randomly assigned to the experimental units.
Find the probability P(number less than4) when a fair die is rolled, given that the number is odd.
E {1,2,3} =3 O{1,3,5} =3 E^O {1,3} =2 P(E/O) = E^O/O = 2/3
Event
Event is the subset of the sample space or event can also be defined as the collection of either one or more than one outcomes of an experiment. ... Event with a single outcome is named as simple event and an event with having two or more than two outcomes is known as compound event.
Factor
Factor. In an experiment, the factor (also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels (i.e., different values of the factor). Combinations of factor levels are called treatments.
Sample Size
How large a random sample do we need for the sample to be reasonably representative of the population? Most people think that we need a large percentage, or fraction, of the population, but it turns out that all that matters is the number of individuals in the sample. A random sample of 100 students in a college represents the student body just about as well as a random sample of 100 voters represents the entire electorate of the United States. This is the third idea and probably the most surprising one in designing surveys.
The IQ's in a certain population are normally distributed with a mean score of 104 and a standard deviation of 14. If we select a person at random from this population, find the probability that their IQ will be: Find the IQR of the IQ in this population.
IQR = Q3 - Q1 = Probability of 0.75 - probability of 0.25 when p =0.25, z = -0.675 Thus, -0.675 = (X2 - 104)/14 X2 = -0.675*14 + 104 = 94.55 hen p =0.75, z = 0.675 Thus, 0.675 = (X2 - 104)/14 X2 = 0.675*14 + 104 = 113.45 Thus, IQR is 94.55 to 113.45
Double Blinding
In "double-blind" trials, the clinical team— people involved in the patient's management, the people collecting samples from the patient and the people analyzing the data also do not have any knowledge of whether the patient is getting the placebo or not.
what it means for two events to be disjoint
In a single trial tow events are said to be disjoint if the have no element in common. Or event A and B are disjoint (mutually exclusive) if the can not occur at the same time. ex. disjoint events do not overlap
what it means for two events to be disjoint
In a single trial2events are said to be disjoint if theyhave no element in common. Or event A and B are disjoint (mutually exclusive) if the can not occur at the same time. ex. disjoint events do not overlap
Trial
In probability theory, an experiment or trial is any procedure that can be infinitely repeated and has a well-defined set of possible outcomes, known as the sample space. An experiment is said to be random if it has more than one possible outcome, and deterministic if it has only one.
Outcome
In probability theory, an outcome is a possible result of an experiment. Each possible outcome of a particular experiment is unique, and different outcomes are mutually exclusive (only one outcome will occur on each trial of the experiment).
the law of large numbers
In probability theory, the law of large numbers is a theorem that describes the result of performing the same experiment a large number of times.
Response Variable
In statistics and data science, the term response variable is referred as a variable whose value depends on that of another, often one that we can't directly control. It's oftentimes considered as the output of changes to the explanatory variable, or what we also call dependent variable (or y variable).
observational study
Like experiments, observational studies attempt to understand cause-and-effect relationships. However, unlike experiments, the researcher is not able to control (1) how subjects are assigned to groups and/or (2) which treatments each group receives. ... Therefore, a sample survey is an example of an observational study.
A group of people were asked if they had run a red light in the last year. 134 responded "yes", and 259 responded "no". Find the probability that if a person is chosen at random, they have run a red light in the last year.
Number of persons=134+259=393 P( yes) =134/393 =0.341 probability ( a person run a red light in the last year) = 0.341.
If you draw one card from a standard 52 card deck. What is the probability of drawing a 10?
P(10) = Answer: 0.076923076923077
the multiplication rule for independent events
P(A and B) = P(A).P(B)
For any event A What's the probability of the event A occurring ?
P(A) is a number between 0 and 1.
The Statistics Aversion Scale (SAS) is normally distributed and has a population mean of 500 and the Statistics Aversion Scale (SAS) is normally distributed and has a population mean of 500 and a population standard deviation of 20. Compute the z-score and locate the percentile rank in the normal curve table for each of the following scores.
P(Z<-2)= 0.0228 (2.28%) (check standard normal table) P(Z<2)= 0.9772 (97.72%) (check standard normal table)
If you draw one card from a standard 52 card deck. What is the probability of NOT drawing a 6?
P(not 6) = 0.92307692307692
A poll showed that 62.2% of Americans say they believe that some people see the future in their dreams. What is the probability of randomly selecting someone who does not believe that some people see the future in their dreams.
P(someone does not believe that some people see the future in their dreams) = 1 - P(someone believe that some people see the future in their dreams) 1 - 0,62 = 0,38
Placebo
Placebo. In an experiment, subjects respond differently after they receive a treatment, even if the treatment is neutral. A neutral treatment that has no "real" effect on the dependent variable is called a placebo, and a subject's positive response to a placebo is called the placebo effect.
Placebo Effect
Placebo. In an experiment, subjects respond differently after they receive a treatment, even if the treatment is neutral. A neutral treatment that has no "real" effect on the dependent variable is called a placebo, and a subject's positive response to a placebo is called the placebo effect.
Random assignment
Random assignment or random placement is an experimental technique for assigning human participants or animal subjects to different groups in an experiment (e.g., a treatment group versus a control group) using randomization, such as by a chance procedure (e.g., flipping a coin) or a random number generator.
Bias
Selecting a sample to represent the population fairly is more difficult than it sounds. Polls or surveys most often fail because they use a sampling method that tends to over- or underrepresent parts of the population. The method may overlook subgroups that are harder to find (such as the homeless) or favor others (such as Internet users who like to respond to online surveys). Sampling methods that, by their nature, tend to over- or underemphasize some characteristics of the population are said to be biased. Bias is the bane of sampling—the one thing above all to avoid. Conclusions based on samples drawn with biased methods are inherently flawed. There is usually no way to fix bias after the sample is drawn and no way to salvage useful information from it.
In a study, the sample is chosen by putting people's names on a dartboard, and blindly throwing darts What is the sampling method?
Simple Random
In a study, the sample is chosen by writing everyones name on a playing card, shuffling the deck, then choosing the top 20 cards. What is the sampling method?
Simple random
Systematic Sample
Some samples select individuals systematically. For example, you might survey every 10th person on an alphabetical list of students. To make it random, you still must start the systematic selection from a randomly selected individual. When the order of the list is not associated in any way with the responses sought, systematic sampling can give a representative sample. ex. In a study, the sample is chosen by surveying every 3rd driver coming through a tollbooth.
In a study, the sample is chosen by dividing the population by Gender, and choosing 30 people of each gender What is the sampling method?
Stratified
Stratified Sample
Stratified samples represent the population by drawing some from each layer, reducing variability in the results that could arise because of the differences among the layers.
Why we might choose an observational experiment rather than performing an experiment.
The researcher has no control over the variables in an observational study. An experiment is a method of applying treatments to a group and recording the effects. Remember, a good group experiment will have two basic elements: a control and a treatment. The study to determine the relation between smoking and lung cancer is a typical example for observational study. ... 1.The main difference between observational study and experiments is in the way the observation is done. 2.In an experiment, the researcher will undertake some experiment and not just make observations.
Sample Space events.
The set of all the possible outcomes is called the sample space of the experiment and is usually denoted by S. Any subset E of the sample space S is called an event. Here are some examples. Example 1 Tossing a coin. The sample space is S = {H, T}.
Treatment
Treatment. In an experiment, the factor (also called an independent variable) is an explanatory variable manipulated by the experimenter. Each factor has two or more levels, i.e., different values of the factor. Combinations of factor levels are called treatments.
the general addition rule
Two events: P(AuB) = PA) + P(B) - P(A^B)
Sample
a smaller group of individuals, selected from the population.
a)Subjects, participants, b)experimental units
a) individuals who take part in research b)The experimental unit is the physical entity which can be assigned, at random, to a treatment. Commonly it is an individual animal. The experimental unit is also the unit of statistical analysis. However, any two experimental units must be capable of receiving different treatments.
Single Blinding
adjective. of or relating to an experiment or clinical trial in which the researchers but not the subjects know which subjects are receiving the active medication or treatment and which are not: a technique for eliminating subjective bias, as the placebo effect, from the test results.
Population
an entire group of individuals
The concentration of SO_2 (an air pollutant) for a particular day and city is normally distributed with a mean of 0.04 ppm and standard deviation of 0.01. Clean air standards require that daily SO_2 concentration does not exceed 0.06 ppm. 1) What is the Z-score associated with the concentration 0.06 ppm? 2) What is the probability that the clean air standard is violated?
as z=(X-mean)/std dev 1) hence for 0.06, Z=(0.06-0.04)/0.02=2 2) or clean air to be violated P(X>0.06)=P(Z>2)=1-P(Z<2)=1-0.97725=0.02275
"at least one."
complementary probabilities.
The concentration of SO_2 (an air pollutant) for a particular day and city is normally distributed with a mean of 0.04 ppm and standard deviation of 0.01. Clean air standards require that daily SO_2 concentration does not exceed 0.06 ppm. What are the SO_2 concentrations associated with the Z-scores for 90th percentile ? What is the concentration associated with a Z-score of -2.04?
for 90th percentile SO2 concentration =mean+z*std deviation =0.04+1.281552*0.01=0.052816
The concentration of SO_2 (an air pollutant) for a particular day and city is normally distributed with a mean of 0.04 ppm and standard deviation of 0.01. Clean air standards require that daily SO_2 concentration does not exceed 0.06 ppm. Using the a z- table, find the Z-scores associated with the a) 90th percentile. b)95th percentile.
for 90th percentile z=1.281552 for 95th percentile z=1.644854
Suppose that 5 independent trials, each of which results in any of the outcomes 0, 1, or 2, with respective probabilities 0.3, 0.5, and 0.2, are performed. Find the probability that both outcome 1 and outcome 2 occur at least once. (Hint: Consider the complementary probability.
probablity of event of occuring both outcome 1 and outcome 2 + probability of not occuring both outcome 1 and outcome 2 = 1 P(event of occuring both outcome 1 and outcome 2) = 1 - P(event of not occuring outcome 1 and outcome 2) P(event of occuring both outcome 1 and outcome 2) = 1 - P(0)
The concentration of SO_2 (an air pollutant) for a particular day and city is normally distributed with a mean of 0.04 ppm and standard deviation of 0.01. Clean air standards require that daily SO_2 concentration does not exceed 0.06 ppm. What are the SO_2 concentrations associated with the Z-scores for 95th percentile ?
similarly for 95th percentile SO2 concentration =0.056449.
every sixth student on the class list
systematic
How we compute complementary probabilities
using binomial distribution= P( event of occurring both outcome 1 and outcome 2) must be equals to unit. It is P (outcome 1 and outcome 2 occurs at least once)
what it means for two events to be independent.
when the probability of event E does not depend on event F, then the two events E and F are independent. Suppose we flip a fair coin 2x if H is event "heads on first toss", and T is the event "tail on second toss", the events H and T are independent events.
The IQ's in a certain population are normally distributed with a mean score of 104 and a standard deviation of 14. If we select a person at random from this population, find the probability that their IQ will be: 20th percentile f IQ
z = (x-u)/ u = 104, = 14 20th percentile => probability 0.2 value of z corresponding to probability 0.2 is -0.84 (from standard normal table) Thus, -0.84 = (x-104)/14 x = -0.84*14 + 104 = 92.24