CIM 250 Midterm Review

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

mathematical expression of probability

-probability is always expressed as a value between 0 and 1 - closer to 0 means less probable; closer to 1 is more probable -this means that it is always a fraction, decimal or percent

Emma's probability of winning a school contest is 0.20. What is her probability of not winning the contest?

0.80

Steps in Hypothesis Testing

1. Formulate research question and state null hypothesis (h0) and alternate (h1) hypothesis 2. state a significance (alpha) level 3. conduct relevant statistical test & obtain the test statistics, generate p value and/or confidence interval 4. This allows you to determine whether to accept or reject the null hypothesis.

3 important reasons for random sampling

1. It avoids known and unknown biases on average 2. it helps convenience others that the trail was conducted properly 3. it is the basis for statistical theory that underlies hypothesis tests & confidence intervals

Concern was expressed by the health educators on a particular college campus that students with serum cholesterol levels above the mean level of 195 would be a increased risk of heart disease. If the mean cholesterol level of students is 195, what percentage of students are at risk?

50%

In a normal distribution, about what percent of values lie within one standard deviation of the mean?

68%

Emperical Rule

68% of all data is within one standard deviations of the mean 95% of all data is within two standard deviations of the mean 99.7% of all data is within three standard deviations of the mean

Which of the following would indicate a test with the greatest sensitivity?

99.8% [highest number]

Which of the following would indicate a test with the greatest specificity?

99.8% [highest number]

All of the following are examples of inferential studies except:

A calculation of colon cancer incidence rates in the California population.

normal distribution

A function that represents the distribution of variables as a symmetrical bell-shaped graph.

Histogram

A graph of vertical bars representing the frequency distribution of a set of data.

probability

A number that describes how likely it is that an event will occur

cluster sampling

A probability sampling technique in which clusters of participants within the population of interest are selected at random, followed by data collection from all individuals in each cluster. select a simple random sample of groups

Bias occurs when:

A study sample is not representative of the underlying population

stratified sampling

A type of probability sampling in which the population is divided into groups with a common attribute and a random sample is chosen within each group used when we wish the sample to represent various subgroups of the population proportionally or to increase the precision of the estimate

Which of the following are properties of the normal distribution?

ALL OF THE ABOVE It has the appearance of a symmetrical, bell-shaped curve It is defined by two parameters: the mean and the standard deviation the area under the probability curve is always equal to 1 or 100%

Probability Rule 5

Addition Rule -to determine the probability that one or another event (but not necessarily both) will occur we use the addition rule -states that the probability that event A or event B (or both) will occur equals the sum of the probabilities of each individual event minus the probability of both - P (A or B) = P(A)+ P(B)- P(A and B) - the reason for subtracting the P(A and B) from the equation is that this portion would otherwise be included twice if the events were not mutually exclusive

In which of the following scenarios would the mode be the most appropriate measure to calculate?

An inventory manager at a store wants to determine the most frequently purchased television model

In interpreting the results of epidemiologic studies, the size of a sample is more important than the way in which it was selected.

FALSE

Measures of central tendency are useful when we are trying to draw conclusions from a larger population and apply them to a sample.

FALSE

Suppose we conduct a clinical trial to test the efficacy of a particular treatment. We find that the mean difference between two groups is 40%, with a 95% confidence intervale of (35.0, 42.0). Which of the following interpretations of the confidence interval is true?

If we were to conduct this clinical trial 100 times, in 95 of 100 trials, the mean difference between the treatment groups would be between 35% to 42%.

The mode tumor size in a sample of breast cancer cancer patients is 4 centimeters. Which of these interpretations is correct?

More patients has a tumor size of 4 centimeters than any other tumor size

Probability Rule 4

Multiplication Rule -2 events are independent if the occurrence of one has no effect on the other. the outcomes of coin tosses are independent because the outcome of 1 toss does not affect the another NOTE: independent and mutually exclusive are NOT the same

Suppose we wish to compare treatment efficacy of a drug for two groups in a breast cancer clinical trial. In the treatment group, the drug is effective for 35% of the patients, while in the placebo (control) group, it is effective for 39% of patients. We wish to determine if this difference of 4% is statistically significant or not. We perform a significance test, which yields a p-value of 0.08. If our significance level is set at 0.05, what can we say about the difference?

Since our calculated p-value of 0.08 is greater than our significance level of 0.05, we accept the null hypothesis and conclude that the difference is not statistically significant.

central tendency

Summary measures that describe a whole set of date with a single value around which other values tend to cluster

In simple random sampling, each subject has an equal chance of being selected.

TRUE

Probability is the likelihood that an event will occur.

TRUE

When data are symmetrically distributed, the mean, median and mode are equal.

TRUE

alternative hypothesis

The hypothesis that states there is a difference between two or more sets of data.

relevance

The quality of information that indicates the information makes a difference in a decision.

Probability Rule 2

The sum of the probabilities of all possible outcomes is 1

Which of the following represents information classified on an ordinal scale?

The top five most watch shows on prime time TV

A major advantage of retrospective (case-control) studies is:

They are better at establishing temporal sequence of events than prospective cohort studies.

Specificity is calculated according to which of the following?

True negative/True negative+False positive

Sensitivity is calculated according to which of the following?

True positives/True positives+False negative

The mean is the measure of central tendency best used in which of the following situations?

When data have a relatively symmetric distribution

All of the following are important reasons for random sampling except:

[It decreases the validity of a study] It is the basis for statistical theory that underlies hypothesis tests and confidence intervals. It helps convince others that the trial/study was conducted properly It avoids known and unknown biases on average

All of the following statements about probability are true except:

[Probabilities are expressed as values between 1 and 10] Probability is a numeric expression of uncertainty about an event. The probability of a given event is equal to 1 minus the complement The sum of the probabilities of all possible outcomes in a given situation is always equal to 1

All of the following statements are true about hypothesis testing EXCEPT:

[The null hypothesis in hypothesis testing is that there is a significant difference between groups] Hypothesis testing allows researchers to compare descriptive statistics between two populations or samples. Hypothesis testing allows researchers to make statistical inferences about a population from a sample. Hypothesis testing refers to the formal procedures used to determine that the probability for a given hypothesis is true.

pie chart

a circular chart divided into triangular areas proportional to the percentages of the whole

sampling frame

a complete non-overlapping list of the persons or objects constituting the population

variance

a difference between what is expected and what actually occurs indicates the spread of dispersion of the data, but useful only in practical terms for calculating the standard deviation

line graph

a graph that uses one or more lines to show changes in statistics over time or space

bar graph

a graph that uses vertical or horizontal bars to show comparisons among two or more items

frequency distribution

a graphical representation of measurements arranged by the number of times each measurement was made

population

a set of persons (or objects) having common observable characteristic

statistical significance

a statistical statement of how likely it is that an obtained result occurred by chance

longitudinal study

a study that observes the same participants on many occasions over a long period of time

sample

a subset of a population

To determine the probability that one or another event will occur, we use the ______Rule.

addition

A census is:

an enumeration of an entire population

Probability Rule 1

any probability is a number between 0 and 1

All except which of the following are common elements of a frequency table?

axis

What type of graph would you use if you wanted to display the number of lung cancer cases in 2011 for each of four major race/ethnic groups?

bar chart

continuous variables

can assume an infinite number of values between any two specific values. They are obtained by measuring. They often include fractions and decimals.

Which of the following tests is used to compare categorical outcomes from two samples to determine if the differences are statistically significant?

chi-squared test

false positive

classifying a person as diseased when they actually do not have the disease

false negative

classifying a person as not diseased when they actually do have the disease

Probability Rule 3

complement rule [P(not A)=1-P(A)]

inferential statitics

concerned with reaching conclusions from incomplete information- generalizing from specific uses information obtained from a sample to say something about the entire population (opinion poll)

A study of attitudes about flexible workplace policies is being conducted in a large company. An e-mail survey is distributed to a complete list of employees. Participation in the survey is anonymous and voluntary. This is an example of a:

convenience sample

descriptive statistics

deals with the enumeration, organization and graphical representation of data (example census)

The area of statistics that describes the data is called ______ statistics:

descriptive

experiments

design a research plan; imposes controls

The specificity of a test refers to the ability of the test to:

detect negative diagnoses among individuals who do not have the disease

The sensitivity of a test refers to the ability of the test to:

detect positive diagnoses among individuals who actually have the disease

ordinal

do have intrinsic order but differences between levels are not relevant, examples: low, medium and high; age ranges

random sample

every subject has an equal chance at being selected

hypothesis testing

formal procedure used to determine the probability that a hypothesis is true

When you want to make a statement about a population using information from a sample, you use________statistics.

inferential

variables

information on specific characteristics

significance level

known as the alpha level, this refers to the probability of rejecting the null hypothesis when the null hypothesis is true

standard deviation

measure of how spread out the observations/data points are from the mean; it is equal to the square root of the variance; a low standard deviation means the values are clustered around the mean, while a high deviation indicates that they are spread out.

The survival time from diagnosis until death of four cancer patients was as follows: 8 months, 2 months, 1 month, and 3 months. Which of the following measures of central tendency best describes the distribution of survival times?

median

prospective studies

members of the cohort are identified before the outcome occurs advantage: permit the accurate estimation of disease incidence in a population disadvantage: they take a lot of time and they are expensive

The _____________ Rule of probability is used to determine the probability of occurrence of two independent events.

multiplication

nominal

no intrinsic order & the difference between levels of the variable have no meaning, examples: sex, race or exposure

The variable eye color can be classified using a ___________ scale.

nominal

The ______is a number that indicates the probability that measures from two samples or groups are similar.

p-value

A characteristic of a population is called a(n):

parameter

All of the following except __________ are ways to graphically display continuous data?

pie charts

A group of healthy teachers is assembled and followed over time in order to determine how many of them develop breast cancer. This is an example of a:

prospective cohort study

systemic sampling

randomly select a first case then proceed by selecting every nth case, where n depends on the desired sample size

The difference between the highest and lowest value in a data set is referred to as the:

range

survey

represents observations; controls are seldom possible

parameter

set of observations may be summarized by a descriptive statistic

Which is a measure that describes how spread out the observations in a data set are from the mean?

standard deviation

A characteristic of a sample is called a(n):

statistic

An auto analyst is conducting a satisfaction survey, sampling from a list of 10,000 new car buyers. The list includes 2,500 Ford buyers, 2,500 GM buyers, 2,500 Honda buyers, and 2,500 Toyota buyers. The analyst then randomly samples 100 buyers of each brand. This is an example of:

stratified sampling

placebos

substances or treatment that have no therapeutic value

A radio station is conducting a promotion where they are giving away a total of 100 free iPhones. Every 10th caller will receiving an iphone until all of them have been given away. This is an example of a:

systematic sample

mean

the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores

range

the difference between the highest and lowest scores in a distribution

null hypothesis

the hypothesis that there is no significant difference between specified populations, any observed difference being due to sampling or experimental error.

median

the middle score in a distribution; half the scores are above it and half are below it

mode

the most frequently occurring score(s) in a distribution

p-value

the number that indicates the probability that measures from 2 samples or groups are similar

Sensitivity

the probability that a clinical test correctly identifies individuals with disease (ie produces a positive result)

Specificity

the probability that a clinical test correctly identifies individuals without disease ( ie produces a negative result)

statistic

the same characteristic pertains to a sample

convenience sample

type of non-probability sampling that involves the sample being drawn from that part of the population that is close to hand

measures of variation

useful for measuring how spread out the data are; 3 main measures: range, variance and standard deviation

data

values of the observations recorded for them; raw materials of statistics

discrete variables

variables that are integers; variables that usually consist of whole number units or categories and are made up of chunks or units that are detached and distinct from one another

retrospective studies

where the cohort is identified after the outcome occurs advantages: economical and particularly applicable to the study of rare disease disadvantages: data usually collected for different purposes & may be missing things/incomplete surveys fail to include relevant variables unknown bias frequently hinder such studies


Kaugnay na mga set ng pag-aaral

Two Worlds Meet/ The First People Ch 1 Lesson 1

View Set

Chapter 6: Closing Entries and the Postclosing Trial Balance, Review

View Set

Chapter 9: Physical and Chemical control of Microbes

View Set

Pearson Realize Chapter 5 Lesson 3

View Set