Stats ch.1 sec. 1,2,3,4.
Principles of Experimental Design
1.Randomize the control and treatment groups. 2.Control for outside effects on the response variable. 3.Replicate the experiment a significant number of times to see meaningful patterns.
Identifying Population and Variables. Neurologists want to study the effect of vitamin C on nerve disorders. The goal of the study is to see if taking an intravenous dose of vitamin C will reduce the amount of nerve pain reported by patients. Identify the population of interest and the variables in this study.
The population of the study would be limited to patients with these specific types of disorders. The variables of interest are the amount of vitamin C administered to a patient and the amount of nerve pain each patient reports.
An Institutional Review Board (IRB)
is a group of people who review the design of a study to make sure that it is appropriate and that no unnecessary harm will come to the subjects involved.
A control group
is a group of subjects to which no treatment is applied in an experiment.
A treatment group
is a group of subjects to which researchers apply a treatment in an experiment.
A meta-analysis
is a study that compiles information from previous studies.
A placebo
is a substance that appears identical to the actual treatment but contains no intrinsic beneficial elements.
A treatment
is some condition that is applied to a group of subjects in an experiment.
The explanatory variable
is the variable in an experiment that causes the change in the response variable.
The response variable
is the variable in an experiment that responds to the treatment.
A case study
looks at multiple variables that affect a single event.
In a double-blind experiment
neither the subjects nor the people interacting with the subjects know to which group each subject belongs.
An observational study.
observes data that already exist.
Data at the nominal level
of measurement are qualitative data consisting of labels or names.
Data at the ordinal level
of measurement are qualitative data that can be arranged in a meaningful order, but calculations such as addition or division do not make sense
Data at the interval level
of measurement are quantitative data that can be arranged in a meaningful order, and differences between data entries are meaningful.
Data at the ratio level
of measurement are quantitative data that can be ordered, differences between data entries are meaningful, and the zero point indicates the absence of something.
In a single-blind experiment
subjects do not know if they are in the control group or the treatment group, but the people interacting with the subjects in the experiment know in which group each subject has been placed.
Conducting a Statistical Study.
1.Design the study. a.State the question to be studied. b.Determine the population and variables. c.Determine the type of study: observational or experimental. 2.Collect the data. 3.Organize the data. 4.Analyze the data to answer the question.
Population, Variable, Data, Census, Parameter, Sample & Sample Statistics
>A *population* is a particular group of interest. >A *variable* is a value or characteristic that changes among members of the population. >*Data* are the counts, measurements, or observations gathered about a specific variable in a population in order to study it. >A *census* is a study in which data are obtained from every member of the population. >A *parameter* is a numerical description of a population characteristic. >A *sample* is a subset of the population from which data are collected. >*Sample statistics* are numerical descriptions of sample characteristics.
Stratified sampling:
A few members from each stratum (or group) are randomly chosen.
Cluster sampling:
All members from a few randomly chosen clusters (or groups) are selected.
Random sampling:
Every member of the population has an equal chance of being selected.
Simple Random sampling:
Every sample of the population has an equal chance of being selected.
Systematic sampling:
Every th member of the population is chosen.
1.1:Identifying Population and Sample EX.
Identify the population and the sample: A.In a survey, 359 college students at the University of Jackson were asked if they had tried the October flavor of the month at the campus coffee shop. Eighty-three of the students surveyed said yes. Solution: a.Population: All college students at the University of Jackson Sample: The 359 college students who were surveyed B. A survey of 1125 households in the United States found that listen to satellite radio. Solution: b. Population: All households in the United States Sample: The 1125 households in the United States that were surveyed
Placebo Effect
In a bizarre instance of the placebo effect, the Archives of General Psychiatry reported a study in which two groups of patients with Parkinson's disease underwent brain surgery. In the first group, human neurons were transplanted into the patients' brains. In the second group, the patients were merely told by their doctor that the transplant had taken place, when it had not. However, both groups showed significant postsurgical increases in body and brain functioning.
The branch of inferential statistics
involves using descriptive statistics to estimate population parameters
EX. of stat
Population Parameter Sample Statistic Notice that the first letters match!
Example 1.1.2: Identifying Population, Sample, Parameters, and Statistics
Read each of the shortened survey reports below. For each report: a.Identify the population. b.Identify the sample. c.Determine whether the highlighted value is a parameter or statistic. 1.Online gaming is gaining in popularity according to a 2017 survey by the Pew Research Center. But there are sizeable differences by age and gender. The survey asked 3930 US adults if they play video games on a computer, TV, game console or portable device such as a cell phone. The report stated that overall 43% of American adults say they often or sometimes play video games on one of the devices. However, those younger than 50 were almost twice as likely as those older than 50 to say they play games (55% vs. 28%).
Example 1.1.2: Identifying Population, Sample, Parameters, and Statistics solution:
Solution 1.a.Population: All US adults b.Sample: The 3930 US adults who were surveyed c.The value 43% refers to all US adults; thus, this is a population parameter.
Classifying Data by the Level of Measurement. The birth years of your classmates are collected. What level of measurement are these data?
Solution Birth years can be ordered. It is also meaningful to subtract years to determine the difference in age. However, the year 0 A.D. does not mean the beginning of time. Therefore, birth years are at the interval level of measurement.
Classifying Data by the Level of Measurement. Consider the ages in whole years of US presidents when they were inaugurated. What level of measurement are these data?
Solution The ages of US presidents are measurable, can be ordered, and differences between data entries are meaningful. Therefore, ages are at the ratio level of measurement. In contrast to Example 1.2.5, involving birth years, you can be twice as old as someone else.
Which type of study would you conduct: an observational study or an experiment? a.You want to determine the average age of college students across the nation. Researchers wish to determine if flu shots actually help prevent severe cases of the flu
Solution a.An observational study would be used since you just need to consider existing records of college students to determine the average age of college students. b.An experiment would need to be used in order to establish a cause-and-effect relationship between flu shots and flu prevention.
Classifying Studies as Meta-Analysis or Case Study. Categorize the following studies as either a meta-analysis or a case study. a.Oceanographers study research on tsunamis dating from 1900 to 2000 to determine their effects on the ocean floor. b.Meteorologists study the Indian Ocean tsunami of December 2004 to try to identify warning signs.
Solution a.Because the oceanographers are looking at multiple studies relating to the single variable of tsunamis' effects on the ocean floor, this is a meta-analysis study. b.In order to identify tsunami warning signs, meteorologists would most likely look at multiple variables relating to the 2004 tsunami. Because they are studying several aspects of a single tsunami, it is a case study.
Classifying Studies as Cross-Sectional or Longitudinal. a.A group of 220 patients is followed for 15 years in order to determine the long-term health effects resulting from gastric bypass surgery. b.A gastroenterologist surveys 130 of his patients six months after having gastric bypass surgery to determine the average amount of weight lost.
Solution a.For this study, a group of gastric bypass patients is followed for a period of time. By definition, this is a longitudinal study. b.In this study, a snapshot of the amount of weight lost at a specific point in time is gathered; thus, this is a cross-sectional study.
Classifying Data as Nominal or Ordinal. Determine whether the data are nominal or ordinal. a.The seat numbers on your concert tickets, such as A23 and A24. b.The genres of the music performed at the original Grammys in 1959.
Solution a.Seat numbers are ordinal because there is a meaningful order to the data, namely, the position in the theater. b.Despite the fact that you may have your own personal preference for specific genres of music, there is no standard order. Therefore, music genres are nominal data.
Classify the following data as either qualitative or quantitative. a.Shades of red paint in a home improvement store b.Rankings of the most popular paint colors for the season c.Amount of red primary dye necessary to make one gallon of each shade of red paint d.Numbers of paint choices available at several stores
Solution a.Shades of paint are descriptions and cannot be measured, so these are qualitative data. b.Rankings are numeric but not measurements or counts, so these are qualitative data. c.The amounts of dye needed are measured and therefore are quantitative data. d.The numbers of paint choices must be counted, so they are quantitative data as well.
Identify the type of sampling used in each of the following scenarios. a.A pollster surveys 50 people in each of a senator's 12 voting precincts. b.The quality control department at a cereal manufacturer measures the weight of every 10th box off of the assembly line. c.A student walks down the halls in her dorm asking students how much money they would spend in a food court in the dorm lobby in an effort to persuade the administration to offer such an option. d.An educator chooses 5 of the school districts in the Chicago area and asks each household in those districts how many school-age children are in the home. e.To determine who will win a shopping spree at the mall, the manager draws a name out of a box of entries.
Solution a.Stratified sampling: The voting precincts are the strata. b.Systematic sampling: The system of selecting the sample is to choose every 10th box. c.Convenience sampling: This would be a very easy method of surveying for this particular scenario, and it would provide a representative sample of the dorm residents. d.Cluster sampling: Every school district in the Chicago area is a cluster. e.Random sampling: Every name in the box has an equal chance of being chosen.
Classifying Data as Continuous or Discrete. a.Temperatures in Fahrenheit of cities in South Carolina b.Numbers of houses in various neighborhoods in a city c.Numbers of elliptical machines in every YMCA in your state d.Heights of doors
Solution a.Temperatures could be measured to any level of precision based on the thermometer used, so these are continuous data. b.Numbers of houses are discrete data because houses are counted in whole numbers. A house under construction is still a house. c.The numbers of elliptical machines are counts, so these are discrete data. d.Heights are measurements and again, depending on the ruler, the heights could be measured to any level of precision, so they are continuous data.
Understanding the Nominal Level of Measurement.
Solution a.These data are nominal because the data simply describe or label the different toppings of the pizza. b.In the second scenario, the data value is a count of students who prefer sausage. This data value is quantitative, not qualitative, so it is not a label and would not be considered to be at the nominal level of measurement.
Statistics is the science of gathering, describing, and analyzing data.
Statistics are the actual numerical descriptions of sample data.
Convenience sampling:
The sample is chosen because it is convenient for the researcher.
Identifying Descriptive and Inferential Statistics: In a news report on the state of the media by Tom Rosenstiel and Amy Mitchell, they write the following: "AOL had 900 journalists, 500 of them at its local Patch news operation.... By the end of 2011, Bloomberg expects to have 150 journalists and analysts for its new Washington operation, Bloomberg Government."
When the authors identify AOL as having "had 900 journalists, 500 of them at its local Patch news operation," they are describing the actual counts, not estimates; thus, these numbers of journalists are descriptive statistics. On the other hand, when the authors state "By the end of 2011, Bloomberg expects to have 150 journalists and analysts for its new Washington operation," they are referring to an estimate based on past descriptive statistics. Therefore the estimate, 150 journalists and analysts, is an inferential statistic.
Stratified sample:
a few members of each group
The placebo effect
a response to the power of suggestion, rather than the treatment itself, by participants of an experiment.
Qualitative Data. Determine the following classifications for the given data sets: qualitative or quantitative; discrete, continuous, or neither; and level of measurement. a.Finishing times for runners in the Labor Day 10 K race b.Colors contained in a box of crayons c.Boiling points (on the Celsius scale) for various caramel candies d.The top ten Spring Break destinations as ranked by USA Today
a. The amount of time it takes for each runner to run the race is quantitative.A finishing time is a measurement, therefore the data are continuous. the data are at the ratio level of measurement. b. Colors are labels, so these data are qualitative.Qualitative data are neither discrete nor continuous. Therefore, the data are at the nominal level of measurement. c.Calculations can be performed on boiling points because they are measurements, making these data quantitative. Temperatures are measurements, so the data are continuous. data from the Celsius scale are always at the interval level of measurement. d. Since the rankings cannot be meaningfully added or subtracted, the data must be qualitative. The rankings are in a specific order, so the data are at the ordinal level of measurement
Informed consent
involves completely disclosing to participants the goals and procedures involved in a study and obtaining their agreement to participate.
Consider the study from an earlier example, in which neurologists want to determine if taking an intravenous dose of vitamin C will reduce the amount of nerve pain reported by patients. Suppose that the study was narrowed to focus only on patients with the nerve disorder, multiple sclerosis (MS). After study approval, the neurologists solicit volunteers who are patients with MS who are reporting nerve pain. The participants are then randomly assigned to two groups, each having 20 participants. Participants in Group A are administered intravenous doses of vitamin C, and their nerve pain is tracked. Participants in Group B are administered intravenous doses of saline (which has no active ingredients) and their pain levels are also tracked. The patients are not told which of the two groups they are in; however, the nurses administering the IVs are aware of the group assignments. After a predetermined length of time, the amounts of pain reported by the separate groups are compared to determine if an intravenous dose of vitamin C will reduce the amount of nerve pain.
a.Identify the explanatory and response variables. the explanatory variable is the dose of vitamin C and the response variable is the amount of nerve pain reported by each patient. b.What is the treatment? The treatment is what is being applied to the group, so the treatment is the dose of vitamin C. c.Which group is the treatment group and which group is the control group?The group that received the treatment of vitamin C, namely Group A, is the treatment group. The group that did not receive the treatment, Group B, is the control group. d.What is the purpose of administering saline to Group B? The saline that is administered to Group B is a placebo, and is administered to compensate for the placebo effect, so that all patients are responding to the same suggestion that they are receiving treatment. e.Is this a single-blind or double-blind study? Do you think this is the best choice for this study? single-blind, docs know patients, patients don't know.
Confounding variables
are factors other than the treatment that cause an effect on the subjects of an experiment.
Participants
are people being studied in an experiment.
Subjects
are people or things being studied in an experiment.
Continuous data
are quantitative data that can take on any value in a given interval and are usually measurements. **are usually measurements.
Discrete data
are quantitative data that can take on only particular values and are usually counts. **are usually counts
The branch of descriptive statistics
as a science, gathers, sorts, summarizes, and displays the data.
Quantitative data
consist of counts or measurements. Are counts and measurements (quantities).
Qualitative data
consist of labels or descriptions of traits. Are descriptions (qualities).
In a cross-sectional study
data are collected at a single point in time.
In a longitudinal study
data are gathered by following a particular group over a period of time.
Cluster sample:
each member of a few groups
An experiment.
generates data to help identify cause-and-effect relationships.
A representative sample.
has the same relevant characteristics as the population and does not favor one group from the population over another.