Chapter 1: Introduction
Which of the following do NOT go together?
quantitative data and frequency data
random sample
-each member of the population has an equal chance of inclusion -a sample selected so that every member of the population has an equal chance of being included in the sample
logical inference vs statistical inference
Statistical inference is the drawing of conclusions about a population from the characteristics of a sample from that population. Logical inference is using logic to draw conclusions, ideally based on evidence.
categorical data
information representing counts or number of observations on each category
population
complete set of events in which one is interested
Which of the following are examples of measurement data? Check all that apply. -Blood alcohol level -Performance on a vigilance task -Favorite TV show -Gender
-Blood alcohol level -Performance on a vigilance test
categorical vs measurement data
-Categorical data may also be referred to as qualitative data while measurement data is often referred to as quantitative. -Measurement data can be counted or measured using a numerically defined method while categorical data is related to dichotomous or nominal observations.
Which of the following are examples of categorical data? Check all that apply. -Balance in a bank account -Number of organisms in each genus -Number of immigrants from each nation of origin -Number of base pairs in the organism's genome
-Number of immigrants from each nation of origin -Number of organisms in each genus
statistic
-a numerical value summarizing the sample data -set of procedures and rules (not always computational or mathematical) for reducing large masses of data to manageable proportions and for allowing us to draw conclusions from those data.
parameter vs statistic
-a parameter is a fixed measure describing the whole population (population being a group of people, things, animals, phenomena that share common characteristics) while a statistic is a characteristic of a sample, a portion of the target population. -a parameter is a fixed, unknown numerical value, while the statistic is a known number and a variable which depends on the portion of the population. -Sample statistic and population parameters have different statistical notations
descriptive vs. inferential statistics
-descriptive statistics merely describe data while inferential statistics try to infer causation between variables -inferential statistics allow one to make predictions from data (ie. take data from samples and make generalizations about the population)
measurement data
-information obtained by assessing objects or events -data values that represent measurements of objects or events
parameter
-numerical value summarizing population data -a number calculated from data in a population that quantifies a characteristic of the population
sample
-set of actual observations; subset of the population -a set of items of interest drawn from the population
descriptive statistics
-used to describe a set of data -methods of organizing, summarizing, and presenting data
Amy loves to eat Skittles candies, but she doesn't like lime-flavored Skittles. She claims that the company produces a higher proportion of lime Skittles compared to the other flavors. To check her claim, she buys a large bag of Skittles, counts the total number of Skittles and the number of lime Skittles, and uses her counts to compute the proportion of lime Skittles in the bag. There are five flavors of Skittles. If the flavors are produced in equal quantities, the proportion of Skittles that are any one particular flavor should be 1/5, or 20%. Amy finds that 21% of the Skittles in her bag are lime. Identify the following elements from the preceding story:
1) All Skittles produced = population 2) The Skittles in the bag = sample 3) The proportion of Skittles in the bag that are lime = statistic 4) The proportion of all Skittles produced that are lime = parameter
Do women wait longer than men to receive their orders at coffee shops? This question was studied by an economics major at Middlebury College. [Source: Myers, C.K. (Sept. 2007). Ladies First: A Field Study of Discrimination in Coffee Shops. Middlebury College Economics Discussion Paper No. 07-11.] A researcher compiled the following information from a sample of 30 customers: -Wait Time (in seconds) -Gender -Age -Order (regular or fancy) -Height (in inches) The researcher's question concerns whether there is a ____(1)_____ between the experiences of men and women at coffee shops. Whether the researcher's question concerns a difference or a relationship typically affects _____(2)_____. For each of the variables listed in the following table, select whether it produces categorical or measurement data. -Wait Time (in seconds) -Gender -Age -Order (regular or fancy) -Height (in inches)
1) difference 2) the type of statistical methods used -Wait Time (in seconds) = measurement data -Gender = categorical data -Age = measurement data -Order (regular or fancy) = categorical data -Height (in inches) = measurement data
Suppose your mother is used to having a cup of regular coffee early in the morning. What would you expect to happen if you put decaf in her early morning cup of coffee?
She would continue to wander around in a daze, wondering what happened.
sample of convenience
a sample selected because it is easy to obtain, such as a sample of volunteers
The mean number of arrests for those who rarely attended high school would be _____.
a statistic
statistics
numerical values calculated from data in a sample intended to summarize the data
One of the most important skills that students learn in statistics courses is ______.
the ability to logically interpret data
decision tree
graphical representation of decisions involved in the choice of statistical procedures
variability
how much a value differs across different elements in the population or sample of interest
Which of the following is an example of a study in which you don't care about the actual numerical value of a population average, but in which you would want to know whether the average of one population is greater than the average of a different population?
Mice that have been getting a steady dose of morphine and mice that have not are injected with morphine, and their endurance on an uncomfortably warm surface before they lick their paws is timed.
Drawing from a telephone book has always been used as an example of bad random sampling. With the rapid expansion of Internet use, why would a standard telephone book be an even worse example than it used to be?
Not everyone in a city is listed in the telephone book.
Which of the following is the best way in which you could draw an approximately random sample from people in a small city? (The Census Bureau has to do this kind of thing frequently.)
Divide the city into neighborhoods, select houses and apartment buildings at random from a map of each neighborhood, and go door to door to the selected housing units.
Suppose you are a researcher conducting a study about the academic performance of the 856 students who attend a particular elementary school. Which of the following describes a method to draw a random sample from your population of interest? (Hint: In some cases, it is simply not possible to draw a true random sample.)
Obtain an alphabetical list of all students at the school and permission to review the school records of the students. Number the students from 1 to 856, and randomly select 40 numbers between 1 and 856. Review the school records of the 40 students corresponding to the 40 numbers selected.
Suppose Amy knew that there is a lot of variability in the proportion of lime flavor Skittles that is in every bag. What would this mean?
The proportion of lime flavor in every bag produced varies from bag to bag. Thus, the population parameters may be different from the sample statistics.
Suppose that you design a study that involves following heroin addicts around and noting the context within which they inject themselves and the kind of reaction that results. -In this hypothetical study, what would the population of interest be? (a) -In this study, how would you define your sample? (b) -Which of the following is a parameter of interest in this study? (c) -Which of the following is a statistic of interest in this study? (d)
a) all heroin addicts b) the heroin addicts who participate in the study c) the average score on a euphoria measurement scale for all heroin addicts d) the average score on a euphoria measurement scale for the heroin addicts participating in the study
Mars, Inc. actually keeps track of the number of pieces of each color there are in each batch of M&M'S® candy. You are interested in conducting a sampling experiment of your own. You purchase several bags of M&M'S® candy from different stores, open the bags, and record the number of pieces of each color. The data you collect are ___(a)___ data. For each item in the following table, indicate whether the item is a population, a sample, a parameter, or a statistic. i) All pieces of M&M'S® candy produced by Mars, Inc. ii) The pieces of M&M'S® candy used in your study iii) The proportion of green pieces reported by Mars, Inc. iv) The proportion of red pieces in your sample
a) categorical i) population ii) sample iii) parameter iv) statistic
You wish to develop a study to understand better the role played by tolerance and context in humans. Rather than using a more dangerous drug, you decide to study the development of tolerance to caffeine. People who do not normally drink caffeinated coffee are often startled by the effect of one or two cups of regular coffee, whereas those who normally drink regular coffee see no such effect. You need to conduct a test to see whether context plays a role in tolerance to caffeine. -To test tolerance to caffeine, you need to decide how to measure the effect you want to observe. Which of the following would be an appropriate measurement for this study? (a) -Now that you have decided what to measure, how could you go about testing to see whether, in a specific context, tolerance to caffeine develops? (b) -After observing tolerance develop in this manner, how could you test to see whether context has played a role in the tolerance effect? (c) -The set of all the people to whom you want to generalize your results is the ___(d)___. -The sample is the ___(e)___. -Which of the following everyday-life examples shows how context can affect behavior? (f)
a) performance on a vigilance task b) serve a group of users of decaffeinated coffee two cups of regular coffee, and then administer the measurement, every weekday morning in their offices for a month, but have them stick to decaf at all other times. c) at the end of the month, serve the subjects regular coffee and administer the measurement on a weekday morning in their offices and again at a different time of the day and in a different place. d) population e) set of the specific people who participated in your study f) Out on a date late at night, Janet laughs uproariously when her charming partner tells an off-color joke. A couple of weeks later, the new manager tells the same joke at a department meeting, and Janet feels very uncomfortable and complains to the human resources department.
A psychology student is very nervous about taking statistics. She is interested in finding out the statistics anxiety levels of fellow psychology students at her university. She selects 50 students from the 500 psychology majors at her university and asks them about their levels of anxiety regarding statistics. The set of 50 students is the ___(a)____, and the set of 500 psychology majors at her university is the ___(b)_____. According to the data she collects on the 50 students, 66% had some anxiety before taking statistics, and 34% did not. These descriptive measures are ___(c)____. According to existing data for the 500 psychology majors at her university, 80% had some anxiety before taking statistics, and 20% did not. These descriptive measures are ___(d)___. The student uses the sample data to draw conclusions about characteristics of the population. She concludes that about 66% of her fellow students had anxiety about taking statistics. This scenario is an example of ___(e)___. The psychology student presents her results to a panel discussion for students who have not yet declared a major. A student in the audience thinks those results would probably apply to herself and other undeclared students as well. This student from the audience is using ___(f)___
a) sample b) population c) sample statistics d) population parameters e) inferential f) logical inference
You are attempting to estimate the average milk production of a group of cows. How would you expect that variability would contribute to the size of the sample you would need? You would expect that __(a)____ sample size would be needed when variability is small than when it is great. What would you have to do if you suspected that some varieties of cows gave relatively little milk, while other varieties gave quite a lot of milk? ___(b)___
a) smaller b) take a separate sample from each variety of interest.
Tolerance to a drug such as morphine means that the user requires greater and greater doses to experience the same effect. Shepard Siegel hypothesized that this tolerance applies primarily to the context (such as a particular time and environment) in which the doses were given. If the user receives the drug in a different context, tolerance will not apply and the user will experience the full effect of the dose. [Siegel, S. (1975). Evidence from rats that morphine tolerance is a learned response. Journal of Comparative and Physiological Psychology, 80, 498-506.] To test this hypothesis, two groups of mice are made morphine-tolerant by repeated doses in a training condition. Then one group receives a standard dose of morphine in the same environment as in the training condition. The other group receives the same dose in a new environment, the testing condition. After receiving the morphine, each mouse is placed on an uncomfortably warm surface, and the time until the mouse licks its paws (called paw-lick latency) is recorded. Longer paw-lick latency means less discomfort and therefore a stronger effect of the drug. Could you obtain more information on this question by using a third group of mice? How? ___(a)___; you could ____(b)___ in the training context ___(c)___ in the testing condition.
a) yes b) give the third group a placebo c) and morphine
If you want to study the effect of hormonal changes in adolescent boys, your population would be _______.
all adolescent males
An example of a statistical inference is _______.
generalizing data from a sample of girls to a population of girls
For each question in the following table, select whether the question can be answered by a study in which your primary interest is in looking at relationships between variables, or whether the question can be answered by a study in which your primary interest is in looking at group differences. i) Is driving speed linked to accident rate? ii) Do children who attend preschool perform better in kindergarten than children who do not attend preschool? iii) Does amount of ice cream consumed during a laboratory task relate to future weight gain? iv) Are premature babies more susceptible to disease than full-term babies are?
i) relationships ii) differences iii) relationships iv) differences
selection among statistical procedures
measurement data, decision tree, categorical data