Ch.1 Intro to stat:

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

average

also called mean: a number that describes the central tendency of the data

Data is a subset of the population studied.

false

representative sample

sample must contain the characteristics of the population in order

- is to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population. Data are the result of sampling from a population.

sampling

Which of the following is the situation in which not all members of the population are equally likely to be selected?

sampling bias. A sampling bias is defined as the situation in which not all members of the population are equally likely to be selected

What is the natural variation that results from selecting a sample to represent a larger population?

sampling error. A sampling error is defined as the natural variation that results from selecting a sample to represent a larger population.

population

statistical study is the entire group of individuals abt which we want info

Numerical variables

take on values with equal units such as weight in pounds and time in hours.

ratio scale

takes care of the ratio problem and gives you the most information.

Sample

part of the population from which we

Categorical variables

place the person or thing into a category

Margaret is investigating if gender has any effect on political party associations. What is the response variable?

political party associations The response variable is the dependent variable, which is the variable that is being measured or tested in response to changes in the independent variable. In this case the response variable is political party associations.

What is all individuals, objects, or measurements whose properties are being studied?

population

systematic sample

randomly select a starting point and take every nth piece of data from a listing of the population.

statistics

- deals with the collection, analysis, interpretation, and presentation of data. - statistic is a number that represents a property of the sample.

lurking variables

Additional variables that can cloud a study

quantitative discrete data

All data that are the result of counting

A scientist is interested in finding out the effect of soil quality on crop quality. Would an experimental or observational study design be more appropriate?

An experimental study should be used with the soil quality as the controlled factor. An experimental study should be used because the goal involves a cause and effect relationship.

A researcher wants to evaluate the effect of using Facebook on happiness. Would an experimental or observational study design be more appropriate?

An experimental study should be used with the use of Facebook as the controlled factor. An experimental study should be used because the researcher is attempting to control one or more aspects of the individuals in the sample.

Which of the following scenarios contain nonbiased samples? Select all that apply.

Christopher wants to estimate the male to female ratio of the residents of his city. He collects data by recording the sex of every 50th resident after selecting a random starting point on a list of residents. Ruby wants to estimate the mean grade point average of students at her school. She collects data by recording the grade point average of every 25th student on the list of students after a randomly selected first student. To estimate the mean salary of professors at his university, Benjamin collects data by recording the salaries of all professors included in 12 randomly selected departments. A sample is biased if some individuals of the population are more or less likely to be selected than others. The sample from choice A is nonbiased because every resident has an equal chance of being selected. The sample from choice B is nonbiased because every student has an equal chance of being selected. The sample from choice C is nonbiased because every professor has an equal chance of being selected.

Annie wants to estimate the average number of AP classes a student at her high school takes. She decides to randomly select 6 classes and use all the students in those 6 classes to estimate the average.What type of sampling did Annie use?

Cluster sampling. Because Annie chose several classes at random and then took all the students from those classes, she used cluster sampling.Note that cluster sampling could be misrepresentative in this case because she might select one or more AP classes in her sample.

quantitative continuous data

Data that are not only made up of counting numbers, but that may include fractions, decimals, or irrational numbers

nonsampling errors

Factors not related to the sampling process. A defective counting device can cause a nonsampling error.

Cluster sampling includes the steps: divide the population into groups; use simple random sampling to identify a proportionate number of individuals from each group.

False

Continuous is the type of quantitative data that is the result of counting.

False Continuous data is defined as the type of quantitative data that is the result of measuring.

A sampling error is the situation in which not all members of the population are equally likely to be selected. False, A sampling error is defined as the natural variation that results from selecting a sample to represent a larger population.

False, A sampling error is defined as the natural variation that results from selecting a sample to represent a larger population.

Select all of the following scenarios below that contain biased samples.

Kelly wants to estimate the mean number of calories consumed by students at her school. She collects data from randomly selected individuals in the cafeteria for breakfast. To estimate the mean amount of money spent on clothes per week by mall shoppers, Paul collects data from every 10th person entering a clothing store at the mall. A sample is biased if some individuals of the population are more or less likely to be selected than others. The sample from choice A is biased because students that do not eat breakfast are not included in the sample. The sample from choice C is biased because mall shoppers that do not visit this particular store are not represented.

You use histograms on...

Quantitative Data

Which of the following scenarios contain biased samples?

Ruth wants to estimate the mean height of students at her school. She collects data by selecting a random group of students within her classroom. Jennifer wants to estimate the mean amount of money spent on clothes per week by mall shoppers. She collects data from every 10th person entering one clothing store at the mall. To estimate the political party distribution of residents in his state, Patrick collects data from a large group of randomly selected residents of his city. A sample is biased if some individuals of the population are more or less likely to be selected than others. The sample from choice A is biased because the students in this classroom may only represent one grade level, which could be shorter or taller in general than other grade levels. The sample from choice B is biased because mall shoppers that do not visit this particular store are not represented. The sample from choice D is biased because this specific city is most likely not representative of the entire state depending on its demographics.

sampling errors

The actual process of sampling causes sampling errors. Ex: the sample may not be large enough.

response variable

The affected variable

Brian researched how a person's intelligence quotient (IQ) is affected by exercise. First, people took an IQ test. Then, over 2 months, they did different amounts of exercise. Then, Brian gave them a second IQ test. What is the response variable in this experiment?

The change in IQ score from the first to second test Different people will have different IQs before they begin the experiment, which is why Brian has the subjects take an IQ test before they begin exercising. He expects that different amounts of exercise will affect the amount their IQ changes. The cause is the amount of exercise, and the effect is the change in IQ. So the response variable is the change in IQ.

A study was performed with a random sample of 5 people from a certain city. What population would be appropriate for generalizing conclusions from the study, assuming the data collection methods used did not introduce biases?

The conclusions apply only to the sample. The sample size is very small, so it is not likely to be representative of the city or any larger population.

Tina collected data from a random sample of 600 students in her university asking whether or not they exercise more than 30 minutes per day. Based on the results, she reports that 53% of the students in the nation exercise more than 30 minutes per day. Why is this statistic misleading?

The sample is biased. The sample in this case is biased. The individuals included in the sample are not representative of the entire population because the students at her school do not necessarily have the same exercise habits as the entire national student population.

Gloria collected data from a random sample of 1200 students in her school district asking whether or not they read more than one book per month. Based on the results, she reports that 35% of the students in all state schools read more than one book per month. Why is this statistic misleading?

The sample is biased. The sample in this case is biased. The individuals included in the sample are not representative of the entire population because the students in her school may read more or less books on average than students at other schools.

Kenneth collected data from a random sample of 800 voters in his city asking whether or not they would vote to reelect the current governor. Based on the results, he reports that 64% of the voters in his state would vote to reelect the current governor. Why is this statistic misleading?

The sample is biased. The sample in this case is biased. The individuals included in the sample are not representative of the entire population because the voters in his city do not necessarily have the same opinion as the rest of the state.

After asking 8 randomly selected individuals from her city whether or not they were born within the state , Kathryn reports that 63% of the individuals from her city were born within the state based on her survey. Why is this statistic misleading?

The sample size is insufficient. Although the sampling was random, using such a small sample size to draw conclusions about a much larger number of individuals is bound to be unreliable. This sample size is too small.

A farmer divided his land into 2 groups of sections randomly. There is no difference in the quality of the soil between the 2 groups of land. He used Type A seeds in the first group and Type B seeds in the second group. After 3 months, the heights of the crops are measured across the two groups of land sections. Is the study observational or experimental? If it is an experiment, what is the controlled factor?

The study is an experiment. The controlled factor is the seed type. Since the land is divided into two groups and each group is treated with a different type of seed, the study is an experiment. The type of seed is the controlled factor.

control group

To counter the power of suggestion, researchers set aside one treatment group

Which of the following scenarios contain biased samples? Select all that apply.

To estimate the political party distribution of residents in his state, Joseph collects data from a large group of randomly selected residents of his city. Victor wants to estimate the mean number of calories consumed by students at his school. He collects data from randomly selected individuals in the cafeteria for breakfast. Howard wants to estimate the mean weight of people in his town. He collects data by interviewing members of a fitness club. A sample is biased if some individuals of the population are more or less likely to be selected than others. The sample from choice B is biased because this specific city is most likely not representative of the entire state depending on its demographics. The sample from choice C is biased because students that do not eat breakfast are not included in the sample. The sample from choice D is biased because members of the fitness club will have different exercise and eating habits than the rest of the population.

Which of the following scenarios contain biased samples? Select all that apply.

To estimate which presidential candidate is likely to win the vote of a town, Mark collects data by interviewing people leaving a local Christian church. Kenneth wants to estimate the mean height of students at his school. He collects data by selecting a random group of students within his classroom. A sample is biased if some individuals of the population are more or less likely to be selected than others. The sample from choice A is biased because members of the Christian church are likely to have different political views than the rest of the population. The sample from choice B is biased because the students in this classroom may only represent one grade level, which could be shorter or taller in general than other grade levels.

A nonsampling error is an issue that affects the reliability of sampling data other than natural variation.

True

Cluster sampling includes the steps: use simple random sampling to select a set of groups; every individual in the chosen groups is included in the sample. Stratified sampling includes the steps: divide the population into groups; use simple random sampling to identify a proportionate number of individuals from each group.

True

In reference to different sampling methods, systematic sampling includes the steps: list the members of the population; use simple random sampling to select a starting point in the population; let k = (number of individuals in the population)/(number of individuals needed in the sample); choose every kth individual in the list starting with the one that was randomly selected.

True

A candy manufacturer is interested in the distribution of colors in each of its packages of candy sold. Which question is appropriate for their research?

What is the typical distribution of colors for the packages of candies?

A 2013 study at Johns Hopkins University wanted to measure if caffeine has an effect on short term memory. The researchers showed images to a group of people. They then gave half the group a 200-mg caffeine pill and the other half a placebo pill containing no caffeine. A day later they measured how many images each person could remember. What is the explanatory variable in this experiment?

Whether or not a person received caffeine Remember that the explanatory variable is the cause and the response variable is the effect. This experiment is trying to determine the relationship between receiving caffeine (the cause) and short term memory (the effect). So in this case, the explanatory variable is whether the person received caffeine.

Quantitative data

are always numbers. - Quantitative data are the result of counting or measuring attributes of a population. - Quantitative data may be either discrete or continuous. Ex (quantitative): Amount of money, pulse rate, weight, number of people living in your town, and number of students who take statistics

Qualitative data

are the result of categorizing or describing attributes of a population. - Qualitative data are also often called categorical data. - Qualitative data are generally described by words or letters. Ex (qualitative data): Hair color, blood type, ethnic group, the car a person drives, and the street a person lives

Population

as a collection of persons, things, or objects under study.

interval scale data

can be measured though the data does not have a starting point

pie chart

categories of data are represented by wedges in a circle and are proportional in size to the percent of individuals in each category. - which is qualitative (or categorical) data

Pareto chart

consists of bars that are sorted into order by category size (largest to smallest).

Given that Justin is collecting data on reaction time, what type of data is he working with?

continuous quantitative Reaction time is continuous quantitative data because it is obtained by measuring and is not limited to a certain set of numbers. FEEDBACK

ordinal scale

data can be ordered. ex: of ordinal scale data is a list of the top five national parks in the United States.

Organizing and summarizing data is

descriptive statistics

cluster sample

divide the population into clusters (groups) and then randomly select some of the clusters. All the members from these clusters are in the cluster sample

When considering different sampling methods, stratified sampling includes the steps: _______.

divide the population into groups; use simple random sampling to identify a proportionate number of individuals from each group

random sampling

each member of a population initially has an equal chance of being selected for the sample

Janice is investigating if grade level has any effect on time spent studying. What is the explanatory variable?

grade level The explanatory variable is the independent variable, which is the variable that is changed to determine its effect on a dependent variable. In this case the explanatory variable is grade level.

Blinding

in a randomized experiment preserves the power of suggestion.

Convenience sampling

involves using results that are readily available.

Statistic

is a number that represents a property of the sample

parameter

is a numerical characteristic of the whole population that can be estimated by a statistic.

experimental unit

is a single object or individual to be measured.

Datum

is a single value

sampling bias

is created when a sample is collected from a population and some members of the population are not as likely to be chosen as others (remember, each member of the population should have an equally likely chance of being chosen).

double-blind experiment

is one in which both the subjects and the researchers involved with the subjects are blinded

Variation

is present in any set of data.

nominal scale

is qualitative(categorical). Ex: Categories, colors, names, labels and favorite foods along with yes or no responses Nominal scale data are not ordered.

Cumulative relative frequency

is the accumulation of the previous relative frequencies.

relative frequency

is the ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes.

Sampling

is to select a portion (or subset) of the larger population and study that portion (the sample) to gain information about the population.

Probability

mathematical tool used to study randomness.

Which of the following best describes the term explanatory variable?

the independent variable in an experiment - An explanatory variable is defined as the independent variable in an experiment.

bar graph

the length of the bar for each category is proportional to the number or percent of individuals in each category. Bars may be vertical or horizontal.

stratified sample

the population into groups called strata and then take a proportionate number from each stratum

placebo

treatment-a treatment that cannot influence the response variable.

The different values of the explanatory variable

treatments

A parameter is a number that is used to represent a population characteristic and that generally cannot be determined easily.

true

A sample is defined as a subset of the population studied

true

Data is defined as a set of observations or possible outcomes.

true

inferential statistics

uses probability to determine how confident we can be that our conclusions are correct.

Variable

usually notated by capital letters such as X and Y, is a characteristic or measurement that can be determined for each member of a population.

explanatory variable.

we call the first variable one variable causes change in another

Descriptive Statistics

will learn how to organize and summarize data. Organizing and summarizing data


Ensembles d'études connexes

3.5 Genetic Modification And Biotechnology

View Set

Geography - Proving Lines Parallel assignment

View Set

U.S. Government Final Exam review

View Set

Personal Health: Block 3, Chapter 6: Manage Stress

View Set

Response to loss of biodiversity

View Set

buad302 exam 1 example questions

View Set