Stats: Winter

¡Supera tus tareas y exámenes ahora con Quizwiz!

A boxplot

(or box-and-whisker diagram) is a graph of a data set that consists of a line extending from the minimum value to the maximum value, and a box with lines drawn at the first quartile Q1, the median, and the third quartile Q3. (See Figure 3-7.)

A z score

(or standard score or standardized value) is the number of standard deviations that a given value x is above or below the mean. The z score is calcu- lated by using one of the following:

stemplot

(or stem-and-leaf plot) represents quantitative data by separating each value into two parts: the stem (such as the leftmost digit) and the leaf (such as the rightmost digit). Better stemplots are often obtained by first rounding the original data values. Also, stemplots can be expanded to include more rows and can be condensed to include fewer rows.

Discrete Data

Results when the data are quantitative and the number of values is finite, or "countable" (Of there are infinitely many values, the collection of values is countable if it is possible to count them individually, such as the number of tosses of a coin before getting tails)

What do we call the characteristics of a sample used to infer information about the population?

Statistic

Practical vs Statical significance

Statsical significance is achieved in a study when we get a result that is very unlikely to occur by chance. A common criterion is that we have statistical significance if the likelihood of an event occurring by chance is 5% or less. Practical Significance - It is possible that some treatment or finding is effective, but common sense might suggest that the treatment or finding does not make enough of a difference to justify its use or to be practical

Median

The middle number in a set of numbers that are listed in order

During the first four Summer Olympic games attended by the United States, these medal counts were awarded. Calculate the mode number of medals that the United States won during these Olympic Games.

The mode is the data point that appeared the most. 47 is the only number that is used more than once.

Mode

The number that occurs most frequently

During the first four Summer Olympic games attended by the United States, these medal counts were awarded. Calculate the range number of medals that the United States won during these Olympic Games.

The range is found by subtracting the highest and lowest value.

Ratio

There is a natural zero starting point and ratios make sense. Heights, lengths, distances, volumes

Observational Study

We observe and measure specific characteristics, but we don't attempt to modify the individuals being studied

Randomization

is used when individuals are assigned to different groups through a process of random selection. The logic behind randomization is to use chance as a way to create two groups that are similar.

Blinding

is used when the subject doesn't know whether he or she is receiving a treatment or a placebo. Blinding is a way to get around the placebo effect, which occurs when an untreated subject reports an improvement in symptoms. Double Blind means that the blinding occurred at two levels both the subject and evaluators don't know about the tests given to whom

Midrange

min + Max / 2

Sampling Error

occurs when the sample has been selected with a radon method, but there is a discrepancy between a sample result and the true population result; such an error results from chance sample fluctuations.

The mean (or arithmetic mean)

of a set of data is the measure of center found by adding all of the data values and dividing the total by the number of data values.

Standard Deviation

of a set of sample values, denoted by s , is a measure of how much data values deviate away form the mean

Big data

refers to data sets so large and so complex that their analysis is beyond the capabilities of traditional software tools. Analysis of big data may require soft- ware simultaneously running in parallel on many different computers.

Continuous (numerical) data

results from infinitely many possible quantitative values, where the collection of vales is not countable. (that is it is impossible to count the individual items because at least some of them are on a continuous scale, such as lengths of distances from 0cm to 12 cm

A statistic is resistant if

the presence of extreme values (outliers) does not cause it to change very much.

bar graph

uses bars of equal width to show frequencies of categories of categori- cal (or qualitative) data. The bars may or may not be separated by small gaps.

frequency polygon

uses line segments connected to points located directly above class midpoint values. A frequency polygon is very similar to a histogram, but a fre- quency polygon uses line segments instead of bars.

Experiment

we apply some treatment and then proceed to observe its effects on the individuals. (The individuals in experiments are called experimental units, and they are often called subjects when they are people

In cluster sampling

we first divide the population area into sections (or clusters). Then we randomly select some of those clusters and choose all the members from those selected clusters.

In systematic sampling

we select some starting point and then select every kth (such as every 50th) element in the population.

convenience sampling

we simply use data that are very easy to get.

stratified sampling

we subdivide the population into at least two different subgroups (or strata) so that subjects within the same subgroup share the same characteristics (such as gender). Then we draw a sample from each subgroup (or stratum).

relative frequency polygon

which uses relative frequencies (proportions or percentages) for the vertical scale. An advantage of relative frequency polygons is that two or more of them can be combined on a single graph for easy comparison

Which of the following answers is an example of continuous data?

3.14 cm

Select the statement that represents discrete quantitative data.

7 people came to my party.

Which of the following answers is an example of discrete data?

8 puppies

A true/false survey question that limits the participant's response is an example of:

A limiting question.

What is a control in statistics?

A method used to make sure only the issue at hand is examined

What is convenience sampling?

A sampling method where the researcher selects the research sample based on ease and proximity to the researcher.

What is the term for the method in which two treatments are given to groups when few individuals are aware of the actual treatment and control groups? Is this confounding?

Blinding; No, the lack of this is confounding

How do you demonstrate an understanding of an experiment?

By organizing and critically analyzing the data.

Katie is collecting data about the number of people that are interested in different dance styles. After passing out a survey, she has gathered the following information: Style / Number of people interested in this style Ballet / 20 Jazz /12 Hip Hop / 8 Lyrical / 26 Tap / 10 What type of data did Katie collect?

Categorical

Samantha is gathering information about how many calories the athletes on her team consume during game week. Which answer correctly indicates how the data should be analyzed for both categorical and quantitative research?

Categorical Data: 1,000-1,200 cal; 1,201-1,400 cal; 1,401-1,600 cal, etc. Quantitative Data: All individual caloric consumption amounts are collected.

What is data that is collected in groups or topics, where the number of events in each group is counted numerically?

Categorical data

What is a single number that summarizes an entire data set?

Center of data

What is data that can be divided infinitely?

Continuous

What is information that is collected for analysis?

Data

Samantha is using convenience sampling to conduct her research. After collecting the data, she runs into a few issues associated with the limitations of convenience sampling. What are these limitations likely to be?

Data bias and inaccurate parameters.

An association for the corn syrup industry presented data from its own poll on obesity in order to emphasize exercise over healthy eating. This is an example of which type of bias:

Funding bias

One trick to finding out if information is categorical or quantitative is to analyze the answer to the question. Each answer choice pairs the answer to the question with categorical or quantitative data. Select the answer selection that is correct.

If the answer to the question is numerical, then the information is quantitative. If the answer to the question is a characteristic, preference, etc., then the information is categorical.

What is it called when an experiment measures the variables that it was designed to measure?

Internal validity

Experts indicate violent video games are a risk factor for aggressive behavior. Do you think it is important for young people to limit their time playing violent video games? This is an example of which type of question?

Leading question

What is the sum of the numbers in a data set divided by the total number of values in the data set?

Mean

What is the midpoint value of a data set, where the values are arranged in ascending or descending order?

Median

When data is visually displayed in a way that misleads from an accurate interpretation, what is it called?

Misleading graphs.

Which of the following is least likely to be reported due to it being the one most likely to change by the introduction of additional scores?

Mode

A survey is handed out by loud volunteers on a street corner. Some people are suspicious of the volunteers and choose to not participate in the survey. This is an example of:

Nonresponse bias

In a study, a group of mice are put into a maze with multiple paths and options. The time it takes each mouse to complete the maze and the path taken are recorded. This is an example of a(n) _____ study.

Observational

Mitch is being asked questions by a friendly pollster. He doesn't have a lot to say about the questions presented, so he embellishes his answers so that the pollster can ask more follow-up questions. What kind of bias is this and why?

Observer bias; Mitch is altering his responses to make the pollster happy

What do we call a value that is much smaller or larger than the other values in a data set?

Outlier

What are the characteristics used to describe a population?

Parameter

What are the characteristics used to describe a population?

Parameters

Sandra conducts an experiment in which she gives ten people an herbal drink to help them with anxiety. She gives another ten people hot water with no tea to help with anxiety. Members of both groups report improvement with their anxiety. What type of confounding is this?

Placebo effect

What is the term for the result when a fake treatment causes a patient to believe that the fake treatment has worked?

Placebo effect

All members of a specified group are referred to as what?

Population

What is data that can be measured and ordered?

Quantitative

Brown school has decided to review their ratio of girls to boys in the Second grade class. They decided to count the number of boys in Second grade. This is an example of what type of data?

Quantitative and discrete

David wants to collect information about his friends and how many siblings they each have. David has eight friends. How should he collect this data? Why?

Quantitatively: Since David isn't dealing with a large amount of information, he can simply ask each friend and analyze the data quantitatively.

Elizabeth is conducting an experiment, and she uses the students in her class as the sample. If she randomly chooses which students will receive the treatment and which will receive a placebo, then what process is she executing?

Random allocation

What is the method used to choose members of a group to participate in an experiment?

Random selection

In a survey, people on average responded that they floss seven times a week despite the fact that the actual averages are drastically lower. Which form of bias likely accounts for this discrepancy?

Social desirability

Sally Sue is filling out a survey about health and fitness. One question asks about her height and weight. She responds with her real height, but puts a weight that is ten pounds lighter than what she really weighs. What is this an example of, and why?

Social desirability, because Sally Sue is answering the question the way that she feels she should.

What are the characteristics of a sample used to infer information about the population?

Statistic

Madison is trying out for the track team. There are 26 people trying out for the team. There are 4 people wanting to run distance, 6 people wanting to pole vault, 8 people wanting to sprint, and 2 people that want to throw shot put. Madison can run a half mile in 6 minutes and she finished 6th in the 100 meter dash. Which of these pieces of data is discrete and which is continuous?

The 6 minute half mile time is continuous data; the number of people trying out for the team is discrete data.

Brittany is conducting an experiment in her home economics class. She wants to know if adding an extra ingredient, cinnamon, to her pancakes will make them taste better. She decides to give the first period home economics class regular pancakes, and the second period home economics class the cinnamon pancakes. She asks each group to rank the pancakes on a scale from 1 to 5. What is the control in this scenario?

The first period class.

During the first four Summer Olympic games attended by the United States, these medal counts were awarded. Calculate the median number of medals that the United States won during these Olympic Games.

The median is the middle number when in order from least to greatest. Since there are two numbers in the middle, add them together and divide by 2.

Marco is conducting an experiment on training certain breeds of dogs. He wants to know how long, on average, it would take to teach a Labrador to fetch an object. He gets a group of dogs to conduct his experiment. 5 of the dogs are Labradors and 3 of the dogs are Dalmatians. What is the population and sample in this experiment?

The population is all Labradors, and the sample is the 5 Labradors that Marco trained.

Analyze the following statements and indicate which one best describes the difference between the population and the sample.

The population is all members of a specified group while the sample is the part of a population used to describe the whole group.

What is the variable in an experiment that is used on an experimental group?

Treatment

The following data represents the test scores of eight students in Mr. Miller's science class. 92%, 91%, 89%, 95%, 45%, 88%, 90%, 91% The student who got a 45% is _____.

an outlier

Why do different types of statistical models exist?

because there are many different types of variables and the models provide ways to analyze them

Variables are collected in the form of data. What are the two major forms of data?

categorical and quantitative

The _____ is a condition or piece of data in an experiment that is controlled or influenced by an outside factor, most often the independent variable?

dependent variable

Sheila is a baker experimenting with a cake recipe. Specifically, she is monitoring how changes in the volume of cream affect the moistness of the cake. What kind of variable is the volume of cream, and why?

it is an independent variable because it is a condition in the experiment that can be changed.

Questions that encourage the answer the researcher expects are known as _____ questions.

leading

Jim is studying how differences in the the screen color of electronic devices make an impact on eye strain experienced by the user. In these experiments, which is the response variable and which is the explanatory variable.

the response variable is eye strain; the explanatory variable is screen color

Why is it important to know the limits of an experiment?

In order to analyze and infer information about the experiment, you have to know exactly what the experiment can tell you and what it can't.

For the following data set, select the best method for summarizing the data. 3, 3, 4, 5, 2, 9, 5, 2, 3, 4

Median

The numbers on lacrosse jerseys are an example of _____ data.

Nominal

A coach records the levels of ability in martial arts of various kids. What type of data is this?

Ordinal

What is a mathematical comparison between two numbers?

Ratio

Charlotte is part of her local track team. She can jump 4 hurdles and can long jump 5 feet 5 inches. There are seven girls and ten boys on her track team. Six of the team members are ranked among the top 10 regional athletes. Which piece of this data is discrete and which is continuous?

The number of hurdles Charlotte can jump is discrete, and the length of her long jump is continuous.

A political group invites people in the local communities to participate in a survey about prayer in schools. This is an example of:

Voluntary bias

All of the following statements are correct, EXCEPT:

An outlier is likely to skew the median of a sample.

Laura needs to conduct research for her statistics class. She has access to funding and a large number of volunteers for the research. The research requires that she attempt to make inferences about population parameters. Should she use convenience sampling? Why or why not?

No, Convenience sampling should only be used when resources and sample accessibility are limited.

Which of these is an example of confounding in statistics?

Only asking people of one race or ethnicity a general question

Which best describes selection bias?

Only choosing certain people to answer a question

What do we call a part of a population used to describe the whole group?

Sample

What is a part of a population used to describe the whole group?

Sample

Why is confounding not a form of bias?

Confounding involves issues with the structure of data gathering, while bias involves problems in the ways the data is gathered.

What is the term for the other factors that can have an effect on research results?

Confounding variable

What do we call the group that remains untreated throughout the duration of an experiment?

Control group

In a study, a group of mice are given an injection of a drug proposed to shrink lung tumors. After one month, the tumors are measured. This is an example of a(n) _____ study.

Experimental

Alyssa wants to determine whether or not there is a correlation between reading to children and their IQ level. Which type of study and variables does she need to use?

Experimental; independent and dependent

Bob proposed that the more you exercise, the more likely you are to be promoted at your job. He reviewed the exercise regimens of coworkers who were promoted and those that were not. In this example, exercise is a(n) _____ variable, and job status is a(n)_____ variable.

Explanatory; response

What is it called when the results of an experiment are true in the outside population?

External validity

When researchers have a limited amount of time to collect data, they use convenience sampling because it is often:

Fast.

What is data that is grouped in evenly distributed values and measured based on the group to which the variable is attributed?

Interval measurement

What is the effect of bias?

It creates a data set that is significantly different from what the reality actually is

Which of the following statements is NOT quantitative data?

Last night's gathering was ten times better than last week's.

What do we call a sample selected by a method that specifically excludes certain groups from the research, either intentionally or unintentionally?

Non-representative.

Why might a study about climate change by a company that sells fossil fuels be an instance of funding bias?

One could reasonably ask questions about the objectivity of the study, since such a company has a financial interest in climate change denial.

Julie wants to understand the growth rates of children in kindergarten. She asks a local elementary school for their help, and she takes the measurements of 45 kindergartners throughout the year. She finds that the children grow at a rate of 2-4 inches over the year, while the national research shows that children of that age grow at a rate of 2-3 inches per year. Identify the population, sample, statistic, and parameter in this scenario.

Population: all kindergarten children Sample: the 45 children at the elementary school Statistic: 2-4 inches per year Parameter: 2-3 inches per year

Brittany is conducting an experiment in her home economics class. She wants to know if adding an extra ingredient, cinnamon, to her pancakes will make them taste better. She decides to give the first period home economics class regular pancakes, and the second period home economics class the cinnamon pancakes. She asks each group to rank the pancakes on a scale from 1 to 5. What is the treatment in this scenario?

The cinnamon.

Sandra is selling her new line of all natural make-up products and wants to know the popularity of certain colors among teenagers. She hands out a survey to all of the students in her teenager daughter's class and asks them to identify their favorite make up colors. Which is the population and which is the sample in this scenario?

The population is all teenagers, and the sample is the teenagers in Sandra's daughter's class.

Elizabeth is waiting for her flight at the airport. She decides to conduct an experiment. She wants to know how many people at the airport carry a laptop with them for their flight. She sees 50 people walk past her as she waits. Thirty of those people are carrying laptops. What is the population and sample in this scenario?

The population is everyone at the airport, the sample is the 50 people that walked by Elizabeth.

Why is it important to avoid a non-representative sample when conducting a survey?

The survey will reflect the characteristics only of the group that was sampled

During the first four Summer Olympic games attended by the United States, these medal counts were awarded. Calculate the mean number of medals that the United States won during these Olympic Games.

To calculate the mean, add all four medal counts and then divide by 4.

Which of the following is NOT a question you need to ask when organizing your data for analysis?

What are the values of the independent and dependent variables?

If all other statistical procedures fail, the best predictor of a score will be the

mean

What is the name of the group that remains untreated throughout the duration of an experiment?

Control group

How does reporting bias occur in a poll or survey?

It occurs when only certain responses are included

jean is conducting an experiment. She has her researchers help her administer hair growth supplements to ten people in a research group. The researchers administer a placebo to ten different people in a research group. The researchers know which people are getting the real supplement and which are getting the placebo. They begin to act differently in front of each group and make comments to the group receiving the placebo that may indicate the treatment will not work. Which type of confounding is this?

Lack of blinding

What are all members of a specified group?

Population

What do we call all the members of a specified group?

Population

Madison and Charlotte are having a competition, seeing how long they can each hold a note on their flutes. Meanwhile, Charlotte wants to know how many notes Madison can produce accurately on her flute. Are these pieces of data discrete or continuous?

The number of notes is discrete; the length of the note held is continuous.

Brittany is conducting an experiment in her home economics class. She wants to know if adding an extra ingredient, cinnamon, to her pancakes will make them taste better. She decides to give the first period home economics class regular pancakes, and the second period home economics class the cinnamon pancakes. She asks each group to rank the pancakes on a scale from 1 to 5. What is the experimental group in this scenario?

The second period class.

Why might a researcher use a sample rather than an entire population for their study?

Using a sample is more practical than using an entire population.

Aubree is trying to understand why a student would pick computer technology as a college major. 64% of the sample she surveyed said that they picked computer technology because of the vast employment opportunities. Her survey comes from 1200 students in 30 different colleges across the nation. Can we infer a parameter from this information? What would it be?

Yes; the statistic is 64% of computer technology majors choose the major because of the employment opportunities. We can infer that the parameter is the same.

If there are two modes, you should _____.

report both

In an experimental study, one group was required to water their lawn once a week and another was required to water three times per week. The control group did not water their lawn at all. Watering the lawn is the _____, and the number of times is the _____.

treatment; factor

parameter

is a numerical measurement describing some characteristic of a population.

statistic

is a numerical measurement describing some characteristic of a sample.

A pie chart

is a very common graph that depicts categorical data as slices of a circle, in which the size of each slice is proportional to the frequency count for the category. Although pie charts are very common, they are not as effective as Pareto charts.

simple event

is an outcome or an event that cannot be further broken down into simpler components.

event

is any collection of results or outcomes of a procedure.

The nominal level of measurement

is characterized by data that consist of names, labels, or categories only. The data cannot be arranged in some order (such as low to high).

A voluntary response sample (or self-selected sample)

is one in which the respondents themselves decide whether to be included.

simple random sample of n subjects

is selected in such a way that every possible sample of the same size n has the same chance of being chosen. (A simple random sample is often called a random sample, but strictly speaking, a random sample has the weaker requirement that all members of the population have the same chance of being selected. That distinction is not so important in this text. See

census

is the collection of data from every member of the population. A sample is a subcollection of members selected from a population.

A population

is the complete collection of all measurements or data that are being considered. Typically, a population is the complete collection of data that we would like to make inferences about.

Class Width

is the difference between two consecutive lower class limits (or two consecutive lower class boundaries in a frequency distribution.

Replication

is the repetition of an experiment on more than one individual. Good use of replication requires sample sizes that are large enough so that we can see effects of treatments

nonsampling error

is the result of human error, including such factors as wrong data entries, computing error, questions with biased wording , false data provided by respondents, forming biased conclusions or applying statistical methods that are not appropriate for the circumstances

Nonrandom sampling error

is the result of using a sampling method that is not random, such as using a convince sample or a voluntary response sample

Statistics

is the science of planning studies and experiments; obtaining data; and organizing, summarizing, presenting, analyzing, and interpreting those data and then drawing conclusions based on them.

There are 17,246,372 high school students in the United States. In a study of 8505 U.S. high school students 16 years of age or older, 44.5% of them said that they texted while driving at least once during the previous 30 days (based on data in "Texting While Driving and Other Risky Motor Vehicle Behaviors Among US High School Students," by Olsen, Shults, Eaton, Pediatrics, Vol. 131, No. 6). Which is a Parameter and What is the Statistic

1. Parameter: The population size of 17,246,372 high school students is a parameter, because it is the entire population of all high school students in the United States. If we somehow knew the percentage of all 17,246,372 high school students who reported they had texted while driving, that percentage would also be a parameter. 2. Statistic: The sample size of 8505 surveyed high school students is a statistic, because it is based on a sample, not the entire population of all high school students in the United States. The value of 44.5% is another statistic, because it is also based on the sample, not on the entire population.

How are a relative frequency polygon and a histogram different

A frequency polygon is very similar to a histogram, but a fre- quency polygon uses line segments instead of bars.

A measure of center is a value at the center or middle of a data set.

A measure of center is a value at the center or middle of a data set.

During the first four Summer Olympic games attended by the United States, these medal counts were awarded. If there is an outlier, identify it from the number of medals that the United States won during these Olympic Games.

An outlier is a number that is significantly higher or lower than the other data points. The medal count 239 is much higher than the other totals.

Nominal

Categories only. Data cannot be arranged in order. Eye colors

Prospective (or longitudinal or Cohort) study

Data are collected in the future from groups that share common factors (such groups are called cohorts).

Ordinal

Data can be arranged in order, but differences either can't be found or are meaningless. Ranks of colleges in U.S. News & World Report

Interval

Differences are meaningful, but there is no natural zero starting point and ratios are meaningless. Body temperatures in degrees Fahrenheit or Celsius

time-series graph

is a graph of time-series data, which are quantitative data that have been collected at different points in time, such as monthly or yearly.

Range

Max - Min

Data

are collections of observations, such as measurements, genders, or survey responses. (A single data value is called a datum, a term rarely used. The term "data" is plural, so it is correct to say "data are . . ." not "data is . . .")

Percentiles

are measures of location, denoted P1, P2, . . . , P99, which divide a set of data into 100 groups with about 1% of the values in each group.

Upper Class Limits

are the largest numbers that can belong to each of the different classes

Class boundaries

are the numbers used to separate the classes, but without the gaps created by class limits.

Lower Class Limits

are the smallest numbers that can belong to each of the different classes.

Class Midpoints

are the values in the middle of the classes. Each class midpoint can be found by adding the lower class limit to the upper class limit and dividing the sum by 2

Retrospective (or case-control) study

data are collected from a past time period by going back in time (through examination of records, interviews and so on)

Cross-sectional Study

data are observed measured and collected at one point in time, not over a period of time

Categorical (or qualitative or attribute)

data consist of names or labels (not num- bers that represent counts or measurements).

Quantitative (or numerical)

data consist of numbers representing counts or measurements.

Whats the difference between descriptive and inferential statistics

descriptive statistics, because they summarize or describe relevant characteristics of data. In later chapters we use inferential statistics to make inferences, or generaliza- tions, about populations. Here are the chapter objectives:

Range Rule of Thumb

dividing the range by 4, given an approximation of the standard deviation

The sample space

for a procedure consists of all possible simple events. That is, the sample space consists of all outcomes that cannot be broken down any further.

skewed

if it is not symmetric and extends more to one side than to the other. Data skewed to the right (also called positively skewed) have a longer right tail, as in Figure 2-4(c). Annual incomes of adult Americans are positively skewed. Data skewed to the left (also called negatively skewed) have a longer left tail, as in

missing completely at random

if the likelihood of its being missing is independent of its value or any of the other values in the data set. That is, any data value is just as likely to be missing as any other data value.

missing not at random

if the missing value is related to the reason that it is missing.

interval level of measurement

if they can be arranged in order, and differences between data values can be found and are meaningful. Data at this level do not have a natural zero starting point at which none of the quantity is present.

ratio level of measurement

if they can be arranged in order, differ- ences can be found and are meaningful, and there is a natural zero starting point (where zero indicates that none of the quantity is present). For data at this level, dif- ferences and ratios are both meaningful.

ordinal level of measurement

if they can be arranged in some order, but differences (obtained by subtraction) between data values either cannot be deter- mined or are meaningless.

relative frequency distribution or percentage frequency distribution

in which each class frequency is replaced by a relative frequency (or proportion) or a percentage

relative frequency distribution or percentage frequency distribution,

in which each class frequency is replaced by a relative frequency (or proportion) or a percentage. In this text we use the term "rela- tive frequency distribution" whether we use relative frequencies or percentages. Rela- tive frequencies and percentages are calculated as follows. The sum of the percentages in a relative frequency distribution must be very close to 100% (with a little wiggle room for rounding errors).

cumulative frequency distribution

in which the frequency for each class is the sum of the frequencies for that class and all previous classes.

Data science

involves applications of statistics, computer science, and software engineering, along with some other relevant fields (such as sociology or finance).

Pareto chart

is a bar graph for categorical data, with the added stipulation that the bars are arranged in descending order according to frequencies, so the bars decrease in height from left to right.

histogram

is a graph consisting of bars of equal width drawn adjacent to each other (unless there are gaps in the data). The horizontal scale represents classes of quantitative data values, and the vertical scale represents frequencies. The heights of the bars correspond to frequency values.


Conjuntos de estudio relacionados

Loan Estimate and Closing Disclosure

View Set

AP Psych: Unit 6 LearningCurve Questions

View Set

NUR 2092 Pharmacology Ch 57 Drugs acting on the GI secretions

View Set

Life and Health Review Guarantee

View Set

Module 3-Professionalism, accountability, communication and legal issues

View Set