Business Stat Exam 1 Definitions

Ace your homework & exams now with Quizwiz!

Nominal scale

- least sophisticated level of measurement - data are simply categories for grouping the data

Interval Scale

-categorize and rank data -differences between values are meaningful -no absolute zero or starting point defined -meaningful ratios may not be obtained -ex: Fahrenheit temperature

Ordinal Scale

-data may be categorized and ranked with respect to some characteristics or trait -(excellent, good, fair, poor) -differences between categories are meaningless because actual numbers are arbitrary

Ratio Scale

-strongest level of measurement -can be categorized and ranked -differences in values are meaningful -absolute zero -ex: sales, weight, time, distance

Ratio scale; discrete

A discrete variable takes on individually distinct values. The ratio scale has a meaningful zero point and we can interpret ratios of values. In this case, the linebacker would have no tackles.

covariance

A positive value of covariance indicates a positive linear relationship between x and y; on average, if x is above (below) its mean, then y tends to be above (below) its mean, and vice versa. A negative value of covariance indicates a negative linear relationship between x and y; on average, if x is above (below) its mean, then y tends to be below (above) its mean, and vice versa.

When interpreting the covariance between variables x and y, which of the following statements is the most accurate?

A positive value of covariance indicates that, on average, if x is above its mean, then y tends to be above its mean.

Event

A subset of a sample space. They are exhaustive and mutually exclusive.

Risk Loving

Accept risky prospect even if expected gain is negative

Which of the following statements is most accurate when defining percentiles?

Approximately p% of the observations are less than the pth percentile, and approximately (100 - p)% of the observations are greater than the pthpercentile.

Which of the following represents a population and a sample from that population?

Attendees at a sporting event, and those who purchased popcorn at said sporting event: Those individuals who purchase popcorn at said sporting event are clearly a subset of all attendees at a given sporting event.

Which of the following is not a graphical technique to display quantitative data?

Bar Chart

Two defining properties of a probability

Between 0 and 1 sum of probabilities of any list of mutually exclusive and exhaustive will be equal to 1

What is an advantage of the correlation coefficient over the covariance?

Both answers-that it falls between -1 and 1 and that it is a unit-free measure-are correct:The correlation coefficient is preferred in evaluating the direction and strength of the linear relationship between two variables. It is a unit-free measure, assuming the values from the interval [-1, 1].

Sampling is used heavily in manufacturing and service settings to ensure high-quality products. In which of the following areas would sampling be inappropriate?

Custom cabinet making: Custom cabinets are not meant to be standardized in their characteristics. Therefore, sampling would make no sense.

Sample Space

Denoted S, of an experiment includes all possible outcomes of the experiment

Events are considered _________ if the occurrence of one is related to the probability of the occurrence of the other.

Dependent: We generally test for the independence of two events by comparing the conditional probability of one event P(A|B)P(A|B) , to its unconditional probability P(A). If they are the same, we say that the two events, A and B, are independent.

Unstructured Data

Does not conform to a pre-defined column format: reports, emails, multimedia

Objective probabilities

Empirical probability and classical probability

T or F: Cross-sectional data contain values of a characteristic of one subject collected over time.

False: Cross-sectional data contain values of a characteristic of many subjects at the same point or approximately the same point in time, or without regards to differences in time.

T or F: Geometric mean is greater than the arithmetic mean.

False: Geometric mean is smaller than the arithmetic mean and is less sensitive to outliers.

T or F: Population parameters are used to estimate corresponding sample statistics.

False: Sample statistics are used to estimate the corresponding population parameter.

T or F: The total probability rule is defined as P(A) = P(A ∩B) P(A ∩Bc )

False: The total probability rule is defined as P(A) = P(A ∩B) + P(A ∩ Bc ).

Which of the following is true when using the empirical rule for a set of sample data?

For a set of sample data, the empirical rule states that approximately 68% of all observations are in the interval x−+−s, approximately 95% of all observations are in the interval x−+−2s, and almost all observations are in the interval x−+−3s.

What graphical tool is best used to display the relative frequency of grouped quantitative data?

Histogram: Histograms are used to display the relative frequency of quantitative data. An ogive is used to display the cumulative frequency, while the bar chart and pie chart display qualitative data.

The Fahrenheit scale for measuring temperature would be classified as a(n)

Interval Scale: Zero in Fahrenheit degrees does not mean "no temperature." We cannot say, for example, that today is twice as warm as six months ago, which characterizes the ratio scale.

Which scales of data measurement are associated with quantitative data?

Interval and ratio: Two scales are associated with quantitative data: interval scale and ratio scale.

What graphical tool would you use to display the cumulative relative frequency of the grouped data?

Ogive

An undergraduate student's status (freshman, sophomore, junior, or senior) is an example of which scale of measurement?

Ordinal scale: Undergraduate students are classified into the four categories based on the number of credit hours earned. There is a natural ordering between the four categories; sophomores have more credit hours than freshmen, and so on.

For which of the following data sets will a pie chart be most useful?

Percentage of net sales by product for Lenovo in Year 1: Only percentage of net sales by product for Lenovo in Year 1 looks at multiple categories of a single qualitative variable, in which the percentage of net sales by product may be meaningfully displayed.

The accompanying chart shows the number of books written by each author in a collection of cookbooks. What type of data is being represented?

Qualitative, nominal: The data are qualitative and nominal (no ordering is present in the categories).

Which of the following is an example of time series data?

Quarterly housing starts collected over the last 60 years: Time series data refers to data collected by recording a characteristic of a subject over several time periods.

Which of the following scales represents the strongest level of measurement?

Ratio Scale

San Francisco 49ers' linebacker Patrick Willis won the Defensive Rookie of the Year Award in 2007 with a total of 174 tackles. Tackles are measured on what kind of a scale? Is a variable measuring the number of tackles considered continuous or discrete?

Ratio scale; discrete

Which of the following is an example of cross-sectional data?

Results of market research testing consumer preferences for soda: Cross-sectional data refers to data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time.

A stem-and-leaf diagram is constructed by separating each value of a data set into two parts. What are these parts?

Stem consisting of the leftmost digits and the leaf consists of the remaining digits.

Bar chart for qualitative data

The data are qualitative and the chart is a bar chart.

When using a polygon to graph quantitative data, what does each point represent?

The midpoint of a particular class and its associated frequency or relative frequency

Which of the following variables is not continuous?

The number of obtained heads when a fair coin is tossed 20 times Although in practice the exact values of such variables as height, time, and temperature are approximated, they are continuous in nature. If a fair coin is tossed 20 times, the possible numbers of obtained heads are 0, 1, 2, ..., 20.

Which of the following are examples of cross-sectional data?

The sales prices of single-family homes sold last month in California The current average prices of regular gasoline in different states The test scores of students in a class Cross-sectional data refers to data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time.

Which of the following can be represented by a continuous random variable?

The time of a flight between Chicago and New York: A discrete random variable assumes a countable number of possible values, whereas a continuous random variable is characterized by uncountable values.

Why do we sample?

Too expensive and too hard to collect data of the population

Consider these events. T or F A = The survey respondent is less than 40 years old. B = The survey respondent is 40 years or older. Events A and B are mutually exclusive and exhaustive.

True: Events are mutually exclusive if they do not share any common outcome of a random experiment. Events are exhaustive if all possible outcomes of a random experiment are included in the event.

T or F: Permutations are used when the order in which different objects are arranged matters.

True: If the order in which objects are arranged matters, we should use permutations.

T or F: Structured data tends to include numbers, dates, and groups of words and numbers called strings.

True: Structure data generally refers to data that has a well-defined length and format. This type of data is not open to interpretation.

T or F: The coefficient of variation is a unit-free measure of dispersion.

True: The coefficient of variation is computed as CV=s/x and is a relative measure of dispersion.

T or F: Two events A and B are independent if the probability of one does not influence the probability of the other.

True: Two independent events are defined as independent if the conditional probability and the simple event are the same value. Using formulas, P(A) = P(A | B)

T or F: Ordinal scale reflects a stronger level of measurement than the nominal scale.

True: With ordinal data we are able both to categorize and rank the data with respect to some characteristic.

Scatterplot

When looking at the plotted points, the variables have a positive relationship (y tends to increase as x increases), and the relationship appears linear or slightly curvilinear.

Sample

a subset of the population

probability distribution

all of the possible outcomes are included in every random variable

The complement of an event A, within the sample space S, is the event consisting of ____________.

all outcomes in S that are not in A: The complement of event A, Ac, is the event consisting of all outcomes in the sample space S that are not in A.

Chebyshev's theorem is applicable when the data are___________________.

any shape: There are no restrictions on the shape of a data distribution when using Chebyshev's theorem.

Subjective probability

assigned on personal judgement

Random Variable

assigns numerical value to the outcomes of an experiment

We need statistics in order to:

avoid making uninformed decisions and costly mistakes make sound statistical conclusions vs. questionable choices

Classical probability

based on logical analysis rather than on observation or personal judgement

Population parameters are difficult to calculate due to

both cost prohibitions on data collection and the infeasibility of collecting data on the entire population.

Sample Statistics

calculated from the sample data and is used to make inferences about the unknown population parameter

Hypergeometric probability distribution

cannot assume trials are independent: use when sampling without replacement from a population whose size N is not significantly larger than the sample size n (population not much bigger than sample)

When constructing a frequency distribution for quantitative data, it is important to remember that _____________.

classes should be mutually exclusive, exhaustive, and the total number of classes should be between 5 and 20.

Descriptive Statistics

collecting, organizing and presenting data

Population

consists of all items of interest

probability density

continuous random variables

A(n) ____________ variable is characterized by infinitely uncountable values and can take any value within interval.

continuous: A continuous variable can take on any value within an interval, while a discrete variable assumes a countable number of distinct values.

Cross sectional data

data collected by recording a characteristic of many subjects at some (one) point in time

Time series data

data collected over several time periods

positively skewed data

data forms a long, narrow tail to the right

Binomial random variable

defined as number of successes achieved in n trials of Bernoulli process

Risk averse

demand positive expected gain from taking risk

Scatterplot

depicts relationship between x and y

Correlation Coefficient

describes both direction and strength of relationship

cumulative distribution function

describes either continuous or discrete random variables

The two branches of the study of statistics are generally referred to as

descriptive and inferential statistics.

Your business statistics class had a test last week. The average score for the class is an example of

descriptive statistics: Descriptive statistics refers to summarizing a set of data.

Inferential Statistics

drawing conclusions about a population based on sample data: more important, more useful, harder to analyze

Frequency Distribution

for qualitative data and groups categories and records how many observations fall into each category

negatively skewed data

forms a long, narrow tail to the left

In order to summarize qualitative data, a useful tool is a __________.

frequency distribution

Variable

general characteristic being observed on objects of interest

Risk neutral

ignore risk, always accept a prospect that offers positive gain

Bernoulli Process

independent and identical trials that each trial there are only two possible outcomes: success or failure. Probability of success and failure is the same

Big Data

massive volume of data that is difficult to manage, process and analyze using traditional tools

median

measure of central location that is not affected by outliers

Nominal and Ordinal Scales

measure qualitative data

Interval and Ratio

measure quantitative data

Statistics

methodology of extracting useful information from a data set

Geometric mean

multiplicative average that incorporates compounding

Possible responses to the survey question were: "Yes," "No," or "Don't Know." This data is best classified as

nominal scale: With nominal data all we can do is categorize or group the data.

Probability

numerical value that measures the likelihood that an uncertain event occurs between 0 and 1. 0 indicates impossible; 1 indicates definite event

A sample statistic is an estimate of

population parameter: Population parameter is estimated by sample statistic.

Arithmetic mean

primary measure of central location that is often referred to as average: additive average and ignored effects of compounding

experiment

process that leads to one of the several possible outcomes

Histograms and stem-and-leaf diagrams describe

quantitative data

Continuous

random variable assumes a uncountable values in an interval

Discrete

random variables assumes a countable number of distinct values ex: credit cards

Empirical probability

relative frequency of occurrence

weighted mean

relevant when some observations contribute more than others

cumulative frequency distribution

shows the number of items with values less than or equal to the upper limit of each class

valid probability

sum of all probabilities equals 1

Covariance

tells directions of linear relationship positive or negative

probability mass function

used to describe discrete random variables

When a characteristic of interest differs among various observations, then it can be termed a

variable: A variable is the general characteristic being observed on a set of people, objects, or events, where each observation varies in kind or degree.

Structured Data

well defined length and format: numbers, dates, strings of words

The empirical rule can be used to estimate some proportions_________________.

when it is approximately symmetric and bell-shaped: The empirical rule is only applicable for approximately symmetric and bell-shaped data sets.


Related study sets

Which of the following are considered members of a class

View Set

CH. 10: Understanding Meats and Game

View Set

Chapter 9: From Here to Your Career

View Set