Test 1 Material - BUSN 3000
A company was concerned with the low participation rate in its 401(k) plan. It sampled 37 similar companies and obtained data on their 401(k) participation rates, typical employee salary and number of employees. What are the cases?
37 companies
A company was concerned with the low participation rate in its 401(k) plan. It sampled 37 similar companies and obtained data on their 401(k) participation rates, typical employee salary and number of employees. What is the response variable?
401(k) participation rates
In order to analyze website sales, you collect data on the average item price ($), number of items sold, and total value of sales ($) for various websites.What would the cases be (to be put in the rows of the data table before analyzing it)? A. average item price ($) B. websites C. total value of sales ($) D. number of items sold
B. websites
For a study of TV shows, you obtain the rating, market share, and advertising revenue for 80 popular TV shows.Which of these are statistical questions that could be answered from this data set? (Select all that apply.) A. Is there a relationship between the rating and advertising revenue of the 80 shows in this sample? B. What is the distribution of ratings for the 80 shows in this sample? C. Does the market share of TV shows differ based on the gender of viewers? D. For all TV shows on network television, what percentage have ratings of 5.0 or lower? E. Which of the 80 popular shows in this sample would you rate the highest?
A, B
An important precursor to analyzing data is to organize it into a data table. For a study of football teams, you obtain the number of touchdowns scored during the preceding season, number of interceptions during the preceding seasons, and number of passing yards during the preceding season. What goes in the rows, representing the cases? A. different teams B. number of touchdowns scored during the preceding season C. number of passing yards during the preceding season D. number interceptions during the preceding seasons
A. different teams
Which of the following variables are categorical? (Select all that apply.) A. expected letter grade in MSIT 3000 B. number of hours studied per week C. hours per week spent on extra curricular activities D. total number of credit hours taken so far
A. expected letter grade in MSIT 3000
A food retailer that specializes in organic food wants to open a store at a new location in the state of North Carolina, but first they want to predict whether the new store will be profitable in the town that is being considered. The researchers examine data from their existing organic food stores across the country to find out if there is a relationship between the median age of town residents and the monthly sales of the store. What is the explanatory variable? A. monthly sales B. zip code of the store C. town name D. median age of town residents
A. monthly sales
A food retailer that specializes in organic food wants to open a store at a new location in the state of North Carolina, but first they want to predict whether the new store will be profitable in the town that is being considered. The researchers examine data from their existing organic food stores across the country to find out if there is a relationship between the median age of town residents and the monthly sales of the store. What is the response variable? A. monthly sales B. town name C. median age of town residents D. zip code of the store
A. monthly sales
A listing posted by a restaurant chain gives, for each of the pastas it sells, the types of sauces for the pastas, number of calories, and serving size in ounces. The data might be helpful to assess the nutritional value of the various pastas. What are the categorical variables? (Select all that apply.) A. the types of sauces for the pastas B. the number of calories in the pastas C. the serving size of the pastas
A. the types of sauces for the pastas
A listing posted by a restaurant chain gives, for each of the pastas it sells, the types of sauces for the pastas, number of calories, and serving size in ounces. The data might be helpful to assess the nutritional value of the various pastas. What are the quantitative variables? A. the types of sauces for the pastas B. number of calories C. serving size
B, C
An MBA program wants to see how the graduating GPA of MBA students is related to their GMAT scores over the period 2014-2020. What of the following are statistical questions that can be answered from this study? (Select all that apply.) A. How GMAT score is related to graduating GPA for all MBA students from this school. B. How GMAT score is related to graduating GPA for MBA student from this school who graduated from 2014-2020. C. The distribution of GMAT scores for students who applied to the MBA program from 2014-2020. The distribution of GMAT scores for students from this school who successfully graduated with MBA from 2014-2020. D. The proportion of MBA students from who graduated from 2014-2020 who had GPA above 3.65.
B, C, D
In 2005, a magazine published an article evaluating refrigerators. It listed 65 models, giving the brand, cost ($), size (cubic ft), type, estimated annual energy cost ($), a summary rating (poor/ fair/ excellent), and percentage requiring repairs over the past 5 years. What are the categorical variables? (Select all that apply.) A. brand B. price ($) C. size (cubic ft) D. type of refrigerators E. estimated annual energy cost($) F. summary rating (poor/fair/excellent) G. percentage requiring repairs over the past 5 years
B, C, E, G
School administrators collect data on the students attending the school. Which of the following variable(s) are quantitative? (Select all that apply.) A. whether the student has GPA above 2.8 B. current GPA C. class standing (Freshman, Sophomore, etc.) D. 811 number E. whether the student has taken more than 75 credit hours F. zip code of the students home address
B. current GPA
A listing posted by a restaurant chain gives, for each of the pastas it sells, the types of sauces for the pastas, number of calories, and serving size in ounces. The data might be helpful to assess the nutritional value of the various pastas. What are the cases for this data set? A. the serving size of the pastas B. the different pastas C. the number of calories D. the different branches of this restaurant chain E. the types of sauces for the pastas
B. the different pastas
Statistical Question
Based on responses from different people and can be answered directly by analyzing the relevant data set - ex. What are the average salaries of graduates with BBA vs. MBA degrees?
Research Question
Broad scope, ongoing investigation with multiple aspects - ex. How can a company increase sales?
In 2005, a magazine published an article evaluating refrigerators. It listed 65 models, giving the brand, cost ($), size (cubic ft), type, estimated annual energy cost ($), a summary rating (poor/ fair/ excellent), and percentage requiring repairs over the past 5 years. Is this data set cross-sectional or time series? A. time series B. both cross-sectional and time series C. cross-sectional
C. cross-sectional
Indicate how the following data set should be organized in order to analyze it. Indicate which items should go in the rows (the cases), and what the headings of columns should be (the variables). Data collected for debt consolidation: weekly debt, week number of the year, weekly debt predicted by last year, difference between predicted debt and realized debt. A. Each row is a difference between the actual and predicted debt. Columns hold the differences (identifier), week number, predicted debt, and actual debt for that week. B. Each row is a different debt amount. Columns hold the actual debt amount (identifier), week number, predicted debt, and the difference between the actual and predicted debt. C. Each row is a debt prediction. Columns hold the debt (identifier), the week in which the debt was predicted, the actual debt of the predicted week, and the difference between the actual and predicted debt. D. Each row is a week. Columns hold the week number (identifier), the debt prediction, the actual debt, and the difference between the actual and predicted debt.
D. Each row is a week. Columns hold the week number (identifier), the debt prediction, the actual debt, and the difference between the actual and predicted debt.
In 2005, a magazine published an article evaluating refrigerators. It listed 65 models, giving the brand, cost ($), size (cubic ft), type, estimated annual energy cost ($), a summary rating (poor/ fair/ excellent), and percentage requiring repairs over the past 5 years. What are the cases? A. the overall recommendation of the refrigerators B. the brands of the refrigerators C. the number of refrigerators that were sold in 2005 D. the price of the refrigerators E. the 65 models of refrigerators
E. the 65 models of refrigerators
Statistical Problem-Solving Process
Formulate Questions, Collect Data, Analyze Data, Interpret Results
An MBA program wants to see how the graduating GPA of MBA students is related to their GMAT scores over the period 2014-2020. What is the explanatory variable?
GMAT scores
An MBA program wants to see how the graduating GPA of MBA students is related to their GMAT scores over the period 2014-2020. What are the cases?
MBA students
Quantitative Variable
tells us how much of something was measured and quantifies exact how far apart individual items are - ex. height, weight, salary, score, distance, time, GPA
A large company is interested in seeing how various promotional activities are related to sales. For every month over the past three years (2017-2019), they add up the total amount spent on various forms of advertising ($ thousands) as well as the monthly sales ($ millions). What is the explanatory variable?
amount spent on advertising
To help investigate discrepancies between how men and women are treated in the workforce, an economist collects data on the average annual salaries of men vs. women at the Fortune 100 companies for 2019. Since this data was collected during 2019, this data is:
cross-sectional
Time Series
data that consists of the same item measured repeatedly - ex. price of Bitcoin at the end of each day for the past year, annual GDP of the US for each of the past 20 years
Cross-sectional
data that is measured only once - ex. GDP of all European countries for 2020, closing price on December 31 of all stocks in NASDAQ 100
Where is data organized and how?
excel; usually organized with cases in rows and variables in columns
An MBA program wants to see how the graduating GPA of MBA students is related to their GMAT scores over the period 2014-2020. What is the response variable?
graduating GPA
Cases
individual items about which we record several different measurements; contained in rows; includes identifiers
Variables
measurements about cases; contained in columns
A large company is interested in seeing how various promotional activities are related to sales. For every month over the past three years (2017-2019), they add up the total amount spent on various forms of advertising ($ thousands) as well as the monthly sales ($ millions). What is the response variable?
sales
Categorical Variable
separate distinct categories that cannot specify how far apart 2 items are nor do any time of mathematical calculation - ex. gender, race, nationality, hair color, student ID number, major, class standing, grade, occupation, zip code
A large company is interested in seeing how various promotional activities are related to sales. For every month over the past three years (2017-2019), they add up the total amount spent on various forms of advertising ($ thousands) as well as the monthly sales ($ millions). What are the cases?
the 36 months
A financial analyst records the CEO total compensation for each of the Fortune 100 companies over the past 12 years. These data are:
time series
Identifier
unique identification assigned to each individual or item, listed in the first column of data table, categorical variable, not analyzed - ex. SSN, student ID number, transaction number
How do you graph a time series?
use a line indicating changes over time (line graph)
Explanatory Variable
used to calculate predictions - ex. type of customer (gender, age, location, etc.)
Response Variable
what we would like to predict - ex. amount spent by online customers