Exam 1 Study Guide

Ace your homework & exams now with Quizwiz!

A soft drinks company developed a new fizzy drink (Delicious Fizz). A researcher conducted a series of blind tasting trials to measure consumer response to the new drink; this involved consumers drinking Delicious Fizz and a rival fizzy drink (Rival Fizz) and then rating the products on a ten-point taste scale. The results for consumption of Rival Fizz and taste was p = 0.02, while the results for consumption of Delicious Fizz and taste was p = 0.015. How should the researcher interpret her findings?

'Delicious Fizz' had greater significance

Why do business analysts use SPSS rather than performing calculations by hand?

- quantitative data analysis is so complex today it is essential to use a stats package - it reduces the chance of making errors in your calculations - it equips you with a useful transferable skill

Which of the following is not a file extension for files saved in SPSS?

.doc

Which of the numbers below might IBM SPSS report as 10.574 E−05? Answer choices 0.00010574 10.569 1057400.0 0000.10574

0.00010574

Given a test is normally distributed with a mean of 30 and a standard deviation of 6: What is the probability that a single score drawn at random will be greater than 34? 0.0228 0.2524 0.1826

0.2524

A researcher working in a Human Resources department was interested in gender and sales figures so he conducted a t-test. The mean for males was 66.25 and the mean for females was 78.24, whith both groups having a standard deviation of 7. What is the effect size using Cohen's d?

1.712

The owner of the large chain of coffee shops called 'MoonBucks' decided to calculate how much revenue was gained from lattes each month in a nationwide sample of 2445 cafés. To measure the variance of revenue gained from lattes, he computes SS = 351,936 for this sample. What are the degrees of freedom for variance?

2444

Twenty-one cats were given 300g of tuna each. The time in seconds was measured until they had eaten all of the tuna: 16, 18, 18, 22, 22, 23, 23, 24, 26, 29, 32, 34, 34, 36, 36, 42, 43, 46, 46, 49, 57 Compute the median.

32

If you see in SPSS the number 8.51 E-02 reported, what is the actual value of this number?

8.51x 10^-2

Which of the numbers below might IBM SPSS report as 8.96 E+03? Answer choices 89.60 8960.0 0.008960 8.960

8960.0

a hr manager conducted a review of overtime worked by employees. her sample size was 60 employees and there was a mean of 90 hours worked overtime per month. her confidence interval was 95%. what would be the upper boundary confidence interval for this study?

91.86 hours

Which of the following is true about a 95% confidence interval of the mean:

95 out of 100 confidence intervals will contain the population mean.

Approximately what percentage of people would have scores lower than an individual with a z-score of 1.65 in a normally distributed sample? Answer choices 95% 98% It is not possible to calculate this unless the mean and standard deviation are given. 1%

95%

Which of the following terms best describes the sentence: 'organizations with employee training programmes will not employ fewer men or women'. A)A directional hypothesis B)An operational definition C)A null hypothesis D)A non-directional hypothesis

A non-directional hypothesis

A researcher in a Human Resources Unit presented a recent study, which showed a statistical significance between length of staff lunch breaks and low productivity; how can she explain to her manager that this does not mean that the length of staff lunch breaks should be reduced?

A significant result does not mean that the effect is important

In SPSS, what is the data view window?

A spreadsheet into which data can be entered

Which of these statements is correct about one- and two-tailed tests?

A statistical model that tests a directional hypothesis is called a one-tailed test, whereas one testing a non-directional hypothesis is called a two-tailed test

'Children can learn a second language differently before the age of 7 than after.' Is this statement:

A two-tailed hypothesis

Why do business analysts use SPSS rather than performing calculations by hand?

All of the above.

What is the standard error?

All of the options describe the standard error.

Which of the following best describes the variable 'Gender'? Answer choices: A between-group variable. A coding variable. All of the possible answers are correct. A grouping variable.

All of the possible answers are correct

If my null hypothesis is 'Dutch people do not differ from English people in height', what is my alternative hypothesis?

All of the statements are plausible alternative hypotheses.

What is the SPINE of statistics?

An acronym for the five core concepts needed to understand statistical models

You have just joined the sales modeling team for a start-up software company. Your boss has decided that from now on the team will adopt a Bayesian approach. However, not all staff understand what this is, your boss asks you to present a training session. How would you explain a Bayesian approach in your session introduction?

An approach that shows you to update the likelihood of your statistical model as more data is collected

To generate a correlation coefficient between two variables with ordinal data. Which set of instructions should give you SPSS?

Analyse-Crosstabs-Descriptive Statistics-Spearman-OK

To generate a correlation coefficient between two variables with ordinal data. Which set of instructions should give you SPSS?

Analyze->Crosstabs->Descriptive Statistics->Spearman->OK

A human resources manager in the IT sector was concerned about unconscious bias in recruitment panels. There were two posts and seven candidates, four men and three women. Theoretically, all the candidates have an equal probability of being hired as they all match the selection criteria. However, the manager has data that suggests that it is more likely men will be hired based on data from across the IT sector and within her own company. However, the manager has implemented many equality initiatives within her company and therefore wants to determine the probability that still no women will be hired. What formula could she use to determine this probability and assess the impact of unconscious bias in her company's recruitment?

Bayes' theorem

Confidence intervals:

Can be used instead of conventional statistics based on point estimates.

Your manager had asked you to identify the number of men responding in your annual staff survey. How would you generate this output?

Click on Analyze->Descriptive Statistics->Frequencies

Your manager had asked you to identify the number of men responding in your annual staff survey. How would you generate this output?

Click on-Analyse-Descriptive Statistics-Frequencies

How would you use the drop-down menus in SPSS to generate a frequency table?

Click on: Analyze; Descriptive Statistics; Frequencies

An HR manager was interested in employee use of company on-site gyms across twenty sites. Different researchers collected and analyzed data across each of the sites but the resultant twenty reports showed differing p-values, some sites found a statistical significance between opening hours of on-site gyms and employee usage and others did not. Which of the following would be useful for her to review?

Confidence Intervals

When items on a questionnaire appear to correspond to the construct that the questionnaire claims to measure it is said to have: Answer choices: Factorial validity Ecological validity Content validity Criterion validity

Content validity

A colleague in your research agency has phoned and asked you in which sub-dialog box the chi-square test can be found. Which do you recommend?

Crosstabs-Statistics

A colleague in your research agency has phoned and asked you in which sub-dialogue box the chi-square test can be found. Which do you recommend?

Crosstabs-Statistics

Ordinal level data are characterized by: Answer choices: Equal intervals between each adjacent score. A fixed zero. Data that can be meaningfully arranged by order of magnitude. None of the above.

Data that can be meaningfully arranged by order of magnitude

Ordinal level data are characterized by?

Data that can be meaningfully arranged by order of magnitude

An analyst at your firm and you are discussing missing data. What might you suggest as an appropriate strategy for dealing with larger quantities of missing data?

Define missing values using the 'recode' function

An analyst at your firm and you are discussing missing data. What might you suggest as an appropriate strategy for dealing with larger quantities of missing data?

Define missing values using the 'recode' function.

In your experiment (Q12) you also ask some qualitative questions to enrich the statistical data. What is the correct way to record non-numerical values in SPSS?

Define the variable as 'string'.

For what is the 'variable view' in IBM SPSS's data editor used? Answer choices: Entering data. Writing syntax. Viewing output from data analysis. Defining characteristics of variables.

Defining characteristics of variables

There are basically two types of statistics - descriptive and inferential. Which of the following sentences are true about descriptive statistics?

Descriptive statistics describe the data.

'Reducing the advertising budget will reduce short-term sales performance'. State the direction of this hypothesis.

Directional

Reducing the advertising budget will reduce short-term sales performance. State the direction of the hypothesis

Directional

The degree to which a statistical model represents the data collected is known as the: Answer choices: Fit Homogeneity Reliability Validity

Fit

You have been asked to assess various atmospheric environments for a brand new fashion retail store. If you are therefore constructing a data file for a repeated-measure design with 190 subjects and three conditions (light and airy, warm and cosy, dark and intense), how many columns and rows will the file have?

Four columns and ten rows

You have been asked to assess various atmospheric environments for a brand new fashion retail store. If you are therefore constructing a data file for a repeated-measure design with 190 subjects and three conditions (light and airy, warm and cosy, dark and intense), how many columns and rolls will the file have?

Four columns and ten rows

Which of the following statements is true?

If the confidence interval for the difference between two means does include zero then the difference between the means is statistically significant.

Why is the standard error important?

It gives you a measure of how well your sample parameter represents the population value.

How is a variable name different from a variable label?

It refers to codes rather than variables

Which of the following could not be represented by columns in the SPSS data editor?

Levels of between-group variables

If we calculated an effect size and found it was r = .42 which expression would best describe the size of effect? Answer choices: Small Small to medium Medium to large Large

Medium to large

Why are large samples desirable in statistical models?

More likely to reflect the population under study

Your CEO has just read a book on criticisms of the NHST and worries that all company data analysis is now flawed and will lead to huge financial losses. How might you reassure her?

NHST does have its flaws but if we incorporate an examination of effect sizes into our analysis, we should be able to trust our research outputs

Assume a researcher found that the correlation between a test she had developed and exam performance was .5 in a study of 25 students. She had previously been informed that correlations under .30 are considered unacceptable. The 95% confidence interval was [0.131, 0.747]. Can you be confident that the true correlation is at least 0.30?

No you cannot, because the lower boundary of the confidence interval is .131, which is less than .30, and so the true correlation could be less than .30.

A stockmarket trader conducted a Bayesian analysis of variations in skirt length and stock market growth. He calculated a Bayes factor of 1. Should he use skirt length as a predictor of stock market growth?

No, a Bayes factor of 1 suggests that it is not worth investing in the stock market based on skirt length variations.

You work in a data analyst unit for a large fast food restaurant chain, planning a customer survey and a colleague informs you that a 95% confidence interval has a 95% probability of containing a population parameter. Because of this, she insists that a survey distributed at one restaurant will provide significant results. Do you agree with her?

No, because 95% probability is a long-run probability requiring that multiple tests to be done

A recruitment analyst wanted to examine the likelihood that advertising on social media is more effective than in print media for recruiting the best candidates. She conducted one study where the probability of making a Type I error was 0.05 and a Type II error was 0.2. Does her research have empirical probability?

No, to have empirical probability the likelihood of an effect being detected requires a series of repeated identical experiments, where the probability of making a Type I error is 0.05 and a Type II error is 0.2.

An experimenter measured 30 children's IQ. He then rank-ordered the children and assigned them a score from 30 (most intelligent) to 1 (least intelligent) to create a new variable. Does this new variable consist of: Nominal data Interval data Ratio data Ordinal data

Ordinal data

In our previous example, the human resources manager had already calculated the probability of women being hired based on sector wide data. In the Bayesian approach, what sort of probability is this?

Prior probability

Why do business analysts use SPSS rather than performing calculations by hand?

Quantitative data analysis is so complex today it is essential to use a stats package, it reduces the chance of making errors in your calculations, it equips you with a useful transferable skill (answer is all of the above)

What operation does the 'Recode into Different Variables' initiate?

Redistributes a range of values into a new set of categories and creates a new variable.

What operation does the "Recode into Different Variable" initiate ?

Redistributes a range of values into a new set of categories and creates new variable.

When cross-tabulating two variables, it is conventional to?

Represent the independent variable in rows and the dependant variable in columns.

When cross-tabulating two variables it is conventional to

Represent the independent variable in rows and the dependent variable in columns

Why might your experimental data file have 'missing data'?

Some of a participant's responses might be missing.

Why might your experimental data file have "missing data"?

Some of the participant's responses might be missing

Which of the following is not a transformation that can be used to correct skewed data? Answer choices Log transformation Square root transformation Reciprocal transformation Tangent transformation

Tangent Transformation

If we use the mean as a model, what does the variance represent? Answer choices: The average error between the model and the observed data. The total error between the model and the observed data. The squared total error between the model and the observed data. The square-rooted average error between the model and the observed data.

The average error between the model and the observed data

In general, as the sample size (N) increases:

The confidence interval gets narrower.

Which of the following is the least affected by outliers? Answer choices The range The mean The median The standard deviation

The median

If my experimental hypothesis were 'Eating cheese before bed affects the number of nightmares you have', what would the null hypothesis be?

The number of nightmares you have is not affected by eating cheese before bed.

What is the power of a statistical test?

The probability that it will find an effect when one exists

A 95% confidence interval is:

The range of values of the statistic which probably contains the true value of the statistic in the population.

Your business studies lecturer has devoted the past ten weeks to teaching you the Bayesian approach and is now asking that you offer a critique of it. What key criticism could you raise?

The reliance on a prior probability is overly subjective and therefore can be open to a researcher's degrees of freedom.

You are the newly appointed business analyst for a large national bank (250000 customers). At a team meeting, your boss presents the results of a survey of customers regarding their opinion (measured on a ten point scale) of a new financial product. The survey sample (of 500customers) showed a significant (p=0.23) level of satisfaction with the new financial product. How should you interpret this results for your boss?

The results is not significant but the small sample size may be missing large differences in customer satisfaction.

If we were to pull all possible samples from a population, calculate the mean for every sample, and construct a graph of the shape of the distribution based on all of the means, what would we have? Answer choices The population distribution of the mean The sampling distribution of the mean The bootstrap distribution of the mean The standard error of the mean

The sampling distribution of the mean

What is the relationship between sample size and the standard error of the mean?

The standard error decreases as the sample size increases.

Which of the following statements is true?

The standard error is calculated solely from sample attributes.

Of what is the standard error a measure?

The variability of sample estimates of a parameter.

Which of the following is not strictly a legitimate business hypothesis?

There will be no difference in productivity between younger and older employees

What is the null hypothesis for the following question: Is there a relationship between heart rate and the number of cups of coffee drunk within the last 4 hours?

There will be no relationship between heart rate and the number of cups of coffee drunk within the last 4 hours.

Under a null hypothesis, a sample value yields a p-value of .015. Which of the following statements is true?

This finding is statistically significant at the .05 level of significance.

A member of your market research team conducted tests of a new television advert with twenty different groups of consumers, in which they rated their satisfaction (on a ten point scale) with the advert and likelihood of purchasing the advertised product (on a five-point scale). He is worried about the family wide error rate across the tests, what advice would you give him.

Use a Bonferroni Correction

What are variables?

Variables are measured constructs that vary across entities in the sample.

A 95% confidence interval for the difference between two population means is found to be (−0.08, 0.15). Which of the following statements is true?

We can be 95% confident that the true difference between the population means falls between −0.08 and 0.15.

A Type I error occurs when:

We conclude that there is an effect in the population when in fact there is not.

A Type II error occurs when :

We conclude that there is not an effect in the population when in fact there is.

You lead a product-testing unit for a large pharmaceutical company. Your team has conducted forty trials of a new antibiotic but you are not sure if the results are conclusive enough to urge the company to start producing the new drug. A new data analyst has joined your team suggesting that meta-analysis might be a good idea, do you agree?

Yes, because the forty trials were identical and tested the same research question and therefore we can calculate an average effect size for the new drug.

You are the CEO of a small financial forecasting company. You have decided to adopt a Bayesian approach to data analysis and modeling. When you announce the new policy, your staff are unhappy and unconvinced, as they are used to a NHST approach. You stress that the Bayesian approach has several key advantages, including which of the following

You can evaluate the likelihood of the null hypothesis being true

Your CEO has followed your advice and now wants you to measure effect sizes. You report a Pearson's r of 0.50 for the impact of Unblock Me Now drain cleaner on reducing drain blockage time. Your CEO wants to know if this is bad, as she remembers that a p-value of 0.30 is not good. What do you tell her?

You tell her that effect size and p-values are not the same and that a Pearson's r of 0.50 is a large effect, suggesting she should rollout the launch of Unblock Me Now.

Which of the following terms best describes the sentence: 'organizations with employee training programs will not employ fewer men or women'.

a non-directional hypothesis

hypothesis

a proposed explanation for a fairly narrow phenomenon or set of observations. It is not a guess, but an informed, theory-driven attempt to explain what has been observed.

When a null hypothesis is rejected, the probability of committing a type II error is _____.

all of the above

Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)? Answer choices All of the options are true. Some feature of the data should be normally distributed. The samples being tested should have approximately equal variances. The data should be at least interval level.

all of the options are true

Theory

an explanation or set of principles that is well substantiated by repeated testing and explains a broad phenomenon

In which sub-dialog box can the Chi Square test be found?

crosstab: statistics

In your experiment, you also ask some qualitative questions to enrich the data. What is the correct way to record non-numerical values in SPSS?

define the variable as 'string'.

A trainee data analyst for a large social media company, which has falling site usage, has just completed a study into factors that affect site users' satisfaction levels. However, he finds only one statistically significant factor, which he includes in his report but he deliberately omits the other six non-significant findings. What is the term for what the data analyst has done?

p-hacking

A trainee data analyst for a large social media company, which has falling site usage, has just completed a study into factors that affect site users' satisfaction levels. However, he finds only one statistically significant factor, which he includes in his report but he deliberately, omits the other six non-significant findings. What is the term for what the data analyst has done?

p-hacking

What is the relationship between the sum of squared errors (SS), the sample size (n) and the variance (s2)? Answer choices: SS = s2/(n - 1). s2 = SS(n - 1). n = (s2/SS) - 1. s2 = SS/(n - 1).

s2 = SS/(n - 1)

If we calculated an effect size and found it was r = .21 which expression would best describe the size of effect?

small to medium

What is the standard error?

the standard deviation of sample means

Your business studies lecturer has set you the following hypothesis to test, 'there will be no association between consumer socioeconomic status and level of private health insurance'. What type of hypothesis is this?

two tailed

What symbol is used to represent the standard error of the mean?

σx̅

Theory and Hypothesis similarities and differences

Both theories and hypotheses seek to explain the world, but a theory explains a wide set of phenomena with a small set of well-established principles, whereas a hypothesis typically seeks to explain a narrower phenomenon and is, as yet, untested. Both theories and hypotheses exist in the conceptual domain, and you cannot observe them directly.

Which of the following best describes the relationship between sample size and significance testing?

In large samples even small effects can be deemed 'significant'.

The 99% confidence interval usually is:

Narrower than the 95% confidence interval.

A researcher was assessing customer satisfaction with MakeMebeautiful, a new beauty product. He had a sample size of 75 and a P-value of 0.10. Does the researcher recommend that the company stop promoting the product?

No, because the sample size is small and p-values are easily affected by sample size.

In hypothesis testing, which hypothesis do we test?

Null

What does NHST stand for?

Null Hypothesis Significance Testing

What are parameters?

Parameters are estimated from the data and are (usually) constructs believed to represent some fundamental truth about the relations between variables in the model.

What is the alternative hypothesis for the following question: Does eating salmon make your skin glow?

People who eat salmon will have a more glowing complexion compared to those who don't.

Which of these statements about statistical power is not true?

Power is the ability to detect an effect, we can use power to determine how big a sample is required to detect an effect of a certain size, power is linked to the probability of making a Type 1 error (answer is all of the above is true)

A null hypothesis

Predicts that an experimental treatment will have no effect on a dependent variable of interest

Variation due to some genuine effect is known as: Answer choices: Unsystematic variation Systematic variation Homogeneous variance Residual variance

Systematic variation

A business analyst in a software design company reviewed a series of national surveys of user satisfaction (rated on a ten-point satisfaction scale) with a new gaming interface the company had recently launched. She found that the survey mean was 8. However, her standard error was high (89.5), how should she interpret her results?

The sample mean might not be representative of the population mean

What does a significant test statistic tell us?

There is an effect in the population of sufficient magnitude to be scientifically interesting

A Type 1 error is when?

We conclude that there is a meaningful effect in the population when in fact there is not.

If research suggests that the mean number of insurance quotations a person makes in a year with a standard deviation of 4, what is the z-score for a score of 18?

-2

What is the conventional level of probability that is often accepted when conducting statistical tests in social science and business?

0.05

Children can learn a second language faster before the age of 7'. Is this statement:

A one-tailed hypothesis


Related study sets

Translating algebraic expressions 1

View Set

Chapter 23: Adrenergic Drugs prepu

View Set