STAT-150 Mid Term

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

sigma

sigma notation explanation

unsystematic variability

the variation (error) not due to the variables tested, but due to something else that may have affected the outcome

measurement error (error variance)

the variation of a number around its true mean due to uncontrolled, essentially random influences

STAT Value

total tested / error

Which of the numbers below might IBM SPSS report as 10.574 E−05? 1. 0.00010574 2. 10.569 3. 1057400.0 4. 0000.10574

1. 0.00010574

Which of the numbers below might IBM SPSS report as 10.574 E−05? 1. 0.00010574 2. 10.569 3. 1057400.0 4. 0000.10574

1. 0.00010574

The degree to which a statistical model represents the data collected is known as the: 1. Fit 2. Homogeneity 3. Reliability 4. Validity

1. Fit

Out of the following options, which type of graph could we use to compare frequency distributions of several groups simultaneously? 1. Population pyramid 2. Simple histogram 3. Frequency polygon 4. Simple 3-D bar chart

1. Population pyramid

If we use the mean as a model, what does the variance represent? 1. The average error between the model and the observed data. 2. The total error between the model and the observed data. 3. The squared total error between the model and the observed data. 4. The square-rooted average error between the model and the observed data.

1. The average error between the model and the observed data.

Differences between group means can be characterized as a regression (linear) model if: 1. The experimental groups are represented by a binary variable (i.e. coded 0 and 1). 2. The outcome variable is categorical. 3. The groups have equal sample sizes. 4. Differences between group means cannot be characterized as a linear model, they must be analyzed with an independent t-test.

1. The experimental groups are represented by a binary variable (i.e. coded 0 and 1).

What is b0 in regression analysis? 1. The value of the outcome when all of the predictors are 0. 2. The relationship between a predictor and the outcome variable. 3. The value of the predictor variable when the outcome is zero. 4. The gradient of the regression line.

1. The value of the outcome when all of the predictors are 0.

A correlation of .7 was found between time spent studying and percentage on an exam. What is the proportion of variance in exam scores that can be explained by time spent studying? 1. .70 2. .49 3. .30 4. .7

2. .49

Given a test is normally distributed with a mean of 30 and a standard deviation of 6: • What is the probability that a single score drawn at random will be greater than 34? • What is the probability that a sample of 9 scores will have a mean greater than 34? • What is the probability that the mean of a sample of 16 scores will be either less than 28 or greater than 32? 1. 0.0228 2. 0.2524 3. 0.1826

2. 0.2524

Which of the numbers below might IBM SPSS report as 8.96 E+03? 1. 89.60 2. 8960.0 3. 0.008960 4. 8.960

2. 8960.0

Out of the following options, which type of bar chart would we produce to look at the mean ratings of 'taste' and 'value for money' for two new varieties of Sauvignon Blanc wine? 1. Clustered bar chart 2. All of the options are possible. 3. Stacked bar chart 4. 3-D bar chart

2. All of the options are possible.

A researcher measured people's physiological reactions to horror films. He split the data into two groups: males and females. The resulting data were normally distributed and men and women had equal variances. What test should be used to analyze the data? 1. Dependent 2. Independent t-test 3. Mann-Whitney test 4. Wilcoxon signed-rank test

2. Independent t-test

Which of the following statements about outliers is not true? 1. Outliers are values very different from the rest of the data. 2. Influential cases will always show up as outliers. 3. Outliers have an effect on the mean. 4. Outliers have an effect on regression parameters.

2. NOT TRUE: Influential cases will always show up as outliers.

A researcher was interested in stress levels of lecturers during lectures. She took the same group of 8 lecturers and measured their anxiety (out of 15) during a normal lecture and again in a lecture in which she had paid students to be disruptive and misbehave. What test is best used to compare the mean level of anxiety in the two lectures? 1. Independent samples t-test 2. Paired-samples t-test 3. One-way independent ANOVA 4. Mann-Whitney test

2. Paired-samples t-test

Variation due to some genuine effect is known as: 1. Unsystematic variation 2. Systematic variation 3. Homogeneous variance 4. Residual variance

2. Systematic variation

If we were to pull all possible samples from a population, calculate the mean for every sample, and construct a graph of the shape of the distribution based on all of the means, what would we have? 1. The population distribution of the mean 2. The sampling distribution of the mean 3. The bootstrap distribution of the mean 4. The standard error of the mean

2. The sampling distribution of the mean

The owner of the large chain of coffee shops called 'MoonBucks' decided to calculate how much revenue was gained from lattes each month in a nationwide sample of 2445 cafés. To measure the variance of revenue gained from lattes, he computes SS = 351,936 for this sample. • What are the degrees of freedom for variance? • Compute the variance. • Compute the standard deviation. 1. 144 2. 12 3. 2444

3. 2444

How much variance has been explained by a correlation of .9? 1. 18% 2. 9% 3. 81% 4. None of these

3. 81%

Which type of graph can we use to compare frequency distributions of several groups simultaneously? 1. A histogram 2. A bar chart 3. A population pyramid 4. A boxplot.

3. A population pyramid

Which of the following best describes the variable 'Gender'? 1. A between-group variable. 2. A coding variable. 3. All of the possible answers are correct. 4. A grouping variable.

3. All of the possible answers are correct.

Which of the following best describes the variable 'Gender'? 1. A between-group variable. 2. A coding variable. 3. All of the possible answers are correct. 4. A grouping variable.

3. All of the possible answers are correct.

When items on a questionnaire appear to correspond to the construct that the questionnaire claims to measure it is said to have: Answer choices 1. Factorial validity 2. Ecological validity 3. Content validity 4. Criterion validity

3. Content validity

Ordinal level data are characterized by: 1. Equal intervals between each adjacent score. 2. A fixed zero. 3. Data that can be meaningfully arranged by order of magnitude. 4. None of the above.

3. Data that can be meaningfully arranged by order of magnitude.

If we calculated an effect size and found it was r = .42 which expression would best describe the size of effect? 1. Small 2. Small to medium 3. Medium to large 4. Large

3. Medium to large

Which of the following statistical tests allows causal inferences to be made? 1. Analysis of variance 2. Regression 3. None of these, it's the design of the research that determines whether causal inferences can be made. 4. t-test

3. None of these, it's the design of the research that determines whether causal inferences can be made.

Which of the following is the least affected by outliers? 1. The range 2. The mean 3. The median 4. The standard deviation

3. The median

What symbol represents the test statistic for the Mann-Whitney test? 1. Ws 2. T 3. U 4. H

3. U

If Pearson's correlation coefficient between stress level and workload is .8, how much variance in stress level is not accounted for by workload? 1. 20% 2. 2% 3. 8% 4. 36%

4. 36%

For what is the 'variable view' in IBM SPSS's data editor used? 1. Entering data. 2. Writing syntax. 3. Viewing output from data analysis. 4. Defining characteristics of variables.

4. Defining characteristics of variables.

For which regression assumption does the Durbin-Watson statistic test? 1. Linearity 2. Homoscedasticity 3. Multicollinearity 4. Independence of errors

4. Independence of errors

What does the error bar on an error bar chart represent? 1. The confidence interval around the mean. 2. The standard error of the mean. 3. The standard deviation of the mean. 4. It can represent any of these.

4. It can represent any of these.

An experimenter measured 30 children's IQ. He then rank-ordered the children and assigned them a score from 30 (most intelligent) to 1 (least intelligent) to create a new variable. Does this new variable consist of: 1. Nominal data 2. Interval data 3. Ratio data 4. Ordinal data

4. Ordinal data

Which of the following options, which type of bar chart would we produce to look at the mean ratings of two new varieties of Sauvignon Blanc (wine)? 1. Clustered bar chart 2. Stacked bar chart 3. Simple 3-D bar chart 4. Simple bar chart

4. Simple bar chart

Which of the following is not a transformation that can be used to correct skewed data? 1. Log transformation 2. Square root transformation 3. Reciprocal transformation 4. Tangent transformation

4. Tangent transformation

A researcher was interested in stress levels of lecturers during lectures. She took the same group of 8 lecturers and measured their anxiety (out of 15) during a normal lecture and again in a lecture in which she had paid students to be disruptive and misbehave. The data were not normally distributed. Which test should she use to compare her experimental conditions? 1. Paired samples t-test 2. Mann-Whitney test 3. Wilcoxon rank-sum test 4. Wilcoxon signed-rank test

4. Wilcoxon signed-rank test

platykurtic distribution

A less peaked distribution that indicates that more returns with large deviations from the mean have occurred, or are expected to occur, than with a normal distribution. (Plat is Flat).

central tendency

A measure that represents the typical response or the behavior of a group as a whole. Influenced quite heavily by extreme values.

leptokurtic distribution

A more peaked distribution that indicates that more returns are clustered around the mean than with a normal distribution. (Lepto=Less Risky)

standard normal distribution

A normal distribution with a mean of 0 and a standard deviation of 1.

Null Hypothesis (H0)

A statement of "no difference."

dependent variable

A variable thought to be affected by changes in an independent variable. You can think of this variable as an outcome.

independent variable

A variable thought to be the cause of some effect. This term is usually used in experimental research to describe a variable that the experimenter has manipulated.

outcome variable

A variable thought to change as a function of changes in a predictor variable. For the sake of an easy life this term could be synonymous with 'dependent variable'.

predictor variable

A variable thought to predict an outcome variable. This term is basically another way of saying 'independent variable'.

interval variable

Equal intervals on the variable represent equal differences in the property being measured (e.g., the difference between 6 and 8 is equivalent to the difference between 13 and 15).

Variability

The extent to which the scores in a data set tend to vary from each other and from the mean.

ratio variable

The same as an interval variable, but the ratios of scores on the scale must also make sense (e.g., a score of 16 on an anxiety scale means that the person is, in reality, twice as anxious as someone scoring 8). For this to be true, the scale must have a meaningful zero point.

binary variable

There are only two categories (e.g., dead or alive).

T/F: If the variables are correlated, when given a measurement of one variable, we can predict the value of the other.

True. But this does not mean we can change the outcome by removing or changing the variable. This is simply an observation.

The smaller the p-value, the better.

True. Under .05, you reject the null-hypothesis and can state your research was statistically significant.

standard deviation

a computed measure of how much scores vary around the mean score

continuous variable

a quantitative variable that has an infinite number of possible values that are not countable

positive correlation

a relationship between two variables in which both variables either increase or decrease together

representative sample

a sample that accurately reflects the characteristics of the population as a whole

method of least squares

a statistical way to find the best-fitting line through a set of data points

extaneous variable

all variables, which are not the independent variable, but could affect the results of the experiment

alpha

alpha parameter set by researcher to reject or accept null value

confounding variable

extraneous factor that interferes with the action of the independent variable on the dependent variable "extra variable"

kurtosis

how flat or peaked a normal distribution is

estimations of likelihood

judging how likely it is that something will occur

slope and intercept calculation

linear regression slope and intercept calculation is Rise / Run

systematic variability

outcome is dependent to the variables (or experiment) tested

random assignment

placing research participants into the conditions of an experiment in such a way that each participant has an equal chance of being assigned to any level of the independent variable

experimental research

research designed to discover causal relationships between various factors

RMSE

root mean squared error

variance

sigma squared is variance of the entire population

confidence interval

statistical range, with a given probability, that takes random error into account

p-value < 0.05

statistically significant

Sum of Squares (SS)

sum of squared deviations from the mean

sum of squared errors

sum of the squared differences between each predicted score and actual score on the criterion variable

What does correlational research measure?

the degree of relationship between two or more variables.

slope and line intercept with predictor

the difference between our actual value or Yi and our fitted (or predicted) values of Y(hat)i is called the residuals (or errors) ei

deviance (error)

the distance of each score from the mean

reliability

the extent to which a test yields consistent results, as assessed by the consistency of scores on two halves of the test, on alternate forms of the test, or on retesting

control group

the group that does not receive the experimental treatment.

Research Hypothesis (H1)

the hypothesis that the experiment was designed to investigate

Dispersion

the pattern of spacing of a population within an area

negative correlation

the relationship between two variables in which one variable increases as the other variable decreases

Central Limit Theorem (CLT)

the sampling distribution derived from a simple random sample will be approximately normally distributed

standard error

the standard deviation of a sampling distribution

correlation coefficient

will tell the strength and direction of the relationship

T/F: The smaller the stat value, the better.

False. The bigger the stat value, the better chance that you will get the same result if the test were run again.

nominal variable

There are more than two categories (e.g., whether someone is an omnivore, vegetarian, vegan, or fruitarian).

normal distribution

a bell-shaped curve, describing the spread of a characteristic throughout a population

ordinal variable

a qualitative variable that incorporates an ordered position, or ranking

Parameter

(n.) a determining or characteristic element; a factor that shapes the total outcome; a limit, boundary

Twenty-one cats were given 300g of tuna each. The time in seconds was measured until they had eaten all of the tuna: 16, 18, 18, 22, 22, 23, 23, 24, 26, 29, 32, 34, 34, 36, 36, 42, 43, 46, 46, 49, 57 • Compute the median. • Compute the lower quartile. • Compute the upper quartile. • Compute the interquartile range. 1. 32 seconds 2. 22.5 seconds 3. 42.5 seconds 4. 20 seconds

1. 32 seconds

Rank the score of 5 in the following set of scores: 9, 3, 5, 10, 8, 5, 9, 7, 3, 4 1. 4.5 2. 4 3. 3 4. 6

1. 4.5

Approximately what percentage of people would have scores lower than an individual with a z-score of 1.65 in a normally distributed sample? 1. 95% 2. 98% 3. It is not possible to calculate this unless the mean and standard deviation are given. 4. 1%

1. 95%

Which of the following are assumptions underlying the use of parametric tests (based on the normal distribution)? 1. All of the options are true. 2. Some feature of the data should be normally distributed. 3. The samples being tested should have approximately equal variances. 4. The data should be at least interval level.

1. All of the options are true.

The covariance is: 1. All of these. 2. A measure of the strength of relationship between two variables. 3. Dependent on the units of measurement of the variables. 4. An unstandardized version of the correlation coefficient.

1. All of these.

Assuming the assumptions of parametric tests are met, non-parametric tests, compared to their parametric counterparts: 1. Are all of these. 2. Are more conservative. 3. Are less likely to accept the alternative hypothesis. 4. Have less statistical power.

1. Are all of these.

R2 is known as the: 1. Coefficient of determination. 2. Multiple correlation coefficient. 3. Partial correlation coefficient. 4. Semi-partial correlation coefficient.

1. Coefficient of determination.

Which f the following does a box-whisker plot not display? 1. The mean 2. The median 3. Outliers

1. The mean

A researcher measured the same group of people's physiological reactions while watching horror films and compared them to when watching erotic films. The resulting data were skewed. What test should be used to analyze the data? 1. Independent t-test 2. Wilcoxon signed-rank test 3. Dependent (related) t-test 4. Mann-Whitney test

2. Wilcoxon signed-rank test

What is the relationship between the sum of squared errors (SS), the sample size (n) and the variance (s2)? 1. SS = s2/(n - 1) 2. s2 = SS(n - 1) 3. n = (s2/SS) - 1 4. s2 = SS/(n - 1)

2. s2 = SS(n - 1)

A researcher measured the same group of people's physiological reactions while watching horror films and compared them to when watching erotic films, and a documentary about wildlife. The resulting data were skewed. What test should be used to analyze the data? 1. Independent analysis of variance 2. Repeated-measures analysis of variance 3. Friedman's ANOVA 4. Kruskal-Wallis test

3. Friedman's ANOVA

The t-test tests for: 1. Differences between means 2. Whether a correlation is significant 3. Whether a regression coefficient is equal to zero 4. All of these

4. All of these


Kaugnay na mga set ng pag-aaral

World History B Semester Study Guide

View Set

AP Biology Chapter 1 Review Questions

View Set

Accounting Principles 241 Final Exam Study Guide

View Set

Сучасні технології навчання дітей раннього віку іноземної мови

View Set

parcial 2, psicologia organizacional

View Set

NU 473 week 2 practice questions

View Set

8B IV Therapy; ATI skills module, pharm book, Igancioius, Article

View Set

Frandsen: Abrams Clinical Drug Therapy Chap. 21

View Set