Social Statistics Final
In the ANOVA test, degrees of freedom within (dfw) are equal to ______ and degrees of freedom between (dfb) are equal to (k - 1).
(N - k)
To calculate degrees of freedom, a researcher uses which formula?
(N-1)
In the ANOVA test, degrees of freedom within (dfw) are equal to (N - k) and degrees of freedom between (dfb) are equal to ____.
(k - 1)
Lambda for Table 12.1 is TABLE 12.1 GENDER CHURCH ATTENDANCE Male Female Attends 200 350 550 Doesn't Attend 300 150 450 500 500 1000
.22
In constructing interval estimates based on sample proportions (P s ), the population proportion (P u ) must be estimated. It is most conservative to estimate P u at
.50
For variables measured at the nominal level, measures of association will have a lower limit of _______ and an upper limit of _______.
0, 1
In computing Spearman's rho, the sum of the D column must be
0.
Given that the null hypothesis is actually true, the probability of Type II error is
0.00
Research involving potentially harmful drugs would be likely to use an alpha level of
0.001
In the formulas for constructing interval estimates based on sample proportions, the expression P u (l - P u ) has a maximum value of
0.25
The probability that a sample mean is within ± 1 Z of the population mean is about
0.68.
We have used an alpha of 0.01 to estimate the average hours of television viewing for residents in a retirement home. What is the chance that our interval estimate does not contain the true population mean?
1%
The shape of the sampling distribution of sample means can be assumed to be normal when N is
100 or more.
Which sample size will produce the confidence interval with the smallest width?
1000
The mean age of all college graduates is 35. If the population distribution is normal, the mean of any sampling distribution of sample mean ages of college graduates will be
35.
In Table 12.1, E2 is TABLE 12.1 GENDER CHURCH ATTENDANCE Male Female Attends 200 350 550 Doesn't Attend 300 150 450 500 500 1000
350.
In Table 12.1, E1 is TABLE 12.1 GENDER CHURCH ATTENDANCE Male Female Attends 200 350 550 Doesn't Attend 300 150 450 500 500 1000
450.
An alpha level of 0.05 is the same as a confidence level of
95%
A researcher is analyzing regional differences in family size. She has information on number of children for samples of families from four regions. Which of the following would be an appropriate statistical test?
ANOVA
Which pattern of cell frequencies in a 2x2 table would indicate that the variables are independent?
All cell frequencies are exactly the same.
For all tests of hypothesis, the probability of rejecting the null hypothesis is a function of
All of the answer choices
In a t test of differences between means, increasing sample size will affect
All of the answer choices
Measures of association provide the researcher with information that
All of the answer choices
Some potential difficulties arise in the chi square test when
All of the answer choices
The characteristic(s) of a relationship between two variables that must be analyzed for a full understanding of the relationship is (are)
All of the answer choices
The simple random sample requires
All of the answer choices
Which assumption about level of measurement is made for the Chi square test?
All variables are nominal in level of measurement.
The width of an interval estimate can be controlled by
Any of the answer choices
Random samples of 1546 men and 1678 women have been given a scale that measures support of legal abortion. Men average 12.45 and women average 12.46, and the difference is significant at the 0.05 level. What can we conclude?
Because of the large sample sizes, these results may be statistically significant but trivial.
For a 3 x 3 table, the appropriate chi square based measure of association would be
Cramer's V
A researcher has analyzed differences in average College Board scores for random samples of students from four different colleges. The obtained F score is 0.45. What can be concluded about the null hypothesis?
Fail to reject, differences are not statistically significant
For ordinal level variables with only a few categories or values, an appropriate measure of association would be
Gamma.
A researcher can demonstrate a strong association between gender and income. Which variable is independent?
Gender
Which is the correct expression for finding E1?
N - (largest row total)
A difference between samples that is shown to be statistically significant is always
None of the answer choices
To prove that one variable causes another, we use
None of the answer choices: neither measures of association or tests of significance can prove causal relationships.
A researcher conducted a survey to determine if older people have different feelings about abortion than younger people. He used an alpha level of 0.05 (Z critical = ± 1.96) to test for significance and found that his computed test statistic was 2.76. Which of the following conclusions is justified?
Older and younger people have significantly different feelings about abortion.
Analysis of Variance based upon the effect of a single variable upon another is a
One-way ANOVA.
A random sample of 200 includes 100 Protestants. The researcher estimates, at the 95% confidence level, that between 43% and 57% of the population is Protestant. In this research situation
P s is .50
When testing the significance of the difference between two sample proportions, the null hypothesis is
P u1 = P u2
A Chi square test has been conducted to assess the relationship between marital status and church attendance. The obtained Chi square is 23.45 and the critical Chi square is 9.488. What may be concluded?
Reject the null hypothesis, church attendance and marital status are dependent
An obtained chi square of 10.78 has been calculated. Critical Chi square is 3.841. What should be concluded?
Reject the null hypothesis, the variables are not independent
Which of the following correctly states the relationship between SST (the total sum of squares), SSB (the sum of squares between), and SSW (the sum of squares within)?
SST = SSB + SSW
A useful computational shortcut for the ANOVA test is expressed as
SSW = SST - SSB
The sizes of four samples vary as follows: Sample A, N = 100 Sample B, N = 76 Sample C, N =1000 Sample D, N = 150 Which sample will produce the most efficient estimate?
Sample C
Which of the following is NOT an assumption required for a test of hypothesis with a single sample mean?
Sample size (N) larger than 1,000
A researcher reports a χ 2 (obtained) of 75.34 ( α =.05, df = 1, χ 2 (critical) = 3.84, N = 13,678). The researcher claims that the relationship is extremely significant. What issue should be raised?
Sample size is very large and the actual relationship between the variables might be trivial.
From a University population, random samples of 45 seniors and 37 freshmen have been given a scale that measures sexual experiences. The freshmen report an average of 1.6 sexual partners over their lifetimes while seniors report an average of 2.5 partners. The t (obtained) for this difference was -3.56 while the t (critical) was ± 2.34. What can be concluded?
Seniors and freshman are significantly different in their sexual experiences.
Which of the following is necessary to calculate the standard error of the mean?
Standard deviation of the population
Four tests of significance were conducted on the same set of results: For test 1: alpha = 0.05, two-tailed test. For test 2: alpha = 0.10, one-tailed test. For test 3: alpha = 0.01, two-tailed test. For test 4: alpha = 0.01, one-tailed test. Which test is most likely to result in a rejection of the null hypothesis?
Test 2
From a University population, random samples of 145 men and 237 women have been asked if they have ever cheated in a college class. 8% of the men and 6% of the women said that they have. What is the appropriate test to assess the significance of this difference?
Test for the significance of the difference between two sample proportions, large samples.
Samples of Republicans and Democrats have been tested for their level of support for a new immigration policy. The test statistic is 0.54 and the critical region begins at ± 1.96. What may we conclude?
The difference is not significant.
The text reports the results of a test for the significance of the difference in average education for random samples of males and females. Both males and females averaged 13.5 years of schooling. The Z score computed in step 4 for this difference was - 0.29. Given these results, which of the following is a reasonable conclusion?
The difference is not statistically significant and was probably caused by random chance.
The text reports the results of a test for the significance of the difference in average income for random samples of males and females. Males earned an average of about $17,000 more per year and the Z score computed in step 4 was 6.28. Given these results, which of the following is a reasonable conclusion?
The difference is statistically significant, large, and important.
When testing for the significance of the difference between two sample means, which parameters must be estimated with sample values?
The population standard deviations
Why would researchers use the t distribution?
The population standard error is usually unknown.
A sample of people attending a professional football game averages 13.7 years of formal education while the surrounding community averages 12.1. The difference is significant at the .05 level. What could we conclude?
The sample is significantly more educated than the community as a whole.
A researcher has computed a gamma of -0.45 between marital happiness and number of children. What can she conclude?
There is a moderate, negative relationship between number of children and marital happiness.
Very large random samples of Catholics and Protestants have been questioned about their opinions on cohabitation. Forty-six percent of the Protestant and 47% of the Catholics approve of males and females living together without being married. The difference has been tested and found to be statistically significant. What is the most reasonable conclusion?
This difference may be statistically significant but it seems unimportant.
When cell frequencies are small and 2 x 2 tables are used in the chi square test, an adjustment to the value of chi square can be made by applying
Yate's correction for continuity.
Based on an EPSEM sample of 300 state university students, we estimate that the average number of hours of study time each week is 30 ± 2. In this example, the population is
all state university students.
The probability that an interval estimate does not include the population value is called
alpha.
In the Chi square test for independence, the null hypothesis and the research hypothesis
always contradict each other.
Compared to probability samples, nonprobability samples
are usually cheaper to assemble.
ANOVA is a one tailed test and we are concerned only with those outcomes in which there is more variance
between categories than within categories.
To conduct a chi square test, the variables must first be organized into a
bivariate table.
If you have a sample statistic and an unknown population standard deviation, you would use which formula to calculate the confidence interval.
c.i. = X+z(s/square root - 1
A problem with both Cramer's V and phi is that values between 0.00 and 1.00
can be interpreted only as an index of the relative strength of the association.
In a sampling distribution of sample means, most of the sample means will
cluster around the true population value.
The EPSEM sampling method that does not require the researcher have access to a complete list of their population is
cluster sampling.
In order to identify the pattern of the relationship in a bivariate table, we need to compute
column percentages.
The ANOVA test is most appropriate for dependent variables that are
continuous and interval-ratio.
In the ANOVA test, the symbol 'dfb' refers to
degrees of freedom associated with SSB
In the ANOVA test, the symbol 'dfw' refers to
degrees of freedom associated with SSW.
Since critical values of t vary by sample size, before using the t table we must first calculate
degrees of freedom.
To construct estimates of the population variance, we divide each sum of squares by its respective
degrees of freedom.
If two variables are independent, the cell frequencies will be
determined by random chance.
Unlike the chi square test of independence, in the chi square goodness of fit test, the expected frequencies are
determined from the null hypothesis.
In terms of hypothesis testing, "significance" refers to the
difference between the sample and population values.
The degrees of freedom for the Sum of Squares Within (SSW) is based upon the
difference between total number of cases and categories.
One limitation of ANOVA is that, when the null hypothesis is rejected, the test
does not tell us which sample mean(s) is/are different.
If the critical region begins at Z (critical) = ± 2.56 and the test statistic is - 2.50, we
fail to reject the null hypothesis
Which of the following combinations would result in a Type II error? The null hypothesis is actually _______ and our decision in Step 5 is to ________ the null hypothesis.
false, fail to reject
The t distribution, compared to the Z distribution, is
flatter for small sample sizes but increasingly like the Z distribution as N increases.
If we cannot reduce our errors of prediction for the order of pairs of cases on one variable by predicting from the order of pairs of cases on the other variable,
gamma will be zero
Cluster sampling often involves selecting
geographical areas.
With the ANOVA test, the ___________ the difference between categories, relative to the differences within, the ________ likely that the null hypothesis will be rejected
greater, more
If we reject a null hypothesis which is in fact true, we
have made a Type I error.
To maximize the probability of rejecting the null hypothesis, use
high alphas, large samples, and one- tailed tests.
The central problem in the case of two-sample hypothesis test is to determine
if two populations differ significantly on the trait in question.
One limitation of the Chi square test (and all tests of hypothesis) is that they cannot tell us if relationships between variables are
important.
Table 1 Support for Restrictive Immigration Policy by Education EDUCATION SUPPORT Less than High School High School College Low 60 30 10 Neutral 20 30 20 High 10 40 70 100% 100% 100% EDUCATION SUPPORT Less than High School High School College Low 60 30 10 Neutral 20 30 20 High 10 40 70 100% 100% 100% In Table 1, support for immigration restriction _________ as education increases.
increases
A problem with large samples in the chi square test is that the test statistic
increases at the same rate as sample size and trivial relationships may seem important.
As our confidence in an interval estimate increases, the width of the interval
increases.
When random samples are drawn so that the selection of a case for one sample has no effect on the selection of cases for another sample, the samples are
independent.
Unlike lambda, phi
is based on chi square.
The more efficient the estimate, the more the sampling distribution
is clustered around the mean.
Republicans average 1.4 children while the state as a whole averages 1.7 children. The Z score computed in this test is 0.78 and the alpha level is 0.05. Therefore, the difference in size of family
is due to random chance.
To construct the narrowest possible confidence intervals, use __________.
large samples
When using systematic sampling, the researcher must
make sure that the list of the population is random or, at least, non-cyclical with respect to the variables of interest.
Given the same alpha level, the one-tailed test
makes it more likely that Ho will be rejected.
For a relationship involving education and library use, gamma was +0.37. This relationship is
moderate and positive.
In the Chi square test, expected frequencies are computed by
multiplying the proper row and column marginals for each cell and dividing by N.
A bivariate table in which both variables have three categories has
nine cells.
The chi square test is frequently used because it is relatively easy to satisfy the model assumptions (step 1 of the five-step model). These assumptions require, in the case of chi square,
no assumption about the shape of the sampling distribution.
Lambda is a PRE measure that is used when variables are measured at the
nominal level
Phi and Cramer's V would be appropriate as measures of association for variables measured at the
nominal level
Chi square is one of a class of statistics called
nonparametric.
A researcher tests a theory about sexism by administering a survey to the 200 students in her sociology classes. This sample is best characterized as a
nonprobability sample.
In order to conduct a test of hypothesis with means or proportions, the sampling distribution must be
normal.
In hypothesis testing, the _____________ is the critical assumption, the assumption which is actually tested.
null hypothesis
The standard deviation of the sampling distribution is represented by which of the following symbol(s)?
o/ square root of n
The goal of estimation procedures is to infer ___________ from ____________.
parameters, statistics
In the two sample case, the null hypothesis is always about the difference in the
populations
Table 1 shows a relationship between education and support for a policy to restrict immigration to the United States Table 1 Support for Restrictive Immigration Policy by Education EDUCATION SUPPORT Less than High School High School College Low 60 30 10 Neutral 20 30 20 High 10 40 70 100% 100% 100% The relationship between education and opposition is
positive.
In an ANOVA test, when the null hypothesis is rejected, we know that at least one of the means is significantly different from the others. In order to find out which mean(s) are significantly different, we must conduct a
post hoc test.
To increase the probability that a confidence interval will include the population parameter
raise the alpha level.
In the formula for gamma, Nd represents the number of pairs of cases
ranked differently on both variables
Yate's correction for small sample size
reduces the value of the difference between fo and fe by .5 before squaring.
If the test statistic falls in the critical region, we
reject the null and conclude that there is strong - although indirect - support for the research hypothesis.
A researcher questioned 45 randomly-selected members of the freshman class about their experiences drinking alcohol and used these responses to estimate the drinking behavior of the entire freshman class of 1500. In this example, the 45 interviewees were the __________ and the ___________ was the population.
sample, freshman class
There are about 70 million eligible voters in a society. A public opinion pollster has estimated their probable choices for the next president with a sample of 2,000 randomly selected citizens. In this example, the 2,000 citizens are a __________ and the 70 million eligible voters are a _______.
sample, population
What are the three distributions involved in every application of inferential statistics?
sample, sampling, and population
Of the three distributions used in inferential statistics, which is only theoretical?
sampling distribution
Using the General Social Survey as an example, the concept of the sampling distribution allows us to link the sample of about ____ respondents to the population of about _______ adult Americans.
several thousand, 235 million
To calculate the confidence interval based on sample means, you need all but which of the following
standard deviation of the population.
A professor wants to determine if the grades in her introductory criminal justice class were higher than the rest of the school. With a total of 101 students in her class, they had a mean exam grade of 87, and a standard deviation of 3. She was not able to obtain the standard deviation for the other introductory classes, but the mean grade of all classes was 74. Which of the following would be the test statistic used, and what is the correct answer?
t = 0.43
Samples from two high schools are being tested for the difference in their average levels of prejudice. One sample contains 39 respondents and the other sample contains 47 respondents. The appropriate sampling distribution is the
t distribution.
When we have an interval-ratio dependent variable and are comparing two groups, the appropriate test of significance is the ________. When we have an interval-ratio dependent variable and are comparing three or more groups, the appropriate test of significance is the ________.
t test, ANOVA
The research hypothesis for the ANOVA test is
that at least one population mean is different.
The F ratio is equal to
the "mean square between" divided by the "mean square within."
When we use larger samples (N > 100) we can assume a normal sampling distribution because of
the Central Limit Theorem.
The sampling distribution for the ANOVA test is
the F distribution.
The critical region is
the area under the curve that includes those values of a sample statistic that will lead to rejection of the null.
The distribution of scores on the dependent variable for a specific category of the independent variable is called
the conditional distribution of Y.
Regarding importance versus statistics:
the distinction is entirely dependent upon data.
The chi square goodness-of-fit test can be used when
the distribution of a single variable must be tested for significance.
In systematic random sampling, the researcher randomly selects
the first case and every k th case thereafter.
The higher the alpha level,
the greater the probability of rejecting the null hypothesis.
The ANOVA test is designed for dependent variables that have been measured at
the nominal level.
In tests of significance, if the test statistic falls in the critical region, we may conclude that
the null hypothesis can be rejected.
When conducting hypothesis tests for two sample means, the term μ 1 - μ 2 in the numerator of the formula reduces to zero because
the null hypothesis is assumed to be true.
All tests of hypothesis are based on the assumption that
the null hypothesis is true.
If we reject a null hypothesis of "no difference" at the 0.05 level
the odds are 20 to 1 in our favor that we have made a correct decision
Comparing one- and two-tailed tests (with a constant alpha level and sample size), the probability of rejection will be higher for
the one-tailed test, if you have correctly predicted the direction of the difference.
With a sample size of 75, a normal sampling distribution can be assumed if
the population distribution is normal.
An estimator is unbiased if the mean of its sampling distribution is equal to
the population value.
For Chi square, a small sample is one in which
the same as
The row and column marginals for the expected frequencies are always ________ those of the observed frequencies.
the same as
By the theorems presented in the text, we know that the mean of a sampling distribution of sample means will be
the same as the population mean.
In the ANOVA test, if the null hypothesis is true
the sample means should be roughly equal in value.
For tests of significance involving two sample proportions, the value of the population proportion is estimated from
the sample proportions.
The Central Limit Theorem states that as sample size becomes large
the sampling distribution of sample means approaches normality.
As noted in the text, telemarketers often use
the sampling techniques used by social scientists.
In the context of chi square, variables are independent if
the score of a case on one variable has no effect on the score of the case on the other variable.
Your sample size is 1000. It is safe to assume that
the shape of the sampling distribution of sample means is normal.
When testing for the significance of the difference between sample means with small samples, the proper sampling distribution is
the t distribution.
Lambda is asymmetric, which means that
the value of lambda may vary depending on which variable is taken as independent.
In a research study conducted to determine if arrests were related to the socioeconomic class of the offender, the chi square critical score was 9.488 and the chi square test statistic was 12.2. We can conclude that
the variables are dependent.
Unlike other tests of significance, Chi square easily handles situations in which
the variables of interest have more than two categories or scores.
Unlike the sample and population distributions, the sampling distribution is
theoretical.
A study was made of marital satisfaction experienced by men and women. A phi of .07 was calculated for the relationship between gender and satisfaction. This result shows that
there is a weak relationship between the variables.
I surveyed 48 randomly-selected residents of the apartment complex where I live to determine their voting habits. I can only use this information to generalize to all the residents if
there is evidence of a normal population distribution.
The degrees of freedom for the Sum of Squares Between (SSB) is based upon the
total number of categories.
A researcher is interested in the effect that neighborhood crime-watch efforts have on the crime rate in the inner city, but he is unwilling to predict the direction of the difference. The appropriate test of hypothesis would be
two-tailed.
Sample proportions are
unbiased.
The sample standard deviation is a biased estimator of the population standard deviation. When using sample means as estimators, we correct for bias in the formula for finding confidence intervals by
using N - 1 rather than N.
The efficiency of a sample estimator is essentially a matter of
variation.
A random sample of 500 reports an average yearly income of $42,000 with a standard deviation of $1000. An estimate of the parameter at the 95% level is about $175 wide. In this research situation
we can be 95% confident that the population mean is 42,000 ± 175.
If a researcher changes from the 90% confidence level to the 95% level, the confidence interval will
widen.
The quantity SSW measures the amount of variation
within the categories.
With alpha set at .05, the Critical Region for a two-tailed test would begin at ± 1.96. In a one-tailed test at the same alpha level the Critical Region would begin at
± 1.65.
In estimation procedures, the Z score that corresponds to an alpha of .05 is
± 1.96
When testing for the significance of the difference between two sample means, the null hypothesis can be stated as
μ 1 = μ 2
Stated generally, the null hypothesis for the ANOVA test is
μ 1 = μ 2 = μ 3 = ... = μ k