S110 Midterm
A group that includes all the cases in which a researcher is interested is referred to as a sample population case dataset None of the above
population
A method of sampling that enables researchers to specify for each case in the population the probability of its inclusion in the sample is referred to as ______ sampling. inclusionary case probability density None of the above
probability
ecological fallacy
refers to the mistake of drawing conclusions about the micro level of society when the unit of analysis was at the macro level
Quantitative Research
research that collects and reports data primarily in numerical form
Determining a confidence interval
sample mean (+/-) z score x estimated standard error
What percentage of the data falls within 2 standard deviations of the mean? (a)90% (b)95% (c)76% (d)99.7%
(b)95%
Five measures of variability
(index of the qualitative variations), the range, interquartile range, standard deviation, variance
calculation z-scores
(observation - sample mean)/sample standard deviation
interval/ratio
- *interval* is time - *ratio* is the amount of times - did the subject have to do a particular amount of times or did a certain amount of clock time?
sampling frame
a list of population members from which a probability sample is drawn.
standard deviation
a measure of variability that describes an average distance of every score from the mean
An upper-level sociology class has 120 registered students: 34 seniors, 57 juniors, 22 sophomores, and 7 freshmen. If instead you are to select a disproportionate sample of size 20 from the classroom, with equal numbers of students from each class level in the sample, how many freshmen will there be in the sample?
20/4 =5
stratified sampling
-a probability sampling strategy in which:•The population is divided into groups, or strata. •Members are selected in strategic proportions from each group.
Steps in the Research Process
1. Identify the research question 2. Conduct a review of the literature 3. Identify a theoretical framework 4. Select a research design 5. Implement the study 6. Analyze data 7. Draw conclusions 8. Disseminate findings
Scores for an exam are normally distributed with a mean of 235 and a standard deviation of 52. How high must an individual score to be in the highest 5%?
1.65(52)+235=320.8
SELECT ALL THAT APPLY. The probability of rolling an even number with a six-sided, equally weighted die is 1/3 1/2 1/6 3/6
1/2 and 3/6
The mean SAT math score was 501 with a standard deviation of 117. What percentage scored between 600 and 700 points?
15.31%
The annual salaries of employees in a large company are normally distributed with a mean of $50,000 and a standard deviation of $20,000. What percent of people earn between $45,000 and $65,000?
37.21
An upper-level sociology class has 120 registered students: 34 seniors, 57 juniors, 22 sophomores, and 7 freshmen. Imagine that you choose one random student from the classroom. What is the probability that the student will be a junior?
57/120 = 0.475
An upper-level sociology class has 120 registered students: 34 seniors, 57 juniors, 22 sophomores, and 7 freshmen. What is the probability that the student will be a freshmen?
7/120 = 0.0583
The mean SAT math score was 501 with a standard deviation of 117. Your score is 725. What is your percentile rank?
97.19th
What are the measures of central tendency?
A measure of central tendency is an "average" that describes what is "typical" of a distribution mean, median, mode
Sample
A relatively small subset selected from a population
social desirability bias
A tendency to give socially approved answers to questions about oneself.
Which is the best example of how unemployment can be viewed as a public issue instead of only a personal choice? A. A poor economy can lead to job layoffs. B. Poor job performance can lead to being fired. C. Having a poor relationship with your boss can stagnate career advancement. D. All of the above are good examples. E. None of the above
A. A poor economy can lead to job layoffs.
C.Wright Mills' The Sociological Imagination encourages sociologists to identify the relationship between... A.Experience and wider society B. Individuals and their family C. Ourselves and our personal issues D. Private and public polls E. None of the above
A. Experience and wider society
measurement
Assign numbers to events in a systematic way
limitations of longitudinal design
Attrition in panel studies •Costs are higher
benefits of longitudinal design
Can assess causal ordering and change over time
Limitations of cross-sectional studies
Cannot assess causal ordering •Attrition in panel studies •Costs are higher
types of descriptive statistics
Frequencies, measures of central tendency, measures of variability, measures of relative position, measures of relationship
categorial variables
Have a finite set of possible values •No known distances between values •Includes nominal and ordinal variables
continuous variables
Have an infinite set of possible values •Values have fixed distances between them •Includes interval and ratiovariables
properties of normal distribution
Its bell-shaped •Perfect symmetry •The point at which three measures of central tendency coincide •Most of the observations are clustered around the middle, with the frequencies gradually decreasing at both ends of the distribution.
Which of the following measures can have more than one value for a set of data? Mean Median Mode None of the above All of the above
Mode
Name and define three measures of central tendency.
Mode- with the highest frequency Median- value in the middle of the distribution Mean- the average of the numbers
Ordinal Measurement
Nominal levels that can be ranked from low to high -have categories that can be ordered in some way
nominal measurement
Numbers or other symbols are assigned to a set of categories for the purpose of naming, labeling, or classifying the observations. Nominal categories cannot be rank-ordered -catalogs states or statuses that are parallel and cannot be ranked or ordered.
On average, students in our class have families that include 3.1 children. For the country as a whole, the mean number of children per family is approximately 2.0. Give the main reason why our average is so much higher than the national average.
Our class is a selected sample in that it excludes families with 0 children. This only drives up the mean.
descriptive statistics
Procedures that help us organize and describe data collected from either a sample or a population
Inter-Quartile Range (IQR)
Q3-Q1
benefits of cross-sectional studies
Relatively inexpensive •No need to recontact subjects •Can be quickly implemented to address current events and hot-button issues
An upper-level sociology class has 120 registered students: 34 seniors, 57 juniors, 22 sophomores, and 7 freshmen. If you are asked to select a proportionate stratified sample of size 30 from the classroom, stratified by class level (senior, junior, etc.), how many students from each group will there be in the sample?
Seniors = 34/120 x 30 = 8.5 Juniors = 57/120 x 30 = 14.25 Sophomores = 22/120 x 30 = 5.5 Freshman - 7/120 x 30 = 1.75
point estimate
Single value that serves as an estimate of a population parameter
Briefly explain the main properties of central limit theorem (e.g. what do we know about the mean for the sampling distribution)
The central limit theorem tells us exactly what the shape of the distribution of means will be when we draw repeated samples from a given population, as the sample sizes get larger, the distribution of means calculated from repeated sampling will approach normality. This result holds no matter what shape the original population distribution may have been.
reliability
The degree to which the processes of operationalizing and measuring a concept yield a consistent reflection of that concept
Validity
The degree to which the processes of operationalizing and measuring a concept yield an accurate reflection of that concept
confidence level
The level of certainty that a population parameter exists in the calculated confidence interval.
unit of analysis
The level of social life on which social scientists focus •Individual •Family •Organization •city Researchers are interested in establishing cause and effect relationship. •Researchers should take care to avoid the ecological fallacy
inferential statistics
The logic and procedures concerned with making predictions or inferences about a population from observations and analyses of a sample
What is the position of the mean, median and mode in a negatively skewed distribution
The mean is going to be closer to the "low" side due to the curve being negatively skewed. The median would be higher than the mean and the mode would be even higher than those two.
sampling
The process of identifying and selecting the subset of the population for study
Explain why the standard error should decrease as we increase the sample size.
The standard error will be large when the standard deviation of the underlying population is large. According to the central limit theorem, the standard error will decrease as we increase sample size.
Population
The total set of individuals, objects, groups, or events in which the researcher is interested
Levels of Measurement
The type of statistical operation we employ depends on how our variables are measured. • Three levels of measurement: (1) nominal (2) ordinal (3) interval-ratio
In the data that we collected for class, 8.6% of respondents were African American. Yet according to the University's official enrollment reports, African Americans make up only 3.9% of the undergraduate population. Drawing on concepts that we've discussed in recent class sessions, describe some reasons why our sample estimate is so different from the true population percentage.
There's a higher concentration of African American students in the class than in the university as a whole.
True or False: Variables that can be measured at the interval-ration level of measurement can also be measured at the ordinal and nominal levels.
True
continuous measurement
Variables for which the actual distance separating values is expressed in meaningful standard intervals
Calculations. Partial credit will be awarded, so please show your work. X = 3, 9, 4, 2 Y = 1, 10, 7, 6 a) Use the above information to calculate the standard deviation and variance of X.
Variance = 9.67 SD = 3.11
The admissions committee decides to exclude student applicants scoring below the 20th percentile on the math SAT. Translate this percentile into a Z score. Then, calculate the equivalent SAT math test score.
Z: -0.84 Answer: 402.72
The mean SAT math score was 501 with a standard deviation of 117. What percentage scored lower than lower than 300 on the Math SAT?
Z: -1.72, Answer: 4.27%
confounding variable
a factor other than the independent variable that might produce an effect in an experiment
probability sampling
a method used by pollsters to select a representative sample in which every individual in the population has an equal probability of being selected as a respondent
Parameter
a numeric value that describes a population characteristic
cluster samples
a probability sampling strategy in which researchers divide up the target population into groups, or "clusters." •Researchers first select clusters randomly, and then select individuals within those clusters randomly.
systematic sample
a probability sampling strategy in which sample members are selected by using a fixed interval.
secondary data
a resource that was collected by someone else
survey
a social research method in which researchers ask a sample of individuals to answer a series of questions.
sociology of science
a sociological perspective that examines how scientific knowledge develops.
cross-sectional design
a study in which data are collected at only one point in time.
repeated cross-sectional design
a type of longitudinal study in which data are collected at multiple time points, but from different subjects at each time point.
panel design
a type of longitudinal study in which data are collected on the same subjects at multiple time points.
simple random sample
a type of probability sample in which: •Each individual has the same probability of being selected. •Social researchers can use computer software to ensure that sample selection is truly random.
A sample that is representative: a. Does not systematically differ from the population. b. Includes as many population members as possible. c. Contains no sampling error. d. Is impossible for researchers to obtain. e. None of the above
a. Does not systematically differ from the population.
For which mode of survey administration is the risk of interviewer effects highest? a. Face-to-face interviews b. Telephone surveys c. Mail surveys d. Internet-based surveys
a. Face-to-face interviews
SELECT ALL THAT APPLY. The box plot can visually present: a. The range b. The IQR c. The median d. The IQV e. The mode
a. The range b. The IQR c. The median
Interquartile range is the difference between: a.The third quartile and the first quartile b.Mean deviation and standard deviation c.The highest and lowest value of a dataset d.Mean and median e.None of the above
a.The third quartile and the first quartile
Identify the issue with the following survey question: "Do you agree or disagree that the government should be doing more to help racial and ethnic minorities and the poor?"\ a. It is a leading question. Correct! b. It is a double-barreled question. c. It contains a double negative. d. It is too controversial to be included in a survey. e. None of the above
b. It is a double-barreled question.
Which measure of central tendency is the score that divides the distribution into two equal parts so that half of the cases are above and half are below it? a. Mode b. Median c. Mean d. Percentile
b. Median
Which sampling strategy is particularly useful for studying social groups about which little information is known? a. Stratified b. Snowball c. Systematic d. Purposive e. cluster
b. Snowball
Which is a limitation of cross-sectional survey designs? a. They are more costly to conduct than longitudinal designs. b. They cannot assess the causal ordering of variables. c. They do not allow researchers to compare across subgroups. d. They can only be administered via face-to-face interviews. e. All of the above
b. They cannot assess the causal ordering of variables.
When the mean is greater than the median and the median is greater than the mode, what kind of distribution do you have? a.Negatively (left)skewed b.Positively (right) skewed c.Normal Distribution d.Average Distribution e.It's not a distribution anymore
b.Positively (right) skewed
What is the unit of analysis in the following research question: How do rates of crime differ across New York City neighborhoods? a.cities b.neighborhoods c.crime rates d.individuals e.none of the above
b.neighborhoods
For a Z score of zero, what is the proportion of area beyond the Z score? a. 0.0 b. .05 c. .50 d. 1.0 e. None of the above
c. .50
The index of qualitative variation can vary from: a. -1.00 to 0.00. b. 0.00 to 0.50. c. 0.00 to 1.00. d. -1.00 to 1.00. e. None of the above
c. 0.00 to 1.00.
Which type of probability sample is useful when there is no readily available sampling frame for researchers to use? a. Simple random b. Systematic c. Cluster d. Stratified e. Convenience
c. Cluster
An extra variable that the research did not take into consideration is called: a. Mediating variable b. Moderating variable c. Confounding variable d. All of the above e. None of the above
c. Confounding variable
Which of the following two qualities are used to calculate the standard error of the mean? a. Sample mean and population mean b. Sample mean and sample standard deviation c. Population standard deviation and sample size d. Population standard deviation and sample mean e. None of the above
c. Population standard deviation and sample size
face-to-face interviews
high cost, high response rate, high researcher control over interview, high interviewer effects
SELECT ALL THAT APPLY. "Summary" or "descriptive" statistics include which of the three following characteristics: The direction of a relationship Central tendency Dispersion or spread Differences in group means Shape of the distribution
central tendency, dispersion or spread, and shape of distribution
Qualitative Research
collect and analyze data that enable rich description in words or images
economic sociology
concerned with the social and cultural aspects of the ways in which humans produce the means of our subsistence
The number of standard deviations that a given raw score is above or below the mean is referred to as a a. Mean b. Frequency distribution c. Variance d. Standard (Z) score e. None of the above
d. Standard (Z) score
events
human beliefs, attitudes, or experiences
true or false The range is more useful than the inter-quartile range because it avoids the extreme scores in the distribution.
false
true or false: One benefit of using the arithmetic average, or mean, to summarize the central tendency of a variable is that it is unaffected by outliers.
false
interval variables
have a continuum of values with meaningful distances between them, but no true zero. •The values can be compared directly, but they cannot be used in proportions or mathematical operations.
The objective of statistics is to draw ___________________ about a population based on information from the sample. inferences measurements calculations mathematical principles operationalizations
inferences
ratio variables
interval variables that do have a true zero. •The distance between values can be measured, and values can be expressed as proportions.
Operationalization
is the process of linking the conceptualized variables to a set of procedures for measuring them.
greatest limitation of a non-representative sample
it's impossible to know how well you are representing the population
Sources of Error in Surveys
nonresponse, measurement error, coverage error, sampling error
Primary Data Collection
occurs when social researchers design and carry out their own data collection.
political sociology
the area of sociology that examines the nature and consequences of power within or between societies, as well as the social and political conflicts that lead to changes in the allocation of power
Conceptualization
the process of precisely defining ideas and turning them into variables
margin of error
the range of percentage points in which the sample accurately reflects the population
confidence interval
the range of values within which a population parameter is estimated to lie
Sociology
the scientific study of the social lives of individuals, groups, and societies
Objective of Statistics
to draw inferences about a population based on information contained in a sample
true or false The simplest and most straightforward measure of variation is the range.
true
true or false: The most commonly occurring value for a variable is referred to as the mode (or modal value).
true
snowball sampling
type of non-random sample that relies on gaining access to a population through one participant and then asking them to refer other participants
Convenience Sampling
type of non-random sampling that involves drawing from an easily accessible population
mixed methods research
uses both quantitative and qualitative techniques, in an effort to build convincing claims about the relationships between attributes and outcomes
The central limit theorem is vital in statistics for two main reasons
—(1) the normality assumption and (2) the precision of the estimates.