Research Methods Final

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Student's t-distribution

a probability distribution that can be used for making inferences about a population mean when the sample size is small

Controlled effect

a relationship between a causal variable and a dependent variable within one value of the another causal variable

Direct relationship

a relationship that runs in a positive direction, increase in x = increase in y

Inverse relationship

a relationship that runs in the negative direction, increase in x = decrease in y

Random sample

a sample that has been randomly drawn from the population

Conceptual dimension

a set of concrete traits of similar type

Variance

average of the squared deviations, indicator of how dispersed the data is, allows us to find standard deviation

Cross-sectional study

interviewed at one point in time, reliability is an issue but it's cheaper

Panel study

interviewed at two different times, better reliability but it's more expensive

Cross-tabulation

table that presents percentages of categories in the independent variable

Frequency distribution

tabular summary of a variable's values

Measure of association

tells the researcher how well the independent variable works in explaining the dependent variable; examples are: _

Alternative-form method

test-retest but with two different tests

Laboratory experiment

the control group and the test group are studied in an environment created wholly by the investigator, ex. going to a specific research venue

Additive relationship

the control variable is a cause of the dependent variable but defines a small compositional difference across values of the independent variable - because the relationship between X and Z is weak, X retains a causal relationship with Y after controlling for X; Z also helps to explain the dependent variable; in a set of additive relationships, the tendency and strength of the relationship between the independent variable and the dependent variable are the same or very similar in all values of the control variable

Population

the universe of cases the researcher wants to describe

Independent variable

the variable that represents the causal factor in an explanation

Dependent variable

the variable that represents the effect in a causal explanation, it is dependent on the other variable; in graphs it is the vertical pillar which is dependent on the location of the foundation

Sample

a number of cases or observations drawn from a population

Reliability

consistency

Negative relationship

downward sloping line \

Test-retest method

like it sounds, evaluates reliability

Central tendency

typical or average value, measured by mean, median, and mode

Positive relationship

upward sloping line /

Normal distribution

used to describe interval-level variables; bell curve

Face validity

using informed judgment to determine what an operational procedure is measuring what it is supposed to measure, "On the face of it are there good reasons to think that this measure is not an accurate gauge of the intended characteristics?

Median

value of a variable that divides the cases right down the middle

one-tailed test of statistical significance

1.645

1.645

Aka one-tailed test of statistical significance. In normal estimation, the absolute value of Z that marks the boundary between .95 of the curve and .05 in one tail is 1.645. Therefore, the lowest plausible difference is defined by the sample statistic minus 1.645 standard errors. If this value is greater than 0 we can reject the null hypothesis.

Adjusted R-square

Often close but always less than regular R-square; adjusts for the fact that squaring any negative error leads to a positive number and inflates the estimated value of R square

Types of measures of association

PRE measures, R-square and adjusted R-square, regression coefficient, Pearson's r, lambda

Pearson's r

Pearson's correlation coefficient; xi = individual observations of x; x bar = sample mean of x; sx = sample standard deviation of x; yi = individual observations of y; y bar = sample mean of y; sy = sample standard deviation of y; symmetrical measure of association, meaning that the correlation between the dependent variable and the independent variable is the same as the correlation between the independent variable and the dependent variable = it is neutral on the question of which variable is the causal variable and which is the effect; it is not a PRE because it does not tell us how well we can predict the dependent variable by knowing the measure of the independent variable; it is bounded by -1 and +1 and communicates strength and direction by a common metric

Asymmetric measure of association

_

Error sum of squares

_

Regression sum of squares

_

Rule of direction for nominal relationships

_

Symmetric measure of association

_

R-square

a PRE measure (of association) bracketed between 0 and 1 that may be interpreted as the proportion of the variation in the dependent variable that is explained by the independent variable; to calculate = ∑(yi - y bar)2; measures the goodness of the fit between the regression line and the actual data

Population parameter

a characteristic of a population, ex: dollar amount of the average PAC contributions or percent of adults who voted; = sample statistic + random sampling error

Aggregate-level unit of analysis

a collection of individual entities (ex: neighborhoods or census tracts)

Bimodal distribution

a frequency distribution having two different values that are heavily populated - not a bell curve

Standard error of the difference

a more formal derivation of the standard error of the mean difference; to calculate = square each standard error, sum them, and take the square root

Hypothesis

a testable statement about the empirical relationship between cause and effect

Dummy variable

a variable for which all cases falling into a specific category assume a value of 1, and all cases not falling into that category assume a value of 0

Intervening variable

a variable that acts as a go-between or mediator between an independent and dependent variable, for ex: having a higher education doesn't make you vote, the fact that you have collaborated with your peers in college does

Controlled comparison

accomplished by examining the relationship between an independent and a dependent variable, while holding constant other variables suggested by rival explanations and hypotheses

Index

additive combination of ordinal variables coded identically

Controlled comparison table

aka control table; presents a cross-tabulation between an independent variable and a dependent variable for each value of the control variable

Zero-order relationship

aka gross relationship, uncontrolled relationship - a difference obtained from a simple comparison; an overall association between two variables that does not take into account other possible differences between the cases being studied; summarizes a relationship between two variables

Systematic measurement error

aka measurement bias, distorts empirical measurement and mis-measures, inherent problems with measurement system

Census

allows researchers to obtain measurements from all members of a population

Ecological fallacy

an aggregate-level phenomenon is used to make inferences at the individual level (whole-to-part)

Rival explanation

an alternative cause for the dependent variable, ex.: everyone in the control group was healthier

Variable

an empirical measurement of a characteristic; variable name (marital status), variable values (married), numeric codes (corresponds to 1)

Sample statistic

an estimate of a population parameter, based on a sample drawn from the population

Linear relationship

an increase in the independent variable is associated with a consistent increase or decrease in the dependent variable

Compositional difference

any characteristic that varies across categories of an independent variable, ex: democrats and republicans difference in gender, income, preferred ice cream flavor - not all present a plausible rival explanation

Random sampling error

as variation goes up, this increases in direct relation to the population's standard deviation

Mean

average

Conceptual definition

cannot use one concept to define another

Nominal-level variable

communicates differences between units of analysis on the characteristic being measured, ex: marital status, religious denominations, gender, race, etc.; must measure with mode

Interval-level variable

communicates exact differences between units of analysis, ex: age, family members, commute time

Ordinal-level variable

communicates relative differences between units of analysis; can be ranked, ex: support for school prayer; measure with mode or median

Control group

composed of citizens who did not receive the treatment

Test group

composed of subjects who receive a treatment that the researcher believes is causally linked to the dependent variable, ex: patients with a certain disease undergoing treatment

Types of tests of statistical significance

confidence interval, p-value, two-tailed (eyeball), one-tailed (1.645), chi-square, z-score, P-value

Field experiment

control and test groups are studied in their normal surroundings, probably unaware an experiment is taking place

Operational definition

describes the instrument to be used in measuring the concept and putting a conceptual definition into operation, describes explicitly how the concept is to be measured empirically

chi-square test of significance (x2)

determines whether the observed dispersal of cases departs significantly from what we would expect to find if the null hypothesis was correct; to calculate, the expected frequency is the total frequency equally applied to both the independent samples, (observed frequency -expected frequency)^2/expected frequency, calculate for every variable and sum them to find chi-square; null hypothesis says that the result should be close to 0; if chi-square is greater than the degrees of freedom, the null hypothesis can be rejected

Prediction error

difference between the estimated value of the dependent variable on a scatter plot based on the line that best fits the data and the actual position of the data; = yi - y hat where yi = individual value of y and y hat = an estimated value of y

Negative skew

distribution with a skinnier left-hand tail <

Positive skew

distribution with a skinnier right-hand tail >

Confidence interval approach

equal to +/-2 rule of thumb; uses the standard error to determine the smallest possible mean difference in the population. If the smallest possible difference is greater than 0, the null hypothesis can be rejected _

Partial regression coefficient

estimates the mean change in the dependent variable for each unit change in the independent variable controlling for the other independent variables in the model

+/-2 rule of thumb

estimation of 95 percent confidence interval to find boundaries of the lower and upper ends; the sample mean +/- 1.96 (rounded to 2) standard errors

Random assignment

every participant has equal chance of ending up in the control or test group

Construct validity

examines the empirical relationships between a measurement and other concepts to which it should be related "Does this measurement have a relationships with other concepts that one would expect it to have?"

Conceptual question

expressed using ideas, frequently unclear and difficult to answer empirically

Concrete question

expressed using tangible properties, can be answered empirically

Proportional reduction error (PRE)

gauges strength of a relationship (measure of association); a prediction-based metric that varies between 0 and 1 - if knowledge of the independent variable does not provide any help in predicting the dependent variable, PRE will assume a value of 0, if it does it will equal 1.

Test of statistical significance

helps you decide whether an observed relationship between an independent variable and a dependent variable really exists in the population or whether it could have happened by chance when the sample was drawn; examples are: _

External validity

if the results of a study can be generalized and applied to situations in the non-artificial, natural world (lab experiments)

Central limit theorem

if we were to take an infinite number of samples of size n from a population of N members, the means of these samples would be normally distributed. This distribution of sample means, furthermore, would have a mean equal to the true population mean and have a random sampling error equal to the population standard deviation divided by the square root of n. It states the 95 percent confidence interval.

Critical value

marks the upper plausible boundary of random error and so defines the null hypothesis' limit

Regression analysis

measure of association, produces the regression coefficient that estimates the size of the effect of the independent variable on the dependent variable; measures direction and could it have happened by chance

Lambda

measure of association: designed to measure the strength of a relationship between two categorical variables, at least one of which is nominal-level; to calculate = (prediction error without knowledge of the independent variable - prediction error with knowledge)/prediction error without knowledge; _

Cronbach's alpha

measures internal consistency

Split-half method

measures internal consistency

Sampling frame

method for defining the population it wanted to study; poor sampling frames lead to sampling bias or selection bias

Mode

most common answer

Null hypothesis

negative hypothesis, states that in the population there is no relationship between the variables and any relationship observed in a sample was produced by random sampling error

Raw frequency

number of responses

p-value approach

researcher determines the exact probability of obtaining the observed sample difference under the assumption that the null hypothesis is correct. If the probability value (p-value) is less than or equal to .05, then the null hypothesis can be rejected

Interaction effect

occurs in multiple regression analysis when the effect of an independent variable cannot be summarized by a single partial effect. Instead, the effect varies depending on the value of another independent variable in the model; if interaction is going on in the data, or the researcher has described an explanation or process that implies interaction, then a different model needs to be identified.

Response bias

occurs when some cases in the sample are more likely than others to be measured

Multicollinearity

occurs when the independent variables are related to each other so strongly that it becomes difficult to estimate the partial effect of each independent variable; there are too few samples.

Standardization

occurs when the numbers in a distribution are converted into standard units of deviation from the mean of the distribution; a standardized value is called a Z-score

Type I error

occurs when the researcher concludes that there is a relationship in the population when in fact there is none; more serious than Type II

Type II error

occurs when the researcher infers that there is no relationship in the population when in fact there is

Total sum of squares

overall summary of the variation in the dependent variable. It also represents all our errors in guessing the value of the dependent variable for each case, using the mean of the dependent variable as a predictive instrument. _

Correlation analysis

produces a measure of association, Pearson's r, between interval-level variables; measures strength and direction of relationship

Random measurement error

random errors such as fatigue, commotion, unavoidable distractions

Standard error

random sampling error, standard error of the mean: calculate = standard deviation / square root of the sample size

Inferential statistics

refers to a set of procedures for deciding how closely a relationship we observe in a sample corresponds to the unobserved relationship in the population from which the sample was drawn

two-tailed test of statistical significance

same as eyeball test; indicates upper and lower limits of random sampling error

Mean comparison table

shows the mean of a dependent variable for cases that have different values on an independent variable

Regression coefficient

slope of the regression line, rise/run

Sample size component of random sampling error

square root of n; as the sample size goes up, random sampling error declines as a function of the square root of the sample size

Degrees of freedom

statistical property of a large family of distribution, including the Student's t-distribution. The number of degrees of freedom is equal to sample size n minus the number of parameters being estimated by the sample

Partial relationship/partial effect

summarizes a relationship between two variables after taking into account rival variables

Standard deviation

summarizes the extent to which the cases in an interval-level distribution fall on or close to the mean of the distribution. To calculate: (each individual value - mean)^2; take average of all squared deviations, take the square root; 68% of samples fall between two standard deviations (one on either side of the mean) and 95% of samples fall between four standard deviations (two on either side of the mean)

Spurious relationship

the control variable, Z, defines a large compositional difference across values of the independent variable, X, and this compositional difference is a cause of the dependent variable Y - after holding Z constant, the empirical association between X and Y turns out to be completely coincidental; in a spurious relationship, after holding the control variable constant, the relationship between the independent variable and the dependent variable weakens or disappears

Unit of analysis

the entity we want to analyze

Random sampling error

the extent to which a sample statistic differs by chance from a population parameter; to calculate = standard deviation/square root of the sample size OR = square root of [sample proportion times (1 - sample proportion)]/square root of n; inverse relationship between random sampling error and sample size

95 percent confidence interval

the interval within which 95 percent of all possible sample estimates will fall by chance; the upper and lower confidence boundaries are defined by the sample mean minus (1.96 x standard error); stated by the central limit theorem

Hawthorne effect

the knowledge they are being studied changes responses

Probability

the likelihood of the occurrence of an event or set of events

.05 level of significance

the minimum standard to reject the null hypothesis to ensure Type I error is committed less than 5 times out of 100; if the null hypothesis is true how often by chance will we obtain the relationship observed - if the answer is more than 5 times out of 100 we do not reject the null hypothesis

Interaction variable

the multiplicative product of two or more independent variables

Sample proportion

the number of cases falling into one category of the variable divided by the number of cases in the sample

Cumulative percentage

the percentage of cases at or below any given value of the variable

Salient

the person really cares about an issue; importance

Interactive relationship

the relationship between the independent and dependent variable depends on the value of the control variable - for one value of Z, the X-Y relationship might be stronger than for another value of Z; in a set of interaction relationships, the tendency or strength of the relationship between the independent and dependent variable is different, depending on the value of the control variable

Curvilinear relationship

the relationship between the variables depends on which interval or range of the independent variable is being examined; relationship may change from positive to negative or just change in strength

z-score

to calculate = (deviation from the mean) / (standard unit = standard deviation); this number indicates at which standard deviation your sample lies and how unusual the measurement is (what probability one has of getting that measurement). Then, look up z-score on chart; if it is positive, the number on the chart indicates what percentage of the samples are above it, if negative the number on the chart indicates what percentage of the samples are below it

t-ratio

to calculate, (observed sample difference - 0)/standard error of the difference

Validity

truthfulness

Multidimensional concept

two or more distinct groups of empirical characteristics

Multiple regression

we are able to isolate the effect of one independent variable while control for the other independent variables

Individual-level unit of analysis

when a concept describes a phenomenon at its lowest possible level

Random selection

when every member of the population has an equal chance of being included in the sample; needed for a valid sample

Selection bias

when nonrandom processes determine the composition of the test and control groups so that they differ; occurs when nonrandom processes create compositional differences often unbeknownst to the researcher between test group and the control group

Internal validity

within the conditions created artificially by the researcher, the effect of the independent variable is isolated from other plausible explanations (lab experiments)

Regression line

y = a + b(x); a is the y-intercept, b is the slope (regression coefficient), and x and y are the independent and dependent variables; equations provides a general summary of the relationship and allow predictions for future variable values


Ensembles d'études connexes

ATI RN Nutrition Online Practice 2023 A

View Set

Photosynthesis and Cell Respiration Study Guide

View Set

BUSINESS INTRO: Chapter 6 - Key Terms and Questions

View Set

Chapter 40 PrepU Management of Patients with Gastric and Duodenal Disorders

View Set

S3 Practice Written Comp #2 (4/19/23 ) - 37/73

View Set