Guth Final PSC-150

Ace your homework & exams now with Quizwiz!

research report

A research report is one type that is often used in the sciences, engineering and psychology. Here your aim is to write clearly and concisely about your research topic so that the reader can easily understand the purpose and results of your research.

experimental research

This is an experiment where the researcher manipulates one variable, and control/randomizes the rest of the variables. It has a control group, the subjects have been randomly assigned between the groups, and the researcher only tests one effect at a time. scientific method

summative index/scale

a multi0item measure in which individual scores on a set of items are combined to form a summary measure.

literature review

a systematic examination and interpretation of the literature for the purpose of informing further work on a topic.

hypothesis

a tentative or provisional or unconfirmed statement that can (in principle) be verified. my own research.

cross-tabulation

also called a cross-classification or contingency table, this array displays the joint frequencies and relative frequencies of two categorical (nominal or ordinal) variables In statistics, a contingency table (also referred to as cross tabulation or crosstab) is a type of table in a matrix format that displays the (multivariate) frequency distribution of the variables. They are heavily used in survey research, business intelligence, engineering and scientific research. They provide a basic picture of the interrelation between two variables and can help find interactions between them. The term contingency table was first used by Karl Pearson in "On the Theory of Contingency and Its Relation to Association and Normal Correlation",[1] part of the Drapers' Company Research Memoirs Biometric Series I published in 1904. A crucial problem of multivariate statistics is finding (direct-)dependence structure underlying the variables contained in high-dimensional contingency tables. If some of the conditional independences are revealed, then even the storage of the data can be done in a smarter way (see Lauritzen (2002)). In order to do this one can use information theory concepts, which gain the information only from the distribution of probability, which can be expressed easily from the contingency table by the relative frequencies. Suppose that we have two variables, sex (male or female) and handedness (right- or left-handed). Further suppose that 100 individuals are randomly sampled from a very large population as part of a study of sex differences in handedness. A contingency table can be created to display the numbers of individuals who are male and right-handed, male and left-handed, female and right-handed, and female and left-handed. Such a contingency table is shown below. Right-handed Left-handed Total Males 43 9 52 Females 44 4 48 Totals 87 13 100 The numbers of the males, females, and right- and left-handed individuals are called marginal totals. The grand total, i.e., the total number of individuals represented in the contingency table, is the number in the bottom right corner. The table allows us to see at a glance that the proportion of men who are right-handed is about the same as the proportion of women who are right-handed although the proportions are not identical. The significance of the difference between the two proportions can be assessed with a variety of statistical tests including Pearson's chi-squared test, the G-test, Fisher's exact test, and Barnard's test, provided the entries in the table represent individuals randomly sampled from the population about which we want to draw a conclusion. If the proportions of individuals in the different columns vary significantly between rows (or vice versa), we say that there is a contingency between the two variables. In other words, the two variables are not independent. If there is no contingency, we say that the two variables are independent. The example above is the simplest kind of contingency table, a table in which each variable has only two levels; this is called a 2 × 2 contingency table. In principle, any number of rows and columns may be used. There may also be more than two variables, but higher order contingency tables are difficult to represent on paper. The relation between ordinal variables, or between ordinal and categorical variables, may also be represented in contingency tables, although such a practice is rare. where χ2 is derived from Pearson's chi-squared test, and N is the grand total of observations. φ varies from 0 (corresponding to no association between the variables) to 1 or −1 (complete association or complete inverse association). This coefficient can only be calculated for frequency data represented in 2 × 2 tables. φ can reach a minimum value −1.00 and a maximum value of 1.00 only when every marginal proportion is equal to .50 (and two diagonal cells are empty). Otherwise, the phi coefficient cannot reach those minimal and maximal values

interviews

interviewing respondents in a non standardized, individualized manner. interviewer bias: the interviewer's influence on the respondent's answers; an example of reactivity.

participant observation

observation in which hotel observer becomes a regular participant in the activities of those being observed.

dependent variable

the phenomenon thought be influenced, affected, or caused by some other phenomenon.

direction

relationship: an indication of which values of the dependent variable are associated with which values of independent variable hypothesis: a hypothesis that specifies the expected relationship between two or more variables

measure of association

statistics that summarize the relationship between two variables

nominal measure

variables are variables that have two or more categories, but which do not have an intrinsic order. For example, a real estate agent could classify their types of property into distinct categories such as houses, condos, co-ops or bungalows. So "type of property" is a nominal variable with 4 categories called houses, condos, co-ops and bungalows. Of note, the different categories of a nominal variable can also be referred to as groups or levels of the nominal variable. Another example of a nominal variable would be classifying where people live in the USA by state. In this case there will be many more levels of the nominal variable (50 in fact).

question order

(effect) the effect on responses of question placement within a questionnaire. An example of a contrast effect can be seen in a Pew Research poll conducted in October 2003 that found that people were more likely to favor allowing gays and lesbians to enter into legal agreements that give them the same rights as married couples when this question was asked after one about whether they favored or opposed allowing gays and lesbians to marry (45% favored legal agreements when asked after the marriage question, but 37% favored legal agreements without the immediate preceding context of a question about gay marriage). Responses to the question about gay marriage, meanwhile, were not significantly affected by its placement before or after the legal agreements question. Another experiment embedded in a December 2008 Pew Research poll also resulted in a contrast effect. When people were asked: "All in all, are you satisfied or dissatisfied with the way things are going in this country today?" immediately after having been asked "Do you approve or disapprove of the way George W. Bush is handling his job as president?" 88% said they were dissatisfied compared with only 78% without the context of the prior question. Responses to presidential approval remained relatively unchanged whether national satisfaction was asked before or after it. A similar finding occurred in December 2004 when both satisfaction and presidential approval were much higher (57% were dissatisfied when Bush approval was asked first vs. 51% when general satisfaction was asked first). Several studies have also shown that asking a more specific question before a more general question (e.g., asking about happiness with one's marriage before asking about one's overall happiness) can result in a contrast effect. Although some exceptions have been found, people tend to avoid redundancy by excluding the more specific question from the general rating.

dichotomous

(variable) a nominal-level variable having only two categories that for certain analytical purposes can be treated as a quantitative variable. variables are nominal variables which have only two categories or levels. For example, if we were looking at gender, we would most probably categorize somebody as either "male" or "female". This is an example of a dichotomous variable (and also a nominal variable). Another example might be if we asked a person if they owned a mobile phone. Here, we may categorise mobile phone ownership as either "Yes" or "No". In the real estate agent example, if type of property had been classified as either residential or commercial then "type of property" would be a dichotomous variable. Examples of dichotomous variables Male or Female Heads or Tails Rich or Poor Democrat or Republican Pass or Fail Under age 65 or 65 and over

control variable

A value or values that are held constant in order to assess or clarify the relationship between the Independent and Dependent variables. Must be kept the same for both the Control Group and the Experimental Group A factor that is held constant to test the relative impact of an independent variable. Another feature of the scientific method is the ceteris paribus (all other things being equal) clause. This is about holding all the other possibly relevant conditions constant in order to check for a significant causal relationship between the two variables that interest you. Social scientists collect information on more than just the independent and dependent variables so that they can check on the action of the I.V. and D.V. in different pools of information where other conditions were in fact the same. In our example, other possible things that might influence the D.V. (election outcome) are campaign spending, personality, success/failure in foreign policy, and so forth. These are the kinds of things you will want to operationalize and hold constant ("control for") in order to make sure that it is your I.V. alone that is causing the variation in your D.V. Human behavior is usually too complicated to be studied with only two variables. Often we will want to consider sets of three or more variables (called multivariate analysis). We will want to consider three or more variables when we have discovered a relationship between two variables and want to find out 1) if this relationship might be due to some other factor, 2) how or why these variables are related, or 3) if the relationship is the same for different types of individuals. In each situation, we identify a third variable that we want to consider. This is called the control or the test variable. (Although it is possible to use several control variables simultaneously, we will limit ourselves to one control variable at a time.) To introduce a third variable, we identify the control variable and separate the cases in our sample by the categories of the control variable. For example, if the control variable is age divided into these two categories--younger and older, we would separate the cases into two groups. One group would consist of individuals who are younger and the other group would be those who are older. We would then obtain the crosstabulation of the independent and dependent variables for each of these age groups. Since there are two categories in this control variable, we obtain two partial tables, each containing part of the original sample. (If there were three categories in our control variable, for example, young, middle aged, and old, we would have three partial tables.) The process of using a control variable in the analysis is called elaboration and was developed at Columbia University by Paul Lazarsfeld and his associates. There are several different types of outcomes to the elaboration process. We will discuss each briefly. Table 2.3 showed that females were more likely than males to say they were willing to vote for a woman. Let's introduce a control variable and see what happens. In this example we are going to use age as the control variable. Table 3.1 is the three-variable table with voting preference as the dependent variable, sex as the independent variable, and age as the control variable. When we look at the older respondents (the left-hand partial table), we discover that this partial table is very similar to the original two-variable table (Table 2.3). The same is true for the younger respondents (the right-hand partial table). Each partial table is very similar to the original two-variable table. This is often referred to as replication because the partial tables repeat the original two-variable table (see Babbie 1997: 393-396). It is not necessary that they be identical; just that each partial table be basically the same as the original two-variable table. Our conclusion is that age is not affecting the relationship between sex and voting preference. In other words, the difference between males and females in voting preference is not due to age. Table 3.1 -- Voting Preference by Sex Controlling for Age Older Younger Male % Female % Total % Male % Female % Total % Voting Preference Willing to Vote for a Woman 43.8 56.1 49.0 44.2 55.8 52.9 Not Willing to Vote for a Woman 56.2 43.9 51.0 55.8 44.2 100.0 100.0 100.0 100.0 100.0 100.0 (240) (180) (420) (120) (360) (480) Since this is a hypothetical example, imagine a different outcome. Suppose we introduce age as a control variable and instead of getting Table 2.1, we get Table 3.2. How do these two tables differ? In Table 3.2, the percentage difference between males and females has disappeared in both of the partial tables. This is called explanation because the control variable, age, has explained away the original relationship between sex and voting preference. (We often say that the relationship between the two variables is spurious, not genuine.) When age is held constant, the difference between males and females disappears. The difference in the relationship does not have to disappear entirely, only be reduced substantially in each of the partial tables. This can only occur when there is a relationship between the control variable (age) and each of the other two variables (sex and voting preference). Next, we are interested in how or why the two variables are related. Suppose females are more likely than males to vote for a woman and that this difference cannot be explained away by age or by any other variable we have considered. We need to think about why there might be such a difference in the preferences of males and females. Perhaps females are more often liberal Table 3.2 -- Voting Preference by Sex Controlling for Age Older Younger Male % Female % Total % Male % Female % Total % Voting Preference Willing to Vote for a Woman 32.9 33.9 33.3 65.8 66.9 66.7 Not Willing to Vote for a Woman 67.1 66.1 66.7 34.2 33.1 33.3 100.0 100.0 100.0 100.0 100.0 100.0 (240) (180) (420) (120) (360) (480) than males, and liberals are more likely to say they would vote for a woman. So we introduce liberalism/conservatism as a control variable in our analysis. If females are more likely to support a woman because they are more liberal, then the difference between the preferences of men and women should disappear or be substantially reduced when liberalism/conservatism is held constant. This process is called interpretation because we are interpreting how one variable is related to another variable. Table 3.3 shows what we would expect to find if females supported the woman because they were more liberal. Notice that in both partial tables, the differences in the percentages between men and women has disappeared. (It is not necessary that it disappears entirely, but only that it is substantially reduced in each of the partial tables.) Table 3.3 -- Voting Preference by Sex Controlling for Liberalism/Conservatism Older Younger Male % Female % Total % Male % Female % Total % Voting Preference Willing to Vote for a Woman 32.9 33.9 33.3 65.8 66.9 66.7 Not Willing to Vote for a Woman 67.1 66.1 66.7 34.2 33.1 33.3 100.0 100.0 100.0 100.0 100.0 100.0 (240) (180) (420) (120) (360) (480) Finally, let's focus on the third of the situations outlined at the beginning of this section--whether the relationship is the same for different types of individuals. Perhaps the relationship between sex and voter preference varies with other characteristics of the individuals. Maybe among whites, females are more likely to prefer women candidates than the males are, but among blacks, there is little difference between males and females in terms of voter preference. This is the outcome shown in Table 3.4. This process is called specification because it specifies the conditions under which the relationship between sex and voter preference varies. In the earlier section on bivariate analysis, we discussed the use of chi square. Remember that chi square is a test of independence used to determine if there is a relationship between two variables. Chi square is used in multivariate analysis the same way it is in bivariate analysis. There will be a separate value of chi square for each partial table in the multivariate analysis. You should keep a number of warnings in mind. Chi square assumes that the expected frequencies for each cell are five or larger. As long as 80% of these expected frequencies are five or larger and no single expected frequency is very small, we don't have to worry. However, the expected frequencies often drop below five when the number of cases in a column or row gets too small. If this should occur, you will have to either recode (i.e., combine columns or rows) or eliminate a column or row from the table. Table 3.4 -- Voting Preference by Sex Controlling for Race White African American Male % Female % Total % Male % Female % Total % Voting Preference Willing to Vote for a Woman 42.9 56.5 51.2 50.0 50.0 50.0 Not Willing to vote for a Woman 57.1 43.5 48.8 50.0 50.0 50.0 100.00 100.00 100.00 100.00 100.00 100.00 (310) (490) (800) (50) (50) (100) Another point to keep in mind is that chi square is affected by the number of cases in the table. With a lot of cases it is easy to reject the null hypothesis of no relationship. With a few cases, it can be quite hard to reject the null hypothesis. Also, consider the percentages within the table. Look for patterns. Do not rely on any single piece of information. Look at the whole picture. We have concentrated on crosstabulation and chi square. There are other types of statistical analysis such as regression and log-linear analysis. When you have mastered these techniques, look at some other types of analysis.

survey research

Gathering primary data by asking people questions about their knowledge, attitudes, preferences, and buying behavior pew research survey

replication

In engineering, science, and statistics, replication is the repetition of an experimental condition so that the variability associated with the phenomenon can be estimated. ASTM, in standard E1847, defines replication as "the repetition of the set of all the treatment combinations to be compared in an experiment. n an experiment, replication refers to the practice of assigning each treatment to many experimental subjects. In general, the more subjects in each treatment condition, the lower the variability of the dependent measures.

empirical relationship

In science, an empirical relationship is a relationship or correlation based solely on observation rather than theory. An empirical relationship requires only confirmatory data irrespective of theoretical basis. Sometimes theoretical explanations for what were initially empirical relationships are found, in which case the relationships are no longer considered empirical. Thus correlation is not causation but in some cases can be found to result from it. Other times the empirical relationships are merely approximations, often equivalent to the first few terms of the Taylor series of the "real" answer (though in practice these approximations may be so accurate it is difficult to tell they're approximations). Still other times the relationships may later be found to only hold under certain specific conditions, reducing them to special cases of more general relationships. Historically the discovery of empirical relationships has been important as a stepping stone to the discovery of theoretical relationships. And on occasion, what was thought to be an empirical factor is later deemed to be a fundamental physical constant.[citation needed] An empirical equation is simply a mathematical statement of one or more empirical relationships in the form of an equation. Elaine focuses her studies and research on empirical political theory. In the simplest terms, empirical political theory is focused on explaining 'what is' through observation. In this approach, scholars seek to generate a hypothesis, which is a proposed explanation for some phenomena that can be tested empirically. After formulating a hypothesis, a study will be designed to test the hypothesis. Let's look at an example. Elaine is interested in the role of money in modern Senate elections. She develops a hypothesis that candidates that spend more money on their campaigns than their opponents win. Elaine then goes about designing a study to test her hypothesis by examining election results and campaign finance reporting disclosures. Elaine must be careful to control for other variables that may affect the result, such as incumbency, and focus on states with a relatively equal balance of political party membership. After collecting the data, she will determine through statistical analysis if it tends to support or not support her hypothesis.

spurious relationship

In statistics, a spurious relationship (not to be confused with spurious correlation) is a mathematical relationship in which two events or variables have no direct causal connection, yet it may be wrongly inferred that they do, due to either coincidence or the presence of a certain third, unseen factor (referred to as a "common response variable," "confounding factor," or "lurking variable"). Suppose there is found to be a correlation between A and B. Aside from coincidence, there are three possible relationships: Where A is present, B is observed. (A causes B.) Where B is present, A is observed. (B causes A.) OR Where C is present, both A and B are observed. (C causes both A and B.) In the last case there is a spurious relationship between A and B. In a regression model where A is regressed on B but C is actually the true causal factor for A, this misleading choice of independent variable (B instead of C) is called specification error. Because correlation can arise from the presence of a lurking variable rather than from direct causation, it is often said that "Correlation does not imply causation". A spurious relationship should not be confused with a spurious regression, which refers to a regression that shows significant results due to the presence of a unit root in both variables

missing data

In statistics, missing data, or missing values, occur when no data value is stored for the variable in an observation. Missing data are a common occurrence and can have a significant effect on the conclusions that can be drawn from the data. Missing data can occur because of nonresponse: no information is provided for several items or no information is provided for a whole unit. Some items are more sensitive for nonresponse than others, for example items about private subjects such as income. Dropout is a type of missingness that occurs mostly when studying development over time. In this type of study the measurement is repeated after a certain period of time. Missingness occurs when participants drop out before the test ends and one or more measurements are missing. Sometimes missing values are caused by the researcher—for example, when data collection is done improperly or mistakes are made in data entry.[1] Data often are missing in research in economics, sociology, and political science because governments choose not to, or fail to, report critical statistics.[2]

random sample

Simple random sampling refers to a sampling method that has the following properties. The population consists of N objects. The sample consists of n objects. All possible samples of n objects are equally likely to occur. An important benefit of simple random sampling is that it allows researchers to use statistical methods to analyze sample results. For example, given a simple random sample, researchers can use statistical methods to define a confidence interval around a sample mean. Statistical analysis is not appropriate when non-random sampling methods are used. There are many ways to obtain a simple random sample. One way would be the lottery method. Each of the N population members is assigned a unique number. The numbers are placed in a bowl and thoroughly mixed. Then, a blind-folded researcher selects n numbers. Population members having the selected numbers are included in the sample.

documentary record

The documentary record is the primary sources relating to a topic. Primary sources may include contemporaneous newspapers, books, magazine articles, diaries, correspondence, maps, laws, court cases, memoirs and autobiographies, maps, photographs, records of government agencies and organizations, research data and public opinion polls. A good place to get a handle on primary sources, may be look at archival records; an excellent archive that shows the active collection of the documentary record is the Vietnam Center and Archive at Texas Tech University

open-end question

a question with no response alternatives provided for the respondent

scatter diagram

a graph that pilots joint values of an independent variable along one axis (usually the x-axis_ and a dependent variable along the other axis (usually the y-axis) To read a scatter plot you need to look for the overall pat Positive gradient: When the larger values of the horizontal (explanatory) variable are associated with larger values of the vertical (response) variable. As the explanatory variable increases, so does the response variable. Can you see how the data, as we move from left to right, are gradually rising? Negative gradient: When the larger values of the explanatory variable are associated with smaller values of the response variable. As the explanatory variable increases, the response variable decreases. Can you see how the data, as we move from left to right, are gradually decreasing? We need to know whether there is association or not, and whether it is linear or not. The relationship might be linear or curved or there might be no underlying form. In this course we will mainly concentrate on linear relationships, but we must be aware of the existence of non-linear ones. The strength of the pattern is related to how tightly clustered the points are around the underlying form. We often use phrases like those following to describe the strength of the relationship, whether negative or positive. These phrases are of course, subjective.

bar chart

a graphic display of the data in frequency or percentage distribution

interval measure

a measure for which a one-unit difference in scores is the same throughout the range of the measure. The interval type allows for the degree of difference between items, but not the ratio between them. Examples include temperature with the Celsius scale, which has two defined points (the freezing and boiling point of water at specific conditions) and then separated into 100 intervals, date when measured from an arbitrary epoch (such as AD) and direction measured in degrees from true or magnetic north. Ratios are not allowed since 20 °C cannot be said to be "twice as hot" as 10 °C, nor can multiplication/division be carried out between any two dates directly. However, ratios of differences can be expressed; for example, one difference can be twice another. Interval type variables are sometimes also called "scaled variables", but the formal mathematical term is an affine space (in this case an affine line). (statistics) An ordinal variable with the additional property that the magnitudes of the differences between two values are meaningful. An interval variable is similar to an ordinal variable, except that the intervals between the values of the interval variable are equally spaced. For example, suppose you have a variable such as annual income that is measured in dollars, and we have three people who make $10,000, $15,000 and $20,000. The second person makes $5,000 more than the first person and $5,000 less than the third person, and the size of these intervals is the same. If there were two other people who make $90,000 and $95,000, the size of that interval between these two people is also the same ($5,000). Statistical computations and analyses assume that the variables have a specific levels of measurement. For example, it would not make sense to compute an average hair color. An average of a categorical variable does not make much sense because there is no intrinsic ordering of the levels of the categories. Moreover, if you tried to compute the average of educational experience as defined in the ordinal section above, you would also obtain a nonsensical result. Because the spacing between the four levels of educational experience is very uneven, the meaning of this average would be very questionable. In short, an average requires a variable to be interval. Sometimes you have variables that are "in between" ordinal and interval, for example, a five-point likert scale with values "strongly agree", "agree", "neutral", "disagree" and "strongly disagree". If we cannot be sure that the intervals between each of these five values are the same, then we would not be able to say that this is an interval variable, but we would say that it is an ordinal variable. However, in order to be able to use statistics that assume the variable is interval, we will assume that the intervals are equally spaced.

ratio measure

a measure for which the scores possess the full mathematical properties of the numbers assigned. A ratio variable, has all the properties of an interval variable, and also has a clear definition of 0.0. When the variable equals 0.0, there is none of that variable. Variables like height, weight, enzyme activity are ratio variables. Temperature, expressed in F or C, is not a ratio variable. A temperature of 0.0 on either of those scales does not mean 'no heat'. However, temperature in Kelvin is a ratio variable, as 0.0 Kelvin really does mean 'no heat'. Another counter example is pH. It is not a ratio variable, as pH=0 just means 1 molar of H+. and the definition of molar is fairly arbitrary. A pH of 0.0 does not mean 'no acidity' (quite the opposite!). When working with ratio variables, but not interval variables, you can look at the ratio of two measurements. A weight of 4 grams is twice a weight of 2 grams, because weight is a ratio variable. A temperature of 100 degrees C is not twice as hot as 50 degrees C, because temperature C is not a ratio variable. A pH of 3 is not twice as acidic as a pH of 6, because pH is not a ratio variable. A categorical variable, also called a nominal variable, is for mutual exclusive, but not ordered, categories. For example, your study might compare five different genotypes. You can code the five genotypes with numbers if you want, but the order is arbitrary and any calculations (for example, computing an average) would be meaningless. A ordinal variable, is one where the order matters but not the difference between values. For example, you might ask patients to express the amount of pain they are feeling on a scale of 1 to 10. A score of 7 means more pain that a score of 5, and that is more than a score of 3. But the difference between the 7 and the 5 may not be the same as that between 5 and 3. The values simply express an order. Another example would be movie ratings, from * to *****. A interval variable is a measurement where the difference between two values is meaningful. The difference between a temperature of 100 degrees and 90 degrees is the same difference as between 90 degrees and 80 degrees. A ratio variable, has all the properties of an interval variable, and also has a clear definition of 0.0. When the variable equals 0.0, there is none of that variable. Variables like height, weight, enzyme activity are ratio variables. Temperature, expressed in F or C, is not a ratio variable. A temperature of 0.0 on either of those scales does not mean 'no heat'. However, temperature in Kelvin is a ratio variable, as 0.0 Kelvin really does mean 'no heat'. Another counter example is pH. It is not a ratio variable, as pH=0 just means 1 molar of H+. and the definition of molar is fairly arbitrary. A pH of 0.0 does not mean 'no acidity' (quite the opposite!). When working with ratio variables, but not interval variables, you can look at the ratio of two measurements. A weight of 4 grams is twice a weight of 2 grams, because weight is a ratio variable. A temperature of 100 degrees C is not twice as hot as 50 degrees C, because temperature C is not a ratio variable. A pH of 3 is not twice as acidic as a pH of 6, because pH is not a ratio variable.

ordinal measure

a measure for which the scores represent ordered categories that are not necessarily equidistant from each other. variables are variables that have two or more categories just like nominal variables only the categories can also be ordered or ranked. So if you asked someone if they liked the policies of the Democratic Party and they could answer either "Not very much", "They are OK" or "Yes, a lot" then you have an ordinal variable. Why? Because you have 3 categories, namely "Not very much", "They are OK" and "Yes, a lot" and you can rank them from the most positive (Yes, a lot), to the middle response (They are OK), to the least positive (Not very much). However, whilst we can rank the levels, we cannot place a "value" to them; we cannot say that "They are OK" is twice as positive as "Not very much" for example. ie. An ordinal variable is similar to a categorical variable. The difference between the two is that there is a clear ordering of the variables. For example, suppose you have a variable, economic status, with three categories (low, medium and high). In addition to being able to classify people into these three categories, you can order the categories as low, medium and high. Now consider a variable like educational experience (with values such as elementary school graduate, high school graduate, some college and college graduate). These also can be ordered as elementary school, high school, some college, and college graduate. Even though we can order these from lowest to highest, the spacing between the values may not be the same across the levels of the variables. Say we assign scores 1, 2, 3 and 4 to these four levels of educational experience and we compare the difference in education between categories one and two with the difference in educational experience between categories two and three, or the difference between categories three and four. The difference between categories one and two (elementary and high school) is probably much bigger than the difference between categories two and three (high school and some college). In this example, we can order the people in level of educational experience but the size of the difference between categories is inconsistent (because the spacing between categories one and two is bigger than categories two and three). If these categories were equally spaced, then the variable would be an interval variable.

arrow diagram

a pictorial representation of a researcher's explanatory scheme

falsifiability

a property of statement or hypothesis such that it can (in principle, at least) be rejected in the face of contravening evidence. For example, by the problem of induction, no number of confirming observations can verify a universal generalization, such as All swans are white, yet it is logically possible to falsify it by observing a single black swan. Thus, the term falsifiability is sometimes synonymous to testability. Some statements, such as It will be raining here in one million years, are falsifiable in principle, but not in practice.[2]

close-end question

a question with response alternatives provided Researchers will sometimes conduct a pilot study using open-ended questions to discover which answers are most common. They will then develop closed-ended questions that include the most common responses as answer choices. In this way, the questions may better reflect what the public is thinking or how they view a particular issue. In a conversation, when completing a research survey, being interviewed for a job or working on a homework assignment, you might find yourself presented with a series of closed-ended or open-ended questions. Close-ended questions are those which can be answered by a simple "yes" or "no," while open-ended questions are those which require more thought and more than a simple one-word answer. Close-Ended Questions If you can answer a question with only a "yes" or "no" response, then you are answering a close-ended type of question. Examples of close-ended questions are: Are you feeling better today? May I use the bathroom? Is the prime rib a special tonight? Should I date him? Will you please do me a favor? Have you already completed your homework? Is that your final answer? Were you planning on becoming a fireman? Should I call her and sort things out? Is it wrong to want to live on my own at this age? Shall we make dinner together tonight? Could I possibly be a messier house guest? Might I be of service to you ladies this evening? Did that man walk by the house before? Can I help you with that? May I please have a bite of that pie? Would you like to go to the movies tonight? Is math your favorite subject? Does four plus four equal eight? Is that haunted house really scary? Will you be going to Grandmother's house for Christmas? Did Dad make the cake today? Is there a Mass being held at noon? Are you pregnant? Are you happy? Is he dead? Close-ended questions should not always be thought of as simple questions that anyone can quickly answer merely because they require a yes or no answer. Close-ended questions can also be very complicated. For example, "Is 1 in binary equal to 1 in counting numbers?" is a close-ended question that not everyone would be able to quickly answer.

negative relationship

a relationship in which high values of one variable are associated with low values of another variable. (rise in x = decease in y)

theory

a statement or series of related statements that organize, explain, and predict phenomena.

correlation

a statement that the values or states of one thing systematically vary with the values or states of another; an association between two variables

content analysis

a systematic procedure by which records are transformed into quantitative data. hispanics in televised entertainment programming 1) count whether there was at least one hispanic present 2) how many hispanics there were 3) how much time hispanics were eon the screen and 4) how favorable the portrayal of hispanics was or how important the portrayal of hispanics was for the overall story.

intervening variable

a variable coming between an independent variable and a dependent variable in an explanatory scheme. Example: The statistical association between income and longevity needs to be explained because just having money does not make one live longer. Other variables intervene between money and long life. People with high incomes tend to have better medical care than those with low incomes. Medical care is an intervening variable. It mediates the relation between income and longevity.

concept

an abstract idea; a general notion. Government means the "formal institutions and processes of a politically organized society with authority to make, enforce, and interpret laws and other binding rules about matters of common interest and concern. Government also refers to the group of people, acting in formal political institutions at national, state, and local levels, who exercise decision making power or enforce laws and regulations." (Taken from: Civics Framework for the 1998 National Assessment of Educational Progress, NAEP Civics Consensus Project, The National Assessment Governing Board, United States Department of Education, p. 19). Civic Values refer to those important principles that serve as the foundation for our democratic form of government. These values include justice, honesty, self-discipline, due process, equality, majority rule with respect for minority rights, and respect for self, others, and property. • Justice means the fair, equal, proportional, or appropriate treatment rendered to individuals in interpersonal, societal, or government interactions.

regression

analysis: a technique for measuring th relationship between to interval - or ration- level variables. coefficient: a statistic that tells how much the dependent variable changes per unit change in the independent variable constant: value of the dependent variable when all of the values of the independent variables in the equation equal 0. In statistics, regression analysis is a statistical process for estimating the relationships among variables. It includes many techniques for modeling and analysing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable (or 'criterion variable') changes when any one of the independent variables is varied, while the other independent variables are held fixed. Most commonly, regression analysis estimates the conditional expectation of the dependent variable given the independent variables - that is, the average value of the dependent variable when the independent variables are fixed. Less commonly, the focus is on a quantile, or other location parameter of the conditional distribution of the dependent variable given the independent variables. In all cases, the estimation target is a function of the independent variables called the regression function. In regression analysis, it is also of interest to characterize the variation of the dependent variable around the regression function which can be described by a probability distribution. Regression analysis is widely used for prediction and forecasting, where its use has substantial overlap with the field of machine learning. Regression analysis is also used to understand which among the independent variables are related to the dependent variable, and to explore the forms of these relationships. In restricted circumstances, regression analysis can be used to infer causal relationships between the independent and dependent variables. However this can lead to illusions or false relationships, so caution is advisable;[1] for example, correlation does not imply causation. Many techniques for carrying out regression analysis have been developed. Familiar methods such as linear regression and ordinary least squares regression are parametric, in that the regression function is defined in terms of a finite number of unknown parameters that are estimated from the data. Nonparametric regression refers to techniques that allow the regression function to lie in a specified set of functions, which may be infinite-dimensional. The performance of regression analysis methods in practice depends on the form of the data generating process, and how it relates to the regression approach being used. Since the true form of the data-generating process is generally not known, regression analysis often depends to some extent on making assumptions about this process. These assumptions are sometimes testable if a sufficient quantity of data is available. Regression models for prediction are often useful even when the assumptions are moderately violated, although they may not perform optimally. However, in many applications, especially with small effects or questions of causality based on observational data, regression methods can give misleading results.[

deduction/induction

deduction- a process of reasoning from a theory to specific observations. induction-A process of reasoning in which one draws an inference fro ma set of premises and observations; the premises of an inductive argument support its conclusion but do not prove it.

multivariate analysis

data analysis techniques designed to test hypotheses involving more than two variables. Simple and multiple regression[edit] The very simplest case of a single scalar predictor variable x and a single scalar response variable y is known as simple linear regression. The extension to multiple and/or vector-valued predictor variables (denoted with a capital X) is known as multiple linear regression, also known as multivariable linear regression. Nearly all real-world regression models involve multiple predictors, and basic descriptions of linear regression are often phrased in terms of the multiple regression model. Note, however, that in these cases the response variable y is still a scalar. Another term multivariate linear regression refers to cases where y is a vector, i.e., the same as general linear regression. The difference between multivariate linear regression and multivariable linear regression should be emphasized as it causes much confusion and misunderstanding in the literature.

secondary data analysis

data not collected by researcher ie. Secondary data analysis, however, is the use of data that was collected by someone else for some other purpose. In this case, the researcher poses questions that are addressed through the analysis of a data set that they were not involved in collecting. The data was not collected to answer the researcher's specific research questions and was instead collected for another purpose. The same data set can therefore be a primary data set to one researcher and a secondary data set to a different researcher. what i did

causal relationship

is the relation between an event (the cause) and a second event (the effect), where the first event is understood to be responsible for the second. In some data sets, it is possible to conclude that one variable has a direct influence on the other. This is called a causal relationship. A scientist in a dairy factory tries four different packaging materials for blocks of cheese and measures their shelf life. The packaging material might influence shelf life, but the shelf life cannot influence the packaging material used. The relationship is therefore causal. A bank manager is concerned with the number of customers whose accounts are overdrawn. Half of the accounts that become overdrawn in one week are randomly selected and the manager telephones the customer to offer advice. Any difference between the mean account balances after two months of the overdrawn accounts that did and did not receive advice can be causally attributed to the phone calls. If two variables are causally related, it is possible to conclude that changes to the explanatory variable, X, will have a direct impact on Y.

field research

open-ended and wide ranging (rather than structured). interviewing qualitative analysis examine an intervention in the real world (or as many experimentalists like to say, naturally occurring environments) rather than in the laboratory. Field experiments, like lab experiments, generally randomize subjects (or other sampling units) into treatment and control groups and compare outcomes between these groups. Field experiments are so named in order to draw a contrast with laboratory experiments, which enforce scientific control by testing a hypothesis in the artificial and highly controlled setting of a laboratory. Often used in the social sciences, and especially in economic analyses of education and health interventions, field experiments have the advantage that outcomes are observed in a natural setting rather than in a contrived laboratory environment. For this reason, field experiments are sometimes seen as having higher external validity than laboratory experiments. However, like natural experiments, field experiments suffer from the possibility of contamination: experimental conditions can be controlled with more precision and certainty in the lab. Yet some phenomena (e.g., voter turnout in an election) cannot be easily studied in a laboratory. Jot Notes-- Key words or phrases are written down while in the field. Field Notes Proper-- A description of the physical context and the people involved, including their behavior and nonverbal communication. Methodological Notes-- New ideas that the researcher has on how to carry out the research project. Journals and Diaries-- These notes record the ethnographer's personal reactions, frustrations, and assessments of life and work in the field.

external validity

the ability to generalize from one set of research findings to other situations. "A threat to external validity is an explanation of how you might be wrong in making a generalization."[3] Generally, generalizability is limited when the cause (i.e. the independent variable) depends on other factors; therefore, all threats to external validity interact with the independent variable - a so-called background factor x treatment interaction.[4] Aptitude-treatment Interaction: The sample may have certain features that may interact with the independent variable, limiting generalizability. For example, inferences based on comparative psychotherapy studies often employ specific samples (e.g. volunteers, highly depressed, no comorbidity). If psychotherapy is found effective for these sample patients, will it also be effective for non-volunteers or the mildly depressed or patients with concurrent other disorders? Situation: All situational specifics (e.g. treatment conditions, time, location, lighting, noise, treatment administration, investigator, timing, scope and extent of measurement, etc. etc.) of a study potentially limit generalizability. Pre-test effects: If cause-effect relationships can only be found when pre-tests are carried out, then this also limits the generality of the findings. Post-test effects: If cause-effect relationships can only be found when post-tests are carried out, then this also limits the generality of the findings. Reactivity (placebo, novelty, and Hawthorne effects): If cause-effect relationships are found they might not be generalizable to other settings or situations if the effects found only occurred as an effect of studying the situation. Rosenthal effects: Inferences about cause-consequence relationships may not be generalizable to other investigators or researchers. Cook and Campbell[5] made the crucial distinction between generalizing to some population and generalizing across subpopulations defined by different levels of some background factor. Lynch has argued that it is almost never possible to generalize to meaningful populations except as a snapshot of history, but it is possible to test the degree to which the effect of some cause on some dependent variable generalizes across subpopulations that vary in some background factor. That requires a test of whether the treatment effect being investigated is moderated by interactions with one or more background factors.[6][7]

internal validity

the ability to show that manipulation or variation of the independent variable actually causes tree dependent variable to change (internal) the correspondence between a measure and a concept it is supposed to measure For example, if we are studying the variable of pay and the result of hard work, we want to be able to say that no other reason (not personality, not motivation, not competition) causes the hard work. We want to say that pay and pay alone makes people like Sean work harder.

reliability

the extent to which a measure yields the same results on repeated trials Reliability in research methods concerns the quality of measurement. Reliability refers to the "repeatability" or "consistency" of research measures.[1] Reliability, like validity, is a way of assessing the quality of the measurement procedure used to collect data in a dissertation. In order for the results from a study to be considered valid, the measurement procedure must first be reliable. In this article, we: (a) explain what reliability is, providing examples; (b) highlight some of the more common threats to reliability in research; (c) briefly discuss each of the main types of reliability you may use in your dissertation, and the situations where they are appropriate; and (d) point to the various articles on the Lærd Dissertation website where we discuss each of these types of reliability in more detail, including articles explaining how to run these reliability tests in the statistics package, SPSS, as well as interpret and write up the results of such tests. In order to examine reliability, a number of statistical tests can be used. These include Pearson correlation, Spearman's correlation, independent t-test, dependent t-test, one-way ANOVA, repeated measures ANOVA and Cronbach's alpha. You can learn about these statistical tests, how to run them using the statistics package, SPSS, and how to interpret and write up the results from such tests in the Data Analysis section of Lærd Dissertation.

independent variable

the phenomenon thought to influence, affect, or cause some other phenomenon

parsimony

the principle that among explanations or theories with equal degrees of confirmation, the simplest-- the one based on the fewest assumptions and explanatory factors-- is to be the preferred; sometimes known as ockham's razor. vent: One of the fence posts is broken. Of possible explanations a) A moose ran through it or b) Some screws fell out of it because it is old, "b" is the likelier explanation. Event: The tire on the car is flat. Of possible explanations a) It has a screw in it and b) A serial tire-flattener came through the neighbor and sliced the tire open, explanation "a" is more likely. Event: It is raining and I saw a bright flash through my curtains. Of possible explanations a) There was lightning or b) Someone is trying to take pictures of me in the house, explanation "a" is more likely. Event: A student failed the statistics test. Of possible explanations a) The student needed to study harder or b) The professor changed his answers on the test because he does not like the student, explanation "a" is more likely. Event: A car rear-ended another in highway traffic during rush hour. Of possible explanations a) The driver did not expect traffic to come to a stop so quickly in rush hour and therefore did not apply the brakes quickly enough or b) The driver was distracted by an elephant on the side of the road, explanation "a" is more likely. Event: A loud noise is heard in an apartment that is next to a busy highway. Of possible explanations a) A bomb was dropped in the immediate area or b) A truck backfired, explanation "b" is more likely. Event: A woman is nauseous several hours after eating at a restaurant. Of possible explanations a) She may have food poisoning or b) She is suffering from stomach cancer, explanation "a" is more likely. Event: A roast beef in the oven burns to a crisp after being in the oven for only one hour. Of possible explanations a) Someone came into the house and turned up the oven temperature temporarily and b) The oven's temperature gauge needs to be re-calibrated, explanation "b" is most likely. Event: A dog owner comes home to the trash can tipped over and trash is scattered on the floor. Of possible explanations a) The dog tipped the trash can over and b) Someone broke into the house and sorted through the trash can, explanation "a" is more likely. Event: A ball rolls out into the road in a residential area in front of a man's car. Of possible explanations a) Some children are playing ball and it accidentally rolled into the road and b) Someone is maliciously attempting to cause an accident, explanation "a" makes the most sense. Example: Two trees have fallen down during a windy night. Think about these two possible explanations: The wind has blown them down. Two meteorites have each taken one tree down, and after that hit each other and removed any trace of themselves.[14] Even though both are possible, several other unlikely things would also need to happen for the meteorites to have knocked the trees down (they would have to hit each other and also not leave any marks). In addition, meteorites are fairly rare. Since this second explanation needs several assumptions to all be true, it is probably the wrong answer. Occam's razor tells us that the wind blew the trees down, because that is the simplest answer and therefore probably the right one. Occam's razor also comes up in medicine. When there are many explanations for symptoms, the simplest diagnosis is the one to test first. If a child has a runny nose, it probably has the common cold rather than a rare birth defect. Medical students are often told, "When you hear hoof beats, think horses, not zebras".[15]

statistical significance

the probability of making a type I error. A type I error occurs when one rejects the null hypothesis when it is true. The probability of a type I error is the level of significance of the test of hypothesis, and is denoted by *alpha*. .

measurement

the process by which phenomena are observed systematically and represented by scores or numerals

R squared (variance explained)

the proportion of the total variation in a dependent variable explained by an independent variable. The value r2 is a fraction between 0.0 and 1.0, and has no units. An r2 value of 0.0 means that knowing X does not help you predict Y. There is no linear relationship between X and Y, and the best-fit line is a horizontal line going through the mean of all Y values. When r2 equals 1.0, all points lie exactly on a straight line with no scatter. Knowing X lets you predict Y perfectly.

validity

the quality of being logically or factually sound; soundness or cogency.

means

the sum of the values of a variable divided by the number of values The mean may often be confused with the median, mode or the mid-range. The mean is the arithmetic average of a set of values, or distribution; however, for skewed distributions, the mean is not necessarily the same as the middle value (median), or the most likely (mode). For example, mean income is skewed upwards by a small number of people with very large incomes, so that the majority have an income lower than the mean. By contrast, the median income is the level at which half the population is below and half is above. The mode income is the most likely income, and favors the larger number of people with lower incomes. The median or mode are often more intuitive measures of such data. Nevertheless, many skewed distributions are best described by their mean - such as the exponential and Poisson distributions.

unit of analysis

the type of actor (individual, group, institution, nation_ specified in a researcher's hypothesis. The unit of analysis is the major entity that is being analyzed in a study. It is the 'what' or 'who' that is being studied. In social science research, typical units of analysis include individuals (most common), groups, social organizations and social artifacts. The literature of international relations provides a good example of units of analysis. In "Man, the State and War", Kenneth N. Waltz creates a tripartite analysis with three different units of analysis: the man (individual), the state (a group), and the international system (the system in which groups interact)


Related study sets

Acute Coronary Syndrome: Unstable Angina, NSTEMI, and STEMI

View Set

ACCT 2302 managerial accounting ch 3

View Set

Correct Sentences, so it has a similar meaning

View Set