Research Exam 2

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Provide an example of when snowball sampling would probably be used

Good example is when you're trying to study the transgender population Transgender people rarely come forward on their own Transgender people also tend to know other transgender people, allowing the snowball to start

__________________ variables pose a higher level of complexity in approaching analysis.

Heterogenous

What is ALWAYS the first variable placed into a data spreadsheet

ID number

Discuss interquartile range

Interquartile Range (IQR) is used to define whether we have some outliers in our variable Anything that is higher than (Q3 + 1.5*IQR) or lower than (Q1 - 1.5*IQR) is considered an outlier

Give an example of an interval, ratio, nominal, and ordinal variables

Interval - temperature Ratio - gpa, height, weight, salary Nominal - race, political party, major Ordinal - class status, letter grade

Discuss skewed distributions

Left-skewed (Negatively skewed) distributions have a long tail on the left side and majority of data are on the right. Vice versa - Right-skewed is also know as positively-skewed. RULE OF THUMB... If peak is on the RIGHT, it is LEFT-skewed. If peak is on the LEFT, it is RIGHT-skewed

Difference between a sample and a population. What is a sampling frame?

Population - entire group of people study is focused on Sample - subset of the population Sampling Frame - list of the entire population of a study

Discuss variance and standard deviation

Variance: Spread of distribution from mean Standard Deviation: Square root of variance & tells average distance of scores from mean

Discuss scatterplots

Visually depicts a bivariate relationship Each point represents a case, dependent variable on the vertical axis and the independent variable on the horizontal axis

Points to consider when selecting a statistical test. (4)

What type of question are you asking? What type of variables do you have? Number of groups? Have you met assumptions for each statistical test?

Give an example of purposive sampling

When a TV reporter stops a passerby on the street to get their opinions on current political changes Its important to note that the reporter must apply some judgement when deciding who to stop and ask questions to. Otherwise it would just be random sampling

When inputting data, what should you do with missing data?

When coding missing data, leave the variable blank or use a standard code (i.e. 9999) Blank in SPSS is considered missing by default -Good idea to code so you know it is indeed missing -In some cases blanks indicate N/A or skip questions

In what type of situation might an experimental design not be feasible, ethical, or appropriate?

When the research question can be broad and exploratory, or it can be about what it is like to have a particular experience (e.g., What is it like to be a working mother diagnosed with depression?).

Provide an example of when you'd use stratified sampling.

When trying to compare GPAs between Health majors and art majors. Would need to get a strata with equal number of people for both majors and then randomly pick from the strata

Pros/Cons of Non-Probability Sampling

- Cost-effective, easy to accomplish - Good for exploratory studies BUT..... - Difficult to generalize

Purpose of Entering and Organizing Quantitative Data (3)

Place the results in various categories where they belong Essential step to understanding content Remove or fix information that will affect analysis - Errors, missing information

Steps to creating a questionnaire (9)

1. Develop a list of constructs 2. Determine how each construct will be measured in a question 3. Think of all possible answers to the questions 4. Avoid biased, misleading questions 5. Organize to attract and hold participant attention 6. Use clarity and brevity 7. Pay attention to contingency questions 8. Create an answer scale 9. Conduct a pretest to evaluate the instrument

What are the ways in which we can collect data? (3)

1. Personally collecting questionnaires 2. Computer-Assisted Telephone Interview (CATI) 3. Virtual Data Collection

Why is personally collecting questionnaires the preferred method of data collection?

1. Researcher gains ability to note body language, record impressions, and keep records of additional information that would not be included in the questionnaire 2. More likely that a participant will fully complete the questionnaire 3. A longer questionnaire is easier to complete face-to-face since people are more likely to talk at length with someone than focus on a long, written questionnaire

What are the proportions of scores found inside certain z-scores within a normal distribution

1. The area between 0 and -1 and the area between 0 and 1 always equals .3413 2. The area between 1 and 2 and the area between -1 and -2 always equals .136 3. The area between 2 and 3 and the area between -2 and -3 always equals .0215 4. The proportion of scores greater than 3 or smaller than -3 is .0013 Can conclude that approximately 68.3% of the distribution falls within 1 standard deviation from mean

What is the max number of answer choices permitted per question?

7

List the internal consistency levels based on the different values of Cronbach's alpha

>= 0.9 (excellent) 0.9 > x < 0.8 (good) 0.8 > x < 0.7 (acceptable) 0.7 > x < 0.6 (questionable) 0.6 > x < 0.5 (poor) 0.5 > x (unacceptable)

What are line graphs? Define the axises

A line connecting points in a distribution Useful in depicting two or more categories simultaneously Horizontal axis is called abscissa, and vertical axis is called ordinate

Discuss the linear model

A linear relationship is a "perfect" relationship between two variables that, when graphed, resembles a straight line Rare in research linear relationship is expressed as Y=bx

Advantages and disadvantages of virtual data collection

Advantages include recruiting a very large number of people to participate, and also a much higher response rate than collecting data face-to-face or via phone Drawbacks: i. Substantial proportion of the population still doesn't have internet access. ii. Participants who feel strongly about a subject are more likely to participate, resulting in more +/- extremes. iii. Researcher cannot guarantee the identity of the respondent in an online setting

Discuss cluster sampling

Allows researcher to create separate random clusters first and then randomly select participants from each cluster Clusters need to be as similar to the entire population as possible Example: 5 counties in VA vs. the whole state Randomly select the 5 counties to compare to the state Then randomly select individuals within those counties

What is convenience sampling. What are some benefits (3)?

Allows researchers to select any participants for the study, even if they are not representative of the population Often means gathering information on participants available to you at a given point in time Participation happens by availability and accident Particularly cost and time-effective for pilot studies Pilot studies give a chance for the researcher to gauge the viability of questions, check for redundancy, and adjust language Cost-effective Can ask participants for feedback before implementing study on a larger scale

How does a pretest for our questionnaire benefit?

Allows to see how long it takes to complete survey Determine any issues in understanding questions, their sequence, or if they cause discomfort

Discuss quota sampling and its two forms

Allows us to compare different groups within the population of interest Researchers need a sample of predetermined quota or proportions Proportional Quota Sampling - The sample proportions are representative of the same proportions that exist in the whole population of interest Non-Proportional Quota Sampling - Uses a different quota than what is found in the population of interest because the study's aim is to compare two or more different groups of interest

When developing answer choices for questions on a questionnaire, what should answers always be?

Answers should ALWAYS be MUTUALLY EXCLUSIVE Answers should also be collectively exhaustive (i.e. response options cover entire realm of possible answers)

When developing answer choices for questions on a questionnaire, how can the researcher avoid making biased and misleading questions?

Avoid guiding the participants answers - Instead of saying "Do you have any problems with your spouse?" say, "How do you rate your relationship with your spouse?" Avoid social desirability in questions - Instead of asking "How many books do you read to your toddler during a typical day?" say, "How many books are you able to read to your toddler during a typical day?" - It's socially desirable to read to toddlers, so the first question may cause parents to answer untruthfully (higher). The second question includes "are you able to read," taking blame off of the parents in the instance they don't read to toddlers often. Double-Barreled Questions - Ask about more than one concept but allow only one answer - For example, "Were you treated with respect by the nurse and the doctor at our facility?" with only a Yes/No answer choice. - This doesn't consider that maybe the nurse was respectful, but not the doctor, or vice versa

Differentiate between bar graphs and histograms

Bar graphs are often used for qualitative variables in which the bars display the frequency of separate and distinct categories (no relationship) - categorical variables Histograms are often used for quantitative variables where the categories are related to one other - numerical variables - Represents frequency of a variable in distinct, but connected bars

Discuss stratified random sampling and its two forms. When is this sampling method preferred?

Becomes valuable when the study is focused on understanding, comparing, or analyzing different groups of a population Requires equal numbers of participants from each group (i.e. men and women, different ethnicities, etc...) Strata - a list of different groups of people that requires an equal number of participants from each group Once the researcher creates the strata, we randomly choose an equal number of people from each group Proportionate Stratified Sampling - Follows the proportions of the true population, but still creates specific strata for the interest of the study - Therefore, sample will represent entire population Disproportionate Stratified Sampling - Proportions are not equivalent to that of the entire population

Discuss multistage sampling and provide an example

Combination of two or more types of sampling together to best suit a specific study Example: • Begin with cluster sampling (randomly select 5 VA counties) o Go into each of those counties and then randomly select high schools in the counties Within the schools, use stratified sampling to select students from different SES classes

You are using a tested scale that includes multiple related items to measure the same construct. Assuming no items are reverse coded, what is the first step you need to do before your analysis?

Compute a total score for items in the scale

What are the four types of non-probability sampling? (4)

Convenience Snowball Purposive Quota

Differentiate between regression and correlation

Correlation can be used to determine the extent to which two quantitative variables approximate a linear relationship; strength and the direction of the relationship Regression can be used to identify the line that, though imperfect, best describes this relationship. Regression also tells us the direction of the relationship.

Description of variables should include.....(3)

Description of variables should include how the variable was measured, a list of the response choices, and other significant notes

Differentiate between descriptive and inferential analyses

Descriptive Analysis - allows researchers to sketch the details of variables and gain familiarity with the sample - Become familiar with min and max values, range, quartiles, mean, median, mode, etc... - Includes collecting, organizing, summarizing and presenting Inferential Analysis - provides a deeper understanding of variables because it draws conclusions about the population from the sample at hand - Best tool researchers have to make approximate generalizations about the entire population - Attempts to draw relationships between two or more variables - Includes generalizing to population, hypothesis testing, determining relationships, making predictions

What is meant by 'logical formatting' when preparing for data entry?

Determining how the variables can best be organized to ensure the data analysis phase is simple and straightforward Follow the questionnaire UNLESS you realize a question is out of place Which variables should be entered in the first, second, and third columns of the spreadsheet?

Differentiate between continuous and discrete variables

Discrete variables are those that can only take on whole numbers and do not have any continuity in terms of measurements Continuous variable are those that take can on all possible units, such as a measure of income, height, or weight

When organizing and inputting variables into a spreadsheet, each row represents one __________________ and each column represents one ________________________.

Each row in a spreadsheet represents one participant and each column represents one variable

Variable Names Rules when inputting data (6)

Each variable must be unique Up to 8 characters (including letters, numbers, underscore) Must begin with letter or @ Last letter cannot be period or underscore No spaces Keywords cannot be used (i.e., ALL, BY, NOT, OR, TO, WITH)

Give an example of when non-probability sampling is preferred

For instance, say you were interested in understanding why new immigrants engage in HIV risk behaviors May have a tough time recruiting participants, so you just sample what you can get

Discuss the different measures central tendency (3)

Mean - the widely used notation for mean is x-bar Median - useful because it gives us an understanding about the distribution of a variable Mode - most preferred source of information when the variable includes outliers because it is resistant to additional numbers in the distribution (unimodal, bimodal, multimodal)

If the data is continuous and normally distributed, what is the preferred source of information? What about when it's not normally distributed?

Mean is preferred if normally distributed and continuous data. Median is preferred when data are not normally distributed.

Discuss how certain measures are more appropriate for certain variable types

Mean, median, and mode are appropriate with continuous variables Frequencies & percentages are appropriate with discrete variables

Internal consistency reliability is measured as _______________________________ and describes ____________________________________________________________________.

Measured as Cronbach's alpha and describes the extent to which all items measure same construct (i.e., the inter-relatedness of all items)

When a variable includes outliers, what is the most preferred source of information and why?

Mode because it is resistant to additional numbers in the distribution

Match the variable type (nominal, ordinal, interval/ratio (not skewed), interval/ratio (skewed) with the preferred source of the measure of central tendency.

Nominal - Mode Ordinal - Median Interval/Ratio (not skewed) - Mean Interval/Ratio (skewed) - Median

What type of test (parametric/nonparametric) do you use for nominal, ordinal, interval, or ratio variables?

Nominal - parametric Ordinal - parametric Interval - nonparametric Ratio - nonparametric

Contrast nominal, ordinal, and numerical variables.

Nominal = no numerical value and no relationship Ordinal = no numerical value, but represent ordered sequences of responses Numerical = answers have mathematical meaning

Differentiate between the 2 types of categorical variables

Nominal variables - - For instance, people either say they're male or female. We could enter "0" for female and "1" for male Also called qualitative variables - describes variables that are not mathematically related to each other i.e. eye color Ordinal variables - - Refers to responses that do not have a numeric value, but instead represent an ordered sequence of responses, for example: A) Strongly agree B) Somewhat agree C) Somewhat disagree D) Strongly disagree • (strongly agree= 4, somewhat agree= 3, etc...) Has some degree of relationship between variables, but still cannot mathematically compare - i.e. letter grades, level of educational attainment

What type of measurement uses a scale to measure data? For example, people are asked to indicate if their pain is A. Slight, B. Moderate, or C. Severe

Ordinal

Differentiate between parametric and nonparametric tests

Parametric - continuous data - data normally or near normally distributed - must meet other statistical assumptions - equal variances and same standard deviations - frequently considered more powerful Nonparametric - discrete data - do not require normally distributed population - wider applications and are less difficult to compute

List the 4 different tests for both parametric and nonparametric tests

Parametric Tests -Pearson Correlation -Student t-test -Analysis of Variance (ANOVA) -Regression Nonparametric Tests (commonly used) -Chi-Square -Spearman Rank Coefficient -Mann Whitney-U Test -Kruskal-Wallis Test

What is snowball sampling? When does this method work better? What is the most important thing as the researcher?

Participants are selected by word of mouth. The researcher connects with one participant and that participant will go find another person, and this will continue until you have enough participants Similar to convenience sampling in that it may not be representative of the population Useful when the population is hard to reach Works better when the population of interest requires a more personal, internal connection to the study UPMOST IMPORTANCE TO GAIN THE TRUST OF THE FIRST FEW PARTICIPANTS

Discuss normal distribution and z-scores

Perfect theoretical distribution where the mean, median, and mode are all equal Uses standard scores (z-scores¬) o Indicate the number of standard deviations in a distribution and what proportion of scores fall in the ranges

Discuss the difference between probability and non-probability sampling

Probability sampling o Participants are randomly selected and a sampling frame is often used o Often used in quantitative studies Non-Probability Sampling o Participants are often not randomly selected and a sampling frame may not be available o Often used in qualitative studies

Define data collection

Process of transforming the raw information from our participants into measurable units that can later be analyzed

Pros/Cons of Probability Sampling

Pros - excellent for larger studies - easier to use if you have completed a study in the past Cons - study will be costly - not always appropriate for exploratory studies - cannot go back to participants for more info

What are the requirements of experimental research?

Randomized selection of participants Control and experimental group Pre and post-testing

Differentiate between the 2 types numerical variables

Ratio and Interval Interval variables cannot express a ratio between numbers because of the lack of a true zero • Temperature is a good example because there never isn't temperature, there is always a temperature Ratio variables have a true zero point • i.e. GPA, income in dollars • "How many children do you have?" - they can say zero

Discuss simple random sampling and its advantages and disadvantages. When is this sampling type preferred?

Relies on complete randomization without any specific boundaries Preferred when we definitively know the population to investigate Techniques include random number generation, picking numbers from a fish bowl, etc... Two biggest advantages are that every member has an equal chance of participating and it ensures the representation of the population Its biggest disadvantage is that it REQUIRES a complete list of the population from which the sample is drawn

Ratio and Interval variables are identified as what in SPSS?

Scale

Discuss systematic random sampling

Selecting the nth number on the sampling frame o N = sampling interval

How does quasi-experimental differ from experimental study designs?

Similar to experimental designs but no random assignment of control and experimental groups Experimental research is not possible in every context

Discuss frequency distributions. What is the first indicator?

Simple procedure that can illustrate key characteristics of participants in the sample First indicator is absolute frequency, which reflect the number of times an event or instance has been repeated Assign %s to each category

List the types of probability sampling

Simple random Systematic Stratified Cluster Multistage

Difference between survey and questionnaire

Survey - refers to the method of data collection Questionnaire - the tangible instrument containing the questions

Discuss CATI (Computer-Assisted Telephone Interviews) and what interviewers should consider when conducting these. Give a pro and a con

Used to find respondents or possibly conduct the entire interview electronically i.e. customer service survey. Computer will initiate interview and then go onto talking to a person Critical that interviewers be trained well. Must consider: i. Familiarity with content ii. Mindful of bias and judgement iii. Set a comfortable environment iv. Not lead the participant Pro - more convenient, interviewer can clarify information Con - not everyone has a phone, people more likely to hang up

Discuss bivariate analysis. What does it use?

Used to investigate the relationship between two variables Are changes in one variable reflected by changes in the other variable? - Uses correlation and regression

Unimodal, bimodal, and multimodal

Unimodal (1 mode) Bimodal (2 modes) Multimodal (3+modes)

When do pie charts work best? What is its main advantage when used?

Work best with categorical variables that have a small number of categories In the instance where the number of categories are relatively small and one category is much larger than the rest, a pie chart may be most effective in displaying the data Main advantage is its visual impact on the audience

How is the regression line expressed?

Y = a + bX a = intercept of y-axis b = slope of the relationship (slope)

What is a standardized survey questionnaire?

a questionnaire that has been tested and is a reliable and valid form of data collection

Define data science

ability of a computer or a machine to think and make decisions based on previous data or information

What is the Pearson Correlation Coefficient

also called 'r' An index that determines the level of correlation between two variables Can range from 1 to -1 - The closer the coefficient is to 1 or -1, the more likely it is that the relationship under investigation resembles a linear relationship - Coefficient of 0 indicates no relationship between variables

What is purposive sampling? What are the 2 forms? Drawback?

also called judgmental sampling Allows researcher to select participants of interest for the study Homogenous Sampling - Participants are chosen based on a trait or characteristic of interest; each participant must have the characteristic the researcher is looking for (i.e. women hockey players, women 65+) Deviant Case Sampling - Focuses on unusual or very specific cases; focuses on the outliers (i.e. men experiencing domestic abuse, high school dropouts) A benefit in allowing the researcher to handpick participants for the study, but then we are not able to generalize the results to the greater population and it offers low external validity

Discuss numerical variables and a possible issue associated with entering them into the spreadsheet

also called quantitative or scale variables - Age, income, height, etc Possible to enter data as is...but there's an issue For instance, say we're measuring someone's treadmill activity • We find out the time of day, the duration, and the speed • 5:00 am, 45 min, 2 miles/hour All of these are different units, so we can't compare the numbers Need to have consistent units May say that "1" is 5:00 am, "2" is 6:00 am, etc... "1" is one mile/hour, "2" is two miles/hour, etc...

Define univariate analysis. What does it help researchers do?

analysis of only one variable, often descriptive in nature but can also be inferential Univariate analysis helps to organize variables and gives us a clearer picture of our sample population Can better understand demographics

It is recommended that demographic questions on a questionnaire be placed.....

at the end

A regression line, or line of best fit, represents the _________________________________________ between two variables

closest possible linear relationship

Descriptive statistics describe the _______________________________________.

first-time impressions of a variable

In data entry, ID numbers for participants always _______________________________.

go in the first column of the spreadsheet

When data sets have little variability, we call them _________________________ and when they have a lot of variability we call them ___________________.

homogenous; heterogeneous

What is a codebook?

key to the codes in your study and how they correlate with participant responses Extremely important that it is kept in a separate and secure file May lose information otherwise

Criterion for finding the value of the intercept and slope is formally known as the _________________________ because the regression line represents the one line that can connect the smallest distance of each squared data point

least squares criterion

Statistics uses......

mathematical logic to interpret phenomena and draw conclusions from accumulated data.

When creating an answer scale for a questionnaire, an even number of choices, participants must __________________. With odd number of choices, people can ___________________________________.

pick a side, choose a neutral choice

Regression is used to __________________________ of variables as much as it explains current behavior

predict future behaviors

Discuss cumulative percent

shows collective percent of categories together Collects aggregate numbers one by one and groups categories together to display the collective percentage of two or more categories together A valuable feature is in its ability to provide cumulative percentages when multiplied by 100 Sensitive to the order of organization

Missing information from a question in a survey can....

signal an underlying with a specific question if a lot of people aren't answering it

By coding information, we can _______________________________________________

simplify raw data in a way that is easier to analyze

Measures of central tendency are shifted in a ________________________________.

skewed distribution

Define absolute frequency

the number of times an event or instance has been repeated

Why are outliers a threat to data analysis?

they can distort research findings

Inferential statistics operate by __________________________________________.

trying to calculate the random error that occurs during data collection and analysis Cannot identify systematic error

Percentages are capable of.....

zooming out from the raw numbers by giving us a deeper layer of information because it compares the raw numbers to the entire sampling population


Ensembles d'études connexes

Chapter 9 Online Retail & Services M/C

View Set

Law in Higher Education Final Study Guide

View Set

The US History: The Revolution Begins

View Set