CNSR SCI 201 Final Exam

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Factors related to sample size

Size of research budget Type of analyses ‒ Minimum sample requirements based on certain analysis, cross-tabs or statistical analysis Historical practice

Primary research involves collecting data by

"asking" or "watching"

Simple Random Sample

- A probability sampling plan in which each unit included in the population has a known and equal chance of being selected for the sample. - If a digital version of the sampling frame is available, implementing a simple random sample is relatively easy

Systematic Sample

- A probability sampling plan in which every kth element in the population is selected from the sample pool after a random start. - If a digital version of the sampling frame is NOT available, but a list of population members exists, this is a useful approach

Filter Question

- A question used to determine if a respondent has the knowledge or "qualifies" for the study - Examples: "Do you do the grocery shopping for your family?" "Have you shopped at Trader Joes within the past six months?" "Did you vote in the last presidential election?"

Probability Samples

- A sample in which each target population element has a known, nonzero chance of being included in the sample - Central idea of probability samples is random selection that makes the task objective . Several kinds: - simple random - systematic - stratified - cluster (area)

Nonprobability Samples

- A sample that relies on personal judgment in the selection process Several kinds: - Convenience - Judgment - Quota With nonprobability samples, sampling error cannot be estimated and we cannot make inferences about the population

T-tests for Means

- A t-test is a hypothesis test for the difference in means of a single variable - Typically smaller sample - Bell-shaped, like the normal distribution, w/heavier tails - Also known as Students t - Several different types of t-test depending on type of data/analysis

VALIDITY of measures

- A test is valid if it measures what it is supposed to measure - The extent to which differences in scores are true differences rather than error

STEP 1: Define the Target Population

- All individuals or entities that meet the designated criteria to qualify for a research study - Must be very clear and precise in defining the population Examples: - College students - Primary grocery shoppers - Household with kids 6-12

Chi-square goodness-of-fit test

- Applies when you have one categorical variable from a single population - Tests how likely it is that an observed value (your sample estimate) is due to chance - Goodness of fit statistic shows if your sample data represents the data you would expect to find in the actual population - Chi square statistic tests relationships between categorical variables - Tests of independence, that no relationship exists (Ho) - Difference between what is observed in the data and what would be expected if there was no relationship

STEP 3: Begin to "work" the question areas

- Begin with general issues, specific wording comes later - Idea is to capture needed data using as FEW questions as possible Some key issues: - Is the question necessary? - Are several questions needed instead of one? - Do respondents have the necessary information? - Will respondents give the information?

Convenience Sample

- Being included in a sample as a matter of convenience, right place, right time - Easy to conduct - No way to know if sample is representative of the population - Mall Intercepts, surveys on websites ex: survey on website

STEP 7: Determine sequence of the questions

- Best practice is funnel approach --- Start with broad questions and progressively narrow the scope - Branching questions direct respondents to different places in a questionnaire, based on their responses to the question at hand

Lower structure questions

- Best when the goal is hearing from consumers in their own words - Appropriate when responses are not as clear cut - More suited to emotional based questions and asking about opinions about issues - Useful for understanding motivations

Other considerations and items on the backend of coding

- Build a codebook with all the details about how data from data collection forms are coded in the data file --- Variable name --- Variable description --- Source of data --- Process for handling missing data - Identify blunders that are administrative errors that arise during editing, coding, or data entry --- Run frequencies on all the variables --- Check questionnaires against data --- Double entry of data ---Scanning devices the way of the future

Why would it be useful to convert continuous to categorical data?

- Can be easier to interpret, facilitates charting like the histogram - Higher levels of measurement (continuous measures like visits) have all the properties of lower levels of measurement (categorical measures like age).

Why is sampling error less troublesome?

- Can reduce it by increasing sample size - Can account for it by calculating margin of error (assuming probability sample).

Cumulative Percentage

- Categories are formed based on the cumulative percentages obtained in a frequency analysis - How do we determine the split, or groups? We can "eyeball" and use judgement to

Cross Tabulation

- Cross Tabulation is the most used multivariate tool for categorical variables - "Two Way" Frequency Analysis. Also known as contingency tables - Examines the relationship between two or more categorical variables - Reveals relationships that otherwise may not be apparent - With cross tabulation often determining whether one variable has an influence on another - In standard cross tabs, percentages are always based on the IV/causal variable Two general uses: 1. Examine the cause and effect relationship between variables, does word of mouth (IV) have an effect on exercise class use (DV)? ‒ IV known as the cause, or the predictor variable ‒ DV known as the effect, or the outcome variable 2. Understand the joint distributions of two variables, e.g., the number of males that are ages 25-34

Step 2: Confirm method of administration

- Desired information guides choice of method and nature of the questions - Personal Interview - Telephone Interview - Mail Survey - Online Survey - Consider advantages and disadvantages along with degree of structure and disguise

Sampling Error

- Difference between sample results and population results - Often an issue with the sampling frame - Usually less of an issue than other kinds of error. - Along with margin of error, the level of confidence in survey estimates can also be reported - As sampling error results from collecting data from a subset of the population, it is highly dependent on the size of the sample

Snake Diagram Scale (variation of semantic differential)

- Display of multiple semantic-differential ratings - Lines represent average scores

Rules when asking question

- Don't ask unless absolutely necessary! - Guarantee anonymity - Place sensitive questions near the end - counter biasing - Ask in terms of "other people," e.g., Do you think most students cheat? - Ask for general rather than the specifics (e.g., income categories vs specific income) - Consider a randomized response option

Nonresponse error

- Error from failing to obtain information from some of the sample elements of the population - Only an issue when those that did not respond are systematically different in a relevant way

Noncoverage error

- Error that arises because of failure to include qualified elements of the defined population in the sampling frame - One or more respondents met the criteria but were not on the list and so had no chance of being in the sample. - This is a sampling frame issue. Can mitigate by enhancing quality of the sampling frame

Response error

- Error that occurs when an individual provides an inaccurate response, consciously or subconsciously Key considerations ‒ understand the question? ‒ know the answer to the question? ‒ willing to provide the truth? ‒ wording of the question likely to bias the response?

Personal interviews

- Face to face conversations between interviewer & respondent - Can be conducted in many different locations and handle a variety of information, either open end or fixed - Including using mall intercepts - Generally strong sampling control with higher response rates - Great flexibility, but higher levels of interviewer bias - Time- and cost-intensive

Recall loss

- Forgetting that an event happened at all ‒ Best time frame when asking consumers to recall something is 2-4 weeks

Histogram

- Form of a bar chart that shows values of the variable on the x-axis and the frequency of the value on the y- axis - Differs from a bar graph that relates two variables while a histogram shows only one

Why use observation research?

- Gathering insights on behaviors is a near certainty, watching what people are "doing" - Don't rely on memory - Don't depend on willingness - DO often see the unexpected

transforming raw data into symbols

- Given descriptive data are largely closed- ended, task involves assigning a number to each of the response categories - Single responses are straight forward whether nominal (gender, male=1, female=2) or interval (Likert or Semantic Differential) - Multiple responses can be more complex, often turn the one question into "many"

Administration

- How the data are collected 1. Human - researchers systematically observe a behavior - record specific events that take place 2. Mechanical Observation - an electrical or mechanical device captures activity

sums of squares

- In regression, "sums of squares" are used to represent variation - "Sums of squares" is a measure of how the data varies around the mean - Variance is the average of the literally sum of squares - A high sum of squares indicates that most of the values are farther away from the mean - Means there is large variability in the data

Special consideration given to the response options

- Include a "don't know" if applies to a sizeable portion (>/+ 20%) - Responses must be exhaustive, may need to include an "other" option - Responses must be mutually-exclusive --- Include "check all that apply" if several responses are possible --- Indicate "most important" to help them focus - Response order bias occurs when responses to a question are influenced by the order --- Options earlier in a list tend to be selected more often --- Randomize or "split ballot"

Pearson Product-Moment Correlation Coefficient

- Indicates the degree of linear association between two continuous variables - Tells us whether there is any relationship between two groups - Sample correlation coefficient (r) can range from −1 to +1, the closer to 1 the stronger the association

The problems with R2

- Is sensitive to the number of I.V.'s included - Can increase R2 simply by adding more I.V.s regardless of the quality of their contribution (adjusted R2 tries to account for this) - Does not account for degrees of freedom - What constitutes a "good" R2 varies widely by subject area - difficult to have a general "rule of thumb"

common self-report scales

- Itemized-Ratings Scales - Graphic-Ratings Scales - Comparative-Ratings Scales - Rating scales used to measure "unobservable" concepts

We are always looking to REJECT the NULL hypothesis and not accept the alternative. Why?

- Looking to reject the null hypothesis if the observed value is in the critical region - Fail to reject ("accept") the hypothesis otherwise

Online surveys

- Many ways to administer a survey online, such as through a website or over email - Explosion in use over the past decade - Email lists and panels are readily available, but response rates are often very low - Flexibility; visuals and complex material possible - Usually quick and inexpensive

Office & Recording errors

- Mistakes made by people or machines. Example, errors due to aggregation of data - Errors due to data editing, coding, or analysis errors.

Regression Analysis

- Most common type of multivariate analysis - Used to understand the influence of a set of independent variables (X's) on a continuous dependent, or outcome, variable (Y).

High structure questions

- Most useful when possible replies are known, limited in number and clear cut - Work well for obtaining factual information and assessing opinions about issues - Often used to collect ratings on attitudes, perceptions, and awareness

Nominal Scales

- Numbers are assigned for the sole purpose of identification or to "label." - Example is geography, west = 1, east = 2. ‒ Used to categorize objects, just "labels" ‒ Categories don't overlap, are mutually exclusive ‒ Numbers don't mean anything, just for coding purposes! Example: "Which of the following soft drinks do you like." Assign a label so 1= coke and 3= sprite

Ordinal Scales

- Numbers are assigned to data based on some order, more than or greater than. Example is income (higher # is meaningful). ‒ Only the order matters (6=$100k, 1=$20k) ‒ Differences between each is not known (can't subtract 1 from 6) ‒ Consistency in order is retained - example rank from 1 highest to 6 least

Interval Scales (Rating Scales)

- Numbers represent meaningful differences in the both the order and the value between them - Example is 5- pt satisfaction scale ‒ Differences can be compared: rating of "5" is higher than "1," ALSO difference between 1 vs. 2 same as 4 vs. 5 ‒ No "true" zero, just another number on the scale - Example: "please indicate your liking for each of the following soft drinks by circling the number that best reflects your opinion"

Why pretesting is a good idea

- Often we want to send surveys out as quickly as possible...but this can result in unforeseen problems - Pretesting or pilot testing of a survey "tests the survey" before actual data collection begins - Running a "pre-test" to check for potential problems saves rework and ensures that you get the data that you want

Univariate analysis

- One variable is analyzed at a time. - Major purpose is to describe - Categorical measures are the simplest form of univariate analysis Example: What is the average salary?

Standardized equations

- Provides standardized coefficients (somewhat like the correlation coefficients we studied) - Does NOT have a Y-Intercept Allows you to compare the contribution of I.V.s

Graphic Ratings Scale

- Ratings of an attribute represented by a point on a line, vs. a fixed number, that runs from one extreme of the attribute to the other ‒ Similar to Itemized Rating Scale except a larger/undefined number of categories are used (vs. limited)

Ratio Scales

- Ratio scales tell us about the order, exact value between units, AND have an absolute zero. Examples include height & weight. ‒ "Zero" is meaningful, can compare the differences between the numbers ‒ Can compare intervals, rank the numbers, use numbers to identify ‒ Numbers can be added, subtracted, multiplied, divided (ratios)

Testing for statistical significance

- Regression seeks to identify the "best fit" between the predictors and the outcome - Predictors/independent variables (health, social, physical, or medical reasons) - Outcome/dependent variable: revenue (fees paid) - Analysis produces coefficients for each predictor variable that shows the individual impact - Also produces a model "coefficient of multiple determination" (R2) that reveals the combined fit, or how much variation is explained

RELIABILITY of measures

- Reliability is another term for consistency - How well the measure obtains consistent scores across time or situations

Response rates

- Response rate serves as an indicator of the overall quality - Provides insight into influence of nonresponse error

Judgement Sample

- Sample elements are handpicked because they are expected to serve the research purpose - Sample elements may be representative - Or they can offer the information needed - Snowball - Onsite panels

Quota Sample

- Sample is constructed with certain characteristics that reflect the target population - Goal is to build a sample that looks like the population - Sample elements at discretion of researcher - Online panels

Itemized-Rating Scales

- Scales that indicate ratings of an attribute or object - Respondent selects the category that best describes their position or feeling - ex: itemized verbal --> "very satisfied" "Somewhat satisfied" "neither satisfied or dissatisfied" "somewhat dissatisfied" "completely dissatisfied"

Other ways to improve response rates:

- Shorter surveys - Guarantee of confidentiality or anonymity - Tighten interviewer characteristics and training - Enhance personalization - Offer incentives - Implement follow-up surveys

Step 1: Decide what information is needed

- Start by thinking about the general issues and questions that need to be addressed - It is good practice to develop hypotheses or educated guesses about what to expect ‒ Helps clarify what information is needed - Careful of the need to know vs nice to know trap!

Step 8: Optimize the appearance of the questionnaire

- Streamline instructions - Beware of clutter - Short as possible - Use graphics and other visuals to improve appearance - Build in other interactive features, esp. online, such as 3-D and virtual reality This is your opportunity to engage!

Significance Level (α)

- The acceptable level of error is usually set at 0.05. - The level of error refers to the probability of rejecting the null hypothesis when it is actually true for the population.

Degree of Disguise

- The amount of knowledge people have about a study - Essential when knowing the purpose or sponsor is likely to bias respondents' answers - May cause respondent to change their answer, e.g., evaluations like satisfaction - Or interrupt true spontaneous answers, e.g., recall questions such as top of mind - Also useful when recreating the natural environment is necessary, particularly in experimental research 1. Disguised people are not aware ‒ Captures more authentic behavior, "can become part of the scene" ‒ Ethical considerations → Debriefing 2. Undisguised people know they are being watched ‒ Likely to forget after the first few minutes unless reminded ‒ Can collect additional background information

Degree of Structure

- The degree of standardization used in research -With highly structured questionnaires, questions and responses are completely standardized 1. Structured: - looking for - sometimes we have something specific we went to observe - leadership skills, level of participation; etc. We use a structured preset guide of what to observe or a checklist 2. Unstructured: - looking at - sometimes we want to see what is naturally occurring or exists without predetermined ideas. - We use an open-ended approach to observation and record all that we observe

Sampling error

- The difference between results obtained from a sample and results that would have been obtained from the population - Can be estimated (assuming probability sample) - Due to chance - Usually less troublesome than other kinds of error - Decreased by increasing sample size

Sample size tradeoffs

- The greater the precision, the larger the sample needed - The bigger the sample, the more confident that the true value falls within the range - The greater the variability in the sample the larger the sample size needed Increases in desired precision, confidence or variation lead to increases in necessary sample size

Hypotheses testing for statistical significance

- The null hypothesis for the independent t- test is that the population means are equal: - Looking reject the null hypothesis and accept the alternative, that the population means are not equal

One-sample t-test

- The one sample t test compares the mean of your sample data to a known value, such as a known population mean 1. The sample mean (x̄) 2.The population mean (μ) 3.The sample standard deviation (s) 4.Number of observations (n)

p-VALUE

- The probability of obtaining a given result if the null hypothesis were true in the population - A result is regarded as statistically significant if the p -value is less than the chosen significance level of the test, .05. reject the null if the p value < 0.5

range

- The simplest measure of dispersion is the range - The range is the difference between the minimum and maximum values - Other measures of dispersion include the quartile deviation, the mean deviation and the standard deviation

Telescoping error

- Thinking something occurred more recently than it did ‒ Gets worse as time frame asked about is shorter ‒ Tend to bring in purchases from broader time frames

Bivariate analysis

- Two or more variables analyzed together - Purpose is to understand relationships - how is X related to Y?) - the focus shifts to analyzing the relationships between the variables - Example: What is the relationship between having an MBA and salary? - Just as with univariate analysis, variable type determines appropriate bivariate analysis

Disguise helps to get at the truth but can also raise questions such as

- Use of disguise can be a violation of the respondent's right to know - Can lessen concerns by letting respondents know the study is "blind" and why (Respondent may wish to "opt out") - "Debriefing" provides respondents with information afterwards, including reason for the study or sponsor

Median Split

- Using a median split resulted in two education groups - A lower education group (64%; less than high school to four-year college degree) and higher education group (36%; advanced degree)

Many ways to aided ("mechanical") observation

- Utilizing technological advances and borrowing from medical sciences - Reflect the big data revolution - Exploiting the various tools getting increasingly smaller and personalized

Median

- Value separating the upper half from the lower half of a data sample - Often thought of as the "middle" value

Split-Ballot Technique

- Version A and B have a different order of the responses

Devices used for observation

- Video cameras --> ubiquitous application - Bar code scanners --> first revolution in retailers - People Meters --> stands alone for capturing media behavior

counter biasing

- When asking a question, include statement showing the situation is NOT unusual - EX of counter biasing statement: Recent studies show that one of every four households has trouble meeting its monthly financial obligations.

Setting

- Where the study takes place 1. Natural - observed where behavior normally takes place ‒ Realistic behaviors without prompting ‒ Shopping in a store, using or consuming a product at home 2. Contrived - observed in an environment that has been specially designed for recording their behavior ‒ Control over extraneous influences ‒ "fake" store, computer simulation

Descriptive statistics

- are at the core of most analyses for continuous variables - Aim is to describe properties of data based on measures central tendency and of dispersion. (ex: mean and standard deviation)

Marginal totals

- are simply the frequency counts of each variable

benefit of a probability sample

- because it is objective and random, you can make generalizations to a broader population

normal distribution

- built around all measures of central tendency - For the datasets that follow a normal distribution, the mean, median, and mode are located on the same spot on the graph

Central Tendency

- central point of distribution - one of the most quintessential concepts in statistics - A single value that reflects the center of the data distribution - Delivers a comprehensive summary of the whole dataset

Categorical variables

- contain a finite number of categories or distinct groups - Categorical measures are the simplest form of univariate analysis - Nominal and Ordinal scales often referred to as "categorical measures" because they are used to categorize respondents

regression line

- expressed as an equation showing the relationship between X &Y - regression indicates the impact of a unit change in the known variable (x) on the estimated variable (y). - For a simple regression equation (only one I.V.), your standardized coefficient is the same as your correlation coefficient. - Randomness and unpredictability are essential to regression - Residuals (e) = the difference between the observed value of y (DV) and the predicted value of y (ŷ) - A residual plot shows the residuals on the vertical axis and the independent variable on the horizontal axis - Randomness is GOOD, means the model is fits the data - Overall idea is that the dependent variables explain (or predict) the response well and that only the inherent randomness remains leftover for the error portion y = a + bx - Objective is to estimate a and b - A= intercept - B= slope - The slope and the intercept define the relationship between two variables

Correlation coefficient

- indicates the extent to which two variables move together.

Confidence intervals

- integral to univariate statistics - A range of values around the estimate that is believed to contain the true value (population parameter) - Indicates that the one can be 90%, 95% or 99% confident that the population mean lies somewhere in these ranges - The wider the confidence interval the further the estimate is from the true value - only takes sampling error into account. It DOES NOT account for others

Coding open-end items

- is more involved - Factual open-ended items seeking concrete responses are relatively easy to code - Numeric answers are typically recorded as given in the survey, e.g., year born = 1996 - Other types of responses are given a specific code number, Starbucks = 1, Caribou = 2, Peets = 3

Hypothesis testing

- is the process for how we determine if the sample result is true - Uses confidence intervals to provide a set of standards for making decisions - Decisions about whether (to accept) the results as a true measure of the population - In hypothesis testing we indicate which of the two hypotheses is true since we cannot prove results

types of measures of dispersions

- mean deviation - range - variance - standard deviation

standard deviation

- measure of the variation of responses on a variable - square root of the calculated variance on a variable

Coding Open-ended, less structured responses

- more difficult to code Steps in open end coding: 1. Identify the themes and patterns in various responses 2. Develop categories for responses 3. Sort responses and give the categories codes 4. Ensure consistency in codes

Multivariate analysis

- more than two variables at a time - Example: What is the relationship between having an MBA and salary, controlling for years of experience? - Multivariate analyses enables a more comprehensive look at their data - When more than two variables are involved, we move to multivariate analysis

Frequency Analysis

- most common type of categorical univariate analysis - Involves a count of the number of cases that fall into each of the possible response categories - Use of percentages to interpret the results of categorical analyses can do a chi square test or make a histogram or other univariate analysis Uses include: - Univariate categorical analysis - Identify blunders and cases with excessive item nonresponse - Identify outliers - Identify the median

Continuous variables

- numeric variables that have an infinite number of values between any two values - Continuous measures add greater depth - Interval and ratio-level data are both considered "continuous" variables and can accommodate numerous types of statistics - Most commonly used measures for continuous variables are mean and standard deviation - When both variables are continuous, correlation is most useful

Nonresponse

- often a significant problem - Reminder degree of nonresponse is a good indicator of overall quality Many strategies for handling nonresponse: - Cases with a significant amount of item nonresponse should be eliminated during the editing process - Eliminate the case with the missing item(s) from all further analyses - Substitute values for the missing items - Contact the respondent again

Types of Error

- sampling error - noncoverage error - nonresponse error - response error - recording error - office error

Measurement is ...

- simply "rules" for how to interpret data or outcomes - Process of assigning numbers to objects (like people) to represent quantities of attributes OR membership to groups

"Paper" surveys

- taking on new forms - Traditionally sent by mail, respondents complete and return to organization - Lower degree of sampling control (mailing lists often available, but no control over who completes survey, and often low response rates) - No interviewer bias and can offer anonymity, but less flexibility (no explanation or follow-up, no complex materials) - Lower cost than personal or telephone interviews Warranties are a form of paper survey

Standard deviation

- the # of standard deviations show how you get further and further away from the population mean, the point estimate, as the confident band (interval) gets wider

sample mean

- the average value of the responses on a variable

Unstandardized equations

-Provides coefficients in the original metric of the I.V. - Has a Y-Intercept - Does not allow you to compare the contribution of I.V.s

Two Methods of Data Collection

1. COMMUNICATION - (asking) - Surveying respondents about desired information using questionnaire 2. OBSERVATION - (watching) - Watching and capturing the relevant facts, actions, or behaviors - people watching

Univariate Top 3 Points

1. Categorical measures simplest form, Frequency Analysis most common 2. Continuous measures add greater depth, mean and standard 3. Confidence Intervals integral to univariate statistics

Census vs parameter

1. Census --> Collecting data from all members of a population. 2. Parameter --> A characteristic or measure of a population.

Common probability sampling techniques (cont.)

1. Cluster Sample 2. Stratified Sample

Attributes of Observation and Communication

1. Communication advantages: - versatility - speed - cost 2. Observation advantages: - objectivity - accuracy

Understanding nonresponse error

1. Contact a sample of non responders - goal is to get them to answer just a few questions - compare responses to see if they are different 2. Compare respondent demographics against population - see if certain groups are over or underrepresented 3. Conduct an analysis of late responders vs early responders - looking to see where they are different

Common nonprobability sampling techniques

1. Convenience Sample 2. Judgement Sample 3. Quota Sample

Inspecting the data to ensure quality standards Top 5 Tasks:

1. Convert all responses to consistent units, e.g. months to years, days to weeks, cents to dollars 2. Assess degree of nonresponse, delete record if >/= 50% are missing 3. Check for consistency across responses ‒ Indicated primary grocery shopper in "filter" question but didn't shop in the P3M 4. Look for evidence that the respondent wasn't thinking about answers ‒ Straight lining" (all 5's) 5. Verify that branching questions were followed correctly ‒ "if yes, continue to 3" "Otherwise skip to 5"

Bivariate Top 3 Points

1. Cross Tabs most used tool for categorical variables 2. Difference in means, independent and pairwise t-test 3. Correlation coefficient, r, to see if two continuous variables are linearly related.

Six steps for "drawing a sample"

1. Define the target population 2. identify the sampling frame 3. select a sampling procedure 4. determine the sample size 5. select the sampling elements 6. collect the data from the designated elements

Observation can occur in two ways

1. Direct - observing the actual behavior or activity how people shop at a retailer or eat at a restaurant 2. Indirect - observing the effect or result of a behavior or activity the aftermath of a party, scoff marks on a museum floor to gauge the popularity of a display

Three key steps in interpreting Multiple Regression Results

1. Does the set of predictors explain a statistically significant portion of variation in the dependent variable? (look at F-statistic) 2. How much of the variation in the dependent variable does our set of predictors explain? (look at the coefficient of multiple determination) 3. Which of the individual predictors explain variation in the dependent variable, and what is the direction of the relationship (positive or negative)? (look at the t-values and p-values of the individual predictors)

Primary Research Approaches

1. Exploratory Research - Research to gain ideas and insights to better define the problem or opportunity 2. Descriptive Research - Research describing characteristics of a group or the relationships between variables 3. Causal Research - Research to determine cause-and-effect relationships

Types of standardized questions

1. Fixed Alternative questions - everyone sees the same questions and responses, commonly ask about "evaluations" - close ended - Questions where respondents choose from among a set of alternatives. 2. Open-ended Questions - standardized question but response is "open," tend to ask about feelings - Questions in which respondents are free to reply in their own words.

Two common kinds T-tests for Means

1. Independent Samples T- Test for Means - Test the difference in two means (different samples): H0: μ1- μ2= 0 Examples: - Satisfaction ratings among men vs. women - Age in years among customers vs. noncustomers 2. Paired Sample (Dependent samples) T- Test for Means - Test the mean of pairwise differences (same sample) H0: μd= 0." Examples: - Before and after measures (pre/post advertising tests) - Applying same measure to different objects

Most common rating scales:

1. Likert Summated-Ratings Scales 2. Semantic Differential Scales

Common approaches to conversion

1. Median split 2. Cumulative % breakdowns 3. Two-box technique

Some places to People Watch

1. Museums 2. Airports 3. The Mall 4. Festivals 5. Beaches & pools

Two forms of hypotheses testing

1. Null Hypothesis (H0): - Proposed result is not true for the population - Difference is caused by random chance. 2. Alternative Hypothesis (HA): - Proposed result is true for the population - Difference is "real" - We are always looking to REJECT the NULL hypothesis and not accept the alternative.

Other Considerations in Designing Scales

1. Number of items in a scale 2. Individual vs composite measures - liking of one attribute, e.g., color - overall liking, e.g., based on color + flavor + size 3. Number of scale positions - usually 5-10 pts 4. Odd or even number - Odd is more common, provides a midpoint - Including a "don't know" or "not applicable" response category

Forms of data collection

1. Personal Interviews - remain the most versatile 2. Telephone Interviewing - becoming increasingly difficult 3. Mail (Paper) Surveys - shift to paperless among reasons for decline 4. Online Surveys - todays predominant method

Precision vs. Confidence

1. Precision --> The degree of error in an estimate of a population parameter 2. Confidence --> How confident we can feel that an estimate approximates the true value • As the confidence level increases, the width of the confidence interval increases as well. This results in greater accuracy • However, the precision goes down because we are "less precise" in our estimate

Four main regression statistics

1. R (Multiple R): The correlation between your predicted value and observed value on your D.V. 2. R2: The proportion of variance which is 'explained' by the regression equation. - Simply the square of the multiple R - "Goodness of fit" measure 3. F: The statistic used to test whether the model fits the data well. 4. T: A measure of whether your I.V. has a significant relationship with your D.V

4 types of scales used in Primary Research

1. Ratio --> comparison of absolute magnitudes ex: units sold income average: geometric mean and harmonic mean 2. Interval --> comparison of intervals ex: customer satisfaction or brand attitude average: mean 3. Ordinal --> order ex: brand preference or income (categories) median 4. Nominal --> identity ex: gender or brand purchase (yes/no) average: mode

Two primary sources of nonresponse error

1. Refusals - Overcoming refusals is an important in any method - Making multiple requests helps to mitigate 2. Not-at -Homes - Those who do not answer calls or contacts - Aim for a higher response rate from a smaller sample

"Neuroscience" applications

1. Response latency - traditional technique in online surveys - Measures the time it takes to respond 2. Galvanometer (GSR) - measures skin response 3. Voice-pitch analysis - measures involuntary response (as with GSR) to changes in frequency of voice 4. Eye camera - studies eye movement, can see "heat maps"

Sample vs statistics

1. Sample --> A subset of individuals or entities from a larger group 2. Statistics --> A characteristic or measure of a sample.

Three general considerations with each method

1. Sample control ‒ Ability to project or be representative 2. Information control ‒ Managing interview bias ‒ Level of anonymity 3. Administrative control ‒ Costs of sending out survey, such as paper and stamps

Common probability sampling techniques

1. Simple Random Sample 2. Systematic Sample

Two types of asking error:

1. Telescoping error 2. recall loss Recall loss and telescoping error work in opposite directions

Key considerations in observation research

1. Versatility - Observation is behavior in the moment ‒ Can't watch past behavior or future intentions ‒ Hard (not impossible) to "see" attitudes & values 2. Efficiency (speed & cost) - Observation on their terms ‒ Must "wait" for behaviors to happen ‒ Scanners are an exception, rapid and ongoing 3. Objectivity & Accuracy - Observation is the "real deal" ‒ Highly accurate ‒ Less subjective

Two key considerations with data analysis

1. What type of analysis will be done? 2. What level of measurement will be used? - Four levels of measurement based on the data - nominal: attributes are only named; weakest - ordinal: attributes can be ordered - interval: distance is meaningful - ratio: absolute ratio

Considerations in Observation Studies

1. degree of structure 2. degree of disguise 3. setting 4. method of administration

The 4 most popular chart types in Excel

1. pie chart - percentages of a whole 2. column chart - using vertical columns, displays values for one or more series over time or other category 3. bar chart - displays values for one or more series using horizontal columns 4. line chart - displays values as equally spaced points connected with a line

Populations vs. Samples

1. population - the measurable quality is called a parameter - the population is a complete set - reports are a true representation of opinion - it contains all members of a specific group 2. samples - the measurable quality is called a statistic - the sample is a subset of the population - reports have a margin of error and confidence interval - it is a subset that represents the entire population

Marketing Research Process Stages

1. problem/opportunity definition 2. data and information gathering 3. data analysis 4. communication of results

Process for developing an effective questionnaire

1. specify what information will be sought 2. determine method of administration 3. determine content of individual questions 4. determine form of response to each question 5. determine wording of each question 6. prepare dummy tables 7. determine question sequence 8. determine appearance of questions 9. develop recruiting message or script 10. reexamine step 1-9, pretest questionnaire, and revise if necessary

Why is it better to try for a higher response rate from a smaller sample than to start off with a larger pool?

A bigger sample will give you the same amount of responses as trying for more responses from a smaller sample int he end, but higher response rates are better in a smaller pool because sampling errors is smaller with a smaller group

Constant-sum Method

A comparative-ratings scale in which an individual divides some given sum among two or more attributes on a basis such as importance or favorability

Outlier

A data point that differs significantly from other observations due to variability in the measurement or it may indicate error

Comparative-Rating Scales

A rating scale based on a series of relative judgments or comparisons rather than as independent assessments

Advantages and disadvantages of Highly standardized questions

Advantages: - ease of administration - ease of coding and analysis - measure reliability Disadvantage: - Response Bias such as: forced choice, omitted response, and precision of response

Close-Ended Questions Advantages/Disadvantages

Advantages: - fast answer - less complicated - easy to find using internet Disadvantages: - doesn't tell you a lot - you don't have to think, which means you're not stretching your brain

Open-Ended Questions Advantages/Disadvantages

Advantages: - learn more by answering these questions - get more detailed information - more specific answers Disadvantages: - can take a long time to answer properly

How are precision and confidence inversely related?

As one increases, the other decreases, all else equal.

Two general kinds of classification questions:

Ask for classification & sensitive information last 1. Target Information: The basic information that addresses the subject of the study 2. Classification Information: Information used to classify respondents, typically for demographic breakdowns

Two-Box Technique

Converting interval-level ratings into categorical measures by only showing top two positions on a rating scale

What is the most prevalent type of Primary Research?

DESCRIPTIVE ANALYSIS

Step 10: Reexamine!

Developing a survey normally requires several revisions of the data collection form. Its an iterative process

What is the best way to mitigate response error?

Doing a pretest

random error

Error in measurement due to temporary aspects of the person or situation

systematic error

Error in measurement that is constant, affects the measurement in a constant way.

What can be done to avoid unstated alternatives?

Exploratory research and pre-testing

Step 6: Prepare a summary table showing hypothetical results

Forces you to think about all the questions and each piece of information

What is the primary benefit of Probability Sampling?

Generalizations (inferences) can be made.

What makes a ratio scale different

Has a meaningful 0 meaning the absence of a value but on any other scale it has another value

Snowball Sampling

Judgement sample that is used to sample special, hard to find, populations in which an initial set of respondents are located and asked to others with same, special characteristics.

WHY is it better to try for a higher response rate from a smaller pool than start with a larger pool?

Lower total error

Unstated Alternatives

One that is not expressed in the "options" provided in the question - rephrase! 1. "Would you like to increase your monthly pay salary?" VS. 2. "Would you prefer to increase your monthly pay salary or reduce your tax obligation at the end of the year?"

Which method of primary research is best at mitigating refusals?

Personal Interviews

Step 9: Prepare a welcome/recruiting message

Provides helpful context: 1. Sponsor of study (if not disguise) 2. Reason for study 3. Promise of anonymity & confidentiality 4. The request for help 5. How long it will take 6. The incentives Get them motivated to complete the study

Semantic-Differential (Rating) Scale

Respondents check which phrase between a set of bipolar adjectives or other words that best describes their feelings toward the object

Likert-Summated Ratings Scale

Respondents indicate their degree of agreement or disagreement with each of a number of statements

2. identify the sampling frame

Sampling Frame --> The list of population elements from which a sample will be drawn; could be geographic areas, institutions, individuals, or other units. Commonly used sampling frames: • Customer database • Member directories • Lists developed by data compilers • Others

Systematic Sample Interval Formula

The number of population elements to count (k) when selecting the sample members in a systematic sample k = (# elements in sampling frame) / (total sampling elements)

Stratified Sample

The population is divided into subsets and a random sample of elements is chosen from each subset

Cluster Sample

The population is divided into subsets and a random sample of one or more subsets (clusters) is selected

Sampling Plan

The process of selecting people to be interviewed

Question Order Bias:

The tendency for earlier questions to influence respondents' answers to later questions

STEP 4: Determine the Sample Size

Three pieces of information are needed to determine the sample size: 1. How much precision is desired in the estimate 2. How confident we need to be that the true value falls within the precision range established 3. How homogeneous (similar) the population is on the characteristic to be estimated NOTE: The size of the population does not impact the size of the sample

STEP 3: Select a Sampling Procedure

Two categories of sampling techniques 1. Probability Samples 2. Nonprobaility samples

Step 5: Determine wording of each question

Use simple words, err on the side of simplicity. AVOID THESE QUESTIONS: 1. Avoid ambiguous words & questions 2. Avoid leading questions 3. Avoid assumed consequences 4. Avoid generalizations 5. Avoid double-barrel questions

Assumed Consequences

When a question does not clearly state the consequences and can generate different responses from individuals who assume different consequences

Variability

how the scores are scattered around the central point

Biggest factor in response rates =

interest in topic

sample drawn from a population to...

make inferences about the population

Sampling error can be calculated with...

probability sampling

coding involves transforming raw data into

symbols (usually numbers)

Observed response =

truth + systematic error + random error

Two key considerations of measurements

‒ Measure characteristics of a person not the person ‒ The way we measure these characteristics varies, attributes have different qualities


Kaugnay na mga set ng pag-aaral

Ch. 26 Safety, Security, and Emergency Preparedness

View Set

Patho Chp 37 Diabetes Mellitus and Its Complications

View Set

PSY150-001 General Psychology Chapters 4 & 6 Test 2

View Set

Chapter 10: Kinship, Family, and Marriage

View Set

Constitutional Criminal Procedure Exam 1

View Set

Psych 111 part 3, PSYCH 111 EXAM 1+EXAM 2

View Set