Marketing Research Exam 3
Categorical Measures
Because both nominal and ordinal measures are easily used to group respondents or objects into groups of categories. researchers often refer to these types of measures
Codebook
contains explicit directions about how raw data for each care are coded in the data file
Branching questions
(also called skip pattern) a question that routes people to different survey items based on their responses.
Open- Ended Questions
- Free to reply to open-ended questions in your own words rather than being limited to choose from a set of alternatives - quite versatile - often used to begin a survey because it helps the respondents begin to focus on the topic at hand - 2 types of open ended questions 1. factual information 2. exploratory information
Optical scanning
data collection forms take info directly from the data collection form and reads it into a data file
Types of Probability Samples
- Simple random samples: each unit included in the sample has a known and equal chance of being selected for study and every combination of population elements is a sample possibility. - Systematic sample: a probability sampling plan in which every k th element in the population is selected for pool after a random start. - Sampling interval: the number of population elements to count (k) when selected the sample members in a systematic sample - Total sampling elements: the number of population elements that must be drawn from the population and included in the initial sample pool in order to end up with the desired sample size. - Stratified sample: a probability sample in which (1) the population is divided into mutually exclusive and exhaustive subgroups and (2) samples are chosen from each of the subgroups - Cluster samples: similar to stratified in the population is divided into mutually exclusive and exhaustive subgroups. A random sample of one or more subsets is selected. -Area sampling: a form of cluster sampling in which areas serve as the primary sampling units. Using maps, the population is divided into mutually exclusive and exhaustive areas, and a random sample of areas is selected.
Close-ended questions
- choose answer from options given - Negative aspects: often close-ended questions don't tell respondent's true beliefs - if 20-25% truly don't know the answer, put a "don't know" option - Response categories must be exhaustive and mutually exclusive - response order bias: occurs when responses are likely to be affected by the order in which the alternatives are presented - split-ballot technique: paper based generally, uses multiple versions of the questionnaire. Each version varies order of response categories *most answer choices listed first in order are chosen
Significance Level
-Alpha -Acceptable level of error, usually at .05
Different types of Nonprobability samples
-Convenience samples: being included in the sample is a matter of convenience. -Judgement samples: the sample elements are handpicked by the researcher because she believes that they can serve the research purpose. -Snowball sample: a judgment sample that is sometimes used to sample special populations, in particular those that are difficult to find and identify. -Quota Samples: a nonprobability sample chosen so that the proportion of sample elements with certain characteristics is about the same as the proportion of the elements with the characteristics in the target population
When testing One-Tail Test with Computer...
-Divides the probability in half -Check the direction of the difference
How to Improved Response Rates?
-Respondent interest in topic -Guarantee anonymity -Follow up surveys -Survey Length -Personalization -Response Incentives -Interviewer characteristics and training
Even though you want to use simple words...
...remember to be precise enough to avoid ambiguity.
Coding open-ended items is a time consuming process that involves a...
...significant amount of subjective interpretation of respondents' comments
Paired sample t-test
1 Group 2 Measurements Examples Before and after measures Longitudinal designs Ask same questions about two brands for each person
3 Elements of Written Report
1. Completeness 2. Accuracy 3. Clarity
Primary tasks in editing process
1. Convert all responses to consistent units 2. assess degree of nonresponse. If limited, keep the available data 3. Where possible, check for consistency across responses 4. Look for evidence that the respondent wasn't really thinking about his or her answers 5. Verify that branching questions were followed correctly 6. add any needed codes
Hypothesis Testing
1. Create mutually exclusive hypotheses, H0 and HA 2. Generate evidence bearing on H0 and HA 3. Asses the probability of the evidence assuming H0 is true 4. Reject H0 if the probability is small
Steps to Determining size of sample you need
1. Define the Target Population 2. Identifying the Sampling Frame 3. Select a Sampling Procedure 4. Determine the Sample Size 5. Select the Sample Elements 6. Collect the Data from the Designated Elements.
Preliminary Steps before Analysis
1. Editing and coding raw data 2. Coding Data 3. Aggregating behavioral data to usable units 4. Merging data from different sources 5. Building the data set 6. Creating a codebook 7. Cleaning the data 8. Handling missing data
3 Questions to Answer How big of a Sample Size we need?
1. How Homogenous the population is 2. How Precise the estimate must be 3. How confident we must be
3 Underlying Assumptions of Regression
1. Independence of error (occurs when sine wave) 2. Normality of error (bell curve fails to tail off each cross section) 3. Homoscedasticity of Error (error is not same amongst the samples)
When is a Mean worth using to determine sample size?
1. Interval Scales 2. Ratio Scales
What are the different types of probability with a Cross Tabulation?
1. Joint Probabilities 2. Marginal Probabilities 3. Conditional Probabilities
When is Proportion used to determine sample size?
1. Nominal Scales 2. Ordinal Scales
What are the 6 Types of Error?
1. Sampling Error- difference in sample results from population 2. Non-Coverage Error- failure to include qualified elements in the sample 3. Non-Response Error- failing to obtain info from a selected sample member (refusals or not at home) 4. Response Error- individual provides inaccurate response either consciously or accidentally 5. Recording Error- researcher or technology records wrong information 6. Office Error- error due to data editing, coding, or analysis
Interpreting Multiple Regression Results
1. Set of predictors explain significant portion of variation? (look at ANOVA) 2. How much variation in dependent variable do our predictors explain (R2) 3. Which individual predicators explain the variation and what direction
What are some Additional Approaches to determining Sampling Size?
1. Size of Research Budget 2. Anticipated Analyses 3. Historical Practice
Steps to making a questionnaire (10 steps)
1. Specify what information will be sought 2. determine method of administration 3. Determine content of individual questions 4. Determine Form of Response to each question 5. Determine working of each question 6. Prepare Dummy Tables 7. Determine Question Sequence 8. Determine Appearance of Questionnaire 9. Develop Recruiting Message or Script 10. Reexamine Steps 1-9, Pretest Questionnaire, and Revise if necessary
Rules that researchers should keep in mind in developing bias-free questions are:
1. Use simple words 2. Avoid ambiguous words and questions 3. Avoid leading questions 4. Avoid unstated alternatives 5. Avoid assumed consequences 6. Avoid generalizations and estimates 7. Avoid double barreled questions
Duration of oral presentation
1/3-1/2 allotted time for formal presentation remainder for questions and answer interaction (Time can be underestimated because people ask questions)
Independent samples t-test for means
2 Groups 1 Measurement Examples Satisfaction ratings, men vs. women Age in years, customers vs. noncustomers Sales, those who saw the ad vs. those who didn't
Pictograms
A bar chart in which pictures represent amounts—for example, piles of dollars for income, pictures of cars for automobile production, people in a row for population.
Bar Chart
A chart in which the relative lengths of the bars show relative amounts of variables or objects.
Pie Chart
A circle representing a total quantity that is divided into sectors, with each sector showing the size of the segment in relation to that total.
Pearson chi-square test of independence
A commonly used statistic for testing the null hypothesis that categorical variables are independent of one another Examples: Purchased our product, men vs. women Customers vs. noncustomers, marital status Used the coupon, those who saw the ad vs. those who didn't
Coefficient of multiple determination (R2)
A measure representing the relative proportion of a total variation in the dependent variable that can be explained or accounted for by the fitted regression equation. When there is only one predictor variable, this value is referred to as a coefficient of determination.
Banner
A series of cross tabulations between an outcome, or dependent variable, and several (sometimes many) explanatory variables in a single table.
Stratum Chart
A set of line charts in which quantities are aggregated or a total is disaggregated so that the distance between two lines represents the amount of some variable.
Cramer's V
A statistic used to measure the strength of the relationship between categorical variables
Line Chart
A two-dimensional chart with the x-axis representing one variable (typically time) and the y-axis representing another variable.
Line chart
A two-dimensional chart with the x-axis representing one variable (typically time) and the y-axis representing another variable.
Always calculate percentages in the direction of the causal variable
Always calculate percentages in the direction of the causal variable
Sample Mean
Arithmetic average of values of responses
Data Visualization
The process of using graphic illustrations to understand and communicate important relationships in large data sets.
Bayesian Stats vs Frequentism Stats
Bayesian -Probability indicates strength of believe -Subjectivity plays role -Worries about everything -Uses Highest density region Frequentism -Probability means long run frequency -Subjectivity is not allowed -Uses confidence interval
Three writing standards that a report should meet if it is to communicate effectively with readers:
Completeness: The degree to which the report provides all the information readers need in language they understand (report must be complete without being too complete) Accuracy: The degree to which the reasoning in the report is logical and the information is correct Clarity: The degree to which the phrasing in the report is concise
Correlation vs Regression
Correlation- symmetric Regression- assymetric
What is the Goal with error?
Decrease overall TOTAL error
Content
Definitions and bulleted lists; maps; diagrams; table and figures; how much
Precision
Degree of error in an estimate of a population parameter Precision and Confidence are INVERSELY related
Confidence
Degree to which one can feel confident that an estimate approximates the true value Confidence and Precision are INVERSELY related
Coding open-ended items
Factual open-ended questions are easy to code Exploratory open-ended questions are difficult 1. Identify usable responses 2. Develop responses categories 3. Sort responses into categories and use coders 4. Assess degree of agreement between coders
Structure of an oral presentation
General purpose Research Problems Evidence Conclusions Can also have Conclusions then Evidence
Oral Report
Goal is communicate to audience at hand. Quality research will not improve a poor presentation
Explain how the time allotted for an oral presentation should be organized
Honor the time left set for the meeting Use no more than half of the time for the formal presentation Reserve the remaining time for questions and discussion.
Research reports are evaluated based on one fundamental issue:
How well they communicate with the reader
Classification Information
Information used to classify respondents usually demographically. WAIT TO ASK THIS AT THE END
Continuous Measures
Interval and ratio measures
How to convert Continuous measures into Categorical?
Judgement Median Split- Technique to cover continuous measures into categorical measures by splitting the measure at its median value Cumulative Percentage Breakdown- similar to median split but with any number of groups Two Box Technique- converting interval level rating into categories
Two Fundamental rules for a good presentation
Know your stuff Know your audience
When should you ask difficult or sensitive information?
Late in the questionnaire
Coding closed-ended items
Limited number of response categories and will ask the respondent to choose the best response, or all that apply. Most items included in questionnaire will be closed-ended.
Regression Analysis
Linear equation showing influence of independent variable(s) (Simple or Multiple Regression) on a dependent variable
What doesn't go in an appendix?
Main Results
Sample Standard Deviation
Measure of variation of responses (square root of variance)
Hypothesis Types
Null Hypothesis (H0)- hypothesis that a proposed result is NOT true for the population Alternative Hypothesis (HA)- hypothesis that a proposed result is true for a population
Response Rate
Number of completed interviews with responding units divided by the number of eligible responding units in the sample Indicator of Overall Quality of data collection efforts
How to decide when to use each graph
One nominal variable (histogram, pie chart) Two nominal variable (grouped bar chart One numeric variable (frequency distribution, box & whisker plot) Two numeric variables (scatterplot) Several numeric variables (stratum chart, grouped bar chart) One nominal and one numeric variable (pictogram, presenting a t-test)
2 general classes of open-ended questions
One seeks factual info One is exploratoy
How is Structure handled for questions?
Open-Ended or Close-Ended
The exceptional presenter must OPEN UP
Organized Passionate Engaging Natural Understand the audience Practice
Parameters vs. Statistics
Parameter: a characteristic or measure of a population Statistic: a characteristic or measure of a sample
Bayesian Stat Symbols
Pr- probability of hypothesis being true -Posterior of Probability -Likelihood of Data -Prior for Parameter Vertical bar= assuming that
Hypothesis Outcomes
Probability is Smaller than .05 or specified p vale -Reject H0 -Confirm HA Probability is NOT Smaller than .05 or specified b value -Fail to Reject H0 -Do not confirm HA
P-Value
Probability of obtaining a given result if in face the null were true. -A result is statistically significant if the P - Value is less than the chosen significance level of the test
Sample
Selection of a subset of elements from a larger group of objects (Drawn to make inferences about the population)
Chi-Square Goodness of Fit Test
Statistical test to determine whether some observed pattern of frequencies corresponds with an expected pattern
Descriptive Statistics
Used to describe distribution of responses on a variable, commonly the Mean and Standard Deviation
Recall Loss
Tendency to forget an entire event entirely - problems are reduced the shorter period of time they are asked about - best time frame tends to be between 2 weeks to one month
What is in an executive summary?
The executive summary is the most important part of the report; think about what you would most want to communicate about the project if you only had 60 seconds to do so. Introduction, Results, Conclusions, Recommendations)
Possible strategies for handling missing data
There is no right or simple answers on how the missing data should be handled Options: -Eliminate the case -Substitue values -Contact respondent again
Pretest
Use of a questionnaire on a trial basis in a small pilot study to determine how well the questionnaire works pretest and make revisions before conducting official quesitonaire
Why use multivariate analysis?
We do not live in univariate world (too broad) Multivariate analyses has enhanced meaning by looking for differences across groups or associations among variables.
Paradox of Completeness
Written reports must be complete but not too complete Completeness must be balanced with clarity
Histogram
a form of bar chart that is based on information from a frequency count
Sampling Frame
a listing of population elements from which you'll draw the sample Ex- geographic areas, institutions, individuals or other units
Item nonresponse
a poorly worded question that causes a respondent to refuse to answer it
Avoid Assumed Consequences
a problem that occurs when a question is not framed so as to clearly state the consequences, and this generates different responses from individuals who assume different consequences. - Key point: it is important to ask questions that are precise and that don't require respondents to make assumptions.
Avoid Leading Questions
a question framed so as to give the respondent a clue as to how they should answer - slugging is sales disguised as research
Census
a type of sampling plan in which data are collected from or about each member of a population
Population
a type of sampling plan in which data are collected from or about each member of a population - often refer to those who qualify as population elements
Avoid Unstated Alternatives
an alternative answer that is not expressed in a questions options.
Funnel Approach-
an approach to questions sequencing that gets its name from starting with broad questions and progressively narrowing down the scope
two-box technique
analysts often report the results of rating scale questions by percenting the percentage of respondents who checked one of the top two positions on a rating scale.
Pearson product-moment correlation coefficient
assesses the degree of linear association between two continuous variables; The correlation coefficient can range from -1 to +1.
Merging
combining data from a different data source into a single database
Frequency Analysis
consists of counting the number of cases that fall into the various response categories
Frequency Analysis
consists of counting the number of cases that fall into the various response categories Uses: -Univariate categorical analysis -Identify blunders and nonresponse -Identify outliers -Identify the median
Descriptive statistics
describe the distribution of repsponses on a variable, including measures of central tendency (mean, median, mode) measures of the spread, or variation, in the distribution (range, variation, ST dev) and the various measures of the shape of distribution
Probability Samples
each member of the target population has a known, nonzero chance of being included in the sample -Random component to an element being selected -Sampling error can be estimated; this is why we use it
Editing Data
inspection and correction of data -Conver responses to consistent units -Assess degree of Nonresponse -Check for consistency among responses -Verify branching questions -Add necessary codes
Nonprobability samples
involved personal judgement somewhere in the selection process I.e. you can't say anything at all about what would be true for the population bc of judgement -Sampling Error cannot be estimated
Cross tabulation
multivariate techniques to study the relationship between two variables Considers joint distribution
Categorical Measures
nominal and ordinal measures used to group respondents or objects into groups or categories
Outliers
observations that are SO different from the rest of the observations that they ought to be treated as special cases
Blunders
office errors that occur during editing, coding, or data entry- most frustrating kind of error
Data aggregation
process of creating summary data (count, sum, mean) for a particular repeated behavior over a specified period of time
Significance level
with any type of hypothesis testing, you'll need to select an appropriate level of error related to the probability of rejecting the null hypothesis when is it actually true for the population.
Confidence Interval
projection of the range within which a population parameter will lie at a given level fo confidence based on a statistic obtained from an appropriately drawn sample To produce: all we need to do is calculate the degree of sampling error for the particular statistic
Confidence Interval
projection of the range within which a population parameter will lie at a given level of confidence based on a statistic obtained from an appropriately drawn sample -Z score is regularly used; 1.96 -Only takes into account of Sampling Error (not nonresponse)
Filter question
question used to determine whether or not the individual is likely to have the information
Double- entry error
requires that data be entered by 2 separate people in two separate data files and then the data files can be compared for discrepancies
Prepare Dummy Tables
simply a table (or figure) used to show how the results of an analysis will be presented (there is no actual data because it hasnt been collected yet)
Sample mean
simply the arithmetic mean value across all responses for a variable
Target Information
the basic information that addressing the subject of the study
Sampling Error
the difference between results obtained from a sample and results that would have been obtained has information been gathered from or about every member of the population -Decreases when Sample Size increases -Can be estimated if you have probability sample -Less troublesome than other errors
Coding
the process of transforming raw data into symbols. It is the technical procedure by which data are categorized. Most often the symbols are numerals because computers can handle them easily
Question Order Bias
the tendency for earlier questions on a questionnaire to influence respondents' answers to later questions
Telescoping error
the tendency to remember an event having occurred more recently than it did - error gets worse when asked to consider shorter time periods
Avoid Double- Barreled Questions
two questions rolled into one, leading to confusion for respondents
Formula for Regression
yi= ÿi + ei Beta = slope Alpha = y intercept
If the p-value is less than the significance level established,
you can reject the null hypothesis and tentatively accept the alternative hypothesis