data analysis and statistics
percentile
(statistics) any of the 99 numbered points that divide an ordered set of scores into 100 parts each of which contains one-hundredth of the total
histogram
A bar graph depicting a frequency distribution. The height of the bars indicates the frequency of a group of scores.
symmetrical distribution
A distribution where the left-hand side is a mirror image of the right-hand side
variable
A factor in the experiment that can be changed
percentage distribution
A frequency distribution organized into a table (or graph) that summarizes percentage values associated with particular values of a variable.
research process
A general sequence of steps that can be followed when designing and conducting research.
time-series chart
A graph that displays changes in a variable at different points in time. it shows time on the horizontal axis and frequencies or % of a variable on the vertical axis (nominal, ordinal or IR)
line graph
A graph that uses points connected by lines to show how something changes in value
population
A group of individuals that belong to the same species and live in the same area
theory
A hypothesis that has been tested with a significant amount of data.
nominal measurement
A measure for which different scores represent different, but not ordered, categories
mean
A measure of center in a set of numerical data, computed by adding the values in a list and then dividing by the number of values in the list.
median
A measure of central tendency for a distribution, represented by the score that separates the upper half of the scores in a distribution from the lower half
percentage
A ratio (i.e., a proportion) formed by combining the same dimensional quantities, such as count (number/number) or time (duration/duration; latency/latency); expressed as a number of parts per 100; typically expressed as a ratio of the number of responses of a certain type per total number of responses (or opportunities or intervals in which a response could have occurred). Presents a proportional quantity per 100 (Source: CHH, 2 Ed).
rate
A ratio that compares two quantities measured in different units
sample
A segment of the population selected for marketing research to represent the population as a whole
bivariate analysis
A statistical method designed to detect and describe the relationship between two variables
hypothesis
A testable prediction, often implied by a theory
proportion
An equation stating that two ratios are equal.
descriptive variable
Characteristics that describe the sample and provide a composite picture of the subjects of the study; they are not manipulated or controlled by the researcher.
statistics
Collection of methods for planning experiments, obtaining data, organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on data.
negatively skewed distribution
Contains a preponderance of scores on the high end of the scale, and moves the mean down on the scale.
positively skewed distribution
Contains a preponderance of scores on the low end of the scale, and moves the mean higher up on the scale.
ordinal measurement
Describes a variable who's attributes may be rank-ordered
data
Facts and statistics collected together for reference or analysis
inferential statistics
Help researchers decide how confident they can be in judging that the results observed are not due to chance.
mode
Measure of central tendency that uses most frequently occurring score.
interval-ratio measurement
Measurements for all cases are expressed in the same units. Example: Age, Income, SAT scores. (Mode, Median, and Mean, Range, Median is preferred if the distribution is skewed . Mean is preferred if the distribution is symmetrical. IQV, IQR, Variance, Standard Deviation
empirical research
Research that operates from the ideological position that questions about human behavior can be answered only through controlled, systematic observations in the real world
unit of analysis
The entity or element that is being studied in market research (e.g., individual, household, etc.)
independent variable
The experimental factor that is manipulated; the variable whose effect is being studied.
dependent variable
The outcome factor; the variable that may change in response to manipulations of the independent variable
marginals
The row and column totals in a bivariate table
interquartile range
The width of the middle 50% of the distribution. It is defined as the difference between the lower and upper quartiles (Q1 and Q3)
normal distribution
a bell-shaped and symmetrical theoretical distribution with the same mean, the median, and the mode all coinciding at its peak and with the frequencies gradually decreasing at both ends of the curve.
negative relationship
a bivariate relationship between two variables measured at the ordinal level or higher in which the variables vary in opposite directions
positive relationship
a bivariate relationship between two variables measured t the ordinal level or higher in which the variables vary in the same direction
direct causal relationship
a bivariate relationship that cannot be account for by other theoretically relevant variables
pie chart
a circular chart divided into triangular areas proportional to the percentages of the whole
intervening variable
a control variable that follows an independent variable but precedes the dependent variable in a causal sequence abortion attitudes
dichotomous variable
a discrete variable that has only two possible amounts or categories
cumulative frequency distribution
a distribution showing the frequency at or below each category (class interval or score) of the variable
cumulative percentage distribution
a distribution showing the percentage at or below each category (class interval or score) of the variable
skewed distribution
a distribution where the majority of the values fall either to the left or the right when graphed and the data is then spread out with a small number of values in the opposite direction
parameter
a measure (e.g., mean or standard deviation) used to describe the population distribution
asymmetical measure of association
a measure of association whose value may vary depending on which variable is considered the independent variable and which the dependent variable
index of qualitative variation
a measure of variabilility for nominal variables. It is based on the ratio of the total number of differences in the distribution to the maximum number of possible differences within the same distribution
standard deviation
a measure of variation for interval-ratio variables; it is equal to the square root of the variance
variance
a measure of variation for interval-ratio variables; it is the average of the squared deviations of the mean
range
a measure of variation in the interval-ratio variables. It is the difference between the highest (maximum) and the lowest (minimum) scores in the range
systematic random sampling
a method of sampling in which every Kth member (K is a ratio obtained by dividing the population size by the desired sample size) in the total population is chosen for inclusion in the sample after the first member of the sample is selected at random from among the first members in the population
stratified random sample
a method of sampling obtained by 1) dividing the population into subgrous based on one or more variables central to our analysis and 2) then drawing a smiple random sample from each of the subgroups
probability sampling
a method of sampling that enables the researcher to specify for each case in the population the probability of its inclusion in the sample
standard normal distribution
a normal distribution represented in standard (z) scores
left-tailed test
a one-tailed test in which the sample outcome is hypothesized to be at the left tail of the sampling distribution
right-tailed test
a one-tailed test in which the sample outcome is hypothesized to be at the right tail of the sampling distribution
elaboration
a process designed to further exlore a bivariate relationship; it involves the introduction of control variables
estimation
a process whereby we select a random sample from a population and use a sample statistic to estimate a population parameter
confidence interval (interval estimate)
a range of values defined by the confidence level within which the population parameter is estimated to fall
spurious relationship
a relationship in which both the independent and dependent variables are influenced by a causually prior-control variable, and there is no causal link between them. the relationship between the independent and dependent variables is said to be "explained away" by the control variable
intervening relationship
a relationship in which the control variable intervenes between the indeendent and dependent variables
conditional relationship
a relationship in which the control variable's effect on the dependent variable is conditional on its interaction with the independent variable. The relationship between the independent and dependent variables will change according to the different conditions of the control variable.
simple random sample
a sample designed in such a way as to ensure that 1) every member of the population has an equal chance of being chosen and 2) every combination of N members has an equal chance of being chosen
point estimate
a sample statistic used to estimate the exact value of a population parameter
statistic
a set of procedures used by social scientists to organize, summarize, and communicate information
measurement of central tendency
a single typical value that characterizes a distribution of values. 3 types of such a measure are mean median and mode.
null hypothesis
a statement of "no difference," which contradicts the research hypothesis and is always expressed in terms of population parameters
research hypothesis
a statement reflecting the substantive hypothesis. It is always expressed in terms of population arameters, but its specific form varies from test to test.
gamma
a symmetrical measure of association suitable for use with ordinal variables or with dichotomous nominal variables. It can vary from 0.0 to +/-1 and provides us with an indication of the strengh and direction of the association between the variables
Kendall's tau-b
a symmetrical measure of association suitable for use with ordinal variables. Unlike gamma, it accounts for pairs tied on the independent and dependent variable. It can vary from 0.0 to +/-1. It provides an indication of the strength and direction of the association between the variables
frequency distrubition
a table reporting the number of observations falling into each category of the variable
standard normal table
a table showing the area (as a proportion, which can be translated into a percentage) under the standard normal curve corresponding to any Z score or its fraction
bivariate table cell
a table that displays the distribution of one variable across the categories of another variable
cross-tabulation
a technique for analyzing he relationship between two variables that have been organized in a table
sampling distribution
a theoretical probability distribution of all possible sample values for the statistics in which we are interested.
sampling distribution of the difference between means
a theoretical probability distribution that would be obtained by calculating all the possible mean differences that would be obtained by drawing all the possible independent rando samples of size N1 and N2 from two populations where N1 and N2 are each greater than 50
sampling distribution of the mean
a theoretical probability probablity distribution of sample means that would be obtained by drawing from the population all possible samples of the same size
bar graph
a type of graph in which the lengths of bars are used to represent and compare data in categories
one-tailed test
a type of hypothesis test that involves a directional hypothesis. It specifies that the values of one group are either larger or smaller than some specified population value.
column variable
a variable whose categories are the columns of a bivariate table
row variable
a variable whose categories are the rows of a bivariate table
control variable
an additional variable considered in a bivariate relationship. Te variable is controlled for when we take into account its effect on the variables in the bivariate relationship
lambda
an asymmetrical measure of association, lambda is suitable for use with nominal variables and may range from 0.0 to +/-1. It provides an indication of the strength of an association the independent and dependent variables.
chi-square test
an inferential statistics technique designed to test for a significant relationship between two variables organized in a bivariate table
partial tables
bivariate tables that display the relationship between the independent and dependent variables while controlling for a third variable
central limit theorem
if all possible random samples of size N are drawn from a population with a mean uy and a standard deviation oy then as N becomes larger, the sampling distribution of sample means becomes approximately normal, with mean uysum and the standard deviation oymean=oy/square root of N
measures of variability
numbers that describe diversity or variability in the distribution
expected frequencies (Fe)
the cell frequencies that would be expected in a bivariate table if the two variables were statistically independent
sampling error
the discrepancy between a sample estimate of a population parameter and the real population parameter.
alpha
the level of probability at which the null hypothesis is rejected. it is customary to set alpha at the .05, .01, or .001 level
confidence level
the likelihood, expressed as a percentage or a probability, that a specified interval will contain the population parameter
degrees of freedom
the number of scores that are free to vary in calculating a statistic.
standard (Z) score
the number of standard deviations that a given raw score is above or below the mean
p value
the probability associated wit the obtained value of Z
margin of error
the radius of a confidence interval
partial relationship
the relationship between the independent and dependent variables shown in a partial table
proportionate stratified sample
the size of he sample selected from each subgroup is proportional to the size of that subgroup in the entire population
disproportionate stratified sample
the size of the sample selected from each subgroup is disproportional to the size of that subgroup in the population
standard error of the mean
the standard deviation of the sampling distribution of the mean. It describes how much dispersion there is in the sampling distribution of of the mean.
chi-square (obtained)
the test statistic that summarizes the differences between the observed (fo) and the expected (fe) frequencies in a bivariate table