Chapter 12- Analyzing Data

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

- estimation of parameter - hypothesis testing

Inferential statistical procedures are divided into two types

postive correlation

Correlation in which high scores for one variable are paired with high scores for the other variable, or low scores for one variable are paired with low scores for the other variable.

negative correlation

Correlation in which high scores for one variable are paired with low scores for the other variable.

probability

Likelihood that an event will occur, given all possible outcomes

negative; positive

Raw scores below the mean have a _____ deviation, whereas raw scores above the mean have a ______ deviation

Source (source of variation),SS (sums of squares), df (degrees of freedom), MS (mean square), and F (F-statistic)

5 columns in ANOVA summary

negative

A (positive/ negative) correlation reflects an inverse relationship between two variables.

central tendency; dispersion

A ________ descriptive stat will approximate the location of the center of the data while a ________ descriptive stat will depict the variability or spread

Pearson Product-Moment Correlation

A common correlational technique researchers use to examine the relationship between two variables using ratio/interval data

skewed distribution

A distribution of scores with a few outlying obser vations in either direction.

mean

A measure of central tendency (aka descriptive statistic) that is calculated by adding up all of the scores and dividing teh sum by the number of scores

mean

A measure of central tendency calculated by summing a set of scores and dividing the sum by the total number of scores

correlation

A measure that defines the relationship between two variables

level of confidence

A probability level in which the null hypothesis can be rejected with confidence and the research hypothesis can be accepted with confidence

confidence interval

A range of values that has some specified probability (e.g., 0.95 or 0.99) of including a particular population parameter.

correlation matrix

A specialized form of a correlation table that presents all possible combinations of the study variables

Analysis of variance (ANOVA)

An inferential procedure used to determine whether there is a significant difference among three or more group means

-The dependent variable being measured on an interval/ratio scale that is normally distributed in the population -Groups being mutually exclusive (independent of each other)

General assumptions associated with parametric procedures include

less

If a number of samples are randomly drawn from a target population, the mean varies (more/ less) compared with the median and mode.

larger

If the between-group variance is (smaller/larger) than the within-group variance, scores in the different groups are far enough apart for a researcher to conclude that the group means are significantly different

smaller than or about the same as

If the between-group variance is (smaller/larger) the within-group variance, the group means are not significantly different.

assumptions associated with a test include those associated with parametric procedures: - dependent variable being measured on a interval-ratio scale that is normally distributed to the population - groups being independent of each other AND - both groups have similar variances

Assumptions with a t test

homogeneity of variance

Situation in which the dependent variables do not differ significantly between or among groups.

descriptive statistics

Statistics that describe and summarize data.

inferential statistics

Statistics that generalize findings from a sample to a population.

- arrange all scores in rank order - odd # = middle number is median - even #= add the two closest middle points and divide by two

Steps in order to calculate the median

true

T/F Distribution of scores can have more than one mode

TRUE! (if an even number of socres is the data set then you have to add and divide by two)

T/F The median is not necessarily one of the scores in the distribution.

FALSE!

T/F The median is sensitive to extreme score

false (it is based on only two values in the distribution)

T/F The range is a stable measurement

true!

T/F The range is extremely sensitive to outliers

false

T/F The standard deviation (SD) is not sensitive to outliers

true

T/F The standard deviation (SD) takes into account all of the scores in the distribution

x ̄ ; ∑

The symbol that represents the mean is ______ the symbol to denote sum of values is _____

median

To describe a skewed distribution, the researcher chooses the (median/mode/mean) to give a balanced picture of the extreme scores or outliers

measures of central tendency and measures of dispersion

Two groups of procedures that are found within descriptive statistics

descriptive; inferential

Two groups of statistical procedures that can be applied to a study to describe, analyze, and interpret quantitative data

two-way ANOVA

Type of ANOVA in which there are two independent variables and several levels

one-way ANOVA

Type of ANOVA in which there is one independent variable with several levels

nonparametic

Type of Statistical Test that makes no assumptions about the shape of the distribution and are referred to as "distribution-free" tests.

- range - variance - standard deviation - percentile - interquartile range

Types of Measures of dispersion

- mean - median - mode

Types of measures of central tendency that describe the location or approximate the center of a distribution of dat

deviated scores

What are variance and standard deviation based on?

both groups must have similar variances with respect to the dependent variable

What is homogeneity of variance with t test?

always somewhere in between the mode and the mean

Where is the median located for a skewed distribution?

the larger the F stat--> the greater probability of finding a statistically significant difference amoung group means

How does the F stat relate to the probability of finding a statistically significant difference

Mean square of within group variance/ mean square of between group variance

How is the F value calculated

mean is "pulled" in the direction of those extreme values.

How is the mean affected by outliers (a data point isolated from other data points)?

examined to determine whether between-group variance is greater than within-group variance

How is varation examined in ANOVA?

on a nominal scale

How would an IV be measured in ANOVA?

separate t test (takes into account the fact that the variances are not equal)

T test that is used if violation of the assumption of homogeneity of variance excises and variances are sigfinanctly different

true

T/F Because the mode requires only a frequency count, it can be applied to any set of data at the nominal, ordinal, or interval-ratio level of measurement.

true

T/F It is not uncommon for researchers to use parametric procedures for data that are measured on an ordinal scale or data that violate the assumption of normality.

true

T/F Range does not take into account variations in scores between extremes

true (take absolute value)

T/F The direction of the relationship does not affect the strength of the relationship

true

T/F The varaince takes into account all of the scores in the distribution

true

T/F The variance is sensitive to outliers

false (its not based on ranks; nominal is things like martial status, religious affiliation)

T/F You should apply median to nominal set of data

mode

The (median/mode/mean) is always used in nominal data

range

The (range, SD, variance) although quick and simple to obtain, is not very reliable

SD

The (range, SD, variance) squares the deviated scores and returns them to their original units of measure

research design; type of research collected

The ______ and _______ determine selection of appropriate statistical procedures

estimation of parameters

Type of inferential stat that is based on data collected from a study sample, evaluated by a single number or an interval

inferential statistics

Type of statistic that focus on the process of selecting a sample and using the information to make generalizations to a population

dependent/matched t test

Type of t test that test for situations in which scores in the first group can be paired with a score in the second group and the means of the two related groups are compared

dependent/ matched t test

Type of t test used in pretest-posttest design, in which a single group of subjects is measured twice

accuracy of a statistic and test a hypothesis

What does probability help to evaluate?

how severely the assumptions are broken

What does the degree of risk for a robust procedure depend on?

Is the difference between the two sample means large enough to conclude that the population means are different from one another?

What does the statistical question become with a T test?

allows researchers to make inferences about the larger population; normally cant sample everyone in a population

What does the use of interferential stats do for a reaseracher? Why is this imporant?

the average squared deviation from the mean

What does variance mean mathmatically?

symmetrical about the mean and is unimodal

What is a normal curve?

probabilty

What is central to hypothesis testing?

you have to have #'s!!!! otehrwise you cant add or divide

What is needed to calculate the mean?

OUTLIERS! Sensitive to extreme values

What is the mean of a data set extremely sensitive to?

answer research questions and/or test hypothesis

What is the purpose of data analysis?

relative positions remain constant in moving away from the peak toward the tail (i.e. mode towards the peak, median ALWAYS in the middle, and mean towards the tail)

What remains constant regarding the mean, median, and mode for skewed data?

in experimental and quasi-experimental designs

When are t test commonly used?

nominal or ordinal data

When is Nonparametic Statistical Test usually used?

when both variables are ratio/interval measures

When is Pearson Product-Moment Correlation used?

nominal data

When is mode an appropriate measure of central tendency to use?

when the data represents either a interval or ratio scale

When is the mean the preferred measure of central tendency to use?

nominal

what sort of data does chi-square analysis use?

measures of central tendency

Subtype of Descriptive statistics that describe the location or approximate the center of a distribution of dat

Point estimation

Type of estimation of parameter in which the estimate consists of a single value

inferential statistics

Group of statistical procedures that will make generalizations about populations based on data collected from samples

If randomization is used

How can independence be met for paramertic procedures?

Each deviated score is squared, so that all numbers become positive

How do researchers stop the devation sum from being zero?

off-centered peaks and longer tails in one direction

How does a Skewed Distribution look?

outliers

A data point isolated from other data points

symmetrical distribution

A distribution of scores in which the mean, median, and mode are all the same.

median (50th percentile)

A measure of central tendency that is the middle score or midpoint of a distribution

range

A measure of variability that is the difference between the lowest and highest values in a distribution.

correlation

A measure that defines the relationship between two variables.

range

A measurement of dispersion that is the simplest. It is calculated by subtracting the lowest score in the distribution from the highest score

chi-square analysis

A nonparametric procedure used to assess whether a relationship exists between two nominal-level variables; symbolized as x^2

parameter

A numerical characteristic of a population (e.g., population mean, population standard deviation)

population; sample

A parameter is a characteristic of a ____, whereas a statistic is a characteristic of a _____.

Analysis of Variance (ANOVA)

A parametric procedure used to test whether there is a difference among three group means.

t- test

A popular parametric procedure for assessing whether two group means are significantly different from one another.

robust procedure/statistic

A statistical procedure that is appropriate even when its assumptions have been violated

F stat

A three- or four-digit number that indicates the size of the difference between the groups relative to the size of the variation within each group.

t-test

An inferential statistical procedure used to determine whether the means of two groups are significantly different

close to the low score values

Because the mean lies closest to the tail in a skewed distribution, the mean score in negitivly skewed distributions lies toward what score values?

the high score values

Because the mean lies closest to the tail in a skewed distribution, the mean score in positively skewed distributions lies toward what score values?

NO just symmetric

Can you use SD as a dispersion for skewed data?

symmetrical distribution

Data distribution in which the mean, median, and mode are all the same

skewed distribution

Data distribution that has outlying observations in either direction, above or below the mean

outlier

Data point isolated from other data points; extreme score in a data set

unimodal; bimodal

Data with a single mode are called _____; data with two modes, _____

measures of dispersion

Descriptive statistics that depict the spread or variability among a set of numerical data.

measures of central tendency

Descriptive statistics that describe the location or approximate center of a distribution of data.

variance

Despite its relative stability, the (range, SD, variance) is not widely used because it cannot be employed in many statistical analyses.

degree of freedom

Element that is often reported with a t test that represents the freedom of a score's value to vary given what is know about the other scores and the sum of scores

Descriptive statistics

Group of statistical procedures that will describe, organize, and summarize data

variance

Measure of variability, which is the average squared deviation from the mean.

parameter

Numerical characteristic of a population (e.g., population mean, population standard deviation).

takes into account each score in the distribution

One major characteristic of the mean

it DOES NOT take into account each score in the distribution

One major characteristic of the median

level of confidence

Probability level in which the research hypothesis is accepted with confidence. A 0.05 level of confidence is the standard among researchers.

robust

Referring to results from statistical analyses that are close to being valid even though the researcher does not rigidly adhere to assumptions associated with parametric procedures.

mode

Regardless of the purpose of the study, the researcher uses the (median/mode/mean) as a preliminary indicator of central tendency

level of measurement, shape, form of the distribution

Researchers choose a measure of central tendency depending on the ________, __________, or __________ of data and research objective.

Measures of dispersion

Subtype of Descriptive statistics that depict the spread or variability among a set of numerical data

0

The closer the coefficient is to ___, the lower or weaker the correlation

-1 or +1

The closer the coefficient is to ____, the higher or stronger the correlation.

deviated score

The difference between a raw score and the mean of that distribution (x2 = X - X)

number of groups being compared (ANOVA is 3 or more; t-test is 2)

The difference between the t-test and ANOVA

between-group variance

The distance among group means

probability (p)

The likelihood that an event will occur, given all possible outcomes

validity

The magnitude of p does not indicate the amount of _____ associated with each research hypothesis

less powerful than parametric tests

The major drawback to nonparametric tests

do not

The mean, median, and mode (do /do not) coincide in skewed distributions

do

The mean, median, and mode (do /do not) coincide in symmetrical distributions

mode

The measure of central tendency that is the most frequently occurring score in a distribution

median (because it always falls between the median and the mode)

The most appropriate measure of central tendency for describing a skewed distribution

standard deviation (SD)

The most commonly reported measure of dispersion and the most stable

chi sqaure

The most commonly reported nonparametric statistic that compares the frequency of an observed occurrence (actual number in each category) with the frequency of an expected occurrence (based on theory or past experience).

standard deviation (SD)

The most frequently used measure of variability; the distance a score varies from the mean.

single value in the F column

The most important number in the ANOVA summary column

MEAN !

The most stable/ reliable measure of central tendency and is the most commonly used

-1.0 - +1.0

The range for correlation coefficient

mode

The score or value that occurs most frequently in a distribution; a measure of central tendency used most often with nominal-level data.

hypothesis testing

The second type of inferential statistical procedure which involves concerning the parameter, sampling the population of interest, and making objective decisions about the sample results of the study.

standard deviation

The square root of the variance

0

The sum of the deviated scores in a distribution always equal to...

variance

The sum of the squared deviations divided by the number of scores

parametric statistical test

Type of Statistical Test that require assumptions to be met for statistical findings to be valid

independent t test

Type of T test that is used when scores in one group have no logical relationship with scores in the other group.

negative correlation (0-1-)

Type of correlation in which low scores on one variable are paired with high scores on the other variable

postive correlation (0-1+)

Type of correlation in which shows that high scores on one variable are paired with high scores on the other variable

interval estimate

Type of estimation of parameter in which the estimate consisting of an interval; need to specify the amount of confidence

when the data is sweked distrubution meaning that is has an outlier in either direction (because the median does not take the outliers that skew the data into account as much as the other measures would)

When is the median a appropriate measure of central tendency to use?

level of confidence

When looking at probability, what must be established in order to determine if the outcome is statically significant

closest to the tail of the curve (where the relatively few extreme scores are located; the mean is an average and the average will be thrown off accordingly by the outlier)

Where is the mean located for a skewed distribution?

closest to the peak of the curve (this is where the most frequent scores are found and the mode is a measure of the most frequent number)

Where is the mode located for a skewed distribution?

median

Which central tendency is a ordinal statistic based on ranks?

Used to assess whether a relationship exists between 2 categorical values (used for variables that are on a nominal scale)

Why is chi-sqaure used?

it takes into account every single score in the distribution

Why is the mean a more precise and stable measure than the median or mode?

Statistics

a descriptive index calculated from data as an estimate of a population parameter


Kaugnay na mga set ng pag-aaral

[Chapter 8 - System Management 2] TestOut PC Pro

View Set

jacob miller language arts 800 3

View Set

health and illness II practice questions exam 1

View Set

2.A Ch 4. Operations Planning and Control

View Set