CHATER 9 Descriptive Statistics, Significance Levels, and Hypothesis Testing

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

mode

score that appears the most often in a dataset 2. bimodal, multimodal which means that more than one score has the largest frequency of occurrence. 3. most are bimodal or multimodal - impossible for researcher to represent average in later calcs

probability level

significance level - established for each statistical test prior to computing the statistical test -the level of error the researcher is willing to accept. -symbolized in written research reports as the letter p or referred to as the alpha level

Two techniques hyp testing relies on

significance testing 2. sampling

range

simplest measure of dispersion 1. calculate by subtracting the lowest score from the highest score. 2. used to report the high and l ow scores on questionnaires - crude measure of dispersion because changing any values between the higheset and lowest scores will have no effect on it

medium or large datasets

spreadsheet program or a statistics program

Frequencies

the number of times a particular value of a variable occurs. - often used to report on the occurrence of comm events - data is at nominal level, because the researcher is making a decision for each occurrence- did this comm phenom occur or did it not? - ex: presidential elections

operationalized

the researcher must specify exactly what data were collected and how

If probability level of the statistical test is acceptable, or within the traditions of the discipline

then the finding are believed to be real, not random - inference can be presumed to be valid

descriptive statistics definition

those numbers that supply information about the sample or those that supply information about the variables 2. simply describe what is found

population inference

accepting the conclusions derived from the sample and assuming that those conclusions are also applicable to the population

If sig level computed for the stat test is .05 or less

alternative hypothesis is accepted

descriptive statistics

another set of numbers computed from the dataset 2. convey essential basic information about each variable and the dataset as a whole 3. mean, standard deviation, range, and number of cases are commonly used to provide a summary interpretation of each variable. 4. function: descriptive(summarizing), provide information about the relationships between or among variables, and help researchers draw conclusions about a population by examining the data of the sample. This use of numbers is known as inferential statistics.

Significance levels

- a criterion for accepting or rejecting hypotheses and is based on probability.

social significance

- achieving statistical significance does not guarantee social significance of the result 1. using a very large sample can create statistically significant differences that have little relevance in application 2. stat significance must always be interpreted with respect to the social and practical significance- or how the results might actually be applied or used in everyday life.

alternative hypothesis

- an assertion that states how researcher believes the variables are related or are different

skewed distributions

1. when a distribution of scores is not normal 2. one side is not a mirror image of the other 3. asymmetrical

positively skewed curve

1.represents a distribution in which there are very few scores on the right side of the distribution. 2. very few very high scores 3. most of the scores are lumped together on the left side of the curve, below the mean.

small dataset

calculator with sq root function

process inference

claim that the theory would likely work in similar situations -are the data consistent with the predictions the researcher drew from the theory? - inference based on the probability level computer for each statistical test

dataset

collection of the raw data

how to know which one is appropriate to use

compute the variability in the scores

skewness

degree to which the distribution of data is bunched to one side or the other 2. direct reflection of the variability, or dispersion, of the scores

Description of data

describe data for each quantitative variable in three ways: 1. the number of cases or data points 2. central tendency 3. dispersion or variability each of these descriptions provides information about the frequency of scores

first step whenever collecting data

develop a frequency distribution for each variable in the dataset for which you have collected quantitative data

positively Skewed distribution mean

mean will always be the largest value of the three measures of central tendency in a positively skewed distribution

negatively skewed distribution mean

mean will always be the smallest value of the three measures of central tendency

if probability level is unacceptable

no conclusion can be drawn

raw data

numerical data collected from each participant compiled into a dataset for the same variables for a sample of participants

probability

scientific term to identify how much error the researcher finds acceptable in a particular statistical test 1. in scientific research, probability is an estimate of "what you think would happen if the study were actually repeated many times, telling the researcher how wrong the results can be" 2. a calculation about the validity of the results. 3. provides an estimate of the degree to which data from a sample would reflect data from the population the sample was drawn from

how normal curve is used

scientists look for the normality of their data and the degree to which the distribution of their data deviates from the normal curve

Measures of Dispersion

to fully describe a distribution of data, a measure, or variability, is also needed. 2. two distributions can have the same mean but different spreads of scores when a measure of central tendency is used, a measure of dispersion should also be reported

disadvantages of using a spreadsheet or statistical fostware 2

- can create a false sense of securit -5 issues to consider if you use these programs for data entry and statistical comparison 1. computers can fail, programs can stall- never trust all your data to one file on one storage device 2. results can only be as good as the data entered 3. researchers tend to limit their thinking to statistical procedures they know they can do on the computer 4. the power of computing makes it possible to create an abundance of analyses 5. as the researcher you are the person responsible for the results and their interpretations

data

- information about communication phenomena -capture quality, intensity, value, or degree of the variables used in quantitative communication studies

deciding on null hypothesis

-belief in null hypothesis continues until there is sufficient evidence to make the assertion of the null hypothesis unreasonable. - decision is based on a comparison between the significance level established by the researcher prior to conducting the study and the significance level produced by the calculation of the statistical test.

Type I error

-occurs when the null hypothesis is rejected even when it is true. - error is set or controlled by the researcher when he or she chooses the significance level for the statistical test - thus if set at .05, there is a 5% chance the null will be rejected even though it is true

Type II error

-when you reject the alternative hypothesis even when it is true

Normal curve

1. "bell curve" 2. a theoretical distribution of scores or other numerical values. 3. majority of scores are distributed around the peak in the middle, with progressively fewer cases as one moves away from the middle of the distribution. 4. more responses are average or near-average than extremely high or extremely low.

Caution with using a software program

1. although programs compute the statistics tests, the program relies on you to request the appropriate test and to make the appropriate interpretations of the outcome. 2. relies on you to indicate which data should be included in the test- if u specify wrong test or indicate the wrong data to be used for the test, program will provide result but it will be wrong or noninterpretable

negatively skewed curve

1. distribution in which there are very few scores on the left side of the distribution 2. very few very low scores 3. most of scores are lumped together on the left side of the right side of the curve.

Causes of high probability level (greater than .05)

1. items on a survey or questionnaire intended to measure a construct may be so poorly written that participants respond inconsistently to them. 2. researcher can also create bias that generates unacceptable levels of probability 3. theory that was foundation of the study is not accurate orr the theory was not adequately or appropriately tested

Median

1. middle of all the scores on one variable 2. compute: arrange data in order from smallest to largest. 3. may or may not be the same as the mean for a set of scores 4. scores in the dataset can change without the median being effected 5. better to use if scores skewed better reflects middle of distribution

Mean

1. most common measure of central tendency 2. "average" computed by adding up all the scores on one variable and then dividing by the number of cases for that variable 3. most sensitive to extremely high or extremely low values of a distribution 4. most commonly reported measure of central tendency.

Application of descriptive statistic

1. reported in the method section of a written research report 2. reader can assess the normalcy of the data 3. help reader interpret the conclusions drawn from data 4. frequencies and percentages also commonly used to provide summaries of nominal data

Number of cases

1. simply indicates the number of sources from which data were collected 2. data points 3. the more data points (number of cases) the more reliable the data 4. found in the methods or results section of the written research report 5. represented by n or N 6. N= total number in a sample, n= subsample, a group of cases drawn from the sample. 7. may not always be the number of people, rather, number of speaking turns, arguments, conflict episodes, commercials, etc.

standard deviation

1. tells how close or far apart the scores are from one another 2. standard calculation and representation of the variability of a dataset 3. both mean and sd reported because mean alone is not interpretable 4. if sd is small, scores were very similar or close to one another 5. the larger the sd, the greater the degree the scores differ from the mean

another use of range

1. to describe demographic characteristics of the research participants 2. simply report the highest and lowest values 3. Ex: range 18 to 67

ways to control Type I and Type II errors

Type I control: set significance level that is appropriate Type II control: increase the sample size

Percentages

a comparison between the base, which can be any number, and a second number that is compared to the base. - frequently used to describe attributes of participants or characteristics of their communication behavior. - make it easy for the reader to get an idea of the degree to which the sample is relevant and appropriate for the study.

if probability level of statistical test is greater than .05

ex: p=.15 - finding is labeled nonsignificant - means that difference could easily be caused by chance or random error

when the probability level of a statistical test is .05 or less,

finding is real - labeled as statistically significant

Hypothesis Testing

hypotheses state the expected relationship or difference between two or more variables. - it is the null hypothesis that is statistically tested

creating a frequency distribution

list the scores in order, from highest to lowest, and then identify the number of times each score occurs. 2. create a polygon to get a good sense of the normality of the distribution. 3. range of possible scores for the variable is displayed on the horizontal axis. 4. frequencies with which those scores occur are listed on the vertical axis. 5. plot each data point according to its frequency of occurrence.

Measures of Central Tendency

primary summary form for data- -research reports do not report the data collected for every case on every variable, but report summary statistics for each variable. 1. mean 2. median 3. mode

two most common measures of dispersion

range standard deviation both provide information about the variability of the dataset

horizontal axis

represents all possible values of a variable

vertical axis

represents the relative frequency with which those values occur

reason to make significance level more rigorous (.01, .001)

results of the study have direct implications for people whos health is at risk - have to create greater certainty about the results achieved.

CHATER 9 Descriptive Statistics, Significance Levels, and Hypothesis Testing

Ensembles d'études connexes

Management - Exam 2 (Chapter 9)

Chapter 10 Business Analytics

ATE Final

Quiz 2 HIT (ch.8-11)

General Psychology Module 41 Critical Thinking

PARENTHETICAL CITATION

MGMT Exam 4 Quiz Questions

Quiz 6

Music Theory I TBC

Bio 101 Exam 3

Chapter 5 micro

Chapter 8 Quiz

Special Senses Structures and Functions

Macroeconomics Mid Term 2

ch21fnd

BCIS 1305 Exam 1 (Ch 1-3)

Ch 18

Chapter 15 Post-Test

Crop Production in the United States: Midwest Region

a&p ch 16