Q & Q Test #3

Ace your homework & exams now with Quizwiz!

Sometimes researchers refer to the risk of making a Type I Error as the "alpha risk". Why?

Because alpha is the area set aside to reject the null hypothesis --> This represents the area in which a Type I Error could be made

Nominal scale

(Name) - subjects fall into categories, but one category is not considered to be higher or lower than another essentially considered to be equivalent categories (Leo vs Virgo) MEASURE OF CENTRAL TENDENCY: *MODE*

1-tailed test

*1-tailed test*: If your research hypothesis suggests a direction to your findings (more or less of something), you will run a one-tailed test. --> Example: "Psychologists are more sociable than engineers". --> M1 > M2 --> What is the null hypothesis? -----> Psychologists are not more sociable than engineers If your research hypothesis suggests that you will be finding more of something (e.g., sociability in psychs), you will be examining the *right tail* of the distribution Here's another possibility: -Engineers are *less* sociable than psychologists. -In this case, you would still do a one-tailed test, but you would compare your calculated "t" value to the cut-off or critical "t" value in the left-hand tail of the sampling distribution.

Factors affecting power - controlling Type II Errors

*1. The statistical significance criterion used in the test* --> In other words, whether you set aside 0.05 or .01 of the distribution as the rejection area will affect how easy it is to reject the null. --> If you make it too hard to reject the null, you may "miss" a real effect. *2. The magnitude of the effect of interest in the population* --> If you're doing an experiment on a robust phenomenon, it's going to be easier to demonstrate that effect experimentally. Big phenomenon = big experimental effect. --> If you can demonstrate an effect experimentally, it's more likely you'll be able to reject the null hypothesis, increasing power. --> Bottom line: examining a robust phenomenon increases power *3. The sample size used to detect the effect* --> Boosting sample size may be the easiest way of increasing power in an experiment. ----> Why? Because bigger sample sizes are better able to detect the true magnitude of experimental effects -------> Bigger sample sizes are more reliable too -------> Smaller sample sizes are more subject to sampling bias

Have an understanding of the various scales of measurement and measures of central tendency - how do they relate to each other? (Which measures of central tendency would be used for which scales of measurement?)

*4 scales exist:* 1. Nominal 2. Ordinal 3. Interval 4. Ratio

What is effect size and how is it evaluated (i.e., what does it mean to have a large or small effect size)?

*Effect size:* --> size of the effect --> a descriptive statistic Within an experiment, effect size is the magnitude of the experimental effect --> Effect size indicates the amount of control of the DV by the IV. --> Effect size is thus a measure of the effectiveness of the treatment or IV. --> If the IV causes big changes in the DV, there will be a large effect size

What are the properties of normal distributions and nonsymmetrical (e.g., skewed distributions)? Have an understanding of skewness and kurtosis.

*Skewness* refers to the position of the tail of the distribution --> Negative skew: tail in the negative direction --> Positive skew: tail in the positive direction *Kurtosis* refers to the "peakedness" of the distribution Skewness and kurtosis are the measures of variance least often reported for data sets -Lepto means skinny; -Platy and "flatty"

Which effect sizes are used with different designs? t-test correlational study one-way ANOVA two-way ANOVA chi-square

*t- test:* --> "t" - Cohen's d is calculated for experiments having 2 groups *Correlational study:* "r" --> effect size is given by r squared *One-way ANOVA:* --> The statistic that you calculate is called "F" = (FISHERS F) - Effect size is given by OMEGA SQUARED *Two-way ANOVA:* --> The statistic that you calculate is still an "F" - Effect size given by OMEGA SQUARED *Chi-Square:* --> x^2 Used when there are 2 nominal variables, scores are frequencies - Effect size is calculate by Cramer's V

Why would a Type II Error happen?

-Because the experiment lacked statistical power. -Statistical power refers to the probability that you will reject the null hypothesis when the null hypothesis is false -Several things affect power in an experiment ....

What factors do you consider when determining sample size?

-If the population is small enough, include everyone in the study (conduct a census); -Use a sample size of a similar study; -Use published tables; -Use formulas to calculate sample size; OR ideally, conduct a pilot study and calculate the effect size. If the effect size is small (i.e., there is a subtle effect of the IV on the DV), it may be necessary to boost sample size to increase the likelihood of obtaining significant results. this may be the best way to decide upon sample size

*2. Measuring how much the data varies from the mean (common ways: variance, standard deviation)*

-In order to find out how different the field of scores is from the average score, it is possible to calculate the variance of the scores *Variance* is how much the scores vary from the mean -A measure of the dispersion of scores around the mean An even more useful measure is the *standard deviation (SD)* - most commonly reported measure of variability. The SD is the average dispersion of scores around the mean score (average "spread") - on average, how different scores are from each other. Low SD values mean less spread between scores These scores are more like each other High SD values mean more spread These scores are more different from each other Variance and standard deviation are related to each other. The SD is the square root of the variance for a set of scores.

*1. Central tendency (mean, median, mode)*

-Most common measure of central tendency: *the mean* -"Finding the average" of a group of scores. (The mean is typically used as the measure of central tendency when scores are on an interval or ratio scale) The mean is not always the most representative score for a data set Why? --> The mean is influenced by extreme scores --> At times, the median may be more appropriate When using ordinal scales *The median* is the middle score *The mode* is the number or response that occurs most often. Why find the mode? -May have intrinsic value (what was the mode for the test) -When data appears as words instead of numbers, you have to use the mode --> Least commonly used in research --> Can have more than one mode

Controlling Type I Error

-Notice that as the amount of the distribution set aside as the rejection area (alpha) decreases, the chances of incorrectly rejecting the null decrease too. -This is the major way of decreasing the probability of making a Type I Error -This is termed "decreasing the p value" or decreasing alpha"

1. Name a trait that inherently lends itself to nominal measurement. Explain your answer. 2. Suppose a fellow student gave a report in class and said, "The average is 25.88". What additional information should you ask for? Why?

1. Ethnicity or Marriage Status- It can be made categorical and there is no mathematical order to the results and no category is higher than the other. It is what it is. OR zodiac sign (virgo vs leo) 2. Average of what? (mean, median, mode?) Average can refer to multiple aspects of data. --> mean can be affected by outliers (extreme highs/lows)

What are the 3 common ways of describing data?

1. Finding the measure of central tendency (mean, median, mode) 2. Measuring how much the data varies from the mean (common ways: variance, standard deviation) 3. Next: If possible, create a frequency distribution

Why is effect size important to consider in an experiment?

1. It is considered good practice when presenting empirical research findings in many fields Simple to calculate; Allows you to be a more sophisticated consumer of research 2. it plays an important role in meta-analysis which summarize findings across studies in a specific area of research 3. They can be used to help determine sample size

1. Imagine that you have conducted an experiment and are now analyzing the results. You will be conducting a 2-tailed t-test in which you are setting the overall alpha at 0.05. Draw a sampling distribution for t and indicate what area of each tail would be "shaded in" as the region of null rejection. If your calculated value for t equals 3.98 and the critical value for t equals plus or minus 2.63, will you reject the null hypothesis? Why or why not? 2. Discuss 3 important factors that affect statistical power in an experiment.

1. You would reject the null hypothesis because the t value exceeds the critical value 2. statistical significance criterion used - where you set alpha magnitude of the effect of interest in population sample size used to detect effect

Type II Error

Another possibility exists: Accidentally not rejecting the null when it should be rejected --> Your results are actually "real" but your calculated statistic is too small to allow the null hypothesis to be rejected. --> This is called a Type II Error --> Researchers desperately guard against this It means that there was something "real" in their results that they could not establish. Results that are not significant are not typically published (unless they refute other findings).

Interval scale

Differences between scores have equal value but there is no absolute zero (e.g.,0 B.C. does not mean absence of time) Ranked from highest to lowest too Ex; time (4 b.c. and 3 b.c. interval is the same; 0 doesn't mean there is no time, it's just another number on the scale) Temperature (equal interval between degrees, but 0 degrees doesn't mean NO temperature, just another number on the scale) MEASURE OF CENTRAL TENDENCY: *MEAN*

How do effect size and sample size interact?

Effect size can be used to help determine sample size *Small effect size = larger sample size* if effect size is small- may miss subtle effects of IV on DV *Large effect size = smaller sample size* if effect size is large- can be limited importance - results can expensive to replicate ** The larger the effect size, the smaller sample size you will require and the smaller the chance that you will not notice a change

What do you do when effect sizes are equal?

Either or both treatments can be recommended ------------------------------------------------------------------- To compare the results, you need to standardize the findings --> How do you do this? -Describe the difference between the means in the same unit. - take mean and divide it by SD --What is the standard unit used? A convenient unit is the standard deviation (SD)

How does standard deviation affect curve shape?

If there is more spread of scores, the curve will be larger, more flat It will be more narrow if the SD is low

2-tailed test

If your research hypothesis does not imply directionality, you would do a 2-tailed test. --> Example: There is a difference in the sociability between psychologists and engineers. M1 ≠ M2 What's the null hypothesis? M1 = M2 Your 5% rejection area would be divided between the 2 tails. Each alpha area represents 2.5% (i.e., 0.025) of the distribution

Why do researchers desperately guard against Type II errors?

It means that there was something "real" in their results that they could not establish. Results that are not significant are not typically published (unless they refute other findings).

Ordinal scale

Measurements are ranked from high to low. Numbers can be subjective. Ranking scale; on a hospital pain chart, my 6 may be your 10 Common use of ordinal scale in research? --> Likert scale (used a lot as a way of measuring attitudes/preferences) MEASURE OF CENTRAL TENDENCY: *MEDIAN*

Can you draw different kinds of distributions (e.g., 2 distributions that have the same mean but different SDs, etc.)?

Normal distributions can have different means and SDs, which affect their shape

Ratio scale

Same as interval scale but absolute zero exists (e.g., 0 MPH means no miles per hour) Intervals are equal MEASURE OF CENTRAL TENDENCY: *MEAN*

What is another way of saying "setting the region of null rejection at .05"?

Setting alpha at .05

Understand the way that hypothesis testing is conducted.

T-Test - a two group comparison * theoretical range between -5 and +5 *bigger differences yield a larger t value - hypothesis testing based on group mean, SD, and sample size ---------------------------------------------------------------------- A statistic is calculated based on your group means, the SD, and the sample size --> This is a 2-group experiment, so a "t" statistic might be calculated You have calculated your "t" value based on mean group differences ... --> How do you know if this calculated "t" is located in the tails? You compare your calculated "t" value to a cut-off "t" or critical "t" value --> Critical values of "t" are cut-off scores demarcating some proportion of the tails --> You find critical values for statistics in tables If your calculated "t" value falls in the tails of the sampling distribution beyond the cut-off "t" value, reject the null hypothesis. --> Results are unlikely when the null is true --> This represents a good day for the researcher - the experimental results appear to be based on "real" differences between groups

If you were measuring voice quality on a 5-point Likert scale, what measure of central tendency would you use?

The Likert scale is ordinal so you would use the MEDIAN

What does alpha represent?

The region of rejection is also called the alpha region (α). When the researcher demarcates 5% of the distribution as the region of null rejection, it is called "setting alpha at .05". Alpha is always divided evenly between the two tails in a 2-tailed test.

Type I Error

The situation in which the null hypothesis is *mistakenly* rejected is called a Type I Error. How could this happen? --> Threats to internal validity or sampling error How would you ever know that a Type I error was committed? --> These results represent a "dead-end" in the field --> Nobody can replicate them If 1% of the sampling distribution is set aside as the null rejection area, what are the odds of making a Type I error? --> The odds of theoretically making a mistake (Type I Error) when you reject the null are 1 out of 100 (or 1% or less). This is typically written p<.01

What is the null hypothesis?

The supposition that the observed difference reflects sampling error or threats to internal validity rather than a real difference between groups It can be stated various ways: --> No difference between groups M1 = M2 --> The true difference between groups is zero: M1 - M2 = 0 --> The observed difference is due to sampling error The null hypothesis states that the observed differences in their experiment are NOT real but due to error alone.

*3. Next: If possible, create a frequency distribution*

There are several ways of describing the number of subjects that are similar along a given characteristic or when measured on a dependent variable --> You are taking a tally of some sort --> Data make more sense when viewed graphically *Pie charts* are used when you: --> Have made a tally --> Are expressing the results as percentages *Bar graphs vs. histograms* -- You use bar graphs when your data are a tally of different categories (NOMINAL DATA) -- You use histograms when your data are a tally that can be described as a range of numbers *A frequency histogram* is a special histogram that shows score frequencies: --Not a bar chart because the x-axis is quantitative --Discrete scores on the x-axis rather than ranges of values

Have a general understanding of the chi-square design. Which statistics are associated with which designs?

You would run a Χ2 test if you did a study in which you obtained frequencies and your two variables were categorical (nominal). The test determines whether there is a significant association between the two variables. The Chi-square test generates a statistic which compares the expected frequencies to the observed frequencies.

Can you develop a null hypothesis and alternative hypothesis for a 2-group test?

yes? A statistic is calculated based on your group means, the SD, and the sample size --> This is a 2-group experiment, so a "t" statistic might be calculated


Related study sets

Intro to webpage creation final exam

View Set

A+ 220-1101 Exam Acronyms Quiz - Part 1

View Set

Psych of Personality Final- Unit 11

View Set

高職龍騰英文 B1 (B版) L5 Three Strange Guests 單字&片語

View Set

N360 - Theory - Evolve questions

View Set

Chapter 7: Driving Safely and Studying for your Permit

View Set