Ed15: Statistics in Education Research- (exam#1)

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

What is the big picture?

1.Population is the large group that we are interested in. But it is very difficult to be able to collect lot of data, details about that big group 2. So we draw a sample, a smaller group , randomly through a sampling mechanisms 3. We work with that sample to understand it and analyze it and use it to draw some conclusions about the population. image is in the first lecture

N vs. n-1

N represents the number of an entire population n represents the number in a sample Because we are aiming to view the population through the sample, n-1 is a correction for a biased estimator made for dealing with a sample

Quantitative Variables

Quantitative variables have measurable differences. E.g., an individual's score on a reading test is a quantitative variable because it can be measured. A higher score = more verbal ability

How do you calculate for the quartiles?

Sort the observations and locate the median M. The first quartile Q1 is the median of the observations located to the left of the median. Also p25 The third quartile Q3 is the median of the observations located to the right of the median Also p75

Qualitative Variables

They differ by CATEGORY rather than by AMOUNT Numbers are used to represent a group, but have NO mathematical significance. Also known as: categorical variables E.g., an individual's ethnicity is a qualitative variable because it is represents the category to which the person belongs ( 1=white, 0=other)

Constants

They do not vary These hold little analytical value (maybe because they don't vary?)

Measurement

Using rules to assign numbers. Different "kinds" of measures gauge different qualities referred to as data scales. A data scale refers to the way different kinds of measures are categorized.

Variability/Dispersion

Variability, also referred to as dispersion, indicates how much individual cases tend to deviate from what is typical in the data.

Population

A population represents all members of a defined group Greek letters indicate the characteristics of a population. For example, μ (pronounced "mu") is the symbol that represents the arithmetic average (mean) of some characteristic in a population. Example: All the admitted freshman class for fall 2014 @UCI (Around 24,000) Mean of Math SAT scores, μSAT = 654

Research Design

A research design is a formal plan for gathering and analyzing data. Included in the research design is the researcher's specification of relevant variables, both independent and dependent. Independent variables are considered antecedent variables and affect other variables. Also known as explanatory variable Dependent variables are effected by independent variables Also known as response variables

Sample

A sample represents any subset of a population. We want the sample obtained to be representative of its population. Roman letters indicate the characteristics of a sample. For example, M represents the arithmetic average (mean) of a sample. Example: Take a random sample of 1000 freshmen at UCI M=656 is the mean of Math SAT

Interval Scales

Allow rankings and have equal distances between the points of the scale ****No absolute or true zero*** No point that captures lacking completely the characteristic being measured Examples: Temperature, IQ, achievement tests Distance between IQ 70 and 90 is the same as the distance between 100 and 120 IQ of 0? SO, IQ of 120 is not twice as intelligent as IQ of 60? NO

Effect Outliers

An outlier is a value or score in a group that is much higher or lower than the other values in the group. For example: add a score of 0 90 92 88 90 0 Mean change from 90 to 72 The median stays the same ( 0 88 90 90 92) The mode is not affected So the mean is less resistant to outliers

Inferential Statistics

Analyze the larger group (i.e., population) through the smaller group's (i.e., sample) characteristics to draw inferences about the population. Chapters 5+

Nominal Scale

Classification/Categorization of characteristics Categories are independent Ex: Gender (Male & female), Political affiliation (Republican, Liberal, independent, etc), Ethnicity (1=Italian, 2=Irish, 3=Asian, 4=Hispanic, 5= African American, 6= Other)

When exploring a new data set what should you ask yourself?

WHO? What cases do the data describe? How many cases does a data set have? WHAT? How many variables does the data set have? How are these variables defined? What are the units of measurement for each variable? WHY? What purpose do the data have? Do the data contain the information needed to answer the questions of interest?

When we construct a set of data what do we decide first?

We construct a set of data by first deciding which cases or units we want to study. For each case, we record information about characteristics that we call variables.

Choosing Measures of Center and Spread

Five Number summary and IQR usually better than the mean and standard deviation for describing a skewed distribution or a distribution with outliers Use mean and standard deviation only for reasonably symmetric distributions that don't have outliers. NOTE: Numerical summaries do not fully describe the shape of a distribution. ALWAYS PLOT YOUR DATA!

Variables

Have different manifestations Qualitative vs. Quantitative variables are characteristics of individuals Qualitative or Categorical: classifies individual into a group (smoker/nonsmoker; male/female; job title) Quantitative: take on a numerical value that makes sense to perform calculations on (# of cigarettes smoked/day; salary)

Quartiles

If the halves are divided in halves. Or quarters of the distribution. The 1st quartile (Q1) is the point below which 25% of the scores occur (P25). The 2nd quartile (Q2) is the point below which 50% of the scores occur. It is also the median and marks the 50th percentile (P50). The third quartile (Q3) is the point below which 75% of the scores occur (P75). Another option is to create deciles, or to divide the distribution of scores into tenths.

Ratio Scale

Includes all the characteristics of the interval scale, with the inclusion of a meaningful absolute zero. Zero indicates an absence of what is being measured. Examples Height: 8 feet is twice the height of 4 feet Money: $50 is twice as much as $25: $50 - $25 = $25 **Zero means there is no money!*** Number of school absences

Independent Variables/Antecedent Variables

Independent variables are considered antecedent variables and affect other variables.

What does M represent?

M represents the arithmetic average (mean) of a sample. Example: Take a random sample of 1000 freshmen at UCI M=656 is the mean of Math SAT

What are some tools that can help you describe the variability of the data?

You can use Range, Variance, Standard Deviation, Quartiles.

What are some tools you can use when describing the central tendency/what is typical?

You can use the mode, median, and mean.

Mean

average on n observations M= sum of observations divided by n Look for compact notation in lecture notes

What are four scales of measurement?

Nominal Ordinal Interval Ratio

What are parameters?

Parameters indicate population characteristics, which are represented by Greek letters. μ represents the mean σ represents the standard deviation σ2 represents the variance

What is the difference between qualitative and quantitative variables?

Qualitative variables do not have a mathematical significance, instead the numbers are used to represent a group (hence why they are called categorical variables as well). Quantitative variables have measurable differences, the numbers mean something.

Descriptive Statistics

Summarize the characteristics of a data set Classify, organize and summarize the characteristics of a set of data. Reduce the information to summarize the data: what kind and how much. (More on Chapter 2 and 3)

Distribution

The distribution of a variable tells us the values that a variable takes and how often it takes each value.

Cases

objects described by a set of data (individual, companies, schools, and states)

What does the μ (pronounced "mu") symbol represent?

represents the arithmetic average (mean) of some characteristic in a population. Example: All the admitted freshman class for fall 2014 @UCI (Around 24,000) Mean of Math SAT scores, μSAT = 654

ID Variable/Label

special variable that uniquely identifies/distinguish the cases.

What is standard deviation?

The standard deviation "s" is used to describe the variation around the mean. Like the mean, it is not resistant to outliers.

What does a zero mean in an interval scale?

The zero doesn't mean anything. In the interval scale there is no absolute zero or true zero.

What does a zero mean in a ratio scale?

The zero means that there is an absence of what is being measured.

What are two types of variables?

There are qualitative and quantitative variables

How do you solve for the standard deviation?

1. First calculate the variance s2. (look in textbook for the formula) 2. Then take the square root to get the standard deviation s.

Ordinal Scale

Data is ordered and allow relative rankings But nothing about the differences between rankings or how much Ex: Percentile rank ( express a person's performance relative to others) Ex: Likert surveys: Strongly Agree(1), Agree(2), Disagree(3), Strongly Disagree(4)

Dependent Variables/Response Variables

Dependent variables are effected by independent variables

Variance, How can you find the variance?

Determine the mean for the group, M Determine the difference between each individual score and the mean, from the first number (x), to the last or "ith" number, xi Square the difference between each individual score and the mean of the sample Sum the squared differences Divide by the number of observations -1: n-1

Range, How can find the Range?

The distance from the largest to the smallest number. Example: test scores: 70 83 83 83 90 90 94 95 95 The range is calculated by subtracting the smallest value (70) from the largest value (95): R = 25

What is the interquartile range? (IQR)

The interquartile range (IQR) is defined as: IQR = Q3 - Q1. Semi-quartile Range = IQR/2

Which is less resistant to outliers? The median, the mode, or the mean.

The mean is less resistant to outliers

Median

The median Mdn is the midpoint of a distribution, Half of the observations are smaller and the other half are larger How to find it: Arrange all observations from smallest to largest. If n is odd, the median Mdn is the center observation in the ordered list. Position (n+1)/2 If n is even, the median Mdn is the average of the two center observations in the ordered list.

Mode

The most frequently occurring value in a group. The number that repeats. Unimodal: data has one mode Example: number of school absences: 0 0 0 2 3 5 Mode is 0 Bimodal: data has two modes Example: number of school absences: 0 0 0 2 3 3 3 5 Two modes 0 and 3.

Degrees of Freedom = df

The number of scores in the group from which the statistic is calculated that are free to vary independently when the final value of the statistic is known. Typically, the df is n - 1, or the one less than the number of scores in a group. -1 is used as a correction, which can be seen in the denominator portion of the variance equation.


Ensembles d'études connexes

Section 6 Unknown Questions (Pt.2)

View Set

Unit 2: Physiology Mastering Ch 7, 8, 9, 10

View Set