STAT 302 - Exam 1

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

~

"has the distribution" so X ~ N(u) for example

Categorical Variable

- Based on groupings - Which one?

Quantitative Variable

- Based on numbers in which a mean makes sense - How many? How much?

Statistic Types Used for Quantitative Variables

- Center (mean/median) - Spread (range, standard deviation, IQR)

Charts for Quantitative Variables

- Dot plots - Histograms - Box plots

Statistic Types Used for Categorical Variables

- Frequency - Proportion - Percentage

Describing the distribution of quantitative variables

- Shape (modality/skewed) - Center (mean/median) - Spread (range) - Abnormalities

Correlation, r

- Values between 1 and -1 - r = 0 means 0 correlation - Not resistant to outliers

Moderate r values

0.3 - 0.7

Strong r values

0.7+

1.5 X IQR Criteria

1.5 x IQR below Q1 or above Q3 to be an outlier

Summary Statistic

A calculation for a group of data, such as a total, an average, or a count

Random Variable

A numerical description of the outcome of an experiment - X, Y refer to the random variable itself (weight) - x, y refer to values taken by the random variable (152 lbs)

Random Experiment

A situation involving chance that leads to an outcome

Resistant

A statistic that is not strongly influenced by outliers

Influential

A statistic that is strongly influenced by outliers

Confounding Variable

Associated with both the response and explanatory variable, makes stats difficult

Simpson's Paradox

Associations between variables are reversesd when different categories are combined

Mean

Average of a sample obtained by dividing the sum of all values by the number of values obtained - Not resistant to outliers

Statistic

Calculated, numerical value of the sample

Bar Chart

Categorized, order doesn't matter

Response Variable

Dependent Variable - Y axis

Deviation

Difference between observation and the mean __ x - x

Population

Entire group of people that the researcher is interested in

Association

Exists between two variables if a particular value for one variable is more likely to occur with certain values of the other variable

Proportion

Frequency divided by total number of individuals

Sample

Group of people we collect data on

Correlation

How close the dots are to the line

Explanatory Variable

Independent Variable, explains why the response variable is the way it is - X axis

Random

Individual outcomes are uncertain but nonetheless a regular distribution of outcomes in a large number of repetitions

Inferential Statistics

Inferences made based on the data collected

IQR

Inter-Quartile Range - Chop off bottom 25% and top 25% percentile - Middle 50% of all data values - IQR = Q3 - Q1 - Resistant to outliers

Weak r values

Less than 0.3

Range

Maximum - minimum - Worst measure of spread - Not resistant to outliers

If the shape of the distribution is skewed left...

Mean < Median

If the shape of the distribution is symmetric...

Mean = Median (or very close)

If the shape of the distribution is skewed right...

Mean > Median

Not resistant to outliers

Mean, range, standard deviation

Q2

Median of entire distribution - Middle of the overall median

Q1

Median of first half of values - Left of the overall median

Q3

Median of second half of values - Right of the overall median

Resistant to outliers

Median, IQR

Median

Middle value of a sample - Resistant to outliers

Independent

No relationship between explanatory and response variable

Normal Distribution

Normal, bell-shaped curve - Continuous random variable - Area = 1, center u = 0

Frequency

Number of individuals in a category

Z-score

Number of standard deviations an observed measurement x is from the mean - Above the mean: + - Below the mean: negative - "Observation minus mean over standard deviation"

Parameter

Numerical summary of the POPULATION

Descriptive Statistics

Numerical summary of the SAMPLE

Positive Deviation

Observation is above average

Negative Deviation

Observation is below average

Rule of Multiplication

P(AnB) = P(A) x P(B|A) - Probability that A and B both occur

Rule of Addition

P(AuB) = P(A) + P(B) - P(AnB) - Probability that either event occurs

Conditional Probability

P(A|B) - The probability that A occurs given that B has occurred

Bayes's Formula

P(A|B) = P(B|A) x P(A) ---------------- P(B) - Probability of an event, based on conditions that might be related to the event - Helps us find P(A|B) given P(B|A)

u

Parameter mean

M

Parameter median

What types of charts can you use for categorical variables?

Pie charts and bar charts

Continuous Random Variable

Possible values are an interval rather than a set

Histogram

Quantitative (use numbers), order matters

Conditional Distribution

Referring to a specific variable in a table--in the picture, it would be one of the "Neither Disagree or Agree" columns/rows

Marginal Distribution

Referring to the margins of a table--usually the calculated totals of rows/columns (in the picture, the totals)

__ X

Sample mean

^ M

Sample median

Discrete Random Variable

Set of separate values (0, 1, 2, etc.) - Find the probability for any event by adding the probabilities of the individual outcomes for that event

Distribution

Shows all possible values of data

SRS

Simple Random Sample - Set of individuals chosen from a larger population

Probability Distribution

Specifies values and their probabilities for a random variable

Variance

Square of the standard deviation

p-value

Subtract this number from 1 and then multiply by 100 to find how accurate you may claim to be

Complement

The event is not occurring - A'

Law of Large Numbers

The larger the number of individuals that are randomly drawn from a population, the more representative the resulting group will be of the entire population

Probability

The likelihood that a particular event will occur - Before the event has occurred

Mutually Exclusive / Disjoint Events

Two events that cannot occur at the same time

Discrete Variable

Type of Quantitative Variable - Amount of something

Continuous Variable

Type of Quantitative Variable - Numerical values over an interval - How much?

Standard Deviation

Typical distance of an observation from the mean - Not resistant to outliers

Mode

Value that appears the most in a set of data

Lurking Variable

Variable not considered in a study but has an effect on the results

Empirical Rule

When the distribution of data is normal: - 68% of observations fall within 1 standard deviation of the mean - 95% of observations fall within 2 standard deviations of the mean - 99.7% of observations fall within 3 standard deviations of the mean

If the standard deviation = 0, then ___

all numbers are equal.

Mean is ___ than median if the graph is skewed right

larger

Side-by-side bar charts are best at measuring ___ values.

numerical

Stacked bar charts are best at measuring ___

proprotions

Mean is __ than median if the graph is skewed left

smaller


Kaugnay na mga set ng pag-aaral

UNIT: WRITING EQUATIONS FOR LINEAR RELATIONSHIPS

View Set

To Kill a Mockingbird Characters 1-2

View Set

Abeka Themes In Literature Reading Quiz S (revised)

View Set

Chapter 63 - Nursing Management: Musculoskeletal Trauma and Orthopedic Surgery

View Set

Intro to Business Law Chapter 15 and Chapter 16

View Set