Introduction to Statistics, Exam 1 Ch.1,2,3

Ace your homework & exams now with Quizwiz!

Discrete Data

A finite number of values; can be counted (ie. # of kids, # of classes taken) used for single value grouping

Histogram

A graph representing the distribution of a quantitative variable. (displays cross-sectional data).

Parameter

A numerical measurement describing some characteristic of a population (Greek Letters - = mean (average))

Statistic

A numerical measurement describing some characteristic of a sample (Roman Letters - X= mean (average))

Categorical variable

A variable that falls into a particular category, (signs of categorical variables appear in question form).

Splitting stems

A way to double the number of stems when all the leaves would otherwise fall on just a few stems.

AKA Qualitative Data

AKA Categorical Data

Stem

An observation that consists of all but the final (rightmost) digit.

Design Experiment

Applying some treatment and then observing its effects on the subject (experimental units) take measurements

Variable(s)

Are any characteristics of an individual, (can take diff values for diff indivi). data is determined by the type of variable.

Normal distributions

Are bell shaped curves in which the mean and STD are good descriptions for symmetric distributions w/o outliers.

Boxplots

Based on the five-number summary. Used to compare several distributions. Box spans the quartiles and shows the variability of the central half of the distribution, median is marked within the box. Lines extend from the box to the extremes (lower and upper bound) showing the full variability of the data.

Qualitative Data

Can be separated into different categories that are distinguished by some numerical characteristic. used for pie and bar charts

Ratio (Quantitative)

Data that can be arranged and the differences are meaningful, plus there is a natural starting point of zero. (Always a number) (Differences are meaningful, starting at zero) (ie. Cost of a book, distance traveled, weight, age)

Interval (Quantitative)

Data that can be arranged in an order where the difference is meaningful. (Always a number) (Differences are meaningful, but starting point of zero) (ie. Body Temperature, years as in dates)

Ordinal (Qualitative)

Data that can be arranged in an order, but the difference is meaningless (Some Order) (ie. Grades - A, B, C, D, F: Drink sizes - Small, medium, large)

Nominal (Qualitative)

Data that consists of names, labels, or categories. Cannot be arranged in an order. (Category Only) (ie. colors - red, blue, green, yellow: Survey responses - yes, no, undecided)

Median

Describes the center of a distribution. (mid-point of the values).

Quartiles

Description of the means variability. (ex. tightly packed)

2 Types of Statistics

Descriptive & Inferential

Quantitative Data can be broken down into...

Discrete & Continuous

Stemplot

Display of distributions for small data sets that presents more detailed info than histograms. Consists of stems and leaves.

Stratified Sampling

Divide the population into at least 2 subgroups that share characteristics then draw a sample from each subgroup.

Cluster Sampling

Divide the population into sections (clusters), randomly select some clusters, choose all members from selected clusters.

Random Sample

Each individual has an equal chance at being selected (picture putting all the names in a hat and then picking a name)

Simple Random Sample

Every possible sample of the same size (n) has the same chance of being chosen (picture voting districts and choosing a district at random)

Skewed distributions

Have a tail that extends in either the left/right side of the bulk of data. The five-number summary is ideal for explaining the description of skewed distributions, but they do not always fully describe the shape of a parti. distribution.

Both start with P

How to remember Parameter is for Population

Both start with S

How to remember Statistic is for a Sample

Inferential

Involves collecting, organizing, summarizing, and presenting data with graphs, charts, and tables. A conclusion is made.

Descriptive

Involves collecting, organizing, summarizing, and presenting data with graphs, charts, and tables. Used in the tables are averages, measures in variation, and percentages. No conclusions are ever made.

Outlier

Is an indivi value that falls outside the overall pattern.

Symmetric distribution

Left and right side of the histogram are aprox mirror images of each other.

Trend

Long-term upward or downward movement over time.

Mu (μ)

Mean of a density curve (population mean) which is the true mean of all the individuals out there even outside your sample data. Also the balance pt of the curve

Continuous Data

Measured; Containing no gaps, interruptions, or jumps (ie. Temperature, weight, time, distance)

Variance 's2' and standard deviation 's'

Measures variability about the mean as center.

4 Categories of Data

Nominal Ordinal Interval Ratio

Sampling Bias

Non-Response Voluntary Response Convenience Sample

Quantitative Data

Numbers, representing counts or measures

Quantitative variable

Numerical values for which arithmetic operations such as adding and averaging makes sense, (recorded with a unit of measurement).

Observational Study

Observing and measuring specific characteristics WITHOUT attempting to modify the subject being studied. Reveal association

First quartile, Q1

One-fourth of the observations fall below it, and 3/4 above it.

Symbols Representing Measurements

Parameter & Statistic

Five-number summary

Provides a quick overall description of a distribution by looking at the median, the quartiles, and the smallest and largest indivi observations.

Types of Data

Quantitative & Qualitative

Bar graphs

Represent each category as a bar.

Sigma (σ)

STD of a density curve

Systematic Sampling

Selecting every nth element

Pie charts

Show the distribution of a categorical variable as a "pie" whose slices are sized by the counts or percents for the categories.

Mean

Sometimes denoted as x bar, describes the center of a distribution. (Arithmetic avg of the observations).

Variability

Sometimes referred to as 'spread', how spread out is the data.

Sampling Strategies (Not Random)

Systematic Convenience Stratified Cluster

Distribution

Tells us what values a variable takes and how often it takes these values.

Leaf

The final digit of an observation (right-most).

Third quartile, q3

Three-fourths of the observations fall below it, and 1/4 above it.

Convenience Sampling

Using results that are easy to get

Skewed to the left distribution

When the left side of the histogram extends much farther out than the right side.

Skewed to the right distribution

When the right side of the histogram extends much farther out than the left side.

Z - score

Z = (x - μ) / σ says how many STD x lies from the distribution mean (standardized).

Probability sample

random device is used like tossing a coin or referring to a table of random numbers which is used to decide which is used to decide which members of the population will be in the population instead of leaving decisions to humans

Representative sample

sample that reflects as closely as possible the relevant characteristics of the population under consideration.


Related study sets

Cost Chapter 7 (Standard Costing and Variance Analysis)

View Set

AP Psychology (Unit 5 Progress Check)

View Set

UNIT 5: MENTAL HEALTH AND COMMUNITY ISSUES

View Set

Chapter 5: Lesson 2 Egypt & Kush

View Set

CIT226 Win Server Management Exam Chapters 1 - 3

View Set

FILIPINO - PANGUNGUSAP NA WALANG PAKSA

View Set

IB Biology HL - Unit 3: Cellular Biology

View Set