Math 202 : Probability and Statistics Final Test Prep

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Quartile

1) Division of data into 4 equal parts

Stemplot

A graphical representation of a quantitative data set. Leading values of each data point are presented as stems and second digits are given as leaves.

Ogive

A line graph that depicts cumulative frequencies.

Correlation

A measure of the extent to which two factors vary together.

Standard Deviation

A measure of variability that describes an average distance of every score from the mean (r).

Probability

A number that describes how likely it is that an event will occur

Parameter

A numerical measurement describing some characteristic of a population.

Statistic

A numerical measurement describing some characteristic of a sample.

Degrees of Freedom

A parameter of the t distribution. When the t distribution is used in the computation of an interval estimate of a population mean, the appropriate t distribution has n-1 degrees of freedom, where n is the size of the simple random sample.

z-Test

A parametric inferential statistical test of the null hypothesis for a single sample where the population standard deviation is known.

t-Test

A parametric inferential statistical test of the null hypothesis for a single sample where the population standard deviation is unknown.

Outcome

A possible result of a probability experiment

Independent

A relationship between two sets of data or two datum which states the outcome of one has no effect on the outcome of the other.

Experiment

A research method in which an investigator manipulates one or more factors to observe the effect on some behavior or mental process

Quota Sample

A sample deliberately constructed to reflect several of the major characteristics of a given population.

Systematic Sample

A sample drawn by selecting individuals systematically from a sampling frame.

Random Sample

A sample in which every element in the sample has an equal chance of being selected.

Convenience Sample

A sample that includes members of the population that are easily accessed.

Normal

A sample which follows the Empirical Rule for distribution.

Cluster Sample

A sampling design in which entire groups are chosen at random.

Line of Best Fit

A straight line that comes closest to the points on a scatter plot.

Two-way Table

A table containing counts for two categorical variables. It has r rows and c columns.

Frequency Table

A table for organizing a set of data that shows the number of times each item or number appears.

Wording Bias

A type of response bias where the question is posed to achieve a desired result.

Standardized Value

A value found by subtracting the mean and dividing by the standard deviation.

Lurking Variable

A variable other than x and y that simultaneously affects both variables, accounting for the correlation between the two.

Sample Space

All possible outcomes of an experiment.

Type I Error

An error that occurs when a researcher concludes that the independent variable had an effect on the dependent variable, when no such relation exists; a false positive.

Type II Error

An error that occurs when a researcher concludes that the independent variable had no effect on the dependent variable, when in truth it did; a false negative.

Binomial

An experiment in which a set number of trials is used.

Geometric

An experiment in which there is no set number of trials but is ended by achieving an outcome.

Observational Study

An experiment which observes individuals and measures variables of interest but does not attempt to influence the responses.

Outlier

An extreme deviation from the mean.

Confounded Variable

An unintended difference between the conditions of an experiment that could have affected the dependent variable.

Undercoverage

Occurs when some groups in the population are left out of the process of choosing the sample.

Experimental Probability

Probability based on what happens when an experiment is actually done.

IQR

Range of the middle 50% of the values; Q3-Q1 = 75th percentile - 25th percentile.

Steps Used in a Hypothesis Test

Regardless of the type of hypothesis being considered, the process of carrying out a significance test is the same and relies on four basic steps: Step One: State the null and alternative hypotheses (see section 11.2) Also think about the type 1 error (rejecting a true null) and type 2 error (declaring the plausibility of a false null) possibilities at this time and how serious each mistake would be in terms of the problem. Step Two: Collect and summarize the data so that a test statistic can be calculated. A test statistic is a summary of the data that measures the difference between what is seen in the data and what would be expected if the null hypothesis were true. It is typically standardized so that a p-value can be obtained from a reference distribution like the normal curve. Step Three: Use the test statistic to find the p-value. The p-value represents the likelihood of getting our test statistic or any test statistic more extreme, if in fact the null hypothesis is true. For a one-sided "greater than" alternative hypothesis, the "more extreme" part of the interpretation refers to test statistic values larger than the test statistic given. For a one-sided "less than" alternative hypothesis, the "more extreme" part of the interpretation refers to test statistic values smaller than the test statistic given. For a two-sided "not equal to" alternative hypothesis, the "more extreme" part of the interpretation refers to test statistic values that are farther away from the null hypothesis than the test statistic given at either the upper end or lower end of the reference distribution (both "tails"). Step Four: Interpret what the p-value is telling you and make a decision using the p-value. Does the null hypothesis provide a reasonable explanation of the data or not? If not it is statistically significant and we have evidence favoring the alternative. State a conclusion in terms of the problem.

Marginal Frequency

Row and column totals in a contingency table (cross-tabulation) that represent the univariate frequency distributions for the row and column variables.

Snowball Sample

Samples in which informants provide contact information about other people who share some of the characteristics necessary for a study.

Empirical rule

States that, in a normal distribution, about 68% of the terms are within one standard deviation of the mean, about 95% are within two standard deviations, and about 99.7% are within three standard deviations (normal curve).

Descriptive Statistics

Statistical procedures used to describe characteristics and responses of groups of subjects.

Inferential Statistics

Statistics that are used to interpret data and draw conclusions.

Statistical Method

Step 1 Prepare: 1) Context What is the goal of the study? 2) Source of the Data Is the data from conflict of interest? 3) Sampling Method Which Method was used? Step 2 Analyze 1) Graph the data 2) Explore the data (are their outliers? how is the data distributed? ) 3)Apply Statistical Method Conclude Is there a statistical significance?

Experiment

The act of conducting a controlled test or investigation.

Simulation

The act of repeating an experiment to get more accurate statistical evidence.

Margin of Error

The range of percentage points in which the sample accurately reflects the population, the range surrounding a sample's response within which researchers are confident the larger population's true response would fall.

Theoretical Probability

The ratio of the number of favorable outcomes to the number of possible outcomes if all outcomes have the same chance of happening.

Central Limit Theorem

The sampling distribution of the mean will approach the normal distribution as n increases (n>30).

Sample Space

The set of all possible outcomes

Standard Error

The standard deviation of a sampling distribution.

Interpolation

Using the Least Squares Regression Line to predict a y-value for an x-value within the x-data set.

Discrete Random Variable

Variable where the number of outcomes can be counted and each outcome has a measurable and positive probability.

population probability

is the proportion of times you expect something to occur when you draw randomly from a population

Coefficient of Determination

Measures the percentage of variation in a dependent variable explained by one or more independent variables (r^2).

Scatter Plot

A graph with points plotted to show a possible relationship between two sets of data.

Scatterplot

A graphed cluster of dots, each of which represents the values of two variables.

Data

Information gathered from observations.

Sample

Items selected at random from a population and used to test hypotheses about the population.

Histogram

A bar graph that shows the frequency of data within equal intervals.

Non-Response Bias

A bias caused by a number of people who did not respond to the survey.

Causation

A cause and effect relationship in which one variable controls the changes in another variable.

Event

A collection of one or more outcomes of an experiment

Spread

A descriptive feature in which describes the range of the data graphically.

Center

A descriptive feature which describes the placement and relation of the median to the other parts of the graphic representation.

Bar Graph

A graph that uses horizontal or vertical bars to display data

Response Bias

Anything in the survey design that influences the responses from the sample.

Voluntary Response Bias

Bias introduced to a sample when individuals can choose on their own whether to participate in the sample.

Law of Large Numbers

Law stating that a large number of items taken at random from a population will (on the average) have the population statistics.

Qualitative

Data identified by something other than numbers.

Quantitative

Data or datum being numerically defined.

Boxplot

Displays the 5-number summary as a central box with whiskers that extend to the non-outlying data values.

Mutually Exclusive

Each event or variable is independent from one another. No event or variable will have an effect on the probability of outcome for any other event or variable.

Matched Pairs

Either two measurements are taken on each individual such as pre and post OR two individuals are matched by a third variable (different from the explanatory variable and the response variable) such as identical twins.

Extrapolation

Estimating a value outside the range of measured data.

Simple Random Sample

Every member of the population has a known and equal chance of selection.

Placebo Effect

Experimental results caused by expectations alone; any effect on behavior caused by the administration of an inert substance or condition, which is assumed to be an active agent.

Dotplot

Graphs a dot for each case against a single axis.

Trial

In probability, a single repetition or observation of an experiment

Mean

The arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores.

Block Design

The random assignment of subjects to treatments is carried out separately within each block.

Mode

The datum which occurs the most in a set of data.

Residual

The difference between an observed value of the response variable and the value predicted by the regression line.

Sampling Distribution

The distribution of values taken by the statistic in all possible samples of the same size from the same population.

Population

The entire aggregation of items from which samples can be drawn.

Null Hypothesis

The hypothesis that states there is no difference between two or more sets of data in a significance test.

Alternative Hyothesis

The hypothesis which states the Null Hypothesis is incorrect in a significance test.

Probability

The likelihood that a particular event will occur.

Least Squares Regression Line

The line that minimizes the sum of squared residuals.

Correlation

The measure of a relationship between two variables or sets of data.

Median

The middle score in a distribution; half the scores are above it and half are below it.

Joint Frequency

The number of responses for a given characteristic.

Stratified Sample

The population is divided into strata and a random sample is taken from each stratum.

p-Value

The probability of getting a result at least as extreme as the result given from the test. The lower the value the stronger the evidence.

Conditional Probability

The probability that a particular event will occur, given that another event has already occurred.

Statistical Significance

When your discovered p-value is less than your alpha (.05 if not given). States that chance alone would rarely produce an equally extreme result.

Outlier

a number in a set of data that is much larger or much smaller than most of the other numbers in the set

Box-and-Whisker Plot

shows the distribution of data. The middle half of the data is represented by a "box" with a vertical line at the median. The lower fourth and upper fourth are represented by "whiskers" that extend to the smallest and largest values.

Histogram

shows/compares frequency of continuous data

Chi-Squared Goodness of Fit

uses sample data to test hypotheses about the shape or proportions of a population distribution. The test determines how well the obtained sample proportions fit the population proportions specified by the null hypothesis.


Kaugnay na mga set ng pag-aaral

Questions I got wrong on examfx practice

View Set

(TCTX5200) Learner Development Quiz

View Set

الاسعافات الاولية 1

View Set