STATISTICS

Ace your homework & exams now with Quizwiz!

BAR CHART

A bar chart is made up of columns or rows plotted on a graph. The columns are positioned over a label that represents a categorical variable . The height of the column indicates the size of the group defined by the column label.

BOXPLOTS

A box plot is a graphical rendition of statistical data based on the minimum, first quartile, median, third quartile, and maximum (A 5 number summary) . The term "box plot" comes from the fact that the graph looks like a rectangle with lines extending from the top and bottom

PIE CHART

A pie chart (or a circle chart) is a circular statistical graphic, which is divided into slices to illustrate numerical proportion. In a pie chart, the arc length of each slice (and consequently its central angle and area), is proportional to the quantity it represents.

T-TEST

A t-test is an analysis of two populations means through the use of statistical examination; a t-test with two samples is commonly used with small sample sizes, testing the difference between the samples when the variances of two normal distributions are not known

VARIABLE

A variable is any characteristics, number, or quantity that can be measured or counted. A variable may also be called a data item. Age, sex, business income and expenses, country of birth, capital expenditure, class grades, eye colour and vehicle type are examples of variables. A variable must vary (meaning more than one option).

BIMODAL HISTOGRAM

Basically, a bimodal histogram is just a histogram with two obvious relative modes, or data peaks.

FREQUENCY TABLE FROM SPSS

Breaks down the specific variable (example tells me exactly how many males / females + their percents). Visually communicated with a Bar Chart. If there are 2 or 3 categories, Pie Charts also are good visual ways to represent the data. use the Frequency Procedure.

STATISTICS TABLE FROM SPSS

Describes the sample (N) and missing values to provide an overview of the data. use the Frequency Procedure.

DESCRIPTIVE STATISTICS

Descriptive statistics are brief descriptive coefficients that summarize a given data set, which can be either a representation of the entire population or a sample of it. Descriptive statistics are broken down into measures of central tendency and measures of variability, or spread

METRIC VARIABLE SPSS USE

For Metric variables, it is best to use the EXPLORE procedure. Example of a metric: No. of visits to the supermarket last month - or - weight

SIGNIFICANT VALUE - P-

If the value is p<0.050 then it is a significant finding (meaning there is evidence). If the value produced is p>- 0.050 then the finding is NOT significant.

FREQUENCY

In statistics the frequency (or absolute frequency) of an event is the number of times the event occurred in an experiment or study.

NEGATIVELY SKEWED HISTOGRAM

Negative skew indicates that the tail on the left side of the probability density function is longer or fatter than the right side - it does not distinguish these shapes

RULES FOR BELL SHAPED CURVED OR SYMMETRICAL DISTRIBUTION

Normally known as 68 - 95 - 99.7% rule.

POPULATION

Population is the entire pool from which a statistical sample is drawn. The information obtained from the sample allows statisticians to develop hypotheses about the larger population

POSITIVELY SKEWED HISTOGRAM

Positive skew indicates that the tail on the right side is longer or fatter than the left side.

RANDOM SAMPLE

Random sample is a subset of a statistical population in which each member of the subset has an equal probability of being chosen.

OUTLINER (Boxplots & SPSS)

SPSS (Statistical Package for the Social Sciences) arbitrarily decides that anything more than one and a had box widths below the bottom of the box or above the box is an outliner value. The outliner usually has a number which is the case number to go back and check if necessary.

EXTREME VALUE (Boxplots & SPSS)

SPSS (Statistical Package for the Social Sciences) defines extreme values anything more than three box widths above or below the box.

SKEWNESS

Skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive or negative, or even undefined

SYMMETRICAL DISTRIBUTION

Symmetrical distribution is a situation in which the values of variables occur at regular frequencies, and the mean, median and mode occur at the same point

EXPLORE PROCEDURES SPSS AND PERCENTILES

The Percentiles can be produced from the Explore Procedure.

MEAN

The mean is the average of the numbers: a calculated "central" value of a set of numbers. To calculate: Just add up all the numbers, then divide by how many numbers there are. Example: 10, 15, 20, 30, 45 (add all numbers) 120 then divide by number of numbers (5). 120 / 5 = 24. Mean: 24. The mean is used to measure the central tendency of the distribution for metric data.

MEDIAN

The median is a simple measure of central tendency. To find the median, we arrange the observations in order from smallest to largest value. If there is an odd number of observations, the median is the middle value. If there is an even number of observations, the median is the average of the two middle values. The Median is to be used as a measure of central tendency for metric data presenting skewed distributions.

BELL SHAPED CURVE

The normal distribution has a very special property. For any given mean or standard deviation 95% of values fall within two standard deviations of the mean.

RULES FOR SYMMETRICAL DISTRIBUTION

There are two useful rules. 1. In all normal distributions 68% of values lie within one standard deviation of the mean. 2. In all normal distributions, 99.7% of values fall within 3 standard deviations of the mean.

METRIC DATA SPSS PROCEDURES

Use the Explore procedure. Example of a metric variable: how many pets do you have? In contraposition to a categorical variable: do you prefer cats or dogs?. Descriptive Table from the Explore Procedure. The preferred visual for metric data is a Histogram Graph

TESTS (BINOMIAL VS ONE-SAMPLE T-TEST)

With Categorical variables, we use Binomial tests. With Metric variables, we use One-sample t-test.

HISTOGRAM

A histogram is a display of statistical information that uses rectangles to show the frequency of data items in successive numerical intervals of equal size. In the most common form of histogram, the independent variable is plotted along the horizontal axis and the dependent variable is plotted along the vertical axis. Histograms give a very good impression of the distribution.

STANDARD DEVIATION

A measure of the range of values in a set of numbers. Standard deviation is a statistic used as a measure of the dispersion or variation in a distribution, equal to the square root of the arithmetic mean of the squares of the deviations from the arithmetic mean

ORDINAL VARIABLE

A ordinal variable, is one where the order matters but not the difference between values. For example, you might ask patients to express the amount of pain they are feeling on a scale of 1 to 10. A score of 7 means more pain that a score of 5, and that is more than a score of 3.

MODE

A statistical term that refers to the most frequently occurring number found in a set of numbers. The mode is found by collecting and organizing the data in order to count the frequency of each result. The result with the highest occurrences is the mode of the set. The mode is used as a measure of central tendency in categorical data.

STANDARDISED VALUES OR Z-SCORES

A z-score is a measure of how many standard deviations below or above the population mean a raw score is. A z-score is also known as a standard score and it can be placed on a normal distribution curve. Formula: data value - mean / standard deviation.

PERCENTILES

Each of the 100 equal groups into which a population can be divided according to the distribution of values of a particular variable. Percentiles are useful to place an individual case within a distribution, and also helps summarising the distribution. Example: Percentile 10 indicates that 10% of the total sample did something.

CATEGORICAL VARIABLE SPSS USE

For Categorical variables, it is best to use the FREQUENCY procedure. Example: Location: 1. Inner city, 2. Suburbs, 3. Country. - or - Level of participation in a sport: 1. once a week, 2. Twice a week, etc

TYPE OF GRAPH: METRIC VARIABLES

Histograms are best to represent metric variables.

NOMINAL VARIABLE

I have granted names for the groups that I am measuring. Example: 1 = cats, 2 = dogs

CATEGORICAL VARIABLE

In statistics, a categorical variable is a variable that can take on one of a limited, and usually fixed, number of possible values, assigning each individual or other unit of observation to a particular group or nominal category on the basis of some qualitative property. Example: genders (male Vs female). To best analyse a categorical variable with SPSS, run a Frequency procedure & graphs.

BIAS

In statistics, the bias (or bias function) of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. Otherwise the estimator is said to be biased.

SAMPLE

Limited number of observations selected from a population on a systematic or random basis, which (upon mathematical manipulation) yield generalizations about the population

Z-SCORE EXAMPLE

Mean : 10 Hours you spend studying per week: 12 Standard deviation: 2 (2 hours per week) Then: data value - mean / standard deviation. Then: 12 - 10: 2 -------- 2/2: 1


Related study sets

security plus exam final cts1120

View Set

Assignment: Chapter 14 Interviewing and Following Up

View Set