Statistics & Research Methods Exam One

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

When box plots are used to display means they also provide information about the interquartile ranges and outliers?

False

(Σ X(i) ) ^2 =

(Σ X) ^2

A third summation expression that is not the same as expression 1.2 is because the parenthesis signal that the summation operation should be executed first (that is, that the X scores should be summed) and then this sum should be squared, what does this equation look like?

(ΣX) ^2 = (X1 + X2 + X3)^2

Identify each of the following as a variable or a constant. Explain your answer 1. The number of hours in a day 2. People's attitudes toward abortion 3. The country of birth of presidents in the United States 4. The value of a number divided by itself 5. The total number of points scored in a football game

1. A constant because there are always 24 hours in a day 2. A variable because different people have different attitudes toward abortion 3. A constant because all presidents of the United States must be born in the United States 4. A constant because a number divided by itself always equals 1.00 5. A variable because different months have different numbers of days

What are the steps in sum of squares?

1. First add all of your X's together 2. Square each deviation score 3. Now sum all of the squared deviation scores

In the computational formula for the sum of squares what are the steps?

1. First calculate the sum of all of the X scores 2. Next we square each score and add the squared scores together to get a sum 3. Finally, we substitute these values along with N (the number of cases) into the computation formula and solve for the sum of squares

3 rules for presenting grouped data

1. How many groups should be reported? 2. What should the interval size be for each group? 3. What should be the beginning value for the lowest interval? No start rules govern these issues, but these are central questions to ask, and these have specified answers that are particular to the textbook but not necessarily to all cases.

Steps of finding the interquartile range

1. Label the numbers from lowest to highest 2. Section the scores into four quartiles of three scores each 3. Eliminate the bottom most and top most groups to be left with just the middle groups 4. Now calculate the range from the highest score in the middle group to the lowest score in the middle group

Identify each of the following as a quantitative or qualitative variable. 1. age 2. gender 3. eye color

1. Quantitative 2. Qualitative 3. Qualitative

Identify each of the following as a quantitative or qualitative variable. 1. weight 2. religion 3. income

1. Quantitative 2. Qualitative 3. Quantitative

Indicate whether each of the following variables is discrete or continuous 1. grains of sand on a beach 2. height 3. the annual federal budget 4. shyness

1. discrete 2. continuous 3. discrete 4. continuous

Rounding a mean to ____ or _____ decimal places provides a better measure of where scores for a discrete variable tend to cluster, and is more useful than a crude index that rounds the mean to the nearest integer.

2, more

What is the upper real limit of 20?

20.5

Frequency histogram

A bar graph that represents the frequency distribution of a data set. The horizontal dimension of this graph is referred to as the X axis or as the abscissa, and the vertical dimension is referred to as the Y axis or the ordinate

What can a box plot also be referred to as?

A box and whiskers plot

What is another type of graph for displaying information about central tendency and variability?

A box plot

Frequency polygon

A graph of a frequency distribution that shows the number of instances of obtained scores, usually with the data points connect by straight lines. Similar to a frequency histogram except that solid dots correspond to the appropriate frequencies and are placed directly above the score values, the dots are connected with lines.

What is a probability distribution? Why is the nature of probability distributions for qualitative and discrete variables different from that for continuous variables?

A probability distribution is a distribution that represents probabilities associated with all possible score values for a variable. The nature of probabilities distributions for qualitative variables and discrete quantitative variables because in the former case it is possible to list all possible values of the variable and their corresponding probabilities. Because the number of values that a continuous variable can have is in principle, infinite, that is not possible for continuous variables. Instead probabilities for continuous variables are conceptualized as the areas within the corresponding intervals of the density curve,

X with a bar over it is typically used to represent what?

A sample mean

Outliers

A score that is so very extreme to the majority of the scores in the data set, in fact it is so extreme that the score is suspect. When one of these scores occurs, it is important to figure out why.

Frequency Distribution

A useful tool for summarizing a large set of data. This is a compilation of all of the scores in a set of scores and the number of times that each occurs. These distributions are often presented in the form of frequency tables.

Ordinal Measurement

A variable is said to measured on this level when the categories can be ordered on some continuum. Classifying individuals/things into categories that are ordered along a dimension of interest. EX: Race times

Frequency graphs

A way of displaying frequency on the Y axis and the dependent variable in the X axis

Any outliers would be indicated by dots above or below the whiskers as is done with box plots of what type?

All types of box plots

The standard deviation thus represents?

An average deviation from the mean

Independent Variable

Any variable that is presumed to influence a second variable

Probability distributions

Are most commonly represented in graph form. When the score values are mutually exclusive and exhaustive, the probabilities associated with that individual score values represents this form of distribution with respect to that variable. This also holds for discrete quantitative variables.

Theoretical Distributions

Are not constructed by taking formal measurements, but rather by making assumptions and representing these assumptions mathematically

Line plot

Are particularly useful when one or more research participants receive the lowest or highest score possible on a particular measure. Line plots do not need to be closed, this can also be useful when comparing two groups

Why does the median qualify as a measure of central tendency?

Because it is closer to the scores in a distribution than any other value in an absolute sense

Why does the mode qualify as a measure of central tendency?

Because it represents a typical case

Probability Distributions for Continuous Variables

Because the number of values that a continuous variable can have is, in principle, infinite, it is not possible to specify a probability distribution for a continuous variable by listing all possible values of the variable and their corresponding probabilities. Specifically, probability distributions for continuous variables are conceptualized in terms of probability density functions

Frequency distributions for grouped and ungrouped data

Because the scores are grouped together in intervals tables of this type are referred to as grouped frequency tables. Note that as with the ungrouped frequency tables, the scores can only pertain to one individual or category, but the grouped and ungrouped data are both listed from highest to lowest values in their respective tables.

If you want to make the smallest amount of absolute error across all scores your best guess would be to use the median, why?

Because you are worried about the SIZE of the error, and the median minimizes absolute or unsigned error across all scores.

Steam and Leaf plot

Best for small sets of data. Uses the digits to the left of the rightmost digit to form the stem. Each rightmost digit forms a leaf; Same for the leftmost. Use when the number of values of the base is more than 1 or but less than 20. These plots provide a compact way of conveying both individual scores and the general "shape" of a frequency distribution.

In a box plot graph how is the IQR represented?

By the height

Distributions of scores can have identical variability, but very different ______ _______

Central tendencies

Bar Graphs

Charts that represent information using a series of vertical or horizontal bars

A line plot is constructed exactly like a frequency polygon except that it is not " __________"

Closed

Absolute frequencies

Commonly referred to simply as "frequencies", is a statistical term describing the number of times a particular piece of data or a particular value appears during a trial or set of trials

Third Approach to computing the median DUPLICATION OF MIDDLE SCORES

Computation of the median when there are duplications of the middle score(s). This formula is based on the assumption that the median occurs within the real limits of the middle score(s), and it is applicable regardless of whether there is an even or odd number of scores. 1. Order the scores from lowest to highest 2. Apply the approach for when there is an even number of scores (1st approach) 3. Think real limits, the median must therefore be defined so that two more scores are less than it 4. We specify a score two-thirds or .67 X greater than the lower real limit 5. Plug into the median formula 6. Solve for answer ex: given 8,6,9,6,10,8,9,6,10,8 1. 6,6,6,8,8,8,9,9,10,10 2. median = 8 3. URL = 8.5, LRL = 7.5 4. 7.5 + .67 =8.17 5. MDN = 7.5 +[ (10) (.50) -3 / 3] (1.0) 6. ANSWER = 8.17

First Approach to computing the median EVEN NUMBERED

Computation of the median when there is an EVEN number of scores. 1. Order the scores from lowest to highest 2. The median is the average of the two middle scores, and then divided by 2. 3. Solve equation EX: given 20,26,25,27,22,18 1. 16,18,20,22,25,26,27 2. (20+22)/ 2 = 21 3. answer = 21

Second Approach to computing the median ODD NUMBERS

Computation of the median when there is an odd number of scores 1. First, order the scores from lowest to highest 2. The median would simply be the middle score, however now the problem is what to do with the middle score, and the answers lies within the REAL LIMITS 3. Take the real limits of the number, and divide the interval in half to define the median 4. Solve ex: given 6, 16, 12, 14, 6, 8, 16 1. 6,6,8,12,14,16,16 2. 12 (URL = 12.5, LRL - 11.5) 3. Answer = 12

Frequency polygons are typically utilized when the variables being reported are _________ in nature, whereas frequency histograms are typically utilized when the variables being reported are ________.

Continuous, discrete

Clearly there will typically be much greater variability in the first group (all men in America) rather than the second group (college aged men) because the individuals who comprise this group are much more diverse on characteristics such as age, nationality, gender, and education, and that should influence many different attitudes and behaviors. In fact, individual differences of this type are an important type of ______ ________

Disturbance Variable

If there were no _____ ______, the scores within a given group would all be the same. The greater the role of ______ ______, the more variability there will be, and this will be reflected in things such as a larger variance and standard deviation.

Disturbance Variables

The fact that members of a given group have different standing on a particular dimension is due to ________ _______.

Disturbance varaibles

The median is the point that...

Divides the distribution into halves

Constant (variables)

Does not vary within given constrains

ROUNDING RULE THREE: If the remainder to the right of the decimal place that you wish to round to is exactly one half a measurement unit (that is if it is a 5, followed by nothing, or by nothing by zeros), leave the last digit kept as it is if it is an _______ number. But increase it by ____ if it is an _____ number.

Even, 1, odd

Maximum variability occurs when the number of individuals is _______ ______ among the possible categories. When there are for example an equal amount of boys to girls.

Evenly divided

T or F: These deviation scores sum to ONE. If any value other than the mean were subtracted from the set of scores, the sum of the signed deviation scores would be greater than ONE in absolute value. This is true for any set of scores: the sum of signed deviations from the mean will always equal ONE.

FALSE, they will always sum to ZERO

T or F: The sum of a set of signed deviations around the mean will always equal 1.00

False

T or F: Measures of variability can help to interpret measures of Standard deviations?

False, it is that measures of variability can help to interpret measures of CENTRAL TENDENCY

T or F: (for rounding rule TWO) If the remainder to the right of the decimal place that you wish to round to is less than 1/2 a measurement unit, decrease the digit kept by one.

False, you would leave the last digit kept as it is.

Probability Density Functions

For the continuous variables, probability distribution is always represented by a smooth curve over the abscissa. The ordinate, though also not formally demarcated, is used to determine the probability of observing a specified range of score values. The higher the curve is between two points the more "dense" the corresponding score are and the more likely they are to occur. Thus statisticians refer to the probability associated with a specific range of score values as a density.

The ordinate lists _______ ________ from 0 to the highest frequency that was observed in the study

Frequency Values

A stem and leaf plot is similar to a

Frequency histogram turned on its side

What does the X(i) to the right of the summation sign tell us to do?

General term that stands for the individual X scores

How do you determine the number of groups?

Generally speaking, the use of 5 to 15 groups tends to strike the appropriate balance between impression and incomprehensibility. If the number of possible score values is small, fewer groups can be used, whereas if the number of possible score values is large, more groups are required.

It is traditional to denote indexes derived from populations with what kind of letters?

Greek letters

The symbols for population variance and population standard deviation are ?

Greek sigmas

Platykurtic Distribution

Has a flat peak and short, steep tails compared to a normal distribution

Leptokurtic Distribution

Has a sharp peak, and long, flat tails compared to a normal distribution

Ratio Measurement

Have all of the properties of internal, and ordinal measured, but provide even more information, Specifically these measures map onto the underlying dimension in such a way that ratios between the numbers represent ratios on the dimension that is being measured. EX: height

Interval Measurement

Have all of the properties of ordinal measures but allow us to do more than order people on a dimension. They also provide information about the magnitude of the differences between the individuals. Have the property that numerically equal distances on the scale represent equal distances on the dimension that is being measured. Ex: Temperature

Real Limits of a number

If a variable is continuous, it follows the measurements taken on that variable must be approximate. Thus, it is often inaccurate to talk about a specific value of a particular measurement for a continuous variable, rather such measurements are more accurately represented in terms of their real lifts.

Compare the general shape of this graph with the cumulative frequency of the graph from the previous exercise, what accounts for the similarities in their shapes?

In an ungrouped frequency table, each different score value is listed. Whereas in a grouped frequency table, scores are grouped together into intervals

What is the difference between inferential and descriptive statistics?

In inferential statistics we are again trying to describe a population, however, we do so not by taking a measure on all cases in the population, but rather by selecting a sample, observing scores on the variable of interest for that sample, and then inferring something with respect to that variable for the entire population.

What is probability?

In statistics it has a precise meaning, in a given situation, there may be several possible different outcomes that are equally likely to occur and any one of them can occur at random. Usually statisticians report from a probability to convey the likelihood that sample results do not accurately represent what is occurring in the population.

ROUNDING RULE ONE: If the remainder to the right of the decimal place that you wish to round to is greater than 1/2 a measurement unit, _______ the last digit kept by _____.

Increase, 1

Relative Frequency

Indicates the proportion of times that a score occurred and is derived by dividing the number of scores of a given value by the total number of scores in the distribution

Frequency histograms and frequency polygons can be constructed for grouped as well as ungrouped scores. When the scores are grouped the abscissa might list the midpoints of the score rather than the _______ _______ _______

Individual score values

Usually we are not interested in how the scores themselves vary, but rather what?

Instead, how the phenomenon that the scores supposedly represent varies.

Inferential Statistics

Involves taking measurements on a sample and then, from these observations, drawing a conclusion about a population.

Descriptive Statistics

Involves the use of numerical indexes to describe either a sample or a population, when measurements have been taken on all members of a population. In either case, the goals is to DESCRIBE a group of scores in a clear and precise manner

Nominal Measurement

Involves using numbers merely as labels No special meaning In behavioral statistics of interest for variables that involve these measures are frequencies, proportions and percentages ex: types of religion

A probability density function

Is a smooth curve including all possible values of a continuous variable

Mean

It is simply the arithmetic average of the scores that. is, the sum of all the score in the data set, divided by the total number of scores. Written formally as X = Sigma X / N

ROUNDING RULE TWO: If the remained to the right of the decimal place that you wish to round to is less than 1/2 a measurement unit, ______ ____ ____ ____ ____ ___ ___ ___.

Leave the last digit kept as it is.

A _______ in the body of the graph identifies the groups

Legand

Distributions are often said to be ________ or _______ relative to a normal distribution

Leptokurtic, platykurtic

_____ ______ are particularly useful when one or more research participants receive the lowest or highest score possible on a particular measure.

Line plots

Formula for computing median

MDN = L [ (N) (.50) - nL / nW] (i) MDN = Median L = Lower real limit of the category that contains the median N = total number of scores in the distribution nL = the number of scores that are less than L nW = the number of scores that are within the category that contains the median i = the size of the interval of the category that contains the median

The ________ is the arithmetic average of a set of numbers.

Mean

What measure of central tendency is more influences by extreme scores?

Mean

Because the _______ is sensitive to extreme scores. variables like income that tend to include a meaningful percentage of very low or very high scores are often reported in terms of modes, and especially medians rather than ________.

Mean, means

The frequency histogram tends to highlight?

The frequency of specific scores rather than the entire distrubtion

If we subtract the ______ from each score and retain the signs of the resulting differences, we get a set of ________ deviation scores

Mean, signed

The most frequently used descriptive statistics are the, _____, _____, _______

Mean, variance, and standard deviation

Kurtosis

Measure of the fLatness of the tails of a probability distribution relative to that of a normal distribution. Indicates likelihood of extreme outcomes. It also reflects how long and flat the peaks are relative to the tails.

Quantitative Measures

Measures that permit expression of various amounts of something, such as a trait. Variables that are measured on an ordinal, interval or ratio level.

The closest concept to central tendency for a qualitative variable is the ______ _______, that is, the category that occurs most frequently

Modal category

Which of the following measures is appropriate for use with qualitative variables?

Modal category

If you were to want the highest probability of being exactly correct, your best guess would be to use the value of the _____ because it is the most frequently occurring score and has the highest probability of predicting each score exactly.

Mode

When a quantitative variable is measured on an ordinal level, from interval characteristics, the mean is NOT an appropriate index of central tendency, and the _____ or _____ should be used instead.

Mode, median

What are the three measures of central tendency?

Mode, median, and mean

Can any of these measures every be negative?

No

What are the four types of measurement?

Nominal, ordinal, interval, ratio

One very important type of theoretical distribution that is being studied by statisticians is the ______ ________

Normal distribution

What is one problem with the sum of squares being used as an index of variability?

One problem with the sum of squares as an index of variability is that its size depends not only on the amount of variability among scores, but also on the number of scores.

When the abscissa is broken by a double slash, this should be used anytime the abscissa "jumps" from zero to a larger number such that it is not drawn to scale. And the same holds true for the ______

Ordinate

A case that shows very extreme scores relative to the majority of the cases in the data set is known as an

Outlier

In probability, P = ? and A = ?

P= probability A= the given event

In a graph of distribution, the modal score has the highest "_______" in the graph?

Peak

Frequency graphs can also be constructed for ________ variables

Qualitiative

When a ___________ variable is measured on a level that at least approximates interval characteristics, all three measures of central tendency are ________.

Quantitative, meaningful

The frequency of a score in comparison to the total number of scores in the group is called a

Relative frequency

S squared, and S are symbols for what?

Sample variance and sample standard deviation

Sample

Simply a subset of a population

What are two other dimensions on which distributions of scores can differ?

Skewness and Kurtosis

Why would you prefer to use the IQR as opposed to the range?

Some researchers prefer to use the IQR rather than the range because the IQR is not sensitive to distortions from extreme cases. Unlike the range, the IQR is not biased by one extreme score.

How are line plots sometimes structured?

Sometimes, line plots are structured so that each vertical line encompasses a full standard deviation.

Standard Deviation Formula

Square root of the variance

The ______ _______ is the most easily interpreted measure of variability among a set of scores. Recall that the variance is the mean squared deviation score.

Standard Deviation

The "typical" sample has a relatively small ______ ________

Standard deviation

The vertical lines on a line plot convey information about the _______ associated with each mean

Standard deviation

Deviation scores

Such scores are calculated by subtracting some constant from each score in a set of scores. These deviations are said to be unsigned when their absolute values are taken.

What is the difference between populations and samples?

Such statements are made with reference to a population are generalized to a group of individuals/categories. It is often not possible to make observations on every member of a population, so an investigator must use a sample. Based on sample observations, it is often possible to generalize the underlying population

One index of variability that considers all of the scores in a data set is the ____ __ _______?

Sum of squares

The standard deviation is found by?

Taking the positive square root of the variance

Median

The ______ is the point in the distribution of scores that divides the distribution into two equal parts. In other words, 50% of the scores occur below the _______ and 50% of the scores occur below the ______. When it is used as a measure of central tendency it is displaying the "middlemost" as a representative value for the set of scores.

Mode

The ______ of a distribution of scores is the most easily computed index of central tendency. It is simply the score that occurs most frequently.

Population

The aggregate of all cases that one wishes to generalize to

Steps in the Scientific Method

The basic steps of the scientific method are: 1. make an observation that describes a problem 2. create a hypothesis 3. test the hypothesis 4. draw conclusions and refine the hypothesis.

Where do you start?

The conventional starting point is the closest number that is evenly divisible by the interval size that is equal to or less than the lowest score

What is the interquartile range?

The difference between the highest and lowest scores (hence, it is a range) after the top 25% of the scores and the bottom 25% of the scores have been trimmed or eliminated from the data set. In short, it is the range of the middle 50% of the scores.

What is the measurement hierarchy?

The four types of measurement can be thought of as a hierarchy, at the lowest level nominal measurement allows us only to classify individuals into categories. The second level, ordinal measurement not only allows us to classify individuals into categories but also indicates the relative ordering of the categories on a dimension of interest. Interval measurement is the next level, it possesses the same properties as ordinal measurement, but, in addition, is sensitive to the magnitude of differences along the dimension. However, ration statements are NOT possible at interval level. It is only at the final level, ratio measurement, that such statements are possible. Ratio measures have all of the properties of interval measures and also permit ratio statements to be made.

What is the symbol for a population mean?

The greek m, or "mu"

What is the balance point, and that is its reason for being a measure of central tendency?

The mean

If your goal was to make the signed error as close as possible to zero, then what would your best measure of central tendency to use be?

The mean, because across all scores, the sum of signed error from the mean will always equal zero.

With graphs of central tendency and variability what does the set up look like?

The names of the levels of the independent variable appear on the abscissa and each of the various levels of this variable is represented by a bar that extends to the heights on the ordinate that corresponds to the mean score on the dependent variable. Unlike frequency graphs, it is not necessary for the ordinate for this type of graph to start at zero.

To avoid a potential problem or inconsistency, measures of variability should take into account what?

The number of cases in the data set.

How tall should the ordinate of the frequency graph be relative to the abscissa? Why is this important?

The ordinate of a frequency graph should be presented such that its height at the demarcation for the highest frequency is approximately 2/3-3/4's the length of the abscissa. This helps to ensure a uniform, clearly interpretable sensation of graphed results

Dependent Variable

The outcome factor; the variable that may change in response to manipulations of the experiment. It is the variable being tested and measured in an experiment, and is '__________' on the other variable.

Standard Deviation

The positive square root of the variance is denoted by the lower case letter S

Under what condition is the range a misleading index of variability?

The range if a misleading index of variability when there is an extreme score in a set of scores that are otherwise similar to each other.

Definition of Real Limits

The real limits of a number are those that points that fall one half a measurement unit below that number and one half a measurement unit above that number. Real limits can be stated for numbers that are expressed as decimals as well as whole numbers

The frequency polygon tends to highlight what?

The shape of the entire distribution more so than does the frequency histogram

What do the parts of a box plot represent?

The small square in the middle of the rectangular box represents the mean, the top of the tob represents one standard deviation above the mean, and the bottom of the box represents one standard deviation below the mean. The "T's" extending away from the box which are referred to as "whiskers" are the criteria that are used to define outliers. The solid dots above and below the whiskers reflect outlying scores.

Why is the standard deviation more "interpretable" than the variance ? That is, what is the advantage of reporting statistics in terms of the standard deviation as opposed to the variance?

The standard deviation is more "interpretable" than the variance because it represents an average deviation from the mean in the original unit of measurement. In contrast, the variance is in terms of squared deviation units.

Cumulative Frequency

The sums of the frequencies of the data values from smallest to largest. For any given score, the _____ ______ is the frequency associated with that score, plus the sum of all of the frequencies below that score

In skewed distributions, the three measures of central tendency will take on what type of values?

The three measures of central tendency all take on different values in skewed distributions

What is the difference between parameters and statistics?

There are various numerical indexes that are based on either population or from samples, when such indexes are based on data from an entire population, they are referred to as parameters.When they are based on data from a sample they are referred to as statistics.

What is the major problem with mode?

There can be more than one modal score, these can be referred to as multimodal or if it only has 2 then it would be bimodal

What do standard deviations and other measures of variability have?

They have the potential to cause real world implications

For quantitative variables what do measures of central tendency indicate?

They indicate where the scores tend to cluster in a distribution

What does the notation above the summation sign tell us to do?

To add through that given individuals number

What does the notation below the summation sign tell us to do?

To start with that individual at that given number

T or F: (for rounding rule ONE) If the remained to the right of the decimal place that you wish to round to is greater than 1/2 a measurement unit, increase the last digit kept by 1.

True

T or F: A negatively skewed distribution will be when the "tail" is towards the left, or is facing the negative end of the abscissa. Most scores in negatively skewed distributions occur above the mean and only a relatively few extreme scores occur below it. This is reflected in the fact that the mean of a negatively skewed distribution will always be smaller than the median.

True

T or F: Because the measure is only ordinal, the standard deviation of those scores does not help us appreciate how much variability there is on underlying dimensions.

True

T or F: By taking the square root of variance, we are in essence, eliminating the square and returning it to the original unit of measurement.

True

T or F: Central tendency and variability represent different characteristics of a distribution

True

T or F: In terms of variability for qualitative variables, we cannot meaningfully define a range because the categories cannot be ordered to define a low and a high score.

True

T or F: It is that property- the fact that the median minimizes the absolute difference between it and the scores in the distribution - that qualifies the median as a measure of central tendency

True

T or F: It is understood that X has an, i subscript, that i = 1 applies below the summation sign, and that N applies above it?

True

T or F: Positively skewed is when the "tail" is toward the right, or positive end of the abscissa. In positively skewed distributions, most scores occur below the mean and only a relatively few scores occur above it. Thus, the mean will always be larger than the median in a positively skewed distribution.

True

What is the general decimal place that you should round to?

Two decimal places.

How do you determine the size of the interval?

Typically an interval size of 2,3, or a multiple of 5 (for instance 5, 10, or 15) is used. The first step in determining the appropriate interval size for a particular set of data is to subtract the lowest score from the highest score. The difference should then be divided by the desired number of groups and the results rounded to the nearest of the commonly used interval sizes

In graphs of central tendency and variability what do the bars look like?

Usually bars for the different groups do not touch one another to indicate that the means are from different sets of scores, however sometimes the bars are allowed to touch if the independent variable is quantitative but they keep the bars from touching if the independent variable is qualitative.

Distributions can also have identical central tendencies but very different _________

Variability

A value of 0 for these statistics means that there is no ________ in the scores; they are all the same. As the values of the three statistics become increasingly greater than 0, more ______ among the scores is indicated, other things being equal.

Variability, variability

Qualitative Measures

Variables measured on a nominal level

To divide the sum of squares by N- that is, to compute an average squared deviation from the mean would give you the?

Variance

Percentage

When a relative frequency is multiplied by 100, it reflects the _______ of times that score occurred.

What is the note on rounding rule three?

When this rule is used, the last digit of the answer will always be an even number (0,2,4,6, or 8)

What is the conceptual difference between probabilities and relative frequencies?

Whereas a relative frequency indicates the proportion of times that some score was previously observed, a probability represents the likelihood of observing that score in the future

In statistical notation, X =? Σ = ?

X = general name for a variable Σ = sigma, or the summation sign

We will sometimes want to simultaneously consider two variables, when this is the case, the capital letter ______ can be used to represent the second variable.

Y

Is the standard deviation a type of average?

Yes

Can box plots be used to represent anything aside from means and standard deviations?

Yes, they can also be used to present medians and interquartile ranges. In fact, box plots were originally designed to represent the median and IQR.

What do you need to do in order to get the unsigned deviations from the median?

You must subtract the median from each score and take the absolute values of the resulting differences

When rounding do not forget to include _______

Zeros

Normal Distributions

a function that represents the distribution of many random variables as a symmetrical bell-shaped graph.

round each of the following numbers to three decimal places: a. .39572 b. .9999 c. 3.6666 d. 12.2538 e. 9.724001 f. 1.9950 g. 2.0060

a. .396 b. 1.000 c. 3.667 d. 12.254 e. 9.724 f. 1.995 g. 2.005

State the real limits of each of the following numbers, assuming they are measured in the units reported. a. 21, 384.11 b. 0.689 c. 13 d. 13.0 e. 13.00

a. 21, 384. 105 and 21,384.115 b. 0.6885 and 0.6895 c. 12.5 and 13.5 d. 12.95 and 13.05 e. 12. 995 and 13.005

Round each of the following numbers to three decimal places a. 4.8932 b. 8.9749 c. 1.4153 d. 4.1450 e. 6.245002 f. 2.615501 g. 6.3155

a. 4.893 b. 8.975 c. 1.415 d. 4.415 e. 6.245 f. 2.616 g. 6.316

The ______ lists the score values from low to high, extending one unit below the lowest score to 1 unit above the highest score

abscissa

Parameters

are numbers that summarize data for an entire population

Statistics

are numbers that summarize data from a sample, i.e. some subset of the entire population

Frequency Tables

are useful for summarizing how data are distributed, score values are usually listed from highest to lowest here

The bars do not touch one another because each bar in a ____ _____ represents a distinct variable

bar graph

Empirical Distributions

based on measurements that are actually taken on a variable

How do you know if it is a measure VS. a scale?

because a measure has as its referent not solely a particular scale, but additionally and individual whom a measurement is taken on, a time, and setting.

The use of numerical information to describe a group of scores in a clear, and precise manner is referred to as _______ statistics.

descriptive

________ polygons are _______ "closed" with the abscissa in the sense that the abscissa always includes a value that is a unit lower than the lowest observed score and a value that is a unit higher than the highest score observed, with a frequency of 0 denoted for each. This serves to connect the lines to the the abscissa, and thus, to form the _________.

frequency, always, polygon

If we let the "typical" score be represented by the mean, then we are concerned with what in the sum of squares?

how much each score deviates from the mean

Variability

in a set of numbers, how widely dispersed the values are from each other and the extent to which they vary from the mean

Discrete Variables

is a variable whose value is obtained by counting (ex:) number of red marbles in a jar, number of heads when flipping three coins.

Continuous Variables

is a variable whose value is obtained by measuring. (EX: Counting the dogs in a room)

Summation Notation

is used in statistics as a shorthand way of indicating that a set of scores should be summed

When the mean and median are different, the sum of the squared deviations from the mean will always be ______ than the sum of the squared deviations from the median

less

The more extreme a score is relative to the other scores in a distribution, the more it will alter the ______

mean

The ______ is the value that minimizes the sum of _______ deviations

mean, signed

The ________ is the value that minimizes the sun of ________ deviations

median, unsigned

The ______ is the most frequently occurring score in a distribution

mode

If all individuals fall into a single category for qualitative variables there is ___ _____

no variability

Formula for Probability

p(a) = Number of observations favoring event A / total number of possible observations

Numerical indexes derived from population data are __________; numerical indexes derived from sample data are _________

parameters, statistics

A second reason for preferring the mean concerns the important role that the means and their sums of squares play in making inferences about?

populations from sample data

The simplest index of variability is the ________, which is the highest score minus the lowest score in a distribution

range

If a variable is continuous, the vertical boundaries of the bar for a score represents the _____ ______ of the score

real limits

Skewness

refers to the tendency for scores to cluster on one side of the mean, or, one of the "tails" of the distribution relative to the central section is disproportionate compared to the other tail.

Random sampling is a procedure for generating ________ samples.

representative

The formula for converting absolute frequency into relative frequency is

rf = f/n rf= relative frequency N= total number of cases

Variance Formula

s ^2 = SS / N

Stem and leaf plots provide a compact way of conveying both individual scores and the general " ________ " of a frequency distribution

shape

Cumulative Relative Frequency

the accumulation of the previous relative frequencies. They are computed in the same manner but use the column of relative frequencies instead, and for any given score the ____ _____ ______ is the relative frequency associated with that score plus the sum of all of the relative frequencies below that score.

In an investigation on the effect of religious upbringing on moral development, moral development is the

the dependent variable

Frequency

the number of times the observation occurred and was recorded in an experiment or study.

The sum of squares, the variance, and the standard deviation will always be greater or equal to _____

zero

Σ X(i) ^2 =

Σ X ^ 2

Other summation expressions require mathematical operations be applied to each score before the individual results are added together, this means that each X score should first be squared and then summed, what does this equation look liked?

Σ(x)^2 = (X1)^2 + (X2)^2


Kaugnay na mga set ng pag-aaral

Prepu: Chapter 15: Nursing Care of the Child with an Infection

View Set

Chapter 11 Practice Quiz Answer Key - Making Sense of Data

View Set

Combo with Barrons 3500: A and 24 others

View Set

Chapter 14 (Law on Cooperatives)

View Set

communications cassella Chapters 1-5

View Set

13. Legacy of Ancient Indian History: Important Observations

View Set

Mastering A&P Chapter 6 - Bones and Skeletal Tissues

View Set