Math 15 Midterm 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

The null hypothesis for ________________ is that the population means are all equal, or (not listed here) that the samples are drawn from the same population.

analysis of variance

In a histogram, what does probability equal?

area

The sign test uses the _______________ distribution.

binomial it tests whether numbers are above or below (two possibilities!) the population median.

__________ plots provide a quick visualization of the range of data and feature a center line, median box, and first and third quartiles

box and whisker plot based on quartiles

__________ the standardized measure of the risk per unit of return

coefficient of variation

____________ is calculated as the standard deviation divided by the expected return. σ/μ(a one-number measurement of spread)

coefficient of variation

T or F: In probability density distribution functions the values are always positive and the total area under the function is =0

False, the total area under the function is =1

When there is almost no linear correlation in a scatterplot, what does that indicate about the value for R.

No linear correlation indicates that the R value is ) or close to 0. R can never be below -1 or above 1

The ___________, a measure of variability less sensitive to outliers than s, is the difference between the upper and lower quartiles.

interquartile range

________is the height of the box and= 3Q-1Q

interquartile range

40 former smokers have their breathing rate tested in January and June of the same year. The two samples should be treated as what two variables?

large and dependent

If you are analyzing net worths in the WLS dataset, and you break them down by college degree, the samples are what two variables?

large and independent

The _______ quartile separates the smallest 25% of the data from the remaining 75%.

lower

_________=(max+min)/2 and is defined by outliers

midrange/median

__________ is a distribution based measurement that applies to non-numeric data, it depends on the shape of the distribution and in noisy data it becomes less useful and may not exists

mode

The chi-square (χ²) distribution (variances) assumes the population is distributed in what way?

normally distributed or close to it

T or F: For the additional college degree variable, the mean can be used.

false

T or F: For the body mass index (2003-5 survey) variable, the mean can be used.

false

T or F: For the population of graduate's high school variable, the mean can be used.

false

True or False: A graph is honest as long as the size of the bar(or lines, or pie slices, or columns) matches the data

false

True or False: An r value close to -1 means that the two variables are not correlated.

false

True or False: An r value close to zero means that the two variables are not correlated

false

True or False: Sorted histograms use ordinal and numeric data frequently

false

True or False: The "lie ratio" of a graph measures it efficiency or inefficiency.

false

True or False: The best graphs have the most detail

false

True or false: Always use as many variables as you can in a multiple linear correlation.

false

True or false: For the number of children variable, the mean can be used.

true

True or false: Ordinal data can be ranked lowest to highest and is not treated as numbers

true

μ

represents the population mean

Σ Xi

represents the sum of all scores present in the population (say, in this case) X1 X2 X3 and so on

N

represents the total number of individuals or cases in the population

________________ show bars ordered from low to high or high to low

sorted histogram

In a probability distribution, the probability is given by ___________________

the area above the range

How do you determine the confident interval?

-Apply α or α/2 to distribution, determine critical value(s) -Convert critical values into parameter values

How to test hypothesis?

-Convert parameter value (null hypothesis) into value on distribution and compare -determine p-value via distribution; if p-value is less than α or α/2, reject the null hypothesis

In a distribution the z score of a mean equals __________

0

What would be the smallest number in a distribution? (Not the lowest frequency, but the smallest number.)

1st quartile

How do you find the proportionally weighted mean?

= Σ(pi∙xi); multiply each value by some weight (pi); Σpi must equal 1. Note that ifpi=ni/N, this is the same as population weighted mean.

Suppose you were trying to find the minimum value for a one-tailed confidence interval of a standard deviation at 99% confidence level. Which Excel formula would you use to find the chi-square value in the calculation?

CHISQ.INV(0.99) Use 0.99 here. 0.005 and 0.995 are the probabilities for a two-tailed confidence interval, and you would use 0.99 instead of 0.01 because for the standard deviation/variance, the higher critical value gives you the lower bound.

What is used to find a confidence interval for the standard deviation of a population?

Chi-Square

Analysis of variance requires use of which of what type of distribution?

F

population number symbol

N

Suppose you were trying to find the minimum value for a one-tailed confidence interval of a mean at 99% confidence level. Which Excel formula would you use to find the z-score in the calculation?

NORM.S.INV(0.01), Here, you'd use 0.01 as the probability.

_________ orders the bins from highest to lowest frequency and adds a cumulative percentage line

Pareto chart

How do you find the standard deviation for a chi-square (χ²) distribution (variances)?

Take square root of variance

_________ is an example of a distribution a general function(mathematical or empirical) of a set of x values

a histogram

__________________ establishes a confidence interval for the population mean. (Remember that we only establish confidence intervals for population values, not sample values. We know sample values.)

central limit theorem

_________ in excel works for standard deviation/variance, if population is known to be normally distributed

chi-square

What are the two methods of predictive statics?

confidence interval and hypothesis testing

The cumulative percentage line on a histogram shows what?

everything in that category or to the left of it; can be displayed as a percentage or a whole

What does the cumulative percentage line on a histogram show?

everything in that category or to the left of it; displayed as a percentage or the whole number

How is the correlation between two variables expressed?

expressed by either +/- 1

________is a bar chart that displays data sorted into categories

histograms

In the WLS data set, if you compared all the cognition scores from 1993 and all the cognition scores from 2003, these two samples would be independent or dependent?

independent The sample sizes are certainly above 30, so they're large, and since you're comparing all the scores, they're independent

A one tailed chi-square (χ²) distribution is often used for what type of standard deviation?

maximum

What are the three types of distribution based measurements?

mean, median, and mode

The centroid on a plot is determined by the ______________ of two variables.

means

__________ is the middle of the distribution and is useful because it refers to the distribution, it is resistant to outliers and can be used for ordinal data but usually isn't.

median

The median of the lower group equals what?

median lower group= Q1

The median of the upper group equals which quartile?

median of the upper group= Q3

How do you compute the geometric mean?

multiply all the values and take the Nth root of the product; (Πxi)^1/N

How do you compute the population weighted mean?

multiply each value by the frequency it occurs (ni) and divide by the total (N =Σni)

sample number symbol

n

The chi-square (χ²) distribution (variances) has a degree of freedom equal to what?

n-1

________ in excel works for means and any population if σ is known or assumed

normal

What does the mean say about the distribution?

nothing

sample fraction symbol

p^

A straight line on a log-log plot indicates a ______________ relationship between two variables.

power

In a negative correlation, what does r equal?

r=-1

In a weak negative correlation, what does r equal?

r=0

In a strong correlation what does r equal?

r=1

The chi-square (χ²) distribution (variances) deals with what type of samples?

random

__________frequency is the frequency divided by the total number of data points. It can be displayed as a percentage or decimal.

relative

We measure the heights of 100 UC/Merced students. Is this a population or a sample?

sample

We measure the sleeping habits of 150 UC/Merced students, all women. Is this a population or a sample?

sample

_______ is a small subset of the population

sample

R values only measure linear correlations so they are always in which type of plot?

scatter plot

Which of the following is based on the "third moment" of a data set? (That is, which of the following requires raising something to the third power?)

skew

A dataframe in R is most closely equivalent to a _______________ in Excel.

spreadsheet

standard deviation formula

sqrt(sum of squares of the deviation from the mean/n-1)

__________ organizes printed data like a histogram; it uses the last digit or digits as a horizontal bar

stem and leaf plot

_________ in excel works for means and if population is known to be normally distributed; useful for small samples as it does not require σ to be known

t

________ is designated as H₀.

the null hypothesis

True or False: The adjusted r2 value is always less than the unadjusted r2 value.

true

T or F: For the cognition score (1993 survey) variable, the mean can be used.

true

T or F: For the number of days in bed (2011 survey) variable, the mean can be used.

true

T or F: For the number of marriages variable, the mean can be used.

true

T or F: For the parental income variable, the mean can be used.

true

True or False: It's more important for a graph to be clear than use its elements efficiently.

true

The __________ quartile separates the largest 25% from the smallest 75%.

upper

What is on the x and y axis of histograms?

x axis=categories y axis= number in the category

The symbol for sample mean is

__________ is the number of standard deviations and an observation above/below the mean

z score

standard deviation symbol

σ

population variance symbol

σ²


Ensembles d'études connexes

Chapter 19 (the lymphatic system)

View Set

Life Insurance Chapter 5 Annuities

View Set

Fluid & Electrolytes HURST REVIEW

View Set

Social Media: Listening and Monitoring

View Set