BUAL 5380

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

The correlation value ranges from

-1 to +1

The percentage of variation (R2) ranges from

0 to +1

Approximately what percentage of the observed Y values are within one standard error of the estimate of the corresponding fitted Y vaules

67%

In a generic box plot, the x inside the box indicates the location of the

A Mean

Which of the following are the three most common measures of central tendency?

A Mean, median, and mode

How is the median defined if the number of observations is even

A The average of the two middle observations

Measure of central location

A central value that best represents a distribution of data. Measures of central location include the mean, median, and mode. Also called the measure of central tendency.

ordinal data

A statistical data type that exists on an arbitrary numerical scale where the exact numerical value has no significance other than to rank a set of data points. Deals with the order or position of items such as words, letters, symbols or numbers arranged in a hierarchical order. Quantitative assessment cannot be made.

Frequency table

A table for organizing a set of data that shows the number of times each item or number appears.

If a value represents the 95th percentile, this means that

A. 95% of all values are below this value

Another term for constant error variance is

A. Homoscedasticity

In regression analysis, the variables used to help explain or predict the response variable are called the

A. Independent variables

Which of the following definitions best describes parsimony?

A. explaining the most with the least

The average score for a class of 30 students was 75. The 20 male students in the class averaged 70. The 10 female students in the class averaged.

B 85

The correlation values ranges from

B. -1 to +1

Coding males as 1 and females as 0 in a data set illustrates the use of

B. Dummy variables

Many statistical packages have three types of equation-building procedures. They are:

B. Forward, backward and stepwise

The appropriate hypotheses test for an ANOVA test is

B. H0= all B=0, Ha: at least on be B not equal to 0

What is the most common type of chart for showing the distribution of numerical variable

B. Histogram

The appropriate hypothesis test for a regression coefficient is:

B. Ho: B=0 , Ha: B not = 0

In regression analysis, if there are several explanatory variables, it is called

B. Multiple regression

Researchers may gain insight into the characteristics of a population by examining a

B. Sample of the population

Which of the following is not one of the assumptions of regression

B. The response variable is not normally distributed

The interquartile rage (IQR) represents what percent of the observations?

B. middle %50

A sample of a population taken at one particular point in time is categorized as

C. Categorical

Gender and state are examples of which type of data

C. Categorical data

If you an determine that the outlier is not really a member of the relevant population, then it is appropriate and probably best to

C. Delete it

In linear regression, we can have an interaction variable. Algebraically, the interaction variable is the other variables in the regression equation

C. Product

In order for the characteristics of a sample to be generalized to the entire population, it should be:

C. Representative of the population

The percentage of variation (R2) can be interpreted as the fraction (or percent) of variation of the

C. Response variable explained by the regression line.

The test statistic in an ANOVA analysis is

C. The F-Statistic

Which one of the following is not one of the assumptions of regression

C. The standard deviation of the response variable increases as the explanatory variables increase

In the standardized value b1-B1)1sp, the symbol sb represents the

C. standard error of b1

Data collected from approximately the same period tf time from a cross-secion of a population are called

Cross-secional Data

A multiple regression analysis including 50 data points and 5 independent variables results in (formula). The multiple standard error estimate will be.

D. 0.953

Expressed in percentiles, the interquartile range is the difference between the

D. 35th and 85th percentiles

Categorizing age variables as "young" "middle-aged" and "elderly" is an example of

D. Binning

Data that arise from counts are called

D. Discrete Data

Is/are especially helpful in identifying outliers

D. Scatterplots

continuous data

Data that can take on any value. There is no space between data values for a given domain. Graphs are represented by solid lines.

Nominal Data

Data which consists of names, labels, or categories.

discrete data

Data with space between possible data values. Graphs are represented by dots.

In regression analysis, the variable we are trying to explain or predict is called the

Dependent variable

What is the decision making process?

Is a purposeful and goal directed effort that uses a systematic process to choose among options. * 1.identify the problem and DEFINE it 2. gather data 3. analyze data 4. identify the options/solutions 5. Pros and cons of each options 6. selection - make the DECISION

Outliers are observations that

Lie outside the typical pattern of points on a scatterplot

___ is especially helpful in identifying outliers

Scatterplots

Empirical Rule

The rules gives the approximate % of observations w/in 1 standard deviation (68%), 2 standard deviations (95%) and 3 standard deviations (99.7%) of the mean when the histogram is well approx. by a normal curve

Quartiles

Values that divide a data set into four equal parts

In regression analysis, which of the following casual relationships are possible?

X causes Y to vary Y causes X to vary Other variables cause X and Y to vary

In multiple regression, the coefficients reflect the expected change in:

Y when the associated X value increases by one unit

The weakness of scatterplots is that they

do not actually quantify the relationships between variables

A single variable X can explain a large percentage of the variation in some other variable Y when the two variables are

highly correlated

Regression analysis asks

how a single variable depends on other relevant variables

In regression analysis, the variables used to help explain or predict the response variable are called the

independent variables

In multiple regression, the constant

is the expected values of the dependent variable Y when all of the independent variables have the value zero

The covariance is not used as much as the correlation because

it is difficult to interpret

Outliers are observations that

lie outside the typical pattern of points on a scatterplot

In regression analysis, if there are several explanatory variables, it is called

multiple regression

A correlation value of zero indicates

no linear relationship

A scatterplot that appears as a shapeless mass of data points indicates

no relationship among the variables

An error term represents the vertical distance from any point to the

population regression line

In linear regresson, we fit the leat scares line to a set of values (or points on a scatterplot). The distance from the line to the point is called the

residual

The percentage of variation can be interpreted as the fraction of variation of the

response variable explained by the regression line

In choosing the "best-fitting" line through a set in linear regression, we choose the one with the

smallest sum of squared residuals

The standarad error of the estimate (Se) is essentially the

standard deviation of the residuals

mean

the arithmetic average of a distribution, obtained by adding the scores and then dividing by the number of scores

Interquartile Range (IQR)

the difference between the first and third quartiles

Median

the middle score in a distribution; half the scores are above it and half are below it

Mode

the most frequently occurring score(s) in a distribution

In linear regression, the fitter values is

the predicted value of the dependent variable

Given the least squares regression line, which statement is true y=8-3x

the realatioship between x and y is negative

population standard deviation

the square root of the population variance

Correlation is a summary measure that indicates

the strength of the linear relationship between pairs of variables

The term autocorrelation refers to

time series variables are usually related to their own past values


Ensembles d'études connexes

Skip to main content Conéctate al cine (2)

View Set

JMESI External Accreditation One 1

View Set

Chapter 2 mental health nursing psychopharmacology

View Set

Purpose and Organization of the United Nations

View Set