Research/Program Eval (Purple Book)

Ace your homework & exams now with Quizwiz!

The hunch is known as the experimental or alternative hypothesis. The experimental hypothesis suggests that a difference will be evident between the control group and the experimental group (i.e., the group receiving the IV). Thus, if the experiment in question 708 were conducted, the experimental hypothesis would suggest that a. the biofeedback would raise board scores. b. the control group will score better on the board exam. c. there will be no difference between the experimental and the control groups. d. the experiment has been confounded.

a. the biofeedback would raise board scores. An alternative hypothesis—which may be called the "affirmative hypothesis" on your exam—asserts that the IV has indeed caused a change.

The median is a. the middle score when the data are arranged from highest to lowest. b. the arithmetic average. c. the most-frequent value obtained. d. never more useful than the mean.

a. the middle score when the data are arranged from highest to lowest. In studies measuring variables with extreme scores, the median would be the best statistic.

The simplest form of descriptive research is the ________, which requires a questionnaire return or completion rate of ________ to be accurate. a. survey; 5% b. survey; 10-25% c. survey; 50-75% d. survey; 95%

c. survey; 50-75% It has been estimated that in most surveys the return rate hovers around the 40% mark. In a survey, the researcher attempts to gather large amounts of data, often utilizing a questionnaire or an interview, in order to generate generalizations regarding the behavior of the population as a whole.

Assume the experiment in question 708 is conducted. The results indicate that the biofeedback helped raise written board exam scores but in reality this is not the case. The researcher has made a a. Type I error. b. Type II error. c. beta error. d. b and c.

a. Type I error. First, write down (or mentally picture) the null hypothesis regarding the experiment in question. In this case null would indicate that biofeedback did not raise board exam scores. This question tells you that the experimental results revealed that biofeedback did raise board scores, so you will reject the null hypothesis. The question then goes on to say that in reality the biofeedback did not really cause the results. Therefore, you have rejected null when it is true/applicable. This is the definition of a Type I or alpha error. Since the experimenter sets the alpha level, they are always cognizant of the probability of making a Type I error.

In the social sciences the accepted probability level is usually a. .05 or less. b. 1.0 or higher. c. .0001 or less. d. .05 or higher.

a. .05 or less. The two most popular levels of significance are .05 and .01.

The standard deviation (SD) is the square root of the variance. A z-score of +1 would be the same as a. 1 SD above the mean. b. 1 SD below the mean. c. the same as a so-called t-score. d. the median score if the population is normal.

a. 1 SD above the mean. Z-scores are the same as standard deviations! In fact, z-scores are often called standard scores. A z-score is the most elementary type of standard score. A z-score of +1 or 1 SD would include about 34% of the cases in a normal population. The normal distribution also can be described using t-scores sometimes called "transformed scores." The t-score uses a mean of 50 with each SD as 10. Hence, a z-score of -1.0 would be a t-score of 40. A z-score of -1.5 would be a t-score of 35 and so on.

Z-scores (also called standard scores) are the same as standard deviations, thus a z-score of -2.5 means a. 2.5 SD below the mean. b. 2.5 SD above the mean. c. a CEEB score of 500. d. -.05% of the population falls within this area of the curve.

a. 2.5 SD below the mean. This would be a t-score of 25. let's examine choice "c" which expresses the abbreviation for the College Entrance Examination Board (CEEB) scores. This standard score is used for tests such as the GRE or the SAT. The scale ranges from 200 to 800 with a mean of 500. CEEB scores use a standard deviation of 100. Scores lower than 200 or above 800 are simply rated as end-point scores. A score of 200 corresponds to 3 SD below the mean with 800 landing at a point 3 SD above the mean. Therefore, in this case, choice "c" would need to read "a CEEB score of 250" to be accurate. An exam could refer to a CEEB score as an ETS score and the scale was created to eliminate negative scores.

The range is a measure of variance and usually is calculated by determining the difference between the highest and the lowest score. Thus, on a test where the top score was a 93 and the lowest score was a 33 out of 100, the range would be a. 61. b. 77. c. 59. d. more information is necessary.

a. 61. The range is the simplest way to measure the spread of scores. Technically, statistics that measure the spread of scores are known as "measures of variability." The range is usually calculated by subtracting the lowest score from the highest score. If the test specifies the "inclusive range" then use the formula with plus 1. If not, go with the "exclusive range" formula, which does not include it.

All of the following describe the analysis of covariance technique except: a. It is a correlation coefficient. b. It controls for sample differences which exist. c. It helps to remove confounding, extraneous variables. d. It statistically eliminates differences in average values influenced by covariates.

a. It is a correlation coefficient. The ANCOVA is similar to the ANOVA yet more powerful because it can help to eliminate differences between groups which otherwise could not be solely attributed to the experimental IVs. Although ideally each random sample will be equal to every other random sample this is not always the case. A so-called COVARIATE, which correlates with the DV, could be present. The ANCOVA tests a null hypothesis regarding the means of two or more groups after the random samples are adjusted to eliminate average differences. It is often referred to as an "adjusted average" statistical procedure.

The ordinal scale rank orders variables, though the relative distance between the elements is not always equal. An example of this would be a. a horse categorized as a second-place winner in a race. b. an IQ score of 111. c. the weight of an Olympic barbell set. d. a temperature of 78 degrees Fahrenheit.

a. a horse categorized as a second-place winner in a race. This is the second level of measurement. Nominal data do not rank order the data like ordinal data. The rank does not indicate absolute differences. Thus, you could not say that the first-, second-, and third-place horses were equidistant apart. The ordinal scale provides relative placement or standing but does not delineate absolute differences.

In a normal curve the mean, the median, and the mode all fall precisely in the middle of the curve. From a graphical standpoint the so-called normal or Gaussian curve (named after the astronomer/mathematician K. F. Gauss) looks like a. a symmetrical bell. b. the top half of a bowling ball. c. the top half of a hot dog. d. a mountain which is leaning toward the left.

a. a symmetrical bell. The normal curve is a theoretical notion often referred to as a "bell-shaped curve." The bell is symmetrical. Most physical and psychological traits are normally distributed. If enough data are collected in regard to a given trait, and a frequency polygon is constructed, it will resemble the bell-shaped curve. Curves that are not symmetrical (those which are asymmetrical) are called "skewed distributions." The 68-95-99.7 rule (the empirical rule) states that in a normal distribution 68% of the scores fall within +/-1 standard deviation (SD) of the mean; 95% within 2 SDs of the mean; and 99.7% within 3 SDs of the mean. The verdict: almost all the scores will fall between 3 SDs of the mean.

The interval scale has numbers scaled at equal distances but has no absolute zero point. Most tests used in school fall into this category. You can add and subtract using interval scales but cannot multiply or divide. An example of this would be that a. an IQ of 70 is 70 points below an IQ of 140, yet a counselor could not assert that a client with an IQ of 140 is twice as intelligent as a client with an IQ of 70. b. a 20 lb weight is half as heavy as a 40 lb weight. c. a first-place runner is three times as fast as the third-place finisher. d. a baseball player with number 9 on his uniform can get 9 times more hits than player number 1.

a. an IQ of 70 is 70 points below an IQ of 140, yet a counselor could not assert that a client with an IQ of 140 is twice as intelligent as a client with an IQ of 70. Since the intervals are the same, the amount of difference can be stipulated (e.g., three IQ points). Using this scale, distances between each number are equal yet it is unclear how far each number is from zero. Division is not permissible inasmuch as division assumes an absolute zero. (If you had an absolute zero then you could in fact assert that a person with an IQ of 140 would be twice as smart as someone with an IQ of 70. But of course, zero on an IQ test does not equal zero knowledge; hence, IQ tests provide interval measurement.)

A good guess would be that if you would correlate the length of CACREP graduates' baby toes with their CPCE scores the result would be a. close to 0.00. b. close to a perfect 1.00. c. close to a perfect negative correlation of -1.00. d. be about +.70.

a. close to 0.00. There is an absence of association here because as one variable changes the other variable varies randomly. The variation of one variable is most likely totally unrelated to the variation of the other

A client goes to a string of 14 chemical dependency centers that operate on the 12-step model. When his current therapist suggests a new inpatient program the client responds with, "What for, I already know the 12 steps?" This client is using a. deductive logic. b. inductive logic. c. an empathic assertion. d. an I statement.

a. deductive logic. Here the client assumes that the general (his experience in 14 treatment facilities) can be reduced (deduction—remember your memory device) to the specific (the new treatment program).

P = .05 really means that a. differences truly exist; the experimenter will obtain the same results 95 times out of 100. b. differences truly exist; the experimenter will obtain the same results 99 times out of 100. c. there is a 95% error factor. d. there is a 10% error factor.

a. differences truly exist; the experimenter will obtain the same results 95 times out of 100.

A large study at a major university gave an experimental group of clients a new type of therapy that was intended to ameliorate test anxiety. The control group did not receive the new therapy. Neither the clients nor the researchers knew which students received the new treatment. This was a a. double-blind study. b. single-blind study. c. typical AB design. d. case of correlational research

a. double-blind study. A double-blind study goes one step beyond the single-blind version by making certain that the experimenter is also unaware of the subjects' status. In fact, in the double-blind situation the persons assigned to rate or judge the subjects are often unaware of the hypothesis. This procedure helps eliminate confounding caused by "experimenter effects." Experimenter effects can flaw an experiment because the experimenter might unconsciously communicate his or her intent or expectations to the subjects. Choice "c," though incorrect, is a must-know concept. An AB or ABA time-series design is the simplest type of single-subject research and was initially popularized by behavior modifiers in the 1960s and 1970s. (You will recall that Freud used the case study paradigm, but needless to say, did not rely on the AB or ABA model.) Single-subject case studies of various types are once again gaining in popularity. Okay, back to the AB and ABA models that rely on "continuous-measurement." A baseline is secured (A); intervention is implemented (B); and the outcome is examined via a new baseline (A) in the case of the ABA design. In order to improve the research process, an ABAB design can be utilized to better rule out extraneous variables. If the pattern for the second AB administration mimics that of the first AB, then the chances increase that B (the intervention or so-called treatment) caused the changes rather than an extraneous variable. Some exams will refer to ABA or ABAB paradigms as "withdrawal designs." The rationale is that the behavior will move in the direction of the initial baseline each time the treatment is withdrawn if the treatment IV is responsible for the change. The ethical counselor must forego using a withdrawal or reversal design if the removal of the treatment variable could prove har

Test scores on an exam that fell below 3 SD of the mean or above 3 SD of the mean could be described as a. extreme. b. very typical or within the average range. c. close to the mean. d. very low scores.

a. extreme. If you graph this situation you will note that these scores would be unusually high (which negates choices "b," "c," and "d") or very low.

Standardized tests always have a. formal procedures for test administration and scoring. b. a mean of 100 and an SD of 15. c. a mean of 100 and a standard error of measurement of 3. d. a reliability coefficient of +.90 or above.

a. formal procedures for test administration and scoring. Standardization implies that the testing format, the test materials, and the scoring process are consistent.

A counselor educator is teaching two separate classes in individual inventory. In the morning class the counselor educator has 53 students and in the afternoon class she has 177 students. A statistician would expect that the range of scores on a test would be a. greater in the afternoon class than the morning class. b. smaller in the afternoon class. c. impossible to speculate about without more data. d. nearly the same in either class.

a. greater in the afternoon class than the morning class. The range generally increases with sample size.

A distribution with class intervals can be graphically displayed via a bar graph also called a a. histogram. b. sociogram. c. genogram. d. genus.

a. histogram. Most bar graphs are drawn in a vertical fashion. When the bars are drawn horizontally it is sometimes called a "horizontal bar chart." A "double-barred histogram" can be used to compare two distributions of scores such as pre- and posttest scores

In experimental terminology IV stands for ________ and DV stands for ________. a. independent variable; dependent variable b. dependent variable; independent variable c. individual variable; dependent variable d. independent variable; designer variable

a. independent variable; dependent variable Variables in an experiment are categorized as independent variables (IVs) or dependent variables (DVs). A variable is a behavior or a circumstance that can exist on at least two levels or conditions. In an experiment the IV is the variable that the researcher manipulates, controls, alters, or wishes to experiment with. The DV expresses the outcome or the data. The DV expresses the data regarding factors you wish to measure. IVs and DVs can be discrete (e.g., a brand of counseling or occupation) or continuous (e.g., height or weight). If your exam describes a true experiment—such as the biofeedback research described in the next several questions—except for the fact that the groups were not randomly assigned, then the new exams are calling this a causal comparative design. Data gleaned from the causal comparative ex post facto or after the fact design can be analyzed with a test of significance (e.g., a t test or ANOVA) just like any true experiment.

Occam's Razor suggests that experimenters a. interpret the results in the simplest manner. b. interpret the results in the most complex manner. c. interpret the results using a correlation coefficient. d. interpret the results using a clinical interview.

a. interpret the results in the simplest manner. Exams often refer to parsimony as Occam's Razor, the principle of economy, or Lloyd Morgan's 1894 Canon (canon in this sense means "law"). Conway Lloyd Morgan was an English psychologist/physiologist, while William of Occam was a fourteenth-century philosopher and theologian. The early behaviorists (e.g., Watson) adhered closely to this principle.

Experiments emphasize parsimony, which means a. interpreting the results in the simplest way. b. interpreting the results in the most complex manner. c. interpreting the results using a correlation coefficient. d. interpreting the results using a clinical interview.

a. interpreting the results in the simplest way. Parsimonious literally means a tendency to be miserly and not overspend. A parsimonious individual is said to be overly economical and stingy. In research, we strive for parsimony in the sense that the easiest and less-complex explanation is said to be the best; an economical description if you will. The simplest explanation of the findings is always preferred. The factor analysis mentioned in the previous answer is parsimonious in the sense that 10 tests which measure the dimensions of an effective counselor can be explained via a short measure which describes three underlying variables. Factor analysis, then, is concerned with data reduction.

The most useful measure of central tendency is the a. mean, often abbreviated by an X with a bar over it. b. median, often abbreviated by Md. or Mdn. c. mode, often abbreviated by Mo. d. point of maximum concentration.

a. mean, often abbreviated by an X with a bar over it. The mean is the most useful of the three measures of central tendency. If a distribution is plagued with extreme scores then the "median" is the statistic of choice. The median is best for skewed distributions. Choice "d" is a definition of choice "c," mode, which researchers consider the least-important measure of central tendency.

The y axis is used to plot the frequency of the DVs. The y axis could also be called the ______ on your exam. a. ordinate b. abscissa c. IV d. horizontal axis

a. ordinate The ordinate plots the DV or experimental data.

An operational definition a. outlines a procedure. b. is theoretical. c. outlines a construct. d. is synonymous with the word axiom.

a. outlines a procedure. Choice "d" or axiom, unlike a theory, is a universally accepted idea needing no additional proof. It is very important that researchers "operationally define" procedures so that other researchers can attempt to "replicate" an experimental procedure. Replication implies that another researcher can repeat the experiment exactly as it was performed before. In most cases, counselors would not accept a finding as scientific unless an experiment has been replicated.

If data indicate that students who study a lot get very high scores on state counselor licensing exams, then the correlation between study time and LPC exam scores would be a. positive. b. negative. c. 0.00. d. impossible to ascertain.

a. positive. A positive correlation is evident when both variables change in the same direction. A negative correlation is evident when the variables are inversely associated; one goes up and the other goes down. In the scenario for this question the relationship is positive since as study time increases, LPC exam scores also increase. A negative correlation (choice "b") would be expected when correlating an association like the number of dental cavities and time spent brushing one's teeth; as brushing time goes up dental cavities probably go down. Choice "c" or a zero correlation indicates an absence of a relationship between the variables in question. An exam could throw the term biserial correlation at you. This indicates that one variable is continuous (i.e., measured using an interval scale) while the other is dichotomous. An example would be evident if you decided to correlate state licensing exam scores to NCC status (dichotomy is licensed/unlicensed). If both variables are dichotomous (i.e., two valued) then a phi-coefficient correlation is necessary.

A researcher notes that a group of clients who are not receiving counseling, but are observed in a research study, are improving. Her hypothesis is that the attention she has given them has been curative. The best explanation of their improvement would be a. the Hawthorne effect. b. the Halo effect. c. the Rosenthal effect. d. a Type II error in the research.

a. the Hawthorne effect. this relates to the famous study by Australian psychologist Elton Mayo and Fritz Roethlisberger and colleagues that took place from 1924 to 1932 at the Hawthorne Works of the Western Electric Company, Cicero, Illinois. The research indicated that work production tended to increase with better lighting or worse lighting conditions. If subjects know they are part of an experiment—or if they are given more attention because of the experiment—their performance sometimes improves. When observations are made and the subjects' behavior is influenced by the very presence of the researcher, it is often called a "reactive effect" or "reactivity" of observation/experimentation. The subject is said to be reacting to the presence of the investigation. As mentioned in question 760, this is sometimes known as an observer effect.

A t-score is different from a z-score. A z-score is the same as the SD. A t-score, however, has a mean of 50 with every 10 points landing at a SD above or below the mean. Thus a t-score of 60 would equal +1 SD while a t-score of 40 would be a. -2 SD. b. -1 SD. c. a z-score of +2. d. a z-score of +1.

b. -1 SD. Choice "a" would be a t-score of 30, choice "c" a t-score of 70, and choice "d" a t-score of 60.

The most common measures of central tendency are the mean, the median, and the mode. The mode is a. the most frequently occurring score and the least-important measure of central tendency. b. always 10% less than the mean. c. the arithmetic average. d. the middle score in the distribution of scores.

a. the most frequently occurring score and the least-important measure of central tendency. The mode is the highest or maximum point of concentration. The French phrase à la mode means "in style" or "in vogue." The mode is the score that is most "in style" or occurs the most. Just remember that pie à la mode has a "high concentration of calories." The modal score is the highest point on the curve. Statisticians refer to choice "c" as the "mean" and choice "d" as the "median."

When a researcher uses correlation, then there is no direct manipulation of the IV. A researcher might ask, for example, how IQ correlates with the incidence of panic disorder. Again, nothing is manipulated; just measured. In cases such as this a correlation coefficient will reveal a. the relationship between IQ and panic disorder. b. the probability that a significant difference exists. c. an F test. d. percentile rank.

a. the relationship between IQ and panic disorder. A statistic that indicates the degree or magnitude of relationship between two variables is known as a "correlation coefficient" and is often abbreviated using a lower-case r. A coefficient of correlation makes a statement regarding the association of two variables and how a change in one is related to the change in another. Correlations range from 0.00, no relationship, to 1.0 or -1.0 which signify perfect relationships. A positive correlation is not a stronger relationship than a negative one of the same numerical value. A correlation of -.70 is still indicative of a stronger relationship than a positive correlation of .60.

A counselor believes that clients who receive assertiveness training will ask more questions in counseling classes. An experimental group receives assertiveness training while a control group does not. In order to test for significant differences between the groups the counselor should utilize a. the student's t test. b. a correlation coefficient. c. a survey. d. an analysis of variance (ANOVA).

a. the student's t test. When comparing two sample groups the t test, which is a simplistic form of the analysis of variance, is utilized. The t test is used to ascertain whether two sample means are significantly different. The researcher sets the level of significance and then runs the experiment. The t test is computed and this yields a t value. The researcher then goes to a t table found in the index of most statistics' texts. If the t value obtained statistically is lower than the t value (sometimes called "critical t") in the table, then you accept the null hypothesis. Your computation must exceed the number cited in the table in order to reject null. If there are more than two groups, then the analysis of variance (choice "d") is utilized. The results of an ANOVA yield an F-statistic. The researcher then consults an F table for a critical value of F. If F obtained (i.e., computed) exceeds the critical F value in the table, then the null hypothesis is rejected. Other tests are: the analysis of covariance (ANCOVA), which tests two or more groups while controlling for extraneous variables that are often called "covariates"; the Kruskal-Wallis, which is used instead of the one-way ANOVA when the data are nonparametric; the Wilcoxon signed-rank test, used in place of the t test when the data are nonparametric and you wish to test whether two correlated means differ significantly (use the "co" to remind you of "correlated"); the Mann-Whitney U test, to determine whether two uncorrelated means differ significantly when data are nonparametric (the "u" can remind you of "uncorrelated"); the Spearman correlation or Kendall's tau, which is used in place of the Pearson r when parametric assumptions cannot be utilized; and the chi-square nonparametric test, which examines whether obtained frequencies differ significantly f

An experiment is said to be confounded when a. undesirable variables are not kept out of the experiment. b. undesirable variables are kept out of the experiment. c. basic research is used in place of applied research. d. the sample is random.

a. undesirable variables are not kept out of the experiment. Confounding is said to occur when an undesirable variable which is not controlled by the researcher is introduced in the experiment.; this could be referred to as a contaminating variable. Basic research is conducted to advance our understanding of theory, while applied research (also called action research or experience-near research) is conducted to advance our knowledge of how theories, skills, and techniques can be used in terms of practical application. Often counselors assert that much of the research is not relevant to the actual counseling process and indeed they are correct.

There are four basic measurement scales: the nominal, the ordinal, the interval, and the ratio. The nominal scale is strictly a qualitative scale. It is the simplest type of scale. It is used to distinguish logically separated groups. Which of the following illustrates the function of the nominal scale? a. A horse categorized as a second-place winner in a show. b. A DSM or ICD diagnostic category. c. An IQ score of 111. d. The weight of an Olympic barbell set.

b. A DSM or ICD diagnostic category. The order of complexity of S. S. Stevens's four types of measurement scales can be memorized by noting the French word noir meaning black (nominal, ordinal, interval, ratio). Parametric tests rely strictly on interval and ratio data, while nonparametric tests are designed only for nominal or ordinal information. The nominal scale is the most elementary as it does not provide "quantitative" (measurable) information. The nominal scale merely classifies, names, labels, or identifies by group (choice "b" is thus correct). A nominal scale has no true zero point and does not indicate order. Other examples would be a street address, telephone number, political party affiliation, brand of therapy, or number on a player's uniform. Adding, subtracting, multiplying, or dividing the aforementioned nominal categories would prove meaningless.

A doctoral student who begins working on his bibliography for his thesis would most likely utilize a. SPSS. b. ERIC, for primary and secondary resources. c. O*NET. d. a random number table or random number-generation computer program.

b. ERIC, for primary and secondary resources. The Educational Resources Information Center (ERIC), www.eric.ed.gov/, is a resource bank of scholarly literature and resources. If you say that Ellis said such and such and reference a book Ellis wrote then the resource or documentation is primary. If you say that Ellis said such and such and quote a general counseling text then the resource is considered secondary. The Statistical Package for the Social Sciences (SPSS) is a popular computer software program that can ease the pain of computing your statistics by hand (e.g., a t test, correlation, or ANOVA).

A bimodal distribution has two modes (i.e., most frequently occurring scores). Graphically, this looks roughly like a. a symmetrical bell-shaped curve. b. a camel's back with two humps. c. the top half of a bowling ball. d. a mountain which is leaning toward the left.

b. a camel's back with two humps. When a curve exhibits more than two peaks it is known as a "multimodal" distribution. This can be contrasted to the curve with just a single peak (e.g., the normal curve) which is said to be "unimodal".

The x axis is used to plot the IV scores. The x axis could also be called the _______ on your exam. a. y axis b. abscissa c. DV d. vertical axis

b. abscissa The horizontal axis plots the IV—the factor manipulated via the experimenter.

A group of first-semester graduate students in counseling took an experimental counseling exam that was much more difficult than the NCE. All of the students scored very low. A distribution of their scores would a. always be a bimodal distribution. b. be positively skewed. c. be negatively skewed. d. produce a curve with a long tail to the left side of the graph.

b. be positively skewed. Most of the scores would fall on the left or the low side of the distribution. Graphically then, the "tail" of the distribution would point to the right or the positive side. The tail indicates whether the distribution is positively or negatively skewed

Switching the order in which stimuli are presented to a subject in a study is known as a. the Pygmalion effect. b. counterbalancing. c. ahistoric therapy. d. multiple treatment interference.

b. counterbalancing. Let choice "a" come as no surprise if it shows up on an exam. The Rosenthal/Experimenter effect often shows up wearing this name tag. The experimenter falls in love with their own hypothesis and the experiment becomes a self-fulfilling prophecy. Choice "b," the correct answer, is used to control for the fact that the order of an experiment could impact upon its outcome. The solution is merely to change the order of the experimental factors. Choice "c," "ahistoric therapy," connotes any psychotherapeutic model that focuses on the here and now rather than the past. Choice "d" warns us that if a subject receives more than one treatment, then it is often tough to discern which modality truly caused the improvements.

A researcher studies a single session of counseling in which a counselor treats a client's phobia using a paradoxical strategy. He then writes in his research report that paradox is the treatment of choice for phobics. This is an example of a. deductive logic or reasoning. b. inductive logic or reasoning. c. attrition or so-called experimental mortality. d. construct validity.

b. inductive logic or reasoning. This is inductive since the research goes from the specific to a generalization. Deductive reduces the general to the specific. Choice "c" refers to subjects that drop out of a study.

In a new study the clients do not know whether they are receiving an experimental treatment for depression or whether they are simply part of the control group. This is, nevertheless, known to the researcher. Thus, this is a a. double-blind study. b. single-blind study. c. baseline for an intensive N = 1 design. d. participant observer model.

b. single-blind study. In the single-blind study the subject would not know whether he or she is a member of the control group or the experimental group. This strategy helps eliminate "demand characteristics" which are cues or features of a study which suggest a desired outcome. A subject can manipulate and confound an experiment by purposely trying to confirm or disprove the experimental hypothesis. Just in case you erroneously chose choice "c," please notice that the question used the word clients which is plural. N = 1 designs rely on a single individual for investigation purposes. Choice "d" describes a study in which the researcher actually participates in the study, while making observations about what transpired.

A panel of investigators discovered that a researcher who completed a major study had unconsciously rated attractive females as better counselors. This is an example of a. the Hawthorne effect. b. the Halo effect. c. the Rosenthal effect. d. trend analysis.

b. the Halo effect. The Halo effect occurs when a trait which is not being evaluated (e.g., attractiveness or how well he or she is liked) influences a researcher's rating on another trait (e.g., counseling skill). Choice "d," trend analysis, refers to a statistical procedure performed at different times to see if a trend is evident.

There are two distinct types of developmental studies. In a cross-sectional study, clients are assessed at one point in time. In a longitudinal study, however, a. the researcher has an accomplice pose as a client and act in a certain manner. b. the same people are studied over a period of time. c. the researcher relies on a single observation of a variable being investigated. d. all of the above.

b. the same people are studied over a period of time. Some exams refer to the cross-sectional method as the "synchronic method" and the longitudinal as the "diachronic method." The longitudinal study is beneficial in the sense that age itself can be used as an IV. In a longitudinal study, data are collected at different points in time. In the cross-sectional method, data are indicative of measurements or observations at a single point in time, and thus it is preferable in terms of time consumption. The person in choice "a" is known as a "confederate" or a "stooge." Social psychology studies routinely employ "confederates" or "stooges," who are not real participants but in reality work with the researcher.

P = .05 really means that a. five subjects were not included in the study. b. there is only a 5% chance that the difference between the control group and the experimental groups is due to chance factors. c. the level of significance is .01. d. no level of significance has been set.

b. there is only a 5% chance that the difference between the control group and the experimental groups is due to chance factors. Many experts in the field feel it is misleading when many exams still refer to this as the "95% confidence interval," meaning that the results would be due to chance only five times out of 100. When P = .05, differences in the experimental group and the control group are evident at the end of the experiment, and the odds are only one in 20 that this can be explained by chance. An exam could refer to the "level of significance" as the level of confidence or simply the confidence level. The meaning is intended to be the same

The study that would best rule out chance factors would have a significance level of P = a. .05. b. .01. c. .001. d. .08.

c. .001. The smaller the value for P the more stringent the level of significance. Here, the .001 level is the most stringent level listed, indicating that there is only one chance in 1,000 that the results are due to chance, versus one in 20 for .05, and one in 100 for .01. It is easier to get significant results using .08, .05, or .01, than it is using .001.

In World War II the Air Force used stanine scores as a measurement. Stanine scores divide the distribution into nine equal intervals with stanine 1 as the lowest ninth and 9 as the highest ninth. In this system 5 is the mean. Thus a Binet IQ score of 101 would fall in stanine a. 1. b. 9. c. 5. d. 7.

c. 5. Stanine is the contraction of the words standard and nine. The mean or average score on the Binet is 100, so a Binet score of 101 would fall in stanine 5.

If an ANOVA yields a significant F value, you could rely on ________ to test significant differences between group means. a. one- and two-tailed t tests b. percentile rank c. Duncan's multiple-range, Tukey's, or Scheffe's test d. summative or formative evaluation

c. Duncan's multiple-range, Tukey's, or Scheffe's test Choice "a" refers to whether a statistical test places the rejection area at one end of the distribution (one-tailed) or both ends of the distribution curve (two-tailed). A two-tailed test is often called a "nondirectional experimental hypothesis," while a one-tailed test is a "directional experimental hypothesis." In a one-tailed test your hypothesis specifies that one average mean is larger than another. A two-tailed hypothesis would be, "The average patient who has completed psychoanalysis will have a statistically different IQ from the average patient who has not received analysis." The one-tailed hypothesis would be, "The average patient who has completed psychoanalysis will have a statistically significantly higher IQ than the average patient who has not received analysis." When appropriate, one-tailed tests have the advantage of having more "power" than the two-tailed design (the statistical ability to reject correctly a false hypothesis). In choice "d" you should be aware that summative evaluation is used to assess a final product (e.g., how many high school students are not indulging in alcoholic beverages after completing a yearly program focusing on drug awareness education?). Summative research attempts to ascertain how well the goal has been met. Formative process research is ongoing while the program is underway (e.g., after three weeks of a proposed year-long drug awareness education program how many high school students are taking drugs?). The correct answer to this question, of course, is alternative "c." An F test for the ANOVA is analogous to the student's t test table when performing a t test. In order to further discriminate between the ANOVA groups the post hoc measures mentioned in choice "c" would be appropriate.

Which of the following would most likely yield a perfect correlation of 1.00? a. IQ and salary. b. ICD diagnosis and salary. c. Length in inches and length in centimeters. d. Height and weight.

c. Length in inches and length in centimeters. In the real world, correlations may be strong (e.g., choice "d"), yet they are rarely 1.00. Correlation is concerned with what statisticians call "covariation." When two variables vary together statisticians say the variables "covary positively," and when one increases while the other decreases they are said to "covary negatively."

A researcher creates a new motoric test in which clients throw a baseball at a target 40 feet away. Each client is given 100 throws, and the mean on the test is 50. (In other words, out of 100 throws the mean number of times the client will hit the target is 50 times.) Sam took the test and hit the target just two times out of the 100 throws allowed. Jeff, on the other hand, hit the target an amazing 92 times out of 100 trials. Using the concept of statistical regression toward the mean the research would predict that a. Sam's and Jeff's scores will stay about the same if they take the test again. b. Sam and Jeff will both score over 95 next time. c. Sam's score will increase while Jeff's will go down. d. Sam will beat Jeff if they both are tested again.

c. Sam's score will increase while Jeff's will go down. Statistical regression is a threat to internal validity. Statistical regression predicts that very high and very low scores will move toward the mean if a test is administered again. This concept is based on "the law of filial regression," which is a genetic principle that asserts that generational traits move toward the mean. The statistical analogy suggests that extremely low scores on an exam or a pretest will improve while the unusually high scores will get lower. Statistical regression results from errors (i.e., lack of reliability) in measurement instruments and must be taken into account when interpreting test data. Most scores don't change that much, and although Sam's score will probably inch up a bit and Jeff's will lose a little ground, Sam will probably still be in the lower quartile and Jeff the upper quartile. The term quartile is common and refers to the points that divide a distribution into fourths. This indicates that the 25th percentile is the first quartile, the second quartile is the median, and the third lies at the 75th percentile. The score distance between the 25th percentile and the 75th percentile is called the interquartile range.

In a random sample each individual in the population has an equal chance of being selected. Selection is by chance. In a new study, however, it will be important to include 20% African Americans. What type of sampling procedure will be necessary? a. Standard (i.e., simple) random sampling is adequate. b. Cluster sampling is called for. c. Stratified sampling would be best. d. Horizontal sampling is required.

c. Stratified sampling would be best. In the random sample each subject has the same probability of being selected, and the selection of one subject does not affect the selection of another subject. The simple random sampling procedure eliminates the researcher's tendency to pick a biased sample of subjects. In this case, a simple random sampling procedure will not suffice, since a "stratum" (plural "strata") or a "special characteristic" needs to be represented. The stratification variable in your sample should mimic the population at large. In a research situation where a specific number of cases are necessary from each stratum, the procedure often is labeled as "quota sampling." Quota sampling is a type of stratified sampling. The "cluster sample" (choice "b") is utilized when it is nearly impossible to find a list of the entire population. The cluster sample solves the problem by using an existing sample or cluster of people or selects a portion of the overall sample. A cluster sample will not be as accurate as a random sample yet it is often used due to time and practical considerations. Horizontal sampling occurs when a researcher selects subjects from a single socioeconomic group. Horizontal sampling can be contrasted with "vertical sampling," which occurs when persons from two or more socioeconomic classes are utilized. Since this question does not specify socioeconomic factors, you could have eliminated choice "d." A snowball sample or a chain-referral sample uses subjects to drum up other subjects for your study.

In order for the professor of counselor education to conduct the experiment suggested in question 708 the experimental group would need to receive a. the manipulated IV. b. the biofeedback training. c. a and b. d. the organismic IV.

c. a and b. The experimental group receives the IV, which in this case is the biofeedback training. An organismic variable is one the researcher cannot control yet exists, such as height, weight, or gender. To determine whether an organismic IV exists you ask yourself if there is an experimental variable being examined which you cannot manipulate. In most cases, when you are confronted with IV/DV identification questions, the IV will be of the "manipulated variety."

A sociogram is to a counseling group as a scattergram is to a. the normal curve. b. the range. c. a correlation coefficient. d. the John Henry effect

c. a correlation coefficient. A scattergram—also known as a scatterplot—is a pictorial diagram or graph of two variables being correlated. The John Henry effect (also called compensatory rivalry of a comparison group) is a threat to the internal validity of an experiment that occurs when subjects strive to prove that an experimental treatment that could threaten their livelihood really isn't all that effective. One way for the researcher to handle this problem is to make observations before the experiment begins. Another control group phenomenon that threatens internal validity in research is the "Resentful Demoralization of the Comparison Group" (also called compensatory equalization"). Here, the comparison group lowers their performance or behaves in an inept manner because they have been denied the experimental treatment. When this occurs, the experimental group looks better than they should. If the comparison group deteriorates throughout the experiment while the experimental group does not, then demoralization could be noted. This could be measured via a pretest and a posttest.

When a distribution of scores is not distributed normally, statisticians call it a. Gauss's curve. b. a symmetrical bell-shaped curve. c. a skewed distribution. d. an invalid distribution.

c. a skewed distribution. In a skewed distribution the left and right side of the curve are not mirror images. In a skewed distribution the mean, the median, and the mode fall at different points. In a normal curve they will fall at the same point.

To complete a t test you would consult a tabled value of t. In order to see if significant differences exist in an ANOVA you would consult a. the mode. b. a table for t values. c. a table for F values. d. the chi-square.

c. a table for F values. More elaborate tests (e.g., Tukey's, Duncan's multiple range, and Scheffe's test) can determine whether a significant difference exists between specific groups. Group comparison tests such as these are called "post hoc" or "a posteriori" tests for ANOVA calculations.

If a researcher changes the significance level from .05 to .001, then a. alpha and beta errors will increase. b. alpha errors increase but beta errors decrease. c. alpha errors decrease; however, beta errors increase. d. this will have no impact on Type I and Type II errors.

c. alpha errors decrease; however, beta errors increase.

A researcher wants to run a true experiment but insists she will not use a random sample. You could safely say that a. she absolutely, positively cannot run a true experiment. b. her research will absolutely, positively be casual comparative research. c. she could accomplish this using systematic sampling. d. her research will be correlational.

c. she could accomplish this using systematic sampling. Today researchers are slowly but surely embracing systematic sampling, since it is often easier to use. With this approach you take every nth person. Say you have a list of 10,000 folks. You want 1,000 in your study. You pick the first person between one and 10 at random and then use every 10th person. According to some statisticians your results will be virtually the same as if you used good old random sampling. One problem is that small samples intended to mimic the population sometimes do not! The margin of error stated in political poll results is based on this concept.

The null hypothesis suggests that there will not be a significant difference between the experimental group which received the IV and the control group which did not. Thus, if the experiment in question 708 was conducted, the null hypothesis would suggest that a. all students receiving biofeedback training would score equally well on the board exam. b. systematic desensitization might work better than biofeedback. c. biofeedback will not improve the board exam scores. d. meta-analysis is required.

c. biofeedback will not improve the board exam scores. The null hypothesis asserts that the samples will not change even after the experimental variable is applied. According to the null hypothesis the control group and the experimental group will not differ at the end of the experiment. The null hypothesis is simply that the IV does not affect the DV. Null means "nil" or "nothing." Null is a statement of "no difference." Choice "d" introduces the term meta-analysis, which is a study that analyzes the findings of numerous studies. Hence, a study of reality therapy that looked at the results of 20 reality therapy studies would be a meta-analysis.

A professor of counselor education hypothesized that biofeedback training could reduce anxiety and improve the average score on written board exams. If this professor decides to conduct a formal experiment the IV will be the ________, and the DV will be the ________. a. professor; anxiety level b. anxiety level; board exam score c. biofeedback; board exam score d. board exam score; biofeedback

c. biofeedback; board exam score The researcher here hypothesized that the training lowers anxiety, but you won't have any direct data regarding this trait. Hence it will not be your DV in this experiment.

In order for the professor of counselor education (see question 708) to conduct an experiment regarding his hypothesis he will need a(n) ________ and a(n) ________. a. biofeedback group; systematic desensitization group b. control group; systematic desensitization group c. control group; experimental group d. control group with at least 60 subjects; experimental group with at least 60 subjects

c. control group; experimental group The control group and the experimental group both have the same characteristics except that members of the control group will not have the experimental treatment applied to them. In this case, the control group will not receive the biofeedback training. The control group does not receive the IV. The experimental group receives the IV. The basic presupposition is that the averages (or means) of the groups do not differ significantly at the beginning of the experiment. Choice "d" would also be a correct answer if it said 15 per group instead of 60. Remember that if you cannot randomly assign the subjects to the two groups then your exam will consider the research a quasiexperiment. Most experts suggest that you need at least 30 people to conduct a true experiment. Correlational research requires 30 subjects per variable while a survey should include at least 100 people.

In a new experiment, a counselor educator wants to ferret out the effects of more than one IV. She will use a ________ design. a. Pearson Product-Moment r. b. Spearman rank order rho c. factorial d. Solomon four-group

c. factorial In a factorial experiment, several experimental variables are investigated and interactions can be noted. Factorial designs include two or more IVs. Sometimes the IVs in a factorial design are called levels. Level does not connote hierarchy. Choices "a" and "b" are not considered pure experimental. Now even though choice "d" is incorrect, it is indeed a must-know term. In the Solomon four-group design (design created by psychologist Richard L. Solomon) the researcher uses two control groups. Only one experimental group and one control group are pretested. The other control group and experimental group are merely post-tested. The genius of the design is that it lets the researcher know if results are influenced by pretesting. The two control groups as well as the two experimental groups can then be compared.

A counselor educator is running an experiment to test a new form of counseling. Unbeknownst to the experimenter one of the clients in the study is secretly seeing a gestalt therapist. This experiment a. is parsimonious. b. is an example of Occam's Razor. c. is confounded/flawed. d. is valid and will most likely help the field of counseling

c. is confounded/flawed. The experiment is said to be invalid (so much for choice "d") due to an extraneous independent variable (IV) (e.g., the gestalt therapy). Variables which are undesirable confound or "flaw" the experiment. The only experimental variable should be the IV—in this case the new form of counseling. The IV must have the effect on the dependent variable (here the DV would be some measure of the client's mental health). In this experiment any changes could not be attributed with any degree of certainty to the new form of counseling since DV changes could be due to the gestalt intervention (an extraneous confounding variable). All correlational research is said to be confounded.

In a basic curve or so-called frequency polygon the point of maximum concentration is the a. mean. b. median. c. mode. d. range.

c. mode. The mean, choice "a," would be computed by adding the numbers provided (i.e., the scores) and then dividing by the number of scores. The median, choice "b," is defined as the score which is the exact middle of the distribution. Choice "d," the range, which is a measure of variability, is the distance between the largest and the smallest scores. The larger the range the greater the dispersion or spread of scores from the mean. Since the computation of the range is based solely on the computation of two scores, the variance and the standard deviation (the square root of the variance) are more stable statistics.

Nine of the world's finest counselor educators are given an elementary exam on counseling theory. Needless to say, all of them scored extremely high. The distribution of scores would most likely be a. a bell-shaped curve. b. positively skewed. c. negatively skewed. d. indicative that more information would be necessary.

c. negatively skewed. Since high scores pack the right side of the distribution, this gives you a long tail that points to the left, which indicates a negative skew.

When you see the letter P in relation to a test of significance it means a. portion. b. population parameter. c. probability. d. the researcher is using an ethnographic qualitative approach.

c. probability. A parameter is technically a value obtained from a population while a statistic is a value drawn from a sample. A parameter summarizes a characteristic of a population. The correct answer is choice "c" which refers to the probability or the level of significance. Traditionally, the probability in social science research (often indicated by a P) has been set at .05 or lower (.01 or .001). The .05 level indicates that differences would occur via chance only five times in 100. The significance level must be set before the experiment begins. Ethnographic research involves research that is collected via interviews, observations, and inspection of documents.

Behaviorists often utilize N = 1, which is called intensive experimental design. The first step in this approach would be to a. consult a random number table. b. decide on a nonparametric statistical test. c. take a baseline measure. d. compute the range.

c. take a baseline measure "N," or the number of persons being studied, is one. This is a "case study" of one approach. This method is popular with behaviorists who seek overt (measurable) behavioral changes. The client's dysfunctional behavior is measured (this is called a baseline measure), the treatment is implemented, and then the behavior is measured once again (i.e., another baseline is computed). Exams sometimes delineate this paradigm using upper-case As and Bs and Cs such that As signify baselines, Bs intervention implementation, and Cs a second or alternative form of intervention. Single case investigations are often called "idiographic studies" or "single-subject designs." The original case study methodology was popularized by Freud, though needless to say, unlike the behaviorists, Freud did not rely on numerical baseline measures. Case studies are often misleading because the results are not necessarily generalizable.

An elementary school counselor tells the third-grade teacher that a test revealed that certain children will excel during the school year. In reality, no such test was administered. Moreover, the children were unaware of the experiment. By the end of the year, all of the children who were supposed to excel did excel! This would best be explained via a. the Hawthorne effect. b. the Halo effect. c. the Rosenthal effect or the experimenter expectancy effect. d. observer bias.

c. the Rosenthal effect or the experimenter expectancy effect. forget the Hawthorne effect this time around since the kids don't even know an experiment is in progress. Here the "Rosenthal effect" or experimenter expectancy effect is probably having the impact. The Rosenthal effect, named after noted psychologist Robert Rosenthal, and no relation to this author, asserts that the experimenter's beliefs about the individual may cause the individual to be treated in a special way so that the individual begins to fulfill the experimenter's expectations. Choice "d" is self-explanatory. The observer has perceptions regarding the research that are not accurate.

The most valuable type of research is a. always conducted using a factor analysis. b. conducted using the chi-square. c. the experiment, used to discover cause-and-effect relationships. d. the quasi-experiment.

c. the experiment, used to discover cause-and-effect relationships. Experimental research is the process of gathering data in order to make evaluative comparisons regarding different situations. An experiment must have the conditions of treatment controlled via the experimenter and random assignments (randomization) used in the groups. An experiment attempts to eliminate all extraneous variables. In the quasi-experiment (choice "d") the researcher uses preexisting groups, and hence the IV (independent variable) cannot be altered (e.g., gender or ethnicity). In a quasi-experiment you cannot state with any degree of statistical confidence that the IV caused the DV (dependent variable). One popular type of quasi-experiment is known as the "ex post facto study." Ex post facto literally means "after the fact," connoting a correlational study or research in which intact, preexisting groups are utilized. In the case of the ex post facto study, the IV was administered before the research began. When conducting or perusing a research study a counselor is very concerned with "internal and external validity." Threats to internal validity include maturation of subjects (psychological and physical changes including fatigue due to the time involved), mortality (i.e., subjects withdrawing), instruments used to measure the behavior or trait, and statistical regression (i.e., the notion that extremely high or low scores would move toward the mean if the measure is utilized again). Internal validity refers to whether the DVs were truly influenced by the experimental IVs or whether other factors had an impact. External validity, on the other hand, refers to whether the experimental research results can be generalized to larger populations (i.e., other people, settings or conditions). Thus, if the results of the study only apply to the p

A researcher gives a depressed patient a sugar pill and the individual's depression begins to lift. This is known as a. the Hawthorne effect. b. the Halo effect. c. the placebo effect. d. the learned helplessness syndrome.

c. the placebo effect. Researchers often give clients involved in studies an inert substance (i.e., a placebo such as a gelatin capsule) so it can be compared with the real drug or treatment procedure. A nocebo, on the other hand, has a negative effect such as when a doctor comments that a person with such and such condition has only six weeks to live.

Researchers often utilize naturalistic observation when doing ethological investigations or studying children's behavior. In this approach a. the researcher manipulates the IV. b. the researcher manipulates the IV and the DV. c. the researcher does not manipulate or control variables. d. the researcher will rely on a 2 × 3 factorial design.

c. the researcher does not manipulate or control variables. When utilizing naturalistic observation the researcher does not intervene. Preferably, the setting is "natural" rather than an artificial laboratory environment. Historically speaking, this is the oldest method of research. Choice "d" indicates a study using two independent variables. The 2 × 3 is called factorial notation. The first variable has two levels (e.g., male or female) and the second independent variable has three levels.

If a distribution is bimodal, then there is a good chance that a. the curve will be normal. b. the curve will be shaped like a symmetrical bell. c. the researcher is working with two distinct populations. d. the research is useless in the field of counseling.

c. the researcher is working with two distinct populations.

Mike takes a math achievement test. In order to predict his score if he takes the test again the counselor must know a. the range of scores in his class. b. the standard deviation. c. the standard error of measurement (SEM). d. the mode for the test.

c. the standard error of measurement (SEM). The standard error is all you need to know. The SEM tells the counselor what would most likely occur if the same individual took the same test again. The question does not ask how well he did on the test, nor does it ask you to compare him to others.

A platykurtic distribution would look approximately like a. the upper half of a bowling ball. b. the normal distribution. c. the upper half of a hot dog, lying on its side over the abscissa. d. a camel's back.

c. the upper half of a hot dog, lying on its side over the abscissa. The word kurtosis refers to the peakedness of a frequency distribution. A "platykurtic" distribution is flatter and more spread out than the normal curve. When a curve is very tall, thin, and peaked it is considered "leptokurtic."

If the researcher in the previous question utilized two IVs then the statistic of choice would be the a. median. b. t test. c. two-way ANOVA or MANOVA. d. ANOVA.

c. two-way ANOVA or MANOVA. Two IVs requires a two-way ANOVA, three IVs, a three-way ANOVA, etc.

A Type I error occurs when a. you have a beta error. b. you accept null when it is false. c. you reject null when it is true. d. you fail to use a test of significance.

c. you reject null when it is true. Since all statistical tests rely on probability there is always the possibility that the results were merely chance occurrences. Researchers call these chance factors "errors."

Using the data in question 764 one could say that a person with an IQ score of 122 would fall within a. + or -1 SD of the mean. b. the average IQ range. c. an IQ score which is more that 2 SD above the mean. d. + or -2 SD of the mean.

d. + or -2 SD of the mean. Two SD would be IQs from 70 to 130 since 2 SD would be 30 IQ points. If everybody scored the same on the test then the SD would be zero. The greater the SD, the greater is the spread.

Which level of significance would best rule out chance factors a. .05 b. .01 c. .2 d. .001

d. .001 Some researchers refer to the level of significance as where "one draws the line" or the "cutoff point" between findings that should or should not be ascribed to chance factors. The significance level must be set low. If, for example, a researcher foolishly set the level at .50, then the odds would be 50-50 that the results were due to pure chance.

The variance is a measure of dispersion of scores around some measure of central tendency. The variance is the standard deviation squared. A popular IQ test has a standard deviation (SD) of 15. A counselor would expect that if the mean IQ score is 100, then a. the average score on the test would be 122. b. 95% of the people who take the test will score between 85 and 115. c. 99% of the people who take the test will score between 85 and 115. d. 68% of the people who take the test will score between 85 and 115.

d. 68% of the people who take the test will score between 85 and 115. Statistically speaking 68.26% of the scores fall within + or -1 SD of the mean; 95.44% of the scores fall within + or -2 SD of the mean; and 99.74% of the scores fall within +/-3 SD of the mean.

The WAIS-IV IQ test is given to 100 adults picked randomly. How many of the adults most likely would receive an IQ score between 85 and 115? a. 7. b. 99. c. 95. d. 68.

d. 68. One SD on most popular IQ tests is 15 or 16, and the mean score is generally 100. Choice "c" is indicative of +/-2 SD, while choice "b" approximates +/-3 SD.

The researcher in question 727 now attempts a more complex experiment. One group receives no assertiveness training, a second group receives four assertiveness training sessions, and a third receives six sessions. The statistic of choice would be the a. mean. b. t test. c. two-way ANOVA. d. ANOVA.

d. ANOVA. A one-way analysis of variance is used for testing one independent variable, while a two-way analysis of variance is used to test two independent variables. When a study has more than one DV the term multivariate analysis of variance (MANOVA) is utilized. The answer is choice "d" since the simple ANOVA or one-way analysis of variance is used when there is more than one level of a single IV, which in this case is the assertiveness training.

Three years ago an inpatient addiction treatment center in a hospital asked their clients if they would like to undergo an archaic form of therapy created by Wilhelm Reich known as "vegotherapy." Approximately half of the clients stated they would like try the treatment while the other 50% stated that they would stick with the tried-and-true program of the center. Outcome data on their drinking was compiled at the end of seven weeks. Today—three years later—a statistician compared the two groups based on their drinking behavior at the end of the seven weeks using a t test. This study could best be described as a. correlation research. b. a true experiment. c. a cohort study. d. causal comparative research.

d. causal comparative research. Since the groups were not randomly assigned and the current researcher did not truly control the IV in the study (since it took place three years ago), "d" is the best answer. Just for the record a cohort study examines people who were born at the same time (or shared an event; for example, fought in Vietnam) in regard to a given characteristic.

In a parametric test the assumption is that the scores are normally distributed. In nonparametric testing the curve is not a normal distribution. Which of these tests are nonparametric statistical measures? a. Mann-Whitney U test, often just called the U test. b. Wilcoxon signed-rank test for matched pairs. c. Soloman and the Kruskal-Wallis H test. d. All of the above are nonparametric measures.

d. All of the above are nonparametric measures. All of the above-referenced tests are categorized as "nonparametric." Many exams refer to nonparametric statistical tests as "distribution-free" tests. In a matched design the subjects are literally "matched" in regard to any variable that could be "correlated" with the DV, which is really the post-experimental performance. This procedure is logically termed "matched sampling." A special kind of "matched subjects design" is the "repeated-measures" or "within-subjects" design in which the same subjects are used, once for the control condition and again for the experimental IV conditions. The theory here is that ultimately a subject is best matched by themselves assuming that counterbalancing is implemented. Now the Mann-Whitney U test (choice "a") is used to determine whether two uncorrelated/unmatched means differ significantly, while the Wilcoxon signed-rank test examines whether two correlated means differ significantly from each other. By employing ranks, it is a good alternative to the correlated t test. Unmatched/uncorrelated groups could be termed independent groups. The U test, like the Wilcoxon, is an alternative to the t test when parametric precepts cannot be accepted. If parametric assumptions are in doubt, the Mann-Whitney U test or the Wilcoxon can be used for two groups; however, when the number of groups reaches three or above, the Kruskal-Wallis one-way ANOVA H test noted in choice "c" is utilized. The Solomon controls for pretest effects.

Experimenters should always abide by a code of ethics. The variable you manipulate/control in an experiment is the a. DV. b. dependent variable. c. the variable you will measure to determine the outcome. d. IV or independent variable.

d. IV or independent variable. Choices "a," "b," and "c" all mention the DV, which deals with outcome "data." Now, in any experiment the counselor researcher is guided by ethics: this suggests first, that subjects are informed of any risks; second, that negative after-effects are removed; third, that you will allow subjects to withdraw at any time; fourth, that confidentiality of subjects will be protected; fifth, that the results of research reports will be presented in an accurate format that is not misleading; and sixth, that you will use only techniques that you are trained in. Research is considered a necessary factor for professionalism in counseling.

Hypothesis testing is most closely related to the work of a. Robert Hoppock. b. Sigmund Freud. c. Lloyd Morgan. d. R. A. Fisher.

d. R. A. Fisher. Hypothesis testing was pioneered by R. A. Fisher. A hypothesis is a hunch or an educated guess which can be tested utilizing the experimental model. A hypothesis is a statement which can be tested regarding the relationship of the IV and the DV.

A researcher performs a study that has excellent external or so-called population validity, meaning that the results have generalizability. To collect his data the researcher gave clients a rating scale in which they were to respond with strongly agree, somewhat agree, neutral, somewhat disagree, or strongly disagree. This is a. a projective measure. b. unacceptable for use in standardized testing. c. a speed test. d. a Likert Scale.

d. a Likert Scale. Created by Rensis Likert in the early 1930s, this scale helps improve the overall degree of measurement. Response categories include such choices as strongly agree, agree, disagree, or strongly disagree.

From a mathematical standpoint, the mean is merely the sum of the scores divided by the number of scores. The mean is misleading when a. the distribution is skewed. b. the distribution has no extreme scores. c. there are extreme scores. d. a and c.

d. a and c.

A Type II error a. is also called a beta error. b. means you reject null when it is applicable. c. means you accept null when it is false. d. a and c.

d. a and c. Although lowering the significance level (e.g., .01 to .001) lowers Type I errors, it "raises" the risk of committing a Type II or beta error. Think of the Type I/Type II relationship as a seesaw in the sense that when one goes up the other goes down. Hence, in determining an alpha level, the researcher needs to decide which error results in the most serious consequences. The safest bet is to set alpha at a very stringent level and then use a large sample size. If this can be accomplished, it is possible to make the correct decision (accept or reject null) the majority of the time.

From a purely statistical standpoint, in order to compare a control group (which does not receive the IV or experimental manipulation) to the experimental group the researcher will need a. a correlation coefficient b. only descriptive statistics. c. percentile rank. d. a test of significance

d. a test of significance Choice "a" or correlational research does not make use of the paradigm in which an IV is experimentally introduced. Descriptive statistics (choice "b"), as the name implies, merely describes data (the mean, the median, or the mode). In order to compare two groups, "inferential statistics," which infer something about the population, are necessary. Choice "c," percentile rank, is a descriptive statistic that tells the counselor what percentage of the cases fell below a certain level. A percentage score is just another way of stating a raw score. A percentage score of 50 could be a very high, a very low, or an average score on the test. Graphically speaking, a distribution of percentile scores will always appear rectangular and flat. The correct answer is that the researcher in this experiment will need a test of significance. Such statistical tests are used to determine whether a difference in the groups' scores is "significant" or just due to chance factors. In this case a t test would be used to determine if a significant difference between two means exists. This has been called the "two-groups" or "two-randomized-groups" research design. In this study, the two groups were independent of each other in the sense that the change (or lack of it) in one group did not influence the other group. Thus, this is known as an "independent group comparison design." If the researcher had measured the same group of subjects without the IV and with the IV, then the study would be a "repeated-measures comparison design." When a research study uses different subjects for each condition, some exams refer to the study as a "between-subjects design." If the same subjects are employed (such as in repeated measures) your exam could refer to it as a "within-subjects design." In a between-subjects design, each subj

Type I and Type II errors are called ________ and ________ respectively. a. beta; alpha b. .01; .05 c. a and b d. alpha; beta

d. alpha; beta A Type I (alpha error) occurs when a researcher rejects the null hypothesis when it is true; and a Type II error (beta error) occurs when you accept null when it is false. The probability of committing a Type I error equals the level of significance mentioned earlier. Therefore, the level of significance is often referred to as the "alpha level." 1 minus beta is called "the power of a statistical test." In this respect, power connotes a statistical test's ability to reject correctly a false null hypothesis. Parametric tests have more power than nonparametric statistical tests. Parametric tests are used only with interval and ratio data.

Dr. X discovered that the correlation between therapists who hold NCC status and therapists who practice systematic desensitization is .90. A student who perused Dr. X's research told his fellow students that Dr. X had discovered that attaining NCC status causes therapists to become behaviorally oriented. The student is incorrect because a. systematic desensitization is clearly not a behavioral strategy. b. this can only be determined via a histogram. c. the study suffers from longitudinal and maturational effects. d. correlation does not imply causal.

d. correlation does not imply causal. Correlation does not mean causal! Correlational research is quasi-experimental, and hence, it does not yield cause-effect data. When correlational data describe the nature of two variables, the term bivariate is utilized. If more than two variables are under scrutiny, then the term multivariate is used to describe the correlational paradigm.

Experimental is to cause and effect as correlational is to a. blind study. b. double-blind study. c. N = 1 design. d. degree of relationship.

d. degree of relationship. A correlation coefficient is a descriptive statistic which indicates the degree of "linear relationship" between two variables. Statisticians use the phrase "linear relationship" to indicate that when a perfect relationship exists (a correlation of 1.0 or -1.0) and it is graphed, a straight line is formed. The Pearson Product-Moment Correlation r is used for interval or ratio data while the Spearman rho correlation is used for ordinal data. Correlational research is not experimental and hence does not imply causality.

A counselor educator, Dr. Y, is doing research on his classes. He hypothesizes that if he reinforces students in his morning class by smiling each time a student asks a relevant question, then more students will ask questions and exam grades will go up. Betty and Linda accidentally overhear Dr. Y discussing the experiment with the department chair. Betty is a real people pleaser and decides that she will ask lots of questions and try to help Dr. Y confirm his hypothesis. Linda, nevertheless, is angry that she is being experimented on and promises Betty that Dr. Y could smile until the cows came in but she still wouldn't ask a question. Both Linda and Betty exemplify a. internal versus external validity. b. ipsative versus normative interpretation of test scores. c. the use of the nonparametric chi-square test. d. demand characteristics of experiments.

d. demand characteristics of experiments. Ipsative implies a within-person analysis rather than a normative analysis between individuals. In other words, are you looking at an individual's own patterns revealed via measurement (e.g., highs and lows) or whether their score is compared to others evaluated by the same measure. The former is "ipsative" while the latter is "normative." Choice "c" mentions what is perhaps the most popular nonparametric (i.e., a distribution which is not normal) statistical test, the chi-square. The chi-square—threatening as it sounds—is merely used to determine whether an obtained distribution differs significantly from an expected distribution. You must be able to have mutually exclusive categories to use the chi-square (such as "will seek therapy" or "won't seek therapy"). The answer to this question is choice "d." A demand characteristic relates to any bit of knowledge—correct or incorrect—that the subject in an experiment is aware of that can influence their behavior. Demand characteristics can confound an experiment. Deception has been used as a tactic to reduce this dilemma.

Regardless of the shape, the ________ will always be the high point when a distribution is displayed graphically. a. degrees of freedom (df) b. mean c. median d. mode

d. mode The mode will be highest because it is the point where the most frequently occurring score falls.

Billy received an 82 on his college math final. This is Billy's raw score on the test. A raw score simply refers to the number of items correctly answered. A raw score is expressed in the units by which it was originally obtained. The raw score is not altered mathematically. Billy's raw score indicates that a. he is roughly a B student. b. he answered 82% correctly. c. his percentile rank is 82. d. more information is obviously necessary.

d. more information is obviously necessary. You couldn't choose choice "b" since you don't have enough information to figure out the percentage. You see, if Billy scored an 82 on a test with 82 questions, then he had a perfect score. If, however, the exam had several thousand items, his score may not have been all that high. I say "may not have been all that high," since a raw score of 82 might have been the highest score of anybody tested. You would need a "transformed score" or "standard score" (such as choice "c") to make this determination. The benefit of standard scores such as percentiles, t-scores, z-scores, stanines, or standard deviations over raw scores is that a standard score allows you to analyze the data in relation to the properties of the normal bell-shaped curve.

A ratio scale is an interval scale with a true zero point. Ratio measurements are possible using this scale. Addition, subtraction, multiplication, and division all can be utilized on a ratio scale. In terms of counseling research a. the ratio scale is the most practical. b. all true studies utilize the ratio scale. c. a and b. d. most psychological attributes cannot be measured on a ratio scale.

d. most psychological attributes cannot be measured on a ratio scale. Ratio scale is the highest level of measurement. Time, height, weight, temperature on the Kelvin scale, volume, and distance meet the requirements of this scale. Occasionally, a trait such as GSR (galvanic skin response) biofeedback could be classified as a ratio scale measurement. Since most measurements used in counseling studies do not qualify as ratio scales, choices "a" and "b" are misleading.

Nondirective is to person-centered as a. psychological testing is to counseling. b. confounding is to experimenting. c. appraisal is to research. d. parsimony is to Occam's Razor.

d. parsimony is to Occam's Razor. Nondirective and person-centered therapy are synonymous; both refer to names given to Rogerian counseling. Parsimony is roughly synonymous with Occam's Razor. Most counselors see themselves as practitioners rather than researchers. Research, nevertheless, helps the entire field of counseling advance. It has been pointed out that we know about the work of many famous counselors and career counselors because of their published research not because of what transpired in their sessions. The APA's Journal of Counseling Psychology publishes more counseling research articles than any other periodical in our field.

A counselor educator decides to increase the sample size in her experiment. This will a. confound the experiment in nearly every case. b. raise the probability of Type I and Type II errors. c. have virtually no impact on Type I and Type II errors. d. reduce Type I and Type II errors.

d. reduce Type I and Type II errors. Raising the size of a sample helps to lower the risk of chance/error factors. Differences revealed via large samples are more likely to be genuine than differences revealed using a small sample size.

If an experiment can be replicated by others with almost identical findings, then the experiment is a. impacted by the observer effect. b. said to be a naturalistic observation. c. the result of ethological observation. d. said to be reliable.

d. said to be reliable. Choice "a" refers to a situation in which the person observing in a participant observer research study influences/alters the situation. You will recall that the term reliability in the social sciences is also used in regard to testing to indicate consistency in measurement. Choice "b" occurs when clients are observed in a "natural" setting or situation. Choice "c" relates to the observation of animals.

In a career counseling session an electrical engineer mentions three jobs he has held. The first paid $10 per hour, the second paid $30 per hour, and the third paid a higher rate of $50 per hour. The counselor responds that the client is averaging $30 per hour. The counselor is using a. a Pearson Product-Moment Correlation coefficient. b. a factorial design. c. the harmonic mean. d. the mean.

d. the mean. The mean is the sum of scores divided by the number of scores. A factorial design has virtually nothing to do with this question! The term factorial design—which can easily be confused with the term factor analysis—can be used when there are two or more independent variables. Choice "c," the harmonic mean, refers to a central tendency statistic that is the reciprocal of the arithmetic mean of the reciprocals of the set of values. The statistic has limited usage; however, it is occasionally called for if measurements were not made on an appropriate scale (e.g., data revealed the number of behaviors per hour, when the number of minutes per behavior would be more useful). The harmonic cannot be utilized with negative numbers or if the data include a score of zero.

When a horizontal line is drawn under a frequency distribution it is known as a. mesokurtic. b. the y axis. c. the ordinate. d. the x axis.

d. the x axis. Choice "a" is from the Latin root "meso" or middle, and kurtic refers to the peakedness of a curve. The normal Gaussian curve is said to be mesokurtic since the peak is in the middle. When graphically representing data, the "x axis" (also called the abscissa) is used to plot the independent variable. The x axis is the horizontal axis. The "y axis" (also called the ordinate) is the vertical axis which is used as a scale for the dependent variable.

An IQ score on an IQ test which was 3 SD above the mean would be a. about average. b. slightly below the norm for adults. c. approximately 110. d. very superior.

d. very superior. Over 99% of the population will score between + or -3 SD of the mean. Therefore, less than 1% of the population would score at a level 3 SD above the mean. That would be a very high IQ score; 145 on the WAIS-IV to be exact. Lewis M. Terman, a pioneer in the study of intelligence, classified any children with IQs over 140 as "geniuses."


Related study sets

ARRT-Chapter 19- the menstrual cycle

View Set