Unit 11

Ace your homework & exams now with Quizwiz!

inferential analysis

Uses the laws of probability to make suggestions about populations based on sample data

in vivo codes

Codes that use the language and words of the participants

Chi-square test for contingency tables

determine whether a relationship in a contingency table is statistically significant categorical independent/dependent

Confidence Interval

is a range of numbers inferred from the sample that has a certain probability or chance of including the population parameter. The endpoints of a confidence interval are called confidence limits; the smallest number is called the lower limit, and the largest number is called the upper limit. In other words, rather than using a point estimate (which is a single number), the researcher uses a range of numbers, bounded by the lower and upper limits, as the interval estimate. This way, researchers can increase their chances of capturing the true population parameter.

Examining Probability Value

After the researcher states the null hypothesis, collects the research data, and selects a statistical test using SPSS, the computer program analyzes the research data and provides something called a probability value as part of the computer output.

union operator

Another operator is OR, also called the union operator. This operator finds all instances that take on any one of the provided words or codes. For example, if you searched a document with the command "female OR first grade" you would come up with instances that are either "female" or "first grade," or both. Another kind of search command is called FOLLOWED-BY in one popular program. Using this can find instances in which two codes occur in a specific order in the data (e.g., punishment FOLLOWED-BY quiet behavior).

co-occurring codes

Co-occurring codes are sets of codes (i.e., two or more codes) that overlap partially or completely. Co-occurring codes might merely show conceptual redundancy in coding (i.e., the two codes mean basically the same thing). More interestingly, co-occurring codes might suggest a relationship among categories within a set of text for a single individual (e.g., an interview transcript) or across multiple sets of text for different individuals (i.e., across several interview transcripts). The key point to remember is that you can allow codes to overlap when coding data.

significance level

(also called the alpha level) is the cutoff that the researcher uses to decide when to reject the null hypothesis: 1. When the probability value is less than or equal to the significance level, the researcher rejects the null hypothesis, and 2. when the probability value is greater than the significance level, the researcher fails to reject the null hypothesis. Choosing a significance level of .05 means that if your sample result would occur only 5% of the time or less (when the null hypothesis is true, as indicated by the probability value), then you are going to question the veracity of the null hypothesis, and you will reject the null hypothesis. The significance level is the value with which the researcher compares the probability value.

Checklist for Evaluating a Qualitative Study

1. Are the findings presented clearly and supported with evidence (e.g., quotes, content analysis)? 2. Were any potentially important data ignored by the researcher(s)? 3. Do the results provide a deep understanding of the inner views and meanings of the people studied?

To calculate the variance and standard deviation, follow these five steps:

1. Find the mean of a set of numbers. Add the numbers in column 1 and divide by the number of numbers. (Note that we use the symbol "X-bar" to stand for the mean.) 2. Subtract the mean from each number. Subtract the mean from each number in column 1 and place the result in column 2. 3. Square each of the numbers you obtained in the last step. Square each number in column 2 and place the result in column 3. (To square a number, multiply the number by itself. For example, 2 squared is 2 × 2, which is equal to 4.) 4. Put the appropriate numbers into the variance formula. Insert the sum of the numbers in column 3 into the numerator (the top part) of the variance formula. The denominator (the bottom part) of the variance formula is the number of numbers in column 1. Now divide the numerator by the denominator, and you have the variance. 5. You obtained the variance in the previous step. Now take the square root of the variance, and you have the standard deviation. (To get the square root, type the number into your calculator and press the square root [√] key.)

There are four important points about inferential statistics

1. First, the distinction between samples and populations is essential. You will recall that a sample is a subset of cases drawn from a population, and a population is the complete set of cases. 2. Second, a statistic (also called a sample statistic) is a numerical characteristic of a sample, and a parameter (also called a population parameter) is a numerical characteristic of a population. Here is the main idea: If a mean or a correlation (or any other numerical characteristic) is calculated from sample data, it is called a statistic; if it is based on all the cases in the entire population (such as in a census), it is called a parameter. 3. Third, in inferential statistics, we study samples when we are actually much more interested in populations. We do not study populations directly because it would be cost prohibitive and logistically impossible to study everyone in most populations that are the focus of research studies. However, because we study samples rather than populations, our conclusions will sometimes be wrong. The solution provided by inferential statistics is that we can assign probabilities to our statements and we can draw conclusions that are very likely to be correct 4. Fourth, random sampling is assumed in inferential statistics. You will recall that random sampling produces representative samples (i.e., samples that are similar to the populations from which they are selected). The assumption of random sampling is important in inferential statistics because it allows researchers to use the probability theory that underlies inferential statistics. Basically, statisticians have studied and come to understand the behavior of statistics based on random samples.

Checklist for Evaluating a Action Research Study

1. Is it clear how data were collected, and why, for each phase of the project? 2. Were data collection and record-keeping systematic? 3. If methods were modified during data collection, is an explanation provided? 4. Did the researchers undertake method and theoretical triangulation? 5. Were the key findings of the project fed back to participants at key stages? If so, how was their feedback used? 6. Were data analysis procedures described, and were they sufficiently rigorous? 7. Was the study design flexible and responsive? 8. Was the data analysis embedded in a logical process that included problem identification, planning, action (change or intervention that was implemented), and evaluation? 9. Did data analysis drive the actions taken (the change or intervention)?

Steps in Hypothesis Testing

1. State the null and alternative hypotheses. 2. Set the significance level before analyzing the data. 3. Obtain the probability value based on the analysis of your empirical data 4. Compare the probability value to the significance level and make the statistical decision. Step 4 includes two decision-making rules: Rule 1: If: Probability value ≤ significance level (i.e., probability value ≤ alpha). Then: Reject the null hypothesis. And: Conclude that the research finding is statistically significant. In practice, this usually means the following: If: Probability value ≤ .05.* Then: Reject the null hypothesis. And: Conclude that the research finding is statistically significant. Rule 2: If: Probability value > significance level (i.e., probability value > alpha). Then: Fail to reject the null hypothesis. And: Conclude that the research finding is not statistically significant. In practice, this usually means the following: If: Probability value > .05. Then: Fail to reject the null hypothesis. And: Conclude that the research finding is not statistically significant. 5. Compute effect size, interpret the results, and make a substantive, real- world judgment about practical significance. This means that you must decide what the results of your research study actually mean. Statistics are only a tool for determining statistical significance. If you have obtained statistical significance, you must now interpret your results in terms of the variables used in your research study. For example, you might decide that females perform better, on average, than males on the GRE Verbal test, that client-centered therapy works better than rational emotive therapy, or that phonics and whole language in combination work better than phonics only. You must also determine the practical significance of your findings. A finding is practically significant when the difference between the means or the size of the relationship is big enough, in your opinion, to be of practical use. For example, a correlation of .15 would probably not be practically significant, even if it were statistically significant. On the other hand, a correlation of .85 would probably be practically significant. Effect size indicators are important aids when you are making a judgment about practical significance.

Checklist for Evaluating a Quantitative Study

1. Were appropriate statistical tests and calculations of effect sizes used to analyze the data? 2. Are the results presented clearly? 3. Was any part of the data ignored, such as some participants being dropped? 4. Can the results be generalized to the populations and settings the researcher desires?

Checklist for Evaluating a Mixed Research Study

1. Were appropriate techniques of data analysis used? 2. Were any potentially important data ignored by the researcher(s)? 3. Were the data merged, connected, or linked to show integration? 4. Was enough evidence provided to convince you of the validity or trustworthiness or legitimacy of the findings?

The following will always be true if the data fully follow a normal distribution:

68.26% of the cases fall within 1 standard deviation. 95.00% fall within 1.96 standard deviations. 95.44% fall within 2 standard deviations. 99.74% fall within 3 standard deviations. A good rule for approximating the area within 1, 2, and 3 standard deviations is what we call the "68, 95, 99.7 percent rule."

Enumeration

A data analyst might want to determine how frequently words or coded categories appear in the data. This process of quantifying data is called enumeration. Enumeration helps qualitative researchers communicate concepts such as "amount" or "frequency" when writing up the results. Word or code frequencies can help researchers determine the importance of words and ideas. Listing frequencies can also help in identifying prominent themes in the data (e.g., What kinds of things did the participants say many times?). When numbers are reported in qualitative research reports, it is important to check the basis of the numbers being used, or they could be misleading.

Memoing

A helpful tool for recording ideas generated during data analysis is memoing (writing memos). Memos are reflective notes that researchers write to themselves about what they are learning from their data. Memos can include notes about anything, including thoughts on emerging concepts, themes, and patterns found in the data; the need for further data collection; a comparison that needs to be made in the data; and virtually anything else. Memos written early in a project tend to be more speculative, and memos written later in a project tend to be more focused and conclusive. Memoing is an important tool to use during a research project to record insights gained from reflecting on data. Because qualitative data analysis is an interpretative process, it is important that you keep track of your ideas by recording insights as they occur and not relying strictly on memory.

descriptive analysis

Goal is to summarize and explain a set of data

Creating Hierarchical Category Systems

Categories are the basic building blocks of qualitative data analysis because qualitative researchers make sense of their data by identifying and studying the categories that appear in their data. Think of the set of categories for a collection of data as forming a classification system characterizing those data. Rather than having to think about each sentence or each word in the data, the researcher will, after coding the data, be able to focus on the themes and relationships suggested by the classification system.

Hypothesis Testing

As you may recall, hypothesis testing is another branch of inferential statistics; one that is concerned with how well the sample data support a particular hypothesis, called the null hypothesis, and when the null hypothesis can be rejected. Unlike estimation, in which the researcher usually has no clear hypothesis about the population parameter, in hypothesis testing, the researcher states his or her null and alternative hypotheses and then uses inferential statistics on a new set of data to determine what decision needs to be made about these hypotheses. In hypothesis testing, the researcher hopes to "nullify" the null hypothesis (i.e., they hope to find relationships or patterns in the world, which means that they want to reject the null hypothesis).

Mixed Analysis Matrix

Before conducting a mixed analysis, a researcher needs to make two decisions. First, determine the number of data types that you intend to analyze. Of course, this depends on the number of data types obtained during data collection. Data types are classified as either quantitative data or qualitative data. For example, quantitative data include measurements based on standardized tests, rating scales, self-reports, symptom checklists, or personality inventories. Qualitative data include open-ended interview responses, open-ended questionnaire responses, observations and field notes, personal journals, diaries, permanent records, transcription of meetings, social and ethnographic histories, and photographs. If only one data type (i.e., quantitative only or qualitative only) is used, then we refer to this as monodata. Conversely, if both qualitative and quantitative data types are used, then we refer to this as multidata. Second, you should determine how many data analysis types you intend to use. These data analysis types can be either quantitative (i.e., statistical) or qualitative. If you only use one type of data analysis (i.e., quantitative analysis only or qualitative analysis only), then it is called monoanalysis. Conversely, if you use both types of data analysis, then it is called multianalysis. The two considerations just mentioned generate what is called the mixed analysis matrix.You may recall this matrix from earlier in the course. Crossing the two types of data (monodata and multidata) with the two types of analysis (monoanalysis and multianalysis) produces a 2 × 2 matrix with four cells. You can examine the mixed analysis matrix below, and each of the cells is described in the following paragraphs.

A simple or complex search can be performed with computer packages that use Boolean operators. Boolean operators

Boolean operators are words used to create logical combinations based on basic laws of thought. Boolean operators are used every day when we think and talk about things. Some common Boolean operators we all use are AND, OR, NOT, IF, THEN, and EXCEPT. Qualitative data analysis computer programs are written so that you can search your data or a set of codes using these and many other operators. For example, you can search the codes or text in a set of interview transcripts concerning teacher satisfaction using the following string of words: "male AND satisfied AND first grade." The Boolean operator AND is called the intersection operator because it finds all intersections of the words or codes. This search would locate all instances of male, first-grade teachers who were satisfied.

facesheet codes

Codes that apply to a complete document or case (e.g., to an interview) are called facesheet codes. The origin of the term facesheet probably comes from researchers attaching a sheet of paper to each transcript with codes listed that apply to the whole transcript. Demographic variables are frequently used as facesheet codes (e.g., gender, age, race, occupation, school). Researchers might later decide to sort their data files by facesheet codes to search for group differences (e.g., differences between older and younger teachers) or other relationships in the data.

coding

Coding is the process of marking segments of data (usually text data) with symbols, descriptive words, or category names. Codes are tags or labels for assigning units of meaning to the descriptive or inferential information compiled during a study. Codes usually are attached to "chunks" of varying size—words, phrases, sentences, or whole paragraphs. . . . They can take the form of a straightforward category label or a more complex one.

QUALITATIVE DATA ANALYSIS

Data analysis begins early in a qualitative research study, and during a single research study, qualitative researchers alternate between data collection (e.g., interviews, observations, focus groups, documents, physical artifacts, and field notes) and data analysis (creating meaning from raw data). Segmenting and coding go hand in hand because segmenting involves locating meaningful segments of data and coding involves marking or labeling those segments with codes or categories Qualitative data analysis involves the analysis of text from interview or field note transcripts, or the examination of visual material. Some basic procedures in qualitative data analysis are transcribing data, reading and rereading transcripts (i.e., immersing yourself in your data to understand what is going on), segmenting and coding the data, counting words and coded categories (enumeration), searching for relationships and themes in the data, and generating diagrams to help in interpreting the data. The goal of data analysis is to be able to summarize your data clearly and generate inductive theories based on the data. Mixed methods analyses are highly dependent on the purpose and questions of a study. Once these are established an analysis strategy can be developed, along with specific analytical procedures that facilitate the combining of qualitative and quantitative data.

Data correlation

Data correlation involves correlating or cross-classifying different data types, such as transforming qualitative data into categorical variables and examining their relationships with quantitative variables.

Data display

Data display refers to describing visually your quantitative data (e.g., using tables and graphs) and/or your qualitative data (e.g., using graphs, charts, matrices, checklists, rubrics, networks, and Venn diagrams).

Data reduction

Data reduction involves reducing the number of dimensions in the quantitative data (e.g., via descriptive statistics, exploratory factor analysis) and/or in the qualitative data (e.g., via thematic analysis, memoing).

Mixed Methods Analysis Techniques

Data reduction: Process of reducing the number of dimensions in the quantitative data (e.g., via descriptive statistics, exploratory factor analysis) and/or in the qualitative data (e.g., via thematic analysis, memoing). Data display: Refers to the visual demonstration of quantitative data (e.g., using tables and graphs) and/or qualitative data (e.g., using graphs, charts, matrices, checklists, rubrics, networks, and Venn diagrams). Data transformation: Involves quantitizing and/or qualitizing data. Data correlation: Involves correlating or cross-classifying different data types, such as transforming qualitative data into categorical variables and examining their relationships with quantitative variables. In data consolidation: Quantitative and qualitative data are combined to create new or consolidated codes, variables, or data sets. In data comparison: Findings from qualitative and quantitative data sources or analyses are compared. In data integration: Qualitative and quantitative findings are integrated into a coherent whole (typically done last in the analysis process).

Data transformation

Data transformation involves quantitizing and/or qualitizing data.

qualitative analysis

Focused on the analysis of nonnumerical data, such as words and pictures

Quantitative Analysis Techniques

Descriptive: Describe, summarize, or make sense of a particular set of data. Frequency distribution: A systematic arrangement of data values in which the data are rank ordered and the frequencies of each unique data value are shown. Graphic Representations of Data: Created in order to represent data in two-dimensional space. Measures of Central Tendency: Use of a single numerical value most typical of the values of a quantitative variable. Measures of Variability: A numerical index that provides information about how spread out or dispersed the data values are or how much variation is present in a data set. Measures of Relative Standing: Provide information about where a score falls in relation to the other scores in the distribution of data Contingency table (also called cross-tabulation or crosstab): Displays information in cells formed by the intersection of two or more categorical variables. Various kinds of information can be put into the cells of a contingency table (e.g., observed cell frequencies, row percentages, column percentages) Regression analysis: A set of statistical procedures used to explain or predict the values of a dependent variable based on the values of one or more independent variables. In regression analysis, there is always a single quantitative dependent variable. Although the independent variables can be either categorical or quantitative. Inferential: Use the laws of probability to make inferences about populations based on sample data. Sampling distribution: The theoretical probability distribution of the values of a statistic that results when all possible random samples of a particular size are drawn from a population. Point estimation: The use of the value of a sample statistic as the estimate of the value of a population parameter (e.g., estimate the population mean, the sample percentage to estimate the population percentage, or the sample correlation to estimate the population Confidence interval:A range of numbers inferred from the sample that has a certain probability or chance of including the population parameter. t Test for independent samples: Used to determine whether the difference between the means of two groups is statistically significant. One-way analysis of variance: Used to compare two or more group means. It is appropriate whenever you have one quantitative dependent variable and one categorical independent variable. Post hoc test: A follow-up test to analysis of variance that is used to determine which means are significantly different. Test for correlation coefficients: Used to determine whether a correlation coefficient is statistically significant. t Test for regression coefficients: Uses the t distribution (sampling distribution) to test each regression coefficient for statistical significance. Chi-square test for contingency tables: Used to determine whether a relationship observed in a contingency table is statistically significant. Analysis of covariance: Control method used to equate comparison groups that differ on a pretest or some other variable or variables.

In organizing a data set

Each participant gets a row, each variable gets a column

positively skewed

If a tail appears to be stretched or pulled toward the right, the distribution is said to be skewed to the right, in the positively skewed distribution, the numerical value of the mean is greater than the median, which is greater than the mode. A good example of positive skew are housing prices. Most people live in modest houses so the mode is at the lower end of the distribution. The small number of million dollar mansions at the high end of the distribution pull the mean up, but the mode is where most of the scores appear at the low end of the distribution, as illustrated below.

negatively skewed

If one tail appears to be stretched or pulled toward the left, the distribution is said to be skewed to the left In the negatively skewed distribution, the numerical value of the mean is less than the median and the numerical value of the median is less than the mode. A good example of a variable that is negatively skewed is self-esteem. Most people generally score high on self esteem with a few people scoring very low. Those few people with low scores pull down the mean. The modal score is at the high end of the distribution, as illustrated below.

There are two kinds of estimation procedures in inferential statistics.

If you use a single number (the value of your sample statistic) as your estimate (your best guess) of the population parameter, then you are engaged in point estimation. If you use a range of numbers that you believe includes the population parameter, then you are engaged in interval estimation.

MIXED DATA ANALYSIS TECHNIQUES

In a mixed research study, after you have collected qualitative and/or quantitative data, you will be in a position to analyze these data. That is, you will be ready to conduct a mixed analysis. The term mixed data analysis simply means that a researcher uses both quantitative and qualitative analytical techniques in a single research study. The researcher might use quantitative and qualitative techniques at approximately the same time (concurrently). For example, the qualitative and quantitative data might be merged into a single data set and analyzed concurrently. On the other hand, the researcher might use quantitative and qualitative techniques at different times (i.e., sequentially or iteratively). For example, initial qualitative data might be analyzed, interpreted, and used to inform a quantitative phase of the study, after which quantitative data are analyzed. More complex possibilities also exist. For example, during each phase of a research study, both types of data might be collected, analyzed, and used in multiple ways. The key idea is that in mixed data analysis, quantitative and qualitative data and/or quantitative and qualitative data analytic approaches are used in the same research study.

Data comparison

In data comparison, the findings from the qualitative and quantitative data sources or analyses are compared.

Data consolidation

In data consolidation, the quantitative and qualitative data are combined to create new or consolidated codes, variables, or data sets.

Data integration

In data integration (typically done last), the qualitative and quantitative findings are integrated into a coherent whole.

hierarchical analysis

In hierarchical analysis, categories are organized into different levels, typologies, and hierarchical systems. A set of subcategories might fall beneath a certain category, and that certain category might itself fall under an even higher-level category. Think about the category called fruit. In this case, some possible subcategories are oranges, grapefruit, kiwi, apples, and bananas. These are subcategories of fruit because they are "part of" or "types of" the higher-level category called fruit. The category fruit may itself be a subcategory of yet a higher category called food group. Systems of categories like this are called hierarchies because they are layered or fall into different levels.

Purpose of INFERENTIAL STATISTICS

In inferential statistics, researchers attempt to go beyond their data. In particular, they use the laws of probability to make inferences about populations based on sample data. In the branch of inferential statistics known as estimation, researchers want to estimate the characteristics of populations based on their sample data. They use random samples (i.e., "probability" samples) to make valid statistical estimations about populations. In the branch of inferential statistics known as hypothesis testing, researchers test specific hypotheses about populations based on their sample data.

Qualitative Analysis Techniques

Interim Analysis: This is a cyclical or recursive process of collecting data, analyzing the data, collecting additional data, analyzing those data throughout a research project. Memoing: Memos are reflective notes that researchers write to themselves about what they are learning from their data. Memos can include notes about anything, including thoughts. Photo-interviewing analysis: Photo interviewing is a method of data collection in which researchers show images to research participants during formal or informal interviews. Semiotic visual analysis: Semiotics is the study of signs and what they stand for in a human culture. Visual content analysis: Visual content analysis is based on what is directly visible to the researcher in an image or set of images. It differs from other methods of visual analysis in that it is more quantitative. Unlike more qualitative visual data analysis methods, visual content analysis concentrates on studying a representative sample rather than individual instances of images. Enumeration: The process of quantifying qualitative data. Enumeration helps qualitative researchers communicate concepts such as "amount" or "frequency" when writing up the results. Thematic analysis: Identification of themes in qualitative research findings. Hierarchical analysis: The process of organizing categories of data into different levels, typologies, and hierarchical systems. A set of subcategories might fall beneath a certain category, and that certain category might itself fall under an even higher-level category.

intracoder reliability

Intracoder reliability is also important. That is, it is also important that each individual coder be consistent. To help you remember the difference between intercoder reliability and intracoder reliability, remember that the prefix inter- means "between" and the prefix intra- means "within." Therefore, intercoder reliability means reliability, or consistency, between or across coders, and intracoder reliability means reliability within a single coder. If the authors of qualitative research articles that you read address the issues of intercoder and intracoder reliability, you should upgrade your evaluation of their research.

Photo interviewing

Photo interviewing is a method of data collection in which researchers show images to research participants during formal or informal interviews. What is unique in this approach is that the researcher has the participant "analyze" the pictures shown to him or her; the researcher records the participant's thoughts, memories, and reactions as "results." In this approach, the pictures are the stimulus and the participant is the analyst. The researcher reports these descriptive findings as the primary results. In addition to this photo-interviewing analysis, the researcher can interpret the results further.

Computer Programs for Qualitative Data Analysis

Qualitative data analysis programs can be used to do virtually every qualitative analysis technique previously discussed. They can, for example, be used to store and code your data. During coding, most programs allow complex hierarchical classification systems to be developed. Most programs allow the use of many different kinds of codes, including co-occurring and facesheet codes. Enumeration is easily done with just a few clicks of the computer mouse. Many programs allow you to attach memos or annotations to the codes or data documents so that you can record the important thoughts you have during analysis. Some programs will produce graphics that can be used in presenting the data. Finally, the heart and soul of most qualitative data analysis programs are their searching capabilities, the topic to which we now turn. The most popular programs are MAXQDA (www.maxqda.com), hyperRESEARCH (www.researchware.com/products/hyperresearch.html), QDA Miner (http://provalisresearch.com/products/qualitative-data-analysis-software/), and NVivo (www.qsrinternational.com/products_nvivo.aspx). Many others also work well, such as the new package Dedoose (www.dedoose.com), and older packages such as Ethnograph (www.qualisresearch.com) and atlas (www.atlasti.com/qualitative-analysis-software.html). Advantages to using qualitative data analysis computer programs are that they can help in storing and organizing data, they can be used for a lot of analyses, they can reduce the time required to analyze data (e.g., an analysis procedure that takes a lot of time by hand may take virtually no time with a computer program), and they can make procedures available that are rarely done by hand because they are either too time-consuming or too complex. Some disadvantages are that computer programs can take time to learn, they cost money and require computer availability, and they can become outdated. The biggest disadvantage is startup time.

Level of Confidence

Researchers are able to state the probability (called the level of confidence) that a confidence interval to be constructed from a random sample will include the population parameter. We use the future tense because our confidence is actually in the long-term process of constructing confidence intervals. For example, 95% confidence intervals will capture the population parameter 95% of the time (the probability is 95%), and 99% confidence intervals will capture the population parameter 99% of the time (the probability is 99%). This idea is demonstrated in the figure below. The reason is that 99% confidence intervals are wider than 95% confidence intervals and wider intervals are less precise An effective way to achieve both a higher level of confidence and a more narrow (i.e., more precise) interval is to increase the sample size. Bigger samples are therefore better than smaller samples. As a general rule, most researchers use 95% confidence intervals, and as a result, they make a mistake about 5% of the time. Researchers also attempt to select sample sizes that produce intervals that are narrow (i.e., precise) enough for their needs. confidence interval = point estimate ± margin of error

statistically significant

Researchers claim their finding is statistically significant when they do not believe (based on the evidence of their data) that their observed result was due only to chance or sampling error.

Significance Tests

Researchers report statistical significance to add credibility to their conclusions. Researchers do not want to interpret findings that are not statistically significant because these findings are probably a reflection of chance fluctuations. There are number of statistical significance tests that are used for hypothesis testing. Several of the most common tests of significance are presented in the following table.

When you engage in hypothesis testing, you follow these two rules:

Rule 1. If the probability value (which is a number obtained from the computer printout and is based on your research results) is less than or equal to the significance level (the researcher usually uses .05), then the researcher rejects the null hypothesis and tentatively accepts the alternative hypothesis. The researcher also concludes that the observed relationship is statistically significant (i.e., the observed difference between the groups is not just due to chance fluctuations). Rule 2. If the probability value is greater than the significance level, then the researcher cannot reject the null hypothesis. The researcher can only claim to fail to reject the null hypothesis and conclude that the relationship is not statistically significant (i.e., any observed difference between the groups is probably nothing but a reflection of chance fluctuations).

Semiotic visual analysis

Semiotic visual analysis is based on the theory of semiotics. A researcher who conducts semiotic analysis is therefore very concerned with what the signs in visual images mean. Semiotic researchers are not concerned with finding images that are statistically representative of a large set of images. Rather, they are concerned with individual images that have conceptual meaning or with how meaning is produced by images. Images often have layered meanings. From a semiotic perspective, images are denotative and connotative (Barthes, 1973). In the first layer, called denotative meaning, researchers simply want to know what is being depicted in the images. This layer assumes that we can only recognize what we already know, and this knowledge can be affected by verbal captions placed under photographs, for example, or by visual stereotypes in our cultures. The second semiotic layer, connotative meaning, builds on what researchers and participants know and explores the ways in which ideas and values are expressed and represented in images.

Semiotics

Semiotics is the study of signs and what they stand for in a human culture. A sign is something that stands for something else, and may mean something different to people in different capacities

a priori codes

Sometimes researchers bring an already developed coding scheme to the research project. These codes are called a priori codes, or preexisting codes , because they were developed before or at the very beginning of the current research study. A priori codes are used when a researcher is trying to replicate or extend a certain line of previous research. Researchers may also establish some a priori codes before data collection based on their relevance to the research questions. When researchers bring a priori codes to a research study, they come in with a start list of codes—an already developed master list that they can use for coding. During coding, however, the researcher should apply these codes only when they clearly fit segments of data. The codes should not be forced onto the data, and new codes should be generated when data segments are found that do not fit any of the codes on the list. In practice, many researchers employ both preexisting and inductive codes.

Analytical Procedures in Mixed Data Analysis

The conduct of mixed analysis potentially can involve many analytical strategies and procedures (Onwuegbuzie & Teddlie, 2003). Although Onwuegbuzie and Teddlie viewed the following as mixed data analysis stages, we prefer to view these as mixed data analysis strategies or procedures, some of which you will use and some of which you will not use in a particular research study:

monodata-monoanalysis

The first cell represents analysis of one data type using its standard analysis type, which involves either a quantitative (i.e., statistical) analysis of quantitative data or a qualitative analysis of qualitative data. Such analysis indicates that the underlying study is either a quantitative or a qualitative study; neither of which represents mixed research.

multidata-multianalysis

The fourth cell represents the analysis of both data types (e.g., quantitative and qualitative) using both analysis types (i.e., qualitative and quantitative). This class of analysis is called multidata-multianalysis. Because both quantitative and qualitative analytical techniques are used, the analysis is mixed. Multidata-multianalysis might be done concurrently, involving a statistical analysis of the quantitative data combined with a qualitative analysis of the qualitative data, followed by meta-inferences being made in which interpretations stemming from the quantitative and qualitative findings are integrated some way into a coherent whole (Tashakkori & Teddlie, 2003). Alternatively, multidata-multianalysis could be sequential in nature such that findings from the qualitative analysis inform the subsequent quantitative analysis, or vice versa. Cell 4 can accommodate rather complex analytical designs. For example, Li et al. (2000) used what they called cross-tracks analysis. This was characterized by a concurrent analysis of both qualitative and quantitative data such that the data analysis oscillated continually between both sets of data types throughout various stages of the data analysis process.

Purpose of DESCRIPTIVE STATISTICS

The goal of descriptive statistics is to describe, summarize, or make sense of numerical data. Researchers use graphic representations such as bar graphs, line graphs, and scatter plots as well as measures of central tendency (i.e., mode, median, and mean) to display and analyze this data. Before this analysis can start, a researcher must have data set to interpret. The researcher can summarize the variables in a data set one at a time, as well as examine how the variables are interrelated (e.g., by examining correlations). The key question in descriptive statistics is how researchers should communicate the essential characteristics of the data.

Probability Value

The probability value (also called the p value ) is the probability of the observed result of your research study (or a more extreme result) under the assumption that the null hypothesis is true. As an aside, the probability value is a conditional probability because it tells you the probability of the observed value of your test statistic (or a more extreme value) if the null hypothesis is true. The probability value is: - not the probability that the null hypothesis is true, - not the probability that the null hypothesis is false, - not the probability that the alternative hypothesis is true, - and it is not the probability that the alternative hypothesis is false. The probability value is the long-run frequency (through repeated sampling) that the particular observed value of your sample statistic or a more extreme value would occur simply due to chance fluctuations when the null hypothesis is true. If the probability value is very small, the researcher is allowed to reject the null hypothesis because the research results call into question the null hypothesis If the probability value is large, the researcher will fail to reject the null hypothesis. The researcher will also make the claim that the research finding is not statistically significant (i.e., the observed difference between the two means may simply be a random or chance fluctuation). The answer is that most researchers consider a probability value that is less than or equal to .05 to be small and a probability value that is greater than .05 to be relatively large.

monodata-multianalysis

The second cell represents analysis of one data type (e.g., quantitative only or qualitative only) using both analysis types (i.e., qualitative and quantitative). This class of analysis is called monodata-multianalysis. Because both quantitative and qualitative analytical techniques are used, this type of analysis is mixed. The first analysis employed in this cell should directly match the data type. Thus, if the data type is quantitative, then the first phase of the mixed analysis should be quantitative (i.e., statistical). Similarly, if the data type is qualitative, then the first phase of the mixed analysis should be qualitative. The data stemming from the initial analyses then are converted into the other data type. That is, the quantitative data are transformed into data that can be analyzed qualitatively, or what is known as qualitizing data (Tashakkori & Teddlie, 1998); or the qualitative data are transformed into numerical codes that can be analyzed statistically, or what is known as quantitizing data (Tashakkori & Teddlie). Qualitizing data. One way of qualitizing data is by forming narrative profiles (e.g., modal profiles, average profiles, holistic profiles, comparative profiles, normative profiles), in which narrative descriptions are constructed from statistical data. For example, Teddlie & Stringfield (1993) conducted a longitudinal study of eight matched pairs of schools that were initially classified as either effective or ineffective with regard to baseline data. Five years after the study was initiated, these researchers used eight empirical criteria to reclassify the schools' effectiveness status. These criteria were (a) norm-referenced test scores, (b) criterion-referenced test scores, (c) time-on-task in classrooms, (d) scores on quality of classroom instruction measures, (e) faculty stability, (f) student attendance, (g) changes in socioeconomic status of the schools' student bodies, and (h) measures of school "climate." Teddlie & Stringfield converted these quantitative data (i.e., qualitized them) into the following four qualitatively defined school profiles: (a) stable-more effective, (b) stableless effective, (c) improving, and (d) declining. These school profiles were used to add greater understanding to the researchers' evolving perspectives on the schools. Quantitizing data. When researchers quantitize data, "qualitative 'themes' are numerically represented, in scores, scales, or clusters, in order more fully to describe and/or interpret a target phenomenon" (Sandelowski, 2001, p. 231). This allows researchers to understand how often various categories or statements occurred in qualitative data, rather than only knowing what categories or statements occurred. Quantitizing sometimes involves reporting "effect sizes" associated with qualitative observations (Onwuegbuzie, 2003; Sandelowski & Barroso, 2003), which can range from manifest effect sizes (i.e., counting qualitative data to determine the prevalence rates of codes, observations, words, or themes) to latent effect sizes (i.e., quantifying nonobservable content, for example, by factor-analyzing emergent themes).

t distribution

The t distribution used in significance testing is the sampling distribution under the assumption that the null hypothesis is true. Therefore, the researcher rejects the null hypothesis when the value of t is large (i.e., when it falls in one of the two tails of the t distribution). Typically, t values that are greater than +2.00 (e.g., +2.15) or less than -2.00 (e.g., -2.15) are considered to be large t values.

t test for independent samples

The t test for independent samples is one of the most common statistical significance test and is used with a quantitative dependent variable and a dichotomous (i.e., composed of two levels or groups) independent variable. The purpose of this test is to see whether the difference between the means of two groups is statistically significant. The reason this test is called a t test is that the sampling distribution used to determine the probability value is known as the t distribution. The mean of the t distribution is equal to zero. Just like the normal curve, the t distribution is symmetrical, is higher at the center, and has a "right tail" and a "left tail" that represent extreme events.

multidata-monoanalysis

The third cell represents analysis of two data types (i.e., qualitative and quantitative) using only one data analysis type—that is, multidata-monoanalysis. This combination is uncommon in research. In fact, this cell generally should be avoided because it would entail one of the types of data being analyzed using a nonstandard analysis (e.g., only analyzing qualitative data using quantitative analysis or only analyzing quantitative data using qualitative analysis).

Variance/Standard Deviation

The two most popular measures of variability among researchers are the variance and standard deviation because these measures are the most stable and are the foundations of more advanced statistical analysis. These measures are also based on all the data values of a variable and not just the highest and lowest numbers, as is the case with the range. They are essentially measures of the amount of dispersion or variation around the mean of a variable. The variance and standard deviation will be larger when the data are spread out (heterogeneous) and smaller when the data are not very spread out (homogeneous).

thematic analysis

This analysis of themes, or thematic analysis, is a common type of qualitative data analysis.

To analyze qualitative data carefully, most data should be transcribed Transcription

Transcription is the process of transforming qualitative research data, such as audio recordings of interviews or field notes written from observations, into typed text. The typed text is called a transcript. If the original data source is an audio recording, transcription involves sitting down, listening to the tape recording, and typing what was said into a word processing file. If the data are memos, open-ended questionnaires, or observational field notes, transcription involves typing the handwritten text into a word processing file. In short, transcription involves transferring data from a less usable to a more usable form. After transcription, it is good practice to put original data somewhere for safekeeping. Some qualitative researchers use a voice recognition computer program, which can make transcribing relatively easy. These programs create transcriptions of data while you read the words and sentences into a microphone attached to your computer. Two popular programs are IBM ViaVoice and Dragon Naturally Speaking. The main advantage of voice recognition software is that it is easier to talk into a microphone than it is to type. Time savings are not currently large in comparison with typing, but the efficiency of these programs will continue to improve over time. It is important to note that these principles also apply when your qualitative data do not directly lend themselves to text (e.g., videotapes of observations, still pictures, and artifacts). You cannot directly transcribe these kinds of data sources. What can be done, however, is to employ the principles of coding (discussed previously) and enter codes and comments into text files for further qualitative data analysis.

Action Research Analysis Techniques

Triangulation: Studying each research question from multiple separate pieces of data and points of view. Introspection: A parallel process to data analysis, involving the examination and reduction of local biases and emotions in relation to the action research data.

Visual content analysis

Visual content analysis is different from semiotic analysis. Visual content analysis is based on what is directly visible to the researcher in an image or set of images. It differs from other methods of visual analysis in that it is more quantitative. For example, with visual content analysis, researchers might examine the relative frequencies of women or minorities in school texts or on websites that recruit college professors. Unlike more qualitative visual data analysis methods, visual content analysis concentrates on studying a representative sample rather than individual instances of images. It is less concerned with deep meaning and more concerned with prevalence. Visual content analysis begins with assertions or hypotheses that categorize and compare visual content. The categories are observable. The corpus (sample size or domain) of the study is decided ahead of time based on the research questions, how important it is to generalize the findings, and the statistical procedures to be employed. Visual content analysis is often limited to isolated content that represents particular variables under study. The variables are limited by clearly defined values that coders can classify consistently (reliably). For example, the variable setting takes on one or more of the values of office, domestic, public, religious, school, outside, or other.

Interval Estimation

When researchers use interval estimation, they construct confidence intervals

intercoder reliability

When you have high consistency among different coders about the appropriate codes. Intercoder reliability is a type of interrater reliability. Intercoder reliability adds to the objectivity of the research and reduces errors due to inconsistencies among coders. Achieving high consistency requires training and a good deal of practice.

Standard Deviation

When you take the square root of the variance, you obtain the standard deviation.

Skewed distributions

Why does the mean change more than the other measures of central tendency in the presence of a skewed distribution? The answer is that the mean takes into account the magnitude of all of the scores. In contrast, the median takes into account only the number of scores and the values of the middle scores. As a general rule, the mean is the best measure because it is the most precise. The mean takes into account the magnitude of all scores. The median and the mode do not do this. The mean is also the most stable from sample to sample. The median takes into account only the number of scores and the values of the middle scores. The mode is usually the least desirable because it provides information only about what data value occurs the most often. Therefore, mode should be used only when it is important to express which single number or category occurs the most frequently. Otherwise, the mean or the median is usually the preferred measure of central tendency. On helpful rule to remember is: Mean is less than median? Skewed left Mean is greater than median? Skewed right

statistics

a branch of mathematics that deals with the analysis of numerical data. It can be divided into two broad categories called descriptive statistics and inferential statistics. Inferential statistics was discussed previously in Unit 3, Module 3 in regards to Quantitative research. In descriptive statistics, however, the goal is to describe, summarize, or make sense of a particular set of data. The focus is more on interpretation than on calculation.

Line Graphs

a format for illustrating data that relies on the drawing of one or more lines One useful way to draw a graphical picture of the distribution of a variable is to construct a line graph, a format for illustrating data that relies on the drawing of one or more lines. An example line graph of grade point average (from the college student data set) is featured below. This type of graph is known as a frequency polygon. Consider the following line graph to see the type of line graph that is commonly constructed in factorial research designs. The dependent variable is placed on the vertical axis, one of the independent variables is placed on the horizontal axis, and the categories of a second independent variable are represented by separate lines. Another common use of line graphs is to show trends over time. In this case, the variable that a researcher wishes to observe changing over time is placed on the vertical axis, and time is placed on the horizontal axis. The key point is that there is not just one type of line graph. Line graphing is a versatile tool for educational researchers.

Histograms

a graphic presentation of a frequency distribution Bar graphs are used when the variable is a categorical variable. However, if your variable is a quantitative variable, a histogram is preferred It is especially useful because it shows the shape of the distribution of values. Compare the table of starting salaries (on the right) with the associated histogram (on the left) using the same data set.

master list

a master list is simply a list of all the codes used in the research study. The master list should include each code followed by the full code name and a brief description or definition of the code. A well- structured master list will enable other researchers working on the project to use the list readily. During coding, the codes on the master list should be reapplied to new segments of text each time an appropriate segment is encountered. For example, one category from the master list for the data in the below table would be "career choice." Therefore, when the data analyst for this research study encountered another segment of data in which the same or a different person being interviewed made a comment about career choice, the researcher would reapply the label "career choice." Every time a segment of text was about career choice, the researcher would use the code "career choice" to refer to that segment.

Variance

a measure of the average deviation of all the numbers from the mean in squared units.

measure of variability

a numerical index that provides information about how spread out or dispersed the data values are or how much variation is present. homogeneous: little variability in set of numbers heterogeneous: great deal of variability When a set of numbers is relatively homogeneous, you can place more trust in the measure of central tendency (mean, median, or mode) as being typical. Conversely, when a set of numbers is relatively heterogeneous, you should view the measure of central tendency as being less typical or representative of the data values. When the numbers are not very spread out, the mean is more representative of the set of numbers than when the numbers are quite spread out. Therefore, a measure of variability should usually accompany measures of central tendency. We now discuss the three most commonly used indexes of variability: 1. the range 2. the variance 3. the standard deviation.

Scatter Plots

a way to visualize the relationship between two quantitative variables. The dependent variable is represented on the vertical axis, and the independent variable is represented on the horizontal axis. Dots are plotted within the graph to represent the cases (i.e., individuals). Questions when dealing with scatter plots: 1. Does there appear to be a relationship between the two variables? 2. Is it a linear relationship or a curvilinear relationship? 3. If there is a relationship, how strong is it? 4. If a linear relationship is present, is it positive or negative?

Bar Graphs

bar graph is a graph that uses vertical bars to represent the data Notice that the x-axis of the bar graph represents the variable called "College Major" and the y-axis represents "Frequency of occurrence." The bars provide graphical representations of the frequencies of the three different college majors.

inductive codes

codes that are generated by the researcher by directly examining the data during the coding process. Inductive codes can be based on emic terms (terms that are used by the participants themselves).

Analysis of variance (ANOVA)

compare two or more group means categorical independent quantitative dependent

t test for correlation coefficients

determine whether an observed correlation coefficient is statistically significant quantitative independent/dependent variables

segmenting

involves dividing qualitative data into meaningful analytical units. When segmenting text data (such as the transcript from an interview or notes from observations), it is important to read the text line by line and continually ask the following kinds of questions: Do I see a segment of text that has a specific meaning that might be important for my research study? Is this segment different in some way from the text coming before and after it? Where does this segment start and end? A meaningful unit (i.e., segment) of text can be a word, a single sentence, or several sentences, or it might include a larger passage such as a paragraph or even a complete document. The segment of text must have meaning that the researcher thinks should be documented

Point Estimation

is defined as the use of the value of a sample statistic as the estimate of the value of a population parameter. You might use the sample mean to estimate the population mean, the sample percentage to estimate the population percentage, or the sample correlation to estimate the population correlation. The specific value of the statistic is called the point estimate, and it is the estimated value of the population parameter. The point estimate is your best guess about the likely value of the unknown population parameter. An insight from our earlier study of sampling distributions is that the value of a statistic varies from sample to sample. That is why a point estimate is usually wrong. Because of the presence of sampling error, many researchers recommend the use of interval estimation.

skewed

one tail is stretched out longer than the other tail, making the distribution asymmetrical.

Median

or 50th percentile, is the middle point in a set of numbers that has been arranged in order of magnitude (either ascending order or descending order).

normal distribution

or normal curve, is a unimodal, symmetrical distribution that is the theoretical model used to describe many physical, psychological, and educational variables. It is said to be bell shaped, because the curve is highest at the center and tapers off as you move away from the center. The height of the curve shows the frequency or density of the data values. Now, the most important characteristic of the normal distribution is that the mean, the median, and the mode are the same number.

Graphs

pictorial representations of data in two-dimensional space. Many graphs display the data on two dimensions or axes. These two axes are the x- and y-axes, where the x-axis is the horizontal dimension and the y-axis is the vertical dimension. When graphing the data for a single variable, the values of this variable are represented on the x-axis, and the frequencies or percentages are represented on the y-axis. If two variables are being graphed, the values of the independent variable are put on the x-axis, and the values of the dependent variable are put on the y-axis. Graphs can also be constructed for more than two variables.

Null Hypothesis

represented by the symbol H0, is a statement about a population parameter and states that some condition concerning the population parameter is true. In most educational research studies, the null hypothesis (H0) predicts no difference or no relationship in the population Please remember this key point: Hypothesis testing operates under the assumption that the null hypothesis is true. Then, if the results obtained from the research study are very different from those expected under the assumption that the null hypothesis is true, the researcher rejects the null hypothesis and tentatively accepts the alternative hypothesis. Again, the null hypothesis is the focal point in hypothesis testing because it is the null hypothesis, not the alternative hypothesis, that is tested directly. the null hypothesis is the hypothesis that the researcher hopes to be able to nullify by conducting the hypothesis test.

Alternative Hypothesis

represented by the symbol H1, states that the population parameter is some value other than the value stated by H0. The alternative hypothesis asserts the opposite of the H0 and usually represents a statement of a difference between means or a relationship between variables. The null and alternative hypotheses are logically contradictory because they cannot both be true at the same time. If hypothesis testing allows the researcher to reject the null hypothesis, then the researcher can tentatively accept the alternative hypothesis. The alternative hypothesis is almost always more consistent with the researcher's research hypothesis; therefore, the researcher hopes to support the alternative hypothesis, not the null hypothesis. The null hypothesis is like a "means to an end." The researcher has to use the null hypothesis because that is what must be stated and tested directly in statistics; several examples of research questions, null hypotheses, and alternative hypotheses are provided in the table below.

Range

simply the difference between the highest and lowest numbers. In the following formula, the range is the highest (i.e., largest) number minus the lowest (i.e., smallest) number in a set of numbers:

t test for regression coefficients

test each regression coefficient for statistical significance categorical or quantitative independent quantitative dependent

Mean

the arithmetic average, or commonly called the average. The symbol X stands for the variable whose observed values are 1, 2, and 3 in our example. The symbol ∑ (the Greek letter sigma) means "sum what follows." Therefore, the numerator (the top part) in the formula says "sum the X values." The n in the formula stands for the number of numbers. The average is obtained by summing the observed values of the variable and dividing that sum by the number of numbers.

Mode

the most frequently occurring number in a set of data bimodal- two modes multimodal- multiple modes

Descriptive statistics starts with a data set

the researcher attempts to convey the essential characteristics of the data by arranging it into a more interpretable form (e.g., by forming frequency distributions and generating graphical displays) and by calculating numerical indexes, such as averages, percentile ranks, and measures of spread. The researcher can summarize the variables in a data set one at a time, as well as examine how the variables are interrelated (e.g., by examining correlations). The key question in descriptive statistics is how researchers can communicate the essential characteristics of the data.

measure of central tendency

the single numerical value that is considered the most typical of the values of a quantitative variable. For example, if someone asked a teacher how well his or her students did on their last exam, using a measure of central tendency would provide an indication of what a typical score was

interim analysis

this cyclical or recursive process of collecting data, analyzing the data, collecting additional data, analyzing those data, and so on throughout the research project is called interim analysis Interim analysis is used in qualitative research because qualitative researchers usually collect data over an extended time period and they continually need to learn more and more about what they are studying during this time frame. In other words, qualitative researchers use interim analysis to develop a successively deeper understanding of their research topic and to guide each round of data collection. This is a strength of qualitative research. By collecting data at more than one time, qualitative researchers are able to get data that help refine their developing theories and test their inductively generated hypotheses (i.e., hypotheses developed from examining their data or developed when they are in the field). Grounded theorists use the term theoretical saturation to describe the situation in which understanding has been reached and there is no current need for more data. Refer to Figure 15.1 for the qualitative data-collection process


Related study sets

Inflammation, the Inflammatory Response, and Fever

View Set

Chapter: Completing the Application, Underwriting, and Delivering the Policy

View Set

NAQT: Authors of Speculative Fiction

View Set

Department of Defense (DoD) Cyber Awareness Challenge 2024 (1 hr) (Pre Test)

View Set

26. Residential and Commercial Financing

View Set