CHP 10: DNA Polymorphisms and Human Identification
Combined DNA Indexing System CODIS
13 core loci that are used for identification
Linkage Disequilibrium
A lack of separation of inheritance of the rare disease and the haplotype. (See it as they are linked) Ex: 17 members of a population of 1,000 people have a rare disease, AND all 17 people have the dsme haplotype at a particular genetic location on chromosome 3
Split Chimerism
A more defined condition can be uncovered by cell type separation. Some cell fractions such as granulocytes engraft before others. For example, isolated granulocytes may show full chimerism, whereas the T-cell fraction shill shows mixed chimerism. This is a case of split chimerism.
Subpopulations
Allele frequencies differ between subpopulations or ethnic groups. Diff allele frequencies in subpopulations are determined through the study of each ethnic group. Table 10.3 pg. 272 illustrates the differences in the polymoprhic nature of alleles in diff subpopulations. it appeats there are more polymoprhisms in Hispanic americans than african americans and even more than in white americans.
Analysis of Test Results
Analysis of polymoprhisms at multiple loci results in very high levels of discrimination. Discovery of the same set of alleles from different sources or shared alleles between allegedly related individuals is strong evidence of identity, paternity, or relatedness. Results from such studies must be expressed in terms of background probability of chance matches.
STR (Short tandem repeat) Typing by PCR
HLA-DQ use to use -->PM STR system now use which had specialized versions such as Mini STR and Y-STR (the 3 types of STR differ in the primers used). First commercial typing test based on PCR specifically for forensic use was the HLA DQ alpha now called DQA1 made in 1986. This system could distinguish 28 DQA1 types. With additional of Polymarker (PM) system, the analyst could type five additional genetic markers. The PM system is a set of primers complementary to sequences flanking STRs, or microsatellites. STRs are similar to VNTRs (minisatellites) but have repeat units of 1 to 7 bp. (The upper limit of repeat unit size for STR varies from 7 to 10 bp depending on diff texts and reports). Because of the increased power of discrimination and ease of use of STR, the HLA DQA forensic DNA amplification and typing kit was disconintued. Primers are designed to amplify short regions containing the tandem repeats. Allelic ladders consisting of all alleles in the human population are used to determine the number of repeats in the locus by the size of the amplicon. Exmample: STR TH01 (repeat unit TCAT) linked to human tyrosine hydroxylase gene on chromosome 11p15.5. One invidiual has 7 peats on allele 1 and 8 repeats on allele 2 thus this person is heterozygous for TH01 with a genotyope of 7/8. So each band on the gel is an allele. STR alleles are identified by PCR product size. Primers are designed to produce amplicons of 100 to 400 bp in which the STRs are embedded. If one of each primer pair is labeled with a fluorescent marker, the PCR product can be analyzed in fluorescent detection systems. Silver stained gels may also be used; however, capillary electrophoresis with fluorescent dyes is the preferable method especially for high-throughput requirements. At least 8 to more than 20 loci are usually included in STR applications. Allele population frequency has the most limiting effect on inclusion. Mutations or gains or losses of repeats not as much an affect. in abnormal cells with genetic instability, such as cancer cells, gain or loss of repeats can occur more frequently enough to affect the identification of genotypes. Microsatellite markers, or short tandem repeats (STR) Occasionally, STRs contain repeat units with altered sequences, or microvariants, repeat units missing one or more bases of the repeat. These differences have arisen through mutation or recombination events. In contrast to VNTRs, the smaller STRs are efficiently amplified by PCR, easing specimen demands significantly. Long, intact DNA fragments are not required to detect the STR products; therefore, degraded or otherwise less-than-optimal specimens are potentially informative. The amount of specimen required for STR analysis by PCR is reduced from 1 ug to 10ng, a key factor for foresneic analysis. Takes 24-48 hrs. Careful design of primers and amplifications facilitated multiplexing and automation of the process. Mini-STR - these STRs are amplified with PCR primers located closer to the tandem repeat than in standard STR. Compared with standard STR products, the small amplicons are more efficiently produced from such challenging starting material as fixed tissue and degraded specimen. Y-STR - developed for surname testing and forensic identification of male offenders or victims. This primer set only amplifies STR located on the Y chromosome. There is only one allele at each locus, and because the T chromosome is inherited as a single haplotype, paternally related men share all Y loci. Surname testing: When two males share a surname, a test of their Y-chromosome markers will determine either that they are not related, or that they are related.
RFLP and Parentage Testing
In diploid organsisms, chromsomal content is inherited half from each parent. This includes the DNA polymorphisms located throughout the genome. Taking advantage of the unique combination of RFLP in each individual one can infer a parents contribution of alleles to a child from the combination of alleles in the child and those of the other parent. The fragment sizes of an individual are a combination of those from each parent. The alleles in the child will be a combination of one allele from each parent. In a paternity test, the alleles or fragment sizes of the offspring and the mother are analyzed. The remaining fragments (ones that do not match the mother) have to come from the father. Alleged fathers are identified based on the ability to provide the remaining alleles (inclusion). Aside from possible mutations, a difference in just one allele may exclude paternity. A parentage test requires analysis of at least eight loci. The more loci tested, the higher the probability of a positive identification of the father.
Legal Identity (CODIS)
Legal Identity (CODIS) is based on 12 core loci plus amelogenin with recent recommendations for additional loci being added. (SO answer is really 13 core loci according to HCL m.c. quiz) Five loci are not sufficient for legal identification. Nonmatching alleles at any of these loci may support exclusion.
Matching with Y-STRs
Matching probabilities from Y-STR data are determined differently than for the autosomal STR. Haplotype diversity (HD) is calculated from the freq of occurence of a given haplotype in a tested pop. The prob of two random males sharing the same haplotype is estimated at 1-HD; that is if the haplotype diversity is high the probability of the two random males in the pop having the same haplotype is low. Discriminatory capacity (DC), is another measure of profile uniqueness. It is determined by the number of diff haplotypes seen in the tested pop and the total number of samples in the population. DC expresses the percentage of males in a population who can be identified by a given haplotype. Just as the number of loci included in an autosomal STR genotype increases the power of discriminaiton, DC is increased by increasing the number of loci defining a haplotype. For instance six loci tested can distiguish 82 of African American males. Using 22 loci raises the DC to almost 99%.
Genetic Mapping with RFLPs
Polymorphisms are inherited in a Mendelian fashion, and the locations of many polymorphisms in the genome are known.Therefore, polymorpshisms can be used as landmarkers, in the genome to determine the location of other genes. In addition to showing clear family history or direct identification of a genetic factor, one can confirm that a disease has a genetic component by demonstrating a close genetic association or linkage ti a known marker. Formal statistical methods are used to determine the probability that an unknown gene is located close to a known marker in the genome. TLDR: The more frequently a particular polymorphism is present in persons with a disease phenotype, the more likely an affected gene is located close to the polymorphism.This is the basis for linkage mapping and one of the ways genetic components of disease are identified. EX: Mark Claire King used RFLP to find location of BRCA1 gene.
Other identification Methods
Protein-Based Identification and Epigenetic Profiles
Post-Transplant Engraftment Testing
Quantification of the percentage of recipient and donor cells post-transplant is performed using the informative locus or loci selected during the pretransplant informative analysis. The raw data for these calculations are the peak heights or areas under the peaks generated by the PCR products after amplification. Peaks are generated by the emission from the fluorescent dyes attached to the primers and thus to the ends of the PCR products collected as each product migrates past the detector. The fluorescent signal is converted into fluorescence units by the computer software, The software displays the PCR products as peaks of fluorescence units (y-axis) versus migration speed (x-axis). The amount of fluoresnece in each product or peak, represented as the height or area under the peak is used to calculate the percentage of recipient and donor cells. Analysis pg. 287 FIG 10.23 Post-engraftment analysis of an informative locus D16S539. The area (fluorescence units) under the peaks is calculated automatically. The recipient and donor patterns are shown in the first and second trace. Results from the whole blood and T-cell fraction are in third and fourth traces. The formula R(unshared)/[R(unshared) + D(unshared)] yields 4,616/(4,616 + 16,413) X 100 = 22% recipient cells in the unfractionated blood and 516/(516+15608) X 100 3.2% recipient cells in the T cell fraction. Because cell lineages engraft with diff kinetics testing of blood and bone marrow may yield diff levels of chimerism. Bone marrow will contain more myeloid cells, and blood will contain more lymphoid cells. Engraftment Analysis of Cellular Subsets ■T cells (CD3), NK cells (CD56), granulocytes, myeloid cells (CD13, CD33), myelomonocytic cells (CD14), B cells (CD19), stem cells (CD34) Methods ■Flow cytometric sorting ■Immunomagnetic cell sorting ■Immunohistochemistry + XY FISH The first determination to be made from engraftment testing is whether donor engraftment has occurred, and secondly whether there is split chimerism. In split chimerism, cell separation techniques may be used to determine which lineages are mixed and which are fully donor. ■Detection of different levels of engraftment in cellular subsets is split chimerism.
Detecting Restriction Fragment Length Length Polymorphisms RFLP (on study guide)
SNPs, larger sequence variants, and tandem repeats can be detected by observing changes in the restriction map of a DNA region. Analysis of restriction fragments by Southern blot reveals RFLPs. Particular types of polymorphisms specifically SNPs, VNTRs, STRs, and RFLPs are routinely used in the lab. "Restriction fragment length polymorphisms, or RFLPs, are differences among individuals in the lengths of DNA fragments cut by enzymes" pg. 262 table explaining diff of each https://www.quora.com/What-are-the-differences-between-RFLPs-STRs-and-SNPs
Guidelines for use of mtDNA for identification purposes
SWGDAM and ISFG have recommended guidelines for the use of mtDNA from bone, teeth, or hair for identification purposes. 1.Process begins with visual inspection of the specimen. Bone or teeth specimens are examined and ascertained to be of human origin. Hair samples the hairs are examined microscopically and compared with hairs from a known source. Sequencing is performed only if the specimen meets the criteria of origin and visual matching to the reference source. 2.Before DNA isolation, the specimens are cleaned with detergent or, for bone or teeth, by sanding to remove any possible source of extraneous dna adhering to the specimen. 3. Cleaned specimen is then ground in an extraction solution. Hair shafts yield mtDNA, as does the fleshy pulp of teeth or bone. The dentin layer of old tooth samples will also yield mtDNA. DNA is isolated by organic extraction and amplified by PCR. 4.The PCR products are then purified and subjected to dideoxy sequencing. A positive control of a known mt sequence is included with every run along with a reagent blank for PCR contamination and a negative control for contamination during the sequencing reaction. If the negative or reagent blank controls yield sequences similar to the specimen seq, the results are rejected. Both strands of the specimen PCR product must be sequenced. 5.Raw mt seq data are imported into a software program for analysis. With the seq software, the heavy-strand sequences should be reverse-complemented so that the bases are alinged in the light-strand orientation for strand comparison and base designation. Analysis: Occasionally, more than one mtDNA population is present in the same individual. This is called heteroplasmy. In point heteroplasmy, two DNA bases are observed at the same nt position. Length heteoplasmy is typically a variation in the number of bases in tracts of like bases (homopolymeric tracts, e.g. CCCCC). A length variant alone cannot be used to support an interpretation of exclusion. In general, if two or more nt differences occur between a reference and a test sample, the test sample can be excluded as originating from the reference or a maternally related person. One nt diff between the samples is interpreted as an inconclusive result. If the test and reference samples show seq concordance, then the test specimen cannot be excluded as coming from the same individual or maternal relative as the source. The conclusion that an individual can or cannot be eliminated as a possible source of mtDNA is reached under conditions defined by each individual laboratory. In addition, evaluation of cases in which heteroplasmy may have occurred is lab defined. The mtDNA profile of a test sample can also be searched in a population database. Population databases, such as mtDNA population database and CODIS, are used to assess the weight of foresnic evidence, based on the number of different mitochondrial sequences previously identified. The quality of seq information used and submitted for this purpose is extremely important. Based on the number of known mtDNA sequences, the probability of seq concordance in two unrealted individuals is estimated at 0.003 the probability that two unrelated individuals will differ by a single base is 0.014. MtDNA analysis is also used for lineage studies and to track population migrations. As with the Y chromosome, there is no recombination between mitochondria, and polymorphisms arise mostly through mutation. The location and divergence of specific sequences in the HV regions of mt are a historical record of the relatedness of populations. Because mt are naturally amplified (hundreds per cell and tens of circular genomes per mt) and because of the nuclease- and damage-resistant circular nature of the mt DNA, mtDNA typing has been useful complement to other types of DNA identification. Useful for identification of missing persons in mass disasters or for typing ancient specimens. MtDNA typing can also be applied to quality assurance issues as described for STR typing of pathology specimens.
STR Analysis
TLDR: STR- Allows accurate identification of sample alleles using primer with PCR and then gel eelctrophoresis. To identify STR alleles, test DNA is mixed with the primer pairs, buffer, and polymerase to amplify the test loci. A control DNA standard is also amplified as well as a sensitivity control if the relative allele percentage in a mixture will be calculated. Following amplification, each sample PCR product is combined with allelic ladders (Sets of fragments representing all possible alleles of a repeat locus) and internal size standards (molecular weight markers) in formamide for electrophoresis. After electrophoresis, detection and analysis software will size and identify the alleles based on co-size migration with specific alleles in the allelic ladders. In contrast to RFLPs and VNTRs, STRs are discrete allele systems in which a finite number of alleles is defined by the number of repeat units in the tandem repeat. Multiple STRs can be resolved on a single gel. Just need the allelic ladders to show that the ranges of potential amplicon sizes do not overlap so as to allow resolution of multiple loci in the same lane. The bands will provide the alleles present at that loci (two bands if hetro and 1 band if homo) and thus the genotype i believe. Theoretically, the minimal sample requirement for PCR analysis is a single cell, A single cell has 6 pg of DNA. There are 3 billion bp in one copy of the human genome. So for a diploid cell really 6 billion bp as two copies. Advances in fluorescence technology have increase ease/sensitivty of STR allele identification. Although capillary electrophoresis is faster and more atuomated than gel electrophoresis a single run through a capillary of single dye-labeled products can resolve only loci whose allele ranges do not overlap. The number of loci that can be resolved on a single run was increased by the use of mulitcolor dye labels. Primer sets labeled with dyes that can be distingiushed by their emission wavelength generate products that are resolved according to fluorescent color as well as size. Test DNA amplicons, allelic ladders, and size standards for multiple loci are thus run simultaneously through each capillary. Genotyping software provides automated resolution of fluorescent dye colors and genotyping by comparison with the size standards and the allelic ladder. STR analysis by capillary gel electrophoresis. Instead of bands on a gel, peaks of fluorescence on an electropherogram reveal the PCR product sizes. Alleles are determined by comparison with allelic ladders representing all possible alleles (From 5 to 11 repeats) for this locus, run through the capillary simultaneously with the sample amplicons. By labeling primers with diff fluorescent dye colors, STRs with overlapping size ranges can be resolved by color. As in RFLP testing, an STR "match" is made by comparing profiles (Alleles at all loci tested) followed by porbability calculations. The HLA DQ in conjunction with the PM system generated highly discriminatory allele frequencies. Extra info: Methods of DNA typing for identity, parentage, and family relationships -RESTRICTION FRAGMENT LENGTH POLYMORPHISM (RFLP) ANALYSIS. ... -POLYMERASE CHAIN REACTION (PCR). ... -PARENTAGE AND FAMILY RELATIONSHIP. Consequently, there is a class of RFs that differ in the number of repeated segments present. Some VNTR polymorphisms have a small number of alleles, and the patterns of RFs that represent each of the alleles at a given locus can be readily distinguished. But highly polymorphic VNTR loci have 50-100 alleles or even more. In that situation, the distribution of RF size is essentially continuous; alleles with RFs close in size might not be resolvable with electrophoresis, and the limit of resolution must be defined operationally. https://www.ncbi.nlm.nih.gov/books/NBK234533/
Hardy-Weinberg Equilibrium
The certainty of a matching pattern increases with decreased frequency of alleles in the general population. Under defined conditions, the relative frequency of two alleles in a population remains constant. This is Hardy-Weinberg Equilibrium. The population frequency of two alleles p and q can be expressed as p^2 + 2pg + q^2 = 1. This equilibrium assumes a large population with random mating and no immigration, emigration, mutation, or natural selection.
Y-STR
Unlike conventional STRs (autosomal STRs) where each locus is defined by two alleles, one from each parent, Y-STRs are represented only once per genome and only in males. A set of Y-STR alleles comprises a haplotype, a series of linked alleles always inherited together. This is because the 7 chromosome cannot extesnively exchange info (Recombine) wiht the X chromosome or another Y chromosome. Thus marker alleles on the Y chromosome are inherited from generation to generation in a single block. This means that the frequency of entire Y-STR profiles (haplotypes) in a given pop can be determined by empiricial studies. The discriminationm power of Y-haplotype testing will depend on the number of subjects tested and will be less defjnitive than that of autosomal STR. Except for rare mutation events, every male member of a family (brothers, uncles, cousins, and grandfathers) will have the same Y-chromosome haplotype. Thus Y-chromosome inheritance can be applied to lineage, population, and human migration studies. The Y-STR/paternal lineage test can determine whether two or more males have a common paternal ancestor. In addition to family history studies, the results of a paternal lineage test serve as supportive evidence for adoptees and their biological relatives or for individual filing inheritance and benefit claims. Because Y chromosomes are inherited intact, spontaneous mutations in the DNA seq of the Y chromosome are used to follow human migration patterns and historical lineages.Y chromosomes genotyping has been used to locate the geographical origin of populations. Because all male relatives in a family will share the same allele combination or profile, the statistical significance of a Y-STR DNA match cannot be assessed by multiplying likelihood ratios like autosomal STR. Instead of allele freq used in autosomal STR match calculations, haplotype frequencies are used. Estimation of haplotype frequencies however is limited by the number of known Y haplotypes. This smaller data set accounts for the reduced inclusion probabilities and a discrimination rate that is significantly lower than that for autosomal STR polymoprhisms. Traditional STR loci are, therefore, preferred for identitiy or relationship analyses, and the Y-STR are used to aid in special situations like in confirming sibship between two males who share commonly occuring alleles, that is, have a low likelihood ratio based on traditional STRs. Y-STRs have been utilized in forensic tests where the evidence consists of a mix of male and female DNA such as semen, saliva, or other body secretions or fingernail scrapings. For example, in specimens from the evidence of sexual assualt, the female DNA may be in vast excess (more than 100-fold) compared to the male DNA in the sample. Autosomal STRs are not consistenly informative under these circumstances. Using Y-specific primers, Y-STR can be spcieically amplified by PCR from the male-female mix resulting in an analyzeable marker that has no female background. So more accurate id of male donor. The Y chromosome has a low mutation rate. The overall mutation rate for Y chromosome loci is estimated 7.4 X 10^-10 mutations per position per year. Assuming that Y-chromosome mutations generally occur once every 500 generations/locus for 25 loci, 1 locus should have a mutation every 20 generations (500 generations/25 markers = 20 generations). Useful for missing persons cases in which reference samples can be obtained from paternally related males. Several Y-STRs are located in regions that are duplicated on the Y chromosome. Ex: DSY389I and DSY389II. like autosomal STRs, Y-STRs have microvariant alleles containing incomplete repeats and alleles containing repeat sequence differences. BOOK PPT CHP 10: ■The Y chromosome is inherited in a block without recombination. ■STR on the Y chromosome are inherited paternally as a haplotype. ■Y haplotypes are used for exclusion and paternal lineage analysis.
■Formula for calculation of % recipient or % donor (no shared alleles).
% Recipient DNA = A(R) / A(R) + A(D) × 100 % Donor DNA =A(D) / A(R) + A(D) × 100
Allele frequency example
500 people from a population of 1,000 people have the same SNP on chromosome 3
Cimerism vs mosaicism
Chimera - individual carrying two populations of cells that arose from diff zygotes. Mosaic - cells arise from the same zygote have undergone a genetic event, resulting in two clones of phenotypically diff cells in the same individual.
Tandem Repeat
Direct repeat of 1 to more than 100 nts in length. A gain or loss of repeat units forms a diff allele. Diff alleles are detected as variations in fragment size on digestion with RE.
Why are SNPs superior to STR and RFLP for mapping and association studies?
SNPs are more numerous in the genome than STRs and RFLPs and, therefore offer higher resolution and mapping of precise genome locations.
Stated many times in the chapter!
To establish the identity of an individual by an allele of a locus, the chance that the same allele could arise in the population randomly is taken into account.
Comparing genotypes in a database
When comparing genotypes with those in a database looking for a match, it is important to. consider whether the database is representative of a population or subpopulation because allele frequencies will be diff in different defined groups. Also consider whether the population is homogenous ( a random mixture) with respect to the alleles tested.
Genetic Concordance
A matching genotype is not necessarily an absolute determination of the identity of an individual. Genetic concordance is a term used to express the situation where all locus genotypes (Alleles) from two sources are the same. Concordance is interpreted as inclusion of a single individual as the donor of both genotypes. Two samples are considered diff if at least one locus genotype differs (Exclusion). Problems that can arise with identification: An exception is paternity testing, in which mutational events may generate a new allele in the offspring at one locus and this difference may not rule out paternity. Technical artifacts such as air bubbles, crystals, and dye blobs as well as sample contaminants, temperature variations, and voltage spikes, can interfere with consistent band migration during electrophoresis. Also, amplification artifacts occur during PCR. Some polymerases add an additional non-template adenine residue to the 3'end of the PCR product. If this 3' nt addition does not include all the amplified products, a mixed set of amplicons will result in extra bands or peaks located very close together. Stutter is an anomaly of PCR amplification in which the polymerase may miss a repeat during the replication process, resulting in two or more diff species in the amplified product. These also appear as extra bands or peaks. Generally, the larger the repeat unit length, the less stutter is observed. These or other aberrant band patterns confuse the analysis software and can result in the miscalling of alleles.
Allelic Frequencies in Paternity Testing
A paternity test is designed to choose between two hypotheses: The test subject is not the father of the tested child (H0) or test subject is the father of the tested child (H1). Paternity is first assessed by observation of shared alleles between the alleged father and the child. The identity of shared alleles is a process of matching like with identity testing. Because each allele of a genotype is inherited from one parent, a child will share one allele of every locus with the paternal parent. A paternity index, or likelihood of paternity is calculated for each locus in which the alleged father and the child share an allele. The paternity index is an expression of how many times more likely the child's allele is inherited from the alleged father than by another man in the general population. A allele that occurs frequently in the pop has a low paternity index; a rare allele has a high paternity index. Paternity index is written as the child is (index like 5.719) times more likely to have inherited the 9 allele of locus D16S539 from the alleged father than from another random man in the pop (by random occurence). When each tested locus is on a different chromosome (not linked), the inheritance or occurence of each allele can be considered an independent event. The paternity index for each locus, therefore, can be multipled together to calculate the combined paternity index (CPI) which summarizes the genotype infomration. After multiplying these data indicate the child is 8,045 times more likely to have inherited the four observed alleles from the alleged father than from another man in the population. If a paternal allele does not match between the alleged father and the child, H1 for that allele is 0. However CPI is not 0. Nonmatching alleles between the alleged father and the child found at one locus (Exclusion) is traditionally not regarded as a demonstration of paternity because of the possibility of mutation. To account for mutations, the paternity index for the nonmatching alleles is calculated as paternity index for a mutant allele = u. Where u is the observed mutation rate (mutations/meiosis) of the locus. A high mutation rate (close to 1) would not lower the CPI, whereas a very low mutation rate (closer to 0) would do so. In a paternity report, the combined paternity index is accompanied by the probability of paternity, a number calculated from the combined paternity index (genetic evidence) and prior odds (nongenetic evidence). For the prior odds the lab as a neutral party assumes a 50/50 chance that the test subject is the father. Pg. 277 for the calculation unsure if need to know how to calculate the probability of paternity.
Y typing
Because no recombination between loci on the Y chromosome, the product rule cannot be applied. (i think because on y chromosome and mt dna the loci are genetically linked and close together). "The product rule is used to estimate the chance of finding a given STR profile within a population. This is done by multiplying the frequency of each of the genotypes (combination of alleles) found at all loci in the STR profile. In the case of Y-chromosome STRs and mitochondrial DNA (mtDNA) the product rule can not be used." according to google. The results of a Y typing might be reported accompanied by the number of observations or frequency of the analyzed haplotype in a database of adequate size. Ex on pg. 281. Suppose a haplotype containing the 17 allele (this allele at this loci has 12 repeats) if DYS390...Even with a 1/12,400 or 99.9% DC (discrimination capacity) the matching probability is orders of magnitude lower than that for autosomal STR. Y-chromosome haplotypes can used to exclude paternity. Taking into account the mutation rate of each allele, any alleles that differ between the male child and alleged father are strong evidence for non-paternity. Conversely, if a Y haplotype is shared between a child and alleged father, a paternity index is calculated in a manner similar to that of the autosomal STR analysis. Ex: 6 Y-STR alleles are tested and match between the alleged father and child. If the haplotype has not been observed before in the population, the occurrence of that haplotype in the population database is 0/1,200, and the haplotype frequnecy will be 1/1,200, or 0.0008333. The paternity index (PI) is the probability that an alleged father with that haplotype could produce one sperm carrying the haplotype, divided by the probability that a random man could produce one sperm carrying the haplotype. The PI is then 1/0.0008333 = 1,200 (takes place of CPI value on pg. 277). With a prior odds probability of 0.5 the probability of paternity is (1,200 X 0.5) / [(1,200X0.5)+0.5], or 99.9%. Y-STRs as marker loci for Y-chromosome, or surname, tests are used to determine ancestry. For example, a group pf males of a strictly male descent line (having the same last name or surname) is ecpted to be related to a common male ancestor. Therefore they should all share the same Y-chromosome alleles (except for mutations which should be minimal given 1 mutation per 20 generations). The Y chromosome haplotype does not provide enough info about the degree of relatenedess just inclusion or exclusion from a family. An analysis to find a most recent common ancestor (MRCA) is possible however using a combination of research family histories, Y-STR results and statistical formulas for mutation frequencies.
Bone marrow engraftment testing using DNA polymorphisms
Bone marrow transplantation is a method used to treat malignant and non malignant blood disorders as well as some solid tumors. A) Autologous. One transplant approach is autologous (From self), in which cells from the patient's own bone marrow are removed and stored. The patient then receives high doses of chemotherapy and/or radiotherapy. The portion of marrow previously removed from the patient may also be purged of cancer cells before being return to the patient. B) Allogeneic. Alternatively, Allogeneic transplants (between two individuals) are used. The donor supplies healthy cells to the recipient patient. Donor cells are supplied as bone marrow, peripheral blood stem cells (also called hematopoietic stem cells), or umbilical cord stem cells. To assure successful establishment of the transplanted donor cells, the immune compatibility of the donor and the recipient is tested prior to the transplant by HLA typing. HLA typing is necessary for allogeneic transplants. Currently donor registries and advances in the use of hematopoietic stem cells have broadened the application of transplants for a variety of diseases. Google - A conditioning regimen may include chemotherapy, monoclonal antibody therapy, and radiation to the entire body. It helps make room in the patient's bone marrow for new blood stem cells to grow, helps prevent the patient's body from rejecting the transplanted cells, and helps kill any cancer cells that are in the body. Allogeneic transplant strategies - high doses of therapy completely remove the recipient bone marrow, particularly the stem cells that give rise to all the other cells in the marrow (conditioning). The allogeneic stem cells are then expected to reestablish a new bone marrow in the recipient (engraftment). The toxicity of this procedure was reduced by using sub-myeloablative transplant procedures or mini-transplants. In this approach, pretransplant therapy will not completely remove the recipient bone marrow. The donor bone marrow is expected to eradicate the remaining recipient cells through recognition of residual recipient cells as foreign to the new bone marrow. This process also imparts a graft-versus-leukemia (GVL) positive impact on host or graft-versus-tumor (GFT) effect positive impact on host which is a process closely related to graft-versus-host disease (GVHD) negative impact on host. GvHD, the donated bone marrow or peripheral blood stem cells view the recipient's body as foreign, and the donated cells/bone marrow attack the body per google. The T-cell fraction of the donor marrow is particularly important for engraftment and for GVT effect. This was realized when efforts to avoid GVHD by removing the T-cell fraction before infusion of donor cells resulted in increased incidence of graft failure and relapse.
Epigenetic Profiles
DNA and probably protein identification systems cannot distinguish between syngeneic individuals (identical twins). Epigenetic changes occur as a result of environmental events, such that a putative epigenetic profile is unique to each individual because no two individuals will have the same environmental exposures. Epigenetic alterations, particularly DNA methylation, change in the absence of cell division or DNA sequence alterations. Many of these changes are stable and can be detected at the DNA sequence level. There are a variety of methods to detect methylated DNA, including methylation-sensitive restriction enzymes, methylation-specific PCR, and bisulfite sequencing by Sanger or massive parallel sequencing. Although shared epigenetics in families is evidence for the inheritance of epigenetic traits, epigenetic differences due to environmental exposures add an additional level of distinction among individuals. Epigenetic differences, not present at early ages, have been observed in adult identical twins. Estimation of the age of persons whose biological materials are recovered at a crime scene is valuable in forensic applications. Strategies have been designed to use epigenetic patterns that accumulate over time to predict chronological age. Through epigenetics, materials such as blood stains may provide useful information on human age. Epigenetic markers can also be used to identify body fluids. (saliva, semen blood). Unique methylation patterns occur at specific gene promoters in these cell types. A marker set defined for fluid type discrimination may also be used to eliminate fluids from other non primate species.
RFLP and inheritance of chromosomes
DNA is inherited as one haploid chromosome complement from each parent. Each chromosome carries its polymorphisms so that the offspring inherits a combination of the parental polymorphisms. Due to recombination and random assortment, each person has a unqiue set of RFLPs, half inherited maternally and half paternally. Over many generations, intra- and inter chromosomal recombination, gene conversion, and other genetic events have increased the diversity of DNA sequences. One consequence of this genetic diversity is that a single locus, that is, a gene or region of DNA, will have several versions or alleles. Human beings are diploid with two copies of every locus. in other words, every person has two alleles of each locus. If these alleles are the same the locus is homozygous; if the two alleles are diff the locus is heterozygous. More closely related individuals are likely to share more alleles than unrelated persons. In figure 10.3 (++)(--) describes four alleles of the locus detectable by southern blot. (+-) represents one chromosome. Two individuals can share both alleles at a single locus, but the chances of two individuals, except fir identical twins, sharing the same alleles decrease 10-fold with each additional locus tested. More than 2,000 RFLP loci in human genome. HaeIII and HinfI enzymes cut DNA frequently enough to reveal polymorphism in multiple locations throughout the genome.
Post engraftment percent calculations
Diff scenarios: a) For homozygous or hetereozygous donor and recipient peaks with no shared alleles, the percentage of recipient cells is equal to R/(R+D), where R is the height or area under the recipient-specific peak(s) and D area under the donor-specific peak(s). Shared alleles where one allele is the same for donor and recipient can be dropped from the calculation, and the percentage of recipient cells is calculated as R(unshared)/ [R(unshared)+D(unshared)]. Chimerism/engraftment results are reported as the percentage of recipient cells and/or percentage of donor cells in the bone marrow, blood, or cell fraction. These results do not reflect the absolute cell number, which could change independently of the donor/recipient ratio. Inability to detect donor or recipient cells does not mean the cell population is completely absent because capillary electrophoresis and fluroescent detection methods offer a sensitivity of 0.1% to 1% for autosomal STR markers. Time trends may be more important than single-point results following transplantation.
Mitochondria sequence data
Divided into two components. Forensic and public. a)The forensic component consists of anonymous population profiles and is used to assess the extent of certainty of mtDNA identifications in forensic casework. All forensic profiles include, at a min, a sequence region in HVI (nt positions 16024 to 16383) and a seq region in HV2 (nt positions 53 to 372). These data are searched through the CODIS program in open case files and missing persons cases. Approx 610bp, including the hypervariable regions of mtDNA, ARE ROUTINELY SEQUENCED FOR FORESNIC ANALYSIS. Deviations from the Cambrdige reference sequence are recorded as the number of the position and a base designation. For ex a transition from A to G at position 263 would be recorded as 263 G. b)Public data consist of mtDNA sequence data from scientific literature and the GenBank and European Molecular Biology Lab databases. The public data have not been subjected to the same quality standards as the foresnic data. The public databases provide info on worldwide population groups not contained within the forensic data and can be used for investigative purposes. As all maternal relatives share mitochondrial sequences, the mtDNA of sisters and brothers or mothers and daughters will exactly match in the hypervariable region in the absence of mutations . Therefore the use of mtDNA polymorphism is for exclusion. There is an avg of 8.5 nt differences between mtDNA sequences of unrelated individuals in the hypervariable region. In contrast to nuclear DNA, the human mtDNA genome is completely sequenced and numbered. Variants in the mtDNA are indicated in relation to the full mtDNA sequence. Descriptions are preceded by "m" and reported with the terms used for nuclear DNA (a T to C change at position 8993 would be m.8993T>C). Descriptions of changes at the protein level include a reference to the protein changed; replacement of leucine with proline at position 156 in ATP synthase 6 would be ATP6:p.Leu156Pro.
Human Identification Using RFLPs
First genetic tool to ID a person was ABO blood group antigens. Discrimination power was low. 4 possible groups and thus only good for exclusion (elimination of a person) and was informative in only 15-20% of cases. Analysis of the polymorphic HLA loci added a higher level of discrimination with exclusion in 90% of cases. Testing both ABO and HLA did not provide positive identification however. The initial use of DNA as an identification tool relied on RFLP detectable by Southern blot. RFLP can arise from a number of genetic events, including point mutations in the restriction site, mutations that create a new restriction site, and insertion or deletion of repeated sequences (tandem repeats). The insertion or deletion of nt occurs frequently in repeated sequences in DNA. Tandem repeats are in genomic DNA. Repeat units can be large enough so that loss or gain of one repeat is resolved by gel electrophoresis of a RE digest. The frequent cutters HaeIIII recognition site GGCC or HinfI recognition site GANTC., generate fragments that are small enough to resolve those that contain diff numbers of repeats and thereby give an informative pattern by Sourthn blot.
In order to Match
Gels: Alleles are identified by gel resolution, good intragel precision (comparing bands or peaks on the same gel or capillary) and intergel precision (comparing bands or peaks of separate gels or capillaries) are important. Intergel precision is less stringent than intragel precision. Because the same samples may run with slightly diff migration speeds on diff gels. because some microvariant alleles differ by only a single bp the resolution must be less than +-0.5 bp. The TH01 9.3 allele must be distinguished from the 10 allele, which is a single bp larger than the 9.3 allele. Capillary electrophoresis: To establish the identity of peaks from capillary electro (or peaks from densitometry tracings of a gel data), the peak is assigned a position relative to some landmark within the gel lane or capillary, such as the loading well or the start of migration. Upon replicate resolutions of a band or peak, electrophoretic variations from capillary to capillary or lane to lane or gel to gel may occur. Normalization of migration is achieved by the relation of the migration is achieved by the relation of the migration of the test peaks to the simultaneous migration of size standards. Size standards can be internal (in the same gel lane or capillary) or external in a separate gel lane. Even with normalization, however, tiny variations in position, height, and area of peaks or gel bands may persist. If the same fragments are run repeatedly, a distribution of observed sizes can be established. An acceptable range of sizes in this distribution is a bin. A bin is like an uncertainty window surrounding the mean position (size) of multiple runs of each peak or band. All bands or peaks that fall in this window are considered identical. Collection of all peaks or bands within a characteristic distribution of positions and areas is called binning. Bins are determined manually in the lab or software for each allele. All peaks within a bin are interpreted as representative of the same allele of a locus. Each band or peak in a genotype is binned and identified according to its migration characteristics . The group of bands or peaks makes up the characteristic pattern or profile of the specimen.
Genotyping
Genotype is allele identification. What alleles are present? that is genotype. A STR locus genotype is defined by the number of repeats in its alleles. Heterozygous locus with 7 repeats on one chromosome and 8 repeats on the other would be 7/8 or 7,8. A homozygous locus where both homologous chromosomes carry the same allele would be 7/7 or 7,7 or just 7. Microvariant alleles containing partial repeat units are indicated by the number of complete repeats followed by a decimal point and then the number of bases in the partial repeat. 9.3 allele of TH01 locus has 10 repeats, 9 full 4 bp repeat units and 1 repeat unit with close to the full-length allele. Microvariants are detected as bands or peaks very close to the full length allele it migrates between the full-length alleles such as between 9 and 10 repeats. The genotype, or profile, of a specimen is the collection of alleles in all loci tested.
Human Haplotype Mapping (HapMap) Project (good def!!!)
Goal: Develop a haplotype map of the human genome.(identify blocks of DNA polymorphisms that are inherited together) pg. 290 figure. Sections of DNA along chromosomes can be inherited as a unit or block of sequence in which no recombination occurs. All the SNPs on that block comprise a haplotype. Ex in fig is 10,000 bp. This map would then be used to identify common disease associations and patterns of human DNA sequence variation. Result: Millions of SNPs, RFLPs, VNTRs, and STRs were discovered through the HapMap project. However, advances in genomic sequencing techngology and development of comprehensive population based databases such as the 1000 Genomes Project has supplanted its contributions to research. In 2016 HapMap resource was retired and researchers now directed to 1000 Genomes Project for population genetic and genomic information.
Single-Nucleotide Polymorphisms
Human Genome project revealed that the nt seq differs every 1,000 to 1,500 from one individual to another. Majority of these seq differences are variations of single nucleotides or SNPs. Traditional def of polymorphism requires the genetic variation be present at frequency of at least 1% of the population. At this rate predicts 11 million sites in a genome of 3 billion bp that vary in at least 1% of the worlds population. Each individual has 11 million SNPs. High density analysis with next gen sequencing showed SNPs more frequent one SNP per every 300 nt. Importance: Due to density of SNPs these polymorphisms were of great interest for genetic mapping, disease prediction, and human identification. Problem: Detection of single bp changes not as easy to detect as STRs, VNTRs, or even RFLPs. Most definitive way to detect SNPs is direct sequencing. Next gen sequencing greatly accelerated both the discovery and detection of SNPs. So far 10 million SNPs have been identified in human genome. Almost all (99%) of these have no biological effect. Over 60,000, however, are within genes, and some are associated with disease. EX: SNP responsible for the formation of hemoglobin S in sickle cell anemia. (People who have sickle cell disease have abnormal hemoglobin, called hemoglobin S or sickle hemoglobin). Classification: SNPs have been classified according to location, relation to coding sequences, and whether they cause a conservative or nonconservative sequence alteration. SNP databases such as dbSNP, dbVar, ClinVar, and others are collections of DNA sequence variants used as reference for screening genomic sequencing data. A variant detected by sequencing may already be described or associated with a disease phenotypes noted in these databases. In addition to SNPs, these databases include short deletions, insertions, and duplications that involve more than one nt.
Protein-Based Identification
Like DNA, protein contains polymorphisms. Protein polymoprhisms are in the form of a.a. seq. variations. Some proteins are chemically more stable than DNA in harsh environments. Proteins such as keratin and collagen, are also more abundant. Protein polymoprhisms may serve as supportive confirmation or even alternative for DNA id results. Nonsynonymous DNA polymorphs produce single a.a. polymoprhs in proteins. A hair shaft contains over 300 nuclear and mitochondrial proteins, adequately representing the whole genome. A collection of peptide variants comprises a profile. Like STR profiles, peptide profiles could be collected for id and population-based studies. By associating known peptide changes with the known SNP that codes for them, a putative DNA profile might be generated form peptide data where sufficient DNA was not available. (so like working backwards to source to figure out dna seq). Method: Peptide variants in these proteins can be identified using liquid chromatography followed by mass spectrometry. Proteins isolated from test samples are reduced, alkylated, and digested with trypsin, and the resulting peptides are resolved by liquid chromatography. on-column conc can increase target molecule concentrations. Particles are then deposited on the matrix, ionized, and subjected to separation by size and charge to generate a spectrum that can be compared with peptide reference spectra. A peptide spectrum reflects the underlying DNA sequence variant in the form of variant a.a. Peak patterns (profiles) are compared to reference databases. The probability of a proteomic profile being shared by unrelated individuals is computed from the size of the population contributing spectra to the database. Y axis is intensity and x axis is m/z = mass/charge. Software matching algorithms identify peptide variants from the reference spectra. A set of variants is the proteomic profile. This profile can then be compared to a peptide database. There are close to 4 million spectra in available libraries, accounting for diff ionization methods. Based on the population frequency of variants (or their associated SNP alleles), the probability of a proteomic profile being shared by unrelated individuals is computed from the size of the population contributing spectra. Annotated reference peptide libraries from various organisms and proteins are being developed for the rapid matching and id of acquired peptide spectra from non-human species as well. Among these, smaller, more focused libraries have been collected specifically for humans and mice.
DNA Fingerprinting with RFLP (aka genetic profiling)
MLP first used-->then SLP used First human DNA profiling system was 1985 using Sir Alec Jeffrey's Southern blot multiple-locus probe (MLP)-RFLP system. This method utilized three to five probes to analyze three to five loci on the same blot. Results of probing multiple loci at once produced patterns that were highly variable between individuals but that required some expertise to optimize and interpret. In 1990, single-locus probe (SLP) systems were established in Europe and North America. Analysis of one locus at a time yielded simpler patterns, which were much easier to interpret, especially in cases where specimens might contain a mixture of DNA from more than one individual. EX: One column has evidence from crime scene and it matches the bands in lane for suspect 2. For positive identification requires further determination of the frequencies of these specific alleles in the population and the probability of matching them by chance. Each gel only probes for a single locus. The RFLP Southern blot technqiue required 100 ng to 1 ug of relatively high-quality DNA, 1 to 20 kbp in size. Furthermore, large, fragile 0.7% gels were required to achieve adequate band resolution, and the 32P-based probe system could take 5. to 7 days to yield clear results. After visually inspecting the band patterns, profiles were subjected to computer analysis to accurately size the restriction fragments and apply the results to an established matching criterion. RFLP is an example of a continuous allele system in which the sizes of the fragments define alleles. Therefore precise band sizing was critical to the accuracy of the results. A match implied inclusion, which was refined by determination of the genotype frequency of each allele in the general or local population. This process established the likelihood of the same genotype occurring by chance. The probability of two people having the same set of RFLP, or profile, becomes lower and lower as more loci are analyzed. Sir Alec first developed techniques for genetic profiling, or DNA fingerprinting using RFLP to identify humans, Has been used by forensics and law enforcment to resolve paternity and immigration disputes.
Quality Assurance for Surgical Sections using STR
Molecular diagnostics labs can assist in ensuring tissue sections are properly identified and not contaminated. During processing of tissue specimens, microscopic fragments may persist in paraffin baths (floaters). These fragments can adhere to subsequent tissue sections, resulting in anomalous appearance of tissue under the microscope. If a tissue sample is questioned, STR identification can confirm the origin of tissue. Procedure: Suspect tissue must be carefully removed from the slide by microdissection. Reference DNA isolated from the patient and DNA isolated from the tissue in question are subjected to multiplex PCR. The results are compared for matching alleles. If the tissue in question originated from the patient, all alleles should match. assuming good quality data, one nonmatching locus excludes the tissue in question as coming from the reference patient. You are testing at multiple STR loci for matching alleles. Limitations: Using STR for this application has some limitations. Tissue quality and quantity can adversely affect amplification of the STR loci, especially the larger products. Also, DNA isolated from small fragments may be mixed with reference DNA, complicating interpretation and comparison of alleles. A reported case study also demonstrated the effect of inherent genomic instability in tumor tissue. Allele differences between the suspected floater of malignant tissue and reference tissue from the patient led to the initial conclusion that it was from another source. A second biopsy, however, also contained malignant cells similar to those found in the first biopsy. The allele differences were determined to be a result of microsatellite instability in malignant cells. Another study comparing STR alleles in genetically stable and unstable tumors directly demonstrated the presence of new alleles in unstable tumor tissue compared to normal ref tissue from the sample patient. It is advisable, therefore, to take into account whether tumor cells might be genetically unstable when testing for contaminants.
Sibiling Tests
Polymorphisms are also used to generate a probability of sibilings or other blood relationships (Familial searches) is more complicated than paternity. Mutations and allele frequencies further complicate analysis.More confidant conclusions can be made with multiple sibilings. A full sibiling test is a determination of the likelihood that two people tested share a common mopther and father. A half sibiling test is a determination of the likelihood that two people tested share one common parent. The likelihood ratio generated by a sibiling test is sometimes called a kinship index, sibiling index, or combined sibiling index. Another type of relationship analysis is avuncular testing, which measures the probabilities that two alleged relatives are related as either an aunt or an uncle if a niece or nephew. The porbability of relatedness is based on the number of shared alleles between tested individuals. The probabilities can be increased if other known relatives such as parent of the niece or nephew are available for testing. Determination of first- and second degree relationships is important for genetic studies because linkage mapping of disease genes in populations can be affected by undetetced familial relationships.
Types of Polymorphisms
Polymorphisms range from a single bp to thousands of bp. SNPs are superior to STR and RFLP for mapping and association studies. a)SNPSs: It is estimated that the genome sequences differ by at least one nt every 1,000 to 1,500 bases. These single-nt differences or SNPs may occur in gene-coding regions or in intergenic sequences. Polymorphisms are more frequent in some areas of the genome than in others. The human leukocyte antigen (HLA) locus is a highly polymorphic region of human DNA. The variable nt sequences in this locus code for peptides that establish self-identity of the immune system. the extent of similarity or compatibility between the immune systems of transplant recipients and potential donors can thus be determine by comparing DNA sequences. b)Long interspersed nucleotide elements (LINEs). Large blocks of repeated sequences may be inverted, deleted, or duplicated from one individual to another. These are highly repeated sequences 6 to 8 kbp in length that contain RNA polymerase promotoers and open reading frames related to the reverse trasncriptase of retrovirsues. There are more than 500,000 of these LINE-1 (L1) elements making up more than 15% of the human genome. There are even more SINEs. c)Short interspersed nucleotide elements (SINEs). 0.3 kbp in size, are present in over 1,000,000 copies per genome. SINEs include Ale elements named for harboring recognition sites for the AluI restriction enzyme. Alu elements account for 11% of the human genome. The majority of transcribed genes contain Alu elements in their introns. Alu elements have cryptic splice and polyadenylation sites, which can become activated through accumulation of mutations and lead to alternative splicing of RNA or premature termination of translation. LINEs and SINEs are known as mobile elements or transposable elements. They are copied and spread by recombination and reverse transcription and may be responsible for the formation of pseudogenes (intronless, nonfunctional copies of active genes) throughout the human genome. c) Short tandem repeats (STRs). Shorter blocks of repeated sequences also undergo expansion or shrinkage through generations. Examples of the latter are STRs and variable-number tandem repeats (VNTRs).
RFLP Typing (RFLP on study guide)
RFLPs were the original DNA targets used for gene mapping, human identification, and parentage testing. RFLPs are differences in the sizes and number of fragments generated by restriction enzyme digestion of DNA. Fragment sizes vary as a result of changes in the nt sequence in or between the recognition sites of a restriction enzyme. NT changes may also destroy, change, or create restriction sites, altering the number of fragments. First step in using RFLPs is to construct a restriction enzyme map of DNA region under investigation. Once the restriction map is known, the number and sizes of the restriction fragments of a test DNA region cut with RE are compared with the number and sizes of fragments expected based on the restriction map. Polymoprhisms are detected by observing fragment numbers and sizes diff from those expected from the reference restriction map. The presence or absence of the polymorphic sites is evident from the number and size of the fragments after cutting with RE. RFLP typing in humans required the use of the Southern blot technique. DNA was cut with RE, resolved by gel electrophoresis, and blotted on a membrane. Probes to specific regions of DNA containing potential RFLPs were then hybridized to the DNA membrane to determine the size of the resulting bands.
PSTR Testing
Screening for loci for informative STR alleles: Donor and recipient DNA for allele screening prior to transplant can be isolated from blood or buccal cells. The lower limit of 1 ng of DNA is reportedly sufficient for the screening of multiple loci; however, 10 ng is a more practical lower limit. Multiple loci can be screened simultaneously using multiplex PCR. Although not validated for engraftment testing, several systems designed for human identification are used for this purpose. Primer sets that specifically amplify Y-STR may also be useful for sex-mismatched donor-recipient pairs. In multiplex primer systems all loci may not amplify with equal efficiency in a multiplex reaction. Less efficient amplification lowers the sensitivity of subsequent post-engraftment testing. Although the capillary electrophoresis used for this method is the same as that used for sequence analysis, measuring peak sizes and peak areas is distinguished from sequence analysis as fragment analysis and sometimes requires adjustment of the instrument or capillary polymer. Automatic detection will generate an electropherogram. Informative and non informative loci will appear as nonmatching or matching donor and recipient peaks and many combinations of donopr and recipient peaks are possible. Optimal loci for analysis should be clean peaks without stutter, especially stutter peaks that co-migrate with informative peaks, nonspecific amplified peaks (mis-primes), or other technical artfiacts. Ideally, the chosen locus should have at least one recipient informative allele. This is to assure direct detection of minimal amounts of residual recipient cells. If the recipient is male and the donor is female, the amelogenin locus supplies a recipient-informative locus. Good separation (ideally, but not necessarily, by two repeat units) of the recipient and donor alleles is desirable for ease of discrimination in the transplant testing. The choices of informative alleles are more limited in related donor-recipient pairs as they are likely to share alleles. Unrelated pairs will yield more options. After transplant testing. With nonmyeloablative or reduced-intensity pretransplant protocols an example sched would be testing at 1,3,6, and 12 months. Because early patterns of engraftment may predict GVHD or graft failure after non-myeoblative treatments, even more frequent blood testing may be necessary, such as 1,2, and 3 months after transplant. Bone marrow specimens can most conveniently be taken at the time of bone marrow biopsy following the transplant, with blood specimens taken in intervening periods. Usually, 3 to 5 mL of bone marrow or 5 mL of blood. However specimens collected soon after the transplant may be hypocellular, so larger volumes ( 5 to 7mL bone marrow, 10 to 20mL blood) may be required.
Linkage Analysis
TLDR: a)Because the locations of many STRs in the genome are known, these structures can be used to map genes, especially those genes associated with disease. b)Three basic approaches are used to map genes: family histories, population studies, and sibling analysis. Details: Family Histories: Family history and analysis of generations of a single family for the presence of a partciular STR allele in affected individuals is one way to show association. Family members are tested for several STRs, and the alleles of affected and unaffected members of the family are compared. Assuming normal Mendelian inheritance, if a particular allele of a particular locus is always present in affected family members, that locus must be closely linked to the gene responsible for the phenotype in those individuals (linkage disequilibrium). If the linkage is close enough to the gene (no recombination between the STR and the disease gene), the STR may serve as a convenient marker for disease testing. Instead of testing for mutations in the disease gene, the marker allele is determined. It is easier, for ex, to look for a linked STR allele than to screen a large gene for point mutations. The presence of the "indicator" STR allele serves as a genetic marker for the disease. Population Studies: Another approach to linkage analysis is to look for gene associations in large numbers of unrelated individuals in population studies. Just as with family history studies, close linkage to specific STR alleles supports the genetic proximity of the disease gene with the STR. In this case, however, large numbers of unrelated people are tested for linkage rather than a limited number of related individuals in a family. The results are expressed in probability terms that an individual with the linked STR allele is likely to have the disease gene. With the accumulation of genomic data produced by massive parallel sequencing methods, however, this type of population study is currently done with higher resolution using SNPs rather than STR. Either type of marker is informative if a linkage is found. Sibling studies are another type of linkage analysis. Monozygotic (identical) and dizygotic (Fraternal) twins serve as controls for genetic and environmental studies. Monozygotic twins will always have the same genetic alleles, including disease genes. There should be 100% recurrence risk (likelihood) that if one twin has a genetic disease, the other twin has it, and both should have the same linked STR alleles. Fraternal twins have the same likelihood of sharing a gene allele as any sibling pair. Investigation of adoptive families may also distinguish genetic from environmental or somatic effects. Identical twins (And clones) have identical nuclear DNA profile. They would have positive identification for autosomal STR. Their autosomal STR would match.
Gender Identification
The amelogenin locus is a very useful marker often analyzed along with STR. The amelogenin gene which is not an STR, is located on the X and Y chromosomes. The function of its encoded protein is required for mbryonic development and tooth maturation. A polymoprhism is located in the second intron of the amelogenin gene. The Y allele of the gene is 6bp larger in this region than in the X allele. Amplification and electrophoretic resoltuion reveal two bands or peaks for males (XY) and one band or peak for females. Males are heterozygous for the amelogenin locus and females are homozygous for this locus. Male specific 218 bp product from amplification of amelogenin for Y allele and 212 bp product found on X chromosome so two peaks on electropherogram for males and one peak at 212 bp for women. ■The amelogenin locus is not an STR. ■The HUMAMEL gene codes for amelogenin-like protein. ■The gene is located at Xp22.1-22.3 and Y. ■X allele = 212 bp ■Y allele = 218 bp ■Females (X, X): homozygous ■Males (X, Y): heterozygous
Receiving specimens after engraftment with no pre-engraftmen info.
The blood or bone marrow of the recipient is not acceptable for determination of recipient-specific alleles because don't know if alleles present may represent donor, recipient, or both donor and recipient. Options: a) The specimen can be processed using the amelogenin locus or Y-STR markers if the donor and recipient are of diff genders preferably a female donor and a male recipient. b)Another option is to use an alternate source of recipient DNA, such as buccal cells, skin biopsy sample, or stored specimens or DNA from previous testing. Because of the nature of lymphocyte migration, however, skin and buccal cells may also have donor alleles due to the presence of donor lymphocytes in these tissues.
Phases of Allogenic Transplantation
The first phase of allogeneic transplantation is donor matching in which potential donors are tested for immunological compatibility. Performed by examining the HLA locus to determine which donor would be most tolerated by the recipient immune system. Donors may be known or related to the patient or anonymous unrelated contributors (matched unrelated donor [MUD]). The National Marrow Donor Program (NMDP) maintains a database of people who have voluntarily submitted their HLA types and are willing to serve as potential donors. Stem cells may also be acquired from donated umbilical cord blood. After conditioning and infusion with the donor cells, the patient enters the engraftment phase, in which the donor cells reconstitute the recipient's bone marrow. Once a succesful engraftment of donor cells is established, the recipient is a genetic chimera; that is the recipient has tissue and blood cells of separate genetic origins. A recipient with donor marrow is a chimera. The engraftment of donor cells in the recipient must be monitored, especially in the first 90 days after the transplant. This requires distinigusihing between donor cells and recipient cells. Historically, RBC phenotyping, immunoglobulin allotyping, HLA typing, karyotyping, and fluoresence in situ hybridization have been used for this purpose. Some of these require months before engraftment can be detected. Others are labor-intensive or restricted to sex mismatched donor recipient pairs. DNA typing has become the method of choice for engraftment monitoring. Because all individuals except twins have unique DNA polymorphisms, donor cells are monitored by following donor polymorphisms in the recipient blood and bone marrow. Although RFLP can effectively distinguish donor and recipient cells, the detection of RFLP requires the use of the Southern blot method, which has limited sensitivity for this application. In comparison, small VNTRs and STRs are easily detected by PCR amplification of VNTRs and STRs which is preferable because of the increased rapidity and the 0.5% to 1 % sensitivity achievable with PCR. Sensitivity can be raised to 0.01% using Y-STR but this approach is limited to those transplants from sex-mismatched donor-recipient pairs, preferrably from a female donor to a male recipient.
Product Rule
The frequency of a set of alleles or a genotype in a population is the product of the frequency of each allele separately (the product rule). Can be applied because of linkage equilibrium. Linkage equilibrium assumes that the observed frequencies of haplotypes in a population are the same as haplotype frequencies predicted by multiplying together the frequency if individual genetic markers in each haplotype and that loci are not genetically linked (located close to one another) in the genome. The overall frequency (OF) of a locus genotype consisting of n loci can be calculated as. OF = F1 x F2 X F3X...Fn. Where F1...n represents the frequency of each individual allele in the population. Ex on pg. 274 basically locus Penta D on chromosome 21, has the 5 allele occurs in 1 in 10 people. At locus D7S829 on chromosome 7 is the 8 allele observed in 1/50 of same population. The overall frequency of the profile containing the loci Penta D 5 allele and D7S829 8 allele (genotype/profile) would be 1/10 X 1/50 = 1/500. This genotype is expected to occur in 1 out of every 500 randomly chosen members of that population. As increase the number of loci the overall frequency of the profile decreases and the certainty that the profile is unique to a single individual in that population increases.
Matching of Profiles
The more loci analyzed, the higher the probability that the locus genotype positively identifies an individual (match probability). Problem: Degraded, compromised, or mixed samples will affect the match probability because all loci may not yield clear, informative results. Criteria for interpretation of results and determination of a true allele established by each lab. The criteria are based on validation studies and results reported from other labs. pERIODIC EXTERNAL proficiency testing is performed to confirm the accuracy of test performance. Results from the analysis of polymorphisms are used to determine the probability of identity or inheritance of genetic markers or to match a particular marker or marker pattern. To establish the identity of an individual by an allele of a locus, the chance that the same allele could arise in the population randomly is taken into account.
Engraftment/Chimerism DNA testing
Two parts to testing: First part of testing: Pretrans plant analysis: Before the transplant, several polymorphic loci in the donor and recipient cells are screened to find at least one informative locus, that is, one locus in which donor alleles differ from the recipient alleles. Noninformative loci: are those in which the donor and the recipient have the same alleles. Donor-Informative loci: the donor and the recipient share one allele for which the donor is heterozygous and the donor has a unique allele. Recipient-informative loci: The unique allele is in the recipient. ■There are different degrees of informativity. ■With the most informative loci, recipient bands or peaks do not overlap stutter in donor bands or peaks. ■Stutter is a technical artifact of the PCR reaction in which a minor product of n-1 repeat units is produced. Second part of testing: The engraftment analysis. Is performed at specified intervals after the transplant. In the engraftment analysis, the recipient blood and bone marrow are tested to determine the presence of donor cells using the informative and/or recipient-informative loci. engraftment analysis: Using informative loci, peak areas are determined in fluorescence units or from densitometry scans of gel bands. Pretransplant analysis and engraftment were measured in early molecular studies by amplification of small VNTRs and resolution of amplified fragment on polyacrylamide gels with silver-stain detection. Before the transplant, the screen for informative loci was based on band patterns of the PCR products. After the transplant, analysis of the gel band pattern from the blood or bone marrow of the recipient revealed one of three diff states. Full chimerism - only donor alleles detected in the recipient (two bands on gel) Mixed Chimerism - Mixture of donor and recipient alleles present (4 bands on gel as 4 diff alleles) Graft failure - only recipient alleles detectable. (2 bands on gel) Currently, the preferred method is PCR amplification of STRs, resolution by capillary electrophoresis, and fluorescent detection. Provides accurate quantification of the percentage of donor/recipient cells, and high sensitivity with minimal sample requirements. Alternate, even more sensitive method using qPCR of SNP has been proposed. Higher throughout and lower sample requirments. It can be performed on 96-well plate format as a sequence -specific qPCR with no gel resolution required. Limited informative SNP and the requirement to include pretransplant donor and recipient DNA at each monitoring time point have slowed the adoption of this method.
Likelihood ratio and linkage equilibrium
When comparing genotype of a sample from a crime scene and genotype from a database the determination that the two genotypes match ( are from the same person) is expressed in terms of a likelihood ratio. It is the comparison of the probability that the two genotypes came from the same person with the probability that the two genotypes came from diff persons, taking into account allele frequencies and linkage equilibrium in the population. A high likelihood ratio is an indication that the probability is more likely that the two genotypes came from the same person, whereas a likelihood ratio of less than 1 indicates that this probability is less likely. Ex:If likelihood ratio is [1/(1/1,000)] = 1,000. The tested genotypes are 1,000 times more likely to have come from the same person than from two randomly chosen members of the population where the profile occurs in 1/1,000 people. In a rnadom sampling of 100,000 members of a population, 100 people with the same genotype might be found. Ex: Suppose the Petna D 5 (1/10 allele frequnecy) and D7S829 8 (1/50 allele frequency) profiles were discovered in a specimen from a source. The likelihood that the profile came from the tested individual is 1, having been directly determined. The likelihood that the same profile could come from someone else in the population is 1/500. The liklihood ratio is 1/(1/500), or 500. The specimen material is 500 times more likely to have come from the tested individual than from some other person in the population.
Types of Polymorphic DNA Sequences (good notes)
■RFLP: restriction fragment length polymorphisms. Visualized by gels and southern blots with probes. These are inherited. ■VNTR: variable number tandem repeats (8 to >50 base pairs) ■STR: short tandem repeats (1-8 base pairs). Different alleles contain different numbers of repeats. Visualized by fragment size on southern blot. STR alleles can also be analyzed by amplicon size (PCR). STR genotypes are analyzed using gel or capillary gel electrophoresis. If allele 1 from mom has 7 repeats and allele 2 is from dad and has 8 repeats genotype is 7/8. Multiple loci are genotyped in the same. These are inherited. reaction using multiplex PCR. ■One allele is inherited from each parent. Mini-STRs are STRs on smaller amplicons. Recommended for degraded specimens Used to identify remains from mass graves and disaster areas ■SNP: single-nucleotide polymorphisms. ■Single-nucleotide differences between DNA sequences. ■One SNP occurs approximately every 1250 base pairs in human DNA. ■SNPs are detected by sequencing, melt curve analysis, or other methods. ■99% have no biological effect; 60,000 are within genes. ■SNPs are inherited in blocks or haplotypes. ■SNPs can be used for mapping genes, human identification, chimerism analysis, and many other applications. ■The Human Haplotype Mapping (HapMap) Project is aimed at identifying SNP haplotypes throughout the human genome. SNP Haplotype: SNPs are reinherited together in a block without combination between them comprise a SNP haplotype. SNP haplotypes are identifiable by two or three representative SNPs (tag SNPs) within the haplotype.
Mitochondrial DNA Polymorphisms
■Sequence differences in the hypervariable regions (HV) of the mitochondrial genome ■There are an average of 8.5 base differences in the mitochondrial HV sequences of unrelated individuals. ■All maternal relatives will have the same mitochondrial sequences. ■Mitochondrial typing can be used for legal exclusion of individuals or confirmation of maternal lineage. Mitochondria contain a circular genome of 16,569 bp. The two strands of the circular mitochondrial DNA (mtDNA) chromosome have an asymmetric distribution of Gs and Cs generating a G-rich heavy (H) and a C-rich light (strand). Each strand is transcribed from a control region starting at one predominant promoter, PL on the L strand and PH on the H strand, located in sequences of the mitochondrial circle called the displacement (D)-loop. The D-loop forms a triple-stranded region with a short piece of H-strand DNA, the 7S DNA, synthesized from the H strand. Bidirectional transcription starts from PL on the L-strand and PH1 and PH2 on the H-strand. RNA synthesis proceeds around the circle in both directions. A bidirectional attenuator sequence limits L-strand synthesis and, in doing so, maintains a high ratio of rRNA to mRNA transcriptions from the H-STRAND. Mature mt RNAs, 1 to 17, are geenrated by cleavage of the polycistronic (multiple gene) transcript at the location fo the tRNA genes. Genes encoded on the mtDNA include 22 tRNA genes, 2 rRNA genes, and 12 genes coding components of the oxidation-phosphorylation system. Mutations in these genes are responsible for neuropathies and myopathies. In addition to coding sequences, the mitochondrial genome has two noncoding regions that vary in the DNA sequence, the hypervariable region 1 and the hypervariable region 2, or HV1 and HV2. The reference mtDNA hypervariable region is the sequence published initially by anderson, called the Cambridge reference sequence, the oxford sequence, or the Anderson reference. Polymorphisms are denoted as variations from the reference sequence. NT sequencing of the mtDNA control region has been validated for the genetic characterization of forensic specimens and disease states and for geneaology studies. In contrast to nuclear DNA, including the Y chromosome, mtDNA follows maternal clonal inheritance patterns. With few exceptions, mtDNA types (sequences) are inherited maternally. These characteristics make possible the collection of reference material for forensic analysis, even in cases in which generations are skipped. For forensic purposes, the quality of a match between two mtDNA sources is determined by counting the number of times the mtDNA profile occurs in DNA collections of unrelated individuals, so the estimate of the uniqueness of a particular mtDNA type depends on the size of the reference database. The more mitochondrial DNA sequences are enetered into the database, the more powerful the idnetification by mtDNA will become.