(11/13/23) - Lecture 2: QTL mapping and sequencing

¡Supera tus tareas y exámenes ahora con Quizwiz!

molecular markers EXAMPLE: zea mays - 10 chromosomes --> 72 microsatellites --> 40 RFLPs (slides+notes)

*don't worry about the specific numbers or types, JUST THAT THERE ARE MANY AND THEY ARE SPREAD THROUGHOUT THE GENOME - molecular markers do not contribute to the QT of interest, they are just "street signs" that help us narrow the location of the genes that DO play a role in these traits (so markers are NOT THE GENE, they just tell us where genes are) --> 112 markers all across 10 chromosomes

P and L produced results... ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- ...that encouraged other scientists to start research in other things --> QTLs made for other crops --> interest in new marker technology development (facilitated more accurate analyses of increasingly complex organisms (like wheat and sugar cane))

QTL map for corn and spikelets (slides+notes)

- 4 QTL found, or 4 regions of the genome that appear to control spikelet number --> these are the regions that will be targeted in future research - found that ILP at P200075 means more spikelets, it is a correlation - high LOD score means about the threshold/cut off line, and again where the spikes are above this line, it means that there is a difference in trait value that is correlated and associated with the molecular marker --> so those surpassing the cut off are the QTLs (known there is a gene contributing to the number of spikelets in this region) - QTLs tell us to do research in the area around these molecular markers (genes somewhere on that street/neighborhood)

genetic markers ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- DNA polymorphisms in cutting sites of common enzymes known as restriction fragment length polymorphisms (RFLPs) --> markers do NOT influence complex traits themselves but the frequency at which a particular markers genotype corresponds to a change in complex phenotype would inform the distance between a marker and the real QTL

what about QTLs? what is the issue? (slides+notes)

- QTL = "Quantitative Trait Loci" - these are the genes that contribute to our additive quantitative traits --> just one gene is no responsible for the trait *need to take knowledge from gene maps and apply it to something new

QTLs (background)

- Quantitative Trait Loci --> the physical locations of quantitative traits

recently ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- SNP markers made it possible to genotype changes in a single nucleotide and GWAS has made it possible to adapt QTL analysis to higher order mammals --> both GWAS and QTL are being used to understand genetic factors which underpin human disease

review: genetic map for Drosophila (slides+notes)

- THM's group was able to map the locations of dozens of Drosophila mutations (made maps based on the order and position of many different mutations) --> recombination frequency is related to distance between loci

Sanger sequencing and COVID? (slides+notes)

- a Sanger sequencing protocol for SARS-CoV-2-S-gene --> used to sequence the spike proteins on the new COVID variants

independent assortment made it impossible to study what? ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- a complex gene in isolation --> it is like piloting a plane but you can only change the dials randomly (so geneticists needed to figure out how the controls)

people scanning the river for gold flakes (example) ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- can't determine the gold deposits location directly but with many indirect measurements they can make predictions and greatly narrow down the search --> the more information you have the greater the accuracy

review: quantitative traits (slides+notes)

- continuous phenotype distributions (bell shapes curve) - many contributing loci (not just one gene anymore, many loci, many genes contributing) --> ex) height has 12,000 genes - environmental influence --> ex: human biomass, what you eat, the quantity, the type and the diet

Sanger Sequencing (slides+notes)

- developed by Frederick Sanger and colleagues in 1977 and commercialized in 1986 --> 2 x Nobel Prize winner based on structural things 1) determining amino acid sequences of insulin and other proteins 2) first DNA sequencing technique --> uses di-deoxynucleotides (ddNTPs) to randomly terminate DNA replication and run those samples on a special gel to get high resolution *used special version of nucleotides that would cause DNA replication to halt or stop which led to getting different length fragments, run through a gel, and figure out the order of the bases

molecular markers (slides+notes)

- features/fragments of DNA that are associated with known locations within the genome --> NOT A QUANTITATIVE TRAIT, we can't see it just by looking at an individual (not what is causing the phenotype)

key advancements: how do we automate repetitive tasks? (slides+notes)

- first, radioactive labels were replaced with florescent labels --> can start automating reads with sensors - then make one big mix and run it through a single tube with a laser and sensor ti read the florescence as the "bands" run off the "gel" (capillary electrophoresis) --> chromatogram

P and L's RFLP map ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- had 70 markers with an average density of 14.3 cM across all 12 chromosomes --> researchers selected patterns with very different phenotypes to maximize the number of detectable QTLs 1) lycopersicon chmielewski (CL) 2) lycopersicon esculentum (E) - these two were bred through a backcross designed to produce 237 plants

what genes control tassel traits? - the spikelets on the central spike of corn (slides+notes)

- had maize recombinant lines, two lines of maize, ILP and B73 --> ILP average spikelets = 177.4 --> B73 average spikelets = 128.6 - then many recombination events later, in the 8th generation for example, the averages of the spikelets now range from 100.1 to 267.6 (transgressive segregation seen here, so the offspring phenotypes are more extreme than the parents) * the spikelets varied in shape, size, and count but they decided to look at the count./amount

your research group has just discovered a new fly mutation that causes the fly's blood to be replaced with Nickelodeon slime; you want to figure out where this gene is in the fly genome...based on what you've learned so far, how might you do this? (worksheet)

- look for correlated or "linked" mutants; those are in the same linkage group, so they are close on the chromosome; to get a more refined picture, you would then do test crosses and calculate recombination frequencies to make a gene map that includes your new mutant --> test cross with a gene that you know the location of, calculate the recombination frequency to see how close you are to the known gene ====> maybe run the heterozygous and homozygous cross and see how well expected phenotypes match the actual ones --> QTL uses molecular markers instead of morphological ones

the general idea of QTL mapping (slides+notes)

- look for correlations between our quantitative trait of interest (encoded by QTLs) and known molecular markers --> similar to what we suggested with the flies, take our new trait and look for correlations between it and known mutations - the higher the correlation = the closer the location of the QTL is to the molecular marker

imagine an allele from parent 2 in the very top region of the blue allele contributes to increased plant height (slides+notes)

- molecular markers in this region will be more correlated with plant height because they are close to the plant height gene than markers elsewhere *several results with blue tips preserved, so you would expect those individuals to be taller and would look for correlations between height and molecular markers in this region for the top - those markers would be correlated with height cause they are close to our gene that increases height --> red at the tip would be shorter, it wouldn't have the marker close by and it therefore wouldn't see that correlation between height and red top marker allele

mapping the genes controlling quantitative traits (slides+notes)

- need a set of offspring that is very variable for the different loci --> inbred lines that differ in quantitative traits of interest and have different alleles for the molecular markers ===> have recombination, and then the recombinant inbred lines make a RIL population (which is one kind of mapping population) and with this RIL population you measure the phenotype of interest - take 2 inbred lines, differ in molecular markers along the DNA and are homozygous for all traits (mate them one generation, two, etc., many more until like maybe 8 generations) - so measuring the phenotype of interest is seeing all the different possible progeny with different combinations like crossing over, shuffling of genetic material, etc.) --> what to figure out the alleles they have at markers and look at correlations now (do tall ones have the blue molecular marker at the top? would blue ones influence height?

molecular markers for the zea mays spikelet experiment (slides+notes)

- need to genotype all the lines at all the markers to identify which parent allele is responsible for on average more spikelets --> at a specific area at the top of the right allele, there are two alleles possible at that marker, ILP and B73 (the only two options) and you need to ask: does our trait (TS) differ between inbred lines with ILP vs B73 alleles? and if it DOES, them something nearby matters for the trait *inbred lines, different alleles at markers - genotyped all the individuals, restriction cutting, ran it on a gel, did PCR to get primers on both sides of those repeated regions and then put on a gel to see how far they migrated and banding patterns

do molecular markers code for proteins that directly influence the trait of interest? (worksheet)

- no, molecular markers do not code for proteins, they are features of the DNA that we can use like street signs to LOCATE the QTLs

why map QTL? (slides+notes)

- one reason is marker-assisted breeding! --> on a strand of DNA, a previously identified marker signals a desirable trait --> seedlings with multiple desired traits are cultivated into adult plants ====> chipping seeds: to speed the process further, scientists can now chip and test a seed before germination ====> sampling seedlings: a tiny snip from the lead of a sprout gives a geneticist a clear picture of the plant's potential as a prospect; chemical markers highlight genes associated with desirable traits, such as yield, quality and disease resistance; markers are used in selecting the parents for the initial cross as well as in several successive generations * can improve plant breeding if you know where certain trait are or might be (and can improve efficiency to breed plants with desirable traits)

P and L's approach. ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- positioned quantitative trait loci (QTLs) within the genome by association with genetic markers

quantitative traits (QTs) and QTLs (background)

- present a unique challenge when compared with discrete traits

Sanger Sequencing uses ddNTPs to halt strand synthesis (slides+notes)

- regular dNTPs can be added to the sequence cause they have the hydroxyl groups, and then the DNA polymerase can then add another one on top of it so the strand continues to elongate - the ddNTPs only have a hydrogen, so DNA polymerase cannot attach anything to it so no elongation occurs and instead there is termination

reading the sequence (Sanger sequencing video notes)

- sequence is read bottom up, and this whole process results in the complementary sequence of the DNA sample

an example of QTL mapping from the scientific literature: - what genes control tassel traits? (slides+notes)

- tassels are a group of male flowers - spikelets are male flowers (make pollen) --> exploring the variation in tassel shape and size (specifically spikelets) in zea mays (corn)

what components are necessary to sequence DNA using Sanger Sequencing? (worksheet)

- template DNA - dNTPs for all 4 bases (A, C, T, G) - primers - ddNTPs (one type in each tube) - DNA polymerase - polyacrylamide gel

components for Sanger sequencing (slides+notes)

- template DNA (what are we sequencing) - primers - DNA polymerase - dNTPs for all 4 bases (G, C, A, T) (normal nucleotides) - ddNTPs (modified nucleotides, added but because they don't have the 3' oH (hydroxyl), there is nothing for the next base to attach to so replication and synthesis is halted and the DNA polymerase falls off) --> a version of dNTPs without a hydroxyl group on the 3' carbon (halts DNA synthesis) --> labeled (type depends on version of Sanger Sequencing) - initially radio labeled (all black) but now use a fluorescent marker so a lazer is read and some of it is automated --> how many at a time depends on the version of Sanger Sequencing (also where the ddNTPs are added is random) - polyacrylamide gel --> more resolution than an agar gel --> single base differences in length are visible

what did they find with the B73 and ILP graph marker thing? (slides+notes)

- that there was a genotype at marker P200075, so a SNP for an A to G base switch, or the length of a repeated element --> saw that B73 at this marker had very little total spikelets, while ILP at this marker had a huge amount of spikelets - looking for statistical association between markers and traits of interest --> there is a difference in trait value between lines with ILP vs B73 allele - high LOD score would be produced by the plot, which would surpass the significance cutoff or threshold, which would show the trait being significant and that it is a QTL * found that the 2 inbred lines have different alleles at this loci, P200075, so here at this loci there are only two options, it has ILP or B73

review: it is less likely that crossing over will occur between two loci that are close to each other than those that are far away (slides+notes)

- the further apart two places on a chromosome are, the more likely they will experience recombination - recombination frequency (RF) can be related to the distance between loci --> the closer to 0% = the closer the loci are --> if RF is 50%, the genes assort independently and are not part of the same linkage group - this information can be used to create genetic maps

as briefly as possible, how did THM and colleagues create a genetic map for Drosophila? (worksheet)

- the generated many mutations in Drosophila, made test crosses, calculated the recombination frequencies, and determined the order of and the relative distances between their genes

(recap) what is the relationship between recombination frequency and distance between gene loci? (worksheet)

- the lower the recombination frequency (RF), the closer the gene loci are located

P and L described? ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- the position and effect of 14 QTLs spread across three traits

what are the potential problems related to finding the locations of these addition quantitative trait genes - especially compared to something like THM's Drosophila gene map? (slides+notes)

- there are often multiple genes involved --> mutation in one gene might not have a big impact on the phenotype (a tiny effect) - the magnitude of the mutation might be small and hard to see

what is the challenge with applying this approach to map the locations of genes that contribute to additive quantitative traits (Quantitative Trait Loci = QTL's) (worksheet)

- there are typically many loci that contribute to quantitative traits; a mutation in only one of these traits may not be easily detected

reading a Sanger sequence (slides+notes)

- there is DNA to be sequences, you add a primer to it, and then do PAGE-gel electrophoresis --> need to read from the bottom up!

when we create a mutant allele at one of these QTLs, how do we detect a change in phenotype? how do we tell if the reason an offspring is 1% shorter than its parents is the result of a mutation or variation in the developmental environment? (background)

- these potentially subtle effects make QTLs challenging to find using the exact methods described previously --> however, if we applied out knowledge of linkage mapping and molecular markers to the whole genome and looked for correlations between QTs and known molecular markers (which is the idea behind QTL mapping)

what are complex phenotypes controlled by? ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- they are controlled by a finite number of genes each with a discrete influence on the overall phenotype but the number of genes is so large than the impact of any 1 gene is overshadowed by environmental variation to give the illusion of continuity

P and L overcame the limitation ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- they broke through the barrier of blind spots and stuff with the RFLP markers --> because the markers were just nucleotide polymorphisms, they were more ubiquitous then morphological markers that required functional changes to a gene - information from RFLPs made interval making and Maximum Likelihoods feasible for the 1st time --> QTL mapping evolved a level of detail and elegance as Thomas Hunt Morgan's map of Drosophila

why or how do ddNTPs halt DNA elongation? (worksheet)

- they lack a hydroxyl group on the 3' carbon meaning that DNA polymerase cannot add any more bases (in the form of dNTPs)

in your own words, what is the general idea behind QTL mapping? (worksheet)

- to look for associations between molecular markers and quantitative trait phenotypes - a high association means that the molecular marker and the QTL are in the same region of the genome; for this to work, we need to establish a list of well-spread molecular markers that have multiple alleles (which we can do by staring with inbred lines and allow them to mate and for recombination to occur during the production of each new generation - this mixes the QTL and molecular makers so we can see which are "linked" and which are not)

QTL mapping (slides+notes)

- use linkage mapping to find QTLs, let the topics built off one another

polyacrylamide gel electrophoresis (Sanger sequencing video notes)`

- used to sequence the DNA (it is denser) - DNA migrates from the negative to the positive pole, due to the negative charge imparted by the DNA's phosphate backbone --> the smaller and lighter lengths of DNA migrate further to the bottom if the electrophoresis plate - the polyacrylamide is used instead of agarose gel because of its high resolving power and it can separate DNA strands that different in length by just 1 base pair

1988 Paterson and Lander ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- wanted to answer how to figure out the control with the help of 3 complex traits in tomatoes

early experiments in complex analysis ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

- were able to prove the existence of QTLs but couldn't accurately locate them or quantify their effect --> limitation arose because early papers relied on morphological markers which could be genotyped by the observation of an organism (and they were scarce and unevenly distributed) ====> nearly impossible to find a set of these markers which covered every 50 cM linkage region - meaning there were blindspots in the genome

steps of Sanger (Sanger sequencing video notes)

1) DNA to be sequenced has to be amplified 2) heat then denatures the DNA to produce a complementary strand and the template strand for DNA sequencing 3) a primer is then annealed to the 5' end of the DNA 4) primer DNA is then dispersed equally among the 4 reaction vessles 5) DNA polymerase is added to all 4 reaction vessels 6) all 4 dNTP's are added to each reaction vessel 7) specially modified ddNTPs are then added to the reaction vessels (only 1 type of ddNTP is added to each reaction vessel 8) the polymerase attaches the dNTPs to the template strand at the primer normally until a ddNTP is base paired 9) once the ddNTP is base paired, the sequence is terminated because the ddNTP lacks a hydroxyl group at the 3' carbon 10) as a result of the chain termination, DNA fragments of different lengths are formed across all the reaction vessels

3 complex traits in tomatoes ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

1) fruit mass 2) pH 3) soluble solid concentration

CL and E breeding and process steps ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

1) genotyped the plants into E and CL type subsets for each of the 70 markers 2) partitioned intervals between 2 adjacent markers into several points and estimated the probability that a QTL existed at each 3) probabilities were divided by the probability that no QTLs existed for the same point 4) the logarithm of these ratios gave the set of LOD scores - LOD score of 3 was considered significant (meaning a 1000 to 1 chance the proposed QTL existed) 5) estimated the effect of the most likely putative QTL using the Maximum Likelihood Method adapted by Lander himself

types of molecular markers (slides+notes)

1) microsatellites: - regions of repeated DNA sequences at most a few base pairs long (short tandem repeats) - vary in length between several "alleles" --> during mitosis (when replicated) or meiosis (when being made into gametes), an extra repeat is put it or left out - so there are different lengths (they do not code for anything functional but are just sections of DNA that might be longer, more repeats, or shorter (deletions)) 2) restriction fragment length polymorphism (RFLP): - caused by differences in how restriction enzymes interact with the different "alleles" --> areas of DNA that take restriction enzymes from bacteria and cut the DNA and get different banding patterns (they cut up DNA and compare the banding patterns and each potential place they can cut there are different alleles

genetic traits have 2 broad categories ("Understanding Quantitative Trait Loci with the help of tomatoes" pre-lecture article/video notes)

1) simple (discrete) phenotypes 2) complex continuous) phenotypes

pre-lecture quiz question 4 which of the following is a benefit of using RFLP markers to map traits over morphological markers? A - more densely and evenly spread through the genome B - they are easily visible in the offspring of a cross C - they are more scarce and unevenly distributed through the genome D - A and B E - all of the above

A - more densely and evenly spread through the genome

pre-lecture quiz question 1 it is just as easy to study individual gene loci for complex or quantitative traits as it is for single simple or qualitative traits A - true B - false

B - false

pre-lecture quiz question 3 molecular markers, such as RFLPs, influence the phenotypic trait of interest A - true B - false

B - false

iClick: allele B is from the specific gene that increases bristle number A - true B - false (slides+notes)

B - false --> molecular marker COULD be the same region but it ISN'T the gene, molecular markers are NOT the gene that cause the trait but markers are physical characteristics of DNA, the house or street of the gene * markers are close to the allele/gene but is not the allele/gene

post-lecture quiz question 2 which of the following best describes the role of molecular markers during QTL mapping? A - molecular markers are the specific genes that contribute to QT phenotypes B - molecular markers help us determine which regions of the genome our QTLs are located C - molecular markers regulate the expression of QTLs D - molecular markers change the outward appearance of the study organism E - molecular markers are special, tiny markers that bacteria use to create master pieces of art

B - molecular markers help us determine which regions of the genome our QTLs are located

pre-lecture quiz question 2 what did Paterson and Lander use to map quantitative trait loci (QTLs) in the genome? A - radar B - molecular markers such as RFLPs C - bacterial conjugation and time maps D - sonar E - full genome sequencing

B - molecular markers such as RFLPs

post-lecture quiz question 3 if on a GWAS, there is a QTL that passed the significance cutoff or threshold, what does this mean, what does this QTL mean? A - we know which gene controls the trait B - some gene linked to this marker affects the phenotype C - the marker gene itself affects the phenotype

B - some gene linked to this marker affects the phenotype

post-lecture quiz question 5 interpret the DNA sequence on the polyacrylamide gel that was made using Sanger sequencing; the ddNTPs were radio-labeled, hence why they are black G_A_T_C ---x----- x-------- --------x --------x -----x--- x-------- x-------- ---x----- -----x--- -----x--- -----x--- ---x----- --------x x-------- --------x -----x--- x-------- ---x----- ---x----- --------x ---x----- x-------- -----x--- A - GGGGGGAAAAAATTTTTTCCCCC B - AGCCTGGATTTACGCTGAACAGT C - TGACAAGTCGCATTTAGGTCCGA D - TGACAAGTCCAGTTATGGTCCGA E - GACCTGGATTATCGCTAGAACGT

C - TGACAAGTCGCATTTAGGTCCGA

iClick: you've made a RIL/QTL mapping population starting with the lowest and highest bristle numbers on fly wings, and here are phenotypes and marker data for just two marker loci (A and B) for nine of your RILs - phenotype, most bristles to the left (1), least bristles to the right (9) phenotype:__1____2___3___4___5___6___7___8___9 locus A_____:_aa_AA_aa_Aa_aa_Aa_AA_aa_Aa locus B_____:_BB_BB_BB_Bb_Bb_Bb_bb_bb_bb - given your results, what can you conclude? A - allele A is linked to a QTL that increases brittle number B - allele a is linked to a QTL that increases brittle number C - allele B is linked to a QTL that increases brittle number D - allele b is linked to a QTL that increases brittle number (slides+notes)

C - allele B is linked to a QTL that increases brittle number --> where many B's = many bristles --> where less B's = less bristles (correlated)

iClick: which of the following statements is true? A - genes that are closer together on a chromosome are more often inherited together B - the distance between genes is positively correlated with the probability of recombination C - all pairs of genes are equally likely to have recombination happen between them D - A and B E - all of the above (slides+notes)

D - A and B --> further apart = more likely to recombine

post-lecture quiz question 1 which of the following is an example of a molecular marker? A - microsatellites (regions of repeated DNA sequences) B - restriction fragment length polymorphism (RFLP) C - morphological mutations D - A&B E - all of the above

D - A&B

post-lecture quiz question 4 which of the following components is necessary for Sanger sequencing? A - ddNTPs B - dNTPs C - template DNA D - DNA polymerase E - primers F - all of the above

F - all of the above

what os the sequence of DNA based on this Sanger Sequencing gel? GATC -----x -----x x----- --x--- ----x- x----- ----x- x----- (worksheet)

GTGTAGCC --> need to read this bottom to top not top to bottom

an oldie, but a goodie: pros and cons of Sanger sequencing (slides+notes)

pros: - widely available - great for smaller-scale projects (less expensive) - relatively long DNA sequences (>500 nucleotides) - very low error rate (99.99% accurate) cons: - very slow and expensive for full genomes * it is still used today to sequence more challenging regions of genomes, to perform spot check, and for targeted sequencing --> old: still separate lanes but put a laser and a reader at the bottom and have it figure out which ones were coming off in what order --> new/now: mix all of it together into 1 big vial and run it through a capillary tube, 1 lane and as bases and segments run off of it, they shoot a laser, it emits a fluorescence, and that is where we get these chromatograms


Conjuntos de estudio relacionados

Anatomy Chapter 12 Part 2 Spinal Cord

View Set

Unit 11: Physical and Cognitive Development in Adolescence

View Set

BCIS 4720 Exam #1 Quiz Questions

View Set

Tillverkningsteknik och digitala tillverkningsmetoder

View Set

Texas Real Estate Appraisal-SAE Course

View Set