LS 3-2

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Method 1: nuclear transfer to an oocyte

Dolly You can take a nucleus from a cell thats less differentiated and remove that nucleus and import a different nucleus from a different stem cell. This is how we've cloned many animals. Implanting a nucleus.

How can the concept of DNA replication be used in DNA studies?

DNA polymerase is the enzyme that synthesizes DNA molecules from deoxyribonucleotides.

The future: using induced pluripotent stem cells for medical research and novel therapies

Medical applications of being able to control how a stem cell differentiates and being able to take a persons somatic cells and make stem cells from them are going to completely eliminate the need for organ transplants in our futures.

Nobel prize in physiology or medicine 2012

"For the discovery that mature cells can be reprogrammed to become Pluripotent" John Gurdon and Shinya Yamanaka Know the concept of reversing heterochromatin and making more euchromatin

Summary: Assembly of the Pre-Initiation Complex

1. Binding of TFIID (TBP plus TBP associated factors (TAFs)) 2. Binding of TFIIB (and sometimes TFIIA) 3. Binding of Pol 2, TFIIF, TFIIE (ignore), TFIIH 4. Phosphorylation of CTD tail (carboxy terminal domain) of Pol 2 and unwinding of DNA 5. Formation of first phosphodiester bonds

First cycle of a PCR reaction

1. Heat to separate strands. 2. Add synthetic oligonucleotide primers; cool. -Our primers will anneal appropriately so you have a forward primer and a reverse primer. The sequence that is going to be complementary to the region next to the gene in the original DNA. The top strand has a 3' and 5' end. If something is annealing to the edge, it is the new 5' end rather than it being to the original DNA the primer can anneal in the same orientation. The primer can be designed to fit just outside the gene or can be established so that part of the primer is at the very edge of the gene. The primer corresponds to the very edge of the gene, not outside of it. The way you design your primers is to be able to amplify the whole sequence of your gene. If you're missing part of it, it may disrupt downstream applications that you have. The primers actually become part of the DNA. In the elongation phase, they are both synthesized, now we have two times the amount of DNA but we are missing the edges on the outside of the gene. If you do this enough times, what you do end up with is just the region around your gene. 3. Add thermostable DNA polymerase to catalyze 5' —> 3' DNA synthesis. Now the amount of target DNA is 2x

Trinucleotide expansion: caused by unequal crossover during meiosis

1. Homologs pair up 2. Repeats misalign. Crossing over and recombination occur. 3. Products are unique

General Overview Eukaryotic Transcription

1. Pol 2 recruited to DNA by transcription factors 2. Formation of transcription bubble 3. Phosphorylation of CTD during initiation 4. Elongation 5. Dephosphorylation of CTD; transcription terminates

Sigma factor guides RNA polymerase binding to specific sequences and facilitates unwinding of the DNA double strand

1. Polymerase bound nonspecifically to DNA —> 2. Specific binding of sigma to -35 and -10 promoter sequences —> 3. Closed-promoter complex —> 4. Unwinding of DNA around the initiation site —> 5. Open-promoter complex —> 6. Initiation of transcription We have a polymerase thats bound nonspecifically to RNA. The sigma factor is scanning the DNA for the promoter region. Theres going to be particular sites within sigma factor that recognize the -35 and -10 sites. Imagine its carrying RNA polymerase along the DNA. Once it finds promoter region, its going to stop because it actually recognizes that sequence and it is forming protein DNA interactions there that are very strong. Its not going to stop if its not at a promoter region because theres going to be specific sequences that sigma factor is going to recognize. So we guide our polymerase to the particular region of the DNA where we need to initiate transcription, and the sigma factor is going to physically unwind the DNA at that point. This is done without the use of energy. We would use a helicase only in EK. But in PK, sigma factor itself and the way it sits on the DNA, once it reaches the promoter region, it is energetically favorable for those 2 DNA strands to come apart. There isn't any additional energy input that needs to take place. Before the DNA is unwound, its called a closed promoter complex. Once it is unwound, thats an open promoter complex or an open transcription bubble. These are the initiation steps that we have to go through: finding the promoter region, once found, its still considered closed promoter region. Once sigma factor physically unwinds the DNA and makes that transcription bubble, thats our open promoter region.

Review: What substrates does DNA polymerase require?

1. Primer 2. Template strand 3. Deoxynucleoside triphosphates (slide) We have a single stranded piece of DNA, that has a primer that is annealled to part of the single stranded DNA, so in total we have a small piece of double stranded DNA. The 3' hydroxyl group is a part of the thymine that is attached to the DNA. A simplified way of showing that, we have those nucleotides and a 3' hydroxyl group hanging off the end. DNA is always synthesized in a 5' —> 3' direction. It can't go the other way around, remember the nucleophile. We can now have the polymerase catalyze the reaction of adding another nucleotide (guanine) utilizing the 3' hydroxyl group. The direction of synthesis comes from the primer. It has a 5' and a 3' end. As you add nucleotides, you synthesize in a 5' to 3' direction.

The snRNA is complementary to splice sites

1. U1 binds to the 5' splice site; U2 binds to the branch point The snRNA component of each snRP is going to be complementary to the splice sites. We have initially, U2 snRP which binds the region that binds the branch point adenine. Its the U2 snRP that bends that adenine out and so it kind of bends the RNA so that the 2' hydroxyl is sticking out and available to act as a nucleophile. Then we're going to have other snRPs, like U1 snRP which binds the 5' splice site and the snRPs interact with each other to catalyze these nucleophilic transesterifiation reactions.

Combined DNA Index System (CODIS)

13 CODIS Core STR Loci with Chromosomal Positions All 13 STR regions are used in comparisons The odds that two people match in all 13 STR regions are over 1 in a billion. As of October 2015, CODIS contained 12.5 million arrestee and criminal DNA profiles that have assisted in more than 285,000 investigations.

The leading and lagging strand

2 of the 3 core enzymes working together. 1 stays on the DNA synthesizing the continuous piece, the other 2 switch off synthesizing off of the RNA primers as they become available to build off of. Imagine one polymerase, the one that just finished synthesizing jumps back and starts synthesizing. They jump back and work off each other. (watch video to see polymerases alternating which okazaki fragment they are synthesizing)

Sigma factor unwinds the DNA at the promoter region to initiate transcription

2. Initiation continues. Sigma opens the DNA helix; transcription begins. NO energy (e.g., in form of ATP) needed for unwinding We have our sigma factor helping unwind the DNA locally where the polymerase is sitting. We have different RNA nucleotides essentially flowing through the polymerase, and if they are a correct match to whatever the DNA template is, they're going to be incorporated. Theres a flow of nucleotides coming into the polymerase, its only going to pick the ones that base pair with the DNA template.

The nucleosome — 11nm fiber

4 core histones: H2a H2b H3 H4 A nucleosome consists of 2 of each of the 4 core histones (histone octamer) and about 160 bp of DNA which is wrapped around the protein core twice. There are 2 sides that are sandwiched together and we have different dimers of two histones that come together, and that makes 8

Method 2: forcing the expression of stem cell specific genes through transcription factors

4 transcription factors There are soluble factors you can treat cells with in culture that give them the signals to express different genes. Maybe some of those genes are a different lineage type or they can be factors that help reverse and break up the heterochromatin formation. There are the main stem cell factors identified by the yamanaka factors. These factors are useful for maintaining your stem cellness. By testing different combinations of factors, they found this is the minimal requirement (smallest number of factors) you can use to bring a cell back to more of a stem cell like state. Say you wanted to differentiate from a population of stem cells different types of other cells. You can isolate bone marrow stem cells in an incubator and give them specific growth factors that trigger the expression profile consistent with a macrophage. By treating stem cells with different factors, you can guide the cells towards becoming a different cell lineage. You can do the same thing with taking out bone marrow and inducing B cell formation, you're just going to alter the types of factors that you do.

RNA is being processed as it is being transcribed

5' cap 3' polyA addition Splicing Involves addition of cap. Cap adds a backwards GTP into the RNA and oriented in a direction so that its not recognized by the 5' exonuclease. The exonuclease is no longer going to recognize that as their substrate. You protect the RNA on that end, you also add a string of adenines onto the ends of the mRNAs, that has a protective function as well. If you had a 3' exonuclease acting, it would have to chew through a number of adenines before it actually started disrupting our coding region. The polyA tail also has different functions in initiation translation. Both the cap and tail do as well. They function in transcription to protect the RNA and they also function to regulate levels of translation. Introns are sequences that we have to splice out of our mRNA, they are not coding. They are part of the whole transcription unit that gets transcribed, introns are never translated. We have to cut them out (RNA splicing) the end effect is that we are going to remove all of those introns and just leave the exons behind. The exons are the part that actually get made into protein. We don't fully understand the function of introns but part of it is thought to be a buffer zone for mutations to occur in the coding regions of our genome. Rather than having just all exonic sequences, if you intersperse them with sequences that are not important for translation (intron), then you won't have loss of gene function if the mutation occurs in an intron. When you look at the whole sequence of an mRNA transcript, the amount of information in the exons, the length of those sequences, is a lot shorter than intronic sequences. We can sustain mutations with intronic sequences. Multiple types of modifications going on to the RNA as it is being made. The first modification it receives is that 5' cap. Once the RNA molecule has been cleaved, once we know we've synthesized the full amount of RNA, afterwards we receive the addition of a string of adenines called a polyA tail on to the end of our RNA molecule. Both the cap and tail are required to provide the RNA enough stability to last in the nucleus so its not going to get degraded by either end and they also served a purpose in translation. Within the RNA molecule, we have exons and introns. In the mature mRNA, we only have those exon sequences. In addition to capping and giving the tail to mRNA, we're also going to have to splice out those intron sequences. Those sequences are much larger than the exonic sequences themselves. Short pieces that are going to be exons flanked by longer sequences which are introns. We have to remove those because those are not supposed to become part of the protein. So we have a preRNA that has exons and introns and our mature RNA only has exon sequences so that it can be translated into the functional protein that that gene codes for.

mRNA transcripts get a 5'-7-methylguanine cap as they are being transcribed

5' cap is important for: Protection from degradation Transport out of the nucleus Recognition for translation machinery The capping enzymes are going to be resting on the CTD tail when they are recruited by serine 5 phosphorylation.

The splicing mechanism: 3 critical sites for splicing

5' splice site: GU or AU Branch site: A 3' splice site: AG or AC There are 3 critical splice sites in any given intron region. We have an intron flanked by a 5' exon and 3' exon. Every intron is going to have an exon on either side of it. We call this the 5' splice site, the very beginning of the intron, and the end of the intron is the 3' splice site. They tend to be very conserved residues so we always see a G on either side. We have a certain sequence that we have a little variability in, just know they are Gs on either end. The 5' and 3' splice sites contain Guanine on the outsides, both on the intron and exon. Somewhere in the middle of the intron we have a branch point A. This is a special adenine that sticks out from a splicing factor that bends the intron region so the adenine sticks out. RNA has 2 hydroxyl groups, one is being used to be incorporated in the phosphodiester backbone of this RNA molecule and the other hydroxyl group is free to interact with its environment, it can self catalyze and have all these different functions because we have an additional hydroxyl group. The branch point A is going to be very important in initiating the first step of our splicing.

Rho-dependent termination

A DNA sequence causes the polymerase to stall Rho catches up when the polymerase stalls Helicase activity of Rho causes disassociation Rho-dependent involves Rho-helicase. All helicases have an unwinding nucleic acid capability. They're going to use ATP energy to do this unwinding. As the mRNA is being transcribed, were going to have a sequences called a rut site. That rut site is going to be able to recruit the rho helicase, The rho helicase will load itself onto the mRNA at the rut site and then with a spinning action, follow the RNA polymerase. The RNA polymerase is going to be transcribing new RNA faster than the Rho-helicase can actually move along that RNA. Its kind of like the helicase is chasing the RNA polymerase during transcription but it doesn't catch up to it unless something happens that causes the polymerase to stall. For rho-dependent termination, you have a couple sequences that are important just like in rho-independent. One sequence is going to allow rho-helicase to identify the mRNA, load itself onto it, and start chasing the RNA polymerase. The other sequence is going to be on the DNA to cause the polymerase to slow down a little bit. Some sequences are a little bit harder for our polymerases whether DNA or RNA to transcribe or replicate and so we actually get a pausing of our RNA polymerase at the end of our mRNA transcript, about where the end of the coding region where we want to transcribe. When we get pausing of the polymerase at that point, our rho-helicase immediately catches up and catches right up to the end of the RNA thats coming out of the RNA exit channel and rips it out from the polymerase. Thats how we terminate transcription with an enzymatic dependent mechanism. Most genes are going to be regulated by 1 of the methods not necessarily both. If asked about mutations in the rut site or in the stem loop formation, those mutations are going to make it so that you can't terminate transcription at the right place. The whole idea is that you want to make a piece of mRNA that has the coding region to become a single protein. If we were to continue and make that a longer mRNA, its possible we could make some sort of mutant protein that has additional sequence on it that should not have been on there. Its really important to regulate the amount of RNA that you're transcribing during transcription. Were going to have 1 of those ways happen, both involve sequence, but only 1 actually involves the activity of an enzyme.

Chromatin at a transcription start site

A transcription start site (the end of the promoter region) where we've gone from more regulatory sequence to the actual gene that needs to be transcribed. We need to unwind the DNA to have all of that region exposed. When were activating gene expression, a HAT comes in and loosens the interaction between the DNA and the nucleosome, and we can have the opposite effect here with HDAC. The deacetylases are going to recruit different proteins. In this case you have a histone deacetylase associated on the DNA with a corepressor. Heterochromatin are very tightly wrapped around the nucleosomes, then we kind of acetylate the lysine and loosen up the interaction, the entire region that is exposed, then we can have actual transcription occur by repositioning the DNA off of the nucleosomes.

Experiment: 2 year old mice received gene therapy with a virus that leads to increased telomerase expression

AAV9 Virus expressing mouse Telomerase (mTERT) AAV9 Virus expression eGFP (no effect on cells, control) Why do the researchers use eGFP virus control and not just no virus? They were looking at the effect of inducing telomerase expression in all tissues of a mouse. We can't do this type of therapy in humans yet because its hard to control where a virus integrates. We are actually using the part of the virus that allows it to infect a cell. Instead of infecting a cell with the viral genome, it is going to infect the cell with a gene. This is called making a transgenic animal (referred to as gene therapy), where we are using viral integration of a gene. It is problematic with humans because we can't control where the genes from the virus are actually going to integrate. We've seen in preclinical trials that there is a certain frequency where we get integration into a region of the genome that causes either the disruption of a very important gene or it can interfere with the regulation of other genes. While a very small amount of our genome is protein coding, we have a very good chance of it incorporating into a region where its repetitive sequences. Because there is a possibility of incorporating into a region that could cause a problem, we are not at the stage where we are doing this in humans. We have a virus that is going to induce expression of the enzyme telomerase in the mouse. This is a common virus that we use, AAV. We're going to have controls. Sometimes its useful to use GFP as a control. You either inject the mice with something benign but you can monitor the infection by seeing how green the cells are. The other mouse you will inject with the telomerase gene. Did we achieve elevated levels of telomerase in this mouse's tissue. It worked well.

DNA pol 3 is the replicative polymerase in E.coli

All of the examples and figures for DNA replication are based on the prokaryotic replication fork because it is more simple. Eukaryotes have many more factors going on. Replicative polymerase because there are lots of different DNA polymerases in our cells and prokaryotic cells. They all polymerize DNA and nucleotides, grow nucleotide strand. Some have the big task of replicating all of the DNA. Others synthesize smaller pieces of DNA for other purposes. For E. coli, DNA pol 3 is going to be doing the majority of the work. DNA pol 1 has a very important smaller role in DNA replication (not the official replicative polymerase). Complicated structure. Yellow pieces are enzymatic cores of DNA pol 3. The entire complex has 3 different polymerase cores, each of which are able to synthesize DNA. They are held onto the DNA by a factor called a Beta-sliding clamp. The whole apparatus they are attached to is the clamp loader. DNA polymerases have the tendency to slip off the DNA so when they do, the Beta-sliding clamp is there to put them right back into place. This occurs very rapidly in two different directions. Its bad if you can't keep polymerases onto DNA.

Only one strand of DNA is transcribed at a time

Always synthesized 5' to 3' Also called coding strand Only one strand is used as a template at a time, however both DNA strands can be used as a template. Similar to DNA replication, we have to open up the DNA duplex and create a bubble. Replication bubble in DNA, this is called a transcription bubble. Red strand 3' and 3' end off of the RNA where we could put the first RNA nucleotide in which creates the 5' end of our RNA. Thats going to be transcribed throughout the end of the coding region of the gene. We have the transcription machinery is able to know what part of the DNA is actually intended to make 1 mRNA. They only transcribe that certain region. From the DNA sequence, the red is corresponding to the red strand as a template to make RNA, the opposite strand is going to be called the coding strand. A template strand vs a coding strand. The way we can differentiate them is the template strand is complementary to the RNA molecule, they can base pair with one another. Thats how we synthesize the RNA. We have temporary base pairs made between incoming RNA nucleotides and the DNA from which they're being transcribed from. Thats how the RNA polymerase ensures that its incorporating the correct nucleotides. Theres an A on the DNA, the polymerase is going to incorporate a Uracil. They're going to be complementary to each other whereas the coding strand is going to have an identical sequence with the RNA molecule with the exception that our Thymines in DNA are going to be Uracils in RNA. We can line them up and say from our template strand we make a AUG and thats going to be complementary to the template to which we have transcribed it from. And thats going to be the same as the ATG in the coding DNA. By looking at a DNA sequence and an RNA sequence, you should be able to tell which is the template strand and which is the coding strand. The template is used to figure out what the complementary base pair is that must be the same thing thats on the coding strand except it has a Uracil instead of Thymine. Our direction of RNA synthesis is 5' to 3'. We are actually moving the polymerase in one direction, and we show the RNA molecule exiting from the polymerase. There is a region where we actually have base pairing occurring between the DNA and RNA, that is actually where the polymerase is actually currently riding along the DNA. So it matches the nucleotides but as it moves forward, those base pairs, those hydrogen bonds between the RNA and DNA nucleotides are going to break. The 5' end of the RNA strand is going to be exiting through the RNA exit channel in our RNA polymerase.

What can we conclude from this data?

Are these mice going to have potential cancer in the future? This is the main concern with using telomerase therapys. You can either induce telomerase expression in other cells, is this going to make them cancerous? This is an important step towards trying to target whether we can influence telomerase expression in certain diseases. The waterfall plot is commonly seen in longevity and seeing if we can extend the lifespan of bodies. We saw a 20% increase in the lifepsan of the mice that were injected with the telomerase gene. Could we extend lifespan? Would these mice be more prone to cancer? There was no significant difference in the level of cancer.

Telomerase is active primarily in early development

As an individual ages, cell division takes place in absence of telomerase, resulting in decrease in length of telomeres Once telomeres are below a certain length, DNA is degraded and cell dies Correlation between age and telomere length Telomerase activity is elevated in cancerous cells Development is when our cells are dividing a lot. As they go from being a stem cell to being a fully differentiated cell, they need protection during that time period where they're replicating a lot. Thats why we see telomerase expression being exceptionally high in developing tissues. We would expect to see sustained levels of telomerase activity in cells that have to replicated over and over again.

HD - Expanded repeat

Autosomal dominant neurodegenrative disorder Gene found on Chromosome 4 Trinucleotide expansion: Error in Replication caused by Polymerase slippage Southern blot of repetitive sequences that we can identify in huntington's disease patients. We have a normal person with low levels of repetitive sequences. In a patient of huntington's disease, that can be expanded. The huntington protein contains these repetitive sequences, CAG repeats. Most people have 10-30 repetitive CAGs in their huntington protein, that is consistent with normal function of that protein. Once you break the threshold of 40 repeats, it starts to disrupt the function of the protein. The protein normally has a certain number of repeats, but if you have polymerase errors and slippage and you continue to add repeats onto the DNA, once it gets translated to a protein it is no longer functional, hence the onset of the disease. You should find out if you are a carrier of this disease. Typical onset at 40 years of age.

The PK Promoter Region

Bases 5' of the start site (+1) are counted with negative numbers —> -10, -35 regions (there is no position 0) +1 marks start of transcript Conserved Sequences (aka these are consistent between organisms) The +1 site is the first nucleotide synthesized into RNA Everything downstream of the +1 site is where we are going to be transcribing those DNA nucleotides into RNA nucleotides and upstream of that more towards the 5' end of the promoter region, were going to have specific sequences that guide sigma factor and the RNA polymerase to this particular place. This is a very typical prokaryotic promoter region, it has several different aspects to it. You have two regions, the -35 sequence, the -10 sequence. -10, -35, and +1 are all designations on the DNA. There is no zero point. We go from the last bit of the promoter region and then the +1 site. We go from +1 to -1. The -10 and -35 sites are just designated by how many nucleotides upstream they are from that +1 site. The region in the -35 and -10 sequences are going to have conserved consensus sequence. Theres a differentiation between those terms. something thats conserved is a region that we see very frequently in evolutionary history. Out of a sequence in the -35, we might see that entire sequence be retained in different PK organisms. Its conserved much in the same way that amino acids that are useful for particular active site or domain are conserved sequences. Consensus sequence is something we use in a different way. This is the most perfect sequence for sigma factor to recognize. So you have a conserved sequence throughout history and this serves as a consensus sequence for binding of a factor. The better the consensus sequence is to the most perfect sequence, the better we can initiate transcription. Theres going to be certain nucleotides present in this -10 and -35 region that if we alter that sequence slightly we can dramatically decrease the chances of sigma factor actually binding and guiding the RNA polymerase to that site. And so if you have a perfect consensus sequence, that is going to be considered a strong promoter region. Meaning it has the maximum ability to recruit sigma factor and the RNA polymerase. If we were to start changing the nucleotides in these sequences, I would have a less perfect consensus, less perfect match, the further i mutate that, the weaker that promoter region is going to be at recruiting RNA polymerase and sigma factor. Strong and weak promoters are terms used to describe the frequency with which we see transcription initiation occur at. This is one way that PK can regulate the expression of their genes. Imagine a gene that the mRNA it comes from is necessary at all times. Its a certain enzyme that that bacteria is using at all times. You would expect that promoter region controlling transcription of that essential gene to be a strong promoter. We want to have relative ease at initiating transcription at those sites. Something that is less likely to be transcribed or something that we really want to limit transcription of and only turn on under certain circumstances is more likely to have a weaker promoter region. We can actually still use very weak promoter regions but what it involves is actually additional factors that help recruit sigma factor to those weak promoter regions. That can help control gene expression levels to make sure that you're not wasting energy transcribing genes when you really don't need to. Energy conservation.

Chromatin treated with limiting micrococcal nuclease reveals a repeating unit: the nucleosome.

Because a limited amount is used you get fragments of various sizes (if each arrow is a cut point not every arrow gets cut every time) When we study the DNA that is wrapped around nucleosomes, we can identify how much DNA gets wrapped around a single nucleosome. Beads on a string model of DNA loosely wrapped around nucleosomes. There is a space in-between the DNA that is wrapped around these nucleosomes. Theres an enzyme called micrococcal nuclease MNase that cleaves the DNA, but it only cleaves the DNA that isn't wrapped around the nucleosomes. The DNA is protected when its wrapped around the proteins. The naked DNA in the middle can be cleaved. When we first studied how DNA is wrapped around nucleosomes, this was an experiment that was conducted to see just how much length of DNA was wrapped around each nucleosome. Nucleosomes are made up of histones. There are 8 histones that make up a nuecleosome and DNA wraps around the nucleosome. When you actually digest the DNA, limited digestion, we are adding enough enzyme to cleave the DNA but not adding enough that all of the unwrapped DNA gets cleaved. Thats going to help us a little bit of a footprint of the DNA length that is wrapped around a nucleosome. When you do a limited digestion of a sample of DNA, by chance, sometimes you'll get a lot of the DNA cleaved. In this case, we've cleaved away a lot of the other DNA so you get these units where you get one nucleosome, 2, 3 and so on. By doing this digestion, we are breaking the DNA down into the units in which it wraps around the nucleosomes. By chance, this longer piece didn't have an enzyme function on it. We get a pattern that has 8 nucleosomes. By chance, we have 1 that had 1 nucleosome cleaved off and so on.

Summary: Experiment by Meselson & Stahl

Before N14 transfer (0), only HH DNA ~1 generation (0.7-1.5), only HL DNA ~2 generations (1.9-3.0), both HL and LL DNA ~4 generations after, only LL DNA

Similarities between prokaryotic and eukaryotic transcription

Both synthesize RNA from 5' to 3' (but while moving in a 3' to 5' direction) Transcription initiates at a promoter regions in both PK and EK Both utilize RNA polymerases that share similar structures Regulation of transcription initiation is the most common mechanism for control of gene expression, it is the rate limiting step that requires the most time, once we elongate thats the fastest, and we have different ways of terminating transcription whether we are PK or EK. There are a lot more transcription factors in EK versus PK. Both involve the initial set up of the initiation of transcription and EK are going to have a number of other regulators. Both involve other transcription activators and repressor proteins that bind to specific DNA sequences and influence transcription rate of individual genes.

Modification of tails provides a binding site for specific proteins

Bromodomain is a motif that binds to acetylated lysines Chromodomain is a motif that binds to methylated lysines These sites that we create through modifying are histone tails are going to be binding sites for other proteins. Acetylated lysines are recognized by proteins which contain a bromodomain. Many different proteins can have a bromodomain withn them. Its the bromodomain that actually seeks out and binds to those acetylated lysine. Bromodomain containing proteins are going to favor more decondensation of our DNA. Just by changing the charge of the histone tail and making less of a tight interaction between the histone tails and DNA, thats not enough to physically unwind the DNA from the nucleosome, you're just loosening them up. Bromodomain containing proteins are gonna help physically remove the DNA from being wrapped around the nucleosome. Changing the charge is not enough to do that. We have to recruit different factors to help actually remove the DNA so that it can be open for gene expression. Our methylated lysines or methylated residues on histone tails are going to be binding sites for proteins that contain chromodomains. Chromodomain containing proteins are going to favor keeping the condensed. Methylation is usually associated with turning gene expression off whereas acetylation is associated with turning gene expression on. Through methylating our histones, we recruit chromodomain containing proteins which are proteins that are associated with higher levels of DNA condensation and favoring heterochromatin formation whereas the bromodomains favor the euchromatin formation.

The mechanism of DNA replication

Catalyzed by DNA polymerases (different in prokaryotes and eukaryotes) None of the DNA polymerases (from any organism ever found) alone can initiate DNA chains de novo (out of nowhere), in either direction - they require a (previously incorporated nucleotide that has a 3' hydroxyl group off the end) primer molecule (why is this?) Direction of synthesis is 5'-3' This is why we have the problem of require an RNA polymerase to help get us started with DNA synthesis. We know a primer is a sequence that is complementary to a sequence of DNA. In terms of probes for northern and southern blots and primers in terms of PCR. In our actual cells when we replicate our DNA is that we are synthesizing RNA based primers to use. It is synthesized by an RNA polymerase called primase. Primase synthesizes primers and is an RNA polymerase. The RNA polymerase can for example, if the end of a chromosome was C, the primase could add a G and synthesize a short segment of nucleotides that are complementary to the template, this is the part of the DNA we actually want to replicate. We eventually have to remove the primers because RNA has no business being part of DNA except for that small fraction in time.

Intercalating agents

Causes insertions or deletions We use EtBr because it wedges itself (intercalates) into the DNA backbone and if we excite it with a certain wavelength, it shows light, so it can visualize our nucleic acids, our DNA or RNA when we run gels. It also will intercalate OUR DNA.

Base analogs

Causes point mutations CH3 to Br 5-bromouracil can pair with guanine (overall effect) Adenine to Guanine mutation after one round of replication 5-bromouracil can switch between its keto and enol tautomer forms. When in its enol form, it can base pair with guanine. If you imagine when you are polymerase synthesizing DNA, you have essentially different nucleotides flowing into the active site and whatever base pairs works and it moves ahead. You can have 5-bromouracil get incorporated instead of a C because it actually can base pair with guanine. When this happens, you have a normal AT base pair, this A which should have been bound to a T is now a 5-bromouracil, when we go through DNA synthesis where we separate the strands, perhaps that 5-bromouracil switches again to another form. It could be in its keto or enol tautomer. If its in its enol tautomer when DNA is trying to find something to base pair with it, it will add a G. We can see that we started our DNA with AT which we wanted to replicate normally but if incorporation of a base analog that can actually switch between forms, and base pair with guanine, overall we go from an AT to a GC base pair on one of the daughter strands.

What can we conclude from these data?

Cells from donors of vein tissue The blot is a southern blot analysis of telomeric DNA Darker staining correlates with higher amount of DNA Length of DNA fragments decreases from top to bottom

Summary on accessory proteins

Clamp ensures processivity Helicase unwinds dsDNA using ATP Topoisomerase removes supercoils in front of fork (nicks the DNA with endonuclease activity and ligate it using its ligate activity after removing tension in the DNA) Primase synthesizes primers Ligase seals nicks Single strand binding proteins (SSB) protect and organize ssDNA DNA pol 3 does the bulk of synthesis DNA pol 1 which fills in the gaps left by RNA primers once they are removed.

Application of PCR

Cloning genes: taking the sequence of DNA that is encoded into a protein, and manipulate that in some way, adding a GFP or epitope tag, ligate it to a synthetic piece of DNA called a plasmid. We don't want just 1 plasmid containing that coding sequence, we want millions of them. PCR can make many copies of the coding region of a piece of DNA that we want to use for cloning. We can also use that for a diagnosis. You can detect diseases, forensics, old pieces of DNA etc. Diagnosis of inherited diseases Detection of viruses (HIV) Studies of gene expression during development Forensics (DNA fingerprinting) Evolution (amplification of DNA of extinct species)

Eukaryotic transcription occurs in the context of nucleosomes/chromatin

Condensed heterochromatin Decondensed euchromatin Decondensation leads to accessibility for transcription factors and RNA Polymerases Decondensed euchromatin Condensed heterochromatin Transcription in EK you have to physically unwind our DNA from being wound around nucleosomes and other higher orders of chromatin structuring such as a scaffolding protein, which is a very condensed chromatin. We have to decondense that in order to transcribe anything in that region. Heterochromatin is going to be tightly wound (not expressed). Euchromatin is more loosely wound DNA and is more easily expressed. Were going to have different histone modifications to either promote condensed or decondensed chromatin. The main players are HAT (masks the positive charge of histone tails, loosening interaction between DNA and histone tails, same with phosphorylation of histone tails which imparts a negative charge on them to help them repel the DNA) we have HDAC that restore the natural charge of our chromatin or of our histone tails, facilitating a tighter interaction between DNA and the histone tails. We also have methylation of our histone tails. That doesn't change our charge, but it becomes a binding site for our chromodomain containing proteins, whereas the acetylated lysine are going to be our binding site for our bromodomain containing proteins. All of those are in play when were talking about how we are going to be able to initiate transcription in EK.

Sigma factor binding sites

Consensus sequence is the sequence of nucleotides found most frequently at each position Lac promoter is an example of a weak promoter — significant differences from consensus sequence In conserved regions that we typically find conserved sequences in promoter regions, they contain a lot of AT based pairs. E coli specific promoters, an e coli organism can have different promoter regions specific for different genes, all of them are going to have some level of AT richness and if we then are all going to have a perfect consensus sequence that would be the best at initiating transcription, but if they differ from that consensus sequence, which normally means instead of an AT we have a GC, thats going to decrease the frequency of actually successfully initiating transcription at a point. A consensus matches one very well. Lac promoter doesn't have a very strong promoter region but it has a lot of other elements that help boost transcription of the Lac genes, which give the cell the ability to breakdown lactose. They have some Gs and Cs that make it a weaker promoter, whereas the other side has more ATs which is why its stronger because its more closely matching a perfect sequence for the sigma factor to bind.

Continuous Replication

Continuous replication is the replication that is going in the 5' to 3' direction naturally. We don't have to synthesize our DNA in short fragments while synthesizing backwards on the other strand. For each ORI we are going to have bidirectional synthesis. You're opening it up like a zipper and replication goes in both directions. When you see the ORI or replication bubble, each replication bubble has two replication forks. You can draw an invisible line down the middle of a replication bubble. We will be synthesizing DNA in each direction. If you draw the invisible line, you can see each Y shape being its own replication fork. When you look at the polarity of DNA, we know we can only synthesize in a 5' to 3' direction. We will actually have to synthesize backwards in the other direction. The leading strand is the strand that is being synthesized continuously so you just need 1 RNA primer and you just keep replicating in that direction as the replication fork proceeds, the solid blue line represents continuous replication in that direction. Our DNA on the complementary strand to our template also has to be synthesized from 5' to 3' but the replication fork is actually moving away from it. We have to have multiple RNA primers where we synthesize a short segment of DNA in that local region and then essentially lay another primer and synthesize the following region, etc. You have 1 polymerase that are all attached to each other synthesizing in one direction. Then you have two that rotate the part of the lagging strand that they are synthesizing. One will synthesize one fragment while the other one synthesizes the other and they switch back and forth (watch launchpad video). As long as its going in the 5' to 3' direction, theres nothing to stop it from replication. As long as we've unwound the DNA in front of it, we have the one primer and we've been able to synthesize all the DNA on that side continuously. As long as we keep unwinding that DNA and the replication fork moves forward, we can continue synthesizing that strand continuously.

The lagging strand would need to be synthesized discontinuously for replication forks to proceed in one direction

Continuous synthesis on leading strand Discontinuous synthesis on lagging strand These individual pieces that we synthesize on the lagging strand are called okazaki fragments.

How can one distinguish between the original strands of DNA and the newly synthesized strands?

CsCl Equilibrium Centrifugation DNA and CsCl mixed evenly Long centrifugation time CsCl forms a concentration gradient DNA accumulates at a point of equal density When we use CsCl as our matrix, it is very good at distinguishing between DNA molecules that are heavier or lighter. CsCl is just what we use in equilibrium centrifugation to separate DNA fragments. When we do things with proteins and other components we might use sucrose, these are just kind of dense mixtures. The DNA is going to centrifuge overtime then match and stabilize where it's density matches the density of the column. When you first mix in the DNA and the CsCl, before centrifugation its just mixed. After centrifuging for maybe 24 hours at a very high rate of speed, you see that you have a gradient of CsCl where its denser at the bottom of the tube and less dense at the top of the tube. Our DNA is going to fall somewhere in the middle where it's density matches that of the gradient. We can take a population of DNAs, some that might be heavier than others, and we can resolve that in a tube and actually compare them. ***the experiment: The way we do that is to make DNAs of different density. We normally have 14N, we can grow bacteria in heavy nitrogen, 15N. (hershey chase experiement) When you grow E.coli in the presence of something that they use for biosynthetic reactions, that's going to be incorporated. We know that DNA is very nitrogen rich, if we grow bacteria in heavy nitrogen and allow them to undergo a few rounds of replication, all of their DNA is then heavier than E.coli that was not exposed to this. You grow them so all of their DNA is essentially heavy nitrogen then if you suddenly switch them and remove the heavy nitrogen so its just regular nitrogen, you can test with every subsequent round of replication, how much light nitrogen is getting incorporated and how much of that heavy nitrogen DNA is being retained after the subsequent rounds of replication. Using heavy nitrogen, you can trace in conservative, semiconservative, and dispersive, where the newly synthesized DNA is. You start with all heavy and see what happens after different rounds of replication, where everything thats new is going to be light. We just want to differentiate between the parental strand and our newly synthesized DNA. You're going to Isolate the DNA from cells after going through one round of replication, two rounds, etc. Then you centrifuge them, then look at the location of the DNA in the tube. There is a visual where lighter DNA did not migrate as far in the CsCl gradient, and we have heavier nitrogen containing DNA towards the bottom. If we see something like this in our result in our autoradiogram, we see that we have two distinct populations. One is heavy, and one is light. If we were to have a mixture molecule, a molecule that contained half heavy and half light, the band would be in the middle. If everything were perfectly separated, it wouldn't have been in the middle.***

Step 3. Analyze and compare amplified DNA portions

DNA can be visualized using gel electrophoresis DNA is loaded at top of gel, an electric current is applied

How does MMR know which was the original strand / base?

DNA is methylated AFTER replication - the old strand has methylation, the new one not The new (unmethylated) strand is cut and replaced In prokaryotes specifically, the DNA after synthesis is going to be methylated. It is common for adenines to be methylated in prokaryotes, but in eukaryotes its more in cytosines. Bacteria do this so they can tell which is the old and new strand. If you have newly synthesized DNA, theres a moment in time before it becomes methylated. Before it becomes methylated we can differentiate between the new strand and old strand. The old strand has the methyl groups on it but the new strand doesn't. Eventually it will get methylated, but for a brief moment it isn't. This helps us decide which side of the DNA has the incorrectly incorporated base and lesion. We're going to have the enzymes in MMR be able to determine the methylated strand, and if it were a mismatch it would know which side to cut from. If we had a mismatch, we'd make a nick in the DNA strand that is not methylated, then were going to excise DNA from that region, fill it in, then ligate it. If we happen to not catch that lesion early enough so the MMR is scanning along newly synthesized DNA and happens to not be fast enough, and the DNA with the lesion gets methylated then thats going to go undetected so you won't be able to determine which side its on. Then it comes down to choosing one or the other. You have a 50% chance of choosing the wrong base pair at that point. If you can't differentiate which side it came from you'll just pick a random side, 50% chance you pick the wrong strand. It's important for MMR to work very well, efficiently, and rapidly to identify lesions in newly synthesized DNA before it becomes methylated.

Okazaki fragments

DNA is synthesized as small fragments -PK: 1000-2000 bases -EK: 100-400 bases Longer labeling times gives rise to larger fragments but still a peak appears at top of gradient You start out with shorter labelling times. The longer you label you start to see the generation of much larger fragments. You don't continue to just create more and more short fragments over time. At 120 seconds, you have some short fragments that were generated at that moment. But everything that was generated before that 120s has been ligated back together. You see a fraction of whats in the short pieces of DNA getting transformed into longer pieces of DNA over time, depending on the labelling period. The labelling period is just how much DNA replication you're allowing to go forward before stopping the experiment. Typically we'll draw a replication bubble where you have two sides of the fork. Or you'll see a single replication fork. You have to look at the polarity (3' end and 5' end of one strand) which means that is where a primer can be laid down and that we can synthesize in a 5' to 3' direction continuously, because we are moving in the direction that the replication fork is moving as well. Then on our lagging side, you have the segments and the RNA primers to complete that region. Try to visualize how the replication fork moves in one direction, therefore one strand can be synthesized in a continuous piece and on the other side if the direction of the DNA synthesis is not in the same direction as the fork, we must do this backwards.

Transcription is DNA dependent RNA synthesis

DNA serves as template (information storage) all RNA is made by transcription mRNA is a short-lived copy of DNA Information carrier copy of the instructions carried in DNA The definition of transcription is DNA dependent RNA synthesis. We take our primary DNA sequence and transcribe it to RNA. Messenger RNA, mRNAs are the RNAs that get translated into protein. Not all RNA are translated. The mRNAs we make are. We make rRNA or tRNA also but those are also transcribed but not translated. They fall into a different category of RNAs. mRNA are governed by a particular set of properties. All mRNAs are going to have the same properties that make sure that they are eventually exported from the nucleus to eventually be translated into protein in the cytoplasm.

PCR reaction - continued

DNA synthesis (step 3) is catalyzed by the thermostable DNA polymerase (still present). Keep repeating the cycle of: Denaturation, Primer hybridization, Elongation 25th cycle: amount of target DNA is 2^25= 33,554,432 (!!!) When we go through subsequent rounds of replication, we denature the strands, cool it down, but rather than reannealing together, a primer is annealing there. We go through another elongation phase. Usually around 25 cycles or so, we get an incredible amount of DNA, and the majority of it only contains the sequence we want. After a couple rounds, were only amplifying whats within the gene. The primer sequences in yellow are only there to help you visualize that the primer is actually part of the DNA. We don't degrade that. There is no difference between it and the original DNA sample. It's here to show you that in the end, the absolute edge of the sequence is the primer. You will have a very small amount of the extra sequence in the reaction tube, but overwhelmingly you have the target gene. Say you were amplifying the target gene because we wanted to ligate it into a plasmid. If we incubate it with a linearized plasmid that has complementary ends and everything, we can ligate a gene into that plasmid. We will still follow up and check that it is our actual gene of interest that got ligated. You're always having to convince yourself that you're working with whatever you're working with in molecular biology. A primer is essentially a probe with no label. The primer amplifies.

Review- organization of eukaryotic chromatin

DNA- 2nm Nucleosome- 11nm 30 nm filament- 30nm — Linker histones H1 help compact to 30nm fiber Extended form of chromosome- 300nm — Scaffold proteins Condensed section of chromosome- 700nm — Cohesins and other scaffold proteins Mitotic chromosome- 1400nm

Spontaneous base damage by hydrolysis (most common source of mutations)

Deamination (deamination of cytosine to uracil) Depurination (depurination of guanine) Deamination (deamination of 5-methyl cytosine to thymine) By far the most common mutation that is spontaneous in cells is the deamination of cytosine to uracil. If we actually deaminate cytosine, it becomes uracil. They are very similar in their structure. If a cytosine is already incorporated into a strand of DNA and it spontaneously hydrolyzes, you now have a uracil in your DNA. That can interfere with DNA replication because the DNA pol is not going to necessarily know what to do with a uracil base. We can most commonly add in an adenine to base pair with that uracil. That also leads to what we call a GC base pair to an AT base pair over rounds of replication. We can also have depurination, the most common base pair that gets depurinated is guanine. Its literally just losing the nitrogenous base. When it does that we still have the sugar and phosphate but were lacking the base. Those are called A-basic sites. We have mechanisms in place to fix that as well. The second most common deamination is a slightly modified version of cytosine thats very frequent in the genome. We often see that cytosine is methylated, and that is useful in bacteria and in eukaryotes for controlling gene expression. Deamination of 5-methyl cytosine, and that becomes a thymine.

How did researchers discover the different polymerases?

Death cap mushroom and alpha amanitin Different concentrations had different effects: -Low concentration blocks Pol 2 -High concentration blocks Pol 3 If you treat cells with this compound at different concentrations, you inhibit other RNA polymerases. 1 2 and 3 have different sensitivities to this.

Xeroderma Pigemntosum

Deficient NER that cannot repair damage done by UV radiation Exposure to sunlight can produce spots on the affected areas Often results in fatal skin cancers Every mole or freckle you have is a mutation. Its benign but they are cells that are behaving a little bit differently than the other cells. A mole already has some mutation in it that allows it to not grow in a normal sheet. Moles are more likely to develop into skin cancer with people with fair skin because you already have things on your body that have one mutation. Once they have a minor lesion on their skin, it develops worse because they can't repair any thymine dimers.

What is a mutation?

Definition: any permanent change in DNA sequence Causes: Replication errors, spontaneous mutations, radiation and mutagens Mutagens are a chemical that increases mutation rate Replication error is when the DNA pol did not add the correct complementary base pair. There was an A on one side and a G was accidentally matched with it. There are a lot of spontaneous mutations so that can happen. Bases can spontaneously go under hydrolysis and they can convert into other bases. Then we have external mutants such as radiation and different chemical mutagens you might be exposed to.

Chromatin remodeling

Describes how we organize the DNA and the proteins (chromatin) and how we can shift the DNA around to open up regions for gene expression to take place. In eukaryotes, we have higher levels of DNA wrapping around nucleosomes, bead on a string model, scaffolding proteins that tightly regulate how the DNA is wound. When we think of condensed chromosomes we have this mixture of DNA and protein. What if we need to express genes that are in a region thats wrapped around a nucleosome, we can't do that. We have to have ways of maneuvering DNA around chromatin associated proteins so that we can actually express the DNA and use it, rather than having it wrapped up and sequestered in the cell.

Three possible models for DNA replication (Meselson and Stahl, 1958)

Designed to test the different possibilities of how DNA is replicated. While Watson and Crick had their hypothesis, there were other possibilities that could have occurred. We had to test them. We know now that Watson and Crick were correct and that DNA replication is Semiconservative. This means after 1 round of replication, our daughter strands contain 50% from the parental strand and 50% new. There was another hypothesis that the DNA strand could be replicated de novo, somehow using it as a template or not. This is conservative replication, where after 1 round of replication, you still have your parental strand but you've just generated a new DNA strand that is completely newly synthesized. Because the parental strand isn't going anywhere, its staying in its own original molecule, and we have just created a brand new strand out of thin air (not a likely hypothesis). Another one was whether or not it could be dispersive. Could you take the parental strand and have it randomly distributed between daughter strands so that the daughter strands are a mixture of the parental and newly synthesized DNA. But they're not going to be taking it just from one strand. Its all mixed up where the newly synthesized and parental DNA are. In this case, each of the daughter strands are 50% parental and 50% new same as semiconservative. Three Proposed models: Conservative: A double stranded copy is generated and the parental DNA is conserved as a double strand. Semiconservative: Each strand is serving as a template and the new double strand contains one parental and one new strand. Dispersive: The parental strand is separated into segments and each new strand consists in parts of both parental and new DNA

Eukaryotic transcription initiation

Differences to prokaryotes: -3 polymerases -Promoter — often contains a TATA box at position -30 — TATA box is recognized by TBP (TATA binding protein) — TBP recruits the RNA polymerase -Many more transcription factors required for initiation -More regulatory elements

DNA damage by chemotherapeutic agents

Double strand breaks (DSB) DNA intercalation and cross linking Most chemotherapeutic agents will target rapidly dividing cells. Most aren't. Hair follicles and intestinal epithelium are rapidly dividing cells. This is why major gastrointestinal issues and hair loss are the most common side effects of chemotherapeutic agents. While we maybe kind of targeting specific cancer cells, we are also targeting our own cells that divide rapidly, and we know that these types of chemotherapeutic agents are used with the understanding that they may actually cause secondary cancers. Many chemotherapeutic agents function by causing double strand breaks in the DNA. Thats a very serious problem because now you have two ends, how will you connect the two strands back together without the loss of genetic information? Double stranded breaks also occur by radiation, X-rays, things like that. DNA intercalating agents which certain chemotherapeutic agents are based on that, they can intercalate or they can induce the double stranded breaks both by targeting cells that are dividing rapidly.

Different phosphorylation positions have different functions in transcription and RNA processing

EK RNAs are modified during synthesis: Capping and splicing The phosphorylation of the tail at serine 5 and 2 are going to be very transient and so during pre initiation when you're gathering all the different factors together, recruiting the RNA polymerase and mediator, we won't have any phosphorylation of the tail. This is the state that it is in for pre initiation. Once we escape the promoter and move on to elongation, we see phosphorylation of the the tail at the serine 5 residue. Further along into elongation, we see more of a shift into being serine 2 phosphorylated. First we phosphorylated 5, but some of it gets dephosphorylated while we are also phosphorylating serine 2, theres going to be a ratio of phosphorylation between serine 5 and 2 that is going to control our different layers of regulation for the elongation and termination phase. For the first part, when we are phosphorylating our serine 5 during promoter escape, that actually is a signal for a capping enzyme to bind to the tail. The capping enzyme only binds the tail if that serine 5 is phosphorylated. The purpose is to actually provide a protective cap onto the new RNA that is coming out of the RNA exit channel. The RNA can be subject to degradation starting from its 5' end as soon as it is exiting that RNA channel. By recruiting a capping enzyme to put a cap that protects the RNA from degradation, by recruiting that to the tail right at the point that we begin elongation, that sets it up perfectly so that by the time the RNA is exiting, our capping enzyme can hop off of the tail and bind the mRNA there. Its recruited at the tail at a time when it is necessary for its function. You make the first phosphodiester bond, you're going through elongation phase, the RNA starts coming out, but by that time you've already phosphorylated the tail at serine 5 positions and we have a capping enzyme ready to go by the time its needed. We have other factors that are going to bind when serine 2 is phosphorylated. Some of those are components of RNA processing that occurs at the end of transcription. Cleaving the RNA at the end of where our transcription unit is. We are initially going to phosphorylate at serine 5 at the moment we escape the promoter. Promoter escape means you've gone from initiation and you've passed on to elongation and actually transcribing that DNA into a full length RNA, you won't have any abortive transcription at that point. You can have abortive transcription in EK as well, but once we go through the threshold of passing the promoter region, thats called promoter escape. At that time we see phosphorylation of the serine 5s at that tail. These are repetitive sequences that are located on the tail, it is a series of 7 residues that are repeated over and over again. Each serine 5 and each serine 2 within that repeat is going to be modified by kinases (add phosphate group) and phosphatase cleaves it off. The relative amount of phosphorylation at serine 2 and 5 are going to dictate what sort of RNA processing elements are localized to the region where the RNA is coming out of the polymerase. Theres going to be exonuclease hanging around waiting to find uncapped RNA to start degrading. As soon as the 5' end of our RNA molecule exits the RNA polymerase, that we have capping enzymes poised and ready to go. The moment we go past the promoter region, serine 5 is going to be phosphorylated, and we will see the recruitment of capping enzymes. Those enzymes then can hop off of the tail on to the RNA as soon as it exits and they're going to cap that RNA. Once we have that RNA capped we don't have a reason to be recruiting those capping enzymes any longer. Thats when we start seeing a shift in the phosphorylation over to the serine 2 residue. Serine 2 residue is going to recruit some of the later factors that work on RNA. Including the splicing machinery that will splice out the introns, that is actually occurring at the same time that RNA is being transcribed, this is all very rapid. Serine 2 is also going to help recruit the factors that are going to cleave RNA at a certain point, and the enzymes that are going to add that polyA tail. Think of it as being this platform (the CTD tail) where we can recruit all of the different enzymes to work on the RNA in a time dependent and orderly fashion. 1st is capping enzymes, next is splicing machinery, and our cleavage factors, then we have our polyadenylation factors as well, thats the polymerase that adds all of the adenines on to the end of an RNA molecule. We go from no phosphoryation to phosphorylation on the serine 5, as we escape the promoter, and as we continue through elongation, we will see phosphorylation at the serine 2 instead. Shift in the overall phosphorylation go from serine 5 to serine 2 during the process of elongation. At the point of initiation and promoter escape, we have serine 5 phosphorylation, then we see a shift from serine 5 to serine 2 phosphorylation, each of them is going to recruit different RNA processing enzymes.

Elongation: RNA synthesis

For each nucleotide addd in, base pairing occurs and then the phosphodiester bond forms. Speed: ~50 nt/sec We have an RNA exit channel, thats where the newly synthesized RNA is going to be exiting from the RNA polymerase. Elongation is the most rapid phase of transcription. We have a flood of nucleotides entering our RNA polymerase at roughly 50 nt/sec made available for base pairing. Temporary RNA/DNA hybrids made as we are adding RNA nucleotides to a growing RNA strand. Temporary point in time where the RNA polymerase is base pairing an RNA nucleotide to the DNA nucleotide. Its using DNA as the template to make sure it adds the correct RNA nucleotide. As it moves forward, those base pairs will separate and that RNA strand will exit through that RNA exit channel.

Summary: Steps in prokaryotic transcription initiation:

Formation of the "closed complex" Unwinding of DNA to yield the "open complex" Synthesis of 5-10 phosphodiester bonds (abortive transcription) Release of sigma factor once the RNA begins to form - it "outcompetes" sigma factor and it is displaced

EK transcription initiation overview

General transcription factors (GTFs) bind Pre initiation complex formation (Pol 2 bound) Unwinding Phosphorylation of Pol 2 CTD Promoter escape GTFs are considered general because they are required at every promoter site, they are necessary for transcription of all mRNAs and eukaryotes. This is the bare minimum. There are different factors that are present for different kinds of genes. There are general transcription factors that are always present. We're going to have a series of them come through and assemble on that promoter region, starting with TATA binding protein, eventually we'll get recruitment of RNA polymerase itself, and then we can actually go from a closed complex to an open complex with the help of a helicase, and then we can actually control when elongation occurs. In this case its not from the dissociation of a factor such as sigma factor, its dependent on phosphorylation of the CTD tail of this RNA polymerase. The prokaryotic RNA polymerase has the C-terminal end that stick outs like a tail, and it can interact with sequences or proteins bound to the UP element. In EK, that tail is going to serve as a regulatory function of RNA polymerase, controlling when it is going to enter elongation and when its going to start termination as well. The most important factors are TFIID and TFIIH, they are the first and last steps in setting up the transcription machinery, all the way from going from a closed to an open complex. The closed complex is up and till the point that we have a helicase unwind or melt the promoter region. TFIID is important because it is a complex that contains TBP. They are going to localize right on the TATA box.

Double-strand break repair

Generated by ionizing radiation (X-rays) or oxygen, free radicals -non homologous end joining (NHEJ) -homologous recombination (HDR) NHEJ mostly during G1 phase NHEJ joins two unprotected DNA ends NHEJ can lead to short insertions or deletions HDR mostly in S phase HDR uses the second chromosome as a template for repair One is considered an error prone mechanisms of DNA repair, meaning that you're repairing the double strand break but you're likely to introduce errors at the same time. Essentially what you're doing is patching the two ends together. We know when you have a break in the DNA theres actually a degradation of the DNA. The ends that you're going to be patching together are not necessarily at the exact point that you had the break. We have enzymes that bring the two pieces back together and try to stick them back together. We create insertions and deletions in our DNA at points. It's very likely that you'll introduce a frameshift mutation. We actually have a shocking amount of double strand breaks happening in our DNA all the time, we use this method a lot, we therefore introduce permanent mutations into our DNA when we do this, the reason why we do this and don't see an overwhelming problem is because the majority of our DNA is not coding. Most of the time this is going to occur in a place that does not make genes. It can happen in a region that contains genes though. This is a mechanism of how cancer arrives. If you constantly bombard yourself with pollution and X-ray exposure, you have a higher incident of cancer because you have these double strand breaks and when you fix them, you introduce mutations most of the time. There is another type called homology directed repair (HDR), by the name here, we are assuming that we are doing something with a homologous chromosome to fix our problem. The overview is that you use a homologous chromosome as a template to figure out when you patch your double strand break together, make sure you do it with high fidelity and you haven't missed anything or deleted anything or addd anything. This can only occur at points in the cell cycle after S phase where you've actually synthesized and copied your DNA and you have a homologous chromosome available to do this. We have a series of enzymes that recognize the break, they bind the regions of the break, and they chew away more of the DNA to make large overhangs. With those overhangs, they carry them into invading the homologous chromosome. It finds a region of homology on one of the sister chromatids, knowing that whatever is on the opposite strand is exactly what we need to do to fill in this region. We kind of recognize that break creating an overhang that is large enough to find homology and match it with a piece of homologous chromosome. Once we have identified that we localize the overhang strand then we know what to build to fix and fill in the region. In NHEJ (error prone) patching together, not putting things exactly where they need to be. We have to make sure in HDR that we have no skipped or added an additional base by making the double strand break a little worse by chewing away to make an overhang but we use the overhang on a large enough piece that we can look on a homologous chromosome, find a region of homology, and then we know exactly what nucleotides to put on the other side to completely repair the break without introducing any mutations. HDR is the preferred mechanism. End result is a complete fix without any errors although it is more involved. But you can't always use HDR so then we will be forced to do NHEJ. Families and women who lack the BRCA2 gene, can't use HDR. They can only use NHEJ, so they are introducing mutations at a much higher rate than you would normally see. Depending on what tissue this is occurring in, you will get cancer at much earlier onset with people with Braca mutations.

mRNA is shorter than the DNA coding for it

Genes and RNA transcripts differ in length Intervening sequences (Introns) are removed Exons are the sequences that remain in the mRNA We can attempt to hybridize our RNA transcript to single stranded DNA template of which it was transcribed from, wherever theres an exon in the RNA, it can hybridize to the template strand of the DNA through which it was synthesized. They're completentary to each other. We can see DNA/RNA hybrids, they form loops where the introns are. If you follow where the RNA hybridizes, it hybridizes to a point but then a loop forms. It can then continue to hybridize, we see all these loops and they are the sequence of an intron that gets spliced out. All the exonic sequences are going to be able to be complementary to each other but the RNA molecule lacks the intronic sequences. You do get this loop formation wherever there was an intron removed.

DNA replication is not perfect

Geometry of mismatched bases We know there are perfect base pairs that form a geometry that influences a double helix but we don't always add in the correct base.

Post translational modifications (PTMs) help regulate chromatin compaction and transcriptional activity

Heterochromatin: repressed gene expression Euchromatin: active gene expression Just like proteins get phosphorylated to alter their function, when we are modifying histone tails, this is a post translational modification of the histone protein. We will spend most of our time talking about acetylation and how that is going to impact gene expression. Heterochromatin is very tightly packed DNA. Its wrapped around nucleosomes, those nucleosomes are packed in together, we have the scaffolding proteins that are condensing the chromatin so we see all of this is DNA tightly wrapped around different proteins. Those are areas of the DNA that are not accessible to be expressed. Any DNA wrapped around proteins is going to be inaccessible to all of the machinery that we use to transcribe those regions to make RNA and then protein. Euchromatin is our actively expressed gene. This is DNA that is not very tightly wound around the nucleosomes. You see there are nucleosomes here, there is space between the nucleosomes. Anyone where we see space where DNA is not wrapped around a nucleosome, we can transcribe genes that are in that region.Acetylation is going to help us go back and forth between condensing our chromatin into heterochromatin and then decondensing it into euchromatin. When we have a histone acetyl transferase acetylate the lysine on histone tails, that is going to loosen the interactions of the histones and the nucleosomes with the DNA. There are loose regions of DNA in-between those nucleosomes where we could potentially express those genes. When we deacetylate those histones, and make them the most positively charged that they could be, then they would be tightly interacting with the very negatively charged DNA. Back and forth action where HAT is going to reduce the charge on histone tails then reduce the interaction between histones and DNA and favor a loose organization. The HDAC removes acetyl groups, restore positive charge, causing DNA to be more condensed.

Example: Human hemoglobin consists of several similar globin proteins

Historically, hemoglobin actually was rising in organisms that had no use for it. It had to evolve into an oxygen carrying molecule. We can all have different genes that evolve over time.

What will happen at the other strand?

How can this be synthesized?

If all else fails: Translesion synthesis

If you have a thymine dimer, these large bulky lesions, DNA polymerase cannot synthesize past that lesion so it will just stop and fall off the DNA. If we don't do anything at that point, our replication fork collapses and causes double stranded breaks, we're not able to synthesize our DNA and our unicellular organism is going to die. We want to have a way to force replication to go forward in the event we have the bulky lesions. We have a polymerase that doesn't care too much about the lesions. It doesn't get nervous like pol 3 and fall off. It synthesizes directly through lesions but it almost always introduces a mutation by doing that. Special translesion polymerase finishes the job, synthesizes through the lesion, then regular polymerase comes back and finishes the job. Our normal replicative polymerase doesn't know what to do in those situations, falls off the DNA, our translesion synthesis polymerase is going to come in and its going to fill in and replicate through that lesion then pol 3 can finish its job. This almost always introduces a permanent mutation but its better than dying.

Base damage by oxidizing agents

Intrinsic/extrinsic sources of oxidative damage The free radical formation in our cells comes from many many different sources. Metabolism is one of the major sources thats occurring in our mitochondria. Inflammation causes the formation of reactive oxygen species, we can also have outside sources such as smoking, radiation, uv, and pollution inducing formation of free radicals in our cells which are all very good at damaging our DNA.

PCR - It all began with an idea

Kary Mullis thought about it on a car ride up the coast Earned Nobel Prize in Chemistry in 1993 Based on in vitro DNA synthesis Has many different uses Revolutionized molecular biology His idea was to utilized DNA polymerase to replicated DNA in vitro. Why would you do that? We need to amplify a particular sequence, so that we can actually use it in the lab. We don't want just one copy of DNA, we want millions. This is particularly useful when we are trying to do some sequencing from a DNA sample from a mummy or some very very old pieces of DNA, as well as trace amount of DNA at crime scenes. Suppose you have 1 drop of blood from a crime scene, we can take the DNA from that blood sample, and make many many copies of it so that you're not limited to doing just one test on that DNA.

DNA repair mechanisms summary

MMR: repairs wrong nucleotide incorporated (polymerase errors) NER: repairs bulky adducts (T dimers, ethidium bromide intercalation) (large scale) BER: repairs modified bases (deaminated C) (hydrolysis, oxidized bases) NHEJ/HDR: repairs double stranded breaks

Fragile X syndrome

Length of a repeat sequence in the FMR-1 gene is associated with severity of the condition Maternally inherited Another disease marked by the expansion of repetitive sequences. The most common type of mental retardation in males. It is caused by an expansion, repetitive sequence. Instead of being in the coding region of the gene, the part that becomes a protein, the expansion happens in the regulatory region of that gene. In transcription, there are regulatory regions upstream from coding regions that help control the gene expression levels. The promoter region, the first exon of the coding region. Ahead of it we have a region that has a lot of cytosines adjacent to a lot of guanines. This is called CPG islands or repeats. C next to G is a common occurrence in our genome, these are commonly methylated, when you see large regions of Cs and Gs next to each other, that is a region where the cell methylates the DNA as a signal to not express the downstream gene of that region. In fragile x syndrome, you can get an expansion of these Cs and Gs in the promoter region, and once you have enough of those expansions, you will get methylation of that promoter region and you will shut down expression of the fragile x protein, which is called FMR-1. Correlated to Huntington's disease. Expansion of repetitive sequences can happen in the coding region and produce a malformed protein, or it can happen in the regulatory region of a gene. In this case, when this region is expanded and becomes methylated, we turn off expression of FMR-1, this is how you end up with Fragile X syndrome.

Reduced charge on histone tails results in a decondensation of chromatin

Less charged (euchromatin) - acetylated No change in charge (heterochromatin) - methylated

There are many DNA repair pathways to protect the genome

Mismatch repair - Replication (polymerase) errors, pol adds in the wrong nucleotide. Base excision repair - Damaged bases, either ones that have gone through hydrolysis, to pinpoint where that damaged base is and remove that. *know DNA glycosylase* Nucleotide excision repair - Pyrimidine dimer; bulky adduct on base, bulky lesions which include alkylating and intercalating agents bound to DNA and our thymine dimers which cause a large distortion of our DNA backbone. Double-strand break repair - Double-strand breaks Translesion DNA synthesis - Pyrimidine dimer, apurinic site, or bulky adduct on base. When you have errors in your DNA and DNA polymerase doesnt know what to do with those, doesn't know which base pairs to add, we can have an additional polymerase come in that doesn't mind adding in wrong base pairs and so it's kind of a last resort. You know you're going to introduce mutations but at least you're replicating your DNA and surviving.

Histone tails are sites of post-translational modification

Modifications can change tail charge and how they interact with other proteins. Histone code: when we break down where the residues are either methylated, acetylated, phosphorylated.. The specific patterns on the histone tail is going to signal to different proteins. You might need a combination of different modifications in one region on one histone tail to serve as the proper signal.

Stem cells differ from somatic cells in the overall condensation and accessibility of chromatin

More loose chromatin structure (euchromatin) allows for potential for activation of any lineage specific genes during differentiation to somatic cells Confined to its lineage specific gene expression pattern A stem cell that is not differentiated is mostly Euchromatin, which makes sense because it doesn't know what its final destination is. It wants to keep all of the genes necessary for expression of all the cell lineages open and available for expression. As a stem cell becomes more differentiated and goes through those steps, it will package DNA that it doesn't need into heterochromatin, so it will permanently shut off different parts of the genome. Lets say this is a cell that becomes a neuron, it packaging muscle and fat specific genes into heterochromatin because it won't need to express those genes at any point. We have tissue specific gene expression, we have the SAME DNA but its the way we choose to express it that makes a difference.

DNA mismatch repair (MMR)

MutS recognizes and binds mismatch MutL and MutH bind Nick is created in DNA by MutH DNA is removed by an exonuclease and filled back in by DNA Pol, DNA ligase The first enzyme involved in this just recognizes something is wrong. We see a little distortion here that is just a small distortion that is caused by a mispaired base. We have a T-G base pair, which is not normal and a small distortion and the enzyme recognizes. It scans the DNA and when it finds a distortion, it stops and recruits other factors. You'll have a nick created in the DNA double backbone, which involves cleaving a phosphodiester bond on one strand of the DNA not both. Topoisomerase can unwind the DNA by making a nick in the DNA to let it unwind. When we have a nick that's just between two base pairs, an A next to a T that isn't connected, ligase allows us to recreate that bond. Every time you back a nick in the backbone or excising some DNA from one strand only, you're going to have an endonuclease activity that is creating that nick and some ligase activity that is going to seal that up. You have an enzyme that recognizes the lesion, it recruits an endonuclease, cuts a nick into the backbone. Then we have an exonuclease that chews away at the nucleotides on one strand. Exonucleases are chewing away nucleotides from one end to another, working on just one of the strands. They can start at 5' phosphate or 3' hydroxyl, different exonucleases will favor doing those. We have an endonuclease that creates the nick which allows the exonuclease to remove the base pairs in the local region where the lesion is. Then DNA polymerase needs to come in and fill in the gap which leaves a nick, not a single DNA polymerase is able to seal in a strand by connecting just a 3' hydroxyl to an adjacent 5' phosphate, it needs to be an adjacent 5' triphosphate. Ligase comes in, uses ATP hydrolysis to add a pyrophosphate onto the nucleotide, we have the reaction occur, where the 3' hydroxyl of one group attacks the alpha phosphate.

Failure to correct lesions leads to mutations

Mutations in cells that form gametes will be passed on to progeny (diseases or phenotypic differences) Mutations in cells that do not form gametes (somatic cells), can interfere with gene expression or replication, lead to formation of tumors and cancers, or speed up aging A mutation is a permanent change. Before it becomes permanent, we call them lesions. When you don't correct a lesion, but you undergo another round of replication, thats when they become mutations.

DNA ligase

No triphosphate at nick — no energy to form the bond Energy from ATP or NAD+ required (coupled reaction) It is an enzyme that seals the nick in the DNAs. We actually remove the RNA primer so that gets degraded by an RNase. DNA pol 1 is going to try to fill in the gaps that were left by the RNA primer. The problem that the polymerase runs into is that it can't fix the last nick in the backbone. 2 nucleotides, 3' OH hanging off one, and a 5' phosphate hanging off one (monophosphate). DNA polymerases require both the 3' hydroxyl and whatever its trying to ligate it to has to have a triphosphate. DNA polymerase will help remove the RNA primers along with an RNase and its going to synthesize the DNA to fill in the gaps. DNA pol 3 does the bulk of DNA synthesis. DNA pol 1 is going to fill in the gaps left by RNA primers once they are removed. Its only going to be able to seal so far, it can't seal the last bond. DNA ligase comes in and uses ATP energy to join the two strands together. The reaction has two DNA nucleotides next to each other and the DNA ligase can't ligate them on it's own. DNA ligase is going to transiently be adding in extra 2 phosphate groups onto that monophosphate so that the normal reaction can occur where a 3' OH is attacking an alpha phosphate that has a beta and gamma phosphate attached to it.

Mutagens come from intrinsic and extrinsic sources

Normal cell metabolism (intrinsic) Chemicals (intrinsic) Hydrolysis (intrinsic) UV light (extrinsic) Within a cell, we have our bases subject to many different types of problems. Normal cellular metabolism, a major byproduct of that is a reactive oxygen species. Those can oxidize our bases, we can have spontaneous hydrolysis of our bases. UV light is more of an extrinsic source of mutation. Chemicals can actually alkylate or add carbon groups onto the nucleotides. Adding a bulky group onto the side of nucleotide is going to make it hard for the DNA polymerase to determine what this is and that's going to interfere with the ability to synthesize DNA during replication. UV light is one of the major sources of an extrinsic mutagen. What UV light does is actually causes adjacent thymines to dimerize. Dimerization (fusing together). When they fuse, they form a cyclobutane ring, and it becomes this very large distortion in the DNA. We have ways of repairing all these different types of mutations.

Splicing requires two transesterification reactions

Notice that the 2' OH is the nucleophile Introns are excised and later degraded Exons are joined The first reaction is going to be an attack of the 2' hydroxyl group from that branch point A to the phosphate of the 5' splice site. Its going to attach itself to the phosphate and actually form a loop and release the region of the 5' exon. We have whats called a lariat (loop function). To get rid of that, we have another transesterification reaction, where we use the existing 3' hydroxyl group of the exon that was just released, it attacks the phosphate of the guanine in the 3' splice site. That effectively joins together the 2 exonic sequences. The branch point A binds the guanine on the intron at the 5' splice site, then the 3' hydroxyl group in the existing guanine in the exon, is going to attack the phosphate of the very first guanine in the 2nd exon. It joins the two exons together. What we have is a product called an intron lariat which gets released and degraded. Every last nucleotide of the exons is kept, we don't lose nucleotides, precisely splicing out the intronic sequences.

Discontinuous Replication

Occurs on the lagging strand. Know how to draw. Within each replication bubble there are two replication forks. Within each replication fork, we have a lagging strand and a leading strand. You can see on the figure that what is on the lagging strand on one side of the bubble is going to be the leading strand on the other side of the bubble. Because these are replication forks moving away from each other in opposite directions. The way that the lagging strand or discontinuous replication occurs is we have a primer that is laid there by primase. DNA polymerase comes in and can synthesize the DNA in that region, then the primase can jump back and synthesize that primer. It moves in a backwards direction but we are still able to synthesize forward in a 5' to 3' direction. Leading strand replicates in one continuous fashion and the lagging strand replicates in multiple pieces. As we reveal more and more single stranded DNA, primase can lay a new primer and as the new fork continues to move and unwind in that direction, more single stranded DNA will be available for another primase to lay out another RNA primer. Our DNA polymerases that is attached (has 3 polymerases in the entire complex), they will be able to synthesize off of those primers, while one is synthesizing off the primer in the leading strand.

PCR

PCR - polymerase chain reaction A method to make copies of specific pieces of DNA and amplify them Uses DNA polymerase as the enzyme We have a double stranded piece of DNA. The darker region is that one we want to amplify. Remember when we know whats on one strand, we know whats on the other strand. The idea is that we can denature the double stranded DNA molecule, we can fill in whats on the other side, and after one round of replication, we now have two copies of it. We can use a primer to do this. Primers are like probes except they don't have a detection mechanism associated with them. They aren't florescent or labelled. They are complementary to your DNA sequence. Much like a probe anneals to single strand of DNA, a primer does the same thing. We need a primer because DNA polymerase cannot start synthesizing DNA de novo, it has to have an existing 3' hydroxyl group on an edge. The 3' hydroxyl group on the most recently incorporated nucleotide is going to be in the nucleophile that attacks the 5' phosphate of the incoming triphosphate, and we have the pyrophosphate as the leaving group. The DNA does not just add on the first nucleotide on its own. It needs an edge or an end that has that nucleophile on it. The primer sequence is just going to be around 22 base pairs in length. They will anneal in a region of DNA that is flanking your gene of interest. If we were to separate the two strands, we want to anneal on the outsides of the region. They will synthesize the DNA in both directions, and this will copy the region of DNA that we are interested in.

Eukaryotes have three RNA polymerases with specialized functions

Pol 1 - large rRNA Pol 2 - mRNA Pol 3 - tRNA, small rRNA and other small RNAs All of these eukaryotic RNA polymerases contain two large subunits and 12-15 smaller subunits 3 different RNA polymerases in EK, each of which are responsible for transcribing certain types of RNAs. RNA pol 2 in EK is what we are talking about when we are talking about transcription. Its going to make all the mRNAs that turn into proteins. RNA pol 1 is going to transcribe these large ribosomal RNAs that don't actually get translated, however they are incorporated into the protein portion of the ribosome. We also have RNA pol 3 thats going to be responsible for synthesizing some of our smaller RNAs such as tRNAs we also have small interfering RNAs, lots of different small RNAs that not coded for, not translated into a protein, but they serve some sort of regulatory function.

mRNA transcripts get a poly-A tail on their 3' end

Poly A tail is important for: -Protection from degradation -Transport out of the nucleus PolyA tail is added once the RNA is actually cleaved. within the DNA, there is going to be sequence called a PolyA signal. The PolyA signal once it is transcribed into the RNA, there are enzymes that recognize that signal. These are enzymes that are already recruited due to serine 2 phosphorylation, so they're on the tail, ready to go, but as soon as that signal is transcribed, those factors recognize that. They can then hop off of the CTD tail and onto that PolyA site. Cleavage factors recognize that PolyA signal, it signals to them the end of the RNA molecule, the end of the transcript that we want to make into an mRNA and then a protein. Anything after that should not be included in this RNA. Once this sequence is transcribed from DNA to RNA, the factors don't recognize the sequence within the DNA but as soon as it becomes RNA, they're going to jump from the tail onto the RNA and they're going to cleave it at that point. So the cleavage factors cleave the RNA at the polyA site, and then the polyA polymerase (PAP) is going to add a string of adenines there. There are also proteins called PABPs, polyA binding proteins, which specifically bind the PolyA tail that help protect it from exonuclease that target the 3' end of the RNA molecule. We have protection at the 5' end from the cap, we have protection from the 3' end because we've added the sequence that could be degraded but we actually would like to keep most of the polyA tail because it serves as signals for initiating translation. To protect that tail further, we're going to have proteins that associate onto it. Think of it as the SSBP which nonspecifically bind DNA, these are proteins that are going to bind any long sequence of As on a new RNA molecule. Once we have a cap and a tail and we splice out our introns, thats considered a mature mRNA. Before that, its considered a preRNA. We also consider that preRNA, sometimes called hnRNA (heteronuclear RNA). All the RNAs that are being synthesized that contain introns initially are going to be widely varied in size. As we cleave the introns out, that RNA molecule is going to become shorter in the RNA splicing process. The RNA is going to be at a long length in the nucleus, and once its fully processed, its a much shorter molecule, a mature RNA.

TBP (TATA binding protein) binds to the minor groove of the DNA molecule and causes a sharp bend, making it easier to unwind/melt the DNA

Positively charge lysine and arginine residues of TBP interact with negatively charged phosphates of DNA backbone Insertion of TBP phenylalanine residues between bases produces kink in DNA The TATA binding proteins actually binds to the minor groove. It physically bends the DNA which facilitates the ability of TFIIH to unwind it. This is because we have a number of different promoter regions and we are not entirely sure why TBP doesn't bind within the major groove, its just an exception to the rule and binds to the minor groove, it still is able to recognize it with a bit of specificity, those TA AT base pairs it doesn't need to discern which side of the DNA they are on.

PCR Reaction - Overview

Primers: Oligonucleotide primers which bound the region to be amplified are made Polymerase: Taq polymerase, isolated from Thermus aquatics, is resistant to heat denaturation Reaction: Repeated many times to amplify a specific region from as little as one copy of DNA How do you determine appropriate temperatures for hybridization? Your steps involve heating up your DNA samples, denaturing step (usually 94 C). After denaturing (allowing all of our DNA to become single stranded), as we start to cool it down, the primers which were already single stranded and we have in abundance, will actually start annealing. When you have the primer annealed, the original strand can't come back. For every original double stranded DNA molecule that we have that gets separated by denaturation and then we anneal the primers onto that, the elongation phase is actually having DNA synthesis to occur. We allow it to occur based on the time we allow it to elongate is equivalent to the size of DNA we are trying to synthesize. If you are trying to synthesize a large piece of DNA, you would have to increase the amount of time to allow the synthesis to occur. What kind of polymerase would function in those settings where you're constantly heating it up to 94 degrees? Mammalian proteins would be completely denatured at that temperature. Kary Mullis came up with the idea of trying to isolate DNA polymerases that naturally live in very very high temperatures. The polymerase that we used in PCR is taq polymerase. Any version of a polymerase that you use in PCR has to have the ability to withstand high temperatures and repeated heatings and coolings, and keep its function. This is isolated from thermos aquaticus. Its a worm that stays in the heated vents of the ocean. Different kinds of organisms that thrive in really high temperatures. We can purify the DNA polymerase from those organisms. You might want to do immunoprecipitation or affinity chromatography to isolate a DNA polymerase from a sample. We can purify that, but its an expensive necessity for these reactions. When you are going through the denaturing phase, it actually isn't the best temperature for taq polymerase to function at. It actually functions best at 72C. Thats when it has its maximum amount of activity, will be doing its synthesis, and therefore it is a perfect system to heat the DNA up, so that it separates. Polymerase doesn't like that temperature but it's not destroyed. It's not working until you have it at 72C. Not happy at 55C either. When you go back and forth from denaturing and annealing, the DNA polymerase is not going to be functional. It will only be functional during the elongation step. The elongation step is specific for how long the DNA segment is that you are amplifying, and the annealing temperature is specific for the G-C content of the primer. If the primer has a higher G-C content, it actually is going to differ in terms of how it can anneal to the DNA, based on how many bonds it has to make. This is also true for the denaturing. The higher the G-C content, the higher the melting temperature. These are all considerations we have to make when we are setting up PCR reactions. They are conducted in expensive and precise heaters. You can tell the machine how many steps and how many cycles you want it to go through, set it, and forget it. Can take several hours depending on your elongation phase. When the reaction is done, take sample out, and you have many more copies of that DNA. In the reaction mixture, you have the original sample, the primers, and the DNA polymerase, a specific buffer with Mg.

Primase can synthesize a new primer on the template generated by Telomerase

Priming by primase, gap filled by polymerase, and nick sealed by ligase RNA primer removal Protection of single-stranded 3' end by telomere DNA-binding proteins This is where we would have an end that would have been lost but we extended it in that direction by reverse transcription, then we have an RNA primer come and lay at the end that has a 5' end complementary to the 3' end of the chromosome. Then we can actually extend our DNA synthesis to fill in that gap, so we still have to remove that primer on the end. We still have the same problem where we have an overhang that is unprotected and will most likely be degraded, we can have some proteins that bind on the edge so the ends of our chromosomes are a mixture of DNA and proteins holding everything together. We are essentially going through the same process as we do in DNA replication when we are synthesizing from an RNA primer and so we lay the primer and remove it, and we will still have the edge that isn't particularly stable but at least now its farther away than our precious genetic information. And so everytime our cell divides, we're going to extend the end and then protect it so the piece that gets degraded is as far away form our genetic information, our key to survival is going to be far away from that.

DNA replication is initiated at 'origins of replication' (ORI) where the double strand is unwound

Prokaryotes: Typically one ORI Eukaryotes: Typically multiple ORIs Denser label indicates areas most recently replicated— the replication forks. DNA replication is BI-DIRECTIONAL: two forks at every origin Continuous synthesis on leading strand We know that prokaryotic DNA is in a circular chromosome. We can have one origin of replication begin and we essentially replicate on one strand and get two daughter duplexes from there. So you start from one end and you synthesize the whole thing and at the end you have two double stranded rings. That is efficient for E. coli that has a smaller genome thats circular. In Eukaryotes we have these long linear chromosomes. If we were to just have one origin of replication, it would take forever to replicate our DNA. What we see in Eukaryotes is we have multiple ORIS and they fuse together, they help each other out. Between all of them, we are able to synthesize eukaryotic chromosomes more rapidly. If you provide radio labeled nucleotides at a point in synthesis, you could end up with pieces of DNA that are half radioactive and have non-radioactive. We have a denser label indicating where the most DNA has been replicated. We have an autoradiogram that shows the presence of radioactivity in the chromosome. We see that each of them has a space in the middle that designated the part of the DNA that is not radioactive. So we can label DNA and trace how it is actually replicated.

Different chromatin remodelers can move, eject, or replace nucleosomes

Promoter can only be used when accessible (not nucleosome bound) When we talk about remodeling our chromatin, these are through the actions of many different protein factors. HATs and HDACs loosen the interaction but not physically remove it. Say we need to express a gene thats wrapped around a histone, we can reposition the DNA on the nucleosomes by altering and sliding the nucleosomes around, we can eject a nucleosome out and release the DNA, we also have different variants of histones that can be swapped out that have control over different types of gene expression and recruit different proteins for different purposes. We have the promoter region of the gene, which is the regulatory region that controls gene expression, its wrapped around the nucleosome, and the promoter region is where all of the transcription machinery docks to control gene expression. We have a loosening of the DNA interaction through acetylation of lysine, then we are going to recruit bromodomain containing proteins to those acetylated lysine and thats going to do more action on the DNA by repositioning it outside of the nucleosome. The end result is that you've exposed the promoter region of a gene and you can start to transcribe the DNA in that region.

Protein coding genes represent a small part of the genome and the majority is made of repetitive sequences

Protein coding genes (1.5%) LINEs (20.4%) SINEs (13.1%) LTR retrotransposons (8.3%) DNA Transposons (2.9%) Simple-sequence repeats (3%) Segment Duplications (5%) Long repetitive sequences (e.g. centromeres, telomeres) (8%) Miscellaneous unique sequences (11.6%) Introns (25.9%) Protein coding genes (1.5%) are what we want to unravel and expose so that we can have the transcription machinery sit on the DNA and then cause gene expression to occur (making an mRNA). Introns (25%) are regions that are in a coding region of a gene that actually gets spliced out to make a mature mRNA, so these are regions that don't become part of the final protein. We have a lot of these sequences in our DNA. You see in a gene sequence, you'll see that you have a short piece thats an exon that gets incorporated into the mRNA, then you have long sequences called introns that get spliced out. The majority of our coding regions are going to be made of introns as well. We have a lot of different repetitve sequences called LINEs and SINEs, including a sequence that encodes centromeres.

Rho-independent termination

RNA Polymerase pauses Hairpin moves into the exit groove of polymerase halts transcription and disassociation occurs 1. A rho-independent terminator contains an inverted repeated followed by a string of approximately six adenine nucleotides. 2. The inverted repeats are transcribed into RNA 3. The inverted repeats in RNA fold into a hairpin loop, which causes RNA polymerase to pause 4. The hydrogen bonds in the A-U base pairs break 5. The RNA transcript separates from the template, termination transcription We can get base pairing from the inverted repeating sequences, the intervening sequence becomes part of the loop. When we are forming the structure, imagine it being part of the mRNA. As soon as that region is transcribed, going from DNA to RNA, once those inverted sequences exit the RNA polymerase, they base pair together and form the looped structure which causes strain on the polymerase. Whatever comes out of the exit channel is making a 3D secondary structure which pushes up against the polymerase. The second feature of the hairpin is the nucleotides that immediately follow it. The inverted repeat base pairing followed by a string of U's. The sequence in the DNA are inverted repeats followed by a string of A's. The AU base pair (temporary RNA/DNA hybrid) is going to be a lot weaker than any number of GC base pairs that are in there. The combination of RNA forming the structure, large bulky structure pressing up against the RNA polymerase, at the same time, the only DNA/RNA hybrid is made up of UA base pairs, this is a weaker interaction. If you were to mutate that sequence so that there were a GC in the middle of it, we would not get proper transcription at that point. It is the dual requirement of the inverted repeats making a loop or hairpin followed by a string of AU base pairs and that together is going to cause the RNA molecule to just fall off the DNA and separate entirely from the RNA polymerase. This isn't the whole RNA molecule, just the 3' end of it. The whole entire RNA at that point will dislodge, and that is the completion of transcription at that point.

What does this experiment suggest?

RNA becomes shorter as it matures Distinct intervening sequences are removed in the maturation process, and is shown by the discrete bands on the Northern Blot instead of a smear Northern blot with a probe for ovomucoid mRNAom in chicken (base pairs with everything complementary to mRNAom)

The end replication problem in linear (eukaryotic) chromosomes

RNA primer will be removed but DNA pol 1 can't synthesize new DNA as there is no further 5' primer as a substrate. Result: chromosomes get shorter with each cycle of replication In eukaryotic chromosomes we have a problem called the end replication problem. On a linear chromosome we can see that we are sometimes laying primers on a 3' end of a DNA (the very end of the chromosome). We have to have an RNA primer in order to initiate synthesis on that strand and we know that the RNA primers have to be removed and that little space is filled in by DNA pol 1. DNA pol 1 can't actually start synthesizing to fill in that gap left by primers from the lagging strand on either side. You have this issue where its just left as an overhang and we know that single stranded DNA isn't particularly stable it is open to be degraded, an exonuclease could start chewing away at the end. If you were to go through subsequent cell divisions, every time that your cell divides, you're losing information on the ends of your chromosomes. And so we actually have a system in place that's going to help elongate the chromosomes at every round of replication so that even if you lose a little bit of information, it's essentially junk information, its not important precious genetic information. You can see in the figure, after subsequent rounds, you actually lose pieces and you continue replicating and every time you replicate, you lose a piece of DNA equivalent to the size of that RNA primer that was placed.

DNA repair: Nucleotide Excision Repair (NER)

Recognizes and removes bulky lesions Similar repair pathways found in all organisms, from bacteria to humans UvrAB recognizes and binds 'bulky' distortion of DNA (e.g. caused by a thymine dimer or a bulky chemical adduct) UvrC nicks DNA around the 'bulky' distortion of DNA, UvrD helicase removes the piece of DNA DNA Pol and ligase repair the lesion Bulky lesions can be caused by UV light which causes the thymine dimers, a bulky lesion that sticks out and distorts the DNA double helix as opposed to a mismatch which is smaller. We will use NER to repair DNA intercalating agents that wedge themselves in the DNA and alkylating agents which add bulky carbon chains onto nucleotides. These are categorized as large scale problems. In these steps, we have an enzyme that recognizes the bulky lesion. And then we are going to have to create a nick in that local region. We use an endonuclease to create that nick in the backbone, but instead of an exonuclease, NER uses a helicase to actually rip out a piece of the DNA. Helicases are good at unwinding DNA, they use ATP energy to do so, after we create the nick, we have a helicase that comes in and removes a section of the DNA. Methylation isn't a factor in NER in determining which side to cut because it has a huge bulky lesion. In MMR its a small lesion, it looks the same on either side because we distorted base pairs so slight angle changes stick out on both sides. Whereas intercalating thymine dimers are really big are on one side of the DNA. Its obvious to the enzymes what they need to do. Theres no need to rely on methylation. Recognition of the lesion, we have an endonuclease that makes a nick, use a helicase to rip out the piece of DNA that contains the lesions, DNA polymerase can fill that in, and ligase seals it. Analogous to MMR, but MMR fixes small things with an exonuclease and NER uses a helicase for larger lesions.

DNA repair: Base Excision Repair (BER)

Removal of chemically modified bases Many different enzymes that recognize specific common modifications have been identified -deaminated C to uracil -deaminated 5-methyl C to thymine -8-hydroxyguanine -3-methyladenine Example: Uracil DNA Glycosylase (UDG) repairs cytosine deamination to uracil Glycosylases scan the minor groove, facilitate base-flipping How glycosylases find altered bases is still unclear After the U is cleaved off of the deoxyribose, the phosphodiester backbone is cleaved DNA polymerase and ligase repair and ligate the lesion MMR is only for polymerase errors. If it's just 1 funky base that isn't added by polymerase, it's going to be fixed by BER. All the base hydrolysis and deaminations are going to be repairs by BER. There are modified bases that find themselves in DNA, and they need to be repaired. (look above) Uracil is a very big problem because Uracil doesn't belong in DNA, we have a series of enzymes that can recognize these types of deamination and hydrolysis instances and actually identify a base that shouldn't be there. We don't know how these bases do this in many cases. We have a host of enzymes that are going to be able to identify a base that shouldn't be there and take it out. They do that by removing the nitrogenous base. BER involves a series of enzymes that are DNA glycosylases. This is a class of enzymes. Glycosylase is going to cleave the beta-glycosidic linkage. Its going to leave the sugar and the phosphate in the DNA and it's just removing the base there (BER). One of the most common ones is Uracil DNA Glycolsylase (UDG) repairs the common C deamination to Uracil. UDG can scan DNA and if it finds a uracil, it will remove it. When we remove a base, it leaves an a-basic site. We have a sugar and phosphate so we haven't done anything to the backbone of the DNA but we've cleaved out a base. After cleaving, we have an endonuclease that will then sever the phosphodiester backbone so then we have an adjacent site that is an a-basic site. Then we have a DNA polymerase that is able to cleave the phosphate group off and try to fill in that gap with the correct base. DNA pol that functions in BER to only add one base back. Now we have a site where we can just add in the correct nucleotide at that point. Once the DNA pol does that, we use another ligase enzyme that ligates the backbone back together.

Let's look at what is actually happening

Replication fork: double stranded DNA that is separated (zipper). We separate the strands to synthesize the complementary strand. Some of the key players in DNA replication, bunch of enzymes, know their functions not the actual name. DNA polymerase (does the synthesis) when we have our DNA strands separated, we can actually have binding proteins that protect the DNA. Those are single stranded binding proteins. Any DNA thats separated could be subject to some sort of degradation. Were going to have single stranded binding proteins help coat any single stranded DNA and prevent it from being degraded. DNA is replicated in two different ways. We always synthesize DNA in a 5' to 3' direction. If thats the direction of our synthesis, we cannot possibly synthesize DNA in a 5' to 3' direction on both of those strands because they are antiparallel and have a different polarity. On one side of the DNA, this is the direction of DNA that is 5' to 3', the DNA polymerase can move right along and synthesize. On the other strand, we have to synthesize it backwards. The DNA polymerase and bacteria actually has 3 different polymerases all attached to each other. We don't show the physical attachment between the polymerases. We actually are going to synthesize the lagging strand (the one that is not in the correct direction for synthesis), we are going to synthesize those in small pieces and ligate it back together. Replication fork is where DNA is going to be replicated or synthesized for the purpose of cell division, every cell before they go through mitosis (s phase is when you are synthesizing new DNA to make copies of DNA so we can split genetic information equally between two daughter cells)

Can we reverse long term repression of stem cell genes in somatic cells?

Repression of cell lineage specific genes with potential of their activation Long term repression of non-cell type specific genes through histone modifications and DNA methylation The idea is that we will be able to reprogram these differentiated cells to become more like stem cells. Out of the proteins and factors we have learned, which would you overexpess in a differentiated cell to make it a stem cell? You would overexpess HAT. We could target our heterochromatin to become more euchromatin, open up regions of the DNA that have been sequestered and completely turned off, the idea is that we want to remove those signals that drive heterochromatin formation so that we have a more undifferentiated cell, if we go back to square 1 where it is undifferentiated, we can drive it towards becoming a different type of cell for different therapies.

Transcription Termination

Rho-independent: GC rich stem-loop followed by run of U's Rho-dependent: Pause sites become termination sites in presence of protein factor rho. Termination has 2 different forms in PK. The methods of transcription termination in PK are 2 fold. We have one option, Rho-independent, and Rho-dependent. Rho is a factor (rho-helicase). We are going to either depend on Rho to terminate transcription, or we have a version that can terminate transcription without Rho. The rho-independent version is going to involve sequences in the RNA and DNA and the RNA is actually going to form a looped structure that is going to interfere with any further transcription. Rho dependent termination, the physical action of a helicase enzyme to stop transcription at a certain point.

Short Tandem Repeats

STR —> Short Tandem Repeat -Stretches of DNA that are repetitive STRs are usually highly variable between different people (that are unrelated) in their lengths -Someone might have 10 AGATs in a row, another could have 17 AGATs in a row. -These repeats occur at the same place in the genome. They aren't long, our example has 4 nucleotides, that just repeat over and over. When DNA polymerase is functioning in our body trying to replicate these sequences that are highly repetitive, sometimes its called polymerase slippage. It slides off the DNA, where it picks it up again it skips a repeat. Sometimes it adds or deletes repeats. DNA polymerase doesn't like repetitive sequences. Thats why we end up with individuals with a different number of repeats. By chance two people can have the same amount. We can look at repetitive sequences throughout the genome. By doing this analysis on many different locations, we can 1,000,000 to 1 positively match DNA samples. There is 7-17 STR regions that are commonly analyzed. X and Y chromosomes. These sequences are all located in the same region of the genome, but they are just slightly different lengths. What we can do is design PCR primers that are going to amplify a region that contains repetitive sequences, and we will see how large that fragment is. The more repeats someone has, the larger their fragment will be from doing a PCR reaction.

If replication errors are not corrected, they become permanent mutations

Say it was a TA base pair, then you had an unusual base that you weren't expecting to have at that point. The parental strands are in blue, when we separate those to do replication, on one side we have a TA and thats what we expect. On the other side, you don't know what it is so the polymerase can actually just add in a random nucleotide, whatever fits best, it can do that. After another round of replication, we get a GC, we go from AT to two rounds of replication later a GC. Say we have an AT base pair, Then on the two parental strands you get an AG. At this point we could repair this. We have to figure out which one was the new strand and which one was the old strand. We know whatever was in the new strand is the side that has the mutation on it. We need to excise the G, put in a T. If we don't catch this lesion, if we undergo replication for another round, it becomes a mutation. We have an A that can get paired with a T and a G that gets paired with a C. This looks perfectly normal to the DNA polymerase and to other things that are binding and interacting with DNA. Theres no distortion of the backbone and this is basically going to persist in this DNA population after different rounds of replication. We've gone from an AT base pair to some of the populations of the molecule we are propagating to become a GC base pair. That can be in a very important region of the protein that that gene becomes.

Using PCR for criminal investigations

Set up a PCR reaction using primers that 'bracket' the VNTR (STR) locus PCR produces a pair of bands of amplified DNA, one maternal and one paternal. The length of the amplified DNA will depend on the exact number of repeats at the locus 3 VNTR loci analyzed (3 PCR rxns), producing 6 bands (6 PCR products) after gel electrophoresis More loci examined = more confidence. When examining the variability at 5-10 different VNTR loci, the odds that two random individuals would share the same "fingerprint" by chance are ~1/10 billion We have to remember that we have homologous chromosomes. For each STR region that we perform a PCR on, we expect to see two bands unless they're the exact same length. Often what you inherit from mom and dad will be different. You have your paternal band and your maternal band for each PCR. The VNTR is another term for STR. For each reaction we do, we expect two bands if they are different in size. What we can do is take all of the PCR from all of the STR regions that we're looking at, combine them into one sample, and run that on a single well on the gel. This makes it easier to compare different samples from different people. This example is 3 different individuals being compared to a forensic sample. You get 6 bands for each individual. Combining the results of the individual PCR into one sample which gets run on a gel. One lane per person. Individual B matches the forensic sample. They're very close in length but we can resolve that to see individual bands and compare that to samples of interest. We just have to look and see which ones will match. The more repetitive sequences that you could look at, the more convinced you could be about the sample. 1 in 10 billion if we look at up to 10 different STR regions.

Pulse-chase experiments (how okazaki discovered the fragments)

Short 'pulse': treatment with a (radio) labelled compound Measure the labelled compound concentration in molecule of interest at different time intervals (chase) Many applications, but essentially you are giving a short labelling time (pulse) of a radiolabelled compound or a fluorescently labelled compound in this case we are talking about nucleotides. We expose growing cells to this labelled ATP in this case, nucleotide triphosphate to be incorporated into the DNA. Before we add that expose the cells to it, we don't see radioactivity in the cells, they haven't been exposed yet. As soon as we actually introduce this radiolabelled nucleotide, we see a spike in the amount of radioactive material that is incorporated into the DNA. The way you can generate a curve like that is to have different time points. The chase is going to be the time point. The pulse is whatever you are giving to cells. The chase is what you are examining is happening to the label over time. We had a timepjoint that was before introduction of radioactivity, then another at 3,4,5,6 hours. You perform them in tandems. 5 time points, 5 batches of cells. For the one at 6h timepjoint, they would be exposed to that molecule for 6 hours and we would look at whats happening at 6 hours of labelling. We could stop that timepjoint earlier, we would just have to have multiple samples. They aren't the same cells that we are doing multiple time points on. What you have to do is isolate the DNA and detect how much radioactivity is in it, that would be the end of that experiment. You have to have a series of that sample in tandem, then you give them the same treatment and you examine what that treatment is doing at different time points. Pulse- grow cells in radio-labelled nucleotides. Chase- density gradient centrifugation of DNA after indicated periods of time.

The sigma factor mediates binding of Polymerase to promoter

Sigma regions interacts with different elements of the promoter UP-segment is recognized by alphaCTD (carboy-terminal domain of the polymerase alpha subunit) There are other regions within a promoter that are very common. 1, in PK, is called an UP element or UP-stream element. This is something thats further upstream from the -35 and -10 box. The -35 and -10 are the minimal requirements to initiate transcription, but not all promoter regions are very strong at initiating transcription. The addition of another element that the RNA polymerase interacts with can help boost transcription levels, particularly at the weaker promoter regions. The region of the RNA polymerase thats going to be interacting with the UP element, is the alpha-CTD tail of the protein. This is the C-terminal end of the RNA polymerase. Thats what is sticking off like a tail. That can have interactions with nucleotides with the UP element to help drive transcription initiation forward. Its going to be 2 different sites within the sigma factor. One interacts with the -35 region, one interacts with -10 region. There is going to be a very distinct part of the sigma factor that interacts with either one of those. If we have an UP element present in our promoter region, its the C terminal end or alpha-CTD tail thats going to interact with that element.

Transcription can be terminated using multiple mechanisms, e.g. the torpedo model

Similar to the rho-dependent transcription termination where we have a factor thats going to be acting on the RNA. We get a cleavage of the RNA after we have a polyA tail on it. When we cleave the RNA off, the RNA polymerase keeps going, it keeps transcribing RNA even though the region we need to make our full complete mRNA is already synthesized and cleaved off. The RNA is first processed by adding a 5' cap to it, which protects it from exonuclease. Once the region of RNA that has the cap has been removed, RNA polymerase continues synthesizing RNA but the 5' end of that new RNA thats exiting the RNA exit channel is going to be subject to degradation by exonuclease. It is critical for us to cap that RNA, because we have exonuclease hanging around ready to terminate transcription. They are poised to find RNA that is not capped and then degrade it. When the polymerase continues transcribing RNA after the RNA transcript has been cleaved off, we can see that the RNA that continues to come out is not capped. We have a series of exonuclease that latch onto the RNA and degrade it. Once it has degraded the RNA till it reaches the polymerase, it can then pull the RNA out and you no longer get transcription. Thats what is going to dislodge the interaction between DNA/RNA hybrid being formed by the polymerase, then the polymerase is going to dislodge from the RNA or DNA at that point. Its kind of like a factor thats following the polymerase and chewing up the RNA until it reaches the polymerase and pulling it out. In PK, rho-helicase follows along the polymerase until it pauses at a point and then that rho helicase can catch up to the polymerase and essentially rip the RNA out and cease transcription. Know that exonuclease is responsible here for degrading RNA that shouldn't be transcribed because we've already transcribed our RNA transcript.

Types of mutations

Single base/small changes -Point mutations (should be an A but is a G) -Insertions -Deletions —By inserting or deleting one base, thats actually a big mutation because it interrupts the frame of our amino acids, we haven't talked about translation yet, but our genetic code is the coding region that goes from DNA to RNA to Protein is in a very specific frame. By 3 nucleotides. By adding or deleting 1 or 2 nucleotides, you've now shifted the entire reading frame of that protein. Every subsequent amino acid is changed after that. Large-scale (chromosomal) changes -Translocations (like Philadelphia chromosome) -Duplications/insertions -Deletions —We saw in BCR ABL translocation, you have switches in the chromosomes, we can duplicate a region of a chromosome, you can invert it as well, its actually flipped into the wrong orientation, or deletions that shorten the chromosomes.

Pluripotent Stem Cells

Stem cells can become any cell type in the body but differentiated cells are limited to their lineage How? Can we reverse this process? We know that stem cells have the ability to become any type of cell in the body. as you go from a stem cell to a fully differentiated cell, you're going to go through different steps where you become more and more committed to one type of lineage. A major focus of study is to be able to create stem cells from differentiated cells by making them go back in time and be able to become any other cell.

The environment can affect a person's epigenome and cause changes in gene expression

Stressors, exposure to environmental toxins can have an affect on our DNA that can then be passed on to other generations.

Transcription initiation starts with sequential GTF binding (1 of 2)

TFIID -Binds first to TATA box -Has many components, including TBP TFIIA and TFIIB -Bind to both DNA and TFIID -TFIIB recruits RNAP II and TFIIF -TFIIA prevents inhibitor binding The first factor thats going to bind is TFIID which contains the TBP. It binds the TATA box. Next we have TFIIA and TFIIB. TFIIB is going to recruit the RNA polymerase itself. TFIIA is going to function as a blockade for any sort of inhibitory factors that are present in that region. For the first few steps, TFIID which contains TBP and binds at the TATA box, then you have factors A and B which comes in, A blocks the effects of inhibitor, TFIIB actually helps recruit RNA polymerase.

Transcription initiation starts with sequential GTF binding (2 of 2)

TFIIF and RNA polymerase bind to TFIID and TFIIB and form the pre initiation complex (closed) TFIIH -Has enzymatic activity (helicase) -Uses ATP hydrolysis to initiate "melting" = unwinding RNA polymerase joins in the entire complex along with the factor TFIIF. TFIIF is a chaperone to help carry the RNA polymerase to the promoter region. Lastly, we go from having a closed complex to an open complex, through the function of TFIIH. H stands for helicase. in PK we required sigma factor to unwind the DNA without the use of ATP energy. in EK its taken care of by an actual helicase using ATP energy. Our last step is going to be from the helicase to go from a closed to open complex ready for transcription, melt the DNA or separate the DNA at that promoter region, which is a TA rich region.

Liz Blackburn (UCSF) found out how the repeats are created:

Telomerase is composed of RNA and protein and adds repeats to the ends of telomeres (sequence varies between different organisms humans - TTAGGG) This enzyme is a type of reverse transcriptase -RNA polymerase: DNA—> RNA -Reverse transcriptase: RNA—> DNA They are proteins that have a significant RNA component, like the ribosome. The function of telomerase is to add repeat sequences, TTAGGG is in humans. We will have different variations of that repeat in many eukaryotes. The enzyme telomerase is an example of an exception to the central dogma. Telomerase allows us to go backwards. Can synthesize DNA from RNA.

Telomeres form a loop structure that protects the chromosome end

Telomere binding proteins stabilize this structure further and regulate the length of the telomere

Telomeres are repeated sequences found at the ends of linear chromosomes

Telomeres protect the ends of chromosomes The telomeres form caps at the ends of chromosomes. They contain a unique DNA sequence which is repeated several times. The DNA sequences varies slightly between species. The one shown here is from Tetrahymena. Telomeres are the structural parts of the ends of the chromosomes that we sometimes refer to as the caps of the chromosomes. Telomeres are actually consisting of a repetitive sequence. The same sequence repeating over and over again. We are going to have an enzyme named telomerase that is going to extend the telomeres during replication to avoid that problem of losing actual genetic information (genes that are going to be encoding a protein, something essential to the cell). How your shoelaces have a cap on the end of them, telomeres are the caps that help prevent the loss of genetic information.

Replication in PK terminates at a terminus region

Terminus region about 180 degrees opposite the origin Topoisomerase separates the two chromosomes You have an origin where the double stranded region becomes single stranded by helicase, supercoiling prevention by topoisomerase. The rings get separated and as you go through replication the two circles are still tagged to each other until the very end where they separate once they are fully replicated on both sides.

The prokaryotic RNA polymerase core enzyme associates with sigma factor

The PK RNA polymerase functions in a holoenzyme meaning that its not just the core part of the enzyme that is necessary for it to function, it requires another factor. This is an example of something that needs to function as a complex so it has a quaternary structure that it must be able to form in order to function. The core enzyme of RNA polymerase does not do a very good job of locating the specific RNA sequences from which it should initiate transcription. It can bind RNA but its very transient and it doesn't have anything guiding it to the proper place in the genome, to begin transcription. That is the purpose of the sigma factor. Its this large complex that is going to help guide the RNA polymerase to a particular region in the DNA where we want to initiate transcription. Its actually sigma factor that is most closely associated with the DNA.

Splicing is catalyzed by enzymes composed of a complex of RNA and proteins (snRNPs)

The RNA components are abundant, uridine-rich RNAs and are referred to as U1, U2, U4, U6, and U5. Each snRNA is associated with 10 or more proteins Example: U1 snRNP- complex of RNA and multiple proteins U1 snRNA This doesn't occur spontaneously, it is facilitated by a splice some. The splice some contains snRPS. These are small nuclear RNAs. These are RNAs that are present in the nucleus and they are going to be complexed with a piece of protein, so its an RNA protein that functions together in splicing. Very similar to telomerase and our ribosomes. We have protein elements that have RNA components to them. There are going to be 5 snRPS that function to splice out every intron. Each snRP has several different protein components. all of the snRPS together is called the splice some.

Sensitivity of RNA polymerases 1, 2, and 3 to alpha-amanitin distinguishes the three polymerases

The activity of the polymerase from 0-100%, we have the axis of the alpha-amanitin. For Pol 1, the alpha-aminitin does not affect it. 2 is sensitive, 3 is less sensitive than 2.

Many different proteins advance the replication fork

The beta clamp helps make DNA replication fast and efficient Helicase unwinds DNA Topoisomerase prevents DNA supercoiling in front of the replication fork Primase synthesizes RNA primers DNA ligase seals remaining nicks in the DNA Single stranded binding proteins protect single stranded DNA The beta clamp helps keep the polymerase locked on to the DNA. We are synthesizing DNA and the polymerase slips off but the clamp repositions it on the DNA and this is how we increase the processivity of DNA replication. A helicase is made of circular subunits that come together that by definition use ATP energy to perform their function. They are very good at unwinding double stranded DNA. In order to synthesize DNA we have to separate the two complementary strands and then build up the new complementary strand with new DNA. The helicase is going to be functioning ahead of the polymerase. Works ahead of the polymerase so it can unwind the DNA and there is room for it to synthesize. Imagine having a bunch of cords tangled, you pull on them, and you have a knot. That occurs in DNA. To relieve that tension and strain in DNA and continue to separate the two strands, that is helped out by a class of proteins known as Topoisomerases. Topoisomerase is going to have the ability to cleave within the phosphodiester backbone of DNA. The backbone between two nucleotides. Its cutting within the backbone, known as endonuclease activity. If you're cleaving nucleotides from the end of a DNA at the 3' or 5' end, it is called an exonuclease (3' exonuclease or 5' exonuclease). If you have a tangled knot of double stranded DNA, if you're to take one of those strands, and just nick the phosphodiester backbone, it unwinds the DNA. When we were pulling on the DNA and making a knot, that is called a supercoil. The purpose of the topoisomerase is to relieve that supercoiling. It makes a nick in the DNA, everything unwinds and all that tension is released. Then it seals that nick again once its all unwound. A nick in the phosphodiester backbone is the same as cutting the phosphodiester backbone on one side of the DNA. If you were to cut through both, you'd be making a double stranded break in our DNA. When we ligate the phosphodiester backbone back together (ligase activity). Most topoisomerase themselves have multiple enzymatic activity associated with them. They have an endonuclease function, and a ligase function that allows them to seal the cut they made in the backbone. Primase is the RNA polymerase that synthesizes the RNA primers. A nick is a break in one of the phosphodiester linkages which can be sealed by ligase activity. We have ligase activity that is associated with topoisomerase for the purpose of unwinding supercoils. We have other ligases that can function on the DNA in other regions. We have Single stranded binding proteins that nonspecifically bind single stranded DNA. The only thing they are specific for binding is single stranded DNA. When they sense that, they bind onto the DNA transiently to protect it from any degradation because it will be less stable in its single stranded form rather than the double helix.

RNA polymerase 2 C-terminal domain (CTD) is modified during transcription initiation

The carboy terminal domain (CTD) contains 7 amino acids that are repeated multiple times. -Yeast: 26 repeats -Humans: 52 repeats Ser modified by phosphorylation Protein kinases catalyze phosphorylation The CTD tail consists of numerous repeats of these same amino acid residues. Were going to look at the phosphorylation of 2 of those residues in particular. The serine in the 2 position and the serine in the 5 position. The CTD tail is going to have many many repeats of this exact sequence, were going to have many serine 2s and 5s that are available for being phosphorylated. Lysine is commonly acetylated or methylated. Serines are commonly phosphorylated. By phosphorylating these serines, we are going to give different signals to the transcription machinery. Those are going to become binding sites for other factors. Much like an acetylated lysine is a binding site for proteins that contain bromodomains, The phosphorylation of either serine 2 or 5 on the alpha CTD tail are going to be binding sites for other factors that are involved for controlling transcription and RNA processing. This occurs simultaneous to transcription. Process RNA at the same time that we are making that. Reverse action can be facilitated by a phosphatase to remove them.

Mediator is additionally required for transcription activation of eukaryotic RNA polymerases II in vivo

The general name for proteins that bind enhancer regions are called activators. The activator is the protein, the enhancer is the DNA sequence. We have a big clump of proteins that is very large called the mediator. It is a giant complex. Called mediator because it mediates the effects of different transcription factors such as activators or repressors and connecting that to the RNA polymerase. The mediator isn't actually binding the DNA, its on one side of the RNA polymerase, interacting with it, on the other side it is interacting with activator proteins that are binding that enhancer region. You could have repressor proteins that disrupt the interaction between mediator and the activator or disrupts the interaction between mediator and the RNA polymerase. When we look at DNA regions that are far upstream from where our gene is, we have to physically bend the DNA on top of itself to connect something thats far away to the promoter region. We generally summarize those proteins as DNA bending proteins. We do have those general transcription factors, most EK genes also require mediator to achieve sufficient levels of transcription. All of these different factors in one way or another are either going to be directly or indirectly touching the RNA pol 2, giving it signals to go from initiation to our elongation phase and actually perform transcription.

The effect of small mutations on protein sequence

The genetic code is triplicates of nucleotides in a reading frame Each triplicate codes for a particular amino acid Disrupting the reading frame results in frameshift mutations Point mutations: Silent mutation, Missense mutation, Nonsense mutation Insertions and Deletions: Frameshift mutation The RNA sequence shows you which amino acids the 3 nucleotides encode for, we can see that if for some reason we were to change the reading frame, every subsequent pairings of 3 would be different. It is read 3 nucleotides at a time. Each of the 3 encodes an amino acid, the first one is always going to be a methionine, so ATG is our start codon and then our stop codon is at the end. If we were to adjust this, it would be a frame shift mutation. If we were to delete a base or two here, the figure is showing you how the subsequent amino acids change. Some of them might stay the same because there are multiple codons that can code for an amino acid. The chances of that happening for the entire genetic code is impossible. Convince yourself that if you add or delete 1 or 2 base pairs that you're not actually going to be able to synthesize the protein you intended to synthesize initially. We classify the point mutations as silent, missense, or nonsense. A point mutation (1 change) from a U to a C. UAU -> UAC. They both encode for Tyrosine. While its still a mutation it has no effect on the cell so we call it silent. Say we had 1 nucleotide change. UAU->UCU. We go from Tyrosine to encoding Serine, this is a missense mutation. A nonsense mutation is transforming a triplet nucleotide from encoding an amino acid to encoding a stop codon. That stops translation. You can have a nonsense mutation in the middle or even the beginning of the protein and that becomes a truncated version of the protein it was suppose to be. You are stopping translation before the appropriate time.

PK and EK replication forks look very similar and function in the same way

The replication forks are going to look very similar when we compare PK to EK but EK are going to be much more complex.

Mechanism proposed by Watson & Crick

The two DNA strands are complementary to each other, so each strand could serve as a template for a new strand Is this true? What would such a mechanism look like? —> Need to test this hypothesis! If you know whats on one strand, you know whats on the other. If you separate the two strands and build the complementary sequence you would then have two identical molecules. Each having 50% of the parental DNA. Blue indicates newly synthesized DNA.

Telomerase has an RNA template associated with it and extends the DNA strand

The way that it does this is it actually has the RNA component function as a template for extending the chromosomes. The end that could have been cleaved if we didn't have telomerase activity, we would have a primer there and we would lose that information, it would be the end of that chromosome. Telomerase is actually going to extend that 3' overhang by a number of repeats. It's taking that truncated end and extending it in the 5' to 3' direction. You can see how the RNA molecule is oriented within the telomerase protein component (TERT). We take the template of the RNA and we reverse transcribe that into DNA. You can see that the repetitive sequence is complementary to this RNA component of telomerase. We're going to extend our ends that would have been lost from the removal of the RNA primer, we extend all the ends of the chromosomes to help maintain their stability. Once we've extended all the way out there, then we can have primase lay a primer, have DNA pol 1 extend that primer all the way to the end. So we fill in through those steps, an additional piece of DNA.

The Rho-independent termination sequence forms a hairpin followed by a string of Us

There are a number of nucleotides that are the most conserved in that stem portion. Its possible if you mutate the nucleotides that form that stem, that it would then not form at all. If you were also to change so that you end up with a G or C in the string of U's. You have a tighter interaction between the RNA and the DNA as the RNA is being synthesized. While the polymerase might actually pause by having the stem loop structure form, it won't actually terminate transcription because the RNA is not actually going to fall off there. All these sequences are important for the rho-independent termination.

Eukaryotic genes vary enormously in both their size and their complexity

There is no correlation between the size of the preRNA and the mature RNA. 200,000 base pairs can become 2000, etc.

Histone modifications can be inherited from one cell generation to the next

They can also be inherited from one cell to the next. We can maintain open regions of heterochromatin in genes and that gets transmitted into daughter cells or we can maintain heterochromatin so that can be passed on and you can have patterns that are established in one cell and passed on to daughter cells.

Histone tails can be modified

They have a tail that is sticking off, that happen to be very enriched in positively charged amino acid residues, lysine/arginine. The fact that we have all of the positive charges means that the histone of the nucleosome itself is very positively charged, its going to be interacting very tightly with the negatively charged DNA. Thats what is holding this together. They are attracted by charge. We can alter the charge of histone tails and make them less positive therefore less closely associated with DNA by actually acetylating the positive residues. Theres an enzyme called acetyl transferase that adds the acetyl group onto lysine. The acetyl group can be removed by an enzyme called a histone deacetylase. HATs and HDAC. When you add the acetyl group onto lysine, we no longer have the positive charge. By acetylating lysines, We mask the positive charge on the histone tails, making them less positive, and less closely associated with DNA. Then you can have the reverse action where the HDAC removes the acetyl group, restoring the positive charge on the lysine, increasing the positive charge back on the histone tails. There are other modifications that are very common on histones. There is methylation. We have a DNA methyl transferase that can either add or remove methyl groups from DNA. These chemical signatures that we add onto the amino acids on the histone tail, are then going to be recognized by the other proteins that modify the chromatin itself. In acetylated lysine becomes a binding site for certain kinds of proteins, methylated amino acids on the histone tails are also going to become binding sites for the different proteins. Proteins bind very specific. There are proteins that won't recognize a regular lysine but they'll recognize a acetylated lysine. That is what they specifically bind to. By modifying the residues on the histone tails, we are not only changing their charge we are also giving signals to other proteins to associate and perform another action on the histone tails. The main ones are acetylation and methylation.

Transcription: Overview

Three steps: 1. Initiation 2. Elongation 3. Termination 1. Binding of RNA polymerase core to the DNA promoter 2. Formation of transcription bubble 3. Initiation 4. Elongation (promoter clearance) 5. Termination and recycling Transcription is going to take place in 3 main phases. First is initiation of transcription, where we actually are assembling all of the transcription machinery onto the promoter region. This is the region that is just upstream towards the 5' of our coding region of the DNA thats going to become an RNA molecule. The promoter region is where we have a lot of sequences that can dictate the amount of expression of that gene so how many mRNAs were going to make and under what conditions were going to make those RNAs. Control of gene expression is infinitely complicated. Eukaryote is way more complicated than prokaryotes. Prokaryotes are often not complex and respond to basic stimuli like lack of nutrients or a temperature you don't want to be at. Often times we are going to control genes as a block of genes. Lacking sugar, turn on all genes that help produce sugar. In eukaryotes, its a lot more complicated, we have many different things controlling gene expression. Its not a matter of just turning on different pathways and turning them off. We're going to see a lot of feedback in our gene expression. One factor gets made then it can work as transcription factor to make sure that other factors are then synthesized and so on. After initiation (when we are setting up to make sure that we have everything we need to begin transcription), the moment when we "escape the promoter" once the RNA polymerase is no longer sitting on the promoter region, and it has physically moved to the region where our template strand is going to be used, thats when we move into the elongation phase. The elongation phase is simply the phase where the RNA polymerase is synthesizing the RNA molecule. Then we're going to have to terminate that transcription at a proper point in time where we've actually made sure that we've synthesized all the RNA nucleotides we need for a particular mRNA. Theres going to be different signals that tell the RNA polymerase to stop at those points. We're going to go through the analogous procedures of how the 3 steps are controlled in PK and EK because they're different but they share many similarities.

UV Damage - Thymine dimers

Thymine dimers are considered BULKY lesions It makes this kink in the DNA when you have those two dimerized together. Its distorting the natural double helix. This is considered to be a very bulky lesion. Thats going to be taken care of by a repair pathway that is specific for large distortions in the backbone. Our spontaneous hydrolysis that happens are considered to be not bulky because they only affect one base pair, this helps you recognize that they are smaller lesions than a thymine dimer. We have different repair pathways to take care of all the different levels of DNA mutations.

DNA replication summary

To be able to synthesize it efficiently going in one direction, we have a DNA polymerase that can synthesize in one continuous fragment in one side, and we are going to have to synthesize the other in a 5' to 3' direction correctly but its going backwards. 1. Elongation (Okazaki fragment synthesis) 2. RNA priming 3. Clamp loading 4. Okazaki fragment maturation 5. Polymerase dissociation

Step 2. Make copies of specific portions of DNA

To make more DNA, use the Polymerase Chain Reaction (PCR) PCR amplifies specific regions of DNA for comparison between individuals -For DNA profiling, Short Tandem Repeats (STRs) are used (more on this coming up) The vast majority of our DNA is made up of highly repetitive sequences.

Point mutations are classified as either transitions or transversions (single nucleotide changes)

Transition: Purine-Purine, Pyrimidine-Pyrimidine (# aromatic rings stays the same) Transversion: Purine-Pyrimidine (# aromatic rings changes) Transversions are more likely to cause a change in the amino acid sequence of a protein. When you go between the two types, by changing that, it is a bigger problem for the cell

Transposable elements can jump and create sequence duplications

Transposon are mobile genetic elements that can move from one area in the genome to another How transposons create duplications: - Some transposons copy themselves when moving, creating duplications - Sometimes, transposons can carry non-transposon sequence with them when they jump Transposons are actually mobile DNA elements. They can cut themselves out of the DNA then splice themselves into another part of the DNA. This can be taking place across different chromosomes, so you can have a mobile element that moves from one chromosome to another. The way that transposons do this, they themselves encode transposase, which is able to cut out the transposon and move it to another part of the genome. If this happened to occur in a gene that was necessary for life, a transposon can actually splice itself into a gene region that you need, making it nonfunctional. You don't want to interrupt a sequence thats encoding a protein. Because you have the extra sequence inside, when it gets translated, you won't make the actual protein you were supposed to.

Although mutation has a bad connotation, it isn't all bad!

Ultimate source of genetic variation (essential to evolution) Can have either deleterious or advantageous outcomes for an organism Intentional mutations are a powerful tool in molecular biology Overtime you will see beneficial mutations fluctuate at a higher rate in a population and so eventually that entire population contains that advantageous mutation and thats how we evolve. We can also use mutations as a very powerful tool in the lab. If we had a protein and we wanted to identify the most important part of the protein, say the active site, you could go through systematic point mutations, so you change a single amino acid in that protein sequence and you can test the function after. We can induce mutations at certain points to study their effects.

A little bit of Science history...

Up until the 1970's DNA was actually one of the hardest things to analyze and characterize. Currently it is probably one of the easiest things to work with and characterize. Today, we can sequence DNA, engineer DNA, and manipulate DNA in a variety of ways. You can leave DNA out overnight. It might get denatured a little bit but it will ultimately be okay. It is very stable. We can heat and denature (separating of the two strands) then re-anneal back together if you just return the DNA to physiological conditions. This renatures the DNA, and you can do this many times.

RNA Pol 2 transcription unit

Usually only one gene per promoter in EK Additional regulatory elements The EK transcription units involve the -35, -10, and +1 sites but they are structured differently in EK. Yeast are the simplest EK organisms, so they have the simplest EK transcription unit. Their promoter region contains the TATA box which is a region around the -10 and -35 were in PK. Just upstream of our +1 site, a TA rich region. -35 and -10 sites were TA rich for the same reasons. The core promoter region of a EK transcription unit is going to have TA rich regions as well. We had in PK, an UP element. We see the same thing in Yeast, but we call them Upstream activating sequence. +1 site and that is the part that is getting transcribed. Its simpler than what is in most EK cells. We take that a step further to add different layers of regulation to our gene regulation in EK by not only having a promoter, by having additional sequences that are important for regulating transcription, and some of those can be very close to the promoter region. We also see in EK sequences that are very far away from the promoter region that still influence transcription. They can be multiple kb upstream from our promoter region and still influence transcription. An enhancer region is something that a protein that activates transcription could bind. An insulator is a region where some sort of repressor protein could bind. Were going to have a series of different regulatory sequences present in the DNA and those are going to be cis factors that are within the DNA and on the cis factors can be bound by trans factors which are actual proteins that interact with the DNA. Cis is within the DNA and trans is something that can bind the DNA (a protein).

The process of intron removal is called SPLICING

We are cutting out the intronic sequences and splicing the exonic sequences together. Visual of a promoter region where we have the +1 site just past the core promoter, we have proximal promoter elements and we have distal promoter elements. The distal ones are enhancer or insulator sequences and we also have sequences that other transcription activators or inhibitors can bind adjacent to the core promoter or TATA box. The +1 site, then a series of sequences that correspond to what would become exons or introns in the RNA that they are synthesized into. Also in that DNA sequences is that PolyA signal, this is the end of the RNA molecule and once it is transcribed into RNA, we're going to have cleavage factors cut the RNA at that point. When we have our preRNA or hnRNA that has just been transcribed, its going to contain those intronic sequences, its going to have a 5' cap and a polyA tail once it is fully furnished. We remove those introns and this is what we consider to be our mature RNA. Within that, theres a specific segment of the RNA that we call the coding segment. That is the region that is actually going to be translated into a protein. Its not the entire RNA molecule from the 5' to 3' end, that would be problematic because what if you lose sequence on either side. We have buffer zones that flank the actual coding region of the RNA is, called UTRs. Untranslated regions. You have a 5' UTR and 3' UTR. These sequences not only serve as a buffer zone to make sure we have the whole coding segment of the RNA intact, but they also have a regulatory function in controlling translation. Certain sequences present in UTRs can have significant effects on the stability of that RNA, targeting that RNA to particular regions, and controlling how it is translated. The coding segment starts with a start codon and a stop codon. Codons are how we designate our genetic code. Each combination of 3 nucleotides in the RNA is going to encode for the particular amino acid. The start codon is going to be the first amino acid that is incorporated into the protein and the stop codon is going to be 3 nucleotides that signal the end of that protein, and thats going to help end translation.

Huntington's disease

We can have insertions of DNA and errors where polymerases have trouble replicating DNA that is highly repetitive. If you have DNA that has repeats like in STRs, individuals then because of the inability of the polymerase to faithfully replicate those repeats, we can add repeats, or delete them as well. This occurs because the polymerase is actually slipping on the DNA and losing its place because of the all these repetitive sequences, so then it adds these repeats. Huntington's disease is a disease classified by the expansion of repetitive sequences. Disease that you have a 50% chance of inheriting if you have 1 parent with it. It manifests around 40s.

Double-strand break repair: HDR

We identify double strand break, series of factor binds the break, excise the nucleotides to form the 3' overhangs. These overhangs undergo strand invasion of a homologous chromosome. Here we are showing that we are lining it up with a homologous chromosome, we have an overhang invading within it. We separate the two strands of the homologous chromosome, then we base pair our exisiting chromosome to that to find homology. Then we find exactly what nucleotides we need to complete the repair and with a high fidelity without actually introducing new nucleotides and deleting other nucleotides.

The actual result after two rounds of replication

We now know that it is semiconservative replication. After 1 round, semiconservative and dispersive look the same. After the second round, the fact that you get two distinct bands, you can understand how it is synthesized by just separating the two strands, building the new sequence, so that our parental DNA is going to be slowly and slowly weeded out. The more rounds of replication we do, we still have a couple DNA molecules that are going to have 1 of the parental strands and everything else will be synthesized. Overtime if we keep doing this, we would eventually get everything to be very very light. A very small fraction would be heavy and light. The density of the DNA goes into separate bands and then we can decide how it is replicated.

STR Analysis

What are the two main properties of STRs? -STRs vary in length between people -We know where they are in the genome How could this information be used to compare people's DNA? What methods could you use? -Isolate DNA -Use PCR to amplify STRs -Compare sizes of STRs using gel electrophoresis

Step 1. Isolate DNA samples

What biological specimens would yield DNA at a crime scene? White blood cells, semen, skin, lip prints, saliva If you find just a few cells, that would not be enough DNA. We need to make more of it!

Evolution of the global genes through duplication events

When you duplicate an entire gene, the second copy is no longer needed for life so its free to mutate. One functional copy that is necessary, if you duplicate the gene, you don't have any evolutionary pressure to keep that in its proper sequence, it is free to mutate without altering the fitness of the organism it is a part of. If you get advantageous mutations in the second copy of the gene, then you can evolve different gene families. That is how myoglobin evolved over time and produced different types of globin proteins that are used in higher organisms. We actually have several different forms of hemoglobin. Fetal, shortly expressed after birth. Multiple others that function at different points in life. This is possible due to overtime duplication events, DNA being shuffled around creating new types of proteins that are advantageous as we evolved.

Transposons can cause unequal crossing over and gene duplications

When you move different elements of the DNA and attempt to cross over, you can end up with a mixture of pieces that do not line up properly anymore because you've lengthened the DNA on one chromosomes and not the other. That can cause further problems with insertions and deletions in your chromosomes.

EK transcription initiation requires additional proteins

Why? -Binding of the RNA polymerases is not sufficient for initiation/activation of transcription in EK -DNA template is packaged into chromatin -Gene regulation What are these proteins? -Transcriptional regulatory proteins (proteins that bind proximal promoter regions, enhancers, insulators, the trans factors that interact with all of the cis elements in the promoter region, and then also those are going to involve our nucleosome modifying proteins. We are going to have a cascade where we are unwinding the DNA from a nucleosome through the help of HAT which are going to create binding sites for bromodomain containing proteins that are going to help further remodel our chromatin and part of that is going to be moving nucleosomes or shifting DNA around a nucleosome. All of these things that have to occur simultaneously to ensure we actually get transcription.) -The Mediator complex -Nucleosome-modifying enzymes

Histone modifications regulate transcription and other cellular processes: "The histone code"

Within a combination, the effects of those modifications are to control gene expression. When you have an acetylated lysine, the function of the protein that comes in is to alter transcription. We can have different modifications that function at different parts of the cell cycle. There are histones that are located near the centromere of chromosomes that are going to recruit the kinetochore proteins to the centromere during cell division. All these histones and their modifications have broad effects on our DNA and how we use it and different parts of the cell cycle such as mitosis.

Without an enzyme maintaining the telomeres (telomerase), they would shorten during each cell division due to the end replication problem

Without telomerase present, the chromosome is shortened each time the cell divides. Finally the telomere DNA is eroded and the chromosome is damaged. Telomerase maintains the telomeres at the ends of the DNA thread. This makes it possible to copy the entire chromosome to its very end each time the cell divides. Without telomerase, after every subsequent division, we would lose genetic information and then eventually we would start losing beyond the telomere repeats and getting into information that we do not want to lose. Most somatic cells, cells that do divide, can have a limit of how long they divide, called the haflick limit. We notice that even though we have telomerase in our cells, there is still a limit on how functional it can be to correct this problem. In aging cells, we actually start to lose expression of telomerase. In an old cell, you won't make as much telomerase, you won't be able to extend your telomeres as efficiently, and you will still lose genetic information at that point. In cancer cells, it is an unregulated or upregulated version of telomerase that ensures that that cell can continue to divide over and over again.

Some diseases are associated with what we've been talking about

Xeroderma pigmentosum Huntington's disease Cockayne syndrome Fragile X syndrome

The chemistry of DNA synthesis

You have an incoming nucleotide triphosphate that gets incorporated as a monophosphate and a pyrophosphate leaving group which is a higher energy bond that we've broken to catalyze the reaction inside the polymerase.

The study of changes in gene expression that are NOT due to the DNA sequence is called Epigenetics

a) Closed chromatin: transcriptional repression b) Open chromatin: transcriptional activation Includes: histone modifications, DNA methylation, RNA-mediated processes

Transcription initiation — open complex formation

a) In initiation, the RNA polymerase holoenzyme first recognizes the promoter at the -35 region and binds to the full promoter. b) As initiation continues, RNA polymerase binds more tightly to the promoter at the -10 region, accompanied by a local untwisting of the DNA in that region. At this point, the RNA polymerase is correctly oriented to being transcription at +1. After we form our open complex, meaning we have separated the DNA strands, its going to be the region within the RNA polymerase where it is sitting on the DNA. The sigma factor is what causes that to happen. We go from a closed complex to then an open complex. The +1 site is shown by changing the color of the DNA at that point. green is promoter region, pink is coding region. The RNA polymerase is always going to be assembling just before that +1 site and its always going to be at that -35 and -10 sites.

A closer look at XP...

a) UV-induced thymine dimers cause DNA to kink b) Vulnerability of cells to UV light damage c) Ability of cells to repair damage

Post translational modifications (PTMs) help regulate chromatin compaction and transcriptional activity

acetylation methylation uniquination sumoylation phosphorylation Within a histone octamer, we have many different points in which we can modify residues on the tails and give different signals to other proteins.

Elongation begins when sigma factor falls off the DNATranscription initiation ends when si complete with the release of sigma factor after 8-9 nt

c) After eight to 9 nucleotides have been polymerized, the sigma factor dissociates from the core enzyme. d) As the RNA polymerase elongates the new RNA chain, the enzyme untwists the DNA ahead of it, keeping a single-stranded transcription bubble spanning about 25bp. About 9 bases of the new RNA are bound to the single-stranded DNA bubble, with the remainder exiting the enzyme in a single-stranded form. Does the formation of the open complex always result in the formation of an mRNA? Just because you've created an open complex, you've opened the DNA, doesn't mean you'll actually make an mRNA. You can be stopped at that point. One way is the sigma factor still being associated with the RNA polymerase and not allowing it to move forward. Sometimes you'll never actually make an mRNA there. You go through an open complex, you synthesize some short transcripts and the whole entire system can actually just fall off the DNA at that point. Just because you've made an open complex, doesn't mean that an mRNA is going to be made for that reason. Elongation is going to occur once we've actually had the polymerase move into the region that we are actually using to transcribe the DNA. Its called escaping the promoter. At that point in time, sigma factor is going to fall off and dissociate. Thats going to allow the RNA polymerase to freely move and transcribe the RNA so it is going to move downstream of where it was originally assembled. We see something called abortive transcription. This is the synthesis of short pieces of RNA that are consistent with what the polymerase was sitting on top of. The +1 site and just a few nucleotides forward. It hasn't been able to move freely. This is going to occur we think because sigma factor is blocking the RNA exit channel of the RNA polymerase. If were trying to initiate RNA synthesis, and it has to actually exit the rear of the polymerase, if we have sigma factor there, there isn't any room to synthesize along the RNA molecule. The synthesis of these very short transcripts, once we have created enough of them, help dislodge the sigma factor. This is a feature we see at the beginning of elongation, before we are fully transcribing our RNA, were going to go through this period of short transcripts that stop because the RNA polymerase can't move forward while still bound to the sigma factor. Once we fill up the RNA polymerase with the short pieces of RNA that can kind of build pressure and help dislodge the sigma factor, once sigma factor is dislodged, we are able to fully escape that promoter region and then synthesize RNA based on the coding region of the DNA.

Sanger DNA sequencing with dideoxynucleotides (ddNTPs)

ddATP terminates synthesis because the 3' H cannot act as nucleophile in phosphodiester bond formation A mix of ddGTP:dGTP of ~1:100 (repeat for each of the bases in separate reactions) Separate fragments by gel electrophoresis Fragments ending where a ddGTP was incorporated is generated. Radioactive primer = all synthesized sequences are radioactive Repeat in separate reactions for each base Run together on gel Read sequence What could be the radioactive component? The radiojlabel should be in the dideoxy and/or the primer. If the primer itself is radioactive, and we're building off of that primer, the fragment that is generated is going to be radioactive. You can either label the primer or the dideoxy. You want something that is labelled that is going into the fragment. The things in the fragment are the primer, regular NTP, and our dideoxy. Dye-labeled segments of DNA, copied from template with unknown sequence Today, sanger sequencing is done in one reaction, using dye-labeled ddNTPs. There were a lot of issues in maintaining accuracy and deciphering a sequence with the old method. Now we use fluorescent dyes. Instead of having to have 4 different reactions with 4 different sources of radiation (we can't distinguish their sources, if they came from an A or a G), if we run them side by side, if we label the dideoxys with fluorescence, we can run them in a gel with a single well. That is what is being indicated on the slide. It is one well in a gel, and we can use a laser to detect the sequence. The laser reads from the bottom up. For a sequence that is 20 base pairs long, were going to have 20 different lengths of sequence. Each of those ending in a different fluorescent dye of 4 possibilities, and so you can read that sequence from the bottom up. It calculates it for you and presents it to you in a graph where you can read your sequence. The peaks indicate a strong signal. If you have a sequence or population of DNA that came from homologous chromosomes and one has a mutation on one side and the other chromosomes doesn't have a mutation, if you sequence that, in a single sequencing reaction, at one base pair point, you're going to have two peaks. Say you had an A (green) on one chromosome that was mutated to G (yellow) on the other chromosome. You would get a peak that is a lot smaller and discolored because its detecting both of those. When we do sequencing reactions we don't just have one copy of our DNA, we're going to use a primer to anneal to a DNA sample, and we are going to sequence that. Depending on the application, you could be sequencing DNA from homologous chromosomes that don't have the exact same sequence. There could be a small change. Dye-labeled segments applied to a capillary gel and subjected to electrophoresis Computer-generated result after bands migrate past detector Signals are read and analyzed by the sequencer Sanger sequencing was the original method to determine the order of bases in a DNA molecule (its sequence). The key with sanger sequencing is using something called dideoxyNTP. Two dideoxy. Nucleotide with deoxyribose sugar and our hydroxyl group. The other molecule has no hydroxyl group. When that is incorporated into the other molecule, the consequence of the other 3' end not having a hydroxyl group terminates DNA synthesis at that point. DNA polymerase in order to build a nucleotide stand is going to require a 3' hydroxyl hanging off (nucleophile). Without that nucleophile, we can't add additional nucleotides. Sanger sequencing has a template (single stranded). We have a primer that binds part of that template. Sometimes there are universal primers that will bind within specific sequences in DNA so you can get a primer to bind on the sequence, then you can add a mixture of all the dNTPs (all 4) but one of them is going to be a dideoxy NTP. You are going to run 4 separate sequencing reactions. Each one of those sequencing reactions, you are going to vary which one of those nucleotides is a dideoxy, which one lacks the 3' hydroxyl group. The way this works is we have a template strand, a primer that anneals, and the first thing after the primer that would be added would be thymine. We're gonna build the complementary strand. If we have a mixture of all 4 regular NTPs, and we add an additional dideoxy NTP at a ratio of about 1:100 of the other nucleotides. We have abundant nucleotides that are regular, and a handful of dideoxys. Those are going to be incorporated in the DNA synthesis and sequencing reaction at a very low rate. Imagine having a sequence that is 100 base pairs, within the 100 base pairs, it has 20 guanines. If were going to sequence that (aka build the complementary strand) in a way to track what's being incorporated. If we don't know what this is, we can figure that out by where the DNA synthesis stops, and depending on which reaction tube has the dideoxy. In this case, we have dideoxyGTP. There are 3 cytosines in this sequence, therefore there are 3 places where a G is going to be incorporated into that sequence, to build the complementary strand. By chance, we are going to get a dideoxyGTP to base pair with those cytosines, occasionally. We will generate different fragments of different sizes based on where that guanine was incorporated. These are 3 possibilities of sequences that could have been generated based on that template strand. We could have a primer bind, and we could have all regular nucleotides incorporated, but at that cytosine we add ddGTP and stop synthesis at the end of the sequence. If it gets incorporated in the middle, the fragment will be shorter. Whenever we see ddGTP, that fragment will be that length. You can have 4 reactions in tandem where we bury the dideoxynucleotide and take a gel and run the sequencing products in a gel and use 4 separate wells to do that. This is an example of a well where we have added in the sequences generated by the dideoxyGTP. So we have the longer fragment, the medium sized fragment, and the smallest fragment. We will visualize them on an autoradiogram. What are the possibilities for how you could label those fragments? (Next slide) A better visual that shows the primer, the sequence that we will be sequencing, and 4 different reactions where we will be varying which one is the dideoxy. You can see you get a banding pattern side by side. For a sequence that is about 12 base pairs long, in that sanger sequencing reaction, you will get 12 fragments. You will get a fragment corresponding to every possible length. Sequencing involving the incorporation of 1 nucleotide all the way to 12. You expect the shortest fragment is going to be corresponding to what was the very first nucleotide in the part you were sequencing after the primer. The primer has a 3'hydroxyl. The first thing that needs to be incorporated is a guanine. If we by chance incorporated a dideoxy guanine there, we would generate a fragment that is 1 base pair beyond the length of the primer. When we read the gels, we read them from the bottom up.

Conclusions

mRNA is derived from hnRNA (hetero nuclear RNA or preRNA) mRNAs are smaller in length than hnRNAs, something must have happened The size difference between hnRNA and mRNA primarily due to removal of introns


Kaugnay na mga set ng pag-aaral

Chapter 6- Blood & Lymphatic System

View Set

AP US History, Give Me Liberty, Chapter 12, Give Me Liberty Chapter 12, Ch. 12 - An Age of Reform, Chapter 12 History Reading Quiz

View Set

architectural drafting quiz #7- chapter 13

View Set

MAN 4701 - Business In Society Chapter(s): 5

View Set

Patho Ch. 30 Alterations of Renal and Urinary Tract Function 5th & 6th

View Set

Java 8 Questions for Oracle Certified Associate Java SE 8 Programmer 1

View Set