Principles and Analysis of Gene Function
PRE (small RE)
"Repressor establishment" Promoter that is activated by CII(CII and CIII are encoded by the delayed early gene cII cIII. CII). Initiates transcription that extends from the antisense strand of cro through the cI gene.
mRNA processing
-5'Cap formation -PolyA addition -Splicing -mRNA export. All these steps are coupled to transcription.
Mechanisms of regulation
1. Regulation of initiation and termination via direct effects on RNA pol. -Sigma factors -T phage ADP ribosylates RNA polymerase to allow expression of their early genes. -Phage T7 encodes its own RNA polymerase that recognises its own promoters. -Initiation at some promoters are sensitive to supercoiling. -Many phage encode anti-terminator proteins that allow read through of viral genes. 2. Regulation by transcription factors -Repressible systems (tryptophan operon) -Inducible systems (lactose operon) -Activation (CAP in lactose operon) 3.Premature termination, or attentuation of transcription via formation of stem loops. 4. Small non-coding and anti-sense RNAs.
Helix turn helix DNA binding domains
3 alpha helices. 3rd helix helps to stabilize the bundle. Arginine residue forms bridge with -ve DNA. The homeodomains (HOX) are helix turn helix proteins discovered in Drosophila melanogaster where they control body pattern during development of the embryo. They perform this function from flies to humans, which explains why not too long ago we had a common ancestor.
Nuceotide
5 carbon sugar Phosphate group at C5 (OH- group at C3) Nitrogen containing base at C1.
Nucleoside
5 carbon sugar. Phosphate group. Lack nitrogen containing base.
Transcription factor DNA binding domains classification
5 superclasses: 1.Basic domains 2.Zinc coordinating DNA binding domains 3.Helix turn helix. 4.Beta-scaffold with minor groove contacts. 5.Other transcription factors. Each transcription factor has a subfamily, for example superclass 2 can be divided into nuclear receptor and thyroid hormone receptor like factors etc. Scaffold that holds the bases the correct distance apart so it fits properly and hydrogen bonds can form.
5'Cap formation
7-methylguanosine. Added to the 5' end of primary transcript via a triphosphate bridge. Provides stability and is needed for translation by ribosome. C-terminal domain of RNA polymerase II binds various factors depending on its phosphorylation state. A factor it binds includes capping factors, resulting in the addition of a 5' cap to RNA sequences.
Chromatid
A pair of identical sister chromosomes produced by replication and held together at the centromere.
Telomeres
A region of repetitive nucleotide sequences at the end of each chromosome, protecting the ends from deterioration or fusion with neighbouring chromosomes. In yeast, a protein complex caps the telomere, whereas in mammals the telomere is folded back on itself to form two loops. This prevents the ends of the chromosomes from being recognized by the double strand break repair machinery. The break repair machinery joins unprotected telomeres, producing dicentric chromosomes that can not be segregated accurately, leading to genetic instability. If the telomeres become too short, the complex cannot form. DNA polymerase does not run off the end of the leading strand in vivo resulting in incomplete replication of the 5' end, and the lagging strand is even more of a problem. Consequently, chromosomes get shorter each round of replication. Hence why smoking and sunbathing accelerates the aging process. "End replication problem". Some bacterial plasmids and chromsomes are linear and therfore have telomeres. Lymes disease is caused by a bacteria with linear chromosomes.
Nuclear receptors
All ligands are hydrophobic and partialy planar. They can cross the cell membrane and reach the receptor in the cytoplasm. Glucocorticoid receptor is a nucleophilic receptor bound to chaperone hsp90. Hormone binding displaces hsp90, allowing the receptor to dimerise and enter the nucleus where it binds its enhancer and activates transcription. Net effect of nuclear receptor activation: stimulation of DNA and protein synthesis.
TAT binding protein (TBP)
All three polymerase enzymes are dependent upon TBP to initiate transcription. In each case, TBP is a subunit of a larger complex required for transcription initiation.: SLI, TFIID, TFIIIB, which are required by RNA polymerase I, II and III respectively. TBP is part of a larger element called SLI in RNA polymerase I. Helps RNA polymerase to recognise the core promoter.
Splicing
Almost all genes in higher organisms have introns that need to be removed. For example dystrophin transcription unit is 2.5Mb, however mature mRNA is 14kb, the rest is removed by splicing. Exon-intro boundaries often correspond to protein domain boundaries. The correspondance between exons and the boundaries of globular domains facilitates exon shuffling. DNA recombination has the potential to join previously separate domains into a single new protein. This increases the rate of evolution as it allows for the recombination of independently folding domains and does not depend on slow accumulation of point mutations. Splicing involves two transesterification reactions. Occurs in the splicesome complex. Firstly, a bond is transferred from one location to another, then the intron is removed and the two neighbouring exons are joined together. The branching point A residue plays a key role in the enzymatic reaction. 1.Hydrophilic attack. 2. The 2' hydroxyl group of the conserved adenosine within the branching site attacks the conserved guanine of the 5' splicing site at the exon1-intron junction. 3. A 2'-5' phosophodiester bond is made between both residues and the exon-1 intron junction is cleaved. 4.The products formed are a 2'-5' phosphodiester RNA lariat structure and a free 3'-OH (leaving group) that arises from the upstream exon. 5.A rerrangement of spliceosomal components must occur to allow the second trans-esterification reaction. 6. Another hydrophilic attack. The 3'OH end of the released exon 1 attacks the scissile phospohidester bond of the conserved guanine of the 3' splicing site at the intron-exon2 junction. 7. Exon2 is then split. This reaction releases the 3' OH of the intron resulting in a free lariat and spliced exons. 8.The two exons are joined together and the intron sequence is released. Intron RNA is degraded in the nucleus, snRNPs are recycled. The spliceosomal proteins holds the RNA in a specific conformation. BPs between splicosomal RNA and mRNA hold the intron-exon boundaries in a specific configuration that promotes splicing chemistry. The BBP (branch-point binding protein) and U2AF (U2 auxiliary factor) recognise the branch-point site. U1 snRNP (small nuclear ribonucleo proteins) displaces BBP and U2AF, forming base pairs with the 5' splice junction. U4, U5 and U6 snRNPs are rearranged to create an active site of the spliceosome and position the pre-mRNA portions in the right position for the transesterification reaction. U6 displaces U1 at the 5' splice junction forming the right configuration for the second transesterification reaction, completing the splice. The spliceosome leaves some of its subunits bound at the mature splice junctions, allowing for interaction with the nuclear pore.
Transposons
Also referred to as jumping genes. Discovered by Barbara mcClintock. Mobile genetic elements that increase in number as they jump from one location to another. A large fraction of the genome of higher organisms has been derived from these elements. Two main classes: -DNA transposons which have a DNA intermediate. -Retrotransposons which have an RNA intermediate. More numerous in higher organisms that DNA transposons.
pQ
Another mechanism for anti-termination. Acts when the sigma factor is still attached to RNA polymerase before it has escaped the promoter. It partially displaces the sigma factor and promotes reinitiation.
Lysogen
Bacterial cell in which a phage exists as DNA in its dormant state.
Polyadenylation
C-terminal domain (CTD) of RNA polymerase II binds various factors depending on its phosphorylation state. A factor it binds includes splicing proteins, handing RNA off to the mRNA to perform splicing and polyadenylation. Polyadenylation leads to transcription termination and release of RNA/ RNA polymerase. Produces mature messenger RNA for translation. Cleavage and polyadenylation specificity factor (CPSF)/ cleavage stimulation factor (CstF) is transferred from CTD of RNA polymerase II. 1. Cleavage site on RNA is flanked by two binding site recognized by protein factors. 2. Further proteins are recruited which cleave the RNA. 3. RNA polymerase II adds a string of A residues which are bound and protected by another protein factor. 4. Produces mature RNA for translation. PolyA tail eventually interacts with cap, therefore ribosome doesn't have to travel far in translation.
SWI/SNF complex
Chromatin remodelling machine. Binds to chromatin and uses the energy generated by ATP hydrolysis to mobilize nucleosomes. This chromatin remodelling regulates the accessibility of DNA to the transcription machinery and can occur by octamer sliding along the same strand of DNA or by transfer of the histone octomers to adjacent regions of exposed NDA. The complex is also involved in transcriptional repression of a subset of its targets by moving nucleosomes into areas that interfere with transcription.
Centromere
Chromosomes are circular in prokaryotes. Centromeres are required for segregation. Metacentric centromeres occur in the centre of a chromosome. Submetacentric centromeres result in arms that are slightly unqeual in length. Acrocentric centromeres result in one length of the arm being much longer than the other. Telocentric centromeres are placed at the end of the chromosome (one arm). Bacteria do not have centromeres and segregation is a mystery. Some steps may involve toroidial motor proteins that pump DNA to opposite ends of the cell S.cerevisae (bakers yeast) has a 'point' centromere of 125bp. S.pombe (fission yeast) has a 'regional' centromere of 40-100kb that contains a block of repeared units. Higher eukaryotes have a 'regional centromere of 400-5000kb.
Chromosome condensation
Chromosomes are packed into chromatin by wrapping around histones and other architectural proteins into higher order structures.. The most condensed form of the chromosome is during metaphase in mitosis and meoisis. The least condensed form is in highly expressed genes where the nucleosomes are continually being displaced by passage of RNA polymerase II complex. There are also intermediate levels of compaction which can dictate whether a gene is accessible. Gene activation is accompanied by chromatin de-condensation and vice versa. Early experimental approaches have revealed that active genes are packaged in different conformations in chromatin compated to inactive genes. A element are nucelar scaffold attachment sites (A attaches DNA to nuclear fold). The use of DNAase1 enzymes allow us to determine whether a region of DNA is accessible or not. DNAse cuts randomly in exposed DNA. De-condensation can be seen in polytene chromosomes from drosophila salivary glands. Form well aligned bundles. Gene expression opens up whole domains. Gene activation promotes de-condensation (via activators), while condensation (via repressors )promotes gene inactivation. There is a balance between opposing forces. Condensation is mediated by interactions between the N-terminal 'tails' of the histones. N terminal tails regulate the formation of 30nm fibres. Interaction between tails are modulated by covalent modifications. N-terminal tails of core histones: H3 (Lys 6, 14, 18,23) H4 (Lys 5, 8, 12, 16) H2A, H2B Specific lysines are acetylated, reducing the charge. Histone tails are chemically modified in an almost infinite number of possible combinations, which constitute "histone code". By reading this histone code, we can gain an insight of the probable level of expression of particular regions. However the histone code does not dictate the expression directly. Rather, it reflects the action of site-specific transcription factors, which are primary determinants. Histone modifications cause a cascade of further modifications. For example, phosphorylation causes acetylation which causes activation. An vice versa; deacetylation causes methylation which causes repression. These histone modification are known as epigenetics If a transcriptional factor is overexpressed, it will causes de-condensation of its target region. This reveals that TFs can penetrate into condensed chromatin to some degree.
Sigma factor
Contained in RNA polymerase of bacteria. An adapter subunit that directs interactions with different classes of promoters such as those needed during exponential growth, or stationary phase or nitrogen starvation etc. RNA polymerase interacts with a promoter via the sigma factor. Sigma factor blocks the exit channel. The growing chain packs into the pocket of RNA polymerase and can cause strain, this is relieved by aborting transcription and releasing this sigma factor blocking the pocket, this breaks the contact between the RNA polymerase and the promoter. After escape, the polymerase begins elongation.
Sigma 70
Controls transcription via the regulation initiation and termination via its effects on RNA polymerase. Sigma 70 is the standard sigma factor in bacterial E.coli. Every cell has the same genome, but only a small fraction are expressed in any particular cell type, this is due to tightly regulated protein expression. Genes are regulated by the availability of alternative factors that recognise different promoters. There are various sigma factors and promoter structures. Some sigma factorss are encoded by the host or a virus. -T phage ADP-ribosylate RNA polymerase to allow expression of their early genes. -Phage T7 encodes it;s own RNA polymerase that recognises its own promoters. -Initiation at some promoters is sensitive to supercoiling because of this effects the bending and strain in the spacer region. Supercoiling reflects the metabolic state of the cell. Many phages encode an anti-terminato protein that allows read-through into viral genes.
Immediate early genes
Cro and N, transcribed by the promoters PL and PR. N is an anti-terminator (Haults termination allowing for growth). that allows transcription of the delayed early genes. N therefore allows the first step of the cascade, immediate early to delayed early. Immediate early transcription is initiated from outward facing promoter flanking cl, and ends at the terminators tL and tR. Once N is synthesised, it binds to nut sites. The site of nut is in the RNA. N is an RNA binding protein, after nut site is synthesised, N binds, modifying RNA polymerase as it passes. This allows RNA polymerase to pass through the terminators and begin transcription of the delayed early genes. A second anti-terminator Q allows expression of late genes from the third major promoter PR'.
Cro
Cro binding establishes the lytic cycle (opposite to CI which represses lysogeny). Binds to the same operators as CI, OR (small R). OR3 has the greatest affinity whereas OR1 and OR2 have the same affinity for Cro binding. The lytic cascade requires Cro protein, which directly prevents repressor maintenance via PRM (lower RM), as well as turning off delayed early gene expression, indirectly preventing repressor establishment via PRE (lower RE). Cro shuts down pRM
Chromatin
DNA complexed with proteins such as histones (condenses DNA). Negative supercoiling condenses DNA via DNA gyrase).
DNA encoding
DNA encodes information in the sequence of the bases. The sequence of bases is converted into proteins sequences via an RNA intermediate. DNA also encodes information in the configuration of H-bond donors and acceptors in the major groove. This allows transcription factors to recognise specific DNA sequences and so controls the expression of genetic information. Proteins (TFs) bind to DNA bases in two ways: 1.Sequence specific binding Via H-bonds between amino acid side chains of specific proteins and the DNA bases in the major groove. 2.Non-specific binding Mediated by ionic interactions between positivity charged amino acid side chains of and negatively charged phosphodiester backbone. Protein binding is also mediated by the shape of distortions in the helix, which are dictated by the sequence of bases (A-tracts). Proteins bend DNA when they bind. The recognition sequence is often naturally bent when present in naked DNA. Example: Tn10/IS10 transposon has a sequence with 2 known protein binding sites and a possible third site. Site 2 binds E.coli protein integration host factor (IHF) at the minor groove (unusual!), resulting in a 180° bend. A-tracts: present in the most intrinsically bent DNA. An extra H-bond may be present due the large propeller twist. Mechanism remains unknown.
Mini satellites repeats
DNA repeat class. DNA repeats, often 10-20bps long. useful DNA fingerprinting Arise from replication slippage.
Moderately repeated DNA sequences
DNA repeat class. Interspersed with single copy DNA. Not usually clustered. Main example is the Alu repeat. Alu is a 280bp transposon repeat integrated into the genome. There are more than 300 copies in human DNA! Now thought that up to 10% of human genome has derived from Alu elements. Althoguh many are unrecognizable due to point mutations. Pseudogene. Non functional, often the product of reverse transcription of cellular RNA into DNA, and integration into genome. Does not code for proteins, often found in introns!
Satellite DNA
DNA repeat class. Simple repeats. Simplest is the guinea pig tandem cluster of 6p CCCTAA or GGGATT. Human alpha DNA is a 170bp repeat, present in telomeres and centromeres.
The sliding clamp
DNA replication is highly processive and can go half way round the genome without the need to restart. The sliding clamp is a ring shaped grommet that encircles the DNA and binds the polymerase ensuring that it does not fall off the DNA. Has low affinity to a particular sequence and therefore easily moves. Each okazaki fragment has an individual sliding clamp. The sliding clamp increases replication speed from less than 10bps and a speed of less than 20 nucleotides/second (nt/s) to more than 5kb and 1,000 nt/s.
Transcription
DNA to RNA.
RNA polymerase
DNA-dependent. Catalyses transcription. Only one strand of DNA is copied. Synthesis starts de novo, does not require a primer. Synthesis occurs in 5' to 3' direction. One PPi (pyrophosphate) is released for each nucleotide polymerised. Reaction always leaves a free 3'OH to which nucleotides (NTPs) are added. RNA polymerase I, II, and III transcribes various DNA into RNA in eukaryotes, however all these functions are performed by a single RNA polymerase in bacteria. Furthermore, the bacterial RNA polymerase has an adapter subunit called a sigma factor that directs its interactions with different classes of promoters such as those needed during exponential growth, or stationary phase or nitrogen starvation etc. RNA polymerase has two forms, as identified by fractionation: The core complex (smaller) nd the holoenzyme (complete enzyme). 5 subunits: α (37kDa) with 2 cores and 2 holoenzymes. β (150kDa) with 1 core and 1 holoenzyme. β (156kDa) with 1 core and 1 holoenzyme. ω (11kDa) with 1 core and 1 holoenzyme. σ (70kDa) with 0 cores and 1 holoenzyme. CTD: C terminal domain of the alpha subunit. Polymerase active site is between the beta subunits.
Tau (τ)
Dimer that acts as a scaffold on which other components are mounted.
DNA polymerase III
Dimerisation of polymerase III prevents the leading strand from running off ahead of the lagging strand in E.coli. The lagging strand is forced to loop round. However in eukaryotes, the leading and lagging strand are replicated by different polymerases. Hence the eukaryotic replicative polymerase is a heterodimer, unlike the bacterial homodimer.
Negative supercoiling
Drawn from bottom left to top right, single strands! Toroidal : One-start left handed helix. Can be stablished by wrapping around a protein spool. Plectonemic: Two-start right handed helix.
Premature termination (bacteria)
Due to base pairing forming two stem and loop structures in mRNA. The second loop causes termination. Termination can be prevented if a protein binds mRNA before the first loop, allowing further sites to pair off, known as read through, Prevents the formation of the second loop. Important in termination of tryp operator and amino acid biosynthesis pathways.
Drosophila gene eyeless (ey)
Encodes a transcription factor with both a paired domain and a homeodomain. It is homologous to the mouse small eye (Pax-6) gene and to the aniridia gene in humans. Ey gene functions in eye morphogenesis. By target gene expression of ey in drosophila, ectopic eyes can be induced on the wings, legs and antennae. The eyes appeared morphologically normal with fully differentiated ommatidia with complete set of photoreceptor cells. If you express the mouse protein in the fly it directs the development of an eye. Not a mouse eye but a fly's eye. However no neuronal link to brain, cannot see.
TFs
Eukaryotic TFs Often have separate DNA binding and transcriptional activating domains. Eukaryotic TFs are like beads on a string. Independently folding globular domains joined by flexible linkers. DNA binding domains are often basic (net positive charge). Activation domains are often acidic (net negative charge). When activators are present (C/EBP, HNF1, HNF3, HNF4, AP1), promoter is initiated. 3 domains in eukaryotic TFs: -DNA binding domain -Activation domain -Flexible protein domain. In bacteria, the naked promoter is usually active unless repressed. Opposite holds for eukaryotes in which the naked promoter is off unless activated. In eukaryotes, fewer genes are controlled by repressors.
Lactose operon
Example of a inducible system in bacteria. Regulates transcription via the regulation of transcription factors (DNA binding proteins). Catabolism The repressor inducers such as lactose or IPTF, causing it to fall off the DNA alowing RNA polymerase to escape the promoter, switching transcription on. There is a binding pocket for the inducer in the hinge region. When the inducer binds, the hinges close, moving the position of the HTH motifs so that they can no longer bind the operator. This lowers the affinity for the operator from X to square root X. The repressor therefore changes when the inducer binds and falls off the DNA. Transcription is activated further by an additional binding site for an activator called CAP (catabolite activator protein) in the presence of cAMP (cAMP binds to CAP) resulting in activation of transcription. The Lac operon operator is downstream of the promoter meaning the polymerase can bind to the promoter but can not initiate until the repressor comes off. If repressor receptor remains empty, transcription is switched off.
Tryptophan operon
Example of a repressible system in bacteria. Regulates transcription via the regulation of transcription factors (DNA binding proteins). Anabolism Binding of corepressor such as tryptophan to the apopressor results in a conformational change, switching transcription off. If tryptophan is not bound to Apopressor, transcription occurs. The operator is between -35 and -10 of the promoter. Repressor often has a helix turn helix motif (HTH). This motif interacts with DNA. The carboxyl terminal helix is the recognition helix that fits into the major groove where the amino acid side chains recognise specific DNA sequences. HTH proteins also often bind as symmetric dimers. This increases the specificity and affinity of binding, if one hand lets go the other prevents the protein from diffusing away. Scaffold holds the helix turn helix in the correct orientation. Transcription is dependent on the availability of tryp (attenuation). The ribosome must be able to catch up with RNA polymerase and therefore must be able to move faster. Whether the ribosome reaches the atenuator in time to cause stermination depends on its progress past the 2 tryp residues in the leader peptide trpL. The ribosome is a negative regulator. When tryp is plentiful, the ribosome catches up with RNA polymerase and gets in the way of base pairings, allowing a second stem and loop, creating a transcriptional terminator (same method as premature termination via stem loops).
β-like globin cluser in rabbits
Example of interspersion of genes and repeated sequences. β4 and β3 are embroyonic, β1 is adult globin. However β2 is a pseudogene that had lost an intron and undergone reverse transcription. Repated sequences are interspersed between β genes. Repeat sequence is made up of 100-1400bp. Some sequences are present several times in cluster, others once in cluster, and other copies elsewhere in genome. For example a 400bp repeat, 6 copies are present in a 65kb human beta globin locus.
Domain swapping
Facilitates evolution. If an estrogen receptor DNA binding domain is replaced by the gluccocorticoid receptor DNA binding domain, estrogen treatment would activate the glucocorticoid responsive genes. Expalins the diversity of nuclear receptor types.
Repeated DNA
Fall into several classes. -Satellite DNA -Minisatellite DNA -Microsatellite DNA -Moderately repeated DNA sequences/ interspersed repeats. -Transposons (DNA and retro)
Other transcription factors
For example; HMG domain of TF SOX-9. HMG domains bind DNA with little or no sequence-specificity. However like many DNA binding proteins, their affinity for a site is increased if the naked DNA is naturally distorted.
Phage lamda
Found in a certain of Intestinal bacterium Escherichia coli. Bacterium is irradiated with UV light, resulting in a stop of growth. After 90 minutes, they lyse (burst), expelling large quantities of viruses called lambda in culture medium. This is a key example of switching genes off and on. The virus switches from the dormant form in dividing bacterium to activated form in iraddiated bacterium. The viruses are called bacteriophages, bacteria eaters. Adding a small amount of lamda to large colonies on an agar plate will infect cells, in turn the infected cells will infect adjacent cells. This process results in a clear area in the agar plate. This is called the lytic phase of the phage cycle. Although some bacteria would have survived in which the phage has integrated it's genome into the host chromosome. It expresses only one protein, the repressor. This turns off all the other phage genes and provides immunity from other phage of the same type. This is called the lysogenic phase of the life cycle. Good example of the 4 mechanisms of regulation! Important in the development of the concept of operons.
Rho
Hexameric RNA-dependent 5' to 3' helicase. Displaces RNA polymerase, terminating transcription. Inhibited by strong secondary structure and therefore does not terminate transcription of untranscribed rRNA or tRNA. Polyosome protects DNA from termination by Rho.
DNA helicase
Hexameric donut that encircles the leading strand in eukaryotes and the lagging strand in prokaryotes. Separates the strands which are immediately bound by a protein that prevents the single strand from self-annealing and forming secondary structure.
Epigenetics
Histone modifications. Meaning heritable traits or conditions not defined by actual sequence of DNA bases (heritable gene control). Controversial, cause and effect! Histone modifications cause a cascade of further modifications. For example, phosphorylation causes acetylation which causes activation. An vice versa; deacetylation causes methylation which causes repression.
DNA replication in bacteria
Initiation of replication takes place at origin of replication. In E.coli, the origin of replication is 245bp long and contains two repeated sequence-motifs. DnaA recognises the origin of replication and loads on together with accessory proteins that help bend the DNA. ATP is hydolysed and the strands separate aided by negative supercoiling. This provides access to helicase DnaB which loads onto the strands by DnaC to prepare the way for polymerase. This forms the DNA fork. Two replication forks set off in opposite directions from the origin in eukaryotes, speeding up replication time. However bacterial circular chromosomes only require one origin of replication. The separation of the two strands by DNA helicase results in positive supercoils in front of the fork, topoisomerases untangle the DNA in front of the advancing fork. Firstly, strands are separated and an RNA primer is synthesized via primase. The 3' end of the primer is used to start DNA synthesis as the nucleotides require an OH group to add onto. Leading strand is synthesised continuously, whereas the lagging strand is created in okazaki fragments (discontinuous). DNA primase synthesises RNA primers at regular intervals to allow for DNA polymerase loading for replication of each fragment. Replication of both strands is coordinated by dimerisation of polymerase III. The sliding clamp is a ring shaped grommet that encircles the DNA and binds the polymerase ensuring it does not fall off the DNA.Once the okazaki fragment has matured and is fully synthesised, the RNA primer is removed by DNA polymerase I which synthesises a short stretch of DNA to replace it (gap-filling). Finally DNA ligase joins the ends of the okazaki fragments together. Prokaryotic and eukaryotic DNA replication are the same except for the helicase. The helicase is a hexameric donut that encircles the leading strand in eukaryotes, and the lagging strand in prokaryotes. The SV40 virus T antigen serves the duel function of origin recognition and helicase. It is the equivalent of the bacterial DnaA, DnaB and DnaC proteins.
Lytic cycle
Involves a cascade of events that must progress in a fixed sequence. Lytic development takes place by producing phage genomes and protein particles that are assembled into progeny phages. 1. Immediate early: Phage attaches to bacterium, DNA is injected into bacterium. 2. Delayed early- Enzymes for DNA synthesis are made, replication begins. 3. Late- genomes, heads and tails are made. DNA packaged into heads, tails attached. 4. Lysis- Cell is broken to release progeny phages. The development decision between lysogeny and lysis is taken during the delayed early phase. The choice is dependent on the control of transcription. The virus destroys the ends of DNA, DNA becomes circular, stabilising it. Lytic cascade require Cro protein which directly prevents repressor maintenance via PRM, as well as turning off delayed early gene expression, indirectly preventing repressor establishment via PRE. 1. Immediate early N and cro are transcribed 2. Delayed early. pN antiterminates, cII and cIII are transcribed. 3. Delayed early continuation Cro binds to OL and OR. 4. Late expression. Cro represses cI and all early genes. pQ activates late expression.
Genome size
Large variation in genome size in different organisms. Simple organisms may have huge genomes, while complex organisms may have small genomes. An amoebae has a genome size of 10¹¹ whereas a drosophila fruit fly has a genome size of 1.65x10⁸. Humans have a genome size of 2.9 x10⁹. However a very small amount of the genome is protein coding. Most is non functioning DNA. DNA involved in spatial and temporal control. Repeated sequences.
Lysogeny
Lysogeny is characterized by integration of the bacteriophage nucleic acid into the host bacterium's genome or formations of a circular replicon in the bacterial cytoplasm. Cascade needed to establish lysogeny: 1. Immediate early N and Cro are transcribed 2. Delayed early N antiterminates. cII and cIII are transcribed 3. Lysogenic establishment. CII acts at PRE (lower RE). CI is transcribed. 4. Lysogenic maintenance. Repressor binds at OL and OR. CI is transcribed from PRM.
Clamp loader
Machine that assembles and loads the new sliding clamp onto the lagging strand that is needed for synthesis of each Okazaki fragment.
Nut site
Made of RNA and is bound by transcription anti-termination factors on the surface of RNA polymerase. N binds, allowing RNA polymerase to pass the terminator site tR and tL, allowing the transcripton of delayed early genes.
Renaturation kinetics
Measures the speed of re-annealing of dissociated sequences. The higher the concentration of a sequence (number of copies), the faster it'll re-nature. If there it is a single copy, it is harder for a strand to find its complementary pair. Hence, the more complex a species, the longer it takes for a sample to reach Cot¹/₂. Cot¹/₂ is the value where half of the DNA has reannealed. The rate of The bigger the genome, the longer it takes to reanneal itself.
Non-specific DNA-protein binding
Mediated by ionic interactions between positivity charged amino acid side chains of and negatively charged phosphodiester bond.
rRNA
Most RNA in cells is ribosomal RNA. Transcribed by RNA polymerase I. rDNA repeat in nuclear DNA is transcribed into 45S rRNA. This is cleaved to form 18S rRNA, incoorporated into small ribosomal subunit (20S intermediate). The 5.8S rRNA and 28S rRNA is incoorporated into large ribosomal subunit (32S intermediate)
Beta-scaffold factors with minor groove contacts
NF-KB P65 homodimer. Example: bacterial integration host factor (IHF) protein used by lambda.
Toroidal
Negative supercoiling. One-start left handed helix. Can be established by wrapping around a protein spool
ncRNAs
Non-coding RNAs. he mRNA encoding the stationart phase sigma factor sigmaS folds back on itself inhibiting translation. Inhibition is relieved by either of two ncRNAs ncRNAs have a variety of mechanisms that alter translation rate, termination and degradation of RNA.
Pseudogenes
Non-functional "gene like" sequences. Often the product of reverse transcription of cellular RNA into DNA, and integration into genome. Can result in defects/ mutations such as: -Frame shifts -Nonsense mutations -Lack of promoter signal. All 3 above occurs on same chromosome and in close proximity to functional gene -Intronless (precise removal of introns) on separate chromosome. Probably generated by reverse transcription of mRNA, and integration into different chromosome. Possibly following retroviral infection during evolution.
Twist
Occurs at base pairs. Refers to the twist of a helix on its axis (axis twist), and the propeller twist between paired bases (propeller twist). Propeller twist motion distorts the hydrogen bond at a 20° angle.
DNA topology
Occurs in circular plasmids, bacterial chromosomes and anchored at both ends DNA. Important during replication and transcription. Almost all bacteria have DNA gyrase that use the energy from ATP hydrolysis to add negative supercoils. Very few bacteria have reverse gyrase that adds positive supercoils. Thermophiles (live in high temps) have reverse gyrases to help hold the strands of the double helix together to prevent thermal denaturation. In relaxed DNA, the linking number (L) equals the number of turns in a helix. Supercoiling is introduced if turns are added (positive) or removed (negative). Positive adds a twist in the same direction as the double helix. Negative removes a twist and therefore tends to separate the strands which help with protein binding, transcription and replication. Negative supercoiling results in the formation of writhes. Negative supercoils help condense the DNA so that it fits. Any protein that wraps DNA will change its topology. Unwrapping DNA for DNA replication introduces supercoiling, topology solved by topoisomerase enzymes!
A-DNA
Occurs in crystalline structures where the water concentration is reduced. The Structure is distorted and the bases are no longer co-planar.
Z-DNA
Occurs occasionally in the middle of a B-form molecule. Left handed. Longer and thinner than B-DNA. Bases are co planar.
OR (Small R)
Operator that includes the major promoter PR (small R) and the repressor maintenance promoter PRM (small RM) that directs the expression of the CI repressor during lysogeny. PRM is a poor promoter that requires the repressor itself for activation. OR contains three imperfect palindromes where the repressor CI binds, OR1 has the greatest affinity binding, OR3 has the lowest affinity binding for CI. Repressor binding at OR1 turns turns off rightward transcription from the promoter PR. Repressor binding at OR2 activates trancription from PRM to make more repressor. Repressor binding at OR3 would in principle turn off PRM, however due to poor binding affinity, the concentration of repressor does not reach high enough concentrations.
Plectonemic
Positive supercoiling. Two-start right handed helix.
Promoter structure (bacteria) and transcription initation
Promoter sequences are degenerate (different promoters encode the same thing) and are described by a consensus sequence that represents an average. Promoter regions indicate transcription start site. Transcription initiation in bacteria summary: 1) RNA polymerase binds to DNA. 2) Locates the promoter. 3) Separates the strands of DNA to gain access to the template (open complex formation). 4) Initiates synthesis of RNA. 5) Elongates the RNA (promoter escape). Four binding stages before transcription is initiated Complex 1: nonspecific DNA binding in which RNA polymerase binds non-specifically. Complex 2 (RPc1/ RNA polymerase complex 1): Interaction of RNA polymerase -35 of promoter, region characterized by a bend in DNA. Complex 3 (RNA polymerase complex 2): RNA polymerase interacts -35 and -10 of the promoter. Results in isomerisation, relieving the stress in the bend. Complex 4 (open complex): Strand of DNA separate from -12 to +3., providing access to the template strand from which the RNA is synthesised. Abortive transcriptional starts are followed by promoter escape and elongation of forming RNA. After initiation of transcription, the growing chain packs into a pocket in the enzyme causing strain. The strain can be relieved by aborting transcription and releasing the RNA, or by releasing the sigma factor. A 9 nucleotide chain is formed, ejecting the sigma factor resulting in the break of connection between RNA polymerase/DNA and the promoter. This allows for RNA polymerase to begin elongation. The purpose of this ritual is to test the strength of the promoter.
Specific DNA- protein binding
Protein (TF) binds to DNA via H-bonds between amino acid side chains of specific proteins and the DNA bases in the major groove.
CII and CIII
Proteins encoded by the delayed early genes cII and CIII. Activates the repressor establishment promoter (PRE). In turn results in transcription. Plays a crucial role in deciding whether a bacterial cell undergoes lysogeny or lytic cycle. If the concentration of CII protein is sufficient to stimulate enough repressor synthesis from PRE (smalll RE) to overcome Cro binding at the right operator, lysogeny occurs, however if not, lysis cycle will occur. The concentration of CII depends on activity of host proteases. Basically competition between CI and Cro at OR (small R)
RNA Polymerases
RNA polymerase I and III rely on distinct set of proteins to initiate transcription. All three polymerases are very similar, with several of the core enzyme subunits identical in all three eukaryotic RNA polymerases. RNA polymerases I and III recognise distinct promoter sequences and and have unique TFs. RNA polymerases I and III transcribe the same subset of genes in all cell types. RNA polymerase II transcibes a different set of proteins in different cells at different times. It therefore has a very sophisticated set of accessory factors. Spatial and temporal expression of genes controls type of cell its present in. RNA polymerase II do not contain a subunit similar to prokaryotic sigma factor which is recognises and unwinds the DNA double helix. RNA polymerase II has 6 general transcription factors to carry out these two functions. TF
Translation
Rate of translation and mRNA stability controls gene expression in bacteria. In turn, the rate of translation depends on the strength of the binding site. Translation continues until mRNA is degraded. In bacteria, the ribosome interacts with the RNA binding site (RBS) and uses the ATG start codon loacted about 6 nucleotides in the 3'direction. In eukaryotes, the ribosome interacts with the cap and then initiates with the first ATG it finds.
Roll
Refers to the displacement of one base pair relative to the base pair above it or below it.
DNA libraries
Represents a complete cross-section of genes from a given tissue or cell type. Libraries can represent genes expressed or the total genome. DNA is not present within the library as naked DNA. Physical size of a library is surprisingly small. 200μl usually contains 2x10⁹ (billion) clones. 1x10⁶ is representative of more than 95% of the totla DNA in a cell. 1μl is enough to cover a complete genome. DNA libraries fall into two main categories: -cDNA libraries -genomic libraries. Different information available from various types.
PRM
Repressor maintenance promoter that directs the expression of CI repressor during lysogeny. PRM is a poor promoter that requires the repressor itself for activation. CI has a duel function at OR, shutting down PR and activating its own promoter PRM. Self activation is referred to as an autogenous regulatory circuit. Repressor stimulates isomerization of RNA polymerase to form the open complex required to initiate transcription.
Repressor gene (CI)
Repressor protein. prevents the lytic cycle. Cl is located between PL and PR. It is not transcribed with immediate early genes. It is the heart of of the control region and also at the heart of the switch. Occupancy of the operators OL and OR by the repressor protein CI prevents the lytic cycle. In a lysogen, CI is the only phage protein being expressed, it shuts down everything except its own production. It activates its own production by activating RNA polymerase at its own promoter. Example of a positive feedback loop. Binding of CI consitutes a immunity region. No phage that has this region can infect the lysogen as it is inhibited, resulting in the cell being immune to infection. Dimeric-helix-turn-helix DNA binding protein. Binds to palindromic recognition sites. This provides specificity and cooperativity. Greater affinity than that of monomers. PRM is the repressor maintenance promoter that directs the expression of the CI repressor during lysogeny. PRM is a poor promoter that requires the repressor itself for activation. Therefore CI shuts down PR and activates it's own promoter PRM. The repressor stimulates isomerization of RNA polymerase to form the open complex required to initiate transcription. Once the repressor is active,it stays on because it stimulates its own production. However if there is no repressor present, no repressor is made (spring-loaded mechanism). As the concentration of the repressor rises, cro is shut down first by binding at OR1, and repressor transcription is stimulated by binding at OR2. Amplification of switch is achieved by cooperativity between repressor dimers bound at OR1 and OR2. However, when the concentration of repressor decreases, repression is relieved and lysogen is induced (phage is no longer dormant).
Telomerase
Reverse transcriptase (RT) present only in the fetus and immortal cells such as germ cells and cancer. An RNA template of about 160 bases is used to synthesize a 6bp repeat added to the 3'end of the leading strand. Absent in the soma (contains cell nucleus in neurons), hence why we age.
Rho dependent termination of transcription (bacteria)
Rho is a hexameric RNA helicase that binds single stranded RNA and moves towards the 3' end. It moves and rapidly displaces RNA polymerase, terminating transcription. It has a low affinity for RNA that is being translated and therefore tends to bind 3' untranslated region of RNA. If there are no ribosomes translating the mRNA, Rho loads on and chases RNA polymerase, knocking it off when they meet.
Spliceosome
Ribonucleoprotein complex containing numerous RNAs and proteins. Where splicing of introns occur in RNA.
Chromatin remodelling machines
SWI/SNF (SWitch/ sucrose non fermentable). Uses ATP hydrolysis to slide nucleosomes along the the DNA helix and epxose the DNA to transcription factors. Histone acetyltrasnferases (HAT) are an example of complexes that covalently modify DNA and lead to relaxation of chromatin structure. Often, these two complexes work together to regulate transcription.
B-DNA structure
Standard form in biological systems. Right handed (runs from right to left /) Rungs on a ladder structure. Phosphodiester backbone. Bases are co planar Strands are of opposite polarity and run 5' to 3'. Major groove and minor groove. Almost all sequence specific information is in the major groove. Different atoms can be assessed in each groove. DNA groove dimensions (width and depth) can be distorted by the sequence context, influencing protein binding. 10.4bp per turn. One helical turn is 3.4nm and 360 degrees. 3.4 angstrom axial rise. 34 degrees twist anglee (rotation per residue)
Basic domains
Superclass 1. Includes the leucine zipper and basic helix-loop helix domains. In the leucine zipper, leucine residues interact to form hydrophobic bonds, holds recognition helixes in correct spatial orientation. The basic helix loop helix also holds helixes in correct spacial orientation, however they use slightly different scaffolds to hold the recognition helixes in the correct position relative to each other and the DNA binding site.
Zinc-coordinating domains
Superclass 2. Two major types of zinc finger motif. C₂H₂ (TFIIIA) and C₂C₂ (Nuclear receptors) C= Cysteine H- Histidine. TFIIIA controls transcription of the 5S RA gene. It contains 9 C₂H₂ Zinc fingers, 6 of which bind in major groove (1-3, 7-9). This gives a long recognition site. The compact size of the zinc finger gives it high affinity and specificity. Zinc stabilizes the scaffold and is not recognising the DNA The glucocorticoid nuclear receptor is a transcription factor in which zinc fingers contribute to DNA binding and the dimer interface.
TAF
TATA box associated factors (TAFs).
TFIID
TATA box is recognised by TFIID. TFIID is a multimeric protein complex composed of TATA box-binding protein and many TBP-associated factors
Cot experiment
Technique used to measure how repetitive DNA is in a DNA sample such as a genome. Log10C₀t on the X axis, percentage of remaining single stranded DNA on the Y axis. Area above the curve corresponds to repetitive DNA, whereas area under the curve corresponds to single copy sequences. . The rate at which a sequence reassociates is proportional to the number of copies of that sequence in the DNA sample. A sample with a highly repetitive sequence will renature rapidly, while more complex sequences will renature more slowly. "Cot" curve is the product of Co, the initial concentration of DNA, and t, refering to the time in seconds. Repetitive DNA will renature at low cot values, whereas complex DNA will renature at high cot values. The fast denaturation of repetitive sequences is due to the availability of numerous complimentary sequences. 1. Sheer DNA to a size of about 400bp. 2. Denature the DNA by heating to 100°C. This breaks the H-bonds and renders the DNA single stranded. 3. Cool to a point just below the temperature required for denaturation of half the sample (Tm). At this temperature, only perfectly paired duplexes will be stable, others will be in dynamic equalibrium where they are testing potential matched pairs. The DNA is cooled slowly meaning the sequences that are complementary are allowed to base pair again. 4. Determine the percentage of single stranded DNA at each time point. as the incubation progresses. This is done by rapidly diluting the sample, which slows reassociation, and then binding the DNA to a hydroxylapatite column. The column is first washed with a low concentration of sodium phosphate buffer, which elutes ss-DNA, and then with a higher concentration of phosphate, which elutes ds-DNA. Amount of DNA in these two solutions is measured via a spectrometer. The cot experiment demonstrated that the majority of eukaryotic genomic DNA is composed of repetitive, non-coding elements. In organisms with large genomes, the curve is not smooth, but is broken into different phases. The phases correspond to various types of repetitive DNA (higly repetitive components, moderately repetitive components and single/low copy components). Analysis of a cot curve allows us to determine: -Genome size -Relative proportions of single copy and repetitive sequences. -Fraction of genome occupied by each component. -Mean kinetic complexity.
Primase
The leading strand advances continuously because there is always a 3' end available onto which the next nucleotide is added. However the lagging strand is synthesised in short okazaki fragments of 500-800bp in length, and therefore needs synthesis of RNA primers at regular intervals so that DNA polymerase can load onto the strand. Primase snythesises this RNA primer.
Topoisomerase
The separation of the two strands by helicase pushes positive supercoils in front of the fork. Topoisomerases are required to untangle the DNA in front of the advancing fork. One supercoil needs to be removed for every 10bp the fork moves forward because each strand is wound round the other every 10bp.
Rho independent termination of transcription (bacteria)
The sequence of DNA allows the single stranded RNA to fold back on itself to form a stem-loop structure. Stem loop is rich in G/C bases to ensure strong pairing (3 hydrogen bonds). But is also followed by a sequence rich in U residues. The stem loop binds to the polymerase complex and stalls the elongation leading to termination of transcription.
DNA polymerase
The template strand is being replicated. DNA polymerase moves from 3' hydroxyl end of the template to the 5' phoshate end. Nucleotides are added to the growing chain. Nucleotides are joined together via the formation of ester bonds. The phosphodiester bond is a bridge built of one phosphate and two oxygens "O-P-O".
Template strand
The thing being replicated.
RNA polymerase II
Transcribes protein coding genes into pre-mRNA in eukaryotes. Pre-mRNA contains both introns and exons. High turnover, meaning high synthesis and degradation. Allows replacement. Very similar to polymerase I and III. Several of the core enzyme subunits are identical in all three eukaryotic RNA polymerases. Transcribes a different set of proteins in different cells at different times. Therefore has a sophisticated set of accessory proteins. RNA polymerase II do not contain a subunit similar to prokaryotic sigma factor which is recognises and unwinds the DNA double helix. RNA polymerase II has 6 general transcription factors to carry out these two functions. TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH. TF= transcription factor II= RNA polymerase II. TATA box is recognised by TFIID. Recognition is followed by an ordered series of steps during which the general transcription factors enter and exit, and which culminates in the elongation phase. The C-terminal domain of RNA polymerase binds various factors depending on its phosphorylated state. It hands them off to the mRNA to perform various functions such as splicing and polyadenylation. Factors RNA polymerase binds includes capping factors and splicing factors. Therefore RNA polymerase II adds 5'caps to RNA sequence.
RNA polymerase I
Transcribes ribosomal DNA (rDNA) into rRNA in eukaryotes. Stable (low turnover). Rely on a distinct set of proteins to initiate transcription. Recognize quite distinct promoter sequences and have unique general TFs. Binds to a promoter containing a core promoter element and an upstream control element (UCE). Rely on tata box protein to initiate transcription. TBP is part of a larger complex called SLI, helps the enzyme recognise the core promoter.
RNA polymerase III
Transcribes transfer DNA (tDNA) and ribosomal DNA (rDNA) into tRNA and 5S RNA subunit of the ribosome in eukaryotes. Stable (low turnover). Rely on a distinct set of proteins to initiate transcription. Recognize quite distinct promoter sequences and have unique general TFs. Rely on tata box protein to initiate transcription. Have an unusual promoter structure. Some of the promoter elements are located downstream of the transcription start site. Promoter recognition is mediated by TBP which in this case is a subunit of TFIIIB, mirroring promoter recognition by polymerases I and II.
Lamda promoters
Two major promoters in lamda: PL and PR (small L and R). The operators OL and OR control the promoters. PL drives leftward transcription, whereas PR drives rightward transcription. They direct the expression of immediate early proteins called Cro and N. It is then decided whether the development is lytic or lysogeny. PL makes N. PR makes Cro.
Intron-less genes
Ubiquitin Histones Cytochrome C
Tetrahymena
Used to study telomeres. Unicellular organism with huge genome and small chromosomes, hence plentiful telomeres.
Promoter specific TF binding sites
Usually (upstream) 5' to the promoter but may also be located on introns, which are (downstream) 3' to the transcriptional start site. Activators include: C/EBP HNF1 HNF3 HNF4 AP1 When activators are expressed, promoter is initiated.
Microsatellite repeats
Very simple tandem repeats. Often less than 10bp. (AC)n, (TAG)n Often used as genetic markers for disease detection.
Delayed early genes
cII and cIII encode proteins CII and CIII. CII is very sensitive to proteolysis. CIII provides partial protection (for CII). CII activates the promoter PRE (lower RE), binding to the -35 region. Stimulates isomerization of RNA polymerase to form the open complex to activate it. Four complexes during initiation of transcription: 1. Non-specific 2. -35 interactions 3. -10 interactions 4. Isomerization to open complex.