Bio 319 Exam 3 Eukaryotic Genomes
Histone
In biology, histones are highly alkaline proteins found in eukaryotic cell nuclei that package and order the DNA into structural units called nucleosomes. They are the chief protein components of chromatin, acting as spools around which DNA winds, and play a role in gene regulation. Without histones, the unwound DNA in chromosomes would be very long (a length to width ratio of more than 10 million to 1 in human DNA). For example, each human cell has about 1.8 meters of DNA, but wound on the histones it has about 90 micrometers (0.09 mm) of chromatin, which, when duplicated and condensed during mitosis, result in about 120 micrometers of chromosomes.
Endogenous genes
Endogenous substances are those that originate from within an organism, tissue, or cell. Endogenous viral elements (EVEs) are DNA sequences derived from viruses that are ancestrally inserted into the genomes of germ cells. These sequences, which may be fragments of viruses, or entire viral genomes (proviruses), can persist in the germline, being passed on from one generation to the next as host alleles.
Promoters
Facilitate the transcription of a particular gene and are typically a short way upstream of the coding region
Endogenous viral element
A DNA sequence derived from a virus, and present within the germline of a non-viral organism. EVEs may be entire viral genomes (proviruses), or fragments of viral genomes. They arise when a viral DNA sequence becomes integrated into the genome of a germ cell that goes on to produce a viable organism. The newly established EVE can be inherited from one generation to the next as an allele in the host species, and may even reach fixation. Endogenous retroviruses and other EVEs that occur as proviruses can potentially remain capable of producing infectious virus in their endogenous state. Replication of such 'active' endogenous viruses can lead to the proliferation of viral insertions in the germline. For most non-retroviral viruses, germline integration appears to be a rare, anomalous event, and the resulting EVEs are often only fragments of the parent virus genome. Such fragments are usually not capable of producing infectious virus, but may express protein or RNA.
DNA polymerase (general)
A cellular or viral polymerase enzyme that synthesizes DNA molecules from their nucleotide building blocks. DNA polymerases are essential for DNA replication, and usually function in pairs while copying one double-stranded DNA molecule into two double-stranded DNAs in a process termed semiconservative DNA replication. DNA polymerases also play key roles in other processes within cells, including DNA repair, genetic recombination, reverse transcription, and the generation of antibody diversity via a specialized DNA polymerase. DNA polymerases are widely used in molecular biology laboratories, notably for the polymerase chain reaction (PCR), DNA sequencing, and molecular cloning.
Histone deacetylase (Histone Deacetylation) (gene silencing)
A class of enzymes that remove acetyl groups (O=C-CH3) from an ε-N-acetyl lysine amino acid on a histone, allowing the histones to wrap the DNA more tightly. This is important because DNA is wrapped around histones, and DNA expression is regulated by acetylation and de-acetylation. Histone deacetylation decreases gene expression.
Small nuclear RNA molecules (snRNA) (U-RNA)
A class of small RNA molecules that are found within the nucleus of eukaryotic cells. The length of the an average snRNA is approximately 150 nucleotides. They are transcribed by either RNA polymerase II or RNA polymerase III, and studies have shown that their primary function is in the processing of pre-mRNA (hnRNA) in the nucleus.
Spliceosome
A complex of snRNA and protein subunits that removes introns from a transcribed pre-mRNA (hnRNA) segment. This process is generally referred to as splicing. Each spliceosome is composed of five small nuclear RNAs (snRNA), and a range of associated protein factors. When these small RNA are combined with the protein factors, they make an RNA-protein complex called snRNP. The canonical assembly of the spliceosome occurs anew on each hnRNA (pre-mRNA). The hnRNA contains specific sequence elements that are recognized and utilized during spliceosome assembly.
Isodityrosine linkages
A diphenyl ether linked (diphenyl ether bridge) amino acid that can form linkages within or between proteins. Two tyrosine residues. Part of SbPRP gene family structure.
Heterogeneous nuclear RNA (hnRNA)
A diverse group of long primary transcripts formed in the eukaryotic nucleus, many of which will be processed to mRNA molecules by splicing (pre-mRNA). While pre-mRNA and hnRNA are typically used interchangeably, pre-mRNA is a subset of hnRNA.
Transgenes
A gene or genetic material that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another.
Highly repeated noncoding DNA sequences
A gene or genetic material that has been transferred naturally or by any of a number of genetic engineering techniques from one organism to another. In general, the DNA is incorporated into the organism's germ line.
Epigenetic
Changes in gene expression or cellular phenotype, caused by mechanisms other than changes in the underlying DNA sequence, some of which are heritable. It refers to functionally relevant modifications to the genome that do not involve a change in the nucleotide sequence. Examples of such modifications are DNA methylation and histone modification, both of which serve to regulate gene expression without altering the underlying DNA sequence. These changes may remain through cell divisions for the remainder of the cell's life and may also last for multiple generations. However, there is no change in the underlying DNA sequence of the organism; instead, non-genetic factors cause the organism's genes to behave (or "express themselves") differently.
Poly-A tail
Consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases.
Poly(A) tail
Consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. (A bases)
α-globin gene cluster
Contains different types of α hemoglobin, which help form the oxygen-transporting structure inside red blood cells. Defective α globin genes cause α thalassemia, resulting in mild to severe anemia and even death. Can be found along chromosome 16.
Trans-regulatory elements
Control the transcription of a distant gene. Typically encode transcription factors that act on remote regions of the chromatin.
Interspersed repetitive DNA
Found in all eukaryotic genomes. Includes retrotransposons and transposons as well as shorter Alu and SINE repetitive DNA and longer LINE repetitive DNA. Its function is to disconnect DNA sequences from the homogenizing force of gene conversion events. In effect, this allows individual gene sequences to evolve independent from one another without being homogenized by gene conversion events.
Segmental duplications
Segments of DNA with near-identical sequence. Segmental duplications give rise to low copy repeats and are believed to have played a role in creating new primate genes as reflected in human genetic variation. In humans, chromosomes Y and 22 have the greatest proportion of SDs: 50.4% and 11.9% respectively.
snRNPs (small nuclear ribonucleoproteins)
RNA-protein complexes that combine with unmodified pre-mRNA and other proteins to form a spliceosome, a large RNA-protein molecular complex upon which splicing of pre-mRNA occurs. The action of snRNPs is essential to the removal of introns from pre-mRNA.
Locus Control Region (LCR)
Regions of DNA that are defined by their ability to enhance expression of linked genes in specific tissues (the most studied example is the β-globin locus control region which only acts in bone marrow as its expression produces hemoglobin) in a copy number-dependant manner (depending on the position of the expressed element relative to the LCR.) The concept derives from the idea that developmental and cell lineage-specific regulation of gene expression relies not only on gene-proximal elements such as promoters, enhancers, and silencers, but also on long-range interactions of various cis-regulatory elements and dynamic chromatin alterations. LCR not only enhances transcription but also has dominant chromatin opening activity. These results suggest that the LCR stimulates transcription at an event downstream of promoter remodeling, although the molecular mechanism of LCR function has not been determined.
Distal
Remote from the point of attachment or origin; as, the distal end of a bone or muscle.
Microsatellite - Short Tandem Repeats - VNTR
Repeated sequences of 2-6 base pairs (bp) of DNA.
Noncoding spacer DNA
Separates genes from each other with long gaps, so mutation in one gene or part of a chromosome, for example deletion or insertion, does not have the "frameshift mutation" on the whole chromosome. When genome complexity is relatively high, like in the case of human genome, not only different genes, but also inside one gene there are gaps of introns to protect the entire coding segment to minimize the changes caused by mutation.
Simple-sequence repeats (SSR)
Tandem arrays of short sequences, from 1 to 500 nucleotides. Similar to micro- and minisatellites, have a different specification for base pair length.
Polyadenylation
The addition of a poly(A) tail to an RNA molecule. The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA (mRNA) for translation. It, therefore, forms part of the larger process of gene expression.
Histone methylation and demethylation
The addition or removal of n methyl groups to a histone substrate. Typically occurs 1, 2, or 3 methyl groups at a time. Methylation has different effects on different genes, including at the different levels of methylation (1 additional group versus 2.)
DNA polymerase III
The primary enzyme complex involved in prokaryotic DNA replication. It was discovered by Thomas Kornberg (son of Arthur Kornberg) and Malcolm Gefter in 1970. The complex has high processivity (i.e. the number of nucleotides added per binding event) and, specifically referring to the replication of the E.coli genome, works in conjunction with four other DNA polymerases (Pol I, Pol II, Pol IV, and Pol V).
Trans-acting element
Usually a DNA sequence that contains a gene. This gene codes for a protein (or microRNA or other diffusible molecule) that will be used in the regulation of another target gene. The trans-acting gene may be on the same chromosome as the target gene, but the activity is via the intermediary protein or RNA that it encodes. Cis-acting elements, on the other hand, do not code for protein or RNA.
Tandem repeats - Satellite DNA
Very large arrays of tandemly repeating, non-coding DNA. Satellite DNA is the main component of functional centromeres, and form the main structural constituent of heterochromatin. The name "satellite DNA" refers to how repetitions of a short DNA sequence tend to produce a different frequency of the nucleotides adenine, cytosine, guanine and thymine, and thus have a different density from bulk DNA - such that they form a second or 'satellite' band when genomic DNA is separated on a density gradient.
Minisatellite - Short Tandem Repeats - VNTR
When between 10 and 60 nucleotides are repeated, it is called a minisatellite.
α, β subunit hemoglobin
α and β subunits of hemoglobin are encoded by gene families; different members of these families are expressed in embryonic, fetal, and adult tissues. Gene families are thought to have arisen by duplication of an original ancestral gene, followed by mutation and divergence of different family members. The result is proteins optimized for different functions consisting of two α and two β subunits non-covalently bound, each made of 141 and 146 amino acid residues, respectively.
Gene silencing
A general term describing epigenetic processes of gene regulation. The term gene silencing is generally used to describe the "switching off" of a gene by a mechanism other than genetic modification. That is, a gene which would be expressed ("turned on") under normal circumstances is switched off by machinery in the cell. Gene silencing occurs when RNA is unable to make a protein during translation.
LINE (Long Interspersed Nuclear Elements)
A group of genetic elements that are found in large numbers in eukaryotic genomes. They are transcribed (or are the evolutionary remains of what was once transcribed) to an RNA using an RNA polymerase II promoter that resides inside the LINE. LINEs code for the enzyme reverse transcriptase, and many LINEs also code for an endonuclease. Part of the interspersed repetitive DNA system that allows individual gene sequences to evolve independent from one another without being homogenized by gene conversion events.
Histone Modification
A huge catalogue of histone modifications have been described, but a functional understanding of most is still lacking. Collectively, it is thought that histone modifications may underlie a histone code, whereby combinations of histone modifications have specific meanings. However, most functional data concerns individual prominent histone modifications that are biochemically amenable to detailed study. (A quick review indicates modifications can be both good or bad or indifferent.)
mRNA
A large family of RNA molecules that convey genetic information from DNA to the ribosome, where they specify the amino acid sequence of the protein products of gene expression. Following transcription of mRNA by RNA polymerase, the mRNA is translated into a polymer of amino acids: a protein, as summarized in the central dogma of molecular biology. This process of translation of codons into amino acids requires two other types of RNA: Transfer RNA (tRNA), that mediates recognition of the codon and provides the corresponding amino acid, and ribosomal RNA (rRNA), that is the central component of the ribosome's protein-manufacturing machinery.
Variable number tandem repeat (VNTR)
A location in a genome where a short nucleotide sequence is organized as a tandem repeat. The class of clustered tandem repeats that exhibit allelic variation in their lengths. Two main forms: Minisatellite and microsatellite
Genomic duplication
A major mechanism through which new genetic material is generated during molecular evolution. It can be defined as any duplication of a region of DNA that contains a gene; it may occur as an error in homologous recombination, a retrotransposition event, or duplication of an entire chromosome.
Exon shuffling
A molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron structure. There are different mechanisms through which exon shuffling occurs: transposon mediated exon shuffling, crossover during sexual recombination of parental genomes and illegitimate recombination.
Ribonucleoproteins
A nucleoprotein that contains RNA, i.e. it is an association that combines ribonucleic acid and protein together. A few known examples include the ribosome, the enzyme telomerase, vault ribonucleoproteins, RNase P, hnRNP and small nuclear RNPs (snRNPs), which are implicated in pre-mRNA splicing (spliceosome) and are among the main components of the nucleolus. RNP in snRNPs has an RNA-binding motif in its RNA-binding protein. Aromatic amino acid residues in this motif result in stacking interactions with RNA. Lysine residues in the helical portion of RNA-binding proteins help to stabilize interactions with nucleic acids. This nucleic acid binding is strengthened by electrostatic attraction between the positive lysine side chains and the negative nucleic acid phosphate backbones.
ENCODE (Encyclopedia Of DNA Elements)
A project to identify all functional elements in the human genome sequence. Pilot and technological phases have been a success.
Cis-regulatory elements
A region of DNA or RNA that regulates the expression of genes located on that same molecule of DNA. Cis-elements may be located in 5' or 3' untranslated regions or within introns.
Telomeres
A region of repetitive nucleotide sequences at each end of a chromatid, which protects the end of the chromosome from deterioration or from fusion with neighboring chromosomes. Are required for the replication of linear DNA molecules. The sequences are similar among eukaryotes, with repeats containing clusters of G residues on one strand. They are repeated hundreds or thousands of times and end with a 3′ overhang of single-stranded DNA. Telomere regions deter the degradation of genes near the ends of chromosomes by allowing chromosome ends to shorten, which necessarily occurs during chromosome replication. Without telomeres, the genomes would progressively lose information and be truncated after cell division because the synthesis of Okazaki strands requires RNA primers attaching ahead on the lagging strand. Over time, due to each cell division, the telomere ends become shorter.
Alternative splicing
A regulated process during gene expression that results in a single gene coding for multiple proteins. In this process, particular exons of a gene may be included within, or excluded from, the final, processed mRNA produced from that gene. Consequently the proteins translated from alternatively spliced mRNAs will contain differences in their amino acid sequence and, often, in their biological functions. Notably, alternative splicing allows the human genome to direct the synthesis of many more proteins than would be expected from its 20,000 protein-coding genes. Alternative splicing is sometimes termed differential splicing, but it does not increase gene expression.
Ectopic Chromatin
A section of chromosome (chromatin) appearing in a locus other than its typical location.
Hemoglobin gene family
A set of gene clusters that encode the different forms of hemoglobin (these forms are expressed through different phases of development). Consist of apha and beta gene clusters containing the alpha, beta, gamma, and delta gene forms of hemoglobin. (Alpha cluster contains alpha, beta cluster contains rest.)
Cotyledon
A significant part of the embryo within the seed of a plant. Upon germination, the cotyledon may become the embryonic first leaves of a seedling. The number of cotyledons present is one characteristic used by botanists to classify the flowering plants (angiosperms). Species with one cotyledon are called monocotyledonous ("monocots"). Plants with two embryonic leaves are termed dicotyledonous ("dicots").
microRNA (miRNA)
A small non-coding RNA molecule (~22 nucleotides) found in plants and animals, which functions in transcriptional and post-transcriptional regulation of gene expression. Encoded by eukaryotic nuclear DNA, miRNAs function via base-pairing with complementary sequences within mRNA molecules, usually resulting in gene silencing via translational repression or target degradation. Are predicted to control the translational activity of approximately 30% of all protein-coding genes in mammals and may be vital components in the progression or treatment of various diseases including cancer, cardiovascular disease, and the immune system response to infection.
Heterochromatin
A tightly packed form of DNA, which comes in different varieties. These varieties lie on a continuum between the two extremes of constitutive and facultative heterochromatin. Both play a role in the expression of genes, where constitutive heterochromatin can affect the genes near them (position-effect variegation) and where facultative heterochromatin is the result of genes that are silenced through a mechanism such as histone methylation or siRNA through RNAi. Constitutive heterochromatin is usually repetitive and forms structural functions such as centromeres or telomeres, in addition to acting as an attractor for other gene-expression or repression signals. Facultative heterochromatin is not repetitive and although it shares the compact structure of constitutive heterochromatin, facultative heterochromatin can, under specific developmental or environmental signaling cues, lose its condensed structure and become transcriptionally active.
Frameshift mutation
A type of gene mutation wherein the addition or deletion of (a number of) nucleotide(s) causes a shift in the reading frame of the codons in the mRNA, thus, may eventually lead to the alteration in the amino acid sequence at protein translation.
Provirus
A virus genome that is integrated into the DNA of a host cell. This state can be a stage of virus replication, or a state that persists over longer periods of time as either inactive viral infections or an endogenous retrovirus. In inactive viral infections the virus will not replicate itself but through replication of its host cell. This state can last over many host cell generations.
tRNA (transfer RNA)
An adaptor molecule composed of RNA, typically 73 to 94 nucleotides in length, that serves as the physical link between the nucleotide sequence of nucleic acids (DNA and RNA) and the amino acid sequence of proteins. It does this by carrying an amino acid to the protein synthetic machinery of a cell (ribosome) as directed by a three-nucleotide sequence (codon) in a messenger RNA (mRNA).
Primase
An enzyme involved in the replication of DNA. Primase catalyzes the synthesis of a short RNA (or DNA in some organisms) segment called a primer complementary to a ssDNA template. Primase is of key importance in DNA replication because no known DNA polymerases can initiate the synthesis of a DNA strand without an initial RNA or DNA primer (for temporary DNA elongation).
DNA polymerase I
An enzyme that participates in the process of DNA replication. It was the first known DNA polymerase (and, indeed, the first known of any kind of polymerase). It was initially characterized in E. coli, although it is ubiquitous in prokaryotes. In E. coli and many other bacteria, the gene that encodes Pol I is known as polA. The E. coli form of the enzyme is composed of 928 amino acids, and is an example of a processive enzyme - it can sequentially catalyze multiple polymerisations.
RNA polymerase
An enzyme that produces RNA. In cells, RNAP is necessary for constructing RNA chains using DNA genes as templates, a process called transcription. RNA polymerase enzymes are essential to life and are found in all organisms and many viruses. In chemical terms, RNAP is a nucleotidyl transferase that polymerizes ribonucleotides at the 3' end of an RNA transcript.
DNA methylation
An epigenetic change that is heritable but does not change the actual sequence of nucleotides. Therefore, methylation can be reversed. Cell type and tissue specific methylation patterns are established during early development. Methylation can either be normal or abnormal depending on the cell type. For example, the sperm genome has plenty of methylation, whereas the oocyte genome does not. After fertilization, genes are de-methylated only to be re-methylated later before implantation. Problems arise, however, when methylation becomes a genome-wide phenomenon, an important feature of cancer, or even when it has a regional effect on critical genes such as the methylation of tumor suppressor genes like p21. In situations such as this, DNA methylation can lead to genetic instability. Typically, methylation down-regulates the expression of a gene. The negative effect this can have on the body is compounded especially when the aberrant methylation of CpG islands occurs in an important gene such as p21. If p21 protein loses its function, then it becomes increasingly difficult to keep the body's proliferating cells in check. In fact, it is CpG hypermethylation that is associated with gene silencing in cancer. a methyl group may be added to cytosine to form 5-methylcytosine. This process is known as DNA methylation and only occurs in cytosines that are followed by a guanine (5' CG 3'). Many genes in the human genome have upstream CG-rich regions called CpG islands. DNA methylation of a gene's CpG island represses gene expression.
Exons
An exon is any nucleotide sequence encoded by a gene that remains present within the final mature RNA product of that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature mRNA or noncoding RNA product of a gene.
Pre-mRNA (hnRNA)
An immature single strand of mRNA. Pre-mRNA is synthesized from a DNA template in the cell nucleus by transcription. The next step in the creation of mature mRNA is the translation process to remove introns from the hnRNA.
Introns
An intron is any nucleotide sequence within a gene that is removed by RNA splicing while the final mature RNA product of a gene is being generated. The term intron refers to both the DNA sequence within a gene and the corresponding sequence in RNA transcripts. Sequences that are joined together in the final mature RNA after RNA splicing are exons.
Annealing
Annealing, in genetics, means for complementary sequences of single-stranded DNA or RNA to pair by hydrogen bonds to form a double-stranded polynucleotide.
Transgenic crops
Are plants, the DNA of which has been modified using genetic engineering techniques, to resist pests and agents causing harm to plants and to improve the growth of these plants to assist in farmers efficiency.
β-globin gene cluster
Beta globin (HBB, β-globinprotin), along with alpha globin (HBA), makes up the most common form of hemoglobin in adult humans. The normal adult hemoglobin tetramer consists of two alpha chains and two beta chains. The gene is located in the β-globin locus. Expression of beta globin and the neighboring globins in the β-globin locus is controlled by single locus control region (LCR).
Gene Conversion
Can be viewed as the force acting to create sequence identity within the gene pool of a species. This is a cohesive force acting to match up DNA sequences of individual organisms that comprise a species. In effect the gene conversion causes the DNA sequences to clump together within a species and by doing so creates the natural boundaries between species. A form of genetic recombination, in which one version of a gene (allele) is observed to replace a different version. At the level of DNA, gene conversion differs from typical homologous recombination in that a duplicate copy of the donor allele appears in the recipient DNA, while the recipient allele is lost.
Proximal
Closer to the point of attachment or observation.
Human β-globin locus
Composed of five genes located on a short region of chromosome 11, responsible for the creation of the beta parts (roughly half) of the oxygen transport protein Hemoglobin. This locus contains not only the beta globin gene but also delta, gamma-A, gamma-G , and epsilon globin. Expression of all of these genes is controlled by single locus control region (LCR), and the genes are differentially expressed throughout development.
cDNA
DNA synthesized from a messenger RNA (mRNA) template in a reaction catalyzed by the enzymes reverse transcriptase and DNA polymerase. cDNA is often used to clone eukaryotic genes in prokaryotes. When scientists want to express a specific protein in a cell that does not normally express that protein (i.e., heterologous expression), they will transfer the cDNA that codes for the protein to the recipient cell. cDNA is also produced by retroviruses (such as HIV-1, HIV-2, Simian Immunodeficiency Virus, etc.) which is integrated into its host's genome where it creates a provirus.
Acetylation
Describes a reaction that introduces an acetyl functional group into a chemical compound. (Deacetylation is the removal of the acetyl group.) Specifically, acetylation refers to the process of introducing an acetyl group (resulting in an acetoxy group) into a compound, namely, the substitution of an acetyl group for an active hydrogen atom. A reaction involving the replacement of the hydrogen atom of a hydroxyl group with an acetyl group (CH3 CO) yields a specific ester, the acetate.
Enhancer sequences
Enhancer sequences may also exert very distant effects on the transcription levels of genes. An enhancer is a short region of DNA that can be bound with proteins (trans-acting factors), much like a set of transcription factors, to enhance transcription levels of genes in a gene cluster.
Histone acetylase (histone acetylation) (transcriptional activation)
Enzymes that acetylate conserved lysine amino acids on histone proteins by transferring an acetyl group from acetyl CoA to form ε-N-acetyllysine. DNA is wrapped around histones, and by transferring an acetyl group to the histones, genes can be turned on and off. Histone acetylation increases gene expression. In general, histone acetylation is linked to transcriptional activation and associated with euchromatin. Research has emerged to show that lysine acetylation and other posttranslational modifications of histones generate binding sites for specific protein-protein interaction domains, such as the acetyllysine-binding bromodomain.
DNA methyltransferase (DNMT)
Enzymes that catalyze the transfer of a methyl group to DNA. DNA methylation occurs mainly at the C5 position of CpG dinucleotides and is carried out by two general classes of enzymatic activities - maintenance methylation and de novo methylation. Maintenance methylation activity is necessary to preserve DNA methylation after every cellular DNA replication cycle. Without the DNA methyltransferase (DNMT), the replication machinery itself would produce daughter strands that are unmethylated and, over time, would lead to passive demethylation. Since many tumor suppressor genes are silenced by DNA methylation during carcinogenesis, there have been attempts to re-express these genes by inhibiting the DNMTs.
Euchromatin
Euchromatin is a lightly packed form of chromatin (DNA, RNA and protein) that is rich in gene concentration, and is often (but not always) under active transcription. Euchromatin comprises the most active portion of the genome within the cell nucleus. 92% of the human genome is euchromatic. (As opposed to heterochromatin.) The structure of euchromatin is reminiscent of an unfolded set of beads along a string, wherein those beads represent nucleosomes. Nucleosomes consist of eight proteins known as histones, with approximately 147 base pairs of DNA wound around them; in euchromatin, this wrapping is loose so that the raw DNA may be accessed. Each core histone possesses a `tail' structure, which can vary in several ways; it is thought that these variations act as "master control switches," which determine the overall arrangement of the chromatin. In particular, it is believed that the presence of methylated lysine 4 on the histone tails acts as a general marker for euchromatin.
Extensin
Extensins are a family of flexuous, rodlike, hydroxyproline-rich glycoproteins (HRGPs) of the plant cell wall. They are highly abundant proteins. There are around 20 extensins in Arabidopsis thaliana. They form crosslinked networks in the young cell wall. Typically they have two major diagnostic repetitive peptide motifs, one hydrophilic and the other hydrophobic, with potential for crosslinking.
Retrotransposons (also called transposons via RNA intermediates)
Genetic elements that can amplify themselves in a genome and are ubiquitous components of the DNA of many eukaryotic organisms. They are a subclass of transposon.
Hydrophobicity
Hydrophobicity is the physical property of a molecule (known as a hydrophobe) that is repelled from a mass of water.
Noncoding sequences in DNA
In genomics and related disciplines, noncoding DNA sequences are components of an organism's DNA that do not encode protein sequences. Some noncoding DNA is transcribed into functional noncoding RNA molecules (e.g. transfer RNA, ribosomal RNA, and regulatory RNAs), while others are not transcribed or give rise to RNA transcripts of unknown function. The amount of noncoding DNA varies greatly among species. For example, over 98% of the human genome is noncoding DNA, while only about 2% of a typical bacterial genome is noncoding DNA. Some sequences may have no biological function for the organism, such as endogenous retroviruses. However, many types of noncoding DNA sequences do have important biological functions, including the transcriptional and translational regulation of protein-coding sequences. Other noncoding sequences have likely, but as-yet undetermined, functions.
(gene activity) Splicing
In molecular biology and genetics, splicing is a modification of the nascent pre-mRNA taking place after or concurrently with its transcription, in which introns are removed and exons are joined. This is needed for the typical eukaryotic mRNA before it can be used to produce a correct protein through translation. For many eukaryotic introns, splicing is done in a series of reactions which are catalyzed by the spliceosome, a complex of small nuclear ribonucleoproteins (snRNPs), but there are also self-splicing introns.
DNA ligase
In molecular biology, DNA ligase is a specific type of enzyme, a ligase, that facilitates the joining of DNA strands together by catalyzing the formation of a phosphodiester bond. It plays a role in repairing single-strand breaks in duplex DNA in living organisms, but some forms (such as DNA ligase IV) may specifically repair double-strand breaks (i.e. a break in both complementary strands of DNA). Single-strand breaks are repaired by DNA ligase using the complementary strand of the double helix as a template, with DNA ligase creating the final phosphodiester bond to fully repair the DNA. DNA ligase has applications in both DNA repair and DNA replication. In addition, DNA ligase has extensive use in molecular biology laboratories for genetic recombination experiments. Purified DNA ligase is used in gene cloning to join DNA molecules together to form recombinant DNA.
5' cap (modified guanine nucleotide)
In molecular biology, the 5′ cap is a specially altered nucleotide on the 5′ end of precursor messenger RNA and some other primary RNA transcripts as found in eukaryotes. The process of 5′ capping is vital to creating mature mRNA, which is then able to undergo translation. Capping ensures the mRNA's stability while it undergoes translation in the process of protein synthesis, and is a highly regulated process that occurs in the cell nucleus. Because this only occurs in the nucleus, mitochondrial and chloroplast mRNA are not capped.
UTR UnTranslated Region
In molecular genetics, an untranslated region (or UTR) refers to either of two sections on each side of a coding sequence on a strand of mRNA. If it is found on the 5' side, it is called the 5' UTR (or leader sequence), or if it is found on the 3' side, it is called the 3' UTR (or trailer sequence).
Acetyl
In organic chemistry, acetyl is a functional group, the acyl with chemical formula COCH3. The acetyl group contains a methyl group single-bonded to a carbonyl (a carbon double bonded to a single oxygen atom). The carbonyl center of an acyl radical has one nonbonded electron with which it forms a chemical bond to the remainder R of the molecule.
Processed pseudogene
Lack introns and the normal sequences that direct transcription. Are an inactive copy of the gene, typically resulting from the duplication of a gene by way of a reverse transcription error (sometimes the result, in humans, of an Alu sequence (or other SINE) insertion.)
Gene family
Members of a group of related genes may be transcribed in different tissues or at different stages of development. Gene families are thought to have arisen by duplication of an original ancestral gene, followed by mutation and divergence of different family members. The result is proteins optimized for different functions. E.g., fetal globins have a higher affinity for O2 than do adult globins.
Alu sequences
Modern Alu elements are about 300 base pairs long and are therefore classified as short interspersed elements (SINEs). A short stretch of DNA originally characterized by the action of the Alu (Arthrobacter luteus) restriction endonuclease (its ability to cleave this particular sequence). Are the most abundant elements in the human genome. Alu insertions have been implicated in several inherited human diseases and in various forms of cancer. Alu elements are retrotransposons and look like DNA copies made from RNA polymerase III-encoded RNAs. Alu elements do not encode for protein products and depend on LINE retrotransposons for their replication.
Pseudogenes
Nonfunctional gene copies. There are more than 20,000 pseudogenes in the human genome. DNA sequences, related to known genes, that have lost their protein-coding ability or are otherwise no longer expressed in the cell. Pseudogenes arise from retrotransposition or genomic duplication of functional genes, and become "genomic fossils" that are nonfunctional due to mutations that prevent the transcription of the gene, such as within the gene promoter region, or fatally alter the translation of the gene, such as premature stop codons or frameshifts.
SINE (Short " " ")
Short DNA sequences (<500 bases) that represent reverse-transcribed RNA molecules originally transcribed by RNA polymerase III into tRNA, 5S ribosomal RNA, and other small nuclear RNAs that have been integrated into the stable genome of a species as part of the interspersed repetitive DNA system that allows individual gene sequences to evolve independent from one another (within the same species) without being homogenized by gene conversion events. SINEs do not encode a functional reverse transcriptase protein and rely on other mobile elements for transposition. The most common SINEs in primates are called Alu sequences.
snRNA (Small nuclear ribonucleic acid) (U-RNA)
Small nuclear ribonucleic acid (snRNA), also commonly referred to as U-RNA, is a class of small RNA molecules that are found within the nucleus of eukaryotic cells. The length of the an average snRNA is approximately 150 nucleotides, their primary function is in the processing of pre-mRNA (hnRNA) in the nucleus. e.g. snRNA in the spliceosome. Introns can contain genes for small nuclear RNA
Spacer sequences
Spacer DNA are regions of non-transcribed DNA between tandemly repeated genes, such as ribosomal RNA genes in eukaryotes. Its function most likely involves ensuring the high rates of transcription associated with these genes. In bacteria, spacer DNA sequences are only a few nucleotides long. In eukaryotes, they can be extensive and include repetitive DNA, comprising the majority of the DNA of the genome.
C-value
The amount, in picograms, of DNA contained within a haploid nucleus (e.g. a gamete) or one half the amount in a diploid somatic cell of a eukaryotic organism. 1 pg = 978 Mb (megabases). In some cases (notably among diploid organisms), the terms C-value and genome size are used interchangeably, however in polyploids the C-value may represent two or more genomes contained within the same nucleus. The current estimates for human female and male diploid genome sizes are 6.406 × 109 bp and 6.294 × 109 bp, respectively (female diploid genome sizes are larger than males because they have two X chromosomes, whereas males have one X and one Y chromosome and the Y chromosome is much smaller than the X chromosome). By using the conversion formulas given above, diploid human female and male nuclei in G1 phase of the cell cycle should contain 6.550 and 6.436 pg of DNA, respectively. DNA content (pg) = genome size (bp) / (0.978 x 109)
rRNA (ribosomal RNA)
The central component of the ribosome's protein-manufacturing machinery. the RNA component of the ribosome, and is essential for protein synthesis in all living organisms. Ribosomes contain two major rRNAs and 50 or more proteins. The LSU and SSU rRNAs are found within the large and small ribosomal subunits, respectively. The LSU rRNA acts as a ribozyme, catalyzing peptide bond formation. rRNA sequences are widely used for working out evolutionary relationships among organisms, since they are of ancient origin and are found in all known forms of life.
RNA Processing and splicing
The conversion of precursor messenger RNA into mature messenger RNA (mRNA), which includes splicing and occurs prior to protein synthesis. This process is vital for the correct translation of the genomes of eukaryotes because the human primary RNA transcript that is produced, as a result of transcription, contains both exons, which are coding sections of the primary RNA transcript and introns, which are the non-coding sections of the primary RNA transcript. Three steps: 1. Capping - Capping of the pre-mRNA involves the addition of 7-methylguanosine (m7G) to the 5' end. 2. Polyadenylation - the addition of a poly(A) tail to an RNA molecule. The poly(A) tail consists of multiple adenosine monophosphates; in other words, it is a stretch of RNA that has only adenine bases. In eukaryotes, polyadenylation is part of the process that produces mature messenger RNA (mRNA) for translation. It, therefore, forms part of the larger process of gene expression. 3. Splicing - RNA splicing is the process by which introns, regions of RNA that do not code for protein, are removed from the pre-mRNA and the remaining exons connected to re-form a single continuous molecule. Although most RNA splicing occurs after the complete synthesis and end-capping of the pre-mRNA, transcripts with many exons can be spliced co-transcriptionally. The splicing reaction is catalyzed by a large protein complex called the spliceosome assembled from proteins and small nuclear RNA molecules (snRNA) that recognize splice sites in the pre-mRNA sequence. Many pre-mRNAs, including those encoding antibodies, can be spliced in multiple ways to produce different mature mRNAs that encode different protein sequences. This process is known as alternative splicing, and allows production of a large variety of proteins from a limited amount of DNA.
Piwi-interacting RNA
The largest class of small non-coding RNA molecules expressed in animal cells. piRNAs form RNA-protein complexes through interactions with piwi proteins. These piRNA complexes have been linked to both epigenetic and post-transcriptional gene silencing of retrotransposons and other genetic elements in germ line cells, particularly those in spermatogenesis. They are distinct from microRNA (miRNA) in size (26-31 nt rather than 21-24 nt), lack of sequence conservation, and increased complexity.
Chicken ovalbumin gene
The main protein found in egg white, making up 60-65% of the total protein. Ovalbumin displays sequence and three-dimensional homology to the serpin superfamily, but unlike most serpins it is not a serine protease inhibitor. The function of ovalbumin is unknown, although it is presumed to be a storage protein.
Replication fork
The replication fork is a structure that forms within the nucleus during DNA replication. It is created by helicases, which break the hydrogen bonds holding the two DNA strands together. The resulting structure has two branching "prongs", each one made up of a single strand of DNA. These two strands serve as the template for the leading and lagging strands, which will be created as DNA polymerase matches complementary nucleotides to the templates; the templates may be properly referred to as the leading strand template and the lagging strand templates
Plumule
The rudimentary terminal bud of a plant embryo situated at the end of the hypocotyl, consisting of the epicotyl and often of immature leaves.
Polyadenylation signal
The sequence motif recognized by the RNA cleavage complex - varies between groups of eukaryotes. Most human polyadenylation sites contain the AAUAAA sequence, but this sequence is less common in plants and fungi.
Hypocotyl
The stem of a germinating seedling, found below the cotyledons (seed leaves) and above the radicle (root).
Genetic diversity
The total number of genetic characteristics in the genetic makeup of a species. It is distinguished from genetic variability, which describes the tendency of genetic characteristics to vary.
Cross-linking
The use of a small chemical agent with two reactive groups to covalently bond molecules that are in close proximity (ie. Isodityrosine linkages)
Transposons
Transposons are mobile genetic elements that change their physical location through a cut-and-paste transposition mechanism. Does not involve an RNA intermediate (Unlike Retrotransposons). The transpositions are catalyzed by several transposase enzymes. Some transposases non-specifically bind to any target site in DNA, whereas others bind to specific DNA sequence targets. The transposase makes a staggered cut at the target site resulting in single-strand 5' or 3' DNA overhangs (sticky ends). This step cuts out the DNA transposon, which is then ligated it into a new target site; this process involves activity of a DNA polymerase that fills in gaps and of a DNA ligase that closes the sugar-phosphate backbone. This results in duplication of the target site. The insertion sites of DNA transposons may be identified by short direct repeats (created by the staggered cut in the target DNA and filling in by DNA polymerase) followed by a series of inverted repeats important for the TE excision by transposase.