Chapter 5-Genomes, Proteomics, and Systems Biology

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Since then, tremendous advances in the technology of

DNA sequencing have been made, and new sequencing methodologies allow rapid and economical sequencing of individual genomes or transcribed RNAs.

The synthetic genome was propagated as

a plasmid in yeast and then introduced into a different mycoplasma subspecies, M. capricolum, by gene transfer techniques

Proteins to be analyzed are digested with

a protease to cleave them into small fragments (peptides) in the range of approximately 20 amino acid residues.

Figure 5.1 The genome of Haemophilus influenzae

Predicted protein-coding regions are designated by colored bars. Numbers indicate base pairs of DNA. (From R. D. Fleischmann et al., 1995. Science 269: 496.)

Mass spectrometry can also be used to

analyze mixtures of proteins, not just single isolated protein. -In this approach, called "shotgun mass spectrometry", a mixture of cell proteins is digested with a protease, and the complex of mixture of peptides is subjected to sequencing by tandem mass spectrometry.

Immunoprecipitated protein complexes can be

analyzed by mass spectrometry to identify not only the protein against which the antibody was directed, but also the other proteins with which it was associated in the cell extract.

Draft sequences of human genome were produced by two teams of researchers, each using diff. approaches. Both of these sequences were initially incomplete in which

approx. 90% of genome had been sequenced and assembled. -continued efforts closed the gaps and improved the accuracy of the draft sequences, leading to its result

Individual peptides from the initial mass spectrum are

automatically selected to enter a "collision cell" in which they are partially degraded by random breakage of peptide bonds

One approach is

comparative analysis of the genome sequences of related organisms

Additional methods have been developed to

compare amounts of proteins in 2 diff. samples, allowing quantitative analysis of protein levels in diff. types of cells/in cells tht have been subjected to diff. treatments

The sensitivity of RNA-seq is

high enough to allow analysis at the single cell level, so the transcriptomes of individual cells can be determined.

Yeast are transformed with

hybrid cDNA clones to test for interactions between the two proteins

One commonly used method for global expression analysis is

hybridization to DNA microarrays, which allow expression of tens of thousands of genes to be analyzed simultaneously.

One approach to analysis of protein complexes is to

isolate protein from cells under gentle conditions, so that it remain associated with proteins it normally interacts with inside the cell

An antibody against a protein of interest is used to

isolate that protein from a cell extract by immunoprecipitation

The resulting antigen-antibody complexes are

isolated and interacting proteins will be present together with the target protein complexes in the immunoprecipitates

Metabolic or signaling pathways do not operate in

isolation; rather, there is extensive crosstalk between different pathways, so that multiple pathways interact with one another to form networks within the cell

Once the complete DNA sequence was obtained,

it was analyzed to identify the genes encoding rRNAs, tRNAs and proteins.

In yeast two-hybrid system, two different cDNAs are

joined to two distinct domains of a protein (DNA binding domain and activation domain of GAL4 transcription factor) that stimulate expression of a target gene in yeast

The terminator nucleotides are

labelled, each with a different fluorophore.

Proteomics is ________

large-scale analysis of cellular proteins

The identification of all of the genes in an organism opens the possibility for

large-scale systematic analysis of gene function

The Human Genome Project became

largest collaborative work in biology and yielded an initial draft sequence in 2001, with more refined complete sequence of the human genome in 2004

Understanding our unique genetic makeup as individual is expected to

lead to the development of new tailormade strategies for disease prevention and treatment (Precision Medicine).

The reversible terminator method with

libraries immobilized on a slide support generates relatively short sequence reads, with a maximum length of 300 bp.

With availability of complete genome sequences,

libraries of double-stranded RNAs can be designed and used in genome-wide screens to identify all of the genes involved in any biological process that can be assayed in high-throughput manner.

Computational modeling of such networks is currently

major challenge in systems biology, which will be necessary to understand the dynamic response of cells to their environment

The frequency with which individual sequences are detected in RNAseq is

proportional to the quantity of RNA in the cell, so this analysis determines the abundance as well as the identity of all transcribed sequences

These advances have changed the way scientists think about

structure and function of our genomes, as well as allowing new approaches to disease diagnosis and treatment based on personal genome sequencing.

Alternative approaches to systemic analyses of protein complexes include

screens for protein interactions in vitro as well as genetic screens that detect interactions between pairs of proteins tht r introduced into yeast cells.

Genome-wide screens using the CRISPR/Cas9 system have been applied to

systematically identify sets of genes in human cells that are responsible for properties such as survival or resistance to anticancer drug

A large-scale international project to

systematically knockout all genes in the mouse is under way.

More detailed amino acid sequence information than the mass of the peptides can be obtained by

tandem mass spectrometry

Genome sequencing will allow

therapies to be specifically tailored to needs of individual patients, both with respect to disease prevention and treatment.

Because of alternative splicing and protein modifications, it is estimated that

these genes can give rise to more than 100,000 different proteins.

A commonly used protease is

trypsin, which cleaves proteins at the carboxyterminal side of lysine (K) and arginine (R) residues.

Reversible terminator sequencing

type of NGS sequencing

No. of distinct species of proteins in eukaryotic cells is

typically far greater than no. of genes

Not only can the sequences of complete genomes be obtained and analyzed, but it is also now possible to

undertake large-scale analyses of all of the RNAs and proteins expressed in a cell.

A major surprise from human genome sequence is

unexpectedly low number of human genes.

Next-generation sequencing

(also called massively parallel sequencing) refers to several different methods in which millions of templates are sequenced in a single reaction.

One approach is to systematically inactivate

(or knockout) each gene in the genome by homologous recombination with an inactive mutant allele

Figure 5.9 Tandem mass spectrometry

- A mixture of peptides is separated in a mass spectrometer 1. - A randomly selected peptide is then fragmented by collisioninduced breakage of peptide bonds. - The fragments, which differ by single amino acids, are then separated in a second mass spectrometer 2. - Since the fragments differ by single amino acids, the amino acid sequence of the peptide can be deduced

Figure 5.16 Genome-wide RNAi screen for cell growth and viability

- Each microwell contains siRNA corresponding to an individual gene. - Tissue culture cells are added to each well and incubated to allow cell growth. - Those wells in which cells fail to grow identify genes required for cell growth or viability

Figure 5.21 A genetic toggle switch

- The circuit includes genes encoding two repressors (A and B) that regulate each other and a reporter controlled by repressor B. ' - Inactivation of repressor B leads to a stable state in which the reporter is expressed, whereas inactivation of repressor A leads to a stable state in which the reporter is repressed.

Figure 5.13 The yeast two-hybrid system

- cDNAs of two human proteins are cloned as fusions with two domains (designated 1 and 2) of a yeast protein that stimulates transcription of a target gene. - The two recombinant cDNAs are introduced into a yeast cell. If the two human proteins interact with each other, they bring the two domains of the yeast protein together. Domain 1 binds DNA sequences at a site upstream of the target gene, and domain 2 stimulates target gene transcription. - The interaction between the two human proteins can thus be detected by expression of the target gene in transformed yeast.

The Arabidopsis genome, approximately

125x10^6 base pairs of DNA, contains approximately 26,000 protein-coding genes - significantly more genes than were found in either C. elegans or Drosophila.

An extensive recent analysis has used almost

14,000 immunofluorescent antibodies to determine the subcellular locations of 12,003 human proteins. - This analysis defined the proteomes of 30 subcellular structures and 13 organelles. (1000 proteins in mitochondria/1500 proteins in the plasma membrane)

The analysis identified

1743 potential protein coding regions in the H. influenza genome as well as six copies of rRNA genes and 54 different tRNA genes.

This method is massively parallel that up to

2,000 Mb of sequences can be obtained per run.

The human genome contains approximately

20,000 different protein-coding genes, and the number of these genes expressed in any given cell is around 10,000.

However, the chemical group that has been attached at

3' portion of the terminator nucleotide can be removed, converting the position to a 3'- OH group, which allows further extension to occur.

The human genome is about

3x10^9 base pairs of DNA

Minimal genome required to support a viable cell encodes only

438 proteins and 35 RNAs

Mice, rats and humans have

90% genes of their genes in common, so the mouse and rat genome sequences provide essential databases for research.

Genome sequences of humans and chimpanzees are about

90% identical. -surprisingly, sequence diff. between humans & chimpanzees frequently alter coding sequences of genes, leading to changes in amino acid sequences of most proteins. -although many of these amino acid changes may not affect protein function, appears tht there r changes in structure & expression of thousands of genes between chimpanzee & human, so identifying these differences tht r key to origin of humans is not a simple task

Mice, rats, and humans have

90% of their genes in common, so mouse and rat genome sequences provide essential dabatases for research

The C. elegans genome is

97x10^6 base pairs and contains about 19,000 predicted protein-coding sequences - approximately eight times the amount of DNA but only three times the number of genes in yeast.

Figure 5.12 Analysis of protein complexes

A known protein (blue) is isolated from cells as a complex with other interacting proteins (orange and red). The entire complex can be analyzed by mass spectrometry to identify the interacting proteins.

Figure 5.11 Immunoprecipitation

A mixture of cell proteins is incubated with an antibody bound to beads. The antibody forms complexes with the protein (green) against which it is directed (the antigen). These antigen-antibody complexes are collected on the beads and the target protein is isolated

Figure 5.8 Identification of proteins by mass spectrometry

A protein is digested with a protease that cleaves it into small peptides. -The peptides are then ionized and analyzed in a mass spectrometer, which determines the mass-to-charge ratio of each peptide. -The results are displayed as a mass spectrum, which is compared to a database of theoretical mass spectra of all known proteins for protein identification.

Fig. 1. The assembly of a synthetic M. mycoides genome in yeast.

A synthetic M. mycoides genome was assembled from 1078 overlapping DNA cassettes in three steps. - In the first step, 1080-bp cassettes (orange arrows), produced from overlapping synthetic oligonucleotides, were recombined in sets of 10 to produce 109 ~10-kb assemblies (blue arrows). - These were then recombined in sets of 10 to produce 11 ~100-kb assemblies (green arrows). - In the final stage of assembly, these 11 fragments were recombined into the complete genome (red circle). - With the exception of two constructs that were enzymatically pieced together in vitro (white arrows), assemblies were carried out by in vivo homologous recombination in yeast. - Major variations from the natural genome are shown as yellow circles. - These include four watermarked regions (WM1 to WM4), a 4-kb region that was intentionally deleted (94D), and elements for growth in yeast and genome transplantation. - In addition, there are 20 locations with nucleotide polymorphisms (asterisks). - Coordinates of the genome are relative to the first nucleotide of the natural M. mycoides sequence. - The designed sequence is 1,077,947 bp. - The locations of the Asc I and BssH II restriction sites are shown. - Cassettes 1 and 800-810 were unnecessary and removed from the assembly strategy. - Cassette 2 overlaps cassette 1104, and cassette 799 overlaps cassette 811

DNA microarrays

An example of comparative analysis of gene expression in cancer cells and normal cells. mRNAs extracted from cancer cells and normal cells are used as templates for synthesis of cDNAs labeled with a fluorescent dye. The labeled cDNAs are then hybridized to a DNA microarray containing spots of oligonucleotides corresponding to 20,000 or more distinct human genes. The relative level of expression of each gene is indicated by the intensity of fluorescence at each position on the microarray, and the levels of expression in cancer cells and normal cells can be compared. Examples of genes expressed at higher levels in cancer cells are indicated by arrows.

Figure 5.5 Next-generation sequencing

Cellular DNA is fragmented and adapters are ligated to the ends of each fragment. Single molecules are then anchored to a solid surface and amplified by PCR, forming millions of clusters of molecules. Four color-labeled reversible chain terminating nucleotides are added together with DNA polymerase and a primer that recognizes the adapter sequence. Incorporation of a labeled nucleotide into each cluster of DNA molecules is detected by a laser. Unincorporated nucleotides are removed, chain termination is reversed, and the cycle is repeated to obtain the sequences of millions of clusters simultaneously

Figure 5.7 RNA-seq

Cellular mRNAs are reverse transcribed to cDNAs, which are subjected to next-generation sequencing. -The results yield the sequences of all mRNAs in a cell. -The relative amount of each mRNA is indicated by the frequency at which its sequence is represented in the total number of sequences read

The creation of a fully synthetic cell was performed by

Craig Venter and his colleagues in 2010

Figure 5.10 Analysis of subcellular organelle proteomes

Examples of immunofluorescence images used to localize human proteins to subcellular organelles. The number of proteins localized to each organelle is indicated below the image

_________________________ is the first system by synthetic biologists in 2000

Gene regulatory circuit in E. coli

Figure 5.17 Conservation of functional gene regulatory elements

Human, mouse, rat, and dog sequences near the transcription start site of a gene contain a functional regulatory element that binds the transcriptional regulatory protein Err-α. These sequences (highlighted in yellow) are conserved in all four genomes, whereas the surrounding sequences are not.

This technology is usually referred to as

Illumina sequencing, named after the company that markets the necessary equipment.

______________/______________ and ________________ screens are commonly used.

Immnoprecipitation/mass spectrometry and yeast two-hybrid

Figure 5.19 Elements of signaling networks

In feedback loops, a downstream element of a pathway either inhibits (negative feedback) or stimulates (positive feedback) an upstream element. - In feedforward relays, an upstream element of a pathway stimulates both its immediate target and another element further downstream. - Crosstalk occurs when an element of one pathway either stimulates or inhibits an element of a second pathway

Figure 5.14 A protein interaction map of Drosophila

Interactions among 2346 proteins are depicted, with each protein represented as a circle placed according to its subcellular localization. (From L. Giot et al., 2003. Science 302: 1727.)

To solve this problem and develop a non-botanical source of artemisinin,

Jay Keasling and his collaborators engineered strains of yeast that produced high-yields of artemisinic acid, which could then be efficiently converted to artemisinin by a chemical procells

Figure 5.22 Structure of artemisinin

P. falciparum in a blood smear.

Large-scale screens based on

RNAi interference (RNAi) are being used to systematically dissect gene function in a variety of organisms, including Drosophila, C. elegans, and mammalian cells in culture

Figure 5.23 First cell with a synthetic genome

Scanning electron micrograph of M. mycoides with a synthetic genome.

Key Experiment, Ch. 5, p. 163 (2)

Strategy for genome sequencing using bacterial artificial chromosome (BAC) clones that had been organized into overlapping clusters (contigs) and mapped to human chromosomes.

Figure 5.18 Example of a signaling pathway

The binding of epinephrine (adrenaline) to its cell surface receptor triggers a signaling pathway that leads to the breakdown of glycogen to glucose-1-phosphate

Figure 5.4 Progress in DNA sequencing

The cost of sequencing a human genome has dropped from approximately $100 million in 2001 to about $1000 in 2015. (Data from the National Human Genome Research Institute.)

Figure 5.2 Evolution of sequenced vertebrates

The estimated times (millions of years ago) when species diverged are indicated at branch points in the diagram

The mouse is ..... while the rat

The mouse is the key model system for experimental studies of mammalian genetics and development, while the rat is an important model for human physiology and medicine.

Figure 5.20 A gene regulatory network

The network includes all regulatory genes required for development of the embryonic cells that differentiate into skeletal cells of the sea urchin.

Figure 5.3 Comparison of vertebrate genomes

The number of genes shared between human, mouse, chicken, and zebrafish genomes is indicated

Figure 5.15 Systems biology

Traditional biological experiments study individual molecules and pathways. Systems biology uses global experimental data for quantitative modeling of integrated systems and processes.

The genome of H. Influenza is

a circular molecule containing approximately 1.8x106 base pairs, more than 1000 times smaller than the human genome

A DNA microarray consists of

a glass or silicon chip onto which oligonucleotides are printed by a robotic system in small spots at a high density.

Each spot on the array consists of

a single oligonucleotide representing a specific gene in cellular genomes

Because each amino acid has ______________, ______________________________

a unique molecular weight, the amino acid sequence of peptide can be deduced from these data.

The complete genome sequences of a wide variety of organisms, including many individual humans, provide

a wealth of information that forms a new framework for studies of cell and molecular biology and opens new possibilities in medical practice

In addition, these proteins can be expressed at

a wide range of levels.

The human genome consist of only

about 20,000 protein-coding genes, which is not much larger than the number of genes in simple animal like C. elegans or Drosophila and fewer than in Arabidopsis or other plants

A comparison of the human, mouse, chicken and zebrafish genomes indicates that

about half of the protein-coding genes are common to all vertebrates, whereas approximately 3000 genes are unique to each of these four species.

It is now possible to analyze

all of the RNA that are transcribed in a cell (the transcriptome), rather than analyzing the expression of one gene at a time.

Large-scale analysis by immunofluorescence is

alternative approach to determine the proteomes of subcellular organelles.

The first complete sequence of a cellular genome, reported in 1995 by a team of researchers led by Craig Venter, was

bacterium Haemophilus influenza, a common inhabitant of the human respiratory tracts

The systematic analysis of protein complexes and interactions has

become an important goal of proteomics

Bioinformatic field lies at the interface between ________________ and is focused on _______________________

biology and computer science and is focused on developing the computational methods needed to analyze and extract useful biological information from the raw data.

The protein composition of a variety of organelles has been determined by

combining classical cell biology methods with mass spectrometry

Best example is development of new drugs for

cancer treatment, which are specifically targeted against mutations that can be identified by sequencing cancer genomes of individual patients.

In RNA seq,

cellular mRNAs are isolated, converted to cDNAs by reverse transcription, and subjected to NGS

Sequences resembling regulatory elements occur frequently by

chance in genomic DNA, so physiological significant elements cannot be identified from DNA sequence alone

Sequences of genomes of other primates, including

chimpanzee, bonobo, orangutan, rhesus macaque, may help pinpoint unique features of our genome tht distinguish humans from other primates

The whole-genome shotgun method starts with

cloning and sequencing of DNA fragments from randomly cut DNA derived from the entire genome

Bioinformatic analysis of

clusters of transcription factor binding sites in genomic DNA is very useful in identifying sequences that regulate gene expression.

Computer algorithms can be used to

compare the experimentally determined mass spectrum with a database of theoretical mass spectra representing tryptic peptides of all known proteins, allowing identification of the unknown protein.

Potential protein-coding regions were identified by

computer analysis of the DNA sequence to detect openreading frames - long stretches of nucleotide sequence that can encode polypeptides from initiation codon to stop codons.

RNA seq is available by

continuing development of next-generation sequencing.

Number of protein-coding genes do not

correlate with biological complexity

The goal of synthetic biology is to

design and create new (unnatural or synthetic) systems, rather than studying natural biological systems.

RNA sequencing (RNA-seq)

determine and quantify all of the RNAs expressed in a cell

A model of a gene regulatory network responsible for

development of an embryonic cell lineage in sea urchins provides a graphical representation of this complexity.

In future, we may expect genome sequencing of healthy people to play important role in

disease prevention by identifying genes tht confer susceptibility to disease, followed by taking appropriate measures to intervene.

Metabolic or signaling pathways are connected by

diverse ways, resulting in the networks within the cell

The identification of functional regulatory elements and

elucidation of the signaling networks that control gene expression represent major challenges in bioinformatics and systems biology

Synthetic biology is

engineering approach to understand and manipulate biological systems

The practical applications of synthetic biology include

engineering of metabolic pathway to efficiently produce therapeutic drugs. - A good example is provided by the production of the antimalarial drug artemisinin. - Malaria is a major global health problem and caused by infection with parasites belonging to the genus Plasmodiu.

Compared with Drosophila and C. elegans, the human genome contains

expanded numbers of genes in functions related to the greater complexity of vertebrates, such as the immune response, the nervous system, and blood clotting, as well as increased numbers of genes involved in development, cell signaling, and the regulation of gene expression

This arise because many genes can be

expressed to yield several distinct mRNAs, which encode diff. polypeptides as a result of alternative splicing

If two proteins tested are interacted,

expression of the target gene can be easily detected by growth of yeast in a specific medium or by production of an enzyme that produces a blue yeast colony

Determinations of C. elegans & drosophila were major steps forward, which

extended genome sequencing from unicellular bacteria and yeasts to multicellular organisms

In order to maintain associations between target protein and the proteins with which it normally interacts inside the cell,

extracts are prepared under gentle conditions and adjacent proteins are sometimes chemically cross-linked

The Drosophila genome contains

fewer genes than the number of genes in C. elegans, even though Drosophila is a more complex organism.

Individual double stranded RNAs from

genome wide library are tested in microwells in a high-throughput format to identify those that interfere with growth of cultured cells, thereby characterizing entire set of genes that r required for cell growth/survival under particular sets of conditions

In addition to the human genome, a large number of vertebrate genomes have been sequenced, including

genomes of fish, frogs, chickens, dogs, rodents and primates.

Large-scale biological experimentation, including

global analysis of gene expression and proteomics, similarly yield vast amount of data, far beyond the scope of traditional biological experimentation

Cells with the synthetic genome were found to

grow normally and show the morphology of normal M. mycoides

Proteomics has the goal of [3]

identifying and quantitating all of the proteins expressed in a given cell (proteome), as well as establishing the localization of these proteins to different subcellular organelles and elucidating the networks of interactions between proteins that govern cell activities.

Elucidating the interactions between proteins provides

important clues about the function of novel proteins, and helps to understand the complex networks of protein interactions that govern cell behavior.

This generates a mass spectrum in which

individual peptides are indicated by a peak corresponding to the mass-to-charge ratio

In RNAi screens, double stranded RNAs are used to

induce degradation of homologous mRNAs in cells.

Proteins generally function by

interacting with other proteins in protein complexes and network

The peptides are ionized by

irradiation with a laser or by passage through a field of high electrical potential and introduced into a mass spectrometer, which measures the mass-to-charge ratio of each peptide.

The genome sequencing projects have led to a fundamental change in the way in which

many problems in biology are being approached, with large-scale experimental approaches that generate vast amounts of data now in common use.

The major tool used in proteomics is

mass spectrometry (peptide mass fingerprinting, matrix-assisted laser desorption ionization time-of-flight; MALDI-TOF), which was developed in the 1990s as a powerful method of protein identification.

Mapping software then assembles

millions of overlapping short sequences into a single, continuous sequence for every chromosome

This method makes use of

modified nucleotides which block further strand synthesis once one has been incorporated at the end of the growing polynucleotide.

These large-scale experimental approaches form the basis of

new field of system biology, which seeks a quantitative understanding of the integrated dynamic behavior of complex biological systems and processes

These global experimental approaches form the basis of

new field of systems biology, which seeks a quantitative understanding of integrated behavior of complex biological systems.

A number of new sequencing methods, collectively called

next-generation sequencing (NGS), are developed that have substantially increased the speed and lowered the cost of genome sequencing.

However, some smaller organelles, such as

nucleoli, contained more proteins than previously recognized (more than 1000)

In human cells, interaction maps have been

obtained for about 25% of protein-coding genes

The protein-coding genes represent

only about 1% of the human genome.

The proteome of a variety of

organelles (mitochondria and plasma membrane) and large subcellular structures such as nucleoli have been characterized by this approach. - (More than 700 different proteins in mitochondria/ 1000-2000 different protein in plasma membrane)

Starting from the known nucleotide sequence of the 1.08Mb Mycoplasma mycoides genome, they synthesized

overlapping oligonucleotides corresponding to the complete genome sequence. - These synthetic oligonucleotides were assembled in several steps to yield a complete synthetic genome of 1,077,947 bases that also contained sequences required for propagation as a plasmid in yeast

A second mass spectrum of

partial degradation products of each peptide is determined

Protein modifications, such as

phosphorylation, can be identified because they alter the mass of the modified amino acid.

Artemisinin is produced by

plant (sweet wormwood) that takes about eight months to grow to full size and its support has been unstable.

Unexpectedly, more than half of

proteins analyzed were localized to more than one compartment, suggesting possibility that they may have diff. functions in diff. locations

Over 40% of the predicted human proteins are related to

proteins in simpler sequenced eukaryotes, including yeast, Drosophila and C. elegans. -Many of these conserved proteins function in basic cellular processes, such as metabolism, DNA replication and repair, transcription, translation, and protein trafficking

High-throughput analysis by both mass spectrometry and the yeast two-hybrid system has been applied to

proteome-scale studies of the interactions between proteins of higher eukaryotes, including Drosophila, C. elegans, and humans.

Since these chain-terminating codons occur

randomly once in every 21 codons (three chain-terminating codons out of 64 total), open-reading frames that extend for more than 100 codons usually represent functional genes

The availability of complete genome sequences has enabled

researchers to study gene expression on a genome-wide global level.

The dramatic changes in sequencing technology have opened the door to

sequencing the complete genomes of large numbers of different individuals, allowing new approaches to understand the genetic basis for many human diseases, including cancer, heart disease, and degenerative diseases of the nervous system such as Parkinson's and Alzheimer's disease.

Complete collections of strains with mutations in all known genes are available for

several model organisms, including E. coli, yeast, Drosophila, C. elegans, and Arabidopsis thaliana. - These collections of mutant strains can be analyzed to determine which genes are involved in any biological property of interest.

Most regulatory elements are

short sequences of DNA, typically spanning only about 10 base pairs.

Since genes that are coordinately regulated within a cell may be controlled by

similar mechanisms, analyzing changes in the expression of multiple genes can help pinpoint shared regulatory elements

Expression of clones genes in yeast is particularly useful because

simple methods of yeast genetics can be employed to identify proteins tht interact with one another.

The genome of these vertebrates are similar in

size to the human genome and contain a similar number of genes.

Although several problems with sensitivity & accuracy of these methods remain to be

solved, analysis of complex mixtures of proteins by "shotgun" mass spectrometry provide powerful approach to systematic analysis of cell proteins

Cells containing the synthetic genome were selected by

tetracycline resistance and propagated in culture

The result of genome sequencing in multicellular organisms indicates that

they contained fewer protein-coding genes than expected relative to bacteria or yeast genomes

Sequencing of genomes of tetracycline-resistant cells indicated that

they were entirely derived from the synthetic M. mycoides DNA.

The changes in gene expression that occur over

time can reveal networks of gene expression

In contrast to traditional approaches, system biology is characterized by

use of large-scale datasets for quantitative experimental analysis and modeling

Sequences of individual peptides are thn

used for database searching to identify proteins present in starting mixture.

In addition, proteins can be modified in

variety of diff. ways, including addition of phosphate groups, carbohydrates, and lipid molecules

Even though automation of chain-termination sequencing by dideoxynucleotide technique contributed to

whole genome sequencing of human and other genomes, genome sequencing by this approach was slow and expensive.

Handling the enormous amounts of data generated by

whole genome sequencing required sophisticated computational analysis and launched the new field of bioinformatics

C. elegans & drosophila are

widely used for studies of animal development, & drosophila has been especially well analyzed genetically.


Kaugnay na mga set ng pag-aaral

spanish 9 test vocab and gramática

View Set

Vascular and MS Study Guide Questions

View Set

Community Health Test 2 (Ch. 9, 11, 14, 15, 19, 20, & 26)

View Set

Chapter 12: The Lymphatic System

View Set

Julius Caesar Act III Study Questions

View Set