Chapter 24
If there is a 100% match between the sequence used for searching and a sequence in the database the E value would be
0
paralogs
2 homologous genes found in a single organism
orthologs
2 homologous genes found in different species
gene family
2 or more copies of homologous genes within genome of single organism
tandem mass spectrometry
2 spectrometers used; first measures mass of given peptide generated from protein digestion; second analyzes peptide after digestion into smaller fragments
assay levels of protein
2D gel electrophoresis
number of reading frames possible in a newly discovered sequence
6
Compare DNA and Protein sequences
BLAST
find DNA sequences bound by protein
ChIP chip assay
involves chemical cross linking of DNA to protein
Chromatin immunoprecipitation
Determine changes in RNA expression
DNA microarray
slide with many DNA sequences spotted on it
DNA microarray
can be used to look for insertions or deletions if labeled genomic DNA is used
DNA microarrays
The better the match the lower the
E value
to identify the DNA bound by the protein, researchers can use
PCR, DNA microarray
any given cell of a multicellular organism will produce only
a subset of proteins in its proteome
If the files in the database include additional information such as the name of the organism from which the sequence was obtained the database is
annotated
protein microarray that can determine protein expression levels
antibody microarray
2 common approaches to protein microarray analysis
antibody microarrays and functional microarrays
homology-based modeling
approach to predicting protein structure based on one protein's similarity to another protein with a known structure
To use a microarray, mRNA from cells of interest are first converted to
cDNA
resulting DNA when mRNA is used to direct the synthesis of DNA
cDNA
chromatin immunoprecipitation
can determine if proteins can bind to a particular region of DNA in chromatin of living cells
subset of proteins produced depends primarily on
cell type, stage of development, environmental conditions
spots present in only given circumstances
cells exposed to a hormone versus those that are not
ORF in eukaryotes
chromosomal coding sequences may be interrupted by introns
database
collection of computer files stored in one place
Computer programs have been developed to predict RNA and protein structures based on
comparison to RNA and protein molecules of similar sequence and function
antibody microarrays
consist of a collection of antibodies that recognize short peptide sequences; used to asses level of protein expression
functional microarrays
consist of many different cellular proteins; used to probe function of proteins
steps of tandem mass spectrometry
digest protein to fragments with protease; determine mass of these fragments with mass spectrometer; analyze fragment individually with second spectrometer
RNA-seq involves directly sequencing the RNA
false
SDS coats the proteins to give them a net positive charge
false
Two dimensional gel electrophoresis separates proteins based on size alone
false
2D gel electrophoresis involves 2 different gel electrophoresis experiments
first separates by pH/charge interactions; second separates by size
Programs that are used to predict protein structure frequently rely on
frequency of amino acids found within structures that have already been crystallized
protein microarray that can demonstrate protein-to-protein interactions
functional protein microarray
DNA microarray
gene chip/DNA chip/biochip; collection of microscopic DNA spots attached to a solid surface
Two or more paralogs in a single organism
gene family
gene knockout
gene has been altered in a way that inactivates its function
may replace microarrays
high-throughput RNA sequencing
will provide whole genome sequence information between cancer/normal cells of the same and different individuals
high-throughput RNA sequencing
Genes derived from the same ancestral gene
homologous genes
In mice many knockouts are created by
homologous recombination
The BLAST program starts with a protein or DNA sequence and then locates
homologous sequences within a database
To fully study what the consequences of a knockout the animal should be
homozygous for knock out
cDNA from two cell types can be differentially labeled and
hybridized to the microarray
examine translational reading frames
in a DNA sequence, the reading of codons could begin with the first, second, or third nucleotides; reading frame 1, 2, and 3
Gene knockouts may be created by using transposable elements to
insert into a gene
RNA-Seq technique
isolate RNA from a sample of cells; break the RNAs into small fragments; attach short oligonucleotide linkers to the ends of the RNAs; synthesize cDNAs via reverse transcriptase PCR, using the RNAs as templates; PCR primers are complementary to linkers; sequence cDNAs using a next-gen sequencing technology; using computer technology, align the cDNA sequences along the genomic sequence
ORF in prokaryotes
long ORFs are contained within the chromosomal gene sequences
Some programs that predict RNA structure may also use calculations to determine the form with
lowest energy state
to compare gene expression between two cells,
mRNA is isolated from each cell type
proteins which are very abundant in a cell type
may be important for that cells structure or function
DNA microarrays can be used to
measure expression levels of large numbers of genes simultaneously or genotype multiple regions of genome
Gene knockouts are useful as many disease and syndromes are the result of
mutations that inactivate genes
Genbank
nucleotide database
open reading frame
nucleotide sequence that does not contain any stop codons
Two homologous genes found in different species
orthologs
Two or more homologous genes found in the same organism
paralogs
sequence recognition searches require a researcher to input a sequence of interest but
pattern recognition search does not
sample subjected to laser beam
peptides become ejected as an ionized gas in which peptide contains one or more positive charges
tandem mass spectrometry protocol
peptides mixed with organic acid and dried onto metal slide; sample subjected to laser beam; charged peptides accelerated via electric field and fly toward detector
gene prediction
process of identifying regions of genomic DNA that encode genes; protein-encoding genes; genes for non-coding RNAs
search by content
program identifies sequences that differ significantly from random distribution due to codon bias within structural genes; some codons used more frequently than others for same amino acid
search by signal
program tries to locate an organization of known sequence elements normally found within a gene (promoter, start/stop codons)
application of protein microarrays
protein expression, protein function, protein=protein interactions, pharmacology
technology to make DNA microarrays is being applied to make
protein microarrays
Prosite
protein motif database
development of protein microarrays is more challenging
proteins are much more easily damaged by manipulations that occur during microarray formation; synthesis and purification of proteins is more time-consuming than DNA
ChiP protocol
proteins in living cells cross-linked to DNA they are bound to with formaldehyde; cells lysed and DNA fragmented; antibody used to precipitate protein of interest; DNA chemically freed from cross-links; DNA PCR amplified; sequence of DNA identified directly or by using it as a probe on microarray (ChIP-chip assay)
protein microarrays
proteins rather than DNA are spotted onto a slide
specific spots may be of special interest
proteins which are very abundant in a cell type; spots present in only given circumstances; spots present only in abnormal cells
computer programs can employ different strategies to locate genes
search by signal; search by content; examine translational reading frames
2-dimensional gel electrophoresis
separation technique that can distinguish hundreds or even thousands of different proteins in a cell extract
DNA microarrays contain
single-stranded DNA
what molecule is labeled with a fluorescent tag in a microarray
single-stranded cDNA
readon proteome is larger than genome
some mRNAs are alternatively spliced
how to correlate a given spot on a 2D gel with a protein
spot is cut out from gel; protein purified from it
In a functional protein array the proteins whose function is to be tested are
spotted onto the chip
In an antibody microarray antibodies to specific proteins are
spotted onto the chip
in mass spectrometry, the amino acid sequence of the protein is revealed via
tandem mass spectrometry
charged peptides accelerated via an electric field and fly toward a detector
time they spend in flight is determined by mass and net charge and reveals mass of peptide
DNA microarrays are used to study patterns of gene expression
true
DNA microarrays may be used to study how a cell responds to specific environmental conditions
true
During isoelectric focusing proteins are separated by the pH at which they have a net neutral charge
true
SDS allows proteins to separate based on molecular mass
true
Two dimensional electrophoresis allows proteins that differ by one charged amino acid to be separated
true
common technique in field of proteomics is
two-dimensional gel electrophoresis
E value
value that represents the number of times that the match or a better match would be expected to be found by random chance in an entire database
spots present only in abnormal cells
very common in cancer cells