Proteomics Midterm
intermediate protein-protein interactions
(antibody-antigen, TCR-MHC-peptide, signal transduction PPI), Kd μM-nM
strong protein-protein interactions
(require a molecular trigger to shift the oligomeric equilibrium) Kd nM-fM
sensitivity
-100 genes account for 50% of the protein -Moderately abundant proteins - 100,000 copies/cell -Rare proteins - fewer than 1000 copies/cell -2 sensitivity problems: 1) Difficulty in detecting the rare proteins 2) Spots for abundant proteins mask those of scarce one -Difficulty in detecting the rare proteins -Spots for abundant proteins mask scarce ones
beta bend
-180° flip of the peptide bond between residues 2 & 3 -Pro and Gly occur frequently in β bends due to steric requirements.
Collision-Induced Dissociation (CID)
-1st quadrupole separates the peptide mixture by mass and puts individual peptides into the 2nd one in which collision with Ar occur -3rd quadrupole separates the fragments generated from a single original peptide to give the CID spectrum -Then the process is repeated with the next MW peptide
Why Does Phosphorylation Cause Binding of Target Proteins?
-Activated tyrosine kinases phosphorylate the src protein and turn on its kinase activity -The SH2 domain interacts with the phosphorylated C-terminal region of the receptor tyrosine kinase
Multidimensional Chromatography in Proteomics
-Advantage is that low abundance proteins are easier to detect because they are concentrated on the column by loading large samples -Disadvantage is that the visual aspect of 2D gels, including the pI and MW information, is lost
protein arrays
-Antibody-based or bait-based arrays -High-throughput assays ; screening and detection of specific interactions of proteins from complex mixtures -Protein expression profiling, protein-protein interaction and enzyme activity
Massively parallel signal sequencing (MPSS)
-Attach cDNA to bead & cleave with Dpn II -Ligate with adapter with fluorescent label (F) & Bbv I restriction site (black) -Bbv I cleaves a specific number of base pairs before the site & leaves a 4 base overhang -These 4 bases are part of the original cDNA sequence -There are 16 possible 4 base sequences
dye binding assays
-Bradford assay - binding of Coomassie brilliant blue -Fluorescent dyes are more sensitive -o-phthaldialdehyde (OPA), fluorescamine & NanoOrange
automation
-Capturing images: Computer-aided comparison of gels -Isolating specific spots for MS and other analysis: 1) Manual VS robotic spot picking 2) Automated digestion, clean-up and application to MS
discovery
-Collect as much data as possible -Use computer-aided analysis to determine the nature's rules -Reverse genetics - determine gene sequence and mutate it to determine effect on phenotype
metabolic regulation
-Control synthesis by induction and repression -Need to control degradation also as an off-switch
TOCSY - total correlation spectroscopy
-Detects groups of protons interacting through a coupled network -Identifies protons associated with a specific residue -Identify the proton resonances
COSY - correlation spectroscopy
-Detects protons interacting through bonds -Allows identification of protons linked by bonded atoms -Identify the proton resonances
analytical approach
-Far Western analysis -Protein arrays
intact peptide ions
-Get masses of each peptide -Compare masses to database generated from genome to identify protein
fragment ions
-Get masses of peptide fragments -Can be used with data base as before -Can be used to derive de novo sequences -De novo sequence can be used to BLAST genomes
Stable Isotope Labeling by Amino acids in Cell culture (SILAC)
-Grow cells with an amino acid labeled differently in the control and test cultures. -C-13 Arg in the control and C-12 Arg in the test culture -N-15 Arg in the control and N-15 Arg in the test culture -Mix cultures before doing analyzing by MS
SCOP - Structural Classification of Proteins
-Hierarchy - class to fold to superfamily to family -Manually derived
CATH - Class Architecture Topology Homology
-Hierarchy - in its name -Semi-automatic
Co-immunoprecipitation
-Immunoprecipitation (IP) experiment-immune response & precipitation -Affinity purify a bait protein antigen together with its binding partner using a specific antibody -Capturing of immune complex by solid support -Elution from the support and analysis by SDS-PAGE and detection by western blot
De novo Sequencing by MS
-Interpret the CID spectra for the b- (or y-) series to identify the amino acid differences at each step -Alternatively, MALDI-TOF analysis of peptide "ladders" generated by chemical or enzymatic digestion
ion exchange chromotagraphy
-Ion exchange resins contain charged groups (acidic and basic) -pH is neutral under low salt conditions
Pb 2+ binding calmodulin
-MD simulations of Pb2+ ion binding to calmodulin were able to predict about 50% of the known Pb2+ binding sites with no pre-knowledge of their positions -The mechanism of binding involved initial interaction with one carboxylate and then recruitment of other carboxylates and carbonyls to the metal ion
MALDI/MS
-Matrix absorbs light from laser and is vaporized -Matrix vaporization carries peptide or protein into gas phase -Peptide is usually a positive ion because matrix is often an acid -Electric field accelerates ions -Mass analyzer is often time-of-flight
representation
-Membrane proteins are underrepresented -Histones, chromatin proteins and ribosomal proteins are hard to separate on 2D gels -You may need special techniques to get these proteins to show up on your gel -extraction buffers: Chaotropic agent to disrupt H-bonds, a non-ionic detergent to solubilize hydrophobic proteins, a reducing agent to break disulfide bonds
FSSP - Families of Structurally Similar Proteins
-Not hierarchy - does pairwise structural comparisons to divide non-homologous proteins into sets -Fully automatic
NOESY - NOE spectroscopy
-Nuclear Overhauser effect -Identifies proton close to one another in space but not associated through bonds -Put restraints of distances for non-bonded protons
the Russian doll effect
-Occurs when smaller substructures are contained within larger structures -Leads to increasing levels of complexity in protein structure
spot detection
-Once the gel images have been matched, the program automatically detects spots -Algorithms are generally based on Gaussian statistics
ESI-MS
-Peptides are dissolves and forced through a narrow needle, which is at a high voltage -Solvent droplets evaporate in vacuum leaving gas phase peptides -Gas phase ions are accelerated toward analyzer by electric field
protein chip
-Protein:protein -Protein:drug -Enzyme:substrate
gel electrophoresis
-Purity of the protein sample -Denatured molecular weight (SDS) -Isoelectric point pI = pH at which the protein has zero net charge -2D electrophoresis
hypothesis
-Put forward a hypothesis -Design experiments to prove or disprove it -Forward genetics - move from phenotype to gene
Tandem Affinity Purification-Mass Spectrometry (TAP-MS)
-Rapid purification of complexes without prior knowledge of the complex composition, activity, or function -Ability to purify low abundant proteins/protein complexes -Fusion of the TAP tag to the target protein -Complex retrieval from tissue culture
Size Exclusion Chromatography in Proteomics
-Separates by size and shape of the molecule -For the same shape, the larger molecules come out first -Different media have different size pores -Sepharose and sephadex are the most popular resins
far western analysis
-Similar strategy to Western blotting -To determine receptor-ligand interactions and to screen libraries for interacting proteins
bootstrapping: initial refinement
-Solvent flattening -Non-crystallographic symmetry averaging -Including a partial atomic model -But initial phases must be good enough to make the map interpretable
isoelectric focusing
-The electric field causes the ampholytes to move towards the oppositely charge pole until they establish a pH gradient in which each ampholyte is at its own pI -Using immobilized pH gradient (IPG) gels in which the buffering groups are attached to the gel allows the establishment of a pH gradient before electrophoresis
spot quantitation
-The positions of detected spots are calibrated to give a pI / mW pair for each protein. -A value for the expression level of the protein can be calculated from the overall spot intensity. -Some programs do not quantitate each gel separately, but calculate relative intensity pixel by pixel
tandem mass spec
-Two mass spectrometers linked end to end -Ions are passed through the first MS, then again through the second MS -Increased resolution -Decay between MS sectors
RNA interference
-Ubiquitous defense mechanism against viruses -RDE-1 binding to dsRNA recruits Dicer which generates short duplexes -Small interfering siRNA directs the RNA-induced silencing complex (RISC) to the corresponding mRNA -RISC destroys the mRNA and no protein is produced
Isotope-coded affinity tag (ICAT)
-Uses isotopic labeling to give quantitative proteomics -Isotopic labeling = IAA to modify Cys -Originally deuterium was used but now it is often C-14 -Tag = biotin for affinity isolation of labeled proteins or peptides
radioactive identification
-X-ray film: CCD camera or densitometer -Phosphoimaging
genetic approach
-Yeast 2-hybrid -Phage display
SAGE (Serial Analysis of Gene Expression)
-a method for comprehensive analysis of gene expression patterns -three principles
GPI-anchors
-added to C-terminal after removal of C-terminal signal sequence
N-linked glycosylation
-added to the amide nitrogen of asparagine side chains -only in eukaryotes -specie specific -occurs only in ER -dolichol phosphate
O-linked glycosylation
-added to the hydroxy oxygen of serine and threonine side chains -no consensus sequence -depends on secondary and tertiary structure
glycosylation
-characteristic for many cell surface proteins and secreted eukaryotic proteins -the function required for some proteins to fold properly act as address label for protein sorting and improved stability -3 types: 1) N-linked, 2) O-linked, 3) GPI-anchors
conjugate gradients (CG)
-compensates for the curvature of the potential energy function by using the "memory" of the previous step -requires convergence along each line search before going in a new direction
methods of identifying PPI
-experimental (Protein-protein arrays, Y2H assay, TAP assay) -computational/inferential (Interolog analysis, Colocalization, co-expression, Correlated mutations, Text-mining)
x-ray crystallography
-finding a 3D structure
transcriptomics
-measures gene expression -Expression of mRNA does not always mean expression of the protein
mass spec
-measures the mass/charge ratio (m/z), or molar mass, of ions in vacuum -types of analysis: intact peptide ions and fragment ions
protein-protein interactions (PPI)
-obligate -non-obligate -intermediate -strong -transient
phosphoproteomics
-recognize and identify phosphoproteins -determine phosphorylation sites -quantify the degree of phosphorylation
guanine-nucleotide bonding protein
-regulate a variety of processes, including sensual perception, protein synthesis, various transport processes, and cell growth and differentiation -act as molecular switches and timers that cycle between inactive guanosine diphosphate (GDP)-bound and active guanosine triphosphate (GTP)-bound states
limitations of 2D gels
-resolution -sensitivity -representation -automation
glycoproteomics
-simultaneous analysis of protein and carbohydrate is difficult
antibodies
-used to isolate a specific protein or peptide -recognize specific structural features called epitopes -Some recognize a specific protein or a class of proteins with a similar feature
crosslinking
-used to map interactome networks within the context of living cells -rely on use of metabolic engineering or genetic code expansion to incorporate photocrosslinking analogs of amino acids or sugars into cellular biomolecules
resolution
-we can resolve 2500 spots but there are 10X's that many proteins -Use multiple pH ranges and run several gels instead of one -Use nonlinear pH gradients -Prefractionate the sample before running the gel
domain fusion
-when two proteins from different domains fuse -Genetic mechanisms that influence the layout of multidomain proteins include gross rearrangements such as inversions, translocations, deletions and duplications, homologous recombination, and slippage of DNA polymerase during replication
Polymerase Chain Reaction (PCR)
1) A pair of primers that hybridize the flanking sequences 2) All 4 dNTPs 3) A heat stable DNA polymerase
three principles of SAGE
1) A short sequence tag (10-14bp) contains sufficient information to uniquely identify a transcript provided that that the tag is obtained from a unique position within each transcript 2) Sequence tags can be linked together to from long serial molecules that can be cloned and sequenced 3) Quantitation of the number of times a particular tag is observed provides the expression level of the corresponding transcript
triple quadrupole mass spec-2 methods for running it
1) Analysis of intact peptides - 1st quadrupole in scanning mode & others in RF mode 2) Analysis of peptide fragmentation - 1st one in scanning mode, 2nd one in RF mode as a collision cell and 3rd in scanning mode to separate fragments
sequence determination-3 ways
1) Edman degradation 2) mass spec 3) sequencing the corresponding DNA in the gene
classification systems
1) SCOP 2) CATH 3) FSSP
Super folds
1) TIM barrel fold 2) alpha/beta hydrolase fold 3) NAD binding domain 4) P-loop NTP hydrolase fold 5) ferredoxin like fold
Advantages of Synchrotron Radiation
1) brightness 2) continuous spectrum 3) better optics
CATH structural characterization
1) class 2) architecture 3) topology 4) homologous superfamily
SCOP structural characterization
1) class 2) fold 3) superfamily 4) family
motion in proteins
1) localized side chain-motions (atomic fluctuation and side-chain movements) 2) Medium-sized structural transitions (Loop motion, terminal-arm motion, rigid-body motion) 3) large scale motions (Domain motion, subunit motion) 4) Global structural transitions (Helix-coil transition, folding/unfolding, subunit association)
components of an x-ray experiment
1) x-ray source 2) x-ray beam 3) crystal 4) diffracted beams 5) detector
Föster Resonance Energy Transfer (FRET)
1. Donor and acceptor must be in close proximity (<10 nm) 2. Absorption spectrum of the acceptor must overlap fluorescence emission spectrum of the donor 3. Donor and acceptor transition dipole orientations must be in a favorable mutual orientation(for optimal energy transfer)
8 steps of determining sequence
1. If more than one polypeptide chain, separate. 2. Cleave (reduce) disulfide bridges 3. Determine composition of each chain 4. Determine N- and C-terminal residues 5. Cleave each chain into smaller fragments and determine the sequence of each chain 6. Repeat step 5, using a different cleavage procedure to generate a different set of fragment 7. Reconstruct the sequence of the protein from the sequences of overlapping fragments 8. Determine the positions of the disulfide crosslinks
antibody based protein array
1. Incubate soluble sample on plate or membrane-immobilized antibody array 2. Add a mix of labeled soluble antibody against the same set of antigens 3. Incubate with developing system and quantitate signal
ubiquitinomics
1. Insert hexa-His tag onto yeast ubiquitin gene 2. Culture modified cells normally 3. Recover ubiquitin-tagged proteins using a Ni column 4. Digest with trypsin 5. Fractionate using multidimensional chromatography 6. Analyze using MS/MS
isomorphous replacement
1. Is usually the method of choice. 2.This method requires the incorporation of a heavy atom into the crystal structure, which leads to small but significant changes in the diffracted intensities. 3. The differences between native and derivative data allows to determine the positions of the heavy atoms, which allows to calculate heavy atom phases. 4. With at least more than one derivative or additional anomalous differences, the heavy atom phases are sufficient to determine the full phase information
peptide mass fingerprinting
1. Sample is a single protein or simple mixture digested with a specific enzyme (trypsin) 2. Determine masses of peptides - MALDI-TOF 3. Select sequence database and do a virtual digest 4. Correlate experimental peptide masses with theoretical ones 5. Rank proteins in database in order of best correlation
biochemical methods
1. analytical approach 2. physical approach 3. genetic approach 4. cell-biology
energy minimization
1. move down the gradient 2. Move down the gradient again, but increase the step size because the energy is lower 3. Continue to move down the gradient, which is now in the opposite direction from the first two steps 4a. Continue moving down the gradient with a larger stepsize because the energy is less 4b. Stop if the gradient (dE/dx) is less than an input value
three most studied and widespread types of PTM
1. phosphorylation (addition and removal of phosphate groups) 2. glycosylation (addition of short-chain carbohydrates, oligosaccharides or glycans) 3. ubiquitinylation (addition of ubiquitin, tag for localization and degradation)
main steps of molecular dynamics simulation
1. preparation of data 2. heating 3. equilibrations 4. production 5. analysis
molecular replacement
1. probably the easiest and fastest way to solve a structure 2. It requires that someone else solved a structure of a similar protein earlier 3. The correct placement of the search model allows the calculation of initial phases, which are usually sufficient to solve the new structure, provided the search model is similar enough and the known fragment (it doesn't need to be the whole protein) large enough
tertiary protein structure
3D structure of a polypeptide chain
fluorophore
A small molecule, or a part of a larger molecule, that can be excited by light to emit fluorescence
conservation of linear and angular momenta
After equilibrium is reached the momenta should be conserved also
fluorochrome
Any of various fluorescent substances used in fluorescence microscopy to stain specimens
london attraction
Attraction caused by fluctuations in electronic motions, which create an atomic dipole
stains
CCD camera or densitometer
antibody chip
Detect Ag-Ab interactions
cell-biology
FRET - fluorescence resonance energy transfer
alpha-helical bundle
Ferritin - an iron storage protein
Fluorescent tag or stain
Fluorescence imager & recorded on CCD camera
time reversability
For a simulation to have this property it should be able to retrace its path back to the original state when the sign of the time step is changed
interologs
If A and B interact in organism X, then if organism Y has a homolog of A (A') and a homolog of B (B') then A' and B' should interact too. Requires list of known interacting partners
phylogenetic profiling
If two proteins, P1 and P2 function/interact together, they must co-evolve. So every organism that has a homolog of P1 must also have a homolog of P2
colorimetric assays
Lowry assay based on reduction of Cu2+
phage display
Molecular technique by which foreign proteins are expressed at the surface of phage particles
conservation of energy
Newton's equation of motion conserves the total energy
non-obligate protein-protein interactions
Non-obligate permanent heterodimer like Thrombin and rodin inhibitor
transient protein-protein interactions
Non-obligate transient homodimer, Sperm lysin (interaction is broken and formed continuously)
differential centrifugation
Particles of different densities or sizes in a suspension will sediment at different rates, with the larger and denser particles sedimenting faster. These sedimentation rates can be increased by using centrifugal force. A suspension of cells subjected to a series of increasing centrifugal force cycles will yield a series of pellets containing cells of decreasing sedimentation rate.
physical approach
Photoactivable crosslinkers
expressional proteomics
Protein Identification and Qualitative Analysis
salting out
Proteins precipitate in high concentrations of ammonium sulfate
Gel filtration chromotography process
Some proteins are small enough to also enter the molecular holes of the gel bead. Other proteins are too large to enter the holes and pass by the gel bead. The concept is of Reverse Sieve, since a normal sieve retains large and passes small particles. In gel filtration the larger proteins elute first, medium sized ones next and finally the smallest elute last
lectin
Specific Carbohydrate-Binding Proteins that Promote Cell-Cell Interactions
gel filtration chromotography
The gel bead has molecular size holes so that small molecules like water and buffer enter them completely
yeast two-hybrid system
The key to the two-hybrid screen is that in most eukaryotic transcription factors, the activating and binding domains are modular and can function in close proximity to each other without direct binding
affinity chromotography
To remove the protein of interest from the column, you can elute with a solution of a compound with higher affinity than the ligand (competitive) You can change the pH, ionic strength and/or temperature so that the protein-ligand complex is no longer stable
exclusion limit
a function of molecular shape, since elongated molecules are less likely to penetrate a gel pore than other shapes
architecture
a large-scale grouping of topologies which share particular structural features
differential expression
a protein expressed only in the second sample is circled in red. The yellow circles show proteins which are differentially expressed.
circular dichroism
a spectroscopic technique where the CD of molecules is measured over a range of wavelengths
rado-frequency (RF) mode
allows ions of any m/z to pass through
Primary Protein Structure
amino acid sequence
PCR medical
amplification of pathogen DNA for early detection
dynamic average
an average over a single point in phase space at all times
thermodynamic average
an average over all points in phase space at a single time
orbitrap
an ion trap mass analyzer consisting of an outer barrel-like electrode and a coaxial inner spindle-like electrode that traps ions in an orbital motion around the spindle
analytical arrays
capture agents
lysozyme
cleaves bacterial cell walls
domain
compact folding regions within a single chain
same gene
compensating mutation that returns protein to active state
scanning mode
constant voltage is applied and it acts as a mass filter -> only ions of a specific m/z pass
PCR forensic
establish guilt in criminal cases - compare defendant (D) and victim (V) DNA to that on jeans and shirt
PCR scientific
evolution studies - amplify ancient DNA molecules
in vivo
express fusion protein in vivo, Purify complexes from the cell
quadrupole
four parallel metal rods with opposite pairs connected electrically -> voltage across space between them
class-SCOP
general "structural architecture" of the domain
topology
high structural similarity but no evidence of homology. Equivalent to a fold in SCOP
Reversed Phase Chromatography in Proteomics
hydrophobic chromatography: 1) Both use a hydrophobic resin 2) RP elutes by increasing the amount of organic component of the elution buffer 3) HC elutes by decreasing the salt concentration in the elution buffer
homologous superfamily
indicative of a demonstrable evolutionary relationship. Equivalent to the superfamily level of SCOP
affinity chromatography in proteomics
isolates proteins that bind to a specific ligand
alpha-helical coiled coil
keratin
secondary protein structure
local, repetitive folding of backbone (alpha helix, beta sheet)
protein crystalization
myoglobin in dilute buffer --(addition of (NH4)SO4)-> myoglobin in 3 M (NH4)SO4, pH 7 --(several days)-> myoglobin crystals
braggs law
nλ = AB + BC AB = d sinθ AB = BC nλ = 2 AB nλ = 2 d sinθ
in vitro
over express protein in vitro, Bind fusion protein to a column and run whole cell lysate through the column. Identify proteins that "stick" to the fusion protein
functional arrays
proteins with ligands in solution
different gene
re-establish protein-protein interaction or suppressor t-RNA
gene knockouts
replacing a gene with a non-function analog that results in a loss of function
fold
similar arrangement of regular secondary structures but without evidence of evolutionary relatedness
family
some sequence similarity can be detected
quaternary protein structure
spatial arrangements of chains
alpha-Lactalbumin
substrate specifier protein
superfamily
sufficient structural and functional similarity to infer a divergent evolutionary relationship but not necessarily detectable sequence homology
suppressor mutations
suppress the change due to mutation at site A by producing an additional genetic change at site B
relative elution volume
the behavior of a particular solute on a given gel that is independent of the size of the column
synthetic lethality
the combination of mutations in two genes causes cell death or reduced fitness
protein turnover
the degradation & re-synthesis of proteins that occurs constantly
zonal centrifugation
the faster sedimenting particles are not contaminated by the slower particles as occurs in differential centrifugation. However, the narrow load zone limits the volume of sample (typically 10%) that can be accommodated on the density gradient. The gradient stabilizes the bands and provides a medium of increasing density and viscosity
epistasis
the interaction of genes that are not alleles, in particular the suppression of the effect of one such gene by another
class-CATH
the overall secondary-structure content of the domain
elution volume
the volume of a solvent required to elute a given solute from the column after it has first contacted the gel
super secondary protein structure
thermodynamically favorable combinations of 2nd structure
basic charged groups
they interact with negatively charged molecules and are called anion exchangers
acid charged groups
they interact with positively charged proteins and are called cation exchangers
goal of energy minimization
to calculate the conformation with the lowest energy - the global minimum and to find the local minimum of the potential energy function
central dogma traditional vs. contemporary
traditional: gene -> mRNA -> protein contemporary: genome -> transcriptome -> proteome
direct methods
try to derive the phase information from phase relation ships and plausible constraints
UV absorbance
used for quantitation and to follow elution from chromatography
Immobilized metal affinity chromatography (IMAC)
used to isolate phospho protein or proteins with His-tags
glutothione beads
used to isolate proteins fused to glutathione-S-transferase
homology modeling
using a structure that is similar in sequence to predict the structure of your protein of interest
obligate protein-protein interactions
usually permanent the protomers are not found as stable structures on their own in vivo
genome
wide random mutagenesis using mutagens or insertion sequences