BIL 255 Topic 6: Proteins
objectives of this section
Review their structure Primary, secondary, tertiary, quaternary Protein folding mechanisms - chaperones degradation/turnover - proteasomes Look at molecular motors Analyze enzyme kinetics
secondary structure - alpha helix
Rigid rod like cylinder around long axis core R groups radiate outward 3.6 aa per 360 degree turn 1.5 A/residue - 5.4 A/turn Single repeat turn of helix (360 degrees) = 0.54 nm Right handed helix forms counterclockwise Helix formed from H bond interactions H of N (of one aa) and C=O (of 4th aa away)
oligopeptide
short chain of amino acids 2-20
two classes of proteins
simple an dcomplex
globulins
soluble in dilute aqueous solutions (with ions); insoluble in pure distilled water
albumins
soluble in pure water (distilled); are globular in shape; includes many enzymes
visualization
some common ways primary sequences are depicted
quaternary
spatial relationships between 2 or more different polypeptides or subunits
storage protein
stores small molecules or ions Iron is stored in the liver by binding to the small protein ferritin;ovalbumin in egg white is used as a source of amino acids for the developing bird embryo; casein in milk is a source of amino acids for baby mammals.
protein function is derived from
the 3D structure (conformation) which is specified by the primary amino acid sequence and its local environs interactions
invariants ex. ubiquitin
(proteasomes - 96% eukaryotic sequence universality) and histones (chromosomes - few sequence differences among eukaryotes) found in all eukaryotic tissues - has MW of 8.5 kDa and is composed of 76 amino acid residues and when tagged to a protein can signal its degradation via proteosomes for example, the human and yeast ubiquitins differ only at three amino acid residues amino acid sequences of the sea slug aplasia and human ubiquitin are 100% identical ubiquitin is found in all eukaryotes but not in prokaryotes UBA is short for ubiquitin associated domains ubiquitin is a signal added to an incorrectly folded protein, which allows it to be degraded by the proteasome, and the amino acid constituents can be recycled
serine proteases
(trypsin, chymotrypsin, elastase) all have SER at active site, resulting in nucleophilic hydrolysis of peptide bonds digestive enzymes HOOC elastase HOOC chymotrypsin backbone models of two serine proteases, elastase and chymotrypsin: green shaded areas are the same in the two proteins and the two conformations are very similar the active site of each enzyme is circled in red serine directly participates in the protein cleavage reaction
zinc finger
1 a and 2 b (b-a-b) strands with antiparallel orientations form "fingers" bound by Zn ion that often link to DNA (RNA) Zinc fingers are so named because their structure resembles a hand with a pointed finger. They are able to recognize and bind to different 3-letter nucleotide sequences. ZINC FINGER NUCLEASES are designed to precisely target, find, and cut specific sequences of DNA. 1 α and 2 β strands with antiparallel orientations, that form fingers bound by Zn ion that bind to DNA Transcription Factors & nucleases also bind to DNA by means of Zinc Finger Motifs
dipeptide
2 amino acids
insulin
2 polypeptides of 21-30 aa's each structure variances analogs and ADA alpha chain = 21 amino acids beta chain = 30 amino acids MW = 5808 Da high degree of invariance of insulin amino acid sequences Lantus and basaglar are insulin analogs made via recombinant DNA in non pathogenic E coli and used to treat types 1 and 2 diabetes
tripeptide
3 amino acids
peptide
< 50 aa
tertiary
complete 3D shape of a peptide due to weak electrostatic forces
enzyme in egg whites and human tears
An enzyme found in egg whites & human tears with 129 aa's and 8 CYS residues, which form 4 S-S bonds; it hydrolyses peptidoglycans in bacterial cell walls thus functioning as a bactericidal agent. MW = 14,600.
first protein sequenced was
Beef Insulin by Fred Sanger - 1958 nobel prize winner
family of peroxidase
EC 1. 11 .1 .x Enzyme Commission # substrates hydrogen peroxide, organic hydroperoxides, or lipid peroxides cytochrome-c peroxidase EC 1.11.1.5 catalase EC 1.11.1.6 peroxidase EC 1.11.1.7 thyroid peroxidase EC 1.11.1.8 glutathione peroxidase EC 1.11.1.9 lignin peroxidase EC 1.11.1.14 each enzyme has a unique amino acid sequence, yet they all catalyze the same chemical reaction.
alpha helix and beta sheet motifs often with a protein domain
EF hand zinc finger
polymorphism ex. peroxidase family
H2O2 = 2 H2O + O2 Inter-specific: between species (each have different aa sequences) Intraspecific: within a species (liver vs kidney) Peroxidases or peroxide reductases (EC number 1.11.1.x) are a large group of enzymes which play a role in various biological processes. They are named after the fact that they commonly break up peroxides.
primary sequence linear chain
Linear sequence of amino acids in a polypeptide Repeated peptide bonds form the backbone of the polypeptide chain R side groups project outward on alternate sides along a zig zag backbone
site specificity ex. signal sequence for protein targeting, prosthetic binding site
Protein targeting or protein sorting is the biological mechanism by which proteins are transported to their appropriate destinations in the cell or outside it. Proteins can be targeted to the inner space of an organelle, different intracellular membranes, plasma membrane, or to the exterior of the cell via secretion. This delivery process is carried out based on information contained in the protein itself. Correct sorting is crucial for the cell; errors can lead to diseases. SRP displaced and released for reuse mRNA, ribosome, growing polypeptide chain, ER signal sequences, signal recognition particle SRP, SRP receptor, protein translator the ER signal sequence binds to both the exposed SRP and the ribosomes the SRP ribosome complex then binds to an SRP receptor in the ER membrane the SRP is released, passing the ribosome from the SRP receptor to a protein translator in the ER membrane protein synthesis resumes, and the translator starts to transfer the growing polypeptide across the lipid bilayer into the ER
secondary structure - beta shseet
Short segments (3-10 residues) connect laterally by H bonds = pleated sheets, e.g. a linear extended ZIG ZAG pleated sheet - intra and inter chain ⅓ of typical protein structure is also in beta sheets Can be parallel and antiparallel Resist pulling (tensile) forces = strength of silk fibers: model = fibroin
variety of protein structures - infinite
The variety of protein structures may be infinite . .. Average protein has 300-500 amino acids and has a MW of 35 kD to 55 kD A protein of 300 aa's can have 20^300 different linear arrays of aa's
contractile (motor)
contract, change shape, elements of cytoskeleton (actin, myosin, tubulin)
secondary structure
Well defined periodic structure making up some 30% of a protein's structure
bovine insulin structure
a chain n terminal R-NH c terminal COOH b chain n terminal R-NH c terminal COOH bovine insulin is composed of 2 separate peptide chains: a chain contains 21 amino acid residues and b chain contains 30 amino acid residues
lysozyme
a family of enzymes
primary sequence size
a protein's size is specified by its mass (MW in daltons = 1 amu) Average MW of all the amino acids is about 113 Da Thus if a protein is determined to have a mass of 33900 Da is about 300 amino acids Average yeast protein = 52728 Da (52.7 kDa) with about 466 amino acids
b turns
a region of 3 or 4 amino acids that redirect backbone Involves 4 residues: 1st and 2nd = PRO in cis, 3rd = gly and a 4th
beta sheet parallel and antiparallel
a ribbon view of the 3D structure of Synechoccocus and a schematic diagram of the anti parallel beta sheet
secondary structure summary
a stylized polypeptide chain both alpha helical and beta sheet segments are shown in this folded polypeptide chain these modular segments are linked by reverse turns and regions of random coil
alpha and beta regions
a/b regions combine to help establish initial shape in proteins - ribbons and sheets Non a/b regions include hinges, turns, loops, etc = flexibility
a complete motif example: ab barrel motif of methylmalonyl CoA mutase
alpha helices are red and beta strands are blue the inside of the barrel is lined by small hydrophilic side chains (SER and THR), which allows space for the substrate coenzyme A to bind along the axis of the barrel methylmalonyl CoA mutase converts methylmalonyl CoA to succinylcholine CoA, a key intermediate in the Krebs cycle
beta sheet
amino acid side chain hydrogen bond hydrogen carbon nitrogen carbon peptide bond oxygen
defensive (protect)
antibodies (IgG), fibrinogen and thrombin, snake venoms, bacterial toxins
glycoproteins (and carbohydrates)
antibodies, cell surface proteins
fibrous structure
architectural proteins of cartilage and connective tissue
polypeptide folding MOTIFs
are conserved super secondary structures Modifs are 3D combinations of 2nd structure that appear in a variety of other proteins and enzymes that may have dissimilar functions I.e. recurring arrangements of alpha helix and/or beta sheets and ab motifs, can occur in different proteins with/without similar functions
proline turns
are due to either a cis or trans configuration of proline ring
inter specific
between species
transport proteins
bind and carry ligands
gene regulatory protein
binds to DNA to switch genes on or off the lactose repressor in bacteria silences the gene for the enzymes that degrade the sugar lactose; many different homeodomain proteins act as genetic switches to control development in multicellular organisms, including humans
lipoproteins (and lipids)
blood, transport and membrane proteins
signal protein
carries signals from cell to cell Many of the hormones and growth factors that coordinate physiological function in animals are proteins; insulin, for example, is a small protein that controls glucose levels in the blood; netrin attracts growing nerve cells in a specific direction in a developing embryo; nerve growth factor (NGF) stimulates some types of nerve cells to grow axons; epidermal growth factor (EGF)stimulates the growth and division of epithelial cells.
transport protein
carries small molecules or ions Inthebloodstream,serumalbumincarries lipids, hemoglobin carries oxygen, and transferrin carries iron. Many proteins embedded in cell membranes transport ions or small molecules across the membrane. For example, the bacterial protein bacteriorhodopsin is a light- activated proton pump that transports H+ ions out of the cell; the glucose carrier shuttles glucose into and out of liver cells; and a Ca2+pump in muscle cells pumps the calcium ions needed to trigger muscle contraction into the endoplasmic reticulum, where they are stored.
enzymes
catalytic activity and function
trypsin
catalytic bind substrate peptide catalytic site peptide bond to be cleavage oxyanion hole arginine side chain in substrate guanidinium group side chain specificity binding pocket
enzyme
catalyzes covalent bond breakage or formation Living cells contain thousands of different enzymes, each of which catalyzes (speeds up) one particular reaction. Examples include: tryptophan synthetase— makes the amino acid tryptophan; pepsin—degrades dietary proteins in the stomach; ribulosebisphosphate carboxylase—helps convert carbon dioxide into sugars in plants; DNA polymerase—copies DNA; protein kinase—adds a phosphate group to a protein molecule.
mutation
change in primary amino acid sequence = defective protein - sickle cell trait
structural (support)
collagen of tendons and cartilage, elastin of ligaments (tropoelastin), keratin of hair, feathers, and nails, fibroin of silk and webs
EC 3.2.1.17
damage bacterial cell walls by hydrolyzing the glycosidic bonds (1,4-beta -linkages) between N-acetylmuramic acid and N-acetyl-D-glucosamine residues in peptidoglycans.
alpha helix
described by Linus Pauling 1954 Nobel using x ray diffraction right or left handed coiled conformation (like a spring) where NH's donate H bond to carbonyl (C=O) 4 residues earlier incident x ray source x ray beam photographic film crystal diffracted x rays diffracted rays x ray diffraction pattern obtained from the protein crystal
receptor protein
detects signals and transmits them to the cell's response machinery Rhodopsin in the retina detects light; the acetylcholine receptor in the membrane of a muscle cell receives chemical signals released from a nerve ending; theinsulin receptor allows a liver cell to respond to the hormone insulin by taking up glucose; the adrenergic receptor on heart muscle increases the rate of heartbeat when it binds to adrenaline.
families of proteins
different structures but with related functions; having evolved from a single ancestral protein and may have up to 30%+ commonality of sequence
invariants
don't vary significantly in aa sequence
proteome
entire set of proteins expressed by a genome either a cell's or a whole organism's
polypeptide
few to many amino acids (up to 300) MW is about 10,000
proteins are classified
functionally
motor protein
generates movement in cells and tissues Myosin in skeletal muscle cells provides the motive force for humans to move; kinesin interacts with microtubules to move organelles around the cell; dyneinenables eucaryotic cilia and flagella to beat.
coiled coil
helices, where the hydrophobic amino acids in one helix wind together forming a coil with others; also called leucine zippers due to high (leu): common to transcription factors - coiled coil = leu zipper the leucine zipper consists of two alpha helices that have hydrophobic zones and basic ends the helices of the leucine zipper bind to each other by their hydrophobic regions and to DNA by their basic regions the basic end region fit into the major groove of the DNA because the basic regions are roughly parallel and open up around the DNA, the two helical segments resemble to a zipper hydrophobic surfaces bind together basic region binds DNA
collagen
high glycine, proline and no cysteine, when boiled makes gelatin
special purpose protein
highly variable Organismsmakemanyproteinswith highly specialized properties. These molecules illustrate the amazing range of functions that proteins can perform. The antifreeze proteins of Arctic and Antarctic fishes protect their blood against freezing; green fluorescent protein from jellyfish emits a green light; monellin, a protein found in an African plant, has an intensely sweet taste; mussels and other marine organisms secrete glue proteins that attach them firmly to rocks, even when immersed in seawater.
histone proteins
histone proteins are among the most highly conserved proteins in eukaryotes H4 - differs by only two residues in cows and peas basic positive charged proteins that associate with DNA forming chromatin nucleosome "bead"
scleroproteins
insoluble in most solvents fibrous structure collagen keratins
glutelins
insoluble in most solvents; but soluble in dilute acids/bases
prolamins
insoluble in water; but soluble in 50% to 90% simple alcoholic solutions
kinesin
is an ATPase eukaryotic motor protein that "walks" (moves) along MT's powered by ATP hydrolysis a dimeric protein whose globular head (motor domain) binds both ATP and MT's
receptors (detect stimuli)
light and rhodopsin, receptor proteins and acetylcholine or insulin
primary
linear sequence of aa's
protamines
not based upon solubility; small MW proteins with 80% arginine and no cysteine
simple proteins
on acid hydrolysis yields only alpha-L amino acids: Early naming of proteins: historically based on solubility of proteins, via the early chemical analysis of isolated proteins Later classified was based upon structural content
complex proteins
on hydrolysis yield amino acids + other molecules
primary sequence chain
one end of polypeptide chain has a free (unlinked) amine group: N terminus Other end has a free (unlinked) carboxyl group: C terminus N - C
how much of a typical protein is in alpha helix?
only about one third the structure of a typical protein is alpha helix dimer of the CAP protein dimer formed by interaction between a single identical binding site on each monomer
storage proteins
ovalbumin, gluten, casein, ferritin
proline bends
peptide bonds are nearly always in a trans configuration, as it is more stable, with less steric hinderance of the R groups than in a cis configuration the cyclic nature of proline allows a cis configuration with less hinderance proline induces a bend t the nitrogen being restrained by a ring structure alpha carbons below plane of pb the direction of a polypeptide bend is determined by whether the proline is in a cis or trans configuration alpha carbons on opposite side of pb beta turns are made of 4 amino acids and often include glycine, which lacks a side chain and proline, with its built in bend allowing a protein to fold into a tight U shape, most often toward the interior of a protein
protein
polypeptide with well defined 3D structure
levels of protein structure
primary secondary tertiary quaternary
molecular structure of proteins
primary (sequence) secondary (local folding) tertiary (long range folding) quaternary (multimeric organization) supramolecular (large scale assemblies) regulation, structure, movement, catalysis, transport, signaling
polymorphism
proteins may vary in their primary amino acid sequence, but still exhibit the same catalytic activity
keratins
proteins of skin and hair, high basic aa's (lys, arg, his) but with some cys
structural protein
provide mechanical support to tissues and cells Outside cells, collagen and elastinare common constituents of extracellular matrix and form fibers in tendons and ligaments. Inside cells, tubulin forms long, stiff microtubules and actin forms filaments that underlie and support the plasma membrane; α-keratin forms fibers that reinforce epithelial cells and is the major protein in hair and horn.
lactic dehydrogenase
pyruvate - lactic acid LDH isozymes pyruvate - LDH - lactate lactic acid dehydrogenase each LDH enzyme is made of 4 polypeptide subunits of 2 different kinds of polypeptides: M for muscle and H for heart turnover number of LDH-M is much higher than LDH-H (number of reactions per enzyme molecule per second)
protein primary sequence is determined by
reading the genome sequence
secondary
regular, recurring orientation of aa in a peptide chain due to H bond
regulatory (signal)
regulate metabolic processes, hormones, transcription factors and enhancers, growth factor proteins
nucleoproteins (and nucleic acids)
ribosomes and organelles
sickle cell anemia
the change in amino acid sequence causes hemoglobin molecules to crystallize when oxygen levels in the blood are low as a result, red blood cells sickle and get stuck in small blood vessels
interactive insights
the expanding interactive has helped researchers show that pathogen effectors target highly connected plant proteins, most likely to control the host's cellular machinery
EF hand
two short a helices connected by a loop with a Ca 2+ ion binder homeodomains of a helix loop helix motifs are common in transcription factors helix loop helix motif also known as EF hand 2 alpha helices connected by a short loop found in Ca binding regulatory proteins common motif in many transcription factors
site specificity
unique sequences determine intra-cellular location and function
histones
unique structure, high # basic aa's - 90% lys, arg, and his: complex w DNA
intra specific
within a species
flexible alpha helix
wool is stretchable breaks H bonds amino acid side chain oxygen hydrogen bond between O of C=O and H of NH as away hydrogen nitrogen
proteins
work horses of cell metabolism
how are protein structures identified?
x ray crystallography from PDB and RSCB