Exam 3
Sanger Sequencing
- aka chain termination method - applied PCR: synthesize & sequence 1 strand of DNA w PCR, fluorescent ddNTP (double deoxy) mixed w dNTPs *ddNTP stops DNA synthesis (can't form phosphodiester bond bc no 3' hydroxyl group), fluorescence emits a color specific to each base type (red = T, green = A, blue = C, yellow = G); can determine what was the last nt added based on color *end up w DNA fragments of diff sizes & colors - separate DNA based on size w gel electrophoresis, laser beam & detectors detect color of each size - cons: can only sequence 1 piece of DNA at a time, sequence fragment limited to 700bp (errors for longer fragments)
Somatic Mutation Theory
- cancer is derived from a single somatic cell that has accumulated multiple mutations in oncogenes or TS; accumulates all phenotypes involved in promoting cancer - usually requires a carcinogen - default state of animal cell is quiescence (resting); cancer is a disease of cell proliferation caused by mutations in genes that control cell cycle & cell proliferation *caused by deregulation of cell cycle
Tissue Organization Field Theory
- cancer is not necessarily genetic; arises from a disruption of interactions w adjacent tissues *can be mediated by chemical signals, mechanical forces, bioelectric charges - usually caused by tissue developmental error - mutations not needed for carcinogenesis; mutations/genetic instability = byproduct of cancer
Exon Aggregation Consortium (ExAC)
- compare protein coding region of diseased and non-diseased individuals to determine disease causing alleles - collection of exomes (subset of genome composed of exons - protein coding regions) - identified disease causing alleles & fixed several false positives from GWAS
Multifactorial/Complex Traits
- continuous trait: w out clear-cut boundaries btwn phenotypes; gradual change of phenotypes *ex) height, weight, BP - categorical trait: phenotype corresponds to 1 of a number of different possible categories; ex) # of ridges in fingerprint, eye color - threshold trait: only a few phenotypes, but multiple underlying genes together w environ. determine phenotype, phenotype shows once a certain threshold is hit (affected by phenotypic variants & epigenetics); many human disease fall in this category (phenotypic classes are affected & unaffected) *ex) schizophrenia, type 2 diabetes - caused by a combination of genetic & environmental (including lifestyle) factors (difficult to treat) *incidence dependent on balance of risks; too many negative genetic & environmental factors favors disease - no clear pattern of inheritance (partially due to linkage disequilibrium), difficult to predict risk of inheriting or passing on the disease; can occur in isolation or run in the family - ex) asthma, epilepsy, hypertension, etc. - studied using linkage analysis & statistical analysis (doesn't identify epigenetic disorders); correlation not causation (positive association may or may not be due to genetics or rooted in environmental factors), have to be proven using biochemistry, molecular biology, etc. - punnett squares can't be used for complex diseases/traits; they don't follow mendelian genetics
Restriction Enzymes
- endonucleases capable of recognizing & cutting specific sites (restriction sites, usually palindromic) of DNA - bacterial in origin; used to cut foreign DNA (phages, F plasmids) - can be used to fragment genome, cut PCR product, cut plasmid DNA - creates 5' or 3' overhangs (sticky ends) that are complementary & used to stick the insert DNA into the vector DNA; sticky ends interact via WC bp until DNA ligase ligates the pieces of DNA *blunt ends (no overhangs) also possible; less efficient than sticky ends (doesn't require complementary seq, has to stay together long enough for ligase to work)
Identifying SNPs Involved in Drug Metabolism
genomic approach: - DNA-seq: identifies polymorphisms; compares genomes of patients w no adverse affects to those of patients w adverse effects or no effect *looks for SNPs found only in those individuals & confirms their role in drug metabolism via biochemistry - DNA microarray: diagnoses patient; check for known variants btwn individuals; probes for a variety of common SNPs (SNP arrays) *doesn't require sequencing, just the DNA sample from patient *SNP array: uses genotype specific regions of genome & finds known SNPs involved in drug metabolism, can compare microarray profile from patient to that of known SNPs *fairly inexpensive 1. warfarin: VKORC1 SNP causes resistance to both enantiomers; CPY2C9 SNP overly metabolizes s-warfarin & reduces efficacy of drug 2. abacavir: prevents formation of self-antibodies for those w HLA-B SNP; hypersensitivity 3. tyrosine kinase inhibitors: works better in tumor cells w a specific mutation in epidermal growth factor receptor TK 4. EGFR inhibitors: resistance in some colon cancer patients due to mutations in KRAS (oncogene) 5. 6-mercaptopurine: inactivated by thiopurine-s-methyltransferase; SNPs in TPMT results in severe toxicity 6. irinotecan: UGT1A1 SNP results in drug being stuck in active form, blocks topo action in all blood cells, specifically WBCs (will die) 7. pharmacogenomics & depression: cytochrome P450 genes show to be associated w accelerated & delayed clearance of psychotropic medications; play a role in citalopram metabolism
Recombinant DNA
- 2 or more strands of DNA that have been artificially combined - gene cloning: isolating a gene from an organism & making many copies of it (usually by a diff. organism) - isolate plasmid DNA (vector DNA) from a cell via miniprep & cut plasmid DNA via restriction enzymes - isolate gene of interest (insert DNA) via PCR or restriction enzymes; interacts w vector DNA to form recombinant DNA - uses: express a foreign gene in a host organism, a modified gene in a host organism, and a protein in a test tube, applications in disease treatment (insulin synthesized in e. coli is transformed, cultured, undergoes protein fusion to give insulin chain A & B to make insulin; express human hormones in animal milk by fusing hormone gene to B-lactoglobulin promoter, injected plasmid into animal oocyte, integrating it in genome to result in transgenic animal that produces milk w the hormone)
Polygenic Inheritance - Ear Lobes
- 49 alleles responsible for ear lobe shape; spectrum of phenotypes (free to attached) - large-scale genetic studies of morphological traits provide insight into pathways involved in development of ears & face; may be affected in the presence of malformations - formation of external ear requires precise timing & spatial coordination; disruption can cause birth defects - jaw muscles derived from same cells that make ear; pathways identified may be relevant for craniofacial syndromes (characterized by external ear abnormalities)
Autism
- a developmental disorder diagnosed by several notable syndroms (inappropriate laughing or crying, lack of awareness, etc.) - considered a complex disease bc multiple symptoms used in diagnosis, phenotypic variance of each symptom, multiple potential causes (single gene mutation w broad effects, multigene interactions, environmental & epigenetic factors) - studied using pedigree analysis & genomic studies *pedigree analysis of twins & siblings; established autism to be mostly genetic *genomic studies used to identify genetic factors of autism; inconclusive so far (low sample size, low statistical correlation of mutations to autism) - multiple candidate genes identified through genetic linkage & pedigree studies (most encode proteins involved in neural development & function), but usually can't be traced back to a single gene *usually non-mendelian, but rare mutations exist where 1 mutation has broad consequences (affect multiple developmental pathways) - can't usually be traced to 1 single chromosomal abnormality, but some are linked to autism - 30% of individuals w autism have spontaneous de novo mutations (acquired in meiosis, not present in parent cells); disrupt genes important for brain development - difficult to study bc there are multiple symptoms & not all are present in all cases - autism is mostly genetic, but autism phenotypes are mostly not genetic - not every autistic individual is intellectually disabled; severity depends on how important affected gene is to brain development - caused by single gene mutations (rare; involved in brain & neuron development & signaling), copy number variance (usually acquired via germ line production; duplications or deletions of genes or large fragments of chromosome), multiple mutated genes working together, epigenetic disorders - "opposite phenotype" to schizophrenia; opposite sides of spectrum (growth rate, sensory processing, cancer risk, etc.) *imprinted brain theory: genomic imprinting, evolutionarily father wants greater fitness (increased autism risk), mother wants to conserve her own well being (increased schizophrenia risk) *mother's diet also affects rates of disorder (starvation = increased schizophrenia risk)
Next Generation Sequencing
- aka high throughput sequencing - pros: can sequence many DNA templates in parallel, lower cost per DNA fragment relative to Sanger, can also sequence RNA (converted to cDNA first w reverse transcriptase) - cons: requires construction of genomic library, requires PCR (may show bias against low abundance DNA fragments), can't replicate modified bases (epigenome can't be studied; have to modify genome through bisulfite, etc.) - can be used to study the entire genome, sub-fractions of the genome (DNA bound to a specific protein, DNA bound to any proteins), transcriptome (all RNA in an organism, RNA being transcribed, translated, RNA bound to a specific protein, RNA bound to any proteins) - if part of the genome or transcriptome can be isolated from the rest, NGS can analyze it on a global basis
Transcriptome
- all RNA in a cell/tissue type - comparing transcriptomes: compare different cell types, healthy vs diseased cells, tumor profiling, transcripts in cells from diff stages of development, metabolic pathways (genes in same pathway often expressed together), changes in response to diff environmental agents like hormones or toxic chemicals
Evolution
- all polymorphisms started out as mutations - some mutations confer a selective advantage (passed down from generation to generation, become polymorphisms) - polymorphisms can be used to study population & migration patterns *SNPs useful in inferring population specific disease risk - phylogenetic trees can be used to compare genetic similarities
Oncogene
- altered gene whose product can act in a dominant (only 1 bad copy necessary) fashion to help make a cell cancerous; promotes cell growth & proliferation *mutant form of a normal gene (proto-oncogene, involved in controlling cell growth), gain of function mutation results in abnormal levels of expression or abnormally high protein activity or both - each retroviral oncogene has a corresponding cellular proto-oncogene, not all oncogenes result from a virus - types of cancer-causing retroviruses: 1. acute transforming virus: carry oncogenes in virus RNA genome (oncogene expressed from viral genome), not known to induce cancer in humans; ex) src 2. non-acute (chronic) transforming virus: don't carry oncogenes in genome, but insert themselves upstream of oncogene or proto-oncogene to cause disruptions *oncogene expressed via insertional mutagenesis; transform cells (ex: myc, int-1, int-2) *can cause tumors in animals, but over 1-2 years - non-viral oncogenes: genetic or epigenetic mutation that results in overactive or overexpressed oncoprotein (gain of function mutation) - classes of oncogenes: 1. growth signal mimics: oncogenes that mimic GFs to induce cell proliferation; rare 2. receptors: mutations of cell-surface receptors that usually result in an overactive or constitutive protein-tyrosine kinase (PTK); will be active even w out its GF 3. intracellular transducers: mutations of genes involved in intracellular signaling pathways 4. TFs: mutations in TFs leading to constitutive expression of genes by that TF
Vector DNA
- carrier of insert DNA that is to be cloned - usually derived from natural sources (plasmids, viruses) & modified to fit needs of research/medicine - can replicate independently of host chromosome; insert DNA replicated at the same time - usually have antibiotic resistance which allows selection of cells w plasmid - some have color selection that allows screening of cells with insert; usually used w antibiotic resistance - not compatible w all hosts; seq of origin of replication determines whether a vector can replicate or not in a particular host cell; need multiple origins of replication to be compatible w multiple hosts - transferred into host via horizontal gene transfer: transformation, conjugation, or transduction
Genome Engineering
- changing the DNA code of a living organism through selective breeding, recombinant DNA tech, artificial restriction enzymes, CRISPR/cas9 system - DNA cut w DNA scissors (genetically engineered nucleases) at a specific site *DNA can be inserted into the deleted region using HR of an external piece of DNA or NHEJ w external piece of DNA for a small insertion *DNA can be repaired via dsDNA break repair (treatment for heterozygotes w a dominant disorder) or via NHEJ w out external piece of DNA (treatment for disorders involving insertion mutations, repair of trinucleotide repeat disorders; results in deletion) - prevalent in research, emerging in medicine; some ethical concerns (mostly w off target effects that may result in cancer or other disease formation)
Melanins
- class of compounds that serve predominantly as pigment - responsible for skin, eye, hair pigmentation - derivatives of tyrosine - at least 3 types of naturally occurring melanins (eumelanin, pheomelanin, neuromelanin) *eu & pheo responsible for all coloration except for parts of brain *chemical composition & physical properties differ; chemical & biological responses may behave differently when exposed to UV - skin tone depends on type & amount of melanin produced, volume of keratinocytes (more/denser = darker) *also controlled non-genetically (UV exposure increases melanin production, epigenetics can control production) - form an amorphous mass when separated from melanocytes - heterogeneous; made up of different types of melanin - have some optical properties (depends on type & concentration of melanin; absorb & scatter diff wavelengths of light) & semiconductor properties (unknown purpose) - concentration depends on TFs (expressed differently in each individual), melanin biosynthesis proteins, melanin signaling (MC1R) & transport, sunlight exposure (environmental), epigenetic events (genetic & environmental), etc. 1. eumelanin: dark brown to black optical properties; responsible for darker skin tones - 2 polymers: 5,6-dihydroxyindole-2-carboxylic acid (brown), 5,6-dihydroxyindole (black; more condensed melanin, darker pigment) 2. pheomelanin: red to yellow optical properties (comes across as pink in high concentrations), responsible for lighter skin tones - polymer: benzothiazines - eu & pheo play a role in eye, hair, skin color; coloration independent of skin & hair type *heterogenous; both present, but concentrations of each decided how light or dark the coloration is 3. neuromelanin: colors certain distinctive regions of the brain, highly concentrated in humans, unknown physiological role (may have a role in detoxification/apoptosis) - abnormalities in concentrations correlate to neurodegenerative diseases; don't know if it's a cause or consequence - polymer of 5,6-dihydroxyindole 4. albinism: caused by a lack of melanin - due to several gene mutations impacting melanin synthesis, melanosome upkeep
Genome Wide Association Studies (GWAS)
- compares DNA of participants having varying phenotypes for a particular trait or disease in order to determine disease causing alleles; diseased (case) vs. non-diseased (controls) - phenotype 1st approach: started w symptoms, then looked at genotype - used chip base arrays; not sequence based, but can identify SNPs (can tell which polymorphisms present in which people) - if genomic variant is common in diseased individuals, it was annotated as disease risk (disease-causing variant); can't specify which genes actually cause the disease - identified several disease causing alleles: type 2 diabetes, RA, osteoporosis, etc.; several drugs designed based on GWAS - poor study design (limited sample size, little significance testing), many false positives (SNPs linked to casual SNPs/mutations; said to be disease causing, but SNP was just common in the population), massive dataset, but limited analysis capabilities, lack of diversity on population based studies - overtaken by NGS tech; current landscape of sequencing technology (sequenced genomes stored in public databases) *integrated to study genomics (DNA), transcriptomics (RNA), proteomics (proteins), epigenomics, metagenomics; can study disease at these levels
Melanocortin 1 Receptor (MC1R)
- controls which type of melanin is produced by melanocytes - when active: trigger chemical rxns inside melanocytes that stimulate eu production - when blocked or inactive: melanocytes make pheo - melanocytes produce eu & pheo at the same time; genetic differences in MC1R results in varying amounts of eu & pheo production (partially responsible for skin tone spectrum) - very little gene variation of MC1R genes btwn dark skinned individual; wt *variation controlled by 4 known genomic regions: SLC24A5, MFSD12, DDB1, OCA2, HERC2 *large gene variation btwn non-dark skinned individuals; even exist within the same population of people, likely due to UV light from sun - low sunlight:eu → eu blocks UV light; no vitamin D synthesis or DNA damage - good sunlight:eu → most DNA damage prevented, vitamin D synthesis occurs - high sunlight:eu → vitamin D synthesis occurs, but too much sun results in DNA damage, may result in skin cancer 1. SLC24A5 region: works in golgi of melanocytes, associated w european pigmentation, introduced into east africa probably from middle east 2. MFSD12 region: cysteine transporter in melanocytes, lysosomal protein that modifies melanocyte pigmentation ; decreased MFSD12 associated w darker pigmentation, variants melanesians, australian aboriginals, some indians 3. DDB1 region: ubiquitin ligase associated w DNA damage repair (associated w UV light); increased expression associated w lighter pigmentation (better mechanism to cope w damage) 4. OCA2 & HERC2 regions: involved in blue eyes, previously uncharacterized variants arose independently from pigmentation associated w non-Africans; actually arose in africa
Induced Pluripotent Stem (iPS) Cells
- conversion of tissue specific to pluripotent stem cells; engineered in lab *has similar characteristics to embryonic stem cells - used to study development & diseases, useful for developing & testing new drugs & treatments, may be beneficial for therapeutic use (ethical concerns), used in disease modeling & pharmacogenomics 1. disease - somatic cells removed from patient, reprogrammed to become iPS cells & differentiated into specialized cells - specialized cells can be used in disease modeling to determine molecular mechanism of disease to help find treatments, used for drug screening & discovery (drug toxicity, preclinical drug trials in test tube); involved in designing new medicine 2. therapy: specialized differentiated cells may be used for ex vivo cell therapy (can potentially put the cells back in patient) 3. issues - unpredictable phenotypic variability btwn iPS cells lines; environ has a profound effect too - requires a diseased & nondiseased cell line to study diseases that can only differ at disease causing allele - increased issues in complex diseases - only effective for monogenic (mendelian) diseases
Chimeric Antigen Receptor (CAR) T Cells
- ex vivo modified T cells recognize & destroy cancer cells - chimeric: antigen binding & T cell activating function on a single receptor - T cells derived from patient or healthy donors to reduce immune response; ex vivo cell therapy (remove stem cells, modify T cells, grow, put them back in patient) - works well to kill leukemia cells, but destruction of cancer cells releases toxin molecules that may be detrimental to organism - doesn't work well against solid tumors: difficult to find suitable targets; not enough antigens, not enough mutated cell surface proteins (don't express proteins at high enough levels) *tumor microenvironment hostile to T cell activity
Gene Therapy
- experimental technique that uses genes to treat or prevent disease - introduction of cloned genes to treat disorders caused by a single allele (not good w complex diseases) - new gene may be maintained outside genome (plasmid or virus) or randomly integrated into genome (can cause other issues) 1. non-viral approach: macromolecules (DNA or RNA) placed in a liposome (spherical sac of phospholipid molecules) 2. viral approach: gene of interest placed in virus - choice of virus determined by: *size of gene: viruses are limited to how much foreign DNA they can package *our immune system: need a virus that will elicit a low immune response if it needs to be around for a long time; modified viruses can exhibit an even lower immunogenic response *gene expression: long term (needs to be integrated into genome w lower immunogenic response) or short term/transient (requires extrachromosomal expression w higher immunogenic response) *expression level: transiently expressed genes can be expressed higher than integrated genes *etc.
Genomics
- field of science focused on several aspects of the genome: structure, function, evolution, mapping, editing *genome = organism's complete set of DNA (or RNA if retrovirus) - interdisciplinary field of science used to study genetic variance on a genome-wide basis; biochem, bio, pharm/tox, ecology, math & stats & CS (bioinformatics), engineering - functional genomics: focuses on transcription, translation, and protein-protein interactions; most of gene expression, not epigenetics - epigenomics: study of epigenetic events on a genome wide basis (or epigenome) like DNA methylation, histone modifications, etc.; can also be used to study gene expression - metagenomics: study of all genomes recovered from an ecological system or environment - structural genomics: 3D structure of every protein within a genome; doesn't look at 3D structure of genome itself - made possible by DNA sequencing tech, specifically sanger sequencing (used in genetics) & next generation sequencing (used in genomics, genetics sometimes) - genomics vs genetics: collective characterization & quantification of genes (looks at all genes at once), involves sequencing & analysis of genomes (genetics involves DNA sequencing at individual gene level), bioinformatics (to assemble & analyze function & structure of entire genomes); can be used to study genetics (study of individual genes & their role in inheritance) - applications in research, medicines, synthetic biology & bioengineering, conservation of genomics (study of ecological conservation; genetic diversity within an ecosystem, potentially allows us to save a near extinct species or ecosystem); used to predict genes (sequence of encoded protein, structure & function of encoded protein), alignment (locating sequence homology, finding highly repetitive seqs, finding transposable elements, identifying evolutionary relationship btwn 2 or more genetic seqs aka conserved sites), assembly (building genome/partial genome from nucleic acids, including large genes), identify specific DNA sequences (or lack thereof), agriculture - used to identify mutations that play a role in traits & disease (linking genomics & genetics), identifying & characterizing single nt differences (SNPs, mutations), studying polygenic inheritance, complex diseases (autism, cancer), pharmacogenetics
Polymorphisms
- genetic variation within a population that occurs in at least 1% of the population (< 1% = mutation) - explains human variation - single nucleotide polymorphism (SNPs) are the most common human variation - insertion-deletions - copy number variants (CNVs): major cause of structural variation, can be duplications or deletions; present in everyone, not usually bad *contribute to human variation
Transgenic Animals
- genetically modified animals used to study gene function & human diseases - plasmid based tech that utilizes HR - knock-outs: deletion of gene from genome; regions of homology found on each side of gene of interest & gene replaced - knock-ins: insertion of gene into genome; gene of interest found on plasmid in btwn 2 areas of homology - conditional knock-outs: allows gene expression to be controlled *when and/or where recombination event occurs controlled by cre recombinase; forces HR to occur at certain locations
Artificial Restriction Enzymes
- in vivo gene editing technique 1. DNA recognition - zinc finger nucleases (ZFN): DNA recognized by protein regions; protein fold stabilized by zinc, can recognize 3 nt (weak interaction), but tighter binding & recognition of longer seqs. can occur w multiple zf domains together *diff. zf domains recognize diff. DNA seqs. - transcription activator-like effector nucleases (TALEN): specific regions of DNA recognized by proteins; TALE protein domains found to be secreted by xanthomanthus bacteria *TALE proteins composed of central domain responsible for DNA binding, a nuclear localization signal, a transactivation domain *TALE recognizes only 1 nt; minimum 18 TALE domains needed 2. DNA cleavage w fokI - zf domain recognizes DNA, fokI dimerizes when both DNA BP bind DNA, cleaves DNA in btwn recognition sites - TALE domain recognizes DNA, fokI dimerizes when both DNA BP bind DNA, cleaves in btwn recognition sites 3. used to treat some diseases - ZFNs & malaria, TALEN & ALL
CRISPR
- in vivo genome editing technique - possible in primates, not ethical for use in humans 1. DNA recognition: gRNA (synthetic version of CRISPR RNAs) that finds a specific target - gRNA recognizes DNA by complementary bp 2. DNA cleavage: cas9 protein (nuclease); cuts DNA at the site recognized by gRNA to allow point mutations, deletions, or insertions - or activator or repressor can be fused to cas9 lacking nuclease activity to allow regulation of gene expression - or fluorescent molecule fused to cas9 protein to inactivate it; allows us to identify location of DNA sequence, including genes - protospacer adjacent motif (PAM) sequence (5'-NGG-3') allows cas9 to recognize DNA *located on DNA immediately downstream of gRNA binding site, recognized by endonuclease (not the site of cleavage) - DNA repair pathways like HR (inserts foreign DNA) or NHEJ (small insertion/deletion) used to insert DNA in genomes - nicking both strands using modified cas9 (cas9 nickase; only cuts 1 DNA strand) can be used to replace large fragments of genome w foreign DNA (HR based) 3. basis of some gene drives: genetic method to ensure a specific allele will be passed down to all subsequent generations (defies law of independent assortment)
Genomic Instability
- increased tendency of genomic alteration during cell division; can be genetic or epigenetic *the more often a cell replicates, the more likely it is to acquire mutations that affect genome or epigenome 1. G1/S (DNA damage checkpt.): stops unwanted cell proliferation (p53 pathway) - unwanted entry into cell cycle results from failure to undergo checkpt. 2. G2/M (DNA damage repair checkpt.): unchecked DNA damage can lead to cancer due to increased rate of damage & mutations - DNA mutagenesis from failure to undergo checkpt.; mutations accumulate (drives cancer formation) 3. M checkpt. (btwn anaphase & metaphase): prevents nuclear division if cell isn't ready - aneuploidy results from failure to undergo checkpt.; can result in aberrant gene expression (cancer)
Familial Cancers
- inherited through germline - predisposition to cancer - usually results in a mutation in DNA repair or TS genes (mutator hypothesis) 1. retinoblastoma: recessive mutation of Rb protein (acts as a dominant mutation bc almost every person who inherits 1 mutant allele will undergo 2nd mutation in at least 1 retinal cell; eventually has 2 bad copies of allele) - lose heterozygosity: functional copy lost, carry 2 mutated copies 2. lynch syndrome: hereditary cancer syndrome, loss of MMR pathway - increased mutation rates - autosomal dominant inheritance pattern w 80% penetrance - loss of heterozygosity; MMR genes have microsatellites (involved in repairing themselves too)
Complex Diseases
- involves multiple genomic variances (SNPs, indels, etc.), environ., lifestyle, phenotypic variance (various levels of disease penetrance & expressivity; affected by environ. & lifestyle too) - depend on various parts of genome - association studies used to test for association btwn polymorphisms (SNPs) & phenotypes *difficult to validate; usually only account for a small % of indiv.'s overall risk & small % of indiv. who have disease - difficult to study due to large sample size requirement (most complex diseases are rare), multiple genes involved in phenotype, environmental factors may mimic disease phenotypes (produce same health problems as altered gene), diseased allele may not result in disease phenotype
SNPs
- locations in the genome where more than 1 allele occupies a genetic locus within a population *common variant in human genome at a certain point - come from mutations that have been held in the population over time; viable mutations, possibly beneficial 1. causative aka coding: occur within coding or regulatory region of a gene; may affect gene expression - synonymous: don't affect protein sequence - non-synonymous: affect coding protein sequence (missense, nonsense) 2. linked aka non-coding: occur in non-coding regions of a gene - may reside in regulatory sequences (enhancers, promoters) or intron; doesn't occur in exon
Issue with Genome-Wide Analysis
- more tests needed to differentiate btwn germline & somatic mutations - germline: appear in both healthy & diseased tissue; have to study somatic cells too to determine where disease came from - somatic: appears only in diseased tissue as a result of DNA damage *becomes germline if passed onto next generation *can be identified by comparing diseased & healthy tissue
Polygenic Inheritance
- multiple genes (loci) contribute to a phenotype *effect can be quantified in some way *each gene product has an additive effect - non-mendelian; effect of genes is cumulative, no dominant or recessive *some genes more important than others; environment & epigenetics may affect phenotype - results in a continuous "spectrum" of phenotypes; wide range of phenotypes, ideally fit a bell curve but not always - must be studied in populations; able to observe all possible phenotypes & alleles involved - observation of multiple distinct phenotypic classes in offspring indicates that multiple factors are responsible for & contribute to final phenotype; multiple genes necessary to contribute to diseased & non-diseased traits - polygenic disorders are called complex diseases; several genes involved w disease penetrance & expressivity
Genetically Modified Organisms (GMOs)
- organism that has had its genome modified through genetic engineering 1. animals - ex) aquaadvantage salmon grow quicker than non GMO salmon; modified to express GH from another fish 2. plants: selective breeding or genetic modification to be larger, disease resistant, and/or produce high quality food - ex) agrobacterium tumefaciens: bacterium that naturally infects plants & causes crown gall tumors; modified to alter genome of some plants *T DNA from Ti plasmid transferred to wounded plant cell via conjugation, T DNA randomly integrated into plant chromosome & replicates w cell; genes on T DNA are oncogenic & cause uncontrolled cell growth (tumor formation) *Ti plasmid modified: gene of interest inserted into T DNA via restriction enzymes; becomes a recombinant T DNA vector, often includes antibiotic marker, plasmid transformed back into bacterium, conjugated into wounded plant cell & becomes transgenic plant - biolistic gene transfer (biological ballistics): DNA gun shoots DNA-coated microprojectiles into nucleus of cells - microinjection: microscopic sized needles used to inject DNA into nucleus - electroporation: electric current used to create transient pores in plasma membrane through which DNA can enter
Genetic Testing
- over 900 gene tests in use - prognostic: identifies likelihood of developing disorder, as well as how disease may develop *chance of acquiring disease symptoms in the future (for those w/out symptoms) & if symptoms will worsen, improve, or remain stable (for those w/symptoms) *predicts penetrance/expressivity - diagnostic: identifies known mutations that cause disease or condition; determine if disease-causing mutation is present *can't distinguish btwn germline & somatic mutations, doesn't say how disease will progress types: 1. cytogenic: checks quality & quantity of chromosomes - ex) g-banding karyotype: checks # & appearance of chromosome to identify chromosomal disorders; most useful to identify trisomy or other large chromosomal disorders 2. molecular: identifies presence of disease causing alleles - genomic sequencing (usually partial sequencing) to identify known, common mutations (presence of pathogenic/disease-causing variants); ex) CF 3. biochemical: synthesis, structure, and/or function of protein; ex) tay-sachs 4. prenatal & newborn: several cytogenic, molecular, and biochemical tests in 1st (blood test, ultrasound, chorionic villus sampling, cell-free DNA testing) & 2nd (quad screen) trimesters, at birth (blood test) 5. carrier testing: identifies if person is carrier of a specific autosomal recessive disorder; chances of passing a disease-causing mutation to offspring - usually molecular and/or biochemical tests - CF, fragile X syndrome, sickle cell, tay-sachs, SMA
Polygenic Inheritance - Skin Color/Tone
- polygenic inheritance partially results in a continuous spectrum of human skin tone *range of phenotypes mean multiple genes affect final phenotype (MC1R, SLC24A5, MFSD12, DDB1, OCA2, HERC2, etc.) - 378 genetic loci thought to be involved in determining skin color & tone - skin tone determined by genetic & non-genetic factors (environment)
Heterozygote Advantage
- potential benefit to being carriers of recessive disorders; beneficial to have 1 copy of mutant allele (not all diseases) - ex) sickle cell trait: resistance to malaria; malaria can't attach to sickle shaped cells, selective pressure to be sickle cell carrier - ex) cystic fibrosis: resistance to cholera & typhoid; no longer an advantage, genetic drift is the most likely cause of the prevalence of these polymorphisms *nothing to select against it, remains in bloodlines
Tumor Suppressor Genes
- protein-coding genes involved in suppression of tumor growth *can stop cell growth (prevent spread of comprised cells due to oncogenes) or can kill the cell (apoptosis) - loss of function mutations resulting in decreased activity or expression promotes cancer; cell can't stop replication or induce apoptosis - DNA tumor viruses like adenoviruses & papoviruses encode proteins that inhibit tumor suppression function - non-viral mutations (genetic or epigenetic mutations that result in inactive proteins or no expression of proteins) are common amongst some cancers
Retrovirus Infections
- some can cause cancer by inserting viral genome into host genome *100% transmission rate; 100% success at invading cells - retrovirus enter cell, mRNA released from virus - reverse transcriptase converts ssRNA to ssDNA to dsDNA; retroviral DNA moves into nucleus & can be inserted into genome - host transcribes retrovirus genome along with its own; host translates retroviral proteins, retrovirus rebuilt & exits cell - viral oncogene hypothesis: genes specifically programmed to trigger cancer had been deposited into vertebrate genomes by viruses; would remain latent (in heterochromatin) unless triggered by some environmental carcinogen (opens up to euchromatin) *ex) src is a normal gene that becomes oncogenic in the presence of RSV
Insert DNA
- source of DNA segment of interest (gene we want to express; DNA we want to transfer to new organism); can be isolated from chromosome or synthesized - sources: chromosomal DNA (mechanically sheared or cut w restriction enzymes), PCR (generates specific DNA fragments), RNA (requires RT), synthetic DNA (synthesized w out template)
Stem Cell Therapy
- stem cells = self-renewing non-differentiated cells *may be pluripotent (embryonic stem cells, can generate almost all of the body's cell type except support structures; differentiate into multipotent) or multipotent (can generate limited cell types that eventually generate tissue or organ specific cell types) *differentiation: pluripotent → multipotent (→ multipotent) → tissue specific cell type - embryonic stem cells primarily derived from blastocysts (hollow ball of cells that form 3-5 days after fertilization); can be extracted & grown in lab - tissue specific stem cells are multipotent or differentiated; difficult to find in human body, don't self-renew easily in cultures - ex vivo gene therapy: tissue specific stem cells from bone marrow removed from body & modified in test tube (normal gene transduced into stem cell via retrovirus & inserted randomly in genome; defective gene stays in the genome too); placed back in body *wouldn't work w dominant disorders; works for homozygous recessive disorders (genotype of patient is now heterozygous)
Linkage Analysis
- study aimed at establishing linkage btwn 2 or more genes; usually associated w diseases, especially rare diseases - don't need to know which gene is affected to do a linkage analysis - gene hunting: identifying genes of interest related to phenotype (diseased state) - usually identifies genetic locus involved in phenotype, not necessarily the gene or allele (identified by comparing diseased & non-diseased family members) - genetic testing: medical tests to identify changes in chromosomes, genes, or proteins - can be used to identify disease causing variants; don't require a known biochemical defect, a known gene, known protein, knowing if 1 or multiple genes involved in phenotype
Pharmacogenomics
- study of how genes affect a person's response to drugs - combines pharmacology (science of drugs) & genomics (study of genes & their functions) - not used to design drugs; goal is to assign medications & dosage based on genes *goal: develop safe, effective medications & dosages tailored to a person's genetic makeup; part of personalized medicine - offers benefits compared to trial and error approach; pharmacogenomic approach asks if the patient's genotype predicts an adverse reaction to the drug & if its safe to give to the patient (asks questions at the same time unlike trial and error approach) *can check how quickly patient will metabolize drug to adjust dose prior to prescribing, can check how drug may interact w other medications
Intragenomics
- studying of a subset of genomics; "within the genome" - study of smaller, specific fractions of the genomes - includes epistasis, pleiotropy, heterosis, interactions btwn loci & alleles (how it's used to express phenotypes in disease)
Phenotypic Variance
- sum of genotype variance and environmental variance *genotype-environment association: certain genotypes preferentially found in a particular environ. (dark skin, high sunlight); common in human genetics, may contribute to lower risk of health issues associated w environ. (sickle cell & malaria) - phenotype may depend solely on genetics (high VG, low VE) or on epigenetics/environ. (high VE, low VG) *environ. causes much of the phenotypic variation *both are prominent in phenotype if VG & VE are similar - heritability: a measure of how much phenotypic variation is due to genetic variation, effect of genotype can be masked by environ. (mostly dependent on environ, ex: nutrient content of soil) *heritability = 1 (VP only due to VG; rare), heritability = 0 (VP only due to VE; no genetic effect, phenotype due to environ.); usually lies btwn 0 & 1, can vary btwn & w in populations
Proteome
- the entire set of proteins expressed by a cell/tissue type - larger; size more difficult to determine & study bc lots of proteins (20 a.a. instead of 4 nt), a lot of diff proteins being expressed at diff levels (compared to 1 genome) - done using mass spec - can tell us if: *a single pre-mRNA is spliced into more than 1 version (alternative splicing); splicing is often cell specific or related to environmental conditions *RNA editing: less common, leads to changes in coding seq of mRNA; can also be studied by transcriptome *post-translational covalent modification: irreversible changes necessary to produce a functional protein (glycosylation), proteolytic processing (attachment of prosthetic groups, sugars, lipids), reversible changes that transiently affect protein function (phosphorylation, methylation, acetylation)
Cancer
- uncontrolled cell growth - genetic & epigenetic disorders that cause changes in gene expression which may affect quantity and/or quality of protein - failure to perform checkpoints (prevent uncontrolled cell growth) or undergo apoptosis can lead to cancer development - each person's cancer is unique bc it's a unique combination of genetic changes *as more mutations develop, cancer is more likely to develop - not all cancers are the same: ~28 different types of tumors identified *same "type" of cancer may be due to different mutations; ex) not all breast cancer cells look like other breast cancer cells on a molecular level
Genome Mapping
1. cytogenetic: relies on microscopy; commonly used w euks who have much larger chromosomes - maps relative to bands on chromsome - chromosomes stained w giemsa to reveal banding pattern that can be used for mapping - euk chromosomes distinguished by size, centromeric locations, banding patterns 2. linkage: helps identify genetic linkage by looking at meiotic crossover events; spatially close DNA seqs that are likely inherited together - relies on freq of recombinant offspring to map genes - genetic markers = DNA segment found at a specific site in all indiv of a population & can be uniquely recognized - maps are relative to genes & genetic markers; measured chromosomal crossover events (map units or mu) - quantitative trait locus (QTL): region of DNA associated w a particular trait; trait can be polygenic (multiple genes contribute to phenotype) *QLTs for same trait often found on diff chromosomes *QTL mapping: fine tune locus using molecular markers; find where gene is in QTL - can be used to determine probability a gene is important for a disease 3. physical: genes mapped relative to e/o using a physical distance (distance measured in bp) - genomic seq required; can be done from scratch (shotgun seq), can be aligned to a similar known seq *shotgun seq: has to be built from contigous seqs (contigs, collection of overlapping DNA fragments); break DNA, seq each strand, align DNA fragments to e/o, use contigs as a scaffold, build scaffolds based on known seqs, build genome map *genome seq using NGS, align sequenced genome to a reference or known genome in the bank - can identify new genetic markers; used to update & improve linkage maps
Mutator Hypothesis
1. driver mutations: mutations that drive tumor development, typically confer an advantage to those cells - alterations in oncogenes, TS, DNA repair genes are drivers - found in cancer cells, but not the only type of mutations - lead to mutator phenotype, which lead to chemotherapy resistant cells - identified via genomic approaches, requires bioinformatics 2. passenger mutations: mutations present in cancer cells that don't contribute to formation of cancer (non-cancer causing) - not a driving force, but may decide type of therapy
Linkage Equilibrium/Disequilibrium
1. equilibrium: alleles are randomly assorted, follows mendel's laws (independent assortment) - depends on how meiotic recombination plays out; generally a 50-50 chance for an allele to be passed down 2. disequilibrium: non-random association of alleles at diff loci in a given population, doesn't follow law of independent assortment - alleles inherited together more or less frequently than chance would predict *genetic linkage; tendency of genes on same chromosome in close proximity to be inherited together - may offer benefits to the offspring - influenced by selection, rate of genetic recombination (more often, less likely to be diseq.), mutation rate, genetic drift, system of mating (inbreeding = less diseq.), population structure, genetic linkage
Haplotypes
1. familiar: DNA variations (polymorphisms) that tend to be inherited together from a single parent; usually survives many generations of reproduction w little or no genetic change - ex) Y chromosome: all males share the same Y chromosome as their father, grandfather, etc. give or take some mutations; Y chromosome looks very similar *X chromosome doesn't have a specific haplotype; changes through the generations - ex) mitochondrial DNA: always passed down from mother; looks the same give or take some mutations during replication 2. non-familiar: set of SNPs or other polymorphisms that always tend to occur together; associated statistically - people within a certain region share some characteristics; often found in genomes of people living in the same region - used to investigate common diseases, basis of hapmap project 3. hapmap project: aimed to identify & develop haplotype map of human genome to describe common patterns of human genetic variation - looks at small, pre-specified regions of genome (those w known SNPs/polymorphisms) - used to find genetic variants affecting health, disease, and responses to drugs & environments - now outdated
Hereditary vs. Non-Hereditary Cancers
1. hereditary: - usually involve defects in DNA repair pathways - mutator hypothesis: inability to repair DNA results in increased rates of mutagenesis; driving force behind cancer development from a tumor 2. non-hereditary (sporadic): majority of cancers - mutations are acquired, doesn't follow mutator hypothesis on genome-wide analysis *follows mutator hypothesis on epigenome-wide analysis; epigenetic disorders may result in lowered expression of DNA repair genes
Disease Penetrance vs. Expressivity
1. penetrance: ratio of diseased allele to clinical symptoms; likelihood of showing disease symptoms - proportion of individuals who carry a disease causing allele & display clinical symptoms of a disease - mendelian diseases have high penetrance, complex diseases have variable penetrance - no penetrance = 0% chance of showing symptoms - incomplete: non-0 & non-100% chance of showing symptoms; not all people who have variation have disorder *cancer, retinoblastoma, emery dreifuss muscular dystrophy - complete: people who have variation have disorder, often dominant disorders *familial adenomatous polyposis, neurofibromotosis, multiple endocrine neoplasia *exception is huntington's: penetrance is based on expansion within gene; can be incomplete or complete *depends on # of CAG repeats in protein coding region of HTT gene *disease caused by too many CAG repeats (trinucleotide repeat disorder); as # of repeats increases, acquiring disease & passing it down grows too 2. expressivity: extent to which a genotype shows phenotypic expression; extent/severity of clinical symptoms 3. affected by polygenic effects (SNPs, mutations), environment, epigenetics - 40% of familial retinoblastoma is hereditary; 90% of carriers develop retinoblastoma (high penetrance), may affect both eyes (high expressivity) - homozygous individuals w diseased CFTR alleles exhibit complete penetrance, may show diverse expressivity *heterozygotes w 2 different disease causing alleles will still get the disease (complete penetrance), but w variable expressivity (more or less severe) - lynch syndrome results in predisposition to many cancers (colon, endometrial); results from MMR deficiency *complex: several genes can be mutated to cause disease; high penetrance (80% w allele will develop cancer), variable expressivity - women under 70 w 1 diseased allele of BRCA1 or 2 will exhibit high penetrance for breast cancer
Cloning
1. reproductive cloning: creation of genetically identical (not 100%) animal to donor animal; nuclear genome from donor implanted into denucleated egg & implanted in uterus where it grows full term 2. therapeutic cloning: creation of genetically identical cells to donor animal; nuclear genome from donor implanted into denucleated egg & grown in vitro (not implanted in embryo or grown to full term; used to generate stem cells) - somatic cell nuclear transfer (SCNT) = most common cloning technique *nucleus removed from oocyte (leaving cytoplasm & mitochondria), somatic cell taken from adult donor to be cloned (nuclear genome extracted & inserted in denucleated oocyte), new cell induced (chemical or electrical stimulation) to divide & develop an embryo, embryo placed in recipient uterus to develop to term *result is a clone (not daughter, son, or twin); not demonstrated to work w humans (unethical, can be used to derive human embryonic stem cells) 3. human embryonic stem cells: generated via SCNT from terminally differentiated somatic cells - histone methylation prevents SCNT reprogramming; requires removal, but there's no human enzymes to remove it *mouse mRNA that encodes H3K9me3 demethylase injected before SCNT to remove histone methylation - used to treat age-related macular degeneration 4. embryonic twinning: embryonic cells removed & artificially split on petri dish & placed into surrogate mother