GEN3051
Tag SNPs
"relatively small set of variants that capture most of the common patterns of variation in the genome" Tag SNPs can be used as tools for an indirect association approach for (Not testing for association for the diseased allele itself but for the tag snp associated with it - indirect): - Any candidate gene in the genome - Any region arising from family-based linkage analysis - Whole genome scans for disease risk factors
Epigenetic inheritance
'Reversible, heritable changes in gene function or cell phenotype that occur without a change in DNA sequence' - Spontaneous OR in response to environmental factors Inheritance: a) from one cell generation to the next (mitotic cell division) b) through the production of germ cells Epigenetic mechanisms include: - DNA methylation - Histone modification - Gene silencing (RNA interference) - Maternal effects (RNA and protein inherited from the mother)
2. RNA modification
- Aims to inactivate a pathogenic mutant gene - Aim is to inactivate a pathogenic mutant mRNA (or to correct it, more difficult) - Sometimes aims to correct a mutation, more difficult, won't be covered - Also in specific cells or tissues only - Usually for dominant gain of function or dominant negative mutations, or activated oncogenes - Dominant conditions, can't supply a functioning copy of the gene and expect something to happen = So have to stop the dominant gene from being expressed
CF Diagnosis
- Altered cl- concentration which changes the salt concentration in the body - Very high amount of sodium chloride ions excreted from the skin - Used for very early diagnosis of the disease
1. Single Gene Disorders
- Any human gene can occur in mutant form. - We don't see 25,000 single gene disorders because § Some mutations won't cause an effect § Same phenotype caused by mutation in many different genes bc used in same pathways § Miscarriages occur so mutation is never seen= function is essential in certain genes so will never see someone alive with these genes mutated - Mutation of many are very deleterious, strong selection pressure against them, thus very rare. - For others, mutations have milder consequences, but also occur sporadically. - There are many single gene disorders, but most are very rare (< 1 in 10,000). - Total incidence among the newborn is ~1 in 100. - Specific single gene disorders may occur more frequently in certain populations. For example: o haemoglobin diseases up to 1 in 100 in Central and West Africa o cystic fibrosis up to 1 in 2,000 in Western Europe - Thought to be associated with heterozygote advantage (e.g. malaria), or Founder effect. - Most disorders associated with single genes show recessive inheritance. o one working copy is as good as two (e.g. most enzyme encoding genes). o important to understand why this is. - But some show dominant inheritance (e.g. regulatory and structural protein genes). o important to revise why some mutations are dominant.
How effective is WES/WGS?
- As soon as whole exome and whole genome was more widespread and in research, it has helped the discovery of disease genes - Not all genes are new disease genes. Many are novel disease due to mutations in previously known disease genes. - Up to 50% of cases associated gene identified by WES Why not more? - Not covered in exome sequence - Not obviously pathogenic eg. non-coding sequence - Copy number variation Need to find second independent family Generate animal model to support association
3. Multifactorial disorders
- By far the most common disorders do not have a simple genetic or chromosomal origin o instead they have a part genetic, part environmental origin, called multifactorial - The genetic component is usually polygenic (> 1 gene).
2.1 Ribozymes
- Catalytic RNAs, some cleave RNA in sequence-specific manner - Can couple catalytic component to sequence complementary to mutant gene (but not wild type copy), will pair with it and cleave the mutant RNA - Binds at guc sequence and then cleave that sequence - Can design the flanking sequence so it will cut the mutant allele but not the rna produced by the wildtype
1. Case Study: Vibrio Cholera Epidemic
- Cholera is an infection of the small intestine that is caused by the bacterium Vibrio cholera. - Symptoms range from none to severe and is characterised by large amounts of watery diarrhea. = Produces a toxin that binds to the epithelial cells and stops them from uptalking water. Cells sense an infection of push out water and flood the intestines to try wash the bacteria out causing diarrhoea. But since water can't be absorbed person becomes dehydrated and that's what a person would due from, not the bacteria itself Vibrio cholera & CTX phage The disease Cholera is actually caused (in part) by a bacteriophage. Native V. cholera does not cause disease. Infection of V. cholera with the temperate bacteriophage CTX gives the bacterium its toxinogenicity in a process called lysogenic conversion. Lysogenic Conversion - Lysogenic phage carry genes within their phage genome, which provide their bacterial hosts with a new phenotype. Different mechanism to transduction! = Transduction transfers bacterial genes where these are phage genes integrated into a bacterial host CTX phage is famous because it is the first filamentous phage found to transfer toxin genes - phages can mediate pathogenic disease!
Zebrafish Example: Nemaline myopathy and Screening in Zebrafish
- Congenital Muscle - Disease Range of severity and onset - Formation of characteristic protein aggregates • Can rapidly screen drugs in an animal model (especially small animals) • Identify promising leads - For approved drugs potential rapid translation to patients - For novel drugs - mouse date - clinical safety trials - long process
Screening IVF embryos
- FISH analysis for aneuplopidy and reciprocal or Robertsonian translocations - PCR analysis for CF, MD, Huntington disease, b-thalassemia, SMA, Fragile X - Gender selection e.g. against male embryo for serious X-linked diseases with no single gene analysis available (Otherwise selecting for sex is not considered ethical) - HLA typing. Looking for an embryo with a tissue-type match. New child can then be a donor for a previous child affected by e.g. Fanconi anemia
What do we know about causative mutation we can use to filter variants? = Recessive disease
- Family linkages: o Expect people in same family to share the same mutations o Present in all affected individuals o Only one or no alleles in unaffected individuals o Variable penetrance and variable expressivity can happen too - Person has to have two mutations in the same gene - Not mutated in healthy individuals - Not common in general population o If we are looking at rare diseases, they should not be present in general population (1%, 1/10000 people) - Gene has role in affected tissue/pathway - Variant predicted to affect function - In consanguineous families - homozygous for the same variant (very powerful at filtering out other changes)
Cl- loss causes dehydration of mucus
- Hair-like cilia of the lung 'beat' and remove the lung mucus and prevent it from entering the lung - CF mucus has reduced ionic concentrations (Cl- and Na+), which causes osmotic movement of water away from mucus -Thick and dehydrated mucus collapses lung cilia and mucus is not removed from lung Increased hydration: Mucus is less viscous and cilia can't remove watery mucus from the lungs Normal State: Cilia working optimally Dehydrated State: No water so mucus becomes thick and puts more pressure on the cilia as it can't remove the mucus as well
Complications in inheritance patterns 2. New mutations
- In some cases, sporadic occurrence of a phenotype may depend on a new mutation. - Mutation could have arisen de novo in the parents - Especially relevant to dominant disorders, e.g. achondroplastic dwarfism.
Complications in inheritance patterns 1. Expressivity and penetrance
- In some genetic disorders, phenotype is variable in its degree of expression, shows variable expressivity - Different degrees of severity o May be difficult to assess phenotype at the extreme closest to normal, especially for behavioural phenotypes, late onset disorders. - Another related complication is variable penetrance o some individuals with affected genotype do not show the phenotype. o Phenotype is variable in whether it is expressed or not. - This complicates segregation analysis. - Assuming single gene pattern, can deduce obligate genotypes from inheritance patterns, even if inconsistent with phenotype. - E.g. Polydactyly = variable expressivity and reduced penetrance
1.1 Congenital abnormalities
- Incidence of congenital abnormalities among newborn is high o heart and vessels 10 per 1,000 o central nervous system 10 per 1,000 o gastrointestinal tract 4 per 1,000 o limb 2 per 1,000 o urogenital system 4 per 1,000 - Examples - neural tube defects such as spina bifida (as high as 1% in some countries), cleft lip and cleft palate - Most congenital abnormalities of genetic origin have a multifactorial basis
1.2 Late onset disorders
- Incidence of later acquired disorders with some genetic basis is relatively high - These diseases cause morbidity and premature mortality in ~60% individuals during their lifetime (Nussbaum Table 8.1). o Thus have a high impact in medicine. - How can we study the genetic basis of such multifactorial disorders? o Population studies, family studies, twin studies, adoption studies, polymorphism associations, etc. - In some cases both the genetic and environmental components are known: \ o eg. type 1 diabetes, venous thrombosis - But for most complex disorders we don't understand the underlying mechanisms of the gene-gene and gene-environment interactions.
1.1.1. Transgenic mice
- Inject foreign DNA into pronucleus of fertilised egg (one cell stage). - Then implant embryo into oviduct of pseudopregnant female. - 25-50% of time DNA will integrate at random into genome. - If integrates at 1 cell stage whole animal transgenic, if later then mosaic. - Can use marker (eg. coat colour) to detect transgene incorporation, or screen animals with PCR. - Uses: overexpression, reporter genes, models of dominant disease alleles.
How does CRISPR work? - Phase 2c
- Mature CRISPR-Cas9 complex begin surveillance of the cell for complementary incoming viral DNAs. = Mature crispr cas9 will have a cas9 protein, a spacer which is the viral transcribed dna/rna, repeat unit bound to the structural tracker rna - When the CRISPR-RNA spacer recognises its complementary target sequence, it binds and forms a DNA:crRNA duplex. = Will recognise foreign dna and then unwind and see if their spacer has perfect complementarity with this invading viral dna - if it is complementary the cas 9 protein structure recognise that and makes a dsDNA cut. If you make a dsDNA cut you have made that viral dna ineffective - This duplex is cleaved by the endonuclease activity of Cas9 protein - creates a dsDNA cut. - The invading viral DNA is partially destroyed, and the invading virus cannot proceed to replicate and its infection is thwarted.
2. Chromosomal disorders
- Most chromosomal disorders are caused by aneuploidy. o one chromosome present more or less than normal. o occurs in at least 5% pregnancies. o severity different for different chromosomes. Autosomal more severe than sex chromosomal. - Among spontaneous miscarriages aneuploids are very frequent, 40-50% up to the end of the first trimester. o kinds differ from those seen in live births. - At birth, incidence is much lower (~1 in 200). - Chromosome aberrations also occur - translocations, deletions, duplications. o E.g. DiGeorge syndrome, deletion of 3Mb in 22q11.2 (Nussbaum Fig 6.9), removes ~30 genes, 1 in 2000-4000 live births, has role in 5% congenital heart defects.
1.1 Recessive loss of function mutations
- Most loss of function mutations are recessive, 50% enough in heterozygote. - Clinical effect in homozygote can vary depending on severity of the mutation. - For most diseases multiple types of mutations in gene occur. So many people with recessive disorders are compound heterozygotes - different mutant allele on each chromosome. = but sometimes see mutational homogeneity: more likely in gain of function, but also can see it in some other situations, eg. Founder effect. - Two examples of variation in severity of recessive loss of function mutations in single gene 1. cystic fibrosis 2. Duchenne Muscular Dystrophy and Becker Muscular Dystrophy (examples in lect 4)
1.1 Dominant mutant alleles (Three ways)
- Much less common than recessive mutations. Three ways a mutation can be dominant are: a) Haploinsufficiency • Loss of function mutation. But 50% normal protein is not enough for wild type phenotype. Not common = need two functioning copies of the gene to function • E.g. Tailless mutation in mice b) Dominant-negative mutations. • Loss of function mutation, protein made but non-functional, inhibits function of normal protein in heterozygotes. Often in structural proteins which form dimers (or multimers). § If one protein in dimer is mutated then the function of the other protein is effected even though it is normal • E.g. Kinky tail mutation in mice, Marfan Syndrome c) Gain of function mutations = RAREST • New function of the gene product, or protein always active, or increased levels of expression, or inappropriate expression • E.g. antennapedia mutation in Drosophila, Achondroplasia, Huntington Disease
Complications in inheritance patterns 3. Locus heterogeneity
- Mutations in several different genes may show the same phenotype - common problem that makes analysis difficult - Especially relevant for mutants of genes in biochemical pathways e.g. recessive loss of function of any of 7 genes in a DNA repair pathway for UV damage leads to xeroderma pigmentosum. - Also relevant to disorders of functions controlled by many genes e.g. recessive sensorineural hearing loss. If two affected individuals have offspring with normal hearing, does not negate recessive inheritance, two different recessive mutations involved.
Gonorrhoeae
- N. gonorrhoeae is easily transmitted and causes severe reproductive complications and infertility. - Quickly becoming a top MDR priority pathogen - few antibiotics left to treat - Gonorrhoeae is a sexually transmitted disease (STD) cause by the bacterium Neisseria gonorrhoeae that infects mucus membranes of reproductive tract. - Most patients are asymptomatic, when symptoms present they range from mild inflammation through to mucosal discharge and bleeding. - Left untreated, can cause complications
1.1. Attachment
A major factor in host specificity of a phage is attachment to a host. The phage has one or more proteins on its capsid (tail fibers, base plate) that interact with specific host cell components called receptors. - Anything on the bacterial cell wall has the potential to be a receptor for the phages - T4 phage tail fibers recognize E. coli membrane receptors - All tail fibers bind and attach to the membrane
Linkage Phase
- Note that the arrangement or phase of the markers on the parental chromosomes is not important for calculating map distance. o BUT we do need to know which phase is present in order to determine if progeny are recombinant. - Can be either dominant alleles together (A and B) and recessive alleles together (a and b), called the coupling phase. - Or one dominant and one recessive allele together (A with b, and a with B), called the repulsion phase. - Coupling: AB/ab and Repulsion: Ab/aB - The basis of genetic mapping is the fact that the chance of a crossover occurring between two loci is proportional to their distance apart. - The distance, in map units (or centimorgans cM), can be estimated as the percentage of products of meiosis that show recombination between the two loci. - Multiple cross-overs can occur, and the result is to cancel out others. - The maximum possible percent recombination is thus 50%. - Larger map distances can be obtained by mapping many closely spaced loci. - The rate of crossing over in human females is about 1.4 times that in males, so maps are usually made in one or other sex.
1.2. Penetration
- Once attached to the cell, T4 phages tail fibres retract and squeeze the phage's tail sheath down onto the cell. - Very complex step. Bacterial wall is very stable - Causes the phage tail to penetrate the cell and inject its DNA genome into the bacterium. = Use enzyme to degrade part of cell wall - Works much like a hypodermic syringe. - Phage DNA is stored under very high pressure in the capsid and upon penetration is rapidly ejected into the bacterial cell. - Attached phage remains on bacterial surface, only the genome reaches the cytoplasm - The dna stored in the capsid is stored under a lot of pressure which help with injection
How does CRISPR work? - Phase 2b
- Processing of the pre-crRNA involves the TracrRNA binding with perfect complementarity to each repeat sequence in the pre-crRNA. - The resulting TracrRNA and pre-crRNA duplex is recognised and cut by Rnase III enzyme at the repeat regions. = Breaking up the multiple repeat and spacer unit - Releases individual CRISPR-RNA complexes - contain one spacer and one repeat. - The Cas9 protein then associates with each CRISPR-RNA complex to form mature CRISPR-Cas9 complex
Human diseases associated with imprinting: Prader-Willi
- Rare autosomal disorder, affects ~ 1 in 10,000 - ~75% cases:associated with deletion of 15q12 of one copy of chromosome 15 - In 15q12 deletions, it is always the paternal chromosome that is deleted - i.e. the only copy of gene is from mother. - Suggests that the disease gene is imprinted (inactivated) in the mother, but not in the father Prader-Willi Sydrome: Child here has two inactive copies of the gene One is a deleted copy from father and one is an imprinted copy from mother (inactive) Normal: Child is heterozygous bc the paternal copy is active ~20% cases: no deletion but both copies of chromosome 15 derived from mother i.e. uniparental disomy How could this arise? 3 copies of this chromosome Will discard one of these chromosomes bc it has nothing to pair with, but if it discards the fathers chromosome then there will be two from the mother so UNIPARENTAL ISODISOMY Both of the prada-wilis gene will be inactive
RNAseq Analysis
- Seq rna as well as the genome so can look at functional consequences as well as the variants, in the affected tissue - Some of the limitations of WES/WGS can be overcome by examining RNA in disease tissue - Identify: o changes in expression o changes in splicing o message stability/degradation - Examines consequence of mutation alongside mutation detection - Need to be able to extract disease relevant tissue - Need to be able to identify disease causing change from normal variation - requires many controls - Achieved diagnosis in 35% of cases
Shigella
- Shigella is closely related to E. coli - is a strict pathogen that causes dysentery and diarrhoea - Has a very low infective dose, less than 100 cells ingested can cause disease. - Invades epithelial cells causing inflammation and cell death, toxin producing Shigella infection (Shigellosis) •Shigella can be passed through stool via direct contact with the bacterium(e.g.,not washing hands, changing infant diaper, contaminated water/food sources). • Infection takes 1-2 days to develop and can persist for up to a week. Causes bloody diarrhoea, abdominal pains/cramps, fever. •Dehydration is a major cause of death, in rare cases can cause blood stream infection. Data on antibiotic resistance: Shigella In 1950 only 0.2% of Shigella were resistant to antibiotics • By 1965, 58%were resistant to sulfanilamide, streptomycin, chloramphenicol and tetracycline
Establishing Mendelian Inheritance
- Single gene disorders/traits may show various patterns of transmission in pedigrees: will assume you know these general patterns - If a phenotype is inherited in simple Mendelian pattern, collect segregation data from all possible parent-offspring cases in pedigrees. - If pattern is autosomal recessive, expect 1⁄4 of progeny to show the phenotype from unaffected carrier (heterozygous) parents. o **But need to compensate for ascertainment bias - some carrier couples by chance will have no affected children. - Very hard to "prove" autosomal recessive inheritance. o Unless have molecular markers or biochemical methods for detecting carriers.
Self-renewal of stem cells
- Stem cell divides and produces two daughter cells. - One will go back and self renew and the other will go and differentiate. This is important so new stem cells can be formed and we also have a constant stream of stem cells (pool of stem cells) - For adult stem cells, one daughter is a stem cell, other differentiates. - For ES cells both daughters remain stem cells.
WGA by Multiple displacement amplification (MDA)
- Strand-displacement synthesis at a constant temperature using B. subtilis bacteriophage Phi29 DNA polymerase which forms a network of hyper- branched DNA structures - Can amplify small amounts of DNA, even from a single cell - Compared to PCR-based methods, Generates higher molecular weight DNA (up to 12kb) with better genome coverage = Large amount of dna from initially a very small amount of DNA - Generates 1-2 micro grams in 3hr - Amplified DNA can be used for STR, SNP genotyping, array comparative genome hybridization (aCGH), even whole genome sequencing - Uses of single cell WGA: pre-implantation genetic diagnosis, forensic materials, dissected tissues (e.g. tumours) Uses random hexamers and Phi29 DNA polymerase - Blue hexamers, and blue dot is polymerase - Polymerase from one reaction has caught up to the next strand and instead of that polymerase falling off at that point it actually displaces that strand (pushes under), so polymerase can continue. And we now have a new stand which addition polymerase and hexamers can bind to and create a copy of that new strand = branched structure
Active versus Inactive genes
- Strong correlation between methylated genes and inactivity - Many inactive genes have their DNA methylated - Many active genes have their DNA unmethylated - Correlation suggests a causal effect but does not prove it Methylation is critically important in mammals - knockout of DNMT1 methylase in mice causes embryonic lethality Effect of methylation on mutation rates C and Cm are both prone to spontaneous deamination (Spontaneous lose of the amine group) - for Cm, this causes CpG -> TpG mutations
1. Multifactorial Disease
- The majority of medically important human diseases are complex and multifactorial - These diseases often "run in families" but do not follow Mendelian inheritance patterns - They are thought to result from complex interactions between several genetic and environmental factors Can be either: 1. Qualitative: trait (e.g. disease) is either present or absent e.g. a person either HAS Crohn's disease or DOES NOT have the disease(underlying cause may be polygenic) 2. Quantitative: trait can be measured (continuous scale) In either case, estimates can be made of the genetic contribution to the disease = Arbitrarily set a value where if you are over that value you have the disease (e.g. blood pressure)
"Two - hit" Hypothesis
- To inactivate tumour suppressor genes, both copies of the gene must inactivated = If one copy is normal it can still work and suppress growth etc of cancer cells - Genetic abnormalities (chromosomal rearrangements, deletions, mutations) do not provide the complete picture of genomic alterations found in cancers. - DNA methylation and histone modification also contribute
1. Possible causes of association
- Type 1 error: at p = 0.05, 5% of results will be significant even without any true effect - Direct causation (e.g. if HLA-DR4 CAUSED rheumatoid arthritis) =Very unlikely that marker examined is the functional variant responsible for association = Marker is much more likely to be in LD with associated variant - Linkage disequilibrium (LD) - Population stratification
Y chromosome DNA
- Y chromosome exists as a block of DNA that does not recombine (except for a small pseudo-autosomal region on the short arm that pairs with the X) - The Y chromosome is transmitted solely down the male lineage - patrilineal - Polymorphic sequences have arisen over time by mutation - different haplotypes = 153 Y haplotypes have been defined, used to assemble a phylogeny = Looking at relatedness of diff ethnic groups - Closely-related males have the same Y haplotype - Differences in Y chromosome sequence can exclude a close relationship - Studies of Y chromosome polymorphisms have been useful in deducing patterns of relationship between different populations - Example 1: Ancestral haplotypes occur predominantly in African populations suggesting that Africa is the site of origin of human populations
PCR-based Whole Genome Amplification (WGA) methods
- degenerate oligonucleotide-primed PCR* - primer extension PCR - ligation-mediated PCR Limitations: - non-specific amplification artefacts - incomplete coverage of loci - small size of the DNA products
What is revealed by a person's biogeographic ancestry from DNA info?
- going back 10 generations, we all have 1,024 pedigree ancestors, many of whom are related and shared among different individuals - DNA-based biogeographic ancestry information is usually performed with genetic markers not involved in appearance traits, = Just bc a marker says a person is from Africa doesn't tell you anything about their appearance - Therefore should not be used to make statements about a person's EVCs - likely to be error-prone as almost no appearance trait is restricted to a certain geographic region
DNA Profiling using SNPs
1) SNPs would ideally have similarly high degrees of allelic diversity in worldwide populations - important to reach similarly high matching probabilities among people of different biogeographic ancestries 2) Crucial to avoid the effects of population substructure when estimating match probabilities Limitations: All current databases are STR-based and often legislation prevents storing of DNA Disaster victim identification (DVI) / Missing persons: - Victim and reference samples (relatives) are collected de novo - therefore suitable for converting to SNPs Kinship testing: SNPs have mutation rate 100,000 x lower than STRs A set of 45 SNPs could be used for forensic genotyping The SNPs: 1. do not detect human pop substructure 2. have high heterozygosity 3. show good match probabilities
Technical issues in DNA profiling
1) Stutter When PCR products of loci with 2 or 3 bp repeats are examined, each of the major products is accompanied by several shadow bands well These are usually 1 and 2 repeats shorter than the major band Presumed to arise from replication slippage by the Taq DNA polymerase May interfere with interpretation of alleles that differ by only 1 or 2 repeats Much less apparent when 4 bp repeats are amplified, so these are now used universally 2) Mutation Rate of mutation of STRs is relatively high: estimated directly by pedigree analysis, or from samples of sperm cells to be 1 in 1,000 -10,000 / locus / gen If 10 loci tested in 100 individuals, expect one new mutation on average Mutation Properties (consistent with replication slippage in vivo): - most changes add or subtract a single repeat - rate of additions of a repeat same regardless of number of repeats - rate of subtraction of a repeat higher if more repeats are present (leads to an equilibrium number of repeats) - rate for 2 bp repeats higher than for 3 bp and 4 bp repeats - mutations occur in germ line (not in somatic cells)
1.1 Transgenic techniques for genetic analysis in mice
1. Adding genes to the mouse genome = transgenic mice - can model dominant gain of function mutations 2. Mouse targeted gene knockouts- can model loss of function mutations Applications of genetic techniques: • Understanding gene function. • Characterisation of gene expression pattern and regulation. • Link mutant phenotypes and genes, "rescue" experiments. • Create mouse models of human diseases or developmental abnormalities.
1. Lytic Phage Life Cycle - T4 Phage
1. Attachment (adsorption) of the phage to the host cell 2. Penetration (entry, injection) of the phage nucleic acid into the host cell 3. Replication of phage nucleic acid and protein by host cell machinery = Takes over host cell 4. Packaging of genomes into capsids and assembly of new phages 5. Lysis of cell that releases new phages = The whole T4 phage lifecycle takes less than 25min to complete, and produces ~140 newly infective phages upon lysis.
Possible Limitations of the HapMap project
1. Don't know how well other genetic variants (indels, repetitive elements) are represented by SNPs 2. Don't know how well the chosen populations represent other populations 3. Disease-causing variants must not be the result of recurrent mutational events 4. Identification of genetic risk factors does not necessarily lead to improvements in health eg. the ApoE Alzheimer's susceptibility gene 5. Disease causing variants must be common because only the most common haplotypes are defined by tag SNPs in the HapMap- use of common SNPs and large spacing could lose important information CD - CV hypothesis: "common disease - common variant'
'Reprogramming' of methylation patterns
1. During germ cell differentiation - Progressive demethylation occurs during development of the primordial germ cells - After gonadal differentiation, de novo methylation occurs - Sperm genome is more heavily methylated than egg genome - Sex-specific differences in methylation patterns are seen 2. During later embryogenesis - Very early embryo is highly methylated - Genome-wide demethylation occurs during later embryogenesis - Later still, widespread de novo methylation occurs again - Methylation pattern varies in different cell lineages - tissue specific patterns
Traits of a good STR
1. High degree of polymorphism is advantageous. - However, less polymorphic loci have lower mutation rates, which can make them more useful in some parentage testing situations = Higher variability, more power we have to distinguish between people especially closely related individuals 2. Simple repeat loci are desirable over highly complex loci 3. STR loci with a small allele span are preferred - large spans use a lot of electrophoretic real estate in multiplexes = Smallest is 9 and largest is 26 and that span of band width is the real-estate If band width is two broad would reduce number of loci we can have in a single multiplex bc it would be hard to distinguish between markers 4. Relatively small difference in allele sizes. - HMW alleles are not amplified as well as companion LMW alleles - Allele imbalance could even result in allele dropout 5. Smaller PCR products - DNA types can be recovered more effectively from degraded DNA Ideally, alleles should be sufficiently polymorphic with a smaller size range ALSO... Amount of PCR product decreases with size of PCR product Higher molecular weight alleles e.g. 300bp may not amplify So will only see one allele and not know if the individual is heterozygous or homozygous
3.1. Techniques of genetic analysis in Drosophila
1. Large scale mutagenesis screens eg. Loss of function, enhancers/suppressors of a phenotype = Very easy due to ease of breeding and regeneration times 2. Using transgenic flies to study gene function - loss of function with knockouts or RNAi, misexpression = Rna expressed in transgene of fly 3. Genetic mosaics - Individuals composed of cells of more than one genotype. Allow us to address developmental fate of various cells, and the functions of particular genes. - Arise due to mitotic recombination, can control this with specialised techniques. If occurs in a heterozygote for a mutation, results in patches of tissue that are homozygous mutant. Called mitotic clones. - Very useful if homozygous mutant is lethal (most cancer genes) can study its function in a mutant patch of cells.
Difficulties in identifying true DSLs
1. No single gene mutation is necessary or sufficient to cause the disease - true susceptibility allele will be found in some controls - true susceptibility allele will be absent in some patients 2. The major DSLs may be different in different populations 3. Genetic variants causing susceptibility may not be obvious mutations 4. Heterogeneity - susceptibility to most complex disease must be determined by many minor loci, not a few major loci
Methylated DNA can be readily identified e.g. Bisulfite sequencing
1. Restriction digest genomic DNA 2. Treated digested DNA with Sodium Bisulfite -bisulfite treatment converts C to T while leaving Cm unchanged 3. Amplify region with PCR 4. Sequence amplified region = See if there a cytosines remaining or not and see if those sequences were originally methylated
Stem Cell Research
1. Stem cell therapies - restoring tissues that have been damaged by injury or disease. - might be accomplished by transplanting stem cells into the damaged area and directing them to grow new, healthy tissue. E.g. skin grafts, hair transplants, bone marrow transplants - May also be possible to coax stem cells already in the body to work overtime and produce new tissue. - To date, more success in animal models with the first method, stem cell transplants. = In a disease case, we would need to correct the gene of that individual, then put the cells back to develop new tissue formation 2. Screening new drugs and toxins. = In vito, in stem cells grown in cultured dishes 3. Derivation of cell lines from humans with genetic disease to use as in vitro models, eg. Huntingtons Disease. = Extract cells from someone with Huntington's disease etc., then growing these cells in culture so you can then test possible cures etc. So using these cells as a research tool
MDA of whole genomes: limitations
1. high allele drop-out (ADO): random non-amplification of one of the alleles present in a heterozygous sample - found in ~25% of the loci amplified from single -cell MDA samples- decreases the accuracy of the genotyping of a sample and ...... - can lead to misdiagnosis in pre-implantation and prenatal diagnosis - need to increase the number of loci that are studied - analyse several replicates of the same sample 2. preferential amplification (PA): relative over-amplification of one allele in comparison to the other - random, affects small stretches of genomic DNA - big problem when identifying STR alleles
1.4.3. Adeno-associated viral vectors 1.4.4. Lentiviruses
1.4.3. Adeno-associated viral vectors = Dominating the current treatments - Integrate but always at same position (19q13.3) that does not interfere with endogenous genes (+). = Not random integration - Can only take 5kb DNA, can be activated by adenovirus infection (-). = Space limited = Potential for this virus to be come activated if the patient is also infected with a adenovirus such as the common cold 1.4.4 Lentiviruses (Type of retro virus) - Infect nondividing cells (+)- Integrate (+/-) - Based on HIV, safety concerns (-) - More complex virus, has been more difficult to disable (-) - Used in vivo to correct mouse models of several CNS disorders.
1.5 Nonviral methods 1.5.1 Liposomes 1.5.2. Direct injection
1.5.1. Liposomes - synthetic vesicles that form spontaneously when certain lipids are mixed. Can carry DNA, cationic ones on outside, anionic ones on inside. Lipid coating allows DNA to survive in vivo, the liposomes bind to cells and are endocytosed. = Challenge to get these inside the cell bc cells have a natural mechanism to avoid foreign dna - easy to make and no limit to DNA size = Super easy to assemble due to charge, just mix the lipids and they form - but efficiency low and expression transient 1.5.2 Direct injection or particle bombardment - DNA injected directly into target tissue - efficiency low and expression transient - may be OK in muscle where cells not dividing, eg. DMD, trialled in mice
Allele-specific PCR amplification
2 PCR reactions One common primer and 2nd primer differs: - one specific for normal seq - one specific for mutant seq Last base of the primer is changed (that is where our mutant allele differs) we can design primers that will only detect the wt or the mutant allele and amplify those - Can have a common reverse primer that amplifies both
CRISPR Targeting of incoming DNA
= DNA bound to CRISPR-Cas9 complex is cleaved through a dsDNA break Guide RNA binds with perfect complementarity to the repeat regions of our crispr array. This repeat region is then joined by a spacer (blue) It will then bind to invading dsDNA from a virus if perfectly complementary. It will unwind the DNA and if there is perfect match, there will be a structural change to the cas9 protein and it will make ds cut, 3 bp up from the start of the target DNA PAM sequence used to help identify foreign DNA Any nucleotide sequence that is NGG, where N is any nucleotide, is recognised by the PAM sequence CRISPR-Cas9 complex rapidly dissociates from non-target DNA, but binds complementary sequences for longer times. = If no PAM sequence it will not even look at that DNA If there is a PAM sequence the cas9 protein will use its spacer sequence to see if the dsDNA is complementary The CRISPR-Cas9 complex searches for incoming DNA through random three-dimensional collisions in the cell. Once bound the CRISPR-Cas9 forms an RNA-DNA heteroduplex and propagates unwinding of DNA. = Cas9 protein is responsible for searching the dsDNA from a virus. It unravels the DNA and allowing the spacer rna structure to search for scan and search for complementarity
Genomic Imprinting
= Related but not identical to methylation - Imprinted genes are monoallelically expressed depending on the parental origin of the allele - Mouse experiments showed both maternal AND paternal chromosome sets are necessary for normal embryonic development = Maternal disomy for some (not all) chromosome regions gave a different phenotype to paternal disomy for the same region = Some genes were being imprinted in either the mother or fathers germ line - Gene A imprinted (inactivated) when passing through mother's germline - Gene B imprinted (inactivated) when passing through father's germline
1.5.3. Receptor-mediated endocytosis
= To help the molecules escape being degraded once entering the cell • DNA is coupled to a targeting molecule that can bind to a specific cell surface receptor, inducing endocytosis and transfer of DNA into cell eg. DNA complexed to glycoprotein containing galactose, will be recognised by receptors on cell surface of liver cells and endocytosed • Problem is endocytosis delivers molecules to lysosomes, need to disrupt this so DNA can get into nucleus. Can add some viral gene products to assist this. = Need to make sure it doesn't go to the lysosome and be degraded • Gene transfer efficiency is high
1.2. Quantitative trait variation
A graph of the # of individuals in a population having a particular quantitative value produces a normal distribution The curve has its peak at the mean (μ) value The breadth of the curve (spread of values) relates to the variance (s2) or standard deviation (s ) - Variance is called the total phenotypic variance - includes both genetic and environmental variance Normal range of a physiological measurement is very important in clinical medicine From your stats: the "normal" distribution sets the limits of the "normal" range Only 5% of the population will be > 2s above or below the population mean.
"Out of Africa" single origin model of human evolution
A recent common ancestor for all modern humans - a single woman - Mitochondrial Eve - lived in Africa ~170,000 years ago. - a single man - Y-chromosomal Adam - who lived ~100,000 years ago. Y Chromosome Adam Patrilineal human most recent common ancestor (Y-MRCA) - all Y chromosomes in living men are descended from him Male counterpart of mitochondrial Eve (the mtDNA-MRCA),- all mitochondrial DNA in living humans is descended from her NOT the only living male of his time- Many men alive at the same time have descendants alive today Only Y-chromosomal Adam produced an unbroken line of male descendants carrying his Y chromosome (Y-DNA) that persists today Y-chromosomal Adam probably lived between 60,000 and 90,000- separated from mitochondrial Eve by at least 30,000 years As male lines die out, a more recent individual becomes the new Y-MRCA
3. Drosophila melanogaster as a model organism (examples, end of lect 6)
Advantages • Rapid life cycle and many progeny - useful for large genetic screens (particularly enhancer/suppressor screens). • Cheap. • Genome sequenced (also 11 other Drosophila species). • Low chromosome number (4), easier to control genetic background. • Sophisticated genetic and molecular genetic techniques for studying gene function. • Most importantly, conservation of many genes and nearly all cell biological processes - 75% of human disease genes have homologue in fly. • No animal ethics requirements. • very useful for large scale chemical and drug screening, several large biotech companies using. Disadvantages • Not all aspects of human disease will have parallels in flies. • Not all human genes have homologues in flies.
Adult stem cell comparison to ES cells
Advantages: - No ethical concerns = Harvest from patient themselves - Grow in a controlled fashion (ES cells can cause tumours) - Phenotypically stable, once differentiate they remain in that state, ES cells have been known to revert - No risk of immune rejection as uses patients own cells Disadvantages: - Rare = hard to find in body - Difficult to culture compared to ES cells - Less potency than ES cells = Only can form a narrow range of cell types
1. The mouse (mus musculus) as a model organism
Advantages: • Availability of many single gene mutations, can do classical genetic analysis. • Can create genetically modified animals easily (transgenic or knockout) - can generate tissue-specific or conditional knockouts or over-expressing strains. = Can target individual genes and knock them out etc. • Genome sequenced (most genes have human homologue). • Short generation time (8-9 weeks). • Small enough to house easily, have large litters and breed readily in captivity, docile and easy to handle. • Can trial therapies, vaccines etc with very good control of environmental factors. Disadvantages: • not a primate - not all genes/disease mechanisms shared with humans • genetic background of inbred lab strains = Very homogeneous - could be advantage but often disadvantage • ethical concerns • expensive, makes large screens very costly
2. The zebrafish (Danio rerio) as a model organism
Advantages: •Develop rapidly •Many progeny and small size - large scale mutagenesis and chemical or drug screens possible. • Cheaper than mice. • Genome sequenced. • Transparent - in vivo imaging- can watch development and progression of pathology. • Genetic tools - Random mutations/trasngenesis - Targeted mutation and transgenesis through CRISPR/Cas9 genome editing - In addition to genetic tools = Morpholinos = small antisense oligonucleotides that can block splicing or translation Disadvantages: • Not placental mammal, fewer genes conserved with human, but still have ~75 of human disease genes.
Therapeutic cloning
Aim is to produce stem cells that are genetically identical to recipient. • Same technique as cloning of animals (somatic cell nuclear transfer), but endpoint is different. • Nucleus from somatic cell from patient is inserted into donor enucleated oocyte (obtaining these is ethical obstacle, have to be harvested from women). • Resulting embryo grown in culture until ICM develops. Cells from ICM (inner cell mass) removed and cultured as in normal ES cell production. = Correct mutation in this process as well • Will produce ES cell line that is genetically identical to patient. • Works well in mice, but not yet in humans (some reports but no peer reviewed publications). • Reports of problems in primates, stem cells frequently aneuploids. • For genetic disease, possibility of using therapeutic cloning to make transgenic ES cells that are identical to patient but also have genetic defect corrected - Trials underway for several disorders.
Occurrence of Methylation
All methylated cytosines are followed by guanine (G) in the DNA sequence i.e. 5' CpG 3' (p refers to the phosphate bridge) Normally the majority of CpGs in the genome are methylated Methylation occurs after DNA replication Pattern of methylation on the parental strand acts as template for methylation of daughter strand A specific enzyme, methyltransferase (methylase), is required DNMT1 is needed for maintenance methylation DNMT3a and DNMT3b are de novo methyltransferases
Human diseases associated with imprinting: Angelman syndromes
Always the maternal chromosome that is deleted. Suggests gene is imprinted (inactivated) in the father. - Opposite to prader willis syndrome
Cystic Fibrosis (CF)
An inherited life-threatening disorder that affects the lungs and digestive system. - CF is the most common autosomal recessive genetic disorder in Caucasians that leads to a decreased lifespan • 1 in 2500 live births • Estimates that 1 in 23 Caucasian individuals are a silent carriers! 3-4% carrier rate Caused by mutations in the Cystic Fibrosis Transmembrane Regulator (CFTR) protein
DNA Profiling using SNPs rather than STR loci
Analysis of low amounts of highly degraded DNA requires very small (<50 bps) amplicons (PCR products) - impossible with STRs - SNPs reflect single bp changes; detected with small amplicons because SNP variation does not involve repetitive sequences, SNP profiling avoids stutter artefacts that complicate STR profile interpretation Low stutter Less mutations Low amplicon = makes SNPS much more useful However bi-allelic SNPs are less polymorphic than multi-allelic STRs - Therefore SNPs are less informative in the analysis of mixtures of DNA from multiple individuals Reduced variation per allele can be compensated for with more SNPs relative to STRs, and / or the use of tri-allelic SNPs 20-50 autosomal SNPs could achieve match probabilities similar to those obtained with 10-15 STRs = More genotyping but advantages of small amplicons and low mutation rate
Phylogenies of Y chromosome and mtDNA haplogroups
Ancestral lineages - Rest of the haplogroups are subsets of original African haplogroups - Strong evidence for out of Africa evolutionary theory
Genome Wide association studies
Association studies rely on finding a particular allele associated with a disease. The frequency of a particular allele is compared among affected and unaffected individuals in a population (Case - Control study) = Can use all individuals in the population not just families Looking at tens of thousands of individuals so can be very hard to get that amount of people Advantages: Only need samples from affected individuals and controls. No pedigrees / family studies required Disadvantages: Need a lot of individuals to have statistical power e.g. allele A is associated with disease D IF people with D also have A significantly more often (or possibly less often) than predicted from known frequencies of A and D in the population = We are just looking for association with an allele not linkage Example: HLA-DR4 is found in 36% of the UK pop, and in 78% of people with rheumatoid arthritis.
Genetic Basis of Resistances: Horizontal Gene Transfer
At a scale exceeding the genome, new functionalities can be acquired by import of genetic systems to the bacterial cell through Horizontal gene transfer (HGT). Genomic recombination: integrates new genomic elements through recombination events (transformation & transduction) into the genome. Plasmid acquisition: maintains genomic elements as extrachromosomal plasmids (conjugation). HGT is ubiquitous in the natural environment and is often the culprit in acquirement of resistance in the clinic
Genetic Basis of Resistances: Point Mutation
At the finest genotypic scale, de novo point mutations can affect expression or alter RNA or protein structures of drug targets or resistance conferring genes. De novo point mutations appear frequently in laboratory evolution experiments Upstream SNP: Single-nucleotide polymorphisms (SNP) appearing upstream or within promoter regions can affect expression ORF SNP: Open reading frame (ORF) SNPs can alter RNA or protein structures of drug targets or resistance conferring genes. = SNP or mutation in the gene itself so the protein structure changes
Genetic Basis of Resistances: Genetic Scale
At the genomic scale, genetic elements can be shuffled across the genome to assemble new combinations, which can affect expression of relevant genes. Structural rearrangement of the genes and chromosome Promoter acquisition: by either introducing a strong promoter upstream of a previously unexpressed or lowly expressed gene. Gene amplification: by generating multiple copies of specific genes or gene cassettes. Allows for increased expression, and a level of redundancy for point mutations to act upon. = If we mutate the ribosome most mutations are going to be deleterious and won't make it function so if we only have one copy of the ribosome we don't want to take risks to change it encase it doesn't work so duplicate the gene and have multiple different copies. Only need one copy to function correctly and the other copies we can use to experiment and cause mutations in and see if these will evolve and work.
Bacteriophage
Bacteriophage (or phage for short) are bacterial viruses - can only infect bacteria. - Is a virus - completely dependent on host metabolism for growth and replication. - Highly specific for a bacterial host, usually strain level specificity. - Is a biological organism that grows, degrades and evolves.
Lysogenic Phage Life Cycle - Lambda Phage
Bacteriophage lambda infects E. coli and has a linear, dsDNA genome. At the 5' end of each dsDNA genome is a single stranded 12bp region. - These single stranded "cohesive" ends are complementary in sequence. - They form the "cos" sites in the lambda genome - Once inside the host cell, the lambda genome circularize by base-pairing the single stranded "cohesive" ends to form a circular Cos site - Lambda expresses a protein called "lambda integrase" that recognises a specific attachment site in the host chromosome - Integrase creates a staggered end cut and inserts the lambda genome at this cut site.
Affected sib pair (ASP) analysis
By random segregation, sib pairs share: 0 parental haplotypes= 1⁄4 of the time 1 parental haplotype = 1⁄2 the time 2 parental haplotypes = 1⁄4 of the time - On average they have one allele (from 2) in common at any locus i.e. they share alleles 50% of the time = Not looking at recombination, just variation from the 1:2:1 ratio In a fully penetrant recessive Mendelian disease affected siblings share 2 alleles at the disease locus i.e. they share alleles 100% of the time In complex disease, one locus is not sufficient to cause the disease, therefore an intermediate level of allele sharing is expected 50% < Complex disease allele sharing < 100% In ASP analysis, affected sibling pairs are analyzed to determine if there are loci at which these sibpairs share alleles more frequently than the 50% expected by chance Looking for a deviation from the normal 1:2:1 allele sharing ratio Genome Scan: ASPs are genotyped for hundreds of polymorphic markers across ALL chromosomes - Looking for sharing of chromosomal regions above the 1 : 2 : 1 ratio expected from random segregation. This excess allele sharing is a sign of linkage between marker and disease gene The more sib pairs used (typically hundreds), the more powerful the analysis model-free linkage : nonparametric LOD (NPL) score [c.f. model-based linkage : LOD score]
Microbiome of CF lungs
CF patients have diverse and unique lung microbiomes that are typified by Pseudomonas aeruginosa infections - Chronic colonization of CF patients lungs - Microbiome is highly diverse - Infections cause lung damage and loss of function - Microbiome varies b/w patients but also within a patient depending on the stage of the life and the disease = Makes CF disease highly diverse
Demonstrating Pluripotency
Chimeras - Taking the cells, mixing them with the cells of an embryo, can the cells we have added in (the potentially pluripotent stem cells), create all the cell types - Most stringent assay - Do cells contribute to all cell types - Inject cells into blastocyst and examine contribution to offspring • Frequency of contribution • Cell types contributed to Not possible with human cell lines Teratomas - Inject cells into immune compromised mouse = Inject the human cells into an immune compromised mouse. If you get cells from the three major lineages, that is a good indication that these cells have pluripotent potential. Not as good as a chimera where you can see a viable animal with all the cells formed correctly, but still a good indication. - Examine cell types in the teratomas that form • Endoderm • Mesoderm • Ectoderm
What is CF disease?
Chronic, progressive and life limiting disease that is characterised by: - chronic respiratory disease - pancreatic insufficiency - elevation of sweat electrolytes - male infertility - among others
Six general classes of CFTR dysfunction
Class 1 - Null mutations - no CFTR polypeptides are produce. Include proteins that have premature stop codons or that generate highly unstable RNAs Class 2 - Mutations impair the folding of CFTR, thereby arresting its maturation - ∆F508 is typical example, this misfolded protein cannot exit from the ER Class 3 - Mutations allow delivery of CFTR protein to cell surface, but its function is disrupted - important due to ivacaftor drug that refolds the protein Class 4 - Mutations located in membrane-spanning domains and have defective or reduced function Class 5 - Mutations reduce number of CFTR transcripts and subsequently amount of protein producedC lass 6 - Mutations produce CFTR that are synthesized normally, but are unstable at cell surface
Testing for SPECIFIC sequence changes
Common mutations e.g. DF508 mutation in CFTR Within a family with a history of a genetic disease Initial mutation scanning. Once mutation is identified, need only test for that particular mutation Mutation testing 1) RFLPs: Change introduces/removes a restriction site 2) Allele specific oligonucleotide hybridisation 3) Allele-specific PCR amplification 4) Oligonucleotide arrays
1.1.1 Relative Risk with respect to Qualitative traits
Compares the frequency of a disease in the relatives of an affected proband with the frequency in the general population (prevalence) relative risk ratio (l r) = freq of disease in relatives of an affected person/freq of disease in general population the greater the value of is l r , the greater the familial aggregation
1.3 Gene therapy vectors
Constructs can be designed either to integrate into host chromosome or to remain as episomes: = Take gene of interest and introduce into the patient • Integration (Integrating vectors incorporate into the host genome) - long term gene expression (+) = Will be in progeny of those cells - occurs randomly so different in different cells, can have large differences in expression levels (-) - integration may cause mutations in endogenous genes, eg, could activate an oncogene (eg. SCID gene therapy trial) (-) • Episomes (DNA retained but doesn't integrate) = Episomal vectors carry into the nucleus and allow expressing but doesn't integrate - limited duration of gene expression, diluted out as cells divide (-) = Might require multiple treatments - patients can develop an immune response with multiple treatments so efficacy becomes less and less - self limiting if something goes wrong (+) = Won't insert or cause a mutation
Lytic or Lysogenic? The 'genetic switch'
Control of these alternative life cycles -lytic or lysogenic - by lambda phage has been likened to a "genetic switch" where a defined series of events must occur to favour one pathway over the other. This decision depends in part on two key repressor proteins that accumulate in the E. coli cell following infection: The lambda repressor called "cI protein" and a second repressor called "Cro" . = Whichever genes proteins accumulates first will determine which cycle the phage goes into cI protein represses lytic genes - If genes encoding the cI protein accumulate inside the cell following infection - lambda enters lysogenic cycle - cI protein is a global repressor of phage genes! - The cI proteins actively repress ALL other phage genes, including Cro = Lamba cant express the genes it needs to enter inti the lytic cycle so is forced to go into the lysogenic cycle - When this happens the lambda phage enters lysogenic cycle Cro represses cI and activates lytic genes - If Cro accumulates inside the cell following infection lambda enters the lytic cycle. - Cro blocks the expression of cI proteins (along with two other repressors cII and cIII). - If cI proteins are blocked, they cannot globally repress ALL phage genes. - Leads to lytic gene activation
CFTR gene and protein
Cystic Fibrosis Transmembrane Regulator (CFTR) is a regulated chloride channel located in the apical membrane of epithelial cells - CFTR transports Cl- ions to the outside of the cell - Loss of CFTR function disrupts Cl- transport which leads to abnormal fluid and mucus secretions CFTR encodes a large integral membrane protein (~170 kDa), has 27 exons and span 190 kb of DNA. CFTR has 5 domains: Two membrane spanning domains each with 6 transmembrane sections (MSD1&2) that form the pore of the chloride channel Two nucleotide binding domains (NBD1&2) that bind and hydrolyze ATP to open and close channel One regulatory domain with multiple phosphorylation sites
DNA fingerprinting cont.
DNA fingerprinting allows comparison of patterns in different individuals Can clearly demonstrate non - identity (different bands) But cannot prove identity - All bands may be shared, but this could occur by chance DNA fingerprinting using mini-satellites has three disadvantages: 1. Southern blot procedure requires ~ 1 μg DNA, significant amount of tissue (>10-6 cells)= needs lots of DNA 2. Difficult to define the limits of a 'band', and to define a 'match' 3. Difficult to calculate the probability of obtaining an exact match Soon superseded by DNA profiling using microsatellites
Ancient DNA
DNA is a stable molecule and can survive many years However, outside the body, DNA damage repair mechanisms not active Damage accumulates over time: - cleavage of the sugar phosphate backbone - loss of bases - chemical modification of bases (oxidised C and T, hydantoins) - cross-linking of backbone(s) Very unlikely that DNA can survive >1,000,000 years (e.g. Jurassic Park) Survival best in cold conditions, and in bone Short fragments (~100-200 bp) may be amplified, and overlapping sequences assembled Usually easier to amplify mitochondrial DNA (only 16,500 bp of sequence) Criteria: - Same sequence obtained from independent replicates - Same sequence obtained by independent labs - Need to be certain that amplified DNA is from specimen and not contamination
Non-recombining, uni-parentally transmitted DNA
DNA transmitted en bloc (non-recombining) and from only one parent Mitochondrial DNA Present in all mitochondria - can have many mitochondria in each cell (esp. in muscle) - many more copies than nuclear DNA - Small circular chromosome (16,569 bp, compared with 3 x 109 bp in nucleus) - Mitochondria are strictly maternally transmitted - matrilineal inheritance - Limited genetic variation exists between mitochondria in different lineages - Maintained down the generations (no genetic exchange) - Useful marker for sibs, and more distantly related individuals in ethnic groups
1.1.3.1 MZ versus DZ twins
DZ twins share intrauterine environment, are reared in the same conditions but share only 50% of alleles If MZ twin concordance > DZ twin concordance, this is strong evidence of a genetic component to the disease : concordance ratio Twins reared apart MZ twins separated a birth - ideal chance to observe disease concordance in people with identical genotypes reared in different environments Limitations of twin studies 1. Genetic differences in MZ twins do exist- somatic rearrangements in immunoglobulin and T cell receptor loci - random X inactivation in females 2. Possible differences in intrauterine blood supply, development
Population screening: Clinical Utility
Degree to which results will change / improve medical care Must have a useful outcome e.g. 1 - Newborn testing for PKU - treatable by diet = So very useful to know so that's why we test at birth e.g. 2 - Huntington disease - not treatable - usually know the disease is in the family = Not useful to screen everyone rather useful to screen a family instead e.g. 3 - Screening for CF carrier status in adults- Possibility of avoiding the birth of an affected child = common disease Ethical Principals in Population screening - Voluntary - Privacy - No pressure on course of action = Decision on what to do if affected or carriers is up to the parents - Confidentiality
PCR amplification of micro-satellite repeats
Design pcr primers that bind to the unique sequence flanking the repeat sequences. So not in the repeat sequence themselves Single locus profiling using PCR and assay of fluorescent products after electrophoresis: Called a DNA 'profile' Can assign individual bands to individual alleles of the locus PCR primers labelled with fluorescent dye (not in the repeat) 'Multiplex' profiling of PCR products after electrophoresis means >1 locus can be profiled per PCR reaction: Each locus: Different PCR primers labelled with different fluorescent dyes (not in the repeat) Many different polymorphic STR loci potentially useful Standard sets of ~10 different loci used (discovered by trial and error) - STRs with 4 bp repeats selected
Dynamic mutations & the trinucleotide repeat disorders
Dynamic mutations (unstable mutations): - Mutations of simple sequence repeats (micro-satellites) that variably lead to an abnormal phenotype and that can increase (or decrease) in copy number between generations. - Most micro-satellites (short tandem repeats) occur within non-genic DNA and have no effect on phenotype. - A few occur within transcribed regions of genes. - The number of repeats in some may vary much more frequently than usual for micro-satellites. - Biochemical mechanism proposed for expansion of repeats is slipped mispairing Leading hypothesis is the repeats change due to slip mispairing During replication the repeats detach and when they reattach because they are identical they could shift in location and then get many more copies when synthesis continues Hard to test this but it is the leading theory atm
2.1 Heteroduplex analysis by capillary / gel electrophoresis
Heteroduplex vs homoduplex: With only single bp difference can anneal like this bc so similar, there will just be a slight change in the structure where these two bases don't bind, so when run through gel they will run at different speeds than the homoduplexes
Creating patient-specific (pluripotent) stem cells
•Aim to generate stem cells specific to patient, genetically identical thus no problem of immune response. Could also use genetically corrected stem cells that are otherwise genetically identical to patient. Two main approaches under investigation: 1. Therapeutic cloning (uses ES cells) 2. Reprogramming of differentiated adult cells- induced pluripotent stem cells (uses adult cells)
Male identification using Y chromosome
Especially useful in mixed samples - small amounts of male DNA difficult to separate from female. - preferential amplification of female component Y chromosome is unique and non-recombining Y-STRs - Haplotypes from sets of non-recombining, male-specific Y-STRs are used for male identification - Commercial kits with up to 17 well-defined Y-STRs are available - Y-STR haplotype databases must be larger than autosomal STR databases because non-recombining DNA changes less frequently - Allow the identification of groups of paternally-related men in most human populations with a high level of resolution Using Y haplotypes to distinguish between related men low mutation rates of the currently used Y-STRs ~ 1 mutation per marker every 1,000 generations - unlikely to occur between related men 13 rapidly mutating (RM) Y-STRs, Have ~ 1 mutation per marker every 100 generations - possible to differentiate >70% of close and distantly related males c.f to 13% with current Y-STRs
Antibiotic Resistances: Target Modification
Even if drugs do accumulate unmodified in the bacterial cytoplasm, binding and inhibiting their target can be hampered by a change in the target (Target modification ⊣ Binding). Residue substitution: Chemical modification of the target itself to bind or block the target site. = Typically lock and key so change the lock (change aa etc, change the fold of the structure) Target protection: Modification of the target site that changes structure or activity to evade antibiotic. Expression level: Change in target abundance, produce more target to effectively quench the antibiotic. = One antibiotic can kill X number of antibiotics so say an antibiotic will target the ribosome, the bacteria will make 50 more ribosomes so need 50x more antibiotic to be affective
Should we sequence exome/genome?
Exome capture (Hybridisation of exons to the array) - 1.5% of genome: Greatly reduces amount of genome that needs to be sequenced - Most known mutations in coding sequence - variable coverage across exome - Much cheaper than whole genome ... 1/2-1/3 of cost Whole genome - Even coverage, can examine copy number variation (CNVs) (No areas hybridised more than others) - Includes regulatory sequences - Far greater analysis required = Analysis is more expensive When whole genome is done, first step is to restrict it down to known coding sequences
Population screening: Clinical validity
Extent to which test result is predictive for a disease Predictive value - rate of false positives (specificity) - indicate disease when actually NOT present - low predictive value results in unnecessary anxiety - false positives rare in DNA tests Test Sensitivity - rate of false negatives - indicate NO disease allele when actually IS present - depends on degree of allelic heterogeneity - normally only test for a subset of more common mutations - Geographical considerations must be taken into account in choosing mutations to be tested The alleles to be screened for should be determined based on allele frequencies in the local/Victorian population (any variation on this that makes clear it should be adjusted for those common in the population to be examined, rather than most common globally)
Gene Tracking
Following molecular markers linked to a disease gene in a family Necessary when disease gene is not known - Historically used for Huntington disease, cystic fibrosis, myotonic dystrophy - Still useful for very large genes where scanning the entire gene is impractical Prerequisites: - Good mapping data (markers closely linked to disease locus) - Pedigree and sample availability - No uncertainty regarding diagnoses or map locations 3 steps required: 1. Distinguish chromosome in parent(s) - closely linked heterozygous marker(s) 2. Determine phase - i.e. which chromosome carries disease allele 3. Determine which chromosome was received by consultant
Examine population frequency of variant
For rare disease the causative variant will have a low allele frequency in the population. For example, if we consider rare to be <1 in 10,000 For recessive disease would expect allele frequency of <1% (actually much less as not all disease due to same mutation) = Less than 1% to be carriers, and would be less bc carriers will all be carrying different variants For dominant disease Would not expect it to be present in healthy individuals
New genetic treatments for CF
Gene therapy approaches have been tried but with limited success. - A new wave of small molecule drug treatments have recently launched and are changing the way we approach genetic diseases Ivacaftor - Small Molecule Treatment Ivacaftor is a CFTR potentiator that increases Cl- ion transport by potentiating the channel open probability (or gating) of the CFTR protein at the cell surface. - ONLY works on patients that have Type III CFTR mutations (gating mutations) - G551D - Once Cl- transport is relieved, mucus is slowly rehydrated, successfully cleared from the lung and bacteria infections are reduced Lumacaftor - Small Molecule Treatment Lumacaftor improves the conformational stability of ∆F508 homozygous mutations, resulting in increased processing and trafficking of CFTR protein to cell surface. - Still in clinical trials, but mechanism of action and early reports suggest it is effective in refolding ∆F508 mutations. - However, ∆F508 mutations typically still have gating problems once at cell surface. ORKAMBI - Lumacaftor/Ivacaftor The combined activity of lamacaftor and ivacaftor acts directly on CFTR protein to increase quantity, stability and function of ∆F508-CFTR protein, resulting in increased Cl- transport. - Ivacaftor acts as potentiator of CFTR at cell surface - Lumacaftor improves stability and processing of CFTR within cell/ER - Together they can treat ∆F508 mutations - Potentially act on other mutations as well.
DNA Hypo-methylation in Cancer
Global DNA hypo-methylation has been reported in almost every human malignancy = Overall decrease of methylation in cancer cells - Measurement of Cm content by HPLC - Verified by determining methylation at specific sequences Majority of hypo-methylation occurs in repetitive elements - satellite sequences and centromeric regions. These are regions that are normally heavily methylatedThe function of these repetitive elements is not well understood However! Hypo-methylation 1) Can also activate oncogenes 2) Can cause chromosome instability = Lose chromosomes or they break up
Detecting ANY mutation: 2) Heteroduplexes or mismatches
Heteroduplex analysis: allows us to detect differences - Doesn't tell us what the mutation is, but tells us there is something different between this dna and our control DNA. - Heteroduplex, complex formed by two different strands of DNA. Need to be similar enough to bind but have differences - Run on a page gel, run differently through gel depending on shape. Most mutations are heterozygous AD +/- AR - / - but usually compound heterozygotes Heteroduplex DNA has different thermal and physical properties to homozygous DNA Abnormal mobility on non-denaturing polyacrylamide gels (PAGE) - Simple - restricted to short fragments - detects indels, most but not all single base substitutions 1) Denaturing HPLC- high throughput, widely used 2) Denaturing gradient gel electrophoresis
Detecting ANY mutation in a gene: 3) Deletions
Homozygous / hemizygous deletions = Give no PCR product = ~ 60 % of DMD mutations have one or more deleted exons = 'Multiplex' PCR can identify 98% of all DMD deletions = Primers flanking each exon, but when can combine the primers for multiple sequences = Needs to be homozygous or hemizygous in x linked ==> We prefer tests that give a positive result but this test we are looking for one less band which can give false positives if our gel didn't work correctly
Positional cloning (examples in lect 3)
Identifying a gene after mapping it (in any organism) is called positional cloning. This method has four main steps: 1. We would obtain sequences of all the DNA in the candidate region - This would involve, doing a partial restriction digest on genomic DNA to generate large fragments of overlapping DNA sequence, then clone it into a vector (i.e. creating a genomic library) - Use chromosome walking to find all the clones necessary to join the markers together. This is done by identifying a clone that overlaps one of the markers that is in the candidate region, probing the library to find a complementary clone, then using that clone to identify overlapping DNA... repeating this process until the markers are joined together. The new way (post HGP): Go to human genome databases and download sequence! However, some regions of human genome not yet completely sequenced or assembled. In this case can often order in clones from region of interest (rather than making own library), may need to generate own contig. 2. Identifying all the genes in the region Can be done with many methods: - gene prediction: looking for ORF (exons) - Zoo blots: sequences are more likely to be conserved across closely related species if they are apart of genes. - CpG islands: looking for clusters of CpG dinucleotides that are unmethylated found near many transcription initiation sites. Islands can be identified as HpaII restriction endonuclease only cuts methylated parts of the genome. - Exon trapping: random fragments from the region of interest are cloned into special vectors that are engineered so splicing reaction will occur if cloned fragments contain an exon/intron boundary. New method, post HGP: • Look at the annotated genes. These have been determined via - ESTs, gene finder programs, comparative genomics (zoo blot equivalent), BLAST searches to identify homology to known genes in other organisms. 3. Prioritise the genes to obtain candidate genes The greater the knowledge of the physiological effects of the gene, the better. - A) look for appropriate function: perform BLAST searches with predicted genes to (i) identify possible function via homology to proteins of known function and (ii) identify homologues in model organisms, which could have known mutant phenotypes. - B) Look for appropriate expression by performing expression studies to find predicted genes whos expression fits the pattern of the disease. These may not help, depends on the gene. 4. Confirming a candidate gene - mutation screening in affected individuals - restoration of a normal phenotype in vitro (rescue) - production of an animal model of the disease e.g. a knock out for a loss of function - use an animal model or cultured cells to understand the function of the gene
Interlocus alleles
If 1 locus has 2 alleles and the other only 1, the interlocus allele probably belongs to the 'homozygote' Heterozygosity levels can help predict which locus is more likely to be a heterozygote
Probability of obtaining a specific DNA profile
If an individual has been genotyped at 1 STR locus, we can calculate the probability of obtaining that profile in a population by chance Requires an accurate database of allele frequencies for the specific population involved e.g. Say the genotype of an individual at the D3S1358 locus was (15, 16) Database frequencies of alleles at the D3S1358 locus: (allele freq table for each population) So chance of obtaining this genotype in a US Caucasian population is 2 X 0.283 X 0.223 = 0.126in a US Afro-American population is 2 X 0.280 X 0.323 = 0.181 Second micro-sat locus FGA has been genotyped as (19, 26) Allele frequencies in US Caucasians are 19: 0.063 and 26: 0.002 Allele frequencies in US Afro-Americans are 19: 0.046 and 26: 0.004 So, chance FGA (19, 26) for Caucasians is 2 X 0.063 X 0.002 = 2.5 X 10-4 for Afro-Americans is 2 X 0.046 X 0.004 = 3.7 X 10-4 So chance D3S1358 (15, 16) and FGA (19, 26) is (0.126) X (2.5 X 10-4 ) = 3.2 X 10-5 for Caucasians (0.181) X (3.7 X 10-4 ) = 6.7 X 10-5 for Afro-Americans
Haplotype Mapping
If the LD island structure is defined across the entire genome, a set of markers could establish haplotypes at each island, then be tested for association with ANY disease Compare the haplotypes of individuals WITH a disease to the haplotypes of individuals WITHOUT the disease (the controls). If a particular haplotype occurs more frequently in affected individuals, a gene influencing the disease may be located within that haplotype
1.1. Linkage Disequilibrium (LD) cont.
In theory, we would expect a smooth gradient of LD until a maximum is reached - this is how the CFTR (cystic fibrosis) gene was found However smooth gradients of LD are the exception, not the norm Systematic studies of marker-marker LD have shown that chromosomes contain a series of islands of relatively long-range LD sharply separated from each other. Within islands: LD may extend 50 kb (All alleles in that block are in LD) Between islands: markers may show no LD even though they may be close Island boundaries are recombination hotspots Association studies look for marker alleles in LD with a DSL Traditionally used microsatellite markers However often insufficient coverage of the genome- If there is no microsat locus in LD with the DSL, no association will be detected Ideally, want markers for every LD island in the genome
Forensic DNA phenotyping (FDP)
Inferring information on directly from a DNA sample: 1) biogeographic ancestry 2) externally visible characteristics (EVCs) - To be applied to unknown samples for intelligence work - similar to how eyewitness statements are used today - Value of DNA-derived EVC and ancestry information can be statistically supported - eyewitness statements have serious error rates - Detection mainly at the levels of large geographic regions e.g. continents - Some subregional differentiation may also be possible
Predictions of Ethnicity
Information of probable ethnicity of unknown offender could help investigators Some STRs are likely to have drastic allele frequency differences between various population groups By comparing alleles present in the evidentiary profile with allele frequencies found in various population data sets, can calculate a likelihood ratio of competing hypotheses (i.e. individual is from ethnicity A vs ethnicity B) Assumptions: - population data sets are representative of individuals coming from a particular ethnic background - 'self-declared ethnicity' is accurate Problems - relatively high rate of mutation with STR loci (IBS v IBD) SNPs are more appropriate for this type of analysis: - lower mutation rate - higher likelihood that a particular allele becomes fixed
1.3.1. Prediction # 1 1.3.2. Prediction # 2
Prediction #1 1) the rate of a disorder will be increased among relatives of affected individuals, 2) this increase will be in proportion to their degree of relationship Example: Schizophrenia - looking at the relatives of the affected person Prediction #2 Incidence will be higher in relatives of more severely affected individuals - expectation that such individuals will carry more of the liability alleles E.g. cleft lip (CL) and cleft palate (CP)
Potential CF treatments
Inhaled Antibiotics Inhaled antibiotics (Tobramycin) are administered when ever a patient has an exacerbation - Reduces bacterial load in lungs - slows lung damage - Cannot recover tissue scarring and damage = Rate of bacterial reduction decreases after more and more treatment bc bacteria adapts to the antibiotics (antibiotic resistance) Inhaled Hypertonic Saline CFTR mutations cause loss of Cl- transport and subsequent hyper adsorption of Na+, dehydrating the lung mucus - Inhaling hypertonic saline acts to ionically 'rehydrate' the mucus - Induces coughing and helps clear mucus out of lungs All of these treatments are preventative Treatment of CF has been directed towards controlling pulmonary infection and prophylatic treatments to reduce mucus buildup. These treatments are not 'cures'! They do not address the underlying CFTR mutation, and instead treat or prevent its subsequent symptoms (i.e., mucus buildup, bacterial infection, inflammation). Although these treatments have greatly improved the quality of life for most patients and had increased average life span of CF patients to over 30 years
Methods of Prenatal Diagnosis and Screening
Invasive Testing Only carried out if there was a reason to suspect the child would be affected bc there is a 1-2% risk of inducing termination by taking a sample Amniocentesis Chorionic villus sampling Pre-implantation genetic diagnosis Non-invasive Testing Maternal serum alpha-fetoprotein 1st / 2nd -trimester maternal serum screen Ultrasonography Isolation of foetal cells from maternal circulation
Gene tracking in Duchenne Muscular Dystrophy
Issues in DMD: Huge gene (2.4 Mb, 79 exons) 2/3 mutations are deletions (98% can be detected) But detection in carriers is difficult 30-35% mutations are point mutations 5% intragenic recombination (need flanking markers) High new mutation rate: - mother of an isolated DMD boy has 2/3 chance of being a carrier
1.1. Linkage Disequilibrium (LD)
LD: Non-random association of alleles at 2 (or more) loci Different from linkage which describes associations between loci due to limited recombination. But LD can be caused by linkage. Alleles at linked loci that are in LD are found together in a haplotype ALL alleles in LD with an allele at a disease susceptibility locus (DSL) will have an association (+ve or -ve) with the disease Within a family, affected individuals are expected to share alleles from loci closely linked to the disease gene (IBD) = Markers for the disease allele are coinherited and can be used as a proxy when the disease allele can't be used However If two 'unrelated' people inherited their disease from a distant common ancestor (IBD), they will also tend to share ancestral alleles at loci very closely linked to the disease gene - alleles in LD Repeated recombination will reduce the shared chromosomal segment to a very small region
1. Loss of function mutations
Loss of function is much more common, and generally recessive (although sometimes get haploinsufficiency). - A genetic method to determine if a mutation is loss of function is to compare it with a deletion mutation. i.e. if a clinical phenotype results from loss of function, any change that inactivates the gene should give the same phenotype.
Potential Linkage of STRs to Disease Genes
Many STRs were originally used as markers close to potential disease genes The str markers currently used are not in a gene coding region or linked to disease - We don't want to know about the disease history of an individual STRs in disease gene mapping / tracking e.g. Trisomy-21 is detected by presence of 3 alleles at any Chr 21 marker
What is CRISPR?
Mechanism of adaptive immunity in bacteria and archaea that allows them to defend against foreign genetic material (i.e., phages) Several different types of CRISPR pathways in bacteria and archaea that mediate different molecular effects.
DNA hyper-methylation in cancer
Methods for establishing promoter methylation as a major mechanism of tumour suppressor gene activation: - Large-scale sodium bisulfite treatment studies - Rapid screening of many tumour samples for promoter methylation of candidate genes The incidence of hyper-methylation varies with respect to the gene and tumour type involved. Methylation events occur early in tumorigenesis - ideal targets for early detection - Really early detection of tumorigenesis Using Methylation for better prognosis E.g. - glutathione S-transferase gene (GSTP1) is hypermethylated in 80-90% of patients with prostate cancer but not in benign hyperplastic prostate tissue
Methylase recognises "hemi-methylated" DNA
Methylase comes in - Sees the old strand is methylated and used that as a template to methylate the new one - Same methylation pattern on both strands Hemi-methylated: Only cytosine is methylated and new is not NB. 1. Once one C is methylated, methylation is maintained in subsequent DNA replications in both strands by maintenance methylation 2. DNA methylation is reversible 3. Methylation is an epigenetic process i.e. does not change DNA sequence (compared to a mutation) = Remains a cytosine but it is just modified 4. Parental pattern is not necessarily transmitted through meiosis/gametes to the next generation
DNA profiling using micro-satellites
Mini-satellites are tandem repeats of ~10-100 bp = The repeat region is bigger than micro-satellites, otherwise basically the same Micro-satellites are shorter tandem repeat sequence: ~2-10 bp - also known as Short Tandem Repeats (STR) - or Simple Sequence Repeats (SSR) Number of repeats at each locus / allele is highly polymorphic - Characteristically ranges from 5 to 50 repeats Micro-satellites can be easily detected by PCR 2 primers in the unique (single-copy) DNA sequence flanking the repeats Size of PCR products will depend on number of tandem repeats present Can label PCR products with fluorescent dye, and read out PCR product sizes from an electrophoretic gel or capillary very accurately Can detect even 1 repeat difference in products several hundred bp long DNA profiling using micro-satellites has three advantages: 1. PCR procedure requires <1 ng DNA, very small amount of tissue (buccal scrape, saliva, hair root, skin flake) 2. Easier to define a 'match' using fluorescent detection of peaks 3. Possible to calculate the probability of obtaining an exact match, as one locus is assessed at a time (but need to know the frequency of alleles) Microsatellite markers are inherited co-dominantly
Linkage analysis to identify complex disease susceptibility loci (DSL)
Model-free (nonparametric) linkage analysis does not assume any particular mode of inheritance, or the number of loci contributing to a trait Looks for alleles / chromosome segments shared by affected individuals 1) within families, or 2) in whole populations e.g. Affected sib pair (ASP) analysis = family analysis If both sibs are affected by a genetic disease, they are likely to share the chromosomal segments / loci that carry the disease alleles
1.1.3. Twin Studies
Monozygotic (MZ) twins: Cleavage of single fertilized embryo - identical genotypes at every locus Dizygotic (DZ) twins: Simultaneous fertilization of 2 eggs by 2 sperm - share on average 50% of alleles at all loci Concordance in MZ twins is a powerful indicator of genetic contribution e.g. sickle cell disease has 100% concordance = single gene e.g. type 1 diabetes has ~ 40% concordance = multifactorial <100% Concordance in MZ twins is strong evidence of non-genetic factors Environmental: e.g. infection, diet, Other: e.g. somatic mutation, X inactivation (in females) All genetic: concordance = 100% All environmental: concordance = 0% (if low rate of occurrence) - Concordance in MZ twins can give an approximate estimate of Heritability
Dynamic mutations & the trinucleotide repeat disorders = THREE CLASSES
More info lect 4 Diseases of unstable repeat expansions can be divided into three classes: 1. Diseases due to the expansion of non coding repeats that cause a loss of protein function by impairing transcription. Eg. Fragile X Syndrome, Friedrich Ataxia 2. Diseases due to expansions of non coding repeats that confer novel properties on the RNA. Eg. Myotonic Dystrophy 1 and 2 3. Diseases due to repeat expansions of a codon that confers novel properties on the affected protein. Eg. Huntington Disease, the spinocerebellar ataxias.
Single Nucleotide Polymorphisms (SNPs)
Most abundant form of DNA variation Individual SNPs are less informative than other markers (micro, mini satellites) - But are more abundant SNPs are not spread evenly, but clustered (greater susceptibility to mutation?) A cluster of SNPs inherited as a unit can be called a haplotype Tag SNPs can be used to represent a particular haplotype, greatly reducing the number of markers needed to cover the entire genome
Sex Tests
Most females 46, XX, males 46, XY Could amplify a sequence specific to the Y chromosome - but if no band: either female OR the PCR reaction did not work Need a sequence shared by the X and Y chromosomes, but different. Only a few genes are shared- in the pseudo-autosomal region (X-Y pairing region) - and elsewhere (e.g. amelogenin gene) Amelogenin GENE Two AMEL genes and proteins are almost the same But the first intron is 6 bp shorter in AMELX than in AMELY (not polymorphic) Widely used as an indicator of gender of test individual 14 Sex Texts: no single test is perfect Test does not always determine whether individual is male or female Sex is determined by the action of the SRY gene on the Y chromosome In very rare cases, SRY is found on the X - Consequence is 46, XX males, and 46, XY females (these would not be predicted correctly by this test) - In normal males, AMELX and AMELY have same peak height - In 47, XXY males (Klinefelters syndrome), X peak twice height of Y peak - In 47, XYY males (double Y), X peak half height of Y peak
2. Gain of function mutations
Much rarer, easy for random change to stop gene working, hard to give it a novel function. • Thus very rare in inherited disease (although common in cancer). Mechanisms (there are other possibilities as well): 1. Can occur when chromosomal rearrangement joins functional exons of two different genes (exon shuffling) to give novel gene = Entirely new protein with combination of those properties, taking the exons from one gene on one chromosome and the gene from another and come together Example: Pittsburgh variant of a -antitrypsin 2. Overexpression - can be via large level of gene duplication, or transposition to different chromatin environment. = Too much of a gene or being expressed somewhere it shouldn't be Whole gene duplication (3-4 copies of a gene instead of 2 bc of copies along their chromosome) Example: Charcot-Marie-Tooth syndrome Duplication in the chromosome due to duplication via unequal crossing over Over expression of myelin protein (lect 4) 3. Constitutively active signalling proteins, eg. G protein coupled receptors. Receptor signals whether ligand present or not. Example: Achondroplasia, caused by one specific mutation in FGFR3 (G380R) (mutational homogeneity), other mutations in the gene give different clinical phenotypes.
Tumourigenesis
Multi-step process - Defects in various cancer genes accumulate Huge complexity in altered gene functions: - Activation of growth-promoting genes (oncogenes) - Silencing of tumour suppressor genes Much is known about inherited cancer syndromes Little is known about the genes responsible for sporadic cancers
Mechanisms of Imprinting?
Must: 1) occur before fertilization 2) confer transcriptional silencing = Genes being turned off 3) be stably transmitted through mitotic divisions 4) be reversible Therefore possibly caused by methylation. Evidence?1) Methylation occurs in promoter regions of many inactive genes 2) Different methylation patterns in sperm and ova 3) Known imprinted genes have a hypermethylated imprinted allele. BUT: not all active / inactive imprinted alleles have clear methylation pattern. Therefore the mechanism of imprinting is still not completely clear
What do we know about causative mutation we can use to filter variants? = Dominant disease
Non synonymous Not present in healthy individuals De novo
To major repair pathways for dsDNA break: Two ways
Non-homologous end-joining (NHEJ) NHEJ is a pathway that repairs double-stranded breaks in DNA in a 'non-homologous' way as the break ends are directly ligated together without the need for a homologous template. = Stitch together two blunt ends of DNA - Does not need a template - Repair function is imprecise and error prone - Creates indels (insertion or deletion in the DNA) - Occurs with high frequency is the primary dsDNA break repair mechanism in mammalian cells. = Unprotected ends of DNA are very prone to degradation by other enzymes in the cell so by the time they get to be stitched together there is usually bases missing so you end up with deletions Frameshift mutations and can very often inactive that gene Homology Directed Repair (HDR) HDR is another pathway in cells to repair dsDNA breaks, most commonly through homologous recombination. The HDR mechanism can only be used when there is a homologous piece of DNA in the nucleus. - Repair function is precise - Requires a homologous template in the nucleus - Inefficient, but more accurate then NHEJ. Used to insert genes.
Common Disease - Common Variant
Not all cases will have the snp associated with the disease and some controls will have the snp associated with the disease but not have the disease - Higher chance of having the disease if you have the associated snp = No variation originally until a snp is introduced
DNA methylation
Nucleotides in DNA can be covalently modified NOT a universal mechanism e.g. No methylation in S.cerevisiae (yeast), C.elegans (nematode) Methylation of cytosine (C) gives 5-methylcytosine (Cm) Appears to be important in identifying 'active' genes from 'inactive' genes Methylation of cytosine has no effect on base pairing - Cm still pairs with G = Still pairs like cytosine that's unmethylated would [Similar to relationship of thymine (T) to uracil (U) - both pair with adenine (A)]
Nomenclature for describing effects of a mutation on gene function (not all alleles fit)
Null allele or amorph - produces no product = No function remaining at all Hypomorph - produces a reduced amount or activity of product = Might be a missense mutation making it slightly less effective at what it does Hypermorph - produces increased amount or activity of product = RARE Neomorph - produces novel activity or product = Rare, cause protein to do something it wouldn't ordinarily do E.g. an enzyme can react on a substrate it didn't work on before. Similar to original substrate but different Antimorph - activity or product antagonises the activity of the normal product = Dominant negative
Population stratification
Occurs when a population is subdivided, with reduced mating between subpopulations Disease and 'associated' allele happen to be shared by a particular subdivision Example: - A significant association of the HLA-A1 allele with the ability to eat with chopsticks was found in a San Francisco population - Probably because HLA-A1 is more common in Chinese than Caucasians Can be avoided with carefully chosen controls Or, look for association within families - Test Trios of affected proband and two parents Single population could be a problem for our study if there are three independent sub populations within that don't mix with each other If we match our controls to this effect then we can eliminate this problem
1.5. Lysis
Once the phage life cycle is complete and the cell is filled with newly infective phages, T4 phage expresses two proteins that lyse the cell. Holin: forms tiny pore in the host cells inner membrane. Lysin: Diffuses through the holin, and degrades the peptidoglycan cell wall. = Needs holin in order to get through the inner membrane and degrade the peptidoglycan cell wall Holin + Lysin = Lysis Once these two proteins activate, the cell breaks open by osmostic lysis
Mapping from pedigree data using LOD scores
PRACTISE WORKED EXAMPLES Based on calculating the chance of getting a sibship assuming the two loci show a recombination fraction of r (range 0 to 0.5). - Divide this by the chance of getting the sibship assuming the two loci are unlinked (r = 0.5). - This is the odds that the genes are linked (θ, theta). - Take the log10 of this (Z), called the LOD score (log of the odds). - Repeat for all sibships. - Then add the individual LOD scores (because they are logs, can be added). • The higher the LOD score, the higher the chance that the genes are linked with a recombination fraction of r. Lod score of 3 is significant
Target capture and Sequencing
Panel based approaches for the many hundreds of childhood diseases - Very cost effective — Pre-conception screening — Selected 448 severe recessive childhood diseases — Captured 7717 genomic regions by hybridisation — Next generation sequencing - average 160X coverage = 95% sensitivity (proportion of mutations that can be detected) = 100% specificity (no false positives) — Less than $1 per disease — Combined screening, education, and counselling could dramatically reduce incidence of Mendelian inherited recessive disorders (accounts for 20% of infant deaths)
1.4. Packaging and Assembly
Phage structural proteins are late proteins synthesised using bacterial cell machinery and spontaneously self-assemble in the cytoplasm. a) Phage proheads are assembled but remain empty. b)Packaging motor is assembled at the opening to the prohead. c) dsDNA is pumped into prohead under pressure using ATP as driving force d)Theprohead expands when pressurised by the entering DNA. e) Packaging motor is discarded and the capsid is sealed. f) Tail and tail fibres self assemble. Head full packaging of T4 genome T4 genome replication produces a single, long, concatomer of phage DNA inside the host cell - T4 packaging motor binds this long concatomer of DNA and starts pumping it into the heads until they are 'full' - Once head is 'full' the DNA concatomer is cut and phage capsid is sealed. - T4 heads contain enough room to package 103% of the genome, so each capsid has a different 3% of redundant DNA included
Use of DNA profiling in paternity tests
Process: STR DNA profiles of child, mother and alleged biological father obtained If child has one or more alleles not present in either mother or alleged father, then paternity can be excluded (assuming that a mutation has not occurred) If every allele not found in the mother is found in the alleged father, strong evidence that he is the real father (although this cannot be proven) Can calculate a paternity index (PI)- probability of observing the child's profile assuming the alleged father is the real father, divided by the probability assuming another unrelated member of the population is the real father Paternity index (PI) "the probability of observing the child's profile assuming the alleged father is the real father, divided by the probability assuming another unrelated member of the population is the real father Calculation of Paternity Index If tested man does not have the alleles that have been inherited from the biological father, then he is excluded. However mutation could cause a false exclusion A1A3 Therefore standard practise to require an exclusion at 3 or more loci before test is declared negative If tested man cannot be excluded, we need to indicate the significance of the non-exclusion using likelihood ratios which consider 2 competing, mutually exclusive hypotheses = Hypothesis 1 (Hp): tested man IS the biological father = Hypothesis 2 (Hd): tested man IS NOT the biological father PI (paternity index) = Pr (Gc / Gm, Gtm, Hp)/ Pr (Gc / Gm, Gtm, Hd) Prob child's genotype (Gc) given mother's genotype (Gm) tested man's (Gtm)
Methylation as an important 'second hit' in FAMILIAL tumours
Promoter methylation is a frequent cause of inactivation of the non-mutated copy of the tumour suppressor gene - Aberrant methylation NOT found on the mutated allele In SPORADIC cancers, hyper-methylation can be either the first or the second hit, or both A number of candidate suppressor genes are silenced by promoter methylation in certain cancers - these genes are predicted to be TSGs based on their function - but are not seen to be frequently mutated in cancers Methylation can be: 1) the second hit in familial cancers 2) the first, second, or both hits in sporadic cancers
Material for Testing / Screening
RNA: Very useful. Introns have been removed, can detect aberrant splicing But difficult to work with - Not stable, harder to work with than DNA Protein: Very useful. But specific to particular disease. - Assay is very specific to that disease DNA: Technology is generic - Approach and technique is identical for all diseases Sample sources: o Blood - most common DNA source from adults o Buccal cells - non-invasive, less DNA than blood samples o Chorionic Villus - fetal DNA o 8-cell stage embryo - pre implantation diagnosis o Hair, semen - common in forensic investigations o Pathological specimens - limited to short sequences o Guthrie cards - newborns
Preimplantation Genetic Diagnosis (PGD)
Requires In vitro fertilization (IVF) - Hyper ovulation is induced in the woman - Oocytes are harvested - Sperm are added in culture or oocytes are fertilized by Intracytoplasmic sperm injection (ICSI) - At 8-cell stage, 1-2 cells are taken for biopsy from several embryos - Embryo continues to develop - 2 healthy embryos are reimplanted in the mother's uterus
S. Pneumoniae
S. Pneumoniae is a commensal organism - naturally resides in nasal passages. Can become an opportunistic pathogen, especially in elderly & children Causes pneumonia, ear infections, bronchitis, sinusitis, blood infections. Infections: - S.pneumoniae can cause a range of infections, but typically presents in the lung before moving into other organs of the body. - Usually infections are secondary infection following viral infection where immune system is weakened with increased lung secretions. Young and elderly patients particularly at risk - Can cause life threatening bacteraemia (bloodstream)and meningitis infections Data on antibiotic resistance: S. pneumoniae - Rapid emergence of resistance - Strong relationship between amount of antibiotic used and % resistant isolates - Local selection matters!
Staphylococcus aureus: S. aureus (MRSA)
S. aureus is a commensal skin organism - found in nose, respiratory tract and on the skin. It is a common cause of skin infections, including abscesses, respiratory infections, sinusitis and food poisoning MRSA typically causes wound infections •normally found on skin and don't cause infection Cuts cause bacteria to come into body Major concern in hospitals because lasts long time on surfaces, easily colonises vulnerable patients (elderly, immune compromised) via intravenous lines or catheters Data on antibiotic resistance: S. aureus Steady increase in MRSA infections within hospital setting High risk pathogen, easily transmitted, resilient and can persist on surfaces
DNA Hyper-methylation in Cancer
Seen in specific genes / chromosome regions in tumours CpG islands are found in the promoters of 60-70% mammalian genes - normally unmethylated Hyper-methylation in CpG islands is the best-characterised epigenetic change in tumours - found in every type of human tumour - associated with inappropriate gene silencing Hyper-methylation is AT LEAST as common as disruption of classic tumour suppressor genes by mutation
Two imprinted domains on 11p15 also cause disease
Silver-Russell Syndrome (SRS)a congenital disease of growth retardation and asymmetry Beckwith-Wiedemann Syndrome (BWS) a complex disorder of fetal overgrowth epigenetic alterations at imprinting control regions (ICR) regulating H19 and IGF2 activity Normal: H19 active from the maternal copy and IGF2 copy active in paternal copy (H19 not active) BWS: Both copies of IGF2 are expressed - increased expression - No H19 is expressed SRS: Both H19 copies are expressed
2.2 Single Strand Conformation Polymorphism (SSCP)
Single stranded DNA takes on a particular conformation, with particular physical properties e.g. mobility Single base pair substitutions alter mobility SSCP detected by non-denaturing polyacrylamide gel electrophoresis
1.3. Discontinuous multifactorial traits
Some diseases act as discontinuous multifactorial traits Members of the population differ in their genetic liability to developing a disorder (also influenced by environmental factors) Once the liability has passed a threshold, the trait will develop Among relatives of an affected child, the liability curve is shifted to right. Proportion beyond the threshold is the familial incidence
1.2 Dominant loss of function mutations
Sometimes loss of function mutations are dominant, two main ways they can be dominant: (a) Haploinsufficiency • 50% not enough in heterozygote. Seen for relatively few genes. • These genes are dosage sensitive, such genes include: - gene products that compete with each other to determine a developmental or metabolic switch. - gene products that co-operate in interactions with fixed stoichiometry, eg. Subunits of structural proteins. • In each case product is titrated against something else, relative levels of interacting products critical. Example - Adult hemoglobin (HbA) has 2 identical α chains and 2 identical β chains. Stoichiometric ratio critical, if one gene is mutated get disease. = Forms a tetramer , so if you lose one of the chains, you'll still get a tetramer but of all the same chains so can't join oxygen the same way so very deleterious (b) Dominant negative • mutant gene product interferes with normal one in heterozygote. • usually more severe effect than null mutation in same gene. • structural proteins that build multimeric structures particularly vulnerable. Example - collagen disorders
CF lung function
Spirometry tests are used to measure CF patients lung function over time to assess disease progression. - Measured as FEV1 - Forced Expiratory Volume - Calculates the volume of air a patient is able to expire within one second - In CF the FEV1 is reduced due to infections, obstruction and loss of functioning lung Exacerbations reduce FEV1 - Every microbial infection causes more inflammation, which causes more lung damage, which reduces lung function (FEV1) - CF patients gradually lose their lung function - Exacerbations are treated, and lung function partially restored - The process continues until patient either dies of respiratory failure or receives a lung transplant
3 characteristics of stem cells
Stem cells have three important characteristics that distinguish them from other types of cells: 1. They are unspecialised cells, have not developed into cells that perform a specific function 2. They are self-renewing-renew themselves for long periods through cell division 3. They can be induced to differentiate (to be come cells with specialised functions) by external signals = Potential to differentiate when triggered by appropriate external triggers
CpG Islands
Stretches of unmethylated DNA 200 to 2000 bp in 5' regulatory regions of housekeeping genes - often extend into first exon and even first intron Function of CpG islands - Islands may be protected from methylation - less likely to mutate to TpG - Islands may surround promoter (regulatory region) = possibly important in retaining gene activity - Majority of CpG islands on the inactivated X chromosome in females are methylated Use of CpG islands in Gene Prediction - CpG islands found in promoter regions of 60-70% of human genes - CpG islands useful when scanning DNA sequences in search of genes = provide candidates for closer scrutiny
Allele-specific Oligonucleotide (ASO) hybridisation
Synthetic probes hybridise only to a perfectly matched sequence More sensitive than southern blot - short probes can reliably detect single nucleotide changes Make a short sequence probe (15-20 bp) and perfectly matches the sequence you're looking for Design probes for each of the alleles you want to search for and bc it has to bind to and identical sequence you can look at many alleles Can test for the presence of sickle cell heterozygotes or homozygotes
Genetic Screening
Testing for Mendelian characters (not biochemical testing) Carrier Testing Cystic Fibrosis b thalassemia Testing for late onset disorders Huntingtons disease Prenatal testing Chromosome abnormalitiesDuchenne Muscular Dystrophy (DMD) Myotonic dystrophy Newborn (neonatal) testing Cystic Fibrosis Phenylketonuria (PKU)
Use of DNA profiles in criminal law
Testing innocence or guilt: If other evidence establishes that an individual is a suspect, the DNA profile of a sample associated with the crime may be relevant If suspect's profile does not match the evidence, he/she is excludedIf the suspect's profile does match the evidence, he/she is not excluded - But this does not prove that he/she is guiltyNeed to inform the court of the probability that the match could have occurred by chance i.e. "what is the probability that the evidential DNA profile could have come from someone other than the suspect (and unrelated to him/her)?" This chance can be calculated earlier, often << 1 in 10^12 NB: DNA evidence does not apply in isolation Other exonerating evidence may exist proving the suspect cannot be guilty Also, even with DNA evidence, the court may not accept it
International HapMap Project (II)
The fossil record / genetic evidence indicate that all humans today are descended from modern ancestors in Africa ~ 150,000 years ago. We are a relatively young species Most current human variation comes from variation present in the ancestral human population. As humans migrated out of Africa, they took only part of the genetic variation present (bottleneck) - Haplotypes outside Africa tend to be subsets of those inside Africa. - Haplotypes in non-African populations tend to be longer - less time for recombination to break up haplotypes The frequency of haplotypes is different in different populations through random chance, natural selection etc. Mutations have created new haplotypes, and most of the recently arising haplotypes have not had enough time to spread widely
CRISPR Genome Engineering
The fundamental principles of CRISPR can be utilised to cut, insert and repair a genomic sequence, allowing us to rapidly, easily and cheaply engineer DNA. - Previous protein- based genome engineering techniques were complex and expensive. - CRISPR engineering is easy! Can do it in lab in a couple of days.
Types of stem cells
Totipotent Capable of generating a complete organism, including extra-embryonic tissue such as amniotic sac and placenta = Totipotency: most potent form of stem cells. Pluripotent Able to give rise to all of the cell types of the body Multipotent A cell able to develop into more than one type of cell Unipotent Able to differentiate into an individual cell type
1.2.1. Familial aggregation of quantitative traits
The heritability (H2) is the proportion of its total phenotypic variance in a population that is under genetic control (aka 'broad' heritability) H2 =VG/VP Range of heritability is 0 to 1 (0% to 100%). The higher the heritability, the greater the contribution of genetic differences to trait variability How can we estimate the heritability of a continuous human variable? We can look at correlations between relatives: - Family members share a certain proportion of their genes in common - This will influence the level of correlation between them for a polygenic (multifactorial) trait - We expect stronger correlations between closer relatives Usually we measure correlations between 1st degree relatives:offspring and parents, or sibs (brothers and sisters), share 50% their genes Correlation coefficient (r) for a fully genetic variable = 0.5 Correlation coefficient (r) for a fully environmental variable = 0.0 A trait with H2 of 100% (or 1) will show a parent/offspring, or a sib/sib, correlation coefficient of r = 0.5 Example: 5% of population have high blood pressure b/w parent & offspring r = 0.12-0.34 b/w sibs r = 0.12-0.37 Heritability estimated to be up to ~70% Correlation between MZ twins for a variable controlled solely by genes is expected to be complete (correlation coefficient r = 1.0) Environmental influences will reduce this correlation coefficient Example: Blood pressure MZ twins r = 0.55-0.72 [DZ twins r = 0.25-0.27] Correlation coefficient - r ranges from: 0 to 1 for MZ twins, 0 to 0.5 for 1st degree relatives...
Antibiotic Resistances: Bypass
The last line of defence can be avoiding the toxic effect of the antibiotic by circumventing the need for the chemical reaction in which the target is involved (Bypass ⊣ Toxicity). Metabolic shunt: Changing the chemical composition and functionality of the cell to produce a new enzyme or process that makes the previous target redundant. = Evolve an entirely new process to get around the process the antibiotic is targeting
CF Lung Disease
The loss of CFTR function triggers the so-called CF pathogenesis cascade the characterises CF lung disease: 1. Lack of CFTR-the major epithelial ion regulator-leads to reduce epithelial chloride secretion 2. Loss of Cl- causes deficient transport of other ion conductances (e.g. hyper adsorption of Na+) 3. Ionic dysfunction in turn leads to reduction in water content of airway surface liquid and thick mucus 4. Cycle of lung destruction initiates-blocked airways, bacterial infection, inflammation and lung damage = Exacerbation 5. Cascade leads to end-stage lung disease
Complex diseases - susceptibility genes
The majority of medically important diseases are complex and multifactorial Identifying genetic susceptibility factors is important for early diagnosis and designing possible treatments Research on complex diseases: - Family / twin / adoption studies to test whether there is a significant genetic component to the disease - Linkage analysis (e.g. affected sib pair analysis) to map susceptibility loci - Population and family-based association studies to map susceptibility loci - Identify DNA variants conferring susceptibility - Define the biochemical action of the susceptibility alleles
1.1.2. Relative contributions of genes and environment
The more closely related two individuals are, the more alleles they share e.g. average number of alleles one sibling is expected to share with another at any one locus = 0.25 (2 alleles) + 0.5 (1 allele) + 0.25 (0 alleles) = 1 allele --> 25% chance of sharing 2 alleles, 50% of one allele and 25% no alleles Relationship to proband = Prop of alleles in common with proband Monozygotic twins = 100% First degree relative = 50% Second degree relative = 25% Third degree relative = 12.5%
∆F508 mutation
The most common CF mutation is a deletion of a phenylalanine residue at position 508 (∆F508) in the first ATP-binding fold (NBD1) - Class 2 CF mutation; misfolded protein that cannot exit the endoplasmic reticulum (ER) - Approx. 50% of CF patients are homozygous for ∆F508 - Additional 40% are genetic compounds of ∆F508 and another mutant allele - Approx. 70% of CF carriers have a single F508 mutation. = The ∆F508 mutation is a class II CFTR defect, which means the mutation impairs the folding of the CFTR protein, therefore it can not mature and exit the endoplasmic reticulum to make it to the cells surface.
Antibiotic Resistances: Drug Modification
The next line of defense prevents drug accumulation by chemically targeting the drug (Drug modification ⊣ Accumulation). Substitution reaction: Designated enzymes modify drug molecules or hydrolyse them to inactivate or reduce their effects. = Most drugs have a lock and key phenotype so if the bacteria adds a hydroxyl group etc onto the drug and changes it shape then it can't attach to parts of the cell Degradation: These reactions can either occur within the cell, or pre-emptively outside the cell if the enzymes are secreted (e.g., B-lactamases).
DNA fingerprinting using DNA 'Mini-satellites' or 'Variable Number of Tandem Repeats' (VNTR) loci
The number of repeats at each locus varied between different individuals - highly polymorphic such loci are called mini-satellite loci Most occur in non-transcribed DNA and have no known function Repeat sequence is detected using the Southern blot procedure Number of other repeats not just Jeffery's 4 - Very useful bc variability bw individuals if you isolate a particular locus you could work out allele frequency Mini-satellite fingerprinting using a cloned and labelled repeat as a probe to hybridize to a restriction digest of genomic DNA: Multiple bands of a wide range of sizes and intensities present Cannot assign individual bands to individual loci Identical twins have identical DNA fingerprints Examples of DNA fingerprints used to: (A) test paternity (B) compare a specimen with rape suspects Mini-satellite alleles are inherited in a co-dominant manner: can follow through a pedigree
Antibiotic Resistances: Entry
The outermost line of defence is prevention of the bare entry of the drug into the cell (Spatial exclusion ⊣ Entry). Permeability: A change in the chemical composition or thickness of the bacterial cell envelope can impede the diffusion of antibiotics into the cell. = Changes to the bacteria cell wall - change the diffusion of the antibiotics into the cell Efflux pump: Cell membranes often contain drug dedicated or general pumps that actively pump antibiotics out of the cell
Phage Life Cycles
There are two main life cycles for phages. Lytic life cycle Phage replicate using bacterial cell machinery, lyse and kill the host, releasing newly infective phages Example: T4 phage Lysogenic life cycle Phage infects a host cell and then integrates its genome into the host genome. Phage replicates in synchrony with bacterial host at cell division. Example: lambda phage = Sitting dormant in the bacterial cell until it is needed them will undergo the lytic life cycle
Transduction - P22 phage
Transduction is the transfer of DNA from a donor to a recipient bacterium by means of a phage vector. P22 phage is a naturally transducing phage. - DNA derived from the host bacterial genome is accidentally packaged inside the phage capsid in place of the phage genome. - Allows for the transfer of bacterial genes from one host to another at low frequency - P22 phage infects a new host by the lytic cycle. - Degrades host bacterial DNA to replicate and assemble new P22 phages. - Occasionally, P22 phage accidentally packages bacterial host DNA, instead of phage DNA - called a "transducing phage". - The transducing phage is an infective phage particle - but it does not carry any phage DNA. - Upon infecting a new bacterial host, it injects donor bacterial DNA - This DNA can recombine with the recipient bacterium and transfer new genes and functions.
Sequence Family Members
Trio sequencing - proband + parents - Affected individual and both parents = trio = Especially good for dominant individuals Affected/unaffected siblings - more family members and affected individuals can help filter variants - Siblings also infected, also have the same disease causing alleles, so looking for variants that are the same between individuals
1.1. Two mains methods of somatic cell gene therapy
Two main methods: • ex vivo, treat cells from affected individual in culture, then reintroduce - limited to disorders where relevant cell population can be removed from affected individual, modified genetically, and then replaced = Not useful for cells that can't be accessed such as cells of the nervous system etc. eg. Hematopoietic system cells, skin cells • in vivo, deliver DNA directly to cells in patient - may be placed directly into target tissue, or into circulation but designed so taken up only by desired cell type - can theoretically be used for many hereditary disorders = Can be used for a much wider range of disorders, but much more challenging
Measure the strength of association: odds ratio
Unassociated alleles will be found at similar frequencies in patients and controls. = How much more prevalent in affect vs non affected individuals
1.1. Qualitative disease traits
Unlike single gene disorders, multifactorial diseases do not follow a Mendelian inheritance pattern But familial clustering does occur (Many genes contributing so no clear pattern but related individuals will share alleles. Family members also share an environment.) - relatives of a proband (affected individual) share a great proportion of: 1) their genetic information ....and... 2) environmental exposures .. than individuals chosen at random Familial aggregation DOES NOT necessarily mean a disease has a genetic contribution Concordance: When two related individuals have the same disease Discordance: When only one member of a pair of relatives is affected
STR (geno)typing
Usually uses standardized allelic ladders with most common alleles As more samples are run, new alleles are constantly being discovered - variants with more or fewer of the core repeat unit OR - variants with partial repeats or in/dels in flanking region Occasionally a variant allele occurs with a size between two loci in a multiplex STR electropherogram - difficult to assign allele to the appropriate locus - called an interlocus allele - Additional characterization required e.g. single locus amplification Multiplexing - using different dyes for the different sets of primers - one PCR reaction for each sample - read off final genotype at all loci from one electrophoretic separation An individual genotype is determined by reference to standard alleles
1.3. Case Study - Cholera
V. cholera requires acquisition of two main virulence factors to cause disease: 3. Lysogenic conversion • Once inside the cell, the CTX phage integrates into the chromosome and becomes a prophage. - CTX causes the V. cholera to express the cholera toxin - lysogenic conversion - In Cholera disease, all 3 of these steps must occur to cause diarrhoeal disease
1.1. Case Study - TCP island
V. cholera requires acquisition of two main virulence factors to cause disease: 1. TCP (toxin co-regulate pilus) - is transmitted through horizontal gene transfer (HGT) such as conjugation or transformation. - TCP encodes genes for a pilus that is necessary for adherence and colonization of the small intestine. - TCP also serves as the cell receptor for CTX phage.
1.2. Case Study - CTX phage
V. cholera requires acquisition of two main virulence factors to cause disease: 2. CTX phage - is a filamentous, ssDNA, temperate phage that encodes for an endotoxin called cholera toxin (ctx) - CTX phage recognizes the TCP pilus on the surface of the bacterium and uses it to attach and enter the cell - TCP pilus must be present on the surface of V. cholera before CTX phage can infect.
How does CRISPR work? - Phase 1
When a bacteriophage attaches to a host cell it injects its DNA into the cell, the CRISPR complex inserts this DNA as a spacer in the CRISPR array.. - The Cas complex recognises foreign viral DNA. Preferentially binds to the first DNA sequence injected into the cell - lambda phage this is the cos site, which is ssDNA end of the viral genome. - Cas complex cleaves viral DNA and inserts short sequence (~30bp) into the CRISPR locus where it becomes a spacer. - The insertion is sequential (most recent infection at front) and confers a genetic memory of viral attack. = How a naïve cell first protects itself from attack = Race to see if CRISPR takes over or viral dna takes over = Build up a memory of viral attacks with the first spacer the most recent viral attack all the way to the end
How does CRISPR work? - Phase 2a
When an immunized cell encounters the same virus at a later date, the mature CRISPR- Cas9 quickly destroy the incoming DNA in an RNA-dependent process. - CRISPR region is under control of its own promoter. The resulting transcript is called a pre-CRISPR RNA (pre-crRNA) and contains sequences complementary to both spacers and repeats. = This will translate the entire array which can be quite long depending on the amount of viruses it is immune too - Cas9 is transcribed and translated into protein - TracrRNA sequence is transcribed
Uses of linkage analysis
Why map genes? 1. Useful in identifying molecular nature of a gene previously known only from its phenotype. - Associate gene with a short region of the genome, and assess candidate genes in this region (Lecture 3) 2. Useful in following inheritance of closely linked genes in families for genetic counselling purposes (gene tracking).
Rare variant model
Will not associate more commonly with cases or controls and the different snps bc the high affect allele is present in both haplotypes
1.3. Genome Replication
Within one minute of injection, the T4 phage genome halts the synthesis of all host cell DNA and RNA and the synthesis of phage genes begins. - T4 phage encodes its own DNA polymerase. - T4 phage genome replicates as a linear chromosome = Very unique to T4 phages (Usually done in circular fragments) T4 phage replicates its DNA using both replication and recombination a) Linear T4 DNA are injected into host cell. b) DNA replication begins at an origin and proceeds bi-directionally to the ends, this leaves a single stranded overhang. c) Single stranded overhang can invade other dsDNA at any place where it has homology. = ssDNA is unstable so will hunt to become dsDNA again d)This replication strategy leads to very long concatomers of the T4 phage genome that are joined end-to-end. = Unique replication strategy, so the long concatemer may have 10-20 copies of the T4 phage genome
Types of single gene mutations
a. Base Substitution: single base change => Common, seen in healthy individuals too, in non coding regions, regions that don't effect phenotype - transition - pyrimidine replaced by another pyrimidine (C / T) or purine by purine (G / A) - transversion - purine replaced by pyrimidine (or vice versa) b. Insertions or deletions (indels): short DNA sequences may be deleted or added - We have these, different to our parents - Particular example in humans - trinucleotide repeats: series of repeats of three nucleotides inserted. Important cause of inherited disease. - Also common are transposable element insertions.
Mixed DNA samples
e.g. Rape victims - sample has more of victim's DNA than assailant's DNA PCR will preferentially amplify woman's DNA Preferential lysis: - used to degrade woman's cells, leaving resilient sperm cells behind - does not work when assailant produces no sperm (medical condition, vasectomy) Alternative method: Isolation of male cells by combination of FISH and laser microdissection = Use cut out cells to make a dna profile Peak Sizes may help also separate two individuals in a mixed DNA sample = 4 peaks, 4 alleles, so obvs more than one person bc one person would have two peaks only
WGA of fragmented DNA
e.g. fixed tissue - very common for pathological samples - formalin-fixed, paraffin-embedded (FFPE) [Really bad for DNA bc it fragments DNA] Order is different = but is all together and can be amplified - Not perfect bc mixing up of order of genome
Detecting ANY mutation in a gene: 4) Deletions in heterozygous carriers
o Cannot simply look for absence of a PCR product - wildtype chromosome will produce a product o Not detected by sequencing (looks like homozygote) o Not detected by SSCP or heteroduplex (mutant allele gives no product) o FISH can only detect very large deletions o Quantitative PCR (qPCR) can detect the difference between 1 and 2 copies of DNA = Monitor the amount of DNA at every cycle and we can plot that. Can be more useful than looking at PCR amount at the end. = Time consuming and expensive o Real Time PCR uses a special thermocycler that detects the amount of fluorescence-labelled PCR products in real time, after each PCR cycle = Take RNA, reverse transcribe it to cDNA and then do a real time PCR reaction o qPCR is generally expensive
Detecting ANY mutation in a gene: 1) Sequencing
o Gold standard approach o We can amplify up the region of interest in the genome and sequence it. Software that reads the bases and records changes o Expensive but becoming cheaper o Detects any mutation - don't need to know the mutation o PCR amplify exons - Good quality DNA required for sequencing o 'Allele-calling' software can automate process o Need to know if polymorphisms detected are pathogenic Linking PCR products can reduce sequencing burden Including overlapping sequences so that we can reduce costs and sequencing burden Maximising pour 800 bp of effective sequence This approach is becoming much less utilised as costs come down
Effects of single gene mutations on gene products = Mutations within coding sequence
silent mutations - no alteration to amino acid sequence missense mutation - amino acid change does occur eg. sickle cell anaemia - Can be a neutral mutation nonsense mutations - codon changes to a stop codon, results in truncated protein frameshift mutation - insertion or deletion of 1-2 nucleotides causes shift in the reading frame, a completely different amino acid sequence results downstream
1.1.2. Targeted gene knockouts
• Culture blastocyst stage embryos in conditions which cause cells to keep dividing without differentiating, to produce embryonic stem (ES) cells. • Transfect the ES cells: transform with foreign DNA, designed to select only homologous recombination (gene replacement) events rather than random integration. - Most commonly replace wildtype allele with a non-functional allele (often marker). = Could do this with antibiotic resistance gene, and use antibiotics to kill cells that didn't incorporate the dna • Inject transfected ES cells into ICM of normal blastocyst, will contribute to all of the tissues including germ line. Chimeric mice will transmit mutated gene to progeny. • Knockouts may be lethal, can study gene function in mosaic animals using cre-lox system.
2.2 RNA interference (RNAi)
• Current very exciting possibility • Involves getting a double stranded RNA hairpin construct into target cells. • Mechanism of RNAi - double stranded construct is processed by dicer complex into 20-25 nt siRNAs (small interfering RNAs) - these bind to RISC complex (RNA-induced silencing complex), contains helicase that unwinds ds DNA - Antisense strand guides RISC to endogenous mRNA and then RISC cleaves it • Needs to be specific for mutant mRNA • Can use to correct aberrant splicing. • Delivery, efficiency and specificity all under investigation • Vectors similar to those for gene therapy
1.4.2 Adenoviral vectors
• DNA viruses that cause benign infections of upper respiratory tract. (Viruses such as the common cold) = Can be good for some conditions such as CF • Efficiently transduce both dividing and non dividing cells. = Post mitotic tissues can also be targeted • linear ds DNA genome does not integrate, remains as an episome in nucleus - safer but lower expression. = Episomal vector - doesn't integrate, so less risk but lower expression after multiple doses • Also disabled, if all viral genes removed can insert up to 35kb of therapeutic gene. • Major problem is immunogenicity, unwanted immune responses in several trials, including one death in 1999 in trial for ornithine transcarbomylase deficiency. • Also short term expression - in CF trial only expressed for 4 weeks, therefore need repeated treatments. = Mucus from CF is hard for vector to get to cells = Immune response after every treatment
1. Somatic cell gene therapy
• Different to germ-line gene therapy, which is: - permanent transmissible genetic modification (Will effect the offspring of that individual in germ line genes) - would solve problem once and for all, but more risky as effects not contained to patient being treated, also imposing choice on future generations - illegal in most countries • Aim of somatic cell gene therapy is to cure genetic disease by inserting wild type gene that is expressed at right time and place. - has to get to right cells, breach plasma membrane, traffic through cytoplasm, enter nucleus
Embryonic Stem Cells
• ES cells are pluripotent, and can culture indefinitely • Currently two main techniques for isolating: - From Inner Cell Mass of embryos left over from IVF - From germ cells of spontaneously aborted foetuses = Very limited supply = Serious ethical concerns • Culture methods continually being developed to direct them along specific developmental paths.
Mouse example - sickle cell anemia (Lect 6)
• Early efforts replaced mouse b globin gene with mutant human gene. Didn't replicate disease.•To produce good model had to knockout mouse a and b globin genes, and replace with human wild type a gene and mutant b gene (sickle cell mutation). • Mice use same Hb during development, therefore also had to introduce human g globin gene so could make fetal Hb.• Mice had severe anemia and infarction - tissue death from clogged blood vessels. • Much of what is known about molecular pathology is from mouse model.
Problems with ES cell use in stem cell therapies
• In animal models, when cells transplanted in sometimes proliferate out of control, form tumours. = Taking a very dangerous pluripotent cell and putting it into the body • Can revert to undifferentiated cells. • Can be genetically unstable. = Only need the genes necessary to proliferate in the dish so may loose portions of chromosomes • Ethical concerns. •Immune rejection, unless therapeutic cloning used to create ES cell line genetically identical to patient. ES cells have not yet been used to successfully treat human disease.
1.1.3. Conditional knockouts
• Knockouts may be lethal, can study gene function in mosaic animals using cre-lox system. • Cre recombinase (from phage P1) excises segments of DNA flanked by its binding sequence, called loxP sites. • Binary system, requires two lines of mice: 1. Transgenic line with Cre driven by tissue-specific promoter 2. Line with target gene flanked by loxP sites (floxed) • Mate two together, gene will only be excised where Cre is produced. Process: - Gene of interest, gene A - Flanking seq, lox P site, selectable marker, loxP site and flanking seq - LoxP sites undergo recombination when you add cre, get 3 outcomes: 1. Recombination between first two sites and deletes selectable marker 2. Rec, and delete gene of interest 3. Delete everything we introduced - If selecting with this marker and not to that we've got cells that have successfully undergone recombination even if it is only 1/1000 that have what we want. - Mouse like our starting mouse but with loxP sites
Adult or "tissue-specific" Stem Cells
• Numerous adult organs contain multipotent stem cells - can generate limited set of tissue types. Eg. hematopoietic stem cells give rise to all blood cell types (Nussbaum Fig 14.11). • More difficult to culture, and to isolate (eg. testis stem cells, only 1/10,000 cells in testis). • Can use patient's own cells (depending on tissue) - no immune response. If you are going to use the patients own cells, you haver to take them out of the body and then correct the genetic defect in those cells (using gene therapy approach), then proliferate them and put the corrected cells back into the patient.
1.4 Viral vectors 1.4.1 Oncoretroviral vectors
• Retroviruses - RNA viruses, viral genome gets reverse transcribed in host cell and cDNA integrates into host genome at random location. = Very useful bc this system does this naturally • Can only infect dividing cells as only gets access to chromosomes when nuclear envelope dissolves. Limits target cells. = If you want to target a post mitotic tissue (one no longer dividing) e.g. neural, can't use these vectors • Can only insert up to 7kb DNA, problem with big genes. • Need to be genetically engineered to eliminate ability to transform cells, key genes are replaced by the gene of interest. Then construct packaged in special cells that supply missing proteins for packaging virus. = Needs to only be able to infect the cell on interest but needs to not be able to go on and transmit further again in other cells • Very efficient at transferring DNA into cells, used a lot early on.
Factors to consider in a model genetic organism
• ease and cost of breeding: Animal models are expensive, even the cheaper animals e.g. worms • number of progeny: Mice litters can be 8-10 where as zebrafish you could get 200-300 embryos per week • generation time • genetic techniques: Have improved for all organisms over time • genetic background variation • cell/organ culture techniques: Doesn't address the physiological side as well as animals • genome sequence: Available for every organism • similarity to human • ethical concerns
Effects of single gene mutations on gene products = Mutations in Non-coding Regions
• more difficult to predict effects • promoter regions - mutation may result in an increase or a decrease in transcription. • splice recognition sites - pre-mRNA may not be spliced correctly • 5'UTR / 3'UTR - alteration in ability of mRNA to be translated or in mRNA stability
1.2 Requirements for somatic cell gene therapy
• well characterised gene, including regulatory elements • knowledge of the mutation and how much expression is needed to compensate • accessible target cells/tissue and good delivery method • vector system to introduce DNA efficiently in right place, safely, and avoiding immune response • animal model of disease very helpful for trialling method
How antibiotic resistance happens Classes of Antibiotics & their mechanisms
•Antibiotic resistance is a natural and unavoidable process. • Resistance happens quickly! • Bacteria replicate every ~45min, they have huge population sizes (108-9 per mL). •At these numbers, even very rare events become common occurrences. Classes of Antibiotics & their mechanisms - Bacterial cell wall structures share common features - allow for broad spectrum antibiotics. - FurtherGram +ve and -ve differences - DNA replication machinery, RNA polymerase, nucleic acid formation also targeted - Human cells have different ribosome structures - makes them good targets for antibiotics Bacteriocidal vs. Bacteriostatic Bacteriocidal: Kill the bacteria outright Bacteriostatic: Don't outright kill the bacteria but slow it down and stop growth - Typically protein synthesis antibiotics - Not as strong