gen3040
What is the 1000 genome project? How is it different from the International HapMap project?
- A project that made use of Low coverage whole genome sequencing of 179 individuals, high coverage sequencing of 2 father-mother-child trios, and exon-targeted sequencing of 697 individuals from 7 populations. - It was observed that each person carries approximately 250 to 300 loss of function variants in annotated genes, with. 50 to 100 variants previously implicated in inherited disorders Difference: The eQTL TIMM2 detected a much stronger signal in the 1000 genome project
How does repetitive DNA contribute to the generation of CNVs?
- CNVs are copy number variants this means that certain areas of the genome have multiple copies depending on the individual - Most humans have very individual CNVs throughout their genome - Because CNVs are created through a short sequence of DNA repeating any number of times in the genome, repetitive DNA is more likely than not going to become a CNV over time - Repetitive DNA sequences can easily be copied and pasted back into the genome, as the repeat makes it easier for transcription/translation errors to occur, creating CNV differences between individuals
You are interested in two closely related genes in yeast. how would you create a strain containing null alleles of both genes (e.g. a double mutant)?
- Can use homologous recombination to introduce a piece of mutant DNA into the genome - Recombination can occur either using a plasmid or linear chromosome to integrate DNA of interest into the target gene -- a selectable marker enables selection of specific genes - Plasmid can then undergo two recombination events (at both target genes flanking the selectable marker) → leads to non-functional gene with a selectable marker in the middle - If there is only one recombo event, it leads to an INTEGRATED plasmid genome
How can you identify which 'species' your protease sequences are from?
- Each species usually have their own rDNA which can be used for molecular phylogenetics - Isolate all the DNA and mass sequence them - If fragments are too small they will not contain any rDNA and could only be used to infer which species it may be - Assemble the sequences as much as possible and form contigs - Some species genomes could be assembled if enough of the species contig is available - Blast the contigs to find closest species to identify which species could the contigs belong to
Why do we choose one particular gene over thousands of others?
- If the biological function of a gene is mostly unknown because it has different functions across model organisms, this may be a good gene to choose for a reverse genetics approach. - Might be an ortholog of a gene that has an interesting role in another organism -It may also be because we are interested in developmental function, so we choose genes expressed in those tissue types or in the times when our biological process of interest is occurring.
Can you think of a reason why locating genes would be more complicated in a eukaryote versus a bacteria?
Eukaryotes have introns and other non-coding sequences that bacteria (ie prokaryotes) do not → can lead to complications when identifying genes as there are MANY more NCS in eukaryotic DNA than bacterial Promoter regions are also not as conserved in eukaryotes as prokaryotes -- harder to define starting points of genes and transcription sites
"Heritability is the proportion of phenotypic variation that can be attributed to genetic differences. Therefore, it is easily transferable between populations." True or false? Why?
False. Heritability values measured in one population would be different than another population as both genetic and environmental factors may be different between each population
Which sequencing approach would you favour if you wanted to sequence your own genome and why?
I would use Illumina because it is the cheapest and it has a low error rate (though not as low as sanger). It is disadvantaged due to its low read length which may result in computational issues. Whole shotgun sequence using Illumina to compare to reference to see if at risk for whatever based on genotypes
What is the International HapMap Project and what is it aiming to achieve?
Identify common patterns of DNA sequence variation Make data freely available to the public Identify sequence variants and catalogue them Test for associations between variants themselves as well as disease state
How does repetitive DNA contribute to the generation of CNVs?
Increase in duplication leads to overabundance of reads mapped to area of genome. This allows for increased non allelic homologous recombination for the formation of copy number variation.
Describe the landscape of the human genome and your view on the biological role, if any, of "junk DNA"
Junk DNA refers to repetitive elements that are also non-coding seqs -- serve no real function for organism but can increase genome size via self-replication and insertion into different parts of the genome These regions likely contain regulatory sequences and genes that produce functional RNAs, such as microRNA. ENCODE project determined that most of the genome (80%) consisted of functional RNA seqs
Differentiate linkage and linkage disequilibrium and discuss how they are useful in mapping traits
Linkage: co-inheritance of genes due to physical proximity resulting from reduced recombination Linkage disequilibrium: non-random association of genotypes due to shared history in populations. Genes that are linked (in linkage) are in linkage disequilibrium with each other. Usefulness in mapping traits: Linkage: Can tell which genes are near to each other Linkage Disequilibrium: When mapping populations can't be generated such as ethics, takes too long to generate (e.g. trees) or other reasons Can be used for association mapping experiments by using linkage disequilibrium to statistically test genotype to genotype correlations
Differentiate local and distant regulatory variations, which may be mapped through eQTL analysis.
Local regulatory variation is allele specific and introduced SNPS are from neighbouring genes. Therefore only affects individual gene Distant variation: transcription factor cannot regulate. Usually affects multiple genes. Find gene expression variation mapped to one gene
Describe how telomeres are maintained in eukaryotic organisms
Repetitive elements that are necessary for protecting the chromosome from degradation during crossing over 1. Uses telomerase enzyme to elongate and regulate telomere length to ensure that it is stable during replication 2. Uses lagging and leading strands to extend telomere length and ensure telomere activity is stabilised This leaves a GAP in lagging bc replication occurs in opposite direction of leading .: no primers bind to fill gaps .: LOSS of DNA after every replication cycle when RNA primer is removed → telomerase binds to overhang → adds nt's towards end of DNA → extends strand → .: compensate for telomere length lost
What are the consequences of whole genome duplications on genome and gene evolution?
Results in the formation of allotetraploids, where two or more parental genomes are maintained in duplicated forms. Formation of autotetraploids in intraspecies diploid duplications which might lead to functional redundancy as well as increased genome flexibility in heterogeneous conditions. Both forms of polyploidy allows the organism to have mutation buffers, accelerate evolution, increase allelic diversity and heterozygosity as well as sub or neofunctionalization.
How are retrovirus and retrotransposons related to one another?
Retrotransposons are derived from retroviruses (Class I "copy and paste" genes) They have RNA intermediates and can transcribe DNA → RNA using reverse transcriptase → DNA which can be inserted back into the genome Ancient exogenous retroviruses may have infected mammalian germline cells and integrated its viral genome into the gametes .: can be inherited by the progeny species
What is the definition of a gene?
Sequence of nucleotides that encode a functional product of polypeptides or RNA (eg proteins, mRNA)
Describe the situations in which GWAS will be a useful approach and contrast this with situations in which GWAS is of limited use
Useful when trying to identify common diseases as there will be more common alleles over a large reading, therefore according to the common variant hypothesis GWAS will be able to capture a clearer genetic variation of common variants. Whereas, GWAS will be less useful when trying to find diseases caused by rare alleles as there is not enough statistical evidence for GWA to detect variation despite the disease being common.
Describe the process by which a polyploid species gradually returns to behaving like a diploid species
Polyploids begin to behave like diploids through the process of fractionation and gene loss. 1. Duplication results in two chromosomes with the same genes 2. Over time pseudogenes are formed (one copy lost) and sub/neofunctionalisation occurs (two copies retained) 3. Results in two chromosomes, neither of which are syntenous to the ancestor but are syntenous to eachother 4. Behaves more like a diploid - not two copies of every gene, less genetic redundancy o Even if autoplyploid, won't be able to pair at meiosis after differential gene loss
Mark sequenced the myoglobin gene of Kangaroos from two different parts of Australia and found a single nucleotide polymorphism. Based on your understanding of the concepts underlying RFLP marker and PCR, can you help Mark design an assay to genotype the myoglobin gene for the SNP in all Kangaroos in Australian Zoos
Primer design to produce different sized fragments in presence/absence of SNP. Run PCR product on gel to determine genotype by fragment size. If SNP is at a restriction site, presence of SNP will result in different Restriction fragment length polymorphisms
What does genome annotation mean and what is the difference between structural and functional annotation?
Process where raw data (DNA seq's) is categorically analysed to make inferences about gene function and localisation (ie glean information from DNA sequences) Structural annotation: identifying specific elements in the genome (eg open reading frame, structure, coding regions, regulatory motifs) Functional annotation: attaching biological information to genomic elements → apply existing knowledge of genes and biology to determine biological (role in organism), biochemical (role in chem process), and cellular (role in cell) functions
What are some major differences when conducting structural annotation of genes for prokaryotic versus eukaryotic genomes?
Prokaryotes: Circular DNA No introns Operons which are several genes grouped under one promoter Simpler because genes grouped in operons and are usually related in function with smaller genomes and no introns Eukaryotes: Double stranded and linear Have introns Usually much larger More complex with lots of non coding dna, introns therefore the best way is to look for open reading frames
If given a genome sequence, how would you figure out where the genes are?
Prokaryotes: Identify conserved promoter sequences? Eukaryotes: Identify consensus splice sites and/or start/stop sequences??
How would you perform a genetic screen to identify genes encoding a protein involved in the secretory pathway in yeast?
1. Treat cells with a mutagen and grow them on complete media, then stamp to minimal media. If there's no growth then the mutation is affecting an important pathway. 2. Transfer identified mutagenised colonies to various restricted medias to identify which nutrients are required for the growth, therefore which point in the secretory pathway has been mutated.
Compare association mapping and QTL analysis in understanding the genetic basis of complex traits.
QTL mapping: uses crosses between tolerant and susceptible parents (contrasts for the trait of interest) and then phenotype and genotype (with molecular markers) progeny populations to determine QTL (marker trait associations (MTA) Association mapping (AM): observes linkage disequilibrium (LD) mapping. With GWAS, associations are made between genotypes based on molecular markers and phenotypes of various traits in reference germplasm sets. Makes use of only 1 marker for association mapping and many markers in GWAS. Therefore we are able to identify specific regions of association in chromosome of the population.
Differentiate between quantitative and qualitative traits
Quantitative: Continuous variation in phenotype which can be measured numerically with a unit a measure Polygenic therefore controlled by many genes and interactions Eg. height Qualitative Phenotypes that can be categorized by discrete categories Simple traits controlled by one or a few genes Eg. blood group
What is the difference between quantitative Reverse Transcription PCR (qRT-PCR) analysis and the RNA-seq/Transcriptome analysis in gene expression studies?
RNA-seq/transcriptome analysis allows you to measure expression of genes, and you can do thousands of genes at once. While qRT-PCR is used to measure the amount of RNA. It analyses expression through hybridisation
What problem does repetitive DNA present in assembling genomic sequences and what approaches can you take to circumvent this problem?
Repetitive DNA makes it difficult to align areas of functional genes (eg coding regions, exons, introns, promoters) because repeat seq's can be V LONG and/or found in MANY regions in the genome (.: can interfere w the order and orientation of contigs) It can be aided using 3rd gen sequencing (long reads). Use Illumina to recreate unique regions of the genome, use paired data to match reads up to assembled contigs, if they connect = know roughly how far apart they are
What is a contig?
A contig is a series of overlapping DNA sequences used to make a physical map that reconstructs the original DNA sequence of a chromosome or a region of a chromosome.
How would your blast results change if you used a longer or shorter word length? Explain why this would happen
Blast searches for words and then extends the alignment out. Longer - fewer, more similar Shorter - more
Explain the difference between broad sense and narrow sense heritability
Broad sense heritability (H2) is the proportion of phenotypic variation that can be accounted for by genetic differentiation. -Specific to population - different in different population for same trait -Condition dependant - environmental conditions will affect H2 Narrow sense heritability (h2) is the proportion of additive genetic variance over the total phenotypic variance
Cytoplasmic male sterility (CMS) in plants has been exploited to produce hybrid seeds. Specific CMS alleles in the mitochondrial genome can be suppressed by specific dominant alleles in the nuclear genome called Restorer of fertility alleles (Rf). Consider the following cross: Female CMS1 Rf1 Rf1 rf2 rf2 x male CMS2 rf1 rf1 RF2 RF2 What genotypes and phenotypes do you expect in the F1 ? If some of the F1 plants are male fertile what genotypes and phenotypes do you expect in the F2?
CMS mutations in organellar genome that result in misprocessing event and toxic protein that causes male sterility. In the presence of dominant Rf allele, fertility is restored F1: CMS1 Rf1 rf1 Rf2 rf2All fertile F2: 9 CMS1 Rf1 rf1 Rf2 rf2 fertile 3 CMS1 Rf1 Rf1 rf2 rf2 fertile 3 CMS1 rf1 rf1 RF2 Rf2 male sterile 1 CMS1 rf1 rf1 rf2 rf2 male sterile
"Tasty" genotype of flies exhibit a strong attraction towards the laboratory fly food while the "boring" genotype does not show a similar interest. Anna identified a QTL (approx 100Kb region) controlling this differential response in fruit flies. Anna has 15 Cosmid clones that cover the entire region of the QTL. Could you help out Anna to design experiments to find where the genes underlying the QTL might be?
Complementation cloning Transform each of the clones into cells showing phenotype Ask does it rescue the phenotype Whichever cosmid is able to complement is the one that contains the region of the phenotype causing gene
How would you analyse your mutagenesis - i.e. once you have a collection of mutants - how do you determine how many genes they represent and how many alleles of each gene?
Complementation tests M x M = M, mutation likely in 1 gene = different alleles, 1 mutant 1 Wt M x M = WT, multiple genes, number of alleles corresponds to number of genes
Describe 3 different approaches researchers have used to annotate protein-coding genes when they finished assembling the genome of Marchantia polymorpha?
Computational: prediction based using algorithms designed to find genes/gene structure based on nucleotide sequence Experimental: analysis of mRNA sequences and proteins and comparison to the computational data Comparative: analysis of sequences from related species to determine highly conserved areas (more likely to be genes/functional regions)
What are CNS and how can they be identified?
Conserved non-coding sequences- maintained in species over time Phylogenetic foot printing - searching for similar sequences in species separated by large evolutionary distances, search for regions that are conserved Phylogenetic shadowing - look for sequences that are conserved in every organism, limiting parts that aren't conserved. Only works for closely related species
What is the evidence that the ancient mitochondrial and chloroplast endosymbionts are related to the alpha-proteobacteria and cynaobacteria respectively?
Double membrane is similar to that of bacteria Organelles are similar size to bacteria DNA packaging is similar in manner to packaging of chromosomes in bacteria and dissimilar to that of DNA in the nuclear genome Transcriptional and translational of machinery closely resembles bacteria Protein coding sequences of organelle genes are more like those of bacteria than either nuclear genes of eukaryotes and archaea
How would you formulate hypotheses about biological role of a gene of unknown function in yeast? What genetic/genomic tools would you employ?
• Genetic interactions (Genome-wide synthetic lethal screens - can identify genes that are only essential if another gene function is missing, double mutants) • Loss of function phenotypes • Gain of function phenotypes • Interactome -- comparing gene against other similar proteins in the pathway and examining whether they interact with one another → using this, can create a hypothesis on the biological role of gene
How would you clone a gene that you have identified by a mutant phenotype?
Use plasmids as recombinant DNA vectors to clone in the gene we want, and insert these into living organisms to create a transgenic organism.
Describe how you would use the best reciprocal blast approach to find orthologs betweeen the platypus and the koala? Describe a scenario where this approach would not give you orthologous genes when comparing transcriptomes of two species
1. Blast the koala gene against the platypus 2. Blast the top hit from this search against the koala If the same gene is the top hit, X and Y are probably orthologs Won't give orthologous genes if one species has had deletions which causes the ortholog to no longer exist
Describe how CRISPR-CAS has been modified to create a genome editing tool. Ho would you create loss of function alleles using CRISPR-CAS? How would you edit a specific nucleotide in the genome
1. CRISPR is transcribed and chopped up into crRNAs - contain a unique and repeat sequence each 2. crRNAs combine with the CAS protein to form a complex 3. CAS protein cuts at site complementary to the crRNAs We use guide RNAs to determine the site of cleavage For a loss of function allele, NHEJ repair will result in a deletion/mutation A repair template can be included so that repair occurs by homologous recombination resulting in the incorporation of an edited specific nucleotide in the sequence
How have transposable elements shaped the genomes of eukaryotes?
1. Can add complexity to the genome and act as regulatory factors and recognition sites → TF's to bind to + interact w enhancers .: enhance expression 2. Ectopic recombination (same seq found in multiple regions) → eventually leads to crossing over in opposite directions → can have no effect OR be beneficial (especially if gene has an important role in biological function) 3. Alternative splicing -- transposable elements interact w areas that are alternatively spliced → genome complexity bc they
You are interested in identifying proteases that work at low temperature. How would you survey microbial diversity in the cold waters?
1. Collect samples from cold waters 2. Extract DNA and sequence or RNA, convert to cDNA and sequence 3. Assemble the sequences of the organisms present in the samples 4. Compare sequences to construct phylogenetic trees 5. Identify protease sequences and compare between the organisms
East performed experiments with two pure lines of tobacco that differed in their corolla length and analysed the F1, F2 and F3 generation. Could you interpret the results of East?
1. Crossed two inbred varieties of tobacco that differ in corolla length (within/environmental and between/genetic population variation) 2. F1 will all have same genotype, therefore variation must be due to environment 3. F2 will have different genotypes (recombination and independent assortment) - environmental and genotypic variance 4. Select and self-mate short corollas and long corollas results in lines that resemble the P1 (within/environmental and between/genetic population variation)
Describe five mechanisms by which organisms acquire new genes.
1. Gene duplication: can be of a single gene, genome, portion of genome chromosome, chromosome segment 2. Gene duplication by unequal crossover: unequal crossover of misaligned homologous chromosomes in prophase I results in one or more gene duplicated 3. Lateral (horizontal) gene transfer: movement of genes of one species into the genome of another species 4. Gene fusion or fission: deletion of stop codon leads to two genes combining together, one gene splits into two 5. Exon shuffling: exons from two or more genes are combined
How would you identify sequences that encode proteases from the organisms that you cannot culture?
1. Get sample of DNA from natural population 2. Isolate and sequence the DNA 3. Assemble contigs from genome 4. Blast the organism's genome sequence against similar species to find similar sequences
In humans, Duchenne's muscular dystrophy is caused by a mutation in the dystrophin gene which resides on the X chromosome. How would you create a mouse model of this genetic disease?
1. Introduce mutated gene with antibiotic resistance marker into embryonic stem cells via homologoues recombination, replacing the endogenous gene 2. Select for transformed (resistant cells) 3. Implant cells into blastocyst, implant blastocyst into mother to give rise to a chimeric pup 4. Mate chimeric pup to wildtype 5. Mate heterozygote F1 to produce some homozygous F2 (females) with Duchenne's muscular dystrophy (homozygous for mutation)
How would you perform a genetic screen to identify genes directing flower development in Arabidopsis? How would you change the protocol if you were interested in mutation affecting the development of the female gametophyte?
1. Mutagenise progenitors (e.g. pollen) 2. Allow self-fertilisation = heterozygous F1 3. F1 self-fertilise = 1:2:1 F2 4. Identify recessive mutations in F2 offspring showing defects 5. Screen 1000s of independent families to be confident of identifying most genes associated with flowering For female gametophyte - only mutate the ovary and screen for mutant phenotypes
What are the possible fates of duplicated genes?
1. Pseudogenes: Gene that no longer encodes for a function 2. Subfunctionalization: Both duplicated genes are mutated results in complementation Composite function of each gene = to original gene function E.g. alpha and beta globin genes 3. Neo functionalization: One gene retains the same function as original gene while the other acquires a new function
How are traces of ancient whole genome duplications recognised in diploid species?
Analysis of syntelogs (genes that are still the same as ancestral chromosome following genome duplication) can be used to identify instances of whole genome duplication
Why is the characterisation of ancient DNA a metagenomic analysis?
Ancient DNA comes from a mixture of organisms because it is old and has not been in sterile conditions, thus when it is sequenced it is a metagenome. Metagenome: recovery and complete sequencing of genetic material extracted directly from all environmental samples
Autotetraploid vs Allotetraploid
Autotetraploid - 4 copies of every chromosome due to DNA replication without cytokinesis, all chromosomes come from the same species originally Allotetraploid - 2 different species (both diploid) interbreed to produce an F1, 2 different genomes from 2 different sources
For mutagenesis - How would you determine if the genes act in the same or different pathways, and if in the same pathway, how do you determine in which order they act?
Combine mutations to create double mutants if phenotype = additive effect of the single mutants, mutations likely don't interact in same pathway (synthetic enhancement) if phenotype = more severe than single mutants, same pathway (epistasis) ********
Describe the impact of common alleles Vs rare alleles in GWAS
Common alleles : clean signatures due to increased variance Rare allele: not enough statistical evidence, no reading on GWAS
What are the major types of transposable elements? What are their modes of transposition?
DNA transposons: Class II seq that can excise themselves out one part of genome and self-insert into another Use transposase enzyme as catalyst → searches for terminal inverted repeats as signalling factor to excise genome Retrotransposons: Class I group derived from retroviruses -- also use enzymes to move, are also RNA intermediates -- can transcribe RNA → DNA using reverse transcriptase DNA inserted back into genome and incorporated as organismal DNA
Describe one scenario where you would use a de novo assembly approach and one where you would use a reference guided alignment. Justify your choice.
De novo - When there is NO reference genome used (.: need to CREATE one first) -- made from overlapping seq's Can be difficult bc of regions w repetitive areas of DNA in genome Reference guided alignment - Align reads to a reference genome → look for differences Used when there are many variants (eg due to polymorphisms) amongst individuals → read is taken and matched to an EXISTING assembly to identify differences
What types of dispersed and tandem repetitive DNA are found in eukaryotic genomes? What is their function, if any?
Dispersed: scattered across genome Tandem: many copies at single location Micro/minisatellites: sequences that are able to expand and contract at rapid evolutionary paces Gene families - paralogous genes that have acquired new/different functions rDNA - ribosomal DNA codes for rRNA - contributes to ribosome Telomeres - prevent loss of genetic information at the ends of chromosomes during replication Centromeres - site of kinetochore to which spindle fibres attach during cell division
How would you perform a genetic screen to identify genes directing Drosophila wing development?
Forward genetics: utilises observable mutations to infer the function of genes (phenotype → genotype) .: can use F3 screen: 1. Mutagenise sperm cells of fly homozygous for a marker 2. Mate with female fly carrying a balancer chromosome 3. Select male F1 progeny with balancer phenotype, mate with female flies carrying balancer chromosome 4. Select F2 progeny with marker and balancer phenotype and intercross Offspring that have the marker but no balancer chromosome will be homozygous for the mutation 5. Screen for abnormal wing development in homozygous mutants
What types of functional and non-functional DNA are found in eukaryotic genomes?
Functional: Gene families, rDNA, telomeres, centromeres Non-functional: Centromeres, tandem repeats, transposed elements
Diferentiate between environmental effect and genotype x environment interaction
Genotype x environment interaction is when individuals of different genotypes react differently to the same environmental conditions Environmental effect is the way environment influences phenotype regardless of genotype in individuals with the same genotypes
What is the difference between global and local alignment? Describe a case where you would want to use one over the other
Global - align proteins or nucleotide sequences from end to end, e.g. MUSCLE Used to compare two sequences of DNA (eg between species or organisms) Local - find regions of high similarity (not aligned over entire length), e.g. BLAST Compares two REGIONS of similarity (eg conserved promoters, coding regions) Local is better when one sequence is much longer than the other
What is a haplotype? What is a haplogroup?
Haplotype: a specific combination of alleles rather than individual ones that segregate together in a population Haplogroup: a set of polymorphisms that are typically in linkage disequilibrium with each other. A haplogroups may contain multiple haplotypes.
Define homolog, ortholog, paralog
Homolog: a gene related to a second gene by descent from a common ancestral DNA sequence Ortholog: a homolog that arose from a speciation event Paralog: a homolog that arose from gene duplication
Two models of how we left Africa
Multiregional model: modern humans emerged gradually and simultaneously from earlier Homo erectus migrations on different continents Recent African origin model: modern humans emerged from a small African population that migrated out of Africa, displacing earlier Homo erectus migrations
What is the evidence that transfer of DNA from the organelles to the nucleus continues to occur? Outline the steps required for a gene originally present in the endosymbiont genome to be transferred to the nuclear genome and be expressed and for its product to be targeted back to the organelle of origin
Nucleus contains genes that originate from alpha-protobacterium and cyanobacterium. Organelle DNA transfer can be detected by nuclear mitochondrial sequences (NUMTS) and nuclear plastid sequences (NUPTS) - highly similar therefore transfer has occurred recently. Given the level of sequence similarity between NUMTS or NUPTS and the respective organelle genome sequences, most are thought to represent evolutionarily recent transfers of organelle DNA to the nuclear genome. Transferred genes must acquire sequences for proper transcriptional regulation in the nucleus. For the protein to be transported back to the organelle, an amino terminal signal sequence must be attached to it
You have isolated: 1. A streptoycin resistant mutation (strR) that maps to the chloroplast genome 2. A hygromycin-resistant mutant (hygR) that maps to the mitochondrial genome What types of progeny do you expect from the following reciprical crosses? mt+ str(R) hyg(S) x mt- str(S) hyg(R) mt+ str(S) hyg(R) x mt- str(R) hyg(S)
Only the mt+ parent chloroplast will be inherited while the m- chloroplast will be selectively degraded while the opposite mating type occurs for mitochondria. All the progeny will inherit mt+ chloroplast and mt- mitochondria therefore genotype ratio for chloroplast and mitochondria will be 4:0 Mating type segregate independently with a ratio 2:2 as typical for a nuclear gene: 1st reciprocal cross progeny = 2 mt+ str(R) hyg(R) : 2 mt- str(R) hyg(R) 2nd reciprocal cross progeny = 2 mt+ str(S) hyg(S) : 2 mt- str(S) hyg(S)
Reciprocal crosses of experimental animals or plants sometimes give different results in the F1. What are two possible genetic explanations? How would you distinguish between these two possibilities (i.e., what crosses would you perform, and what would the results tell you)?
Organellar/maternal inheritance (eg. mtDNA, cpDNA) Sex-linked chromosomes (x-linked) - To determine whether it is sex-linked: o Affected female x Wildtype male - all males affected, no females affected but carriers (assuming this is a recessive mutant gene) o Affected male x Wildtype female - no males affected, females carriers - To determine whether it is mitochondrial inheritance: o Affected female x Wildtype male - all offspring affected o Affected male v Wildtype female - no offspring affected
How do you determine whether genes are orthologous or paralogous? Which are more likely to have similar functions?
Orthologs are genes that have diverged due to speciation and therefore are important for determining species relationships Functional specificity of proteins is assumed to be conserved among orthologs and is different among paralogs Paralogous proteins, however, usually have different specificity as they act on different targets - arise due to gene duplication Orthologs are more likely to have similar function
Describe OLC. What type of sequence data (ie. Sanga, Illumina, PacBio) is not frequently assembled with this approach and why?
Overlap-layout-consensus (OLC): genome assembly technique Overlap: constructs a graph that takes all the reads produced and aligning them to look for overlapping regions HIGH overlap = HIGH score Layout: finds a path through the graph that gives meaning to the graph presented → can have many paths or forks in the pathway if there are many overlaps Consensus: most common base pairs are joined together sequence by sequence Limitations: Requires overlaps to be scored between all existing reads -- difficult to do this with reads such as Illumina where there are MILLIONS of short reads .: hard to find specific overlaps and align them Is also computationally challenging → requires many nodes (reads) .: can be complicated to assemble)
What are the major components of phenotypic variation? What are the major components of genetic variability?
Phenotypic variation: Vt = Vg + Ve. Genotypic variation (Vg) = Va (additive) + Vi (epistasis) + Vd (dominance)
What is the difference between Southern and Northern blot
Southern blots detects DNA, Northern blot detects RNA
If you were to compare your genome sequence with another students in the class, how would it differ? If you were to compare your genome sequence with that of your parents, how would it differ?
Students: different SNPs and CVS due to different ethnicities and environment in which students grow up with Parents: Different de novo CNVs
What is the c-value paradox? What is the answer to the C value paradox?
The c-value paradoc states that the genome size among eukaryotes does not correlate with gene number organismal complexity The answer to this is that genome size does not reflect gene number in eukaryotes since most of their DNA is non-coding and therefore does not consist of genes
What is meant by synthetic enhancement? How are there limitations in interpretation with different types of alleles?
The effects of one mutation exacerbates that of another Two pathways perform the same essential function, mutation of either (null alleles) alone would be inconsequential but mutation in both results in a loss of the essential function. Caveat: If the mutations not null mutations, double mutants may have a more severe phenotype due to hypomorphic alleles acting in the same pathway. ie CANNOT work w leaky alleles -- MUST be NULL
What is a quantitative trait locus (QTL)?
The region of the genome that could account for a significant proportion of the phenotypic variation in a quantitative trait is referred to as a quantitative trait locus.
"Heritability is always less than or equal to I". True, or false? Why?
True. It is the proportion of phenotypic variation that can be accounted for by genetic differentiation. Given that it is a proportion it must be less or equal to 1.
Describe the evolution of tandemly repeated genes
Unequal crossing over during meiosis
What are the major types of repetitive elements and how can they impact genome evolution?
Transposable elements: - Make up the greatest percentage of the genome - Source of mutations (insert gene, change expression, exonisation, structural changes) - Main cause of differences in genome sizes among different species - Insertion of sequences can interrupt functional sequences resulting in changes to change gene function, expression and splicing. It can also result in ectopic recombination - Copying of retrotransposons results in expansion of the genome -All of these things result in evolutionary diversity in the genome.
You have isolated two petite mutants, pet1 and pet2 in Saccharomyces cerevisiae. When pet1 is mated with wild-type yeast, the haploid products following meiosis segregate 2:2 (wild type: petite). In contrast, when pet2 is mated with wild type, all haploid products following meiosis are wild type. To what class of petite mutations does each of these petite mutants belong? What types of progeny do you expect from a pet1 x pet2 mating?
Wt x pet1 2:2 ratio is segregational petite mutation is in the nuclear genome Wt x pet2 4:0 all WT therefore a neutral petite mutation lacking most or all of mitochondrial DNA Pet1 x pet 2 4:0 all are petite mutants because pet1 only has half of mtDNA while pet2 has none at all Pet1 is segregational petite Pet2 is neutral petite
Do you or your microbiome possess more genes?
Your microbiome