BIOL 3010: final study questions

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Define "life," at least according to NASA.

"a self sustaining chemical system capable of darwinian selection"

Besides their roles in instigating wars, famines, and a variety of other plagues on society, men contribute de novo point mutations to their offspring at significantly higher rates than do women. What evidence supports this conclusion and what is a reasonable explanation for why it might be so? What about major aneuploidies, rather than point mutations?

- 1 to 1.5 new mutations per 108 nt: every new child born with 45 to 60 de novo mutations - 3.9:1 ratio de novo mutations of paternal alleles compared with maternal - male germ line cells undergo 160 rep by age 20 and 610 rep by age 40 - constantly undergoing mitosis and meiosis which allows accumulation error over time - female gametes are already established at birth - major aneuploidies more common in female gametes because long time between meiosis 1 and 2, meiosis starts fetal development but meiosis 1 doesn't finish until oocyte maturation and meiosis two not until fertilization

What is the incidence of red-green color blindness in humans and why is it most often observed in males? What is a common cause of this condition? Can affected individuals pass on the trait or must it arise de novo (i.e., new) each generation?

- 8% of males have red-green color blindness - results in defects of crossing over in X chromosome so impacts males because only one copy - opsin proteins of the cone photoreceptors: sensitive to long, medium, short wavelengths (L,M,S) - L and M genes almost identical due to recent duplication - opsin genes adjacent to TEK28 and TEK28P which are almost identical and likely duplicated with opsin -if DNA opened up enough then crossing over is possible strand in L opsin may match with one in M opsin or TEK28 match with TEK28P sequences leading to gene fusions or losses - males cannot pass this trait on because it is in X chromosome and they only pass on the Y - females could pass on this trait to the next generation because they have X chromosomes that go to both sons and daughters

Distinguish ATP-dependent chromatin remodelers (e.g., SWI/SNF or chromodomain factors) from histone modifying enzymes (e.g., HDACs, HMTs). What are the shared features of ATP-dependent chromatin remodelers?

- ATP dependent remodelers work by nucleosome sliding, displacement and modification, -ATP depended chromatin remodelers share: affinity for nucleosome, domains to recognize histone modifications, similar ATPase domains for overcoming nucleosome DNA interactions, domains for interactions with other proteins - major families: SWI/SNF, CHD (chromodomian), ISWI -ex. CHD1 large class proteins recruited to nucleosome can remodel chromatin in that region - histone modifying enzymes often change affinity by changing charge (adding acetyl or methyl) -ATP dependent remodelers can also associated with the nucleosome and DNA and ca expose sites (specific cis reg elements) for DNA binding proteins, can do this by site exposure (repositioning, ejection, or unwrapping), or by altered composition (can exchange dimers, histone variants, or eject dimers out of nucleosomes)

What are some of the major technological advances that have facilitated genetic and genomic analyses since ~1970?

- DNA sequencing 1977 with the Sanger method - invention of PCR by Kary Mullis in 1983 - Humane genome sequenced 1986-2000 - targeted genome editing 2010 (CRISPR)

In what direction is the DNA template read and in what direction is a new strand synthesized?

- DNA template is read from 3' to 5 ' and new strands synthesized from 5' to 3' - DNA pol can only add to the 3' end of the growing strand

What are two examples of ribosomopathies and what are their genetic bases?

- Diamond Blackfan anemia: 4-5 cases per million live births, requires bone marrow transplant, RPS19, RPS24, RPS17 mutations impair 18s rRNA production which leads to less 40s ribosome until which leads to depletion of mature ribosomes, can also have RPL5, RPL11, RPL35A mutations leading to the depletion of the 60s subunit and mature ribosome, not enough functional ribosome so anemia because cells need vast amounts of hemoglobin and red and white blood cells have to constantly turn over if compromised in having to make enough protein then they are affected - Treacher Collins Syndrome: mandibulofacial dystosis affecting 1 in 10,000 to 50,000 live births, problems w airway, swallowing , brain development, hearing, affected individuals are heterozygous for any of 120 mutations, TCOF1 (treacle ribosome biogenesis factor ab 93%) - treacle localizes to nucleolus where rRNas associate w ribosome proteins, interacts with ribosomal DNA and recruits RNA pol 1 to nucleolus, without treacle, RNA pol 1 no longer localizes to nucleolus and pol 1 not where it should be and not making rRNAs to patch into ribosomes, POLR1C or POLR1D (RNA pol 1 transcribe rRNA genes other than 5s rRNA, ab 6%), issue in this disease is that no enough ribosome because pol 1 is not localizing, often mutation only one allele because if both would have no functional ribosomes and not survive, range of severities depending on gene affected and nature of allele

What is RNA Pol II pausing and how is it regulated? Speculate on why pausing might occur.

- RNA pol 2 has early elongation called pioneer escape for 20-60 nucleotides and then undergoes pausing of pol 2 -by stopping this is another check to see if the gene is going to be transcribed - pausing depends on interactions with the: mediator complex, NELF, DSIF, promoter elements, and stable RNA/DNA hybrids which might be hard to break (ex C-G 3 h bonds) - proteins and elements of the promoter itself are involve in the pausing - productive elongation then continues rapidly through termination - pausing may keep genes POISED for expression and allow additional regulation and signal integration while maintaining access to the PIC

Robert's syndrome (resulting from mutations in ESCO II) is just one example of many different congenital malformations that affect human populations. How does age-specific mortality from congenital malformations compare to other causes of death? What is meant by "collectively common but individually rare" and what are the implications for studying such disorders?

- Roberts syndrome, mutations to ESCO 11 which is essential to acetylate cohesions and keep them together, sister chromatids cannot stay together and dissociate in this disease - congenital disorders are number one cause death infants under 1 year - 2 times as many deaths as cancers for young people overall - substantial human suffering, worse in developing nations -difficult to study because collectively common, but individually rare (diseases itself are rare but as a group very common) - congenital malformation exceeds deaths due to alzhiemers, stroke, diabetes, flu, etc. - not much funding because individually rare -healthspan vs lifespan

What are the major regions of human sex chromosomes? What evidence indicates roles for the SRY gene in specifying male characteristics?

- X chromosome about 850 genes and Y chromosome about 50 genes - SRY (sex deterring region of Y) is uniquely on the Y chromosome and activates testes development, determines maleness - MSY (male specific region off Y) contains 3 male fertility genes, are also 8 genes shared and on both X and Y - PAR 1 and PAR2 are pseudoautosomal regions which have around 30 regions that are homologs on both X and Y, these are near the tips but are not at the telomeres

What are some of the general ways in which transcription factor activity can be regulated (for transcription factors acting as activators or repressors). Explain one example in which a single transcription factor can have dual roles.

- activators can be regulated by: transcription, phosphorylation, ligand binding, cofactor interactions, and cleavage from inactive precursors - repressors can be regulated by: interacting directly with the PIC ad preventing binding, promoting closed chromatin by histone modification or recruiting chromatin closing enzymes, mechanisms to prevent transcription include: competition, quenching, cytoplasmic sequestration, and heterodimerization - specific example of dual roles involves the T3 receptor: T3 binds to the TH receptors which converts them from repressors to activators for many genes, when T3 is bound other TFs bind to promote histone acetylation and opening of chromatin allowing the PIC and RNA pol 2 to bind and transcribe, when T3 is NOT bound repressors bind to the TH receptor site in DNA which lead to histone deacetylation and PIC and RNA POL2 cannot bind, so Genes are not transcribed

How does ionizing radiation damage genomes? Are there different outcomes that depend on dose received or type of radiation/radionuclide?

- acute radiation syndrome from gamma rays leads to direct and indirect DNA damage, death by the loss of proliferative stem cells in the bone marrow, skin, and gut -dispersal of radionuclide: I131 associate with increase papillary thyroid cancer - damages genomes by splitting dsDNA and then now repair mechanisms have to work to fix it - damage due to DS breaks is essentially random but there are recurrent fusions of particular genes, certain fusions will affect certain cells and cause issues, only a few ones (RET-ELE1 and ETV6-NTRK3 were very common in thyroid cancer)

Distinguish between additive and non-additive effects of alleles.

- additive: what happens at one locus adds to the effects of what happens at the other locus, each gene makes a contribution to the phenotype, each gene has its own effect and particular contributions, still 9:3:3:1 phenotypic ration - non-additive: like epistasis, one allele of one gene impacts the other genes alleles, like dominant/recessive, other epistatic interactions include redundancy (maize leaf width depends on genotype at more than one locus 15:1 phenotypic ratio) , phenotypic rations vary, NOT 9:3:3:1

What are the phases of mitosis and their associated events?

- after DNA goes through S phase transiently is 4N with two essentially identical copies of each homolog -join at centromere when still together called sister chromatids -eukarotic somatic cells are 2N with 2 complete sets of chromosomes one from each haploid (N) gamete -phases mitosis: prophase: chromosomes condense, centrosomes move apart, microtubules appear, nucleoli disappear pro metaphase: nuclear envelope breaks down, sister chromatids attach to microtubules from opposite centromeres metaphase: chromosome align on the metaphase plate with sister chromatids facing opposite poles anaphase: connection between centromeres of each of sister chromatids is severed, now separated sister chromatids move to opposite poles telophase: nuclear membranes and nucleoli reform, spindle fibers disappear, chromosomes uncoil and become tangled chromatin cytokinesis: cytoplasm divides, splits elongated parent cell into two daughter cells with identical nuclei

How can we understand aneuploidies in sex chromosomes based on regions that do not function in sex determination per se? What are these common aneuploidies?

- aneuploidies on autosomal DNA usually lethal but can be viable when on the sex chromosomes - Turner syndrome (XO women): have missing X chromosome, only one copy, in somatic cells one copy inactivated anyways, but still needs both regions of the PAR containing SHOX so missing one copy of this, low dosage leads to short stature, In gametes need both X copies so leads to infertility, also causes extra skin folds and heart defects, in ab 1 in 2500 female births - Klinefelter syndrome (XXY): cryptorchidism, hypospadias, inc breast cancer risk, men usually tall and long limbed, mild cognitive impairment, presence of SRY drives maleness, in somatic cells are 3 pseudoautosomal regions so genes like SHOX expressed too much, drives larger size, problems in germ line because excess chromosome

How do fundamental aspects of genetic pathways contribute to genetic robustness (and the observation that mutant alleles are often recessive)?

- architecture of the genetic pathways provides inherent buffering capability - linear pathways of Gennes mutations in one of the components of the pathway, if there are multiple enzymes/genes in the pathway, then if one is nonfunctional there is still more buffering capacity, if there are 9 enzymes and one is less activte than normal still get 90-95% of product, large effect to one component but still get a lot of enzyme product - gene C depends on product of gene B if enzyme C not working as efficient should be accumulation so has more substrate available to get through - effects of a mutant allele: if one enzyme in the pathway has a heterozygous or homozygous gain of function allele may increase slightly but no true impact to amount o product made because the other genes in the pathway are not being changed. if one is heterozygous for loss of function will also likely have small affect because less product bit still is readily observable (95%). may only be when homozygous for loss of function that truly get a significant drop in product to be considered mutant - as long as you have multiple components to a pathway there will be some buffering - pathway architecture: 240,000 randomly chosen values for 48 parameters and 1 in 200 were normally functional, reasonable range for each parameter, 90% chance of correct outcome even over several orders of magnitude, shows huge amount of variation in parameter values that worked, complex regulatory network has potential for lots of buffering, individual values quite variable but can still work, feed forward mechanism might help promote this robustness

Studies of sea urchins and other species have allowed the elucidation of "gene regulatory networks" for development of specific traits (like the skeleton in sea urchin larvae). How are the links in a gene regulatory network tested experimentally? What do the different symbols mean in a gene regulatory diagram (e.g., arrows vs. "tacks")?

- arrows means it promotes the expression - tacks mean it represses the expression - Can look at the development of sea urchins from mesoderm to micromeres to PMCs to skeletogenic cells to see the impact of gene regulatory networkls - see the impact of the TF beta catenin: need the right amount not too much (no skeleton forms only endoderm) not too little (no skeleton only ectoderm) - can see how the development of the mesoderm is a double negative gate (some TF activated by B catenin inactivate repressors of genes needed for the micromeres to form) -can see how feed forward has multiple inputs from multiple TFs to allow differentiation of micromeres to PMCs and if one is totally knocked out will arrest the differentiation

What are the major steps required for single cell RNA-sequencing? What kind of information comes out of the approach? What doesn't? Does scRNA-Seq tell you about gene function or cell lineage?

- assay whole transcriptome and compare to whole transcriptome of any other cell - transcription now assayable across states off differentiation for single cells by scRNA-seq - allows whole organism reconstruction of expression changes during development, relationships among cells, ad identification of new genes likely to function in specific cell types -strategy: take tissue of embryos at certain stage dissociate cells from one aoter made cDNA library: use RNA reverse transcriptase to make cDNA ad then use cDNA for NextGen seq donne w microfluidics - has nt tag attached to the cDNA (each has own identifier), tag lets you see which precise cell the precise molecule came out of - scSeq library (can put back together a nd seq w Illumina sequencer) - gathered cells at each different time point from embryos at diff times of development (count umber of times any given mRNA appears in sample) -can order cells by times and lineage: get trajectories of differentiation - trajectories can follow true lineages but looking at many different cells not just following one single cell as it changes (called psuedotime) - cann also see expression of individual genes (how particular genes change over time): can see how expression changes -various ways to determine WHAT gene expression changes are occuring - Doestn tell you WHY certain things turn on at certain points

What explains the difference between the numbers of codons and the types of tRNAs found in eukaryotic cells?

- at least one tRNA for each AA but not necessarily all 61 corresonding to each sense codon -due to wobble position this accommodates degeneracy w promiscuous pairing of standard and modified bases - given Phe tRNA can recognize more than one Phe codon - modified nucleotides can recognize other, example anticodon I can recognize codon U, C, A -dont need 61 tRNA genes to recognize all 61 possible codons

Once a critical mapping interval has been defined, how might one determine which gene is likely to be responsible for a given mutant phenotype? What are some approaches that could be used to ensure the correspondence of gene and phenotype is correct?

- can look at the sequence of the exons within the gene at the critical interval and compare to the WT to see if any mismatches (ex. nt substitutions) - once have this dont know exactly if it is what is causing mutation and half to check - several methods: 1. find more alleles to see if mutation at this gene always associated with phenotype - can do a complementation test and see if mutations at different alleles crossed with this mutant allele have a mutant phenotype or a WT, if mutant than likely that both of these affect same gene 2. replace the lost gene product (by transgenesis) and see if the mutant phenotype is rescued 3. find or make a "revertant" allele that restores the WT sequence for the candidate lesion to see if the phenotype is restored to WT (gene editing like CRISPR, can sqwitch one nt back and see what happens) - for Jam3b can see that there are three AA substitutions that can lead to mutant phenotypes (all in the same gene) so Pissarro must be affected by the Jam3b gene

Define "chromatin" and its composition. What are some differences between heterochromatin and euchromatin?

- chromatin is the components of chromosomes which includes proteins and DNA, complex found inn eukaryotic cells - chromatin is 1/3 of each: DNA, histone proteins, and non histone proteins - nucleosome is made of two copies of four histones and histone 1 -histone sequences are highly conserved -euchromatin: open chromatin, more accessible to RNA polymerase ad transcription factors -heterochromatin: closed chromatin, less accessible to RNA poly and TFs, unlikely to be transcribed, often gene poor and repeat rich, dark staining ad localized to periphery of the nucleus

Explain the roles for splice site enhancer/suppressor sequences and protein in determining whether splicing occurs at a particular site.

- cis regulators are the RNA motifs and are recognized by proteins that will determine whether expressed or not (ex. exonic/intronic splicing enhancers/suppressors, ESE) - trans regulators are proteins that bind the RNA cis motifs: serine arginine rich proteins promote splice site usage and compete with repressive heterogeneous nuclear ribonucleoproteins (hnRNPs) and tissue specific regulators (these tend to repress splicing sites) - this occurs AT THE LEVEL OF RNA, only recognize RNA after transcribed, diff combos of factors at different times leads to different splice forms coming out - possible for introns to be retained Premature stop codon causes whole thing to be targeted for degradation

Distinguish between cis- and trans-regulatory factors (relative to some gene of interest).

- cis regulatory factors (enhancers) are regions of the DNA sequence which serve as binding sites for TFs - trans regulatory elements or transcription factors are proteins bind to cis regulatory elements ad either serve as activators or repressors for transcription

What are the steps typically used when attempting to generate a high quality "reference" genome sequence? What is the meaning of sequencing "coverage" and why is it important to sequence more DNA than is found in an individual genome?

- clone by clone sequecing is best method: genome is fragmented into large pieces and then put In bacterial artificial chromosomes as an intermediate, the clones are then assembled in order using the ends of the sequence or DNA fingerprinting, then the clones are each broken into small pieces one at a time and sequenced, fragments of clones are ordered, sequences of overlapping clones established to get reference sequence -sequence coverage is important because must be sequenced many times to ensure all covered, some fragments rarer than others, "rain drops on sidewalk",. no every nucleotide hit on first sequence, have to do entire genome 50x to get good coverage

How do cohesins regulate sister chromatid pairing, and how are cohesin distributions and dynamics regulated across cell cycle phases?

- complexes have several different proteins (SMC3, SMC1a, etc.), forms what appears to be a ring but is labile when made and has to be able to open up onto DNA - cohesion loading occurs in G1 - cohesion complex is acetylated by ESCO1/2 during the S phase which stabilizes the complex and allows cohesion of sister chromatids after DNA rep, allows the sisters to be close and locked togehter - cohesion partially removed during prophase, EXCEPT at the centromeres, are phosphorylated which causes dissociation of the complex from the sister chromatid, cohesions stay on at the centromere and keep chromatids bound to one another - complete dissociation of cohesions, chromatids at anaphase by separase (cohesions are recycled to the next cell cycle)

What is a copy number variant and how common are such variants? What evidence supports roles for CNVs in affecting phenotypes?

- copies of numbers of genes varies from one individual to the next -CNV major class structural variation, typically regions larger than 50bp median 1kb mean 11kb -4.8%-9.5% of whole genome - some chromosomes higher parentage of CNV, ab half of chromosome 19 has CNV (because smaller gene so not as much total DNA as Chr1) - all genes about 12% have CNV - noncoding genes around 20% have copy number variants -ex. ridgeback, ridge is dominant trait, involves a dubplication of a 33kB region which encodes 3 Fibroblast growth factor genes, ridged dogs twice as much DNA as those homozygous for no duplication - homozygous with the duplication have 12 FBF which leads to Dermoid sinus conditions

How can recombination between phenotype and genotypes at marker loci be used to define a critical genetic interval? What does "genetic distance" refer to in this context and what are its units? Why are marker loci more likely to be homozygous the closer they are to the mutant locus? What about marker loci on other chromosomes?

- cross a mutant of one genetic background to WT of another genetic background - the F1 will be heterozygous - looking at the gametes from F1 there are regions of crossing over and recombinant - in mutants the affected locus must be homozygous for the mutation so the marker loci at or near the mutation are also likely to be homozygous for alleles of the same background, marker loci further away may have undergone crossing over and have a recombination of alleles from the two genetic backgrounds -when F1 generation are crossed with one another should get around 25% mutants and 75% WT - the mutants MUST be homozygous at the mutant gene and the marker loci around it are likely to also be homozygous for one genetic background -mismatches between the phenotype and marker genotype compared to the parental: if they marker is heterozygous or homozygous WT then the mutation cannot be at that locus -genetic distance is in centimorgans (100 x number recomb/number meioses) - critical interval is the region with the least number of recombinants in the mutant progeny, this is the region with the greatest likelihood of finding the affected gene - more individuals screened - more informative recombinants - narrow critical interval of gene

Why is aneuploidy deleterious? Why might aneuploidies for chr1 be early-lethal but chr21 viable?

- deleterious because causes dosage imbalance among genes - trisomy in chr 1 is almost always lethal because this is a larger chromosome encoding more genes, smaller chromosomes like ch18 are easier to duplicate and have less genes, easier for smaller chromosomes to be duplicated and dont contain as many genes so not as lethal

Explain how CRISPR/Cas9 systems can be used to target specific genes for inactivation or for introduction of new sequence.

- derived from studies of bacterial immune systems and how they protect themselves from viruses - nuclease specific DSB: homology directed repair (optimal strategy for repair, no error introduced, uses homology arms to construct what you need to be inserted), non homologous end joining: adding the two ends randomly, more error prone, few nt deleted/inserted -CRISPR requires: - CRISPR repeats and protospacers (segments DNA that when transcribed to RNA have target gene that is to be disrupted, each have unique specificity allow for target specificity), come together to make crRNA - associated with the tracrRNA which is bridge between crRNA and the endonuclease (CAS9) - target region recognizes target in the DNA and brings to Cas9 and a DSB is made cutting it out - people have made crRNA-tracrRNA combos so only need one RNA to do CRISPR(called gRNA) -can use homologous recombination to insert a new sequence or can let it repair itself with NHEJ (may get a few nt differences)

What evidence suggests that mutations, even when homozygous, can sometimes lack consequences for the phenotype?

- example: normal versus skinny leaves in maize, plants homozygous for mutations at A or B locus are normal phenotype, only if homozygous at both the A and B locus do they have skinny leaves, as long as they have A or B with one normal allele than they are fine and have normal phenotype, shows that A and B Genes are redundant - humans also have many mutations as a species, whole genome sequences and looked for loss of function alleles, humans have about 100 loss of function alleles with 20 gene completely inactivated - Icelandic population: about 8% of the population is homozygous for at least one LOF allele - 100K genome of Asian populations, all protein coding genes, 8,800 genes with protein truncating variants (cause loss of function) of these 858 are homozygous showing almost or complete loss of function - loss of alleles per individuals: in heterozygous state about 14 found per individual and in homozygous about 0.094 (about 1 person of ten) with rate mutations (PTC, miss splice, FS) all leading to viable embryos - we can have these And still be okay

What observations and experimental evidence suggest that self-replicating ribozymes might have existed in nature?

- extant ribozymes do not have replicase activity needed for RNA world but RNA polymerase ribozymes have been selected for in lab to add 1 to 3 nucleotides at a time - able to synthesize a 6 nucleotide long segment but not able to do one with hairpin folding structures -after six rounds of selection ribozyme able to get through 3 nucleotides at a time and 2nd structures - RNA world: 1. RNA molecules w/ catalytic activités assembled from nucleotide soup 2. RNA molecules evolve and diversify by self rep and mutations and recombination to provide materials for selection 3. RNA molescules begin to synthesize proteins first by making adapter RNAs to bind to activated amino acids, which then improve on ribozyme only functions 4. DNA appears w double strand as more stable way to hold genetic info and error correction

Successive losses of individual genes after whole genome duplication means that only very recently duplicated genomes will actually have twice as many genes as their ancestral genome, or the genomes of sister species. What does it mean for a gene to be "lost"?

- gene through rediploidization -conversion from tetraploid to diploid pairing in meiosis

What are some exceptions to the general rule of genomic equivalence across cell types of an organism?

- genomic imprinting (DNA methylation) of gamete pronuclei, not lose/gain genes but shut down and keep them from being expressed over long time periods, chemical mods to the genome, some cases heritable across generations - immunoglobulin gene variable regions in mammals -rRNa amplification in frog eggs, frog eggs may copies so massive amount of rRNA and ribosomes made so when Fert enormous amounts of proteins can be made - programmed genetic reduction, embryo has complete genomes but cells differentiate they loose sets of whole genomes ad cannot go back (ex lampreys, hogfish, scared flys, roundworms, copepods), genomes preserved entirely only In the germ cells, many TF that function in stem cells and that would be repressed by H3K27me3 are now just totally deleted from the cells whereas I other vertebrates would just be silenced or shut dow

How can one use a complementation test to assess whether two independently isolated mutants arise from mutations in the same gene? Are there any caveats to using or interpreting complementation tests?

- have similar mutant phenotypes from each of two families - can cross to see if represent mutant alleles of the same gene or mutations at different genes - if alleles of same gene: will get another mutant phenotype in offspring bc homozygous - if alleles of different gene: offspring will be wild type because heterozygous -caveats: both mutations must be recessive, mutations in the same gene can cause different homozyous phenotypes, trans heterozygote could exhibit more severe phenotype than homozygotes, mutation could affect more than one gene if in a complex locus

Define the term "heritability" and the various ways that heritability can be estimated. Why are heritability and additive genetic variance especially interesting to animal or plant breeders, as well as evolutionary ecologists?

- heritability describes the phenotypic resemblance between parents and offspring, this depends on the contribution of each allele to the phenotype (parents pass on alleles not genotypes so the dominance and epistasis relationships differ on what alleles happen to be present) - additive affects of each gene where affects each allele is seen, variance from the additive is proportion of the total Vp serves as estimate of f heritibility - h^2 = Va/Vp "breeders equation" - predicts to what degree there is a resemblance between the parents and offspring - h^2 and Va can predict the populations response to selection - h^2 can be estimated by the slope of a regression line between the parents vs. offspring - if R=1 then heritability is 1 and all the traits are additive - if R=0 this means low heritability and traits must be do to dominance/epistasis, etc. - can use this to see the impacts on a population due to selection - if there is selection for bigger beaks and high heritability then it is likely that the mean of the offspring population will be shifted upwards to account for this h^2 = R/S R = h^2S R = Va/Vp S R = distance offspring mean from parent mean S = distance mean of selected sample from mean of population - if you know the amount of variation due to additive over phenotype variation and know the selection imposed you can estimate the response to selection

How might CRISPR/Cas9 approaches be used to "correct" disease causing mutations? What are some of the advantages to these approaches and what are some of their limitations and major challenges that need to be addressed?

- human sickle cell disease, results from a single major gene mutation - can take out hematopoietic stem cells and try to fix them in vitro ad then put back Into organism, can try to revert to the standard wild type allele, this is not approach taken w CRISPR bc cells will take quick and easy NHEJ and is not precise, if put oligo nt w homology arms ca hope will occur but not as efficient as NHEJ (every 1HR, 10s-100s NHEJ) - real approach employed: reactivating silenced fetal y globin genes HBG1/2, product made by this represses fetal globe ad is normally turned off so gene not expressed in adults, getting rid of BCL11A then fetal globi will be derepressed (made again) - could also try by inserting GATA (cis reg element to ic expression) tools: -genome editing: insertion, deletion, replace gees by DSB, 1-1000nt, permanent to genomes of cell, variable efficiency (larger is harder) -base editing: cas9 fused to other enzymes, demise (uses gran ad Cas9 to target activity of other enzymes) can modify bases based on single nt repair, permanent -gene regulation: gene repression - temporary or persistent, modify accessibility of DNA, fuse Cas9 to inhibitor proteins (ex histone deacetylase) gene activation: fuse Cas9 to activators that facilitate pol 11 Challenges: - Crispr reagent delivery In vitro: how do you get it Ito cells - efficiency of editing target gene: much less effective to make single nt to edit, get to correct cells, how many have desired outcome - risk of off target effects: point mutation, rearrangements, mod made target locus but not having desired mutations, small 20-25 nt regions could bid other places -need personalized medicine: most affect may gees -ethics: editing of genome in germ line cells

When performing a complementation test for two recessive mutants, one occasionally finds that progeny have a mutant phenotype despite different genes be affected (and therefore being heterozygous at two different loci). Why might this be the case? If you didn't know a priori whether you had one gene or two how would an additional in-cross or backcross resolve whether the mutants are truly alleles at one gene or two?

- if you cross two recessive mutants and get a mutant phenotype this shows that these two genes interact somehow and affect one another because when heterozygous for each they still are mutated - ex. heterozygosity for jam3b sensitizes fish for heterozygosity at igsf11: means that they likely function in the same pathway or promote the same cell behavior - called a "non-allelic" non complementation:

Explain why allelic variation at relatively few genes can yield a continuous or nearly continuous distribution of phenotypes.

- incomplete dominance of alleles at just two loci can greatly expand phenotypic variation, leading to continuous distribution of phenotypes, ex purple pigmentation (2 genes each having 2 incompletely dominant alleles, can allow for 9 diff phenotypes) - each allele contributes 1 or 0 at a time, when add get 3 phenotype classes (2,1, or0) -when alleles are additive it doesn't take many genes/alleles to give a distribution that is continuous, -height: contributions from 50 to 100 genes if allelic variation at each will give a pretty continuous distribution - when there is environmental effects added, this gives variation around the values determined by the genotype which leads to a continuous distribution of phenotypes (ex. birth weight and nutritional status of mother, clutch size) - with some environmental effects and more genes or alleles doesn't take long until get continuous distribution

What are the events of translation initiation, elongation and termination?

- initiation: mRNA has to associate with PABPC and elFs, small subunit recognizes the 5' methylated RNA cap and attaches to the 5' UTR, scans until it gets to the start codon (AUG) and then can accept the MET tRNA, small subunit charged with MET tRNA causes large subunit to come in and makes full structure, most proteins first Aa (AUG) sequence matters but also sequence around it called Kozak sequence (usually 3 US are AG, Us usually CG, and DS usually G) these are favorable for small ribosomes and without Kozak sequence not translated as efficiently, adds another level of control, elFs helps 40s join to 60s - elongation: tRNA accepted into A site, links two AA detaching from the tRNA and moves into E site, one from A site moves to P site and E site leaves - termination: release factor enters A site when hits the termination codon, causes ribosome to get off of the mRNA

What is an approach that can be used to examine differential expression of small numbers of genes?

- insitu mRNA hybridization: ca have gene in mind and want to see where expressed, make riboprobe hybridized for that gene and can find where it is based on the color, can see how expression pattern changes over time - spacial port very important can use reverse transcriptase PCR of RNA target to get the DNA and use PCR to amplify - tells us if the gene is there or not but doesn't tell where it is expressed -downside: can only see few genes at a time In cell by cell basis

What are the consequences of chromosomal inversions for gamete production within an individual and how does this fecundity? Why does the presence of an inversion suppress the production of recombinant genotypes and phenotypes for genes contained within the inversion? What is one potential consequence of inversions, if they persist over evolutionary time?

- inversions result in gametes with missing or duplicated DNA and suppressive effective recombinants and reduce gamete viability -reduce fertility - pericentric: region inverted includes centromere - peracentric: region inverted does NOT include centromere, - for non sisters to pair the centromere much be there because where sisters pair to spindle and where pulled apart -meiosis in peracentric: to pair need weird config B has to pair w b, created inversion loop, crossing over in loop leads to one strand w two centromeres (dicentric) and one with no centromeres (acentric), in anaphase dicentric is broken into two random fragments, acentric won't attach to spindle so will end up being chewed up in cell because cannot participate in division by attaching to spindle, half gametes have at least all genes (either normal copy or inverted copy), 2 gametes have just fragmentts from the dicentric chromosome - affects number of progeny because only 1 or 2 in four gametes can be viable - allow for selective advantage to individuals w products of inversions - ruff: heterozygous for an inversion with lack of recombination will always be inherited together - inversion deleterious in dec body size but increased attractiveness for mating - inversion at allele itself did not cause phenotype but mutations accumulated overtime at this inversions site, inversions DONT EFFECTIVLEY PARTICIPATE in crossing over, mutations inn sites of inversions can be translated with other linked mutations - supergene: combo alleles on same inverted region stay together over time, transmitted without loci being separated by recombination (crossing over would be deleterious), mutations accumulate in inversions over time

Describe the basic steps of making a mutant by homologous recombination. Why is it important to have a selectable marker? Why are embryonic stem cells needed?

- is able to introduce: deletions, modified sequence, transcriptional reporter (GFP) - replace endogenous DNA with introduced DNA - exploits the mechanisms of crossing over and homologous recombination (in cells not undergoing meiosis but still have machinery present) - insert construct into the exon to make a cassette: simplest examples involve inserting a stop codon and a gene allowing for selection. ex. insert neomycin resistance that ends in a stop - get blastocyst from mouse and culture as embryonic stem cells - transfect cells with the DNA construct that can be recombined with target site to introduce alteration - construct much have homology arms identical with the target region of DNA on either side, this allows strand invasion, rescombination, basically crossing over to occur - can select against cells that have this edit by giving neomycin and ones that die failed - insert the ES cells into blastocyst from another foster parent and into foster mother - offspring will have the mutation in some cells, want to select for ones with it in the germ line, then can breed to homozygosity -stem cells used because they can be incorporated into the embryo and will differentiate into any cells

What are some approaches for knocking down gene activity that do not depend on introducing heritable genetic changes? What are some caveats to these (or other) approaches for manipulating gene activity?

- knockdown requires the genes remain intact but translation of the gene product is prevented using small RNAs - around 25 nt "morpholino" oligonucleotides are targeted to RNAs to interfere with translation or splicing (usually cause frame shift and PTC) - two ways: translation blocking: MO RNAs bind in UTR upstream of initiation codon to mRNA, block the initiation complex and ribosome from translating, knocks down activity of target gene because protein not able to be made splice blocking: target MO to splice junction, primary mRNA w three exons normally splice together, target MO RNA gets in the way of the splice some and one intron is retained, intron almost always has a PTC so will lead to incorrect splicing, targeted for degradation or just no functional protein is made - morpholino group protects from degradation and also makes sure is in a strong and straight conformation to bind - problems: only 25nt so can bind different sites, what you see may not represent the gene you think of, it isn't changing the genome itself so over time the MO affects will go away, ex. 2 days for zebrafish, MO is diluted out as more cells made or is degraded

How and why does meiosis contribute to genetic diversity?

- leads to genetic diversity through independent assortment (pairs of the homologous chromosomes segregate independently of one another, allowing new combinations of alleles in gametes) - what happens to one chromosome has no effect on others - equally likely to separate two ways which lead to different haploid gametes - when genes are physically linked or on the same chromosomes crossing over will also lead to genetic diversity - in prophase 1, chromosomes condense and homologous sets pair followed by crossing over among all 4 chromatids - zygotene: onset of pairing between homologues at synaptoneal junctions - pachytene: pairing completion and assembly of recombination nodules - diplotene: recombination between chromosomes at sites of crossing over, chiasmata form at a variety of locations along lengths of non sister chromatids, centromere remains intact but telomeric portions can swap, differences between homologous chromosomes in one allele transferred to the other, not every site with chiasmata will cross over but chiasmata essential for crossing over, allows for new combos of alleles for genes on a single chromosome

What is linkage disequilibrium, and why does it decay over time?

- linkage disequillibrium: non random association of alleles of different loci - decays over time because of recombination, selection, genetic drift, mutation - when haploid blocks become too small you can no longer see the correspondence between the marker loci and the phenotype

How does tumor suppressor copy number relate to cancer susceptibility in humans and other organisms?

- long lived animals, enormous amount of cells and larger lifespans, more room for mutations and errors in genes like p53 - would think have higher risk of cancer - but they DONT have higher risk - this is because have multiple duplicates of p53 -elephants have 20 copies of p53 - large bodies long lived species have compensated over time by evolution by duplicating the tumor suppressor genes

What does it mean for an allele to be semi-dominant, or haploinsufficient? When are alleles co-dominant?

- loss of function alleles can be dominant or incompletely dominant (semi dominant) relative to the WT, the produce of one WT allele is insufficient (hereto zygote is "haploinsufficient") - ex. semidominant: resembles neither parent but a mixture between the two -ex. snapdragons: intermediate pigment from incomplete dominants among the alternative alleles, represent an allelic series -allelse are additive: can see affect of each allele in the genotype by looking at the phenotype -ex. piebalsism: one allele alone is insufficient to promote the proliferation, survival, and migration of melanocytes needed to fully colonize skin, occurs in hereto zygotes usually same affect but not totally different, enough to see some affect but not all affected (incomplete dominance) -codominance: offspring have distinct features of both parents, alleles are codominant, have own characteristics (heterozygotes have both alleles, ex is blood types get two diff proteins from two diff alleles.

What are some of molecular changes that might lead to mutant alleles being dominant or recessive?

- loss of function alleles reduce the amount of gene product or its activity (null: complete loss of product, hylomorphic: less product or less active product made) -LOF mutant alleles are often recessive because of the genetic robustness built in that functions despite LOF alleles -ex. menders peas : one copy of Stay green enzyme locus is enough to keep the peas yellow, cystic fibrosis: when have one copy have enough functional that no mutant phenotype - can only need one copy to be functional so the mutated copy is recessibe - gain of function alleles increase the amount of product or its activity (overexpression: higher exp than WT allele, hyperactivity: higher activity than WT, neomorphic: new experience domain or protein activity) - often are dominant because they add expression or activity of gene products -example: achondroplastic dwarfism: FDFR1 is on regardless if ligand is present or not, causes early exit from the growth cycle because no reserved cells to keep going, causes premature ossification of cartilage - mutant alleles can be dominant if they produce a defective product that actively interferes with the normal cell function or product of the WT ex. Huntington's disease--- one copy of the mutated allele is enough to have phenotype

What are the basic events of mRNA splicing and how do specific sequence regions delineate where these events happen? How specific and predictable are these sequences?

- mRNA splicing typically occurs at the GU-AG splice sites with interrupting branch sites - splice donor before a GT and splice acceptor after an AG typically, branch site is in the middle about 30 nucleotides long ad is usually CT and ACT - sites are functionally equivalent across introns so correct splicing depends on the simultaneous recognition of corresponding GU-AG regions, if misses one AG or GT then could accidentally cut out Exon etc. -spliceosome: small nuclear ribonucleoproteins (snRNPs) each comprise a small RNA with about 20 proteins -four spliceosomes work together and this is done while RNA is still being transcribed - cut at splice donor site and then branching occurs to the branch site and a 2' to 5' bond is made, creating a loop called a lariat -then a cut is made at the splice acceptor and the two exons are put back together - there is lots of room for error in splicing and misplacing can lead to disease - may different variability and alternative splicing produces many different kinds of regulation of genens - 20,000 protein coding genes to over 100,000 protein coding mRNAs -variable features at core promoters and TSS (may have multiple promoters and which ones used determines which RNA is made), transcriptional termination or polyadenylation, also varies in splicing to retain diff exons ad introns - alt splicing may retain (or not) certain exons or introns -95-100 percent of mRNAs w more than one exon yield multiple mRNAs

What are some ways polyploidy can occur in somatic cells, and what are some examples of such cells? Why might polyploidy be "useful" for some cell types?

- many cells have more than one copy of genome (2n total copies) - whole genomes can be duplicatied through abbreviations of the cell cycle - polyploid cells are common and developmentally programmed in various tissues (hepatocytes, trophoblast, giant cells of placenta, megakaryocytic, keratinocytes of mammals, many larval cells and some adult cells of Drosophila) - polyploid cells can range from 4N-512N (even 200kN in giant neuron sea slug) -can occur by endomitosis: cell looks like will go through mitosis but then skips and goes G2 to G1, called the endocycle (repeats G to S to G to S), can either stop at anaphase giving one nucleus with 2x DNA or can go through division of nucleus but not cytokinesis so have 2x nuclei in the cell - drosophila: polytene chromosomes thousands chromosomes all stay together, some cells have different number of copies of chromosomes - normal and developmentally programmed, unclear why some cells do this, this happens in addition to other polypolidy where independent cells fuse together in muscle and cardiac tissue - provide source for new mutations evolution?, promotes evolutionary diversification

What are marker loci and how can individuals be genotyped for alleles at such loci? Are marker loci genes? Do they affect phenotypes?

- marker loci are places where individuals have different genetic backgrounds and can be used to distinguish heterozygotes vs homozygotes - lost of places in genome where you can tell if chromosome is from one genetic background or another -site where diff in genetic background = marker loci - have no affect on phenotypes not functionally significant difference - ex include CA microsatelittes: some genetic backgrounds larger alleles than others or restriction fragment length polymorphism (one background has sequence cut by restriction enzyme and other doesn't) - can use with pcr to separate homozygotes and heterozygotes

How do microRNAs contribute to regulating transcript and protein abundance?

- microRNAs: 21-24 nt noncoding miRNA guides bind to mRNAs, key regulators of mRNA abundance (60% of genes have miRNA target siteS), 1900 miRNA genes in human genome and each with multiple target mRNAs, miRNAs form an mi-RNa induced silencing complex (RISC) with the Argonaute protein, miRNA called a guide -miRISC sometimes perfectly complementary to the coding sequence of an exon, binds and proteins in the RISC cut on either side, then the mRNA is chewed up by exonucleases -also common miRNAs recognize mRNA even though not perfectly complementary, get bulge miRNA to align nucleotides in a few places, despite mismatching this binding efficient enoughjt to determine if translated, does this by blocking binding of formation of topology of mRNA to assemble ribosome (doesn't CUT but BLOCKS), blocks ELG and assembly ribosome -phylogenetic variation: certain organisms do this process different ways, different organisms where miRNA are found, in vertebrates 5'UTR is often the target of miRNA, in others exons are targeted

What are the major differences between mitosis and meiosis? What roles does the synaptonemal complex play?

- mitosis: homologous chromosomes line up on metaphase plate but NOT together on plate -meiosis: homologous chromosomes duplicated and then associated with each other, have 4 copies (two sister chromatids for each homologous chromosome) that line up at the metaphase plate, when aligned can have crossing over between homologous chromosomes, in meiosis one the two homologous pairs of chromosome are split into separate cells so each cell is haploid (but has two identical copies of the one chromosome), in meiosis two the identical sister chromatids are separated from one another into four daughter cells - in meiosis crossing over occurs which leads to genetic diversity - the synaptonemal complex is a protein scaffold that holds the homologous chromatids together when they align at the metaphase plate

What does "modularity" refer to, and what does a gene-phenotype mapping look like when there is high modularity vs. low modularity? What about a gene-regulatory network?

- modularity is the relative degree of connectivity in a system - module is a unit tightly integrated internally but relatively independent of other modules - each gene contributes to different traits, could assign linkages randomness - low modularity with a 0.16-0.17 modularity -high modularity is around 0.40 -genes somewhat pleiotropic but also another structure because traits themselves are somewhat modular

Why is it necessary to perform multiple crosses when intending to isolate and analyze recessive mutant alleles? Why is each cross made?

- must do multiple cross overs to ensure you get homozygotes for the mutation - first mutagenize a male: mutations will exist randomly in genome and are mosaic (different cells have different mutations), only care about ones in the germ lien cells - cross link to WT females and get heterozygotes for mutations F1 which are non mosaic - expand the mutation by taking F1 and crossing with another WT, will get 50% WT and 50% heterozygous because need mutations to be homozygous for phenotypes - cross siblings of F2 together to get F3: 25% WT, 50% heterozygous and 25% homozygous mutant - screen for unique phenotypes, if interesting can continue to propagate further, if lethal can breed heterozygotes to conserve mutations - once this is done can try to find the gene responsible

What is transcriptional adaptation and how is it thought to work? How might this explain observations in both model organisms and human populations?

- mutants of only 5-20% of genes targeted by reverse genetics have observable phenotypes - 8 or 9 times our of ten northing happens to phenotype even if we know the gene functions in that pathway -this is because of transcriptional adaptation - ex. zebrafish knockdown of RNA with morpholino lead to mutant phenotype, but mutant created by CRISPR had normal phenotype due to transcriptional adaptation where genes very similar to the mutated gene get turned on to a higher level to compensate for the lost mutant gene (doesn't happen with RNA KO, only when actual gene is mutated) - paralogous genes or just genes similar cann respond to one another - this response only occurs when the mutated gene makes a transcript that undergoes NMD (PTC) - cases where the transcript is never made (if entire gene removed or the US regulatory region inn the first exon with the promoter is removed) no RNA transcript is made and transcriptional adaptation will NOT occur - through unknown mechanism, NMD breakdown allows transcriptional adaptation which can compensate for mutated gennes - in humans likely the trsncreiptional adaptation of other genes leads to robustness and normal function

What is the 5' cap on mRNA, how is it synthesized and what does it do?

- new transcript starts with a pppG or pppA but the mature mRNA has a methyl G-cap in reverse orientation added early in transcription - the G is added by guanylyl-transferase -the methyl is added by guanine-7-methyltransferase -functions: protects transcript from degradation, recruits cap binding complex proteins which are important for transport from the nucleus, in the cytoplasm it recruits eukaryotic elongation factors (elF4G), RNA helices ad other proteins necessary for translation, it also interacts with the poly A binding protein to generate a pseudo-circular structure important for efficient translation

What are the consequences of DNA replication for chromatin state?

- old nucleosomes removed and histones dissociated, after replication have to reform - euchromatin modifications not maintained - heterochromatin modifications are maintained because these regions have no active transcription occurring, enzymes can regonize the presence of these modifications and add them to the new nucleosomes called "read-write" methylation, this is how epigenetic inheritance occurs, highly silenced regions remain silenced in daughters - doubling DNA has to double histones and nucleosomes and proteins (is rep conservative all 8 stay or semiconservative 4 and 4?), histones do come apart to certain extent in replication some histones tend to stay together when nucleosome broken apart and some do not

What evidence suggests that "sub-optimal" affinities between between transcription factors and cis regulatory elements contribute to specificity of gene expression, and what are the implications of these observations for the dynamics of transcription factor binding and cis regulatory site occupancy?

- otx encodes Tf required for neural development in ascidian -expression depends on regulatory control from ETS and GATA - regulatory region only 69 bp (3 Gata sites and.2 ETS sites) -made library with 165K regions of the 69bp regulatory domain and screened to see the consequences of variation by GFP reporter expression -22K expressed same or higher expression thann the native element -WT mosaic: reporter only some cells because only gets to some - one reporter not sufficient enough for expression, flanking regions not there so no expression -optimal binding sites = tooo much expression -showed that binding Tf to target sites not optimal and weak interactions are needed for expression at the right times, evolutionary tuned so sub optimal - cis regulatory elements are very densely packed and having sub optimal binding may allow dynamic interactions of many factors at a single locus

How does "Sanger sequencing" work and what modifications to the original method contributed to increased efficiency (and safety)?

- p32 radioactive nucleotide was put on the end of each of the primers for extension and then visualization after electrophoresis - ddNTPS were added for each of the nucleotides one at a time, created different sized fragments bc cleaved at the base - fragments were separated by size on a gel and exposed to film to see radioactive p32 - this was done for all four nucleotides with the four ddNTPS -multiple reactions with radioactive nucleotides and each read about 100-200 nucleotides long - advances: fluorescent ddNTPS, could add all the the reaction at the same time and avoided having to use radiation, run through capillary with matrix to separate by size and use a detector - gave electropharogram which sep by diff in color

How do p53 and CDK inhibitors like p21 prevent cells from entering S phase when they have extensive DNA damage?

- p53 is transcription factor that is one of 'quality control' checkpoints - p53 prevents S phase by inducing CDK inhibitors like p21, p21 inhibits activity of CyclinD-CDK4 and CyclinE-CDK2 complexes, can no longer phosphorylate Rb so it is bound to E2F and E2F cannot promote transcription of genes needed for DNA replication -p53 also able to initiate programmed cell death when damage severe -mutations p53 lead to cancer suseptibility, most cancers have p53 mutations

How can recombinant phenotypes be used to estimate genetic distance between genes?

- parental phenotypes are those directly present in the parents -recombinant phenotypes are those with mixture of traits from parents - excess of parental phenotype usually indicates the two genes are located on the same chromosome so some combinations of alleles are more common to be inherited than others - increasing distance between Genes means more crossing over so higher percentage of recombinant phenotypes - recombination frequency between two loci defines genetic distance in centimorgans between the two genes (ex. recombination frequency of 0.107 x 100 = 10.7 centiMorgans)

What are some of the ways in which DNA damage can lead to point mutations? How does DNA damage compare with polymerase error as a source of DNA alteration?

- polymerase errors most common -damage can occur by hydrolysis, oxidation, and methylation: Many nt damages per day in these normal activities, about 20,000 damage events per cell per day - hydrolysis: depurination (G lost, apurinic site with no base), deamination (loss amino from C makes it U, it is then read as a T and A is added, goes C-G to A-T) -oxidation: if G is ox it is read as T so have A instead of C, goes G-C to A-T, these occur because base damaged and not repaired before replication -uv induced thymine dimerization: cross linking occurs between strands

How has an ability to visualize single RNA molecules contributed to our understanding of transcription dynamics and the roles of core promoters and enhancers (cis-regulatory elements) in regulating the rate and intensity of transcription?

- single molecule RNA imaging has revealed discontinuous transcription - allows us to track when mRNA is made -transcription occurs in bursts -the size of the burst depends on the promoter and how many RNA pol 2 are recruited at once -the enhancers (cis regulatory elements) control the frequency of the bursts

What are some of the ways that the early patterning screen of Nüsslein-Volhard and Weischaus contributed to our broader understanding of animal development and pathology? What was novel about their approach?

- sought to identify genes needed for segmentation, anterior-posterior differences, dorsal ventral differences and other features - used larval exoskeleton cuticle preps of drosophila as a read out of earlier events in embryonic development -mutants with defects could show that the affected gene usually plays a role in developing a particular feature - yielded 600 mutants and around 100 genes - could be correlated so some important developmental genes in humans - first time that this done at this scale to dissect all genes needed for process - embryonic patterns in vertebrates comes from this study -used a forward genetic screen where you see the phenotypes first and then determine the genes later

What are "spontaneous" mutants? How have geneticists been able to increase the frequency of mutations for screens in the laboratory?

- spontaneous heritable mutations arise with normal errors in replication, are rare about 2-12x10-6 per gene per gamete - can increase the numbers of mutants in the laboratory by inducing mutations randomly with mutagenic agents: include ENU and EMS, will cause base changes randomly throughout the genome, add An ethyl to G so it can only have two H bonds and is read as an A and attached to T, 1 in 500 gametes will have mutation in any particular gene - causes incorrect pairing and substitution during DNA replication - can do this to get mutations at random genes and then view the phenotypes that result

What are some of the strengths and weaknesses of Drosophila as a developmental genetic model system? Can you imagine how these might compare with other species that are currently used as well (e.g., mouse, zebrafish, frog, nematodes)? What might be some of the advantages and disadvantages of other model systems you might imagine (e.g., tardigrades or thousands of other species).

- strengths: 10 d generation time, many progeny, exoskeleton has many traits that can be evaluated under simple microscope, only 4 chromosomes that are visualized as giant polytene chain (up to 1000 copies of each chromosome in some cells) -weaknesses: cannot be frozen down, must be continually maintained - share many similarities in genes and gene functions with vertebrates despite split in lineages -700 MYA -other species: advantages, some may be able to be frozen, disadvantages: some have longer generation times, also may create less progeny,

How do tardigrades and humans compare in their abilities to withstand extreme environments? What evidence suggests a role for the Dsup gene product?

- tardigrades can withstand desiccation, freezing temperatures, toxic chemicals, vacuums, - are extremely radiation resistant, median lethal dose is 5000-6500 Gy (1 Gy = 1 joule per kg of bw), human lethal dose is 0.8-16 Gy - have a Damage suppressor protein by the Dsup gene, expressed in all tardigrade cells and is localized to the nucleus and binds nucleosomes - studies show that there is higher amount of dsup when nucleosomes are present vs when there is just DNA - can be transfected into mammalian cells and see that Dsup tagged w GFP is localized to nucleus, normal GFP is everywhere in cell - protects against ionizing radiation including SSB, DSB, and free radials - DSUP gene in mammalian cells subject to 0 hr of 5gy - see that cells without DSUP more fragmented (comet tail), then those with DSUP, non radiated are oval - can look at damage in nucleus using H2AX histone variant, non irradiated control, non damage, after 1hr of 1 gy can see control and shDsup have more damage than those with normal dsup, shDSup has hairpin that destroyed Dsup protein to make sure its not another factor leading to less damage

What are the Pre-initiation and Mediator Complexes and how do they promote transcription? How do the proximal promoter and "distal" cis-regulatory elements (occupied or unoccupied) fit into the process?

- the PIC is a assembly of around 50 proteins in several different complexes (RNA pol, TFIIB, TFIIA, E,D,F,H), this TFs have to appear at the promoter to stabilize RNA, PIC has to form to recruit RNA pol to the promoter, construction of PIC is essential for RNA pol 11 to be recruited and stay on DNA to transcribe -Mediator complex includes around 30 proteins subunits, promotes PIC assembly ad pol 11 localization, stabilizes PIC in vitro but short lived in vivo, integrates and communicates with TFs bound to cis regulatory (enhancer) elements, promotes looping of DNA causing spacially clustered activating TFs near the transcript start site, also regulated Pol 2 pausing after 30-60 nucleotides have been transcribed, mediator bridges PIC to cis regulatory sites more cis regulatory factors -large complex formed between mediator and PIC gives the potential for many interactions with proteins to determine if transcribed -due to similarity in promoter and enhancer sequences both can transcribe bi directionally, making eRNAS at the enhancer and some uaRNAs at the promoter

When making a mutant by homologous recombination or by CRISPR/Cas9 mutagenesis, the "F0" animal is genetically mosaic and one must breed F0s to isolate alleles and generate individuals that are genetically homogeneous across all of their cells. Why are the F0s mosaic in each of the two different approaches?

- the gene edit may be added to different cells at different times, will not be incorporated into all the cells, and repair will occur in many different ways - the animals injected are mosaic with many different mutations in different cells - have to breed the mosaic animals to propagate certain alleles and mutations in the germ line - in HR: have to breed the chimeras to get ones with edit in the germ line so can be passed on , have to be breed to homozygosity

Why is it important to sample a lot of chromosomes (by genotyping a lot of individuals) for mapping a mutant locus at high resolution?

- the more chromosomes sampled by genotyping more individuals leads to more informative recombinants and a narrow critical interval for which the mutant gene is located

What is the variance of a distribution and why is it a useful metric for understanding phenotypic variation? What are the components of variance? How do the different genetic components related to the allelic effects within and between loci that we discussed previously?

- the square root of the variance is the standard deviation - the variance is useful because it is additive, tells you how far from mean on average - phenotypic variance = genetic variance (additive, dominance, epistaxis) + environmental variance + environmental/genotypic variance - Va = additive effects are due to alleles across the locus, each locus contributes directly or adds to the trait expression - Vd = dominance effects due to the dominant-recessive relationships among alleles of each locus (if heterozygote looks like homozygote) - VI= interactions affects due to the epistasis between the loci (when phenotypic affects at one loci depend on the effects at the other loci, masking)

Early estimates of the gene number varied widely. What do whole genome sequences that have been annotated for genes tell us about actual numbers of genes, and what kinds of genes they are?

- total size: 3.6 Gb - protein coding: 20,376 - 40,000 total gees - gene body length: average 26,288 bp - noncoding genes: 22,000 -exons per transcript: 7 introns per transcript: 6 mRNA- 2787 bp -pseudogenes - ab 13,000, used to be functional but aren't currently - shows much variation in size and composition, not only made of protein coding genes

Sanger sequencing is sometimes referred to as "first generation" sequencing, whereas several methods of "second generation" (sometimes referred to as "next generation") sequencing have been devised. Currently the most commonly used of these is Illumina Corporation's Solexa sequencing. In what ways does this method differ from Sanger sequencing?

- uses a flow cell, genome is fragmented and primers are added to the ends of the DNA strands which cause them to stick to the cell, unlabeled nucleotides and polymerase added so DNA will form a bridge and make the complimentary strand, then are denatures to separate the two strands, nucleotides labeled with fluorescence and fluoropore stops and polymerase and primers are added, each complimentary strand is sequences one nucleotide at a time ad the stops, can use fluoresce to see the nucleus added, termination reversible and second added, can do this for 300-400 long and can do millions at a time on one chip

What are the cellular and phenotypic consequences of prenatal Zika virus infection?

- virus affects the neural stem cells in the brain, impacts the ability to undergo normal mitosis, DNA may be replicated but not aliquoted correctly to daughter cells, chromosome segregation defects, sometimes multiple copies of one chromosome and when incorrect chromosome complement cell undergoes death and nucleus breaks down - phenotypic consequences: severe microcephaly, decreased brain tissue and specific brain damage, seizures, auditory defects, scarring at back of eye, limited mobility, infects neural system cells

What are the fundamental properties of the genetic code and how were they "deciphered"?

- was deciphered by making sythetic RNAs with only certain base composition, started with only U and only synthesized Phe, tried with diucleotides ad got two alternating aa, tried with three ad got the same, tried with four ad got four different Ones or a stop codon - each AA is encoded by a codon of three nucleotides - degeneracy -- diff codos can code for same aa - three stop "non sense" codons -initiatio code (AUG) marks the start of the reading frame - mutations can alter the message ( missennse - change aa, nonsense - stop, silent - doest change aa, frameshift - changes entire sequence of protein - DNA: two strands template is antisense and noncoding, RNA-like strand is sense and coding - polarities same, 5' end of RNA makes the N terminal end of proteins

A GWAS necessarily relies on sampling of a population in an environment or set of environments. What might be the consequences of missing subsets of a population or environments in one's sampling when trying to identify genes associated with a particular traits, or the heritability of a trait?

- when missing subsets or environments could see a correlation between a gene and a phenotype that isn't fully signifficant - heritability is likely to be miniscuke because the sample size and any given allele is unlikely to be found that segregates with the population, also is possibility of epistasis - each gene contributes a very small amount to each highly polygenic scenario

How is quantitative trait locus (QTL) mapping similar, or different, from genetic mapping of discrete traits for which variation depends on a single "major effect" locus. What are some particular advantages and challenges of QTL mapping?

- which is the gene or genes that contribute to the expression of a trait, is used in animals or plants where you have control over the breeding - start with two different traits of interest (tricombs and no tricombs) breed to get intermediate F1, then incross or backcross the F1s to get F2 with a variety off phenotypes (similar to mutant mapping, uses recombination) - assuming the parent phenotypes have two different genetic backgrounds can look at the segregation of the alleles and what phenotypes they express (locus A two alternative alleles: AA leads to lots of tricombes, Aa intermediate amount, and aa no tricombes) - can have regression analysis of tricombe density versus the genotypic variance at the locus - this doesn't mean that the marker locus is the cause of the phenotype but it may be very close to the causal locus because it segregates alongside it - get a result with representations of each chromosome with marker loci and a test statistic showing the magnitude of association of that locus with the phenotype, the peak indicates the greatest probability of correspondence between the phenotype and the underlying genotype at locus A, can extend this to a case with multiple genes - tomato example: what causes the variation in the fruit size and shape? 30 QTL were identified with most of the variation arising (67%) from alleles at 6 different QTL, large and small were crossed to get F1 with intermediate phenotypes, f1 were intercrossed and meiosis used to make recombinants where you could assess the differences, the vertical lines show quantitative trait loci where the peaks would be - difference from lab mutant studies: in this case Not getting down the the one gene associated, you get a critical interval involving a portion of a chromosome with several genes, region of a locus with correspondence, only way to narrow this interval is to add more individuals (this is hard because with multiple alleles and multiple genes they will segregate at the same time) -adcantage: can also see the epistatic relationships between genes and which genes mask eachother (ex. locule gene has an interaction affect and phenotype depends on the alleles at two other different loci) - genetic variant identification: can take this further by fine mapping w new crosses and interbreeding to get consistent phenotypes with very small percentage of variation, will allow to narrow down critical interval, only isolate a small portion of the allele variation, ex. identified two sites with SNPs, issue is these are in an intergenic sequence and dont know function - problematic: can narrow down to two SNPs and know they contribute to the phenotype but you dont know HOW they affect phenotype because dont know what gene does, can find the variance inn the genotype but then is hard to tell what the gene actually does if its function is unknown

What evidence suggests that whole-genome duplication is relatively common over evolutionary time?

- whole genome duplications occur repeatedly in plants, vertebrates, protists, and fungi -vertebrates 2x WGD, teleosts 3x WGD, some others four - after WGD some genes are retained but most are lost "rediploidization" - evidence: zebrafish 30% more protein coding genes than humans, humans 20% more than ascidians - evident by multiple regions conserved synteny - hox genes specify body regions and found as one or more clusters of multiple genes with shared synteny: tetrapod 2x wGD have 4 clusters w orthodoxies to HOX from hemichordates, telost doubles again so get 8 copies, other telost 4x WGD so get 16 copies but some disappear

GWAS is a valuable approach when trying to find molecular markers (SNPs) associated with common diseases that have polygenic bases. But if a specific and very rare pathology is likely due to a single gene mutation, it is often preferable to use second-generation short-read sequencing of all exons ("whole exome sequencing") or of the entire genome to find a causal variant, especially when it is possible to sample both parents and their affected children (so-called, "trio sequencing"). Why would this be the case?

- would be easier to see the relationship of the heritability between the parents and the offspring -GWAS is better if need to examine the interactions between many SNPs - if looking at only one gene could track the differences between parents and offspring and their phenotypes to determine the gene responsible and the nature of the alleles

How do "basal" or "general" transcription factors compare with tissue- or cell-type specific transcription factors?

-"general" transcription factors are involved in the formation of a PIC for the transcription process (ex. TFIIB (TATA binding protes), TFIIA -D -E -F -H -cell type specific transcription factors will serve as either activators or repressors

What are origins of replication and what does it mean for one to be "licensed"? Are there particular characteristics that make a site likely to be licensed? Are all sites that are licensed actually used?

-30,000 to 50,000 origins activated per cell division - origins recognized by 12 bp autonomous replication sequence in yeast, AT rich islands in another yeast but in metazoans is unknown (maybe CPG islands and promoter sites?), may depend on higher topology of the chromosome - assembly at the potential origins of a pre-replication complex (pre-RC) with herterohexamer origin recognition complex (ORC), mini chromosome maintenance proteins, and cell division cycle 6 protein - PreRCs lisence origins for use make them potential site where replication could commence but not all origins licensed are used -constitutive replication sites - origin used in every single cell - inactive or dormant site - origin need used in any cell, DNA rep never starts at that site - flexible cluster - site may get used or not depending on cell (vary depending non ells types) - unknown why some not used

How can one infer the distribution of histone modifications (sometimes referred to as histone "marks") and non-histone proteins in chromatin using methods of second generation sequencing?

-ChIP-Seq: chromatin immunoprecipitation sequencing, chromatin with proteins is isolated from the whole genome, chromatin is fragmented into small pieces, chromatin is incubated with antibodies to the protein of interest, precipitate antibody-chromatin complexes (pull down certain antibody), sequence collected DNA fragments, align sequences to genome to detect regions enriched in the protein of interest -differential enrichment of histone ' marks' correlated with transcriptional activity, higher peaks means that mod is more likely in that region of the DNA, more reads means more likely target protein was bound to the protein and histone has that modification

How does DNA become methylated and what are the downstream consequences of such methylation? Name a couple examples of methylation during development.

-CpG islands are near transcription factor binding sites where there is an abundance of G and C richness -cytosine of the CpG dinucleotide can be methylated by DNA methyltransferase (DNMT) -when the promotor is occupied by TFs this prevents the C from being methylated but unoccupied promoters can be targeted - when TF are not bound DNMT ca methylated the C which prevents TFs from binding -methylated Cs are recognized by methyl Cp-G binding proteins like HP1 which then drives histone methylation and chromatin condensation -Hp-1 proteins form large complexes which shut down large sections of sequence to prevent transcription - these are called epigenetic modifications: examples include hemoglobin switching, one form is methylated at one time so only other transcribed and then this changes and other form only made, another example is x-chromosome inactivation in which cats can have different marking depending on which X chromosome is shut down, methylation can be passed to daughter cells and in some cases offspring

What are examples of chemicals that interact with specific features of the DNA helix?

-DAPI stain -- affinity for DNA and intercalates into minor groove, fluoresces and allows detection of where DNA is - furanocoumarins (psoralen) --- chemical from giant hogweed, contacts skin and when exposed to UV will cause cross linking, this causes issues in DNA replication and leads to cell death, causes wounds in epidermis

What is a "pioneer" transcription factor and what distinguishes it from other types of transcription factors? What kinds of interactions can pioneer TFs have, and in what contexts have they been shown to be especially important?

-DNA binding proteins that are able to bind target DNA sequences even in closed chromatinn -can initiate chromatin remodeling, permit binding of other TF, histone variants, and chromatin remodelers - stabilize open chromatin state - play roles in cell programming and reprogramming (Stem cells) -include FoxA, FOXO, GATA factors go through three states: silent state: chromatin scanning, initial targeting competannt state: enable other factors to access active state: cooperative stable binding w secondary TFs acquisition of active histone mods

Why do chromosomes shorten with repeated replication and what are the mechanisms used by cells to deal with this problem? Are there cell types that are especially susceptible?

-DNA polymerase requires a primer to work off off -there is a short primer at the 5' end of each newly synthesized strand -when this primer is removed there is a gap -gap cannot be filled because at the 5' end and DNA polymerase only adds to the 3' end -single strand of DNA after each round ends up getting chewed up by exonucleases - each round of replication the chromosome gets smaller and smaller, if nothing else is done can eventually get to regions where shorten the gene bodies -cells prevent this by stopping replication, after a certain number of round of replication the ends are too short and they go into "replicative senescence" - telomeres: 300-3000 repeats of 5'-TTAGGG-3' sequence with 2-48 kb of dsDNA and 150-300 bases of ssDNA called a "G-overhang" - telomeres function in: protecting chromosome end from degradation or fusion with other chromosomes, allowing their own "rejuvenation", pairing and recombination of homologous chromosomes during meiosis - normal shortening is associated with age, dysfunctionn - telomeres protect the chromosome ends, "t loop" config prevents the attack of the G-overhang by nucleases and DNA repair enzymes -G-overhand invades dsDNA and paire with complementary strands on DNA resulting in a "D-loop", this provides protection because the single strand cannot be degraded by nucleases - t loop formation and maintenance requires the Shelterin complex TPP1 and 2 (telomeric repeat binding factor 1 and 2) and POT1 (protection of telomeres protein 1) are subsets proteins specific to D loop - telomere rejuvenation: telomerase is a large ribunucleoprotein complex including telomerase reverse transcriptase (TERT) and repeats of RNA (TERC) which serve as a template - enzyme using TERC templating RNA to recognize end of the DNA, lets TERT add new DNA bases to extend the end of the telomere, and then DNA polymerase can fill in the ss gap with ds DNA - enzyme adds length to chromosome to compensate for loss of chromosome -only terminal end of telomere is ss, bulk is ds -cancer cells have unregulated telomerase activity which allows escape from replicative senessense. -rapidly cycling cells are at risk (ex. hematopoietic and epidermal stem cells)

Explain how PCR works.

-DNA strands are seperated by denaturing at high temperatures, this breaks the H bonds holding the strands together -the DNA is cooled and primers are added to each of the strands at the 3' ends, called annealing -the temperature is increased slightly and extension occurs as the primers move along the strands -this process repeated Manny times to get Manny copies of the same DNA

What are double strand breaks and how can cells repair them? What types of damage can occur after double strand breaks have occurred?

-DSB: can occur in mitosis or meiosis, ionizing radiation, can lead to small or large deletions and chromosomal rearrangements -rearrangement occurs following DSB and during illegitimate recombination -rearrangement esp common in regions with repeated sequences (microsatelites) -rearrangments can break genes or bring regulatory elements of one gene to another -types of rearrangements: deletion (part is cut out when fused back), inversion (added back in opposite orientation), deletion and duplication between sister chromatids (one gets two copies one gets none), translocation between non homologous chromosomes - can occur in somatic cells or gametes with impacts on the germ line -DSB can occur naturally or by ionizing radiation -cell can repair DSB by: non homologous end joining (proteins recognize the free ends of dsDNA and connect back together, usually only 2-4 nt difference, accurate repairs, does not depend on recognition of sequence identity), homologous recombination (breaks in DNA ends recognized by proteins and processed to chew back so they are ss by exonuclease, ss then invades double strand of homolog, ss can be repaired based on template in homologous chromosome, can lead to perfect repair), SSA (single strand annealing, can lead to large deletions, no insertions), alt-EJ (alternative end joining, can lead to mutagenesis rearrangement, insertions and deletions) - fusion may be perfect or bad depending which mechanism used, depends on cell and luck

What evidence suggests an important role for H3K27 mutations in pediatric glioma?

-H3 mutations common in pediatric cancers (gliomas, chondroblastomas, giant cell tumors of bone) -specific mutations associated with particular tumors types and locations -diffuse intrinsic pontine glioma (PIPG), midline structures, median survival 9-12 months, H3 mutations in 80% -H3K27 - LYS is usually mutated to a MET, MET cannot undergo normal LYS modifications (leads to dysregulation, genes normally repressed are now active) -h3 mutation present in every of several cases of cancer in children

How is Hsp90 thought to contribute to buffering of the phenotype in Drosophila? What empirical evidence supports this idea?

-HSP act as chaperones to help proteins and keep them correctly folded under stress -if one copy is mutated (heterozygous) then there are different defects depending on genetic background -defects were genetically based because could be selected for penetrance even when there were two copies of HSP90 -shows there is reservoir of stuff that can have defects but you only see the phenotype in a certain environment, Hsp 90 as chaperone helps even mutant proteins fold so they appear functional -if the assistance of Hsp90 is gone, then no longer works -this is how genetic robustness can deal with underlying variation

What experimental observations first supported a semi-conservative mode of DNA replication? How might these findings have differed were another mode involved?

-Meselson and Stahl - used e-coli grown in two different mediums (14N and 15N) - grew e-coli in 15N so that DNA contained heavier nitrogen then put this into a 14N, after one round of replication there was one strand 15N14N (in between the two weights of 15 and 14) after two round of replication there were two strands one light 14N14N and one medium 15N14N, this supported semiconservative because the two separated and each served as a template for the next replication, after each round replication DNA has one strand old and one strand of newly synthesized DNA - if this were conservative, the 1st generation would have shown one 15N band (heavy) and one 14N band (light) because after one round one strand would be identical to the parent, so one DNA molecule would be all heavy and one all light -if this were dispersive the first generation would be a middle weight of 15N14N as recorded but the second generation would also have one line at 15N14N because the DNA would be interspersed still, the old and new DNA would be interspersed in the daughters

How does a single nucleotide mutation in a Duffy antigen gene confer resistance to malaria?

-Plasmodium parasite gets Into red blood cells by exploiting cell surface chemokine receptors "Duffy antigens" - GATA transcription factors regulate the production of Duffy antigens by binding cis regulatory elements and acting as activators -human resistance is due to a single mutation T-46C in the GATA DNA binding site, this reduces the ability of GATA to bind and reduces the transcription of the Duffy antigen/chemokine receptor (DARC) by 96% -individua;s with the T to C change at the GATA bindinng site are resistant to malaria - C mutation: much less transcription of the Duffy antigen RNA -down to one base to determine if gene transcribed

Several DNA motifs in proximal promoters are essential for the commencement of transcription from particular genes. What are two of these motifs and how do they function to promote transcription?

-RNA poly 2 recruitment to promoters requires interactions with other proteins (general TFs)

In genome wide association studies of normal human trait variation and disease, an "odds ratio" (OR) or its log ( log10 odds ratio; LOD) is frequently calculated. What does an odds ratio represent?

-The OR is the measure of how much higher the odds of specific variant SNP are in one group compared to another - OR = (P one nt in case/ that nt in control ) / ( p other nt in case/ other nt in control) - Or > 1 indicates the odds of that variant are greater in the case group than inn the control group - OR magnitude indicates the strength of association - odds ratios and the numbers of SNPs are used to determine the statistical significance of associations - the log10 of the odds ratio represents the p value and this is only significant when smaller than 10^-8 because of the amount of comparisons (thousands )being made

miRNA

-a class of functional RNA that regulates the amount of protein produced by a eukaryotic gene -microRNAS - key regulators of mRNA abundance -1900 miRNA genes in human genome each multiple target mRNAs -forms miRISC can target mRNAs for degradation or block translation

Histone modifications are essential for regulating chromatin state. What are major types of modifications and the enzymes that participate in them? What are their consequences for transcription?

-acetylation: histone acetyl transferase (HAT) acetylated histones on their Lys residue which makes them less attracted to DNA bc no longer charged nh3 group this makes them more open and are more readily transcribed, histone deacetylase (HDAC) removes the acetyl group from the lys residue which causes them to have the positive charged nh3 and are attracted to DNA, this means the DNA is held more tightly together and has less transcription -methylation: protein arginine methyltransferase (PMRT) adds a 2' methyl to ARG, histone methyltransferase (HMT) adds 1, 2, or 3 methyl groups to LYS, consequences methylation not as predictable as acetylation

What are the ways in which aneuploidies can arise and what might explain the increased risk of transmitting aneuploidies with increasing maternal age?

-aneuploidy can arise during meiosis 1 or meiosis 2 in gametes, can also occur in somatic cells in mitosis - if nondisjunction occurs in meiosis one you end up with two gametes that have trisomy and two gametes with monosomy - if nondisjunction officers in meiosis 2 you end up with one with trisomy, one with monosomy, and two normal diploid gametes - increased incidence with maternal age, oocytes form at the embryonic stage, long time between formation and maturation and then finally fertilization, they dont go through meiosis one until mature and dont go through meiosis 2 until fertilized

What is aneuploidy and how does the incidence of aneuploidy compare between humans and other organisms and between humans at different stages of development?

-aneuploidy is at the incorrect number of entire chromosomes or chunks of chromosomes - results in incorrect dosage balance among genes - rare in many species (0.01% in yeast), less than 2% in other mammals, far more common in humans (5-25%) - due to failure of normal chromosome segregation in meiosis 1 and 2 - most aneuploidy related to oocytes - 20% aneuploidy from oocytes - cause 35% of spontaneous abortions and 4% of still births - unknown why humans so bad at this

What proteins function in DNA replication besides DNA polymerase and what are their activities?

-at potential origins of replication the pre replication complex (preRC) forms containing heterohexamer origin recognition complex (ORC), mini chromosome maintenance proteins, cell division cycle six protein, preRCs license potential origins for use -helicase (mini chromosome maintenance proteins): unwind the DNA ONLY in multicellular organisms, separate the strands of DNA - single strand binding proteins (replicating protein A): bind to the single strands and keep them in their open configuration - primase (pol a): adds the RNA primer needed by pol delta to replicate the lagging strand, primers have to be Made along open strands of DNA (are actually RNA primers, short oligotides of RNA placed on DNA), RNA is chewed out later by exonucleases -sliding clamp (PCNA): makes sure the new strand and the template stick together -DNA polymerases: pol epsilon for the leading strand and pol delta for lagging strand

How do growth factors promote the G1 to S transition in normal cells?

-cell cycle reentry depends on growth factors like ligands that activate the receptor tyrosine kinase pathway, ligand is form of a dimer and receptors are dimers too, once bound transphosphorylate each other and go through a series of phosphorylations to activate transcription factors including cyclins - cell cycle progression requires cyclins and cyclin dependent kinases (CDKs) - when the cyclin-CDK complex forms phosphorylation occurs, many target genes to transition one phase of cell cycle to next -about 10 cyclins that allow CDKs to function -about 20 CDKs each with conserved structures -CDK-cyclin complex phosphorylated hundreds of target proteins at Ser/Thr residues -phosphorylation enable specific steps of the cell cycle - example: RTK signaling drives transcription of cyclin D and E, form CDK4-cyclinD and CDK2-cyclin E complexes, these phosphorylate Rb which is bound to E2F, E2F is inhibited when bound to Rb but when Rb phosphorylated is released, E2F is transcription factor and promotes expression off genes needed for DNA replication -Rb product of RB first identified as tumor suppressor, when Rb un functional E2F replicates too much leading to cancers

Define the different elements comprising a eukaryotic "gene body" as defined in class. How is this different from prokaryotes?

-cis regulatory elements (enhancers): locations on the gene for transcription factors to bind, can be upstream downstream or In the center of gene body - core promotor: where RNA poly attaches ad forms complex -TSS: where tells to start making RNA - 5' UTR: where tells to start making RNA - Start codon: where transcription begins and ribosome actually makes RNA - exon: coding sequence actually in the final mRNA -introns: noncoding sequence spliced out in mature mRNA -stop codon: tells to stop synthesizing RNA -3' UTR and transcription termination signal: tells machinery to stop making RNA -"gene body" composed of irons ad exons and sometimes cis regulatory - prokaryotes have no cis regulatory or 5'UTR or 3'UTR, also have no introns, have leader sequence instead of 5'UTR

Distinguish between discrete and continuous traits and give some examples of each.

-continuous traits: when trait expression depends on variant alleles segregating for multiple genes, environmental influence, or both, many traits very continuous (include heights, age sexual maturation, skin color, heart disease, skull metrics), each case has graded spectrum of how the phenotype is expressed, many genes often make contributions with alleles with additive affects, when number of genes increases or numbers of alleles per genes increase this causes continuous variation -discrete traits: allelic affects one gene at a time and comparing just two alleles, phenotypic classes are discrete, controlled by small number of Gennes often one (menders peas, lab mutants, major effect disease genes)

What is Fanconi anemia and how does it arise mechanistically?

-defects in cross linking repair mechanisms lead to Fanconi anemia -cross links include inappropriate covalent bonds within or between DNA strands which can block DNA replication and transcription - intra-strand cross links are fixed by base excision repair - inter-strand cross links require a complex of around 15 proteins that act during DNA replication -genes can be lost, translocated, inappropriate pairing, etc bc breaks in chromosomes and last minute effort to fix them -genomic instability and inability to turn over RBC and WBC as needed -leads to growth retardation, hyperpigmentation, renal and skeletal abnormalities, mental impairment, and heart defects - progressive bone marrow failure: onset 8 years - avg life expectancy: 16 years -cancer risk is 15000x baseline for myleodysplasia and acute non lymphocytic anemia, also hepatocarcinoma, squamous cell carcinoma and others by 13-16 years

What does it mean for an allele at one gene to be epistatic to alleles at another gene? What is an example of such epistasis? How might epistasis between loci influence phenotypic ratios obtained in "dihbyrid" crosses?

-epistasis: gene interaction in which an allele of one gene masks effects of another genes alleles - similar to dominant/recessive extended to whole genes - example is the lab color: if the melanin is homozygous recessive (ee) then it doesn't matter what the eumelanin density is it will be yellow phenotype (ee masks what is occurring at the B locus), coat color depends critically on the combination of alleles you get - end up with a 9:3:3:1 ratio of genotypes but the phenotypes are 9:3:4 because any with homozygous for ee are one phentoype

Compare exons and introns. Which have coding sequence and which have untranslated regions? Which appear in mature mRNAs?

-exons: coding sequence of DNA, have 5' and 3' UTR, appear in mature mRNA -introns: noncoding sequence of DNA, spliced out in transcriptional processing

What early evidence showed that genetic material could be transferred between organisms, and that such material is DNA?

-experiments with Streptococcus bacteria - two forms: s form with polysaccharide coat and R form is individual - s form = death and r form = alive - spontaneous change between the two forms s can become r - if s heat killed and then introduced w live r then the mice died, r was converted to the s form - called TRANSFORMATION: genetic info from dead cells transferred to live cells

What are differences between "forward" and "reverse" genetic approaches?

-forward genetics: identify mutant phenotypes, find gene that is mutated, assume the function of the gene -reverse genetics: choose gene to study, inactivate that specific gene (homologous recombination, knockdown, CRISPR), infer the function of the gene

Describe the possible steps by which a gene in a region of closed chromatin begins to be transcribed.

-gene can begin to be transcribed when histones are acetylated or changed to open up chromatin - coactivators can have histone acetyl transferase or histone methyltransferase activity to promote nucleosome displacement and the opening of the chromatin, coactivators can also bridge activators to the sights of transcription -activators can work in several ways: transcription (activator transcribed by one gene and then goes to bind to cis reg element and activate another), phosphorylation (activator is phosphorylated and becomes active to bind to cis regulatory element), co factor interactions (activator bound to another element but another interacts and binds replacing it), ligand binding (a hormone can bind to activator which activates it to bind to cis reg element), cleavage from inactive precursor (activator held in the cytoplasm by an inactive precursor and once detactched can go ad bind to cis reg element)

mRNA abundance is relatively easy to measure and this can even be done at large scale by single cell RNA-sequencing, in which transcripts for thousands of genes are isolated and counted for every cell separately from a tissue or embryo. By contrast, protein quantification is much more challenging, and especially so at single-cell level. What are the risks of inferring gene function from mRNA abundance data alone? How predictive are such data for estimating protein abundance and why?

-gene expression and activites of gene products are regulated at many levels: abundance at one level does not often predict abundance or activity at another level - Many steps where can have regulation from gene to protein: genome (whether gene expressed), transcriptome (bursts transcription, splicing, transport, volume control how long transcripts persist due to Poly A and miRNA), translation (how efficient and how long), proteome (folding and PTM) - risks bc mRNA may be changed in various ways from its initial point as a gene due to splicing, Bursts transcription, Poly A, miRNAs etc.) - predicting abundance of a protein based on abundance of mRNA is NOT easy (NOT 1 to 1), many places in between where mods can occur, mRNA is a poor predictor of protein content

What are some of the ways genes can be differentially expressed?

-genes expressed can be different in locations and times -each cell type already express sets of Tfs that allow expression than others -genes currently expressed determine what is expressed next -varies temporally (different stages expressed and the turned off) due to changed in TF that either promote or repress -which TF present depends on the signals between cells -hypotheses: differential gene loss: germ line have all gees ad some lost when differentiates selective gene application: some genes number is amplified In diff cells genomic equivalence w differential gee exp: genes dont change just whether or not they are on or off, genes somatic same as germ just depends what's on/off

A more realistic gene-phenotype mapping includes a variety of effects (arrows) than connect an underlying genotype to a phenotype. What do some of these effects represent for an actual organism?

-genetic pathways are interactive and non linear - individual genes influence single traits or multiple traits directly or indiretly - effects mutations reveal how organisms respond to change in the function of the affected genes not necessarily the function of the gene itself - environmental factors also affect final phenotypes - mutations in one gene --- whether or not you see effect depends on middle ground of pleiotropy, modularity, environmental effects, cell-cell interactions

Why is sequence read assembly complicated by sequences that are identical or nearly identical at different places in the genome?

-hard to tell if these are duplicate copies or if there actually are two of the same sequence in different places, also may have alignments with slightly different parts that both could work and you dont know which one is correct

What is implied about genes when they are referred to as "homologous," "orthologous," or "paralogous"? What does it mean for genes to be "ohnologs" of one another?

-homologous: genes have shared ancestry - orthologous: genes of shared ancestry in different species - paralogous: homologous genes arising by duplication in the same species - ohnologs: gene copies that arise from Whole genome duplications

Nüsslein-Volhard and Weischaus were able to define several different phenotypic classes. How were these classes defined? For mutants of different genes that fell into the same class, what might you infer about the normal functions of such genes?

-identified by the arrangements of the bristles on the larvae and patterns of bristles -phenotypic classes: segment polarity - anterior or posterior part of every segment is disrupted pair rule - alternating segments are lost gap gene - large or missing regions - mutants different genes in same class may have related functions, could influence each other -found hierarchy of genes establishing the body plans - some required earlier and some required later inn development -affected genes encoded transcription factors and cell signaling molecules - genes placed in pathways (crosses to make mutant for multiple genes, experiments manipulating gene activity) - genes encoded TF and could repress or activate others etc -some genes set stage for other genes to be turned on

"Third generation" sequencing was not discussed in class but allows much longer sequencing reads than Illumina sequencing (e.g., 10 kb vs 150 bp per read), though at lower "throughput" and sometimes with less precision than Illumina. ("PacBio" is one example of third generation sequencing.) These newer technologies have eliminated the need to make bacterial artificial chromosome (BAC) libraries, as was done for the first draft of the human genome, and "new" genomes are often now sequenced and assembled computationally with a combination of second and third generation strategies. Why is it valuable to have an intermediate level of structure to one's sequencing effort, whether provided by long-read third generation sequencing, or second generation sequencing using BACs as the starting material? Why might third generation approaches now be preferred to BACs?

-intermediate levels of structure help to constrain to a relative area of where in the genome you are, you at least know how close together the sequence fragments are - third gen may be more favorable because it eliminates another step which could lead to error by making clones In BAC, it also makes it quicker because you dont have to first order BACs before splitting them apart individually to sequence each one

What is a karyotype and how can karyotyping be accomplished? What can a karyotype show?

-karyotype: an individuals complete set of chromosomes - banding pattern: mitotic chromosomes stained with dyes, AT rich regions are gene poor and heterochromatic (dark staining), GC regions are gene rich and euchromatic (light staining) - can use Glensa staining (G-banding) - fluorescence in situ hybridization (FISH): can see chromosomes at higher res, DNA or RNA probe labeled with fluorescent dye and hybridized to chromosomes, sticks where complementary, can see the two copies - spectral karyotyping: 24 different probes with different colors, diff regions of affinity of each probe so get multi colored karyotype - karyotypes reveal chromosome abnormalities: translocations, inversions, intrachromosomal deletions, aneuploidy (abnormal chromosome number)

Distinguish between lagging and leading strand replication and the proteins required for them.

-leading strand: strand that moves from 3' to 5' of template, one RNA primer can be laid down at 5' end of new strand and then DNA pol e can synthesize continuously, requires MCM (helicase), DNA pol e, and PCNA (sliding clamp), also RNA primer has to be chewed up later by exonucleases -lagging strand: strand that moves from 5' of 3' end of template, cannot be synthesized continuously, instead forms short segments called Okazaki fragments where new RNA primer is laid down and then created new DNA from 5' to 3', DNA synthesized inn opposite direction of fork, requires MCM, DNA pol delta, PCNA (sliding clamp), pol a primate (for RNA primers), ligase to hold the separate fragments together, another DNA polymerase to fill inn gap after exonuclease chew out RNA

Explain how cells remove transcripts that have premature termination codons or other kinds of damage that might result in defective proteins.

-mRNA ORF dictates the peptide sequence but nuclear export, localization in cytoplasm, engagement with translational machinery, create of degradation, etc depend on bound proteins (cap binding proteins, TIF, SR proteins, EJC) - EJC core proteins and EJC interacting proteins regulate mRNA function, each EJC has four proteins and EJC is loaded onto RNA as splicing occurs and follows the spliceosome, occurs every 24 nucleotides US of each exon-exon junction, cover open reading Frame but absent from 3' UTR, as splicing continues one EJC is added per each exon exon junction, this recruits other proteins to the RNA (EJC interacting proteins) which influence remaining splicing, facilitate export from nucleus, promote initial translation, and are required for quality control - non sense mediated RNA decay: normal transcript has EJC and assessors proteins, as transcript translated 1st ribosome moves down and pops off the EJC, once at end of transcript no EJC left, if there is a PTC upstream of a EJC site than the 1st ribosome will stop at this PTC and detach from the mRNA, if EJC is still one then more proteins are recruited which target for degradation -WHERE STOP CODON AND WHERE LAST EXON EXON JUNCTION IS ARE CRUCIAL -NMD discovered in B-thalessemia (premature stop codons cause transcript loss and reduced hemeglobin) -viruses interfere with NMD mechanisms to promote their own replication inc HIV, HEP C, ZIKA, virus can interfere with assembly degradation complex or assembly of EJC, makes the process fail so their own RNA can code for multiple proteins on one RNA without being degraded

What are some of the ways to classify mutations and effects?

-mechanistic consequences of mutation depend on the region affected: coding - deletion or insertion, nonsense, missense or synonymous substitution, exon-introns junction: disrupts normal splicing, mis splicing RNA, regulatory regions: altered transcriptional efficiency - depend on how gene function is affects: loss of function alleles reduce amount of gene product or its activity: null: complete loss of product, non mRNA is made, non protein, happens if gene deleted), hypomorphic: less product or less active: ex. mutation at enzyme active site or not enough transcript made gain of function alleles inc about of product or its activity: over expression: higher expression than WT (mutation in uORF, more protein than should be made), hyperactivity: higher activity than WT (binds too strong, receptor always on whether or not ligand is present), neomorphic: new expression domain or protein activity (product interacts w protein in way it shouldn't, expressed in locations it shouldn't be, unpredicted impact on phenotype)

Describe the various ways that ATP-dependent chromatin remodelers can affect DNA binding site accessibility.

-modifers can not only change affinity but can also change positions -ATP dependent remodelers can also associated with the nucleosome and DNA and ca expose sites (specific cis reg elements) for DNA binding proteins, can do this by site exposure (repositioning, ejection, or unwrapping), or by altered composition (can exchange dimers, histone variants, or eject dimers out of nucleosomes)

What is meant by "additivity," across alleles of a gene or across genes?

-most phenotypes require multiple genes which each have multiple alleles - phenotypes can involve interactions among their loci and alleles: phenotypic contribution of one locus and its alleles can depend on the genotype of the other loci and their alleles - genes interact addiviley if each of makes a contribution to the phenotype - example is lentil color: the dominant of each gene causes a different enzyme to be active making a certain color (with both dominant mix to brown, with one or the other is only that color, and with neither it remains green) - 9:3:3:1 ratio of genotypes and of different phenotypes

What are the components and functions of nucleosomes?

-nucleosomes are made of two of each of core histones (H2A, H2B, H3, H4) and H1 and wrap around DNA, basic structural unit of chromatin -fuctio to tightly pack DNA and change the state of chromatin based on modifications to them -histone tails are exposed beyond the nucleosome and can have PTM by acetylation, methylation, phosphorylation, and ubiquitination of certain residues - ex. histones can be acetylated so not tightly bound to DNA and then can be more open to be transcribed - cann also cause altered recognition by cofactors which influences what other proteins will interact with histones, depends on residue at, depends what cause will be to transcription.

What evidence supports the idea that genes are often pleiotropic (and to a particular degree)? Are there caveats to consider about such evidence.

-one gene KO may affect one module but also be connected and impact another module -example is Bardet-Biedel syndrome: 19 known genes each affect 5-19 traits leading to phenotypic changes In the limbs, craneofacial skeleton, kidneys, retina, genitals, is a mutations affecting the primary cilium which is used for signaling by most vertebreates, is a mutations in genes required for development so leads to influence on many different traits, mutations single gene makes problems inn several different areas, doesn't mean gene function only involves areas with the phenotypic changes but it is essential in these areas

What is the meaning of "pleiotropy"? How might this differ when referring to genes as opposed to alleles of genes?

-pleiotropy means that a single gene or alleles affects more than one trait - for genes this means one gene could have many important parts of the developmental pathway and impact many traits coming out of it

What are the consequences of repeat expansion (or less frequently contraction) with respect to human health and disease. Why might simple sequence repeats be prone to accumulating errors during DNA replication?

-polymerase errors during DNA rep can lead to expansions and contractions of nt repeats sometimes affecting expression of protein function - template looks same, polymerase can slide back and add more repeats, repeat sequence can expand and contract over time, can use in genotyping to determine parents - ex. CAG sequence repeats can affect cis reg elements: change distance between two cis reg or can change distance between cis reg and transcription initiation - if repeat in coding sequence can lead to Huntington disease (PolyQ expansion), glutamine repeat expansion in exon 1 of Huntington protein leads to toxicity, protein accumulates in nonfunctional state when around 36-120 repeats CAG and poisons cells -single nt addition, deletion or sub can result from DNA pol errors (makes errors even 104 to 106 nt) -pole < pol delta < pol a - C-T transitions are the most common - can also have trans versions between purines and pyrimidines

What is the origin of the poly(A) tail found on mature mRNA? Why is having a poly(A) tail important, and what proteins contribute to functional roles of the poly(A) tail?

-pre-mRNAS are polyadenylated at 3' end when transcription terminates - RNA binding proteins associate with Pol 11 and recognize polyadenylation signals (AAUAA) in the 3' UTR, several present and choice is regulated -RNA pol 11 transcribes the sequence and is recognized by factors including Poly A polymerase -endonuclease binds and cuts at the G/U- rich target site 11-30 nt beyond the poly a signal -poly a polymerase adds a poly A tail of 200-250 ATPs -Functions poly A tail: binding sites for Poly (A) are RNA binding proteins in the nucleus (PABPN) and in the cytosol (PABPC) PABPN: promotes Poly A polymerase activity (adds more A nucleotides), interacts w cleavage and polyadenylation factor to influence Poly A length and cut transcript, creates circular structure, functions in RNA export to cytosol PABPC: interacts w eukaryotic initiation factors needed for ribosome function (elF4G, E), maintains loop structure critical bc protects from being chewed up by endonuclease (cannot access ends of RNA), promotes ribosome release (eRF3,1), blocks degradation of mRNA at 3' and 5' ends -shorter poly A tail may limit opportunity for PABPC binding, translational efficiency and protection from nucleus

What evidence points to constancy of the genome during development and across cell types?

-principle genomic equivalence -polytene chromosomes: in larva of insects replicate over and over, whole genome replicated between cells types, pretty consistent cell to cell so shows not losing genes or amplifying between types - best evidence for genomic equivalence is cloning: shows that a differentiated cell has everything needed to make a new organism so really about changes I expression

In trying to determine where a mutant gene lies relative to marker loci, one seeks to minimize the number of recombinants as well as double recombinants. Why would double recombinants within an individual be suspect?

-recombinants are rare and double recombinants even more rare - in particular very rare within a very small region of the chromosome -want to find gene with minimum number of recombinants because since mutant is homozygous it is likely that regions close to it (marker loci) are also homozygous -places that appear like double crossovers may be PCR errors

What is mean by "robustness" and "canalization" of a phenotype?

-robustness: invariance of a phenotype in the face-off genetic or environmental perturbation -canalization: robustness evolved under natural selection to stabilize phenotype and decrease variability - ex. ball rolling on wading ton landscape, can go up side and slightly different path but is still heading towards differentiation, complex set of interactions under the process of developing a phenotype

What is meant by a "saturation" screen?

-screen is saturated when new alleles are at loci for which mutants have already been found -stready accumulation of new mutant phenotypes but number mutants doesn't directly correspond to number of genes identified - more families looked at = more mutants found - can figure out by crossing whether two mutants identified at different times affect the same gene or different gene - overtime hit the same genes over and over again and just get new alleles for these genes - number of genes levels off around 3000-5000 chromosomes scored, not getting many more loci represented by mutants -when levels off, probably have found all the genes that can be found using this screen, point where you say you can no longer get more

Can defects in alternative splicing lead to human pathologies? What is one example?

-significance of most normal splice variants and protein isoforms is NOT known - titin is unique case: largest protein known and regulates length and elasticity of striated muscle, alternative splicing between fetus and adults regulates cardiac muscle elasticity, depends on splicing factor RBM20 (RNA binding motif 20), this is a trans regulator of splicing, in fetus 2 splice forms of Titian have large middle region which keeps elasticity, in adults splicing switches so the protein lacks middle region, in mutant RBM20 you can't have this change in splicing as you become an adult and you will get fibrosis in cardiac tissue and cardiomyopathy, specific phenotype ONLY due to a change in the splicing pattern, mutant individuals can only make two elastic forms of larger titin. the gene that produces the RBM20 protein is defective

What is a genetic x environmental interaction? Can you think of such interactions outside the explicit examples from class and why they might be relevant?

-some genotypes will perform better or worse in some environments compared to others - ex spade foot toad: live in desert and breed in ponds that dry up quickly, some genotypes are larger when pond dries up quickly and others are large when the pond does not dry up as fast - ex. colorectal cancer: genotype specific treatment benefits, people with abnormal enzyme to break down aspirin do better - ex. MS people who have had Epstein Barr are genetically more predisposed to it, additionally those that smoke even worse - examples are air pollution, diet, smoking, social environment, etc.

What is "synteny"? What do synteny comparisons suggest about chromosome evolution across Muntjac deer and other ungulates? What about comparisons of human with other species?

-synteny are regions of shared gene order, are often preserved over 10s-100s millions of years - genes homologous: shared ancestry - genes orthologous: homologous genes different species - genes paralogous: homologous genes resulting from duplication in one species - genes can undergo fission or fusion within individuals changing the total number of chromosomes or the karyotype, rare to be passed to offspring, but fission and fusion do accumulate over evolutionary time -muntjac deer only four pairs of homologues and reeves have many pairs of homologous, muntjac have undergone more fusions that reevsi -fusions, fissions, translocations, inversions all contribute to chromosome evolution - corresponding regions of chromosomes encoding particular genes can be mapped between distant species, humans and mouse many genes chromosomes of humans found on different chromosomes in mouse, humans and gar have strong conservation of chromosome content (genes chromosome 4 on same chromosome in gar) - homologous genes: all have same ancestry and then some undergo duplications to get paralogous Genes

How does organismal cloning work?

-take mammary cells of a sheep and place in culture -take oocyte from different sheep and remove nuclei - combine with cells from mammary glands - cells fused and mammaru glandn gave nucleus to egg - this egg I host sheep -mammary is diploid so no need sperm (already 2N) -renucleated egg develops to normal sheep

Explain the consequences of normal and pathological telomere shortening.

-telomerase activity is tightly regulated and low overall compared to telomere numbers - telomeres shrink with age and in rapidly cycling cells - as few as 5 (of 46) telomeres that are too short can trigger senessvense or cell death - mutations in 8 genes encoding telomer proteins or RNA can cause syndromic diseases (dysteratosis and aplastic anemia (very short telomeres), idiopathic pulmonary fibrosis (slightly shorter) -disease can be hereditary because one can inherit telomeres that are already too short, this is a consequence of inadequate telomere maintenance, mutation can have multigenerational consequences if the germ line is affected

Which strand of DNA is transcribed by RNA Pol II and in what direction? In what direction is the RNA molecule synthesized? What is a "transcription bubble" and over what region are RNA bases hydrogen bonded to DNA bases in an "RNA-DNA hybrid"?

-template strand is transcribed from the 3' to 5' direction to make an RNA which resembles the RNA-like or sense strand in the 5' to 3' direction, adds to the 3' end of the RNA being made - the transcription bubble is the complex with RNA pol 11 where the two strands of DNA are melted apart and the RNA is being transcribed -the RNA-DNA hybrid of bonding is 8 nucleotides long

How might the speed of RNA pol II be regulated and what is a proposed mechanism by which this could result in alternative splicing to generate different mRNAs from a single gene?

-transcription rate differentially exposes exons to splicing factors - histone methylation/acetylation influences RNA pol 2 speed -faster transcription results in larger loops that are more vulnerable to being spliced out - precise histone mod and chromatin state may favor some splice variants over others - splicing and transcription occur simultaneously: when RNa pol 2 is moving really fast there is mix match between the speed of the splicesome and RNA pol 2 so larger loop that the spliceosome is associated with, sometimes this excess loop contains the exon and removes two introns and an exon instead of one intron, if moving slowly less likely to cut out exons - in any given case cannot determined what specifically affects splicing, know some factors but not EXACTLy how splicing occurs

How can chromosome-scale rearrangements or other changes lead to changes in cancer susceptibility or progression? Why would particular alterations occur recurrently for particular cancers?

-translocations can affect open reading frames and protein function: two genes with DS breaks are then fused together during repair, this makes a novel protein that is actually functional and cab be favorable for cancers, ex. breast cancer: transcription factor ETV6 and NTRK3 (receptor tyrosine kinase) are joined and able to phosphorylate things and inc cell cycle -translocations can affect regulatory elements and gene expression: cis regulatory elements can be brought into different genes were not supposed to be, enhancer of one gene brought into another and can express things at inappropriate times and places

What are "tumor suppressors" and what kinds of roles do they play during normal development or homeostasis?

-tumor suppressor genes code proteins that regulate cell division -example is RB (identified as tumor suppressor) which is the gene that codes for Rb, Rb binds to E2F (a transcription factor that promotes transcription genes related to DNA replication), Rb inhibits E2F, negative control - example is p53, transcription factor that can induce cell death when damaged, can also induce p21 which inhibits CDK-cyclins from phosphorylating Rb

Many (but not most) genes have mRNAs with upstream open reading frames in their 5' UTR. What is a commonly accepted function of these upstream ORFs and what observations support this idea? What kinds of genes seem to have these upstream ORFs

-uORFs reduce translational output for mRNAs that encode very potent proteins (growth factors), this reduction inn protein is normal and important for regulating differentiation in some contexts (nervous system) -evidence: 5'UTR of PTCH1 hooked to fluorescent reporter gene, when 3 US ORFs were removed 5 fold more fluorescent protein was made, when 1 uORF mutated doubling in fluorescence intensity - shows that uORFs in 5' UTR put the breaks on translation, makes translation LESS efficient -often have uORFs for genes whose protein products are very potent (easy for too much protein to be made which is deleterious) -with neuron cells: WT differentiate correctly into Foxa2, take one uORF that's missing and use PTCH coding sequence so much more PTCH protein made leads to much less differentiation to the FOXa2 state, too much PTCH1 poisons cells and doesn't let them differentiate -why uORFs lower translational efficiency? multiple AUG means multiple places ribosome can bind and if there prevents other ribosomes from binding at the same time, where it recognizes the first AUG has a huge impact, multiple opportunities to start on small uORF and detach before moving protein, small majority would miss the stop and actually translate, evolutionary favorable to make small peptides than too much of a protein, uORF dont have good Kozak sequences recognized in some cases but also not in other cases

What are potential fates of duplicated genes, and what are some ways that gene function might be affected for paralogous copies that both persist after duplication?

-vast majority of genes get lost, even when lost still possibility to introduce new variation and gene functions -around 85% of duplicate genes are lost after WGD, likelihood of retention depends on dosage sensitivity of pathway, vulnerability to dominant negative mutations - if retained mutations in coding sequence may lead to new protein activity -mutations in noncoding regions may leads tp altered domains of expression, function inn cell types -new function for new genes arising from WGD is associated with increase in organismal complexity - cis reg elements can influence whether gene expressed in tissue, gene duplication can lead to modify old func, new functions, etc. - gene duplication of haploid: as evol continues new mutations arise - nonfunctionalism: function of that copy of the gene is lost, (ex. due to a PTC), this gene is not functional but still have one full functional copy of the gene, happens to most duplicated genes, one copy is usually inactivated, redundancy is built into the system, will work even if one prologue lost - neofunctionalism: other prologue has some mutation in cis reg which leads to new function, one copy expressed in each expression domain but also now have expression in novel place due to duplicatio - subfunnctionalizationn: each cis drives expression inn a certain location, accumulate mutations, can have two genes functional and aggregates cover all regions but the expression domains have been divided, original functions gene divided to two separate genes, could separate continually in evoultion, not this simple and additive, can allow more gene to be expressed

What factors besides ribosomes are necessary for translation initiation?

1. 40s ribosome subunit must associate with eukaryotic initiation factors and MET tRNA, mRNA associated with PABPC and elFs 2. mRNA and 40s ribosome associate togehter 3. 40s subunit scans to get AUG start codon 4. elFs helps 60s join 40s 5 mature 80s starts translation

There are several types of repeated sequence in human and other genomes. What are their broad categories (recognizing that we will discuss in more detail later in the semester)?

1. Simple sequence repeats (SSR) - example are CA microsatelites, one SSR per 2kb, Manny people have different sizes so can use to determine paternity, different sizes because repetitive sequence so DNA polymerase can jump ahead or miss or add a sequence, might find several sizes of repeat in PCr 2. transposable elements: can encode proteins when functional which allow them to move within the gene and copy themselves within the genome, varies across species humans 44%

Darwin described the process of natural selection. What is needed for this process to occur?

1. variation - individuals among a population differ in some trait 2. fitness differences: consistent relationship between value of a trait and reproductive success 3. inheritance - consistent relationship for value of a trait between parent and offspring

40S/60S/80S ribosome/subunits

40S - small 60 S - large 80S - entire ribosome unit

additivity

A mechanism of quantitative inheritance such that the combined effects of genetic alleles at two or more gene loci are equal to the sum of their individual effects.

What observations suggested that particular bases pair in DNA?

A:T and G:C were seen in the same ratio in the base pairing of DNA across Many organisms, Chargaff

start codon

AUG or ATG

Explain the experimental results that led to the "one gene, one enzyme" hypothesis. What modern genetic and genomic data suggest the original hypothesis was framed too narrowly?

Beelte and Tatum in the 1940s - screen for yeast that are unable to synthesize their own amino acids, used x rays to break DNA apart and let it rebuild itself, when rebuilt had mutations, could look for mutants that were unable to synthesize their own amino acids - wild type can synthesize all of own amino acids - took spores and used x rays then crossed to another wild type and got haploid spores -grew on full medium and then grew on minimal medium - the ones that died on minimal where then grown on individual amino acid mediums to see if survived - certain example only survived when given arg so arg deficient - found four different mutants with different issues in the steps to synthesize arg (missing certain enzymes, etc.) -implies that genes code for proteins w distinct enzymatic actives and function in linear pathways -NEW DATA: SHOWS GENES ARE COLINEAR - high res mapping shows how induction of mutations altered amino acids in corresponding portions of proteins, missense mutants if changes AA, nonsense mutants if stopped transcription, each nucleotide associated w identity of one aa, multiple nucleotides for one aa because adjacent mutations could affect, mapped how genes go to proteins

How can one use transcriptional reporters to assay gene expression and its dependence on cis-regulatory regions?

Can add a report to gene of interest by: 1. clone canditate regulatory regions by PCR and add reporter gene cDNA in vitro (endogenous gene unaffected) 2. express reporter in embryo compare expression domains to endogenous transcript 3. if reporter expression faithfully reproduces native expression can use in experiments 4. test for differences in reporter expression when cis regulatory regions are deleted - if reporter not expressed where should be may not have amplified all the cis regulatory elements or may be some downstream in introns that do have affects - if reporter matches expression can use as a proxy for the animals to see how changes over time - can use to find where essential regions drive expression: if delete a cis regulatory and reporter still functions, that site probably didn't impact gene much, if time or place of gene expression changes this means that cis reg element did impact expression

Describe the basic structure of nucleotides and DNA or RNA.

DNA: Nucleotides: A, G, C, T deoxyribose sugar phosphate group RNA: Nucleotides: A, G, C, U ribose sugar phosphate group phosphodiester bonds link nucleotides - DNA structure: double helix with

EJC

Exon junction complex proteins, deposited on mRNA after spliceosome as splicing occurs, knocked off by ribosome during translation to ensure no PTC

Distinguish between the approaches to mapping used in genome wide association study (GWAS) and a QTL study. Under what circumstances might one choose to apply one approach or another to identifying the genetic bases of quantitative phenotypic variation?

GWAS: human trials, something where you cannot control breeding, take advantage of natural recombination, must have existence of LD between mappable variants (SNPs) and traits of interest, good when you have polygenic traits ex. heart disease, IQ, personality, is just a correlation study, population of random individuals, uses the whole genome sequences to see the SNPs associated with a particular phenotype (condition), analyses SNPs of an entire population, does not take into affect polygenic inheritance QTL: allows you to actually breed and cross, can add more individuals or interbreed to narrow down the critical interval, shows which genes are contributing to the phenotype and can map epistatic interactions, biparental population, uses linkage loci to assess the phenotypic traits associated with polygenic inheritance, does not look a SNPs, does look at polygenic inheritance, genotype at marker loci to try to find association

reporter gene

Gene encoding a protein whose activity is easy to monitor experimentally; used to study the expression pattern of a target gene or the localization of its protein product, used fluorescent reporter genes to study uORFs in PATCH 5'UTR

What are the essential features of a genetic material and what makes RNA a particularly good candidate for being the first one?

Genetic material must be able to: store information, express information, replicate, and accommodate introduction of new variation RNA is able to: - encode information - complex folding patterns --- single strand can fold on itself to make double stranded areas, shows may have some function due to structure - highly conserved across all forms of life --- some individual types found across all forms of life, some rRNAs in ALL kingdoms of life - can act as an enzyme, ribosyme --- ribosomal RNA, ribonuclease-P cleaves phosphodietster bonds on tRNA and other small RNAS, can be isolated and work without proteins!, self splicing introns are noncoding regions spliced out without any proteins - self replicating --- not currently in living organisms but in viruses

What are Mendel's "Laws" of Segregation and Independent Assortment and what analyses led him to these laws?

Law of segregation: two alleles from each parent segregate independently of one another during gamete formation Law of independent assortment: different pairs of alleles (for diff genes) segregate independently of one another during gamete formation observed this using the sweet pea, saw that color and shape assorted independently of one another

Define "locus" and "allele." What do these terms mean when applied to non-protein coding DNA (like microsatellites)?

Locus: the site or location of a gene on a chromosome allele: one or more alternatives of a form of a gee - can use alleles to describe the different lengths of SSR of micro satellites, can use alleles to describe the variants of a locus of micro satellites, can vary from person to person

What are some major conclusions from studies of the Drosophila even skipped (eve) cis regulatory region, using transgenic reporters and mutant lines?

Made reporter constructs for the even skipped gene - 9.2kb of sequence but made reporters with deletions to show what cis regulatory elements lead to certain stripes - 9.2kb total size was sufficient to drive even skipped. tested different regions and lengths to show what pattern made when attached to reporter - can see which regions drive expression certain stripes and find consistency and regions that correspond to one another - can map out critical reg elements driving expression of gene - with deletion analysis found regions controlling stripes 3+7 and stripe 2: saw a series of binding sites for TF that are for TF higher up in hierarchy, some TF are repressors and some are activators -stripe 2: also has Tf as repressors and some as activators -suggests if one domain controls expression of stripe 3 and 7 thaen these cells do not express certain TF or the stripe would be shut down bc repressors - for stripe two shows even skipped expressed bc activation TF s in those cells and those do NOT have repressors expressed - can take embryos and attatch reporter to expression of stripe 2, can compare WT to mutant for the repressor Gt or mutation in binding site of Gt -shows that fir each stripe are multiple pos and neg inputs from Tfs that make sure gene is on but refine when it is on - general for all genes: have reg regions with portions controlling gene exp and is mix of pos and neg factors, which depend what TFs there and sum of positive and negative effects -whether or not a gene expressed depends on Tfs and additive effects (what happened before)

Is it possible to predict from sequence data alone the impact of a mutation on splicing or phenotype? Why or why not?

No, it is not possible to predict from sequence data alone the impact of a mutation on phenotype - cannot determine what specifically affects splicing only know factors that affect it like (cis/trans splicing regulators, transcriptional rate) - these factors occur only at the level of the mRNA and not the DNA sequence - cannot be determined by the DNA sequence - diff combos of factors at diff times leads to diff splice forms coming out

NMD

Nonsense-Mediated RNA Decay surveillance system in eukaryotes that recognizes and eliinates mRNA's encoding nonsense codons with their protein coding regions, recognizes EJC still present after ribosome is knocked off at a PTC

Are there any obstacles to the RNA World hypothesis, and, if so, how might they be accommodated?

Problems: -each nucleotide has three chemical moeities - contemporary nucleotides won't couple w/o chemical activation - phosphates very limited - adenine may be prevelant but not others -different cations -non biotic synthesis of ribose unlikely and ribose unstable - solvent properties: nucleotides dont self assemble w h bonds in h20, polymerase RNA unstable in H20 -natural RNa polymerase ribozymes unknown Accommodations: - could have had reasonable substitutes to bases and materials needed (other sugars, linkers, solvents, and cofactors) - likely RNA didn't look as it does today

What recent data suggest that an RNA-Peptide World might have occurred sooner than originally imagined?

RNA synthesis of peptides - tRNAs have modified bases, critical AA w modified side chains added, one of modified bases interacts w AA and can couple and cleaned and dissociate (goes from 1AA peptide to 2 AA peptide) - able to occur in vitro w only RNA chains - contemporary tRNAS may have modified nucleoside bases thought to be relics of ancient RNA world - modified bases can allow transfer and coupling of AA into peptide chain

RPL/RPS genes

RPL - genes coding for large ribosome subunit proteins RPS - genes coding for small ribosome subunit proteins

What are some of general explanations for pleiotropy, when it is found? What are some examples of pleiotropic effects that we have seen?

Reasons for pleiotropy: - gene arises early so it is high in hierarchy and everything below is affected - have same gene that arises in different times and places, same transcriptions factors and other factors often employed in different cells and organ systems, examples are BBs2, HOX D11, Fgf8, Tgfbr2 - gene expressed one tissue and that tissue has to interact with another (if that tissue messed up 2nd one is) - gene makes bifunctional protein: B-catenin Ko has both effects inn gene reg and cell adhesion

stop codons

TAA, TAG, TGA

Analyses of multiple gene regulatory networks have identified specific types of regulatory logic that are used repeatedly across organisms and developmental traits. What are two examples, the kinds of situations in which they are used, and the properties they are thought to confer?

Two types: - double negative gating: one gene turned on represses another gene that when active has the role to repress downstream genes, this allows the expression of something to be controlled in a specific location, example is sea urchins, b catenin turns on pmar1 which represses HesC which inhibits downstream genes needed for micromeres to develop. important to have some repression so not too many micromeres but also without pmar 1 would have no micromeres, males sure something only happens in one localized region - feed forward: one gene turns on two different genes, series of genes are turned on and contribute to one another expression, if one out of any is knocked out the differentiation will arrest but if minnow variations then the system is able to work around this, feed forward loops incorporate robustness into the system, with sea urchins once double negative gating occurs feed forward is used to continue the differentiation of the cells to PMCs, not just linear pathways, redundancy leads to genetic robustness, -many interactions allow buffering of phenotypes (recessivity) but also allow critical links sensitive to perturbation (dominance, additivity)

How common is alternative splicing and what are some examples of the phenomenon? What is an overall consequence of alternative splicing (e.g., in the context of the "one-gene / one enzyme" hypothesis) and how much do we know about the specific functions (if any) of most splice forms?

alternative splicing is persuasive - 95-100% of pre-mRNAS with more than one exon yield multiple mRNAS (2 to thousands of variants per gene) - splice acceptor and donor sites are invariant (how does cell know where to do alt splicing?) - consequence of alt splicing is many mRNAs and proteins made from one pre-mRNA due to alternative splicing, one gene one enzyme hypothesis false - we only know general factors affecting (cis and trans splicing regulators, transcriptional rate due to histone mods) but we do not know EXACTLY how splicing occurs

Genome browsers like Ensembl , NCBI or UCSC provide a wealth of information. What are some of the features that can be readily accessed using such a browser? Do these browsers indicate every functionally important part of a gene?

can show 5'UTR and 3'UTR, can show exons and introns, can show the AI number used to look up the sequence, show the q number identifing position on the chromosome also show areas on chromosome with no genes (gene deserts) and areas where multiple genes are encoded and which directions they go - hash marks -- repeated sequences (10 to 1000 base pairs long) also shown - can also show enhancers and promoters - ?

enhancers

cis regulator bound by activators, help to increase transcription of target genes

What is a cis-regulatory element? Can these elements always be identified computationally? Where do they reside in the DNA?

cis regulatory elements: short strands of DNA that serve as binding sites for transcription factor proteins, around 4 to 15 nucleotides long - can be upstream, within introns, or downstream of gene, gene can fold so will be close at time of transcription - dont often know where they are or what their significance is

continuous vs discrete traits

continuous variation shows an unbroken range of phenotypes of a particular character in the population whereas discontinuous variation shows two or more separate forms of a character in the population, A discrete trait is a phenotype that manifests as clear and separable differences in a population, while a continuous trait is a phenotype that manifests as a continuum along a spectrum. Examples of discrete traits are dimples and albinism. Examples of continuous traits are height and eye color.

5' UTR

critical for recruitment of ribosome to mRNA for translation provide stability, post transcriptional modification of the gene including transport, stability, ad localization

dNTP vs ddNTP

dNTP - building blocks of DNA, have an OH on the 3' end ddNTP - used to stop DNA polymerase, chain terminating, missing the OH on 3', used in Sanger sequencing

ribosomopathy

diseases caused by abnormalities in the structure or function of ribosomal component proteins or rRNA genes

eIF

eukaryotic initiation factor -associates with mRNA and PABPC -associates with 40s ribosome subunit -helps join 60s subunit to 40s subunit

What is a gene? What is an allele?

gene: segments of DNA found on chromosomes allele: different forms of a gene

Protein coding sequence comprises only about 1% of the genome. What is the rest?

introns, 5'UTR and 3'UTR, promoters, regulatory elemets, simple sequence repeats (ab 50% genome), transposable elements (encode proteins when functional which allow them to move in genome ad to copy themselves in genome, some may obtain stop codons and lose function, humans have ab 44%)

TSS

location where the first DNA nucleotide is transcribed into RNA

pre-miRNA

made as primary miRNA transcript, NOT protein coding but has exons and introns, goes through drosha to clip hairpin, then to cytoplasm, then dicer clips bottom and forms an RISC

Explain the events and mechanisms of crossing over.

mechanisms of crossing over 1. Spo11 cleaves the phosphodiester bonds of one strand to make a double stranded break 2. exonuclease degrades the 5' ends to expose the 3' end single stranded tails 3. strand invasion of non-sister chromatids leads to a heteroduplex (X shape called Holliday junction)- dmc1 promotes the strand invasion and opens up helix for more invasion, letting H bonds with the invading strand, RPA protein stabilizes D loop 4. reciprocal second strand invasion of original chromatid, new DNA added and ligated to original (ligase adds the new DNA to the backbones of each strand) 5. branch migration lengthens heteroduplex region 6. resolution of Holliday junction

Why are microsatellites often variable in length? Are there usually consequences of such length variation for the organism?Why or why not?

micro satellites vary in length because they are repetitive so errors in DNA replication may occur, DNA poly may miss one or add one to the sequence, usually noncoding sequences so mutations typically have no affect on the organism, may people can have different length micro satellites, may different alleles (variants of a locus) for SSR repeat micro satellites

upstream ORF

reduce the translational output of mRNAs coding for very potent proteins

What traits do cancer cells typically evolve that are different from normal cells? Are there particular aspects of cancer cells that allow facilitate the evolution of increasingly malignant phenotypes?

traits of cancer cells: - resist cell death - sustain proliferative signaling - evade growth suppressors - activating invasion and metastasis - avoid immune destruction - deregulate cellular energies - enabling replicative immortality - inducing angiogenesis exhibit grossly dysregulated gene expression: tumor suppressors down and oncogenes up

dye terminator

used in illumina flow cell method, after each dye labeled nucleotide is added the DNA synthesis is stopped and can be detected which was added, then is reversed so can continue

electropherogram

used in sanger method with fluorescent dyes, separates based one color of labeled nucleotides

capillary sequencer

used with sanger w fluorescent dyes, separated by sized and has detector to show flourescence

core promoter

where transcription initiated ad RNA poly assembles

Does a gene consist of more than its gene body?

yes, eukaryotic genes have a core promotor also cis regulatory elements are upstream and downstream of the gene body


Ensembles d'études connexes

4.2 nominal vs real interest rates

View Set

Chapter 13: The Costs of Production

View Set

BUS 203 Lesson 12- Unfair Trade Practices and the FTC

View Set

benchmarks one and twoWhich is the BEST example of an organism maintaining homeostasis?

View Set

Most missed topics quiz on alex and rome

View Set

Introduction to Entrepreneurship

View Set

N 360 Congenital Heart Defects Pearson questions

View Set