GN 311 Exam 5

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

High Incidents of Cancer Correlate with Tissues that Undergo More Cell Divisions

- High correlations between number of cell divisions in a tissue and the incidence of cancer in that tissue. - This is thought to be because many of the mutations leading to cancers are due to errors in replication. There is a high correlation between the number of cell divisions in a tissue and the incidence of cancer in that tissue. This is thought to be because many of the mutations that lead to cancers are due to errors in replication. Take a look at the number of new cases per year for various cancer types. Compare the numbers you see with the estimated cell division rate in some tissue types: Some tissues with frequent cell divisions are colorectal (estimated to divide every 5 days), hematopoetic stem cells every 30 days, epidermis every 48 days, and melanocytes 147 days Some tissues that have slower rates of division are gall bladder every 625 days, neural cells: little to no cell divisions after birth. Liver every 300-500 days, and Bones every 15 years

Example profile of an STR assay

16 STR loci amplified in a single reaction 0.5 ng of human genomic DNA Capillary electrophoresis is used in forensic labs across the country. This uses the same principle of separating DNA fragments based on their size, but is a more automated process with the ability to test multiple genes in one automated run. Results such as these are obtained. Note that this is a 16 STR loci profile and not the 20 CODIS loci.

Oncogenes Burkitt's Lymphoma

Abnormal function of B cells Reciprocal translocation between chromosomes 8 and 14 places c-myc (oncogene) next to enhancer Thought that enhancer leads to abnormally high function of c-myc gene The B cells normally make antibodies. Burkitt's Lymphoma is a cancerous condition where there is abnormal B cell function resulting in tumors. Normally, the c-myc gene is present near the end of chromosome #8. Normally there is an antibody producing gene near the bottom of chromosome #14. If a reciprocal translocation occurs between chromosomes 8 and 14, it can place the c-myc gene near to the enhancer for the antibody producing gene. It is thought that the enhancer leads to an abnormally high function of the c-myc gene, resulting in the tumors and Burkitt's Lymphoma. In this case, it is an altered level of gene expression that causes the cell proliferation to occur.

Using info from P0, F1 and F2 to estimate H2

All variation in the parental and F1 generations is VE since each of these generations only has one genotype. (assumes parental generation is pure-bred lines) VE = [V(parent1) + V(parent2) + V(F1)]/3 Phenotypic variance is estimated by variance in the F2 generation V(F2) = VP = VG + VE Therefore, estimate VG = V(F2) - VE Calculate H2 = VG/VP Let's use the variance calculated from each generation to estimate heritability. All variation in the parental and F1 generations is VE since each of these generations only has one genotype. (assumes parental generation is pure-bred lines) The best estimate of variance due to environmental effects is the average of the variance in the two parent lines and the F1. We use the formula VE = [V(parent1) + V(parent2) + V(F1)]/3 Phenotypic variance is estimated by variance in the F2 generation since the F2 has all possible genotypes and still has the effects of the environment affecting phenotype. Our formula is V(F2) = VP = VG + VE We assume that the environmental variance in the F2 is the same as the previous generations. This may not be a perfect assumption, but it is what we have to use. Using these values for VE and V(F2) we can rearrange the equation to solve for VG. We estimate VG = V(F2) - VE Now we can calculate an estimate of broad sense heritability as H2 = VG/VP

Sympatric Speciation Hybridization that leads to Allopolyploidyis another mechanismfor Sympatric Speciation

Another type of sympatric speciation is the hybridization that leads to allopolyploidy. You may remember allopolyploidy from our earlier lectures. In making an allopolyploid, two species mate. The product is a sterile F1. Chromosome doubling then occurs to produce a plant that is fertile with two copies of each chromosome from both species. In this example, the European native small cordgrass Spartina maritima hybridized with the introduced American smooth cordgrass Spartina alterniflora. Chromosome doubling occurred resulting in the production of the allotetraploid, Spartina anglica, the Common Cord-grass. This is a species of cordgrass that originated in southern England in about 1870.

Molecular Clock

Based on the assumption of constant mutation rate in the change of DNA sequence or amino acid sequence, the differences in sequence between present day organisms can be used to date past evolutionary events Based on the assumption of constant mutation rate in the change of DNA sequence or amino acid sequence, the differences in sequence between present day organisms can be used to date past evolutionary events. We can relate these changes assuming that there is a linear rate in the mutation over time. This relationship has been referred to as the molecular clock.

New Species Arrive through Reproductive Isolation

Biological Species Concept: members of a species are capable of inter-mating and producing fertile progeny. Based on this, new species arise through reproductive isolation. The biological concept of a species is that members of the same species are capable of mating and producing fertile progeny. It can be difficult to determine whether two organisms are from different species when we look at fossil data. It can be difficult to determine species in asexually reproducing organisms. This table provides a list of some prezygotic and postzygotic mechanisms of reproductive isolation.

Chromosome Segregation

Cancer cells often have abnormal karyotypes Cancer cells often have abnormal karyotypes. Look at this karyotype. This is a staining procedure where each homologous pair is stained a different color. It makes it easy to see the abnormalities. There are several chromosome positions where there are three copies of a chromosome or only 1 copy of a chromosome. There are numerous translocations - you can tell these since the translocated chromosomes have more than 1 color on the same chromosome.

Bioinformatics

Combines computer science and biology Goals include •Maintaining and analyzing data bases of sequence data •Developing software to mine information from big data sets •Comparing structural and functional features of DNA sequences and resulting proteins •Determining evolutionary relationships among genomes The rapidly growing field of bioinformatics combines computer science with biology. As genomes from more organisms are being sequenced, the volume of data is growing and new methods are being developed to analyze these data. Goals of bioinformatics include Maintaining and analyzing data bases of sequence data Developing software to mine information from big data sets Comparing structural and functional features of DNA sequences and resulting proteins Determining evolutionary relationships among genomes

Contig Map

Contig - overlapping sets of clones that give a physical map of part of a chromosome Often DNA is sequenced in pieces with each clone having 1 piece. Then the pieces have to be combined into a longer map. A contig is a set of overlapping clones that give a physical map of part of a chromosome. Here you can see 14 overlapping clones that were arranged in order to show the contig. In this case, the black hash marks that were used to align the clones were restriction enzyme recognition sites, but STSs can also be used

Sample Problem: Making Recombinant DNA

Construct a recombinant plasmid by putting a portion of the mosquito DNA fragment into a plasmid vector so that the recombinant products can be determined using blue-white screening. The maps of the vector and the mosquito DNA are given below You do NOT need to insert the entire mosquito DNA segment You DO need to use enzymes in the MCS region You CANNOT use an enzyme that cuts the vector more than 1 time Let's try a problem: You want to put part of the mosquito DNA (purple) into the vector so that you can detect cells that have the recombinant plasmid using blue-white screening. In this experiment, we want to use two different enzymes to make sure our insert goes into the vector in the proper orientation for expression. (For our purposes, it will only work 1 way but this is important if we want our gene to be transcribed. ) You do not need to insert the entire mosquito DNA. You do need to use enzymes that are found in the polylinker (MCS). You cannot use an enzyme that cuts the vector more than 1 time because it will cut the vector into pieces and you won't be able to get them to all come back together properly in the recombinant plasmid (the odds are WAY too far against you!). You have to use enzymes that are found in both the vector and the mosquito DNA. In this case, we will use 2 different enzymes. This means we will lose a small piece of vector DNA and add a big piece of mosquito DNA when we make our recombinant plasmid. Note that there is a number 1 at the top of the vector. This is always assumed to be at the top of the vector and represents base pair #1. We count the base pairs in the vector as we go around clockwise. The enzymes are given on the map with the number next to each enzyme name representing the base pair at which the enzyme cuts the plasmid. These numbers are important since they are the way we are drawing the map of this plasmid vector. The total number of base pairs in the vector is the number in the center of the plasmid: 2260 bp here. The next slide has a video that works through this problem. Get out your calculator and work through the problem with the video. Script: Let's make recombinant DNA! You want to put part of the mosquito DNA (in purple) into the plasmid vector so that you can detect cells that have the recombinant plasmid using blue-white screening. You need to use two different enzymes to make sure the insert goes into the vector in the proper orientation for expression. You do not need to insert the entire mosquito DNA. You do need to use enzymes that are found in the polylinker (MCS). You need to give the map of the recombinant plasmid at the end of the problem. First, we need to decide which enzymes to use. The enzyme must cut exactly one time in the MCS region and nowhere else on the plasmid. The enzyme must cut the mosquito DNA. Let's go through the enzymes in the polylinker: DrdI cuts one time in the vector, but it does not cut in the mosquito DNA. We cannot use DrdI MorI cuts one time in the vector AND cuts the mosquito. We CAN use MorI SpeI cuts more than 1 time in the plasmid. We cannot use SpeI EcoRI cuts 1 time in the vector AND cuts the mosquito. We CAN use EcoRI RsaI cuts one time in the vector, but does not cut the mosquito. We cannot use RsaI BamHI cuts multiple times in the vector. Also BamHI is not located in the polylinker. We cannot use BamHI ClaI does not cut the vector. We cannot use ClaI We Do use MorI and EcoRI. We cut the mosquito DNA with the enzymes MorI and EcoRI. We also cut the plasmid vector with the enzymes MorI and EcoRI. We mix these DNA molecules together. The MorI sticky ends from the mosquito DNA can bind to the MorI sticky ends of the vector. The EcoRI sticky ends on the mosquito DNA can bind to the EcoRI sticky ends on the vector. We add DNA ligase to seal the nicks on the sugar phosphate backbone. We lost the 40 base pairs of DNA between the cut sites on the original plasmid vector. We added 575 base pairs of mosquito DNA. Our net gain is 575 - 40 = 535 bp. To get the size of our total recombinant plasmid, we add the net gain to the original vector size. The original vector was 2260 base pairs. 2260 + 535 = a total of 2795 bp in the recombinant plasmid. Now we have the proper sites on our map of the recombinant plasmid. Note that the lacZ gene has been inactivated by the mosquito DNA so there is a line through the gene name indicating that it does not work. We need to put the cut site locations for each enzyme on the map. Start at the top and go around clockwise. Base pair #1 is always at the top. Nothing was done to alter the original plasmid from the 1 base pair through the MorI cut site so the numbers of 75 for the SpeI site, 390 for the DrdI site, and 420 for the Mor1 site remain the same. Our next cut site is ClaI. Of you look at the mosquito DNA, you can see that this site is 275 bp from the MorI site. We add 275 to the number at MorI, which is 420. Since 420 + 275 = 695, that is the location of the cut site for the ClaI enzyme. Looking on our Mosquito map, we see that the distance between the ClaI and the EcoRI cut sites is 300 bp. We add 300 to the 695 that is already on our map for the ClaI site to get the position of the EcoRI site, which is 995. This is the end of the inserted piece. From here and the rest of the way around the circle, we just add the net gain (535) to the position of the cut site that was on the original vector. These calculations are shown for you. For the RsaI site: 490 + 535 = 1025. For the next BamHI site, 970 + 535 = 1505. For the BamHI site located in the ampicillin resistance gene, 1825 + 535 = 2360. For the final SpeI site, 2100 + 535 = 2360. This is the map of the recombinant plasmid. The inserted segment of mosquito DNA is shown in purple. The correct total size of the recombinant plasmid of 2785 bp is given in the center of the map. L38/19

Electrophoresis

DNA loaded at negative pole and migrates to positive pole "Run to Red" Small fragments migrate faster than large fragments Sizes can be determined based on comparing fragment migration to that of a control of known size Pieces of the same size will migrate the same distance We can compare DNA profiles from different individuals. The other technique that is really important in using DNA for identification is electrophoresis. I won't explain it again, but here is the slide we had earlier so you can review it. The idea is that pieces of the same size will migrate the same distance. This allows us to compare DNA profiles from different individuals

Reverse Genetics Knock-Out Mice To Study Gene Function

Knock-out mice are mice in which a gene has been turned off. This allows the function of the gene to be assayed - since we can look at the difference in phenotype between the mice with the functional gene versus the nonfunctional gene. This diagram shows the production of a knock-out. The neo gene was inserted into the target gene in order to inactivate the target gene. The tk gene was used as a marker and cells with the altered gene sequence are obtained. Cells with the altered gene are then put into mouse embryos and mice that have the incorporated gene are selected. This gene will not be present in all cells of the mouse, but matings can ultimately result in homozygous knock-outs where the function of the gene is missing and can be characterized.

Is the sequence below a palindrome? 5' - GATTAG - 3' 3' - CTAATC - 5'

Nope - This sequences is not a palindrome. If we put in the second strand, we notice that it is not the same when written 5' to 3'.

Sample using UPGMA

OTUs are Operational Taxonomic Units (not necessarily different species) The numbers in the table reflect the relative distance between OTUs and can be calculated from differences in the DNA sequences. For example, if there are 10 out of 100 nucleotides that differ, the distance is 10/100 = 0.1. Often we multiply this x 1000 to give numbers that are easier to work with. In this case, the value would be 100. We will start with aligned sequences from the OTUs. You will not need to align the sequences. Then we have to compute a relative distance between the OTUs based on the alignment. If there are 10 out of 100 nucleotides that differ, the distance is 10/100 = 0.1. We multiply this x 1000 to give numbers that are easier to work with. In this case, the value would be 100 and the 100 would be the number in the table. These distances are called the percentage distances times 1000.

MZ:DZ ratios are given for 4 traits. According to this data, which trait has the highest heritability? Trait A: 84:14 Trait B: 95:82 Trait C: 51:25 Trait D: 81:65

See if you can see apply those trends in this problem. MZ:DZ concordance ratios are given for 4 traits. According to these data, which trait has the highest heritability? The correct answer is trait A, which in which 84% of the MZ twins shared the same phenotype but only 14% of the DZ twins shared the same phenotype.

Restriction Enzymes cut DNA

Some enzymes produce sticky ends when they cut DNA - such as EcoRI and HindIII Other enzymes cut in the center of the sequence and produce blunt ends. The sticky ends are helpful in making recombinant DNA.

If you do the Punnett square for the cross, you have 9 genotypes. I have listed them here, but I grouped them by the number of dominant alleles present. This results in only 5 phenotypes. If we had more loci, we would have greater complexity.

The Environmental effects allows one genotype to produce multiple phenotypes Since each individual's phenotype is affected by the environment, there is a bit of variation in phenotype for each genotype and the phenotypes for the different genotypes overlap as shown here. If we look at the position of the arrow on the X axis, we can see that a plant of that height could be either the AA or Aa genotype. This gets more complex with additional genetic loci. L36/ slide 13

Oncogenes vs. Tumor Suppressor Genes

This diagram compares the dominant-acting nature of the oncogenes with the recessive acting nature of the tumor suppressor genes. You have seen both of these diagrams before, but the comparison and gene action of these two types of genes is important L37/33 There are other causes of cancer other than oncogenes and tumor suppressor genes. Viruses responsible for many cancers in animals. Here is a list of some viruses and the cancers that are associated with them. Probably the one you are most familiar with is the HPV or human papillomavirus since there is a vaccine that is available for this virus. L37/ 34

Map Based Approach

This slide describes the map based approach to DNA sequencing. Think of this as aligning fragments along the chromosome and then mapping each and combining the maps to give the map of the chromosome. L39/ 34

Joining Two DNA Molecules

To make recombinant DNA, we cut DNA from two sources with the same restriction enzyme. This results in complementary sticky ends. We then mix the products from the enzymatic digest together. Some pieces from one source will join with pieces from the other source. This restores the complementary pairing but we have to add DNA ligase to seal the nick in the sugar phosphate backbone.The theory of this is easy, but the reaction is not very efficient so many times the original molecules go back together instead of making the recombinant product. We have to design a system that allows us to easily recognize the recombinant molecule and cells that take up the recombinant molecule.

True or False: Migration between two populations increases diversity between the two populations.

false

True or False: Genetic Drift increases divergence between two populations.

true

Gene Editing: CRISPR-Cas9 System

•crRNA -Naturally occurring in bacteria -Help protect bacteria from bacteriophages -Can be modified to alter gene sequence •Can this be used to purposefully repair a mutant DNA sequence? •Or to add a missing gene? •Or to inactivate a bad gene? You have probably heard a lot about CRISPR and gene editing. We studied crRNA earlier in the semester. In its naturally occurring organism, bacteria, it is used to prevent invasion of bacteriophages. We know that CRISPR can be used to alter gene sequences, but can it be purposefully used to repair a mutant DNA sequence? Or to add a missing gene? Or to inactivate a bad gene? These are all areas of research that are currently underway. The next slide shows a video about gene editing using CRISPR

The average egg production of a population of chickens is 229 eggs per year. The average egg production of individuals selected to be parents of the next generation is 293 eggs per year. The average egg production of the offspring generation is 253 eggs per year. Estimate narrow-sense heritability for egg production in this population.

(XO) ̅= __229_____ (Xs) ̅= ___293_____ (X1) ̅ = _253_______ S = ___293-229=64_____ R = ___253-229=24_____ h2 = __24/64= 0.375____

Electrophoresis

. DNA loaded at negative pole and migrates to positive pole "Run to Red" Small fragments migrate faster than large fragments Distance migrated is inversely proportional to the log of the fragment size Sizes can be determined based on comparing fragment migration to that of a control of known size We can isolate DNA from each of our plants and cut the DNA with a restriction enzyme. Then we can separate out the fragments on a gel using electrophoresis. You may remember that in electrophoresis, the DNA samples are loaded at one end of the gel and an electrical current is passed through the gel. DNA has a negative charge so it moves toward the positive pole. We make sure to load the DNA at the negative pole so that it can "run to red", the positive pole. The smaller pieces of DNA migrate more rapidly than the larger pieces. The migration distance is inversely proportional to the log of the fragment size, which just means that we can determine the sizes of the pieces of DNA based on comparing their migration to that of control fragments of known length.

Dideoxy Sequencing Reaction

1 type of Dideoxyribonucleoside triphosphate added to each tube Fragments in one tube all end after the same type of nucleotide was added Separate on Polyacrylamide gel to determine order of bases added by reading across all 4 columns Here you can see a template strand that we want to sequence. We have to add a primer, the DNA polymerase enzyme, and all four deoxyribonucleoside triphosphate forms. Just as mentioned on the previous slide, we run 4 reaction tubes simultaneously. Each contains a different dideoxyribonucleoside triphosphate. Replication is allowed to occur. Each tube accumulates a set of fragments that end after the addition of the dideoxy nucleotide. Then we separate the fragments using a polyacrylamide gel, which allows for a distinction of 1 bp in size between adjacent bands. The 4 reaction tubes are loaded into adjacent lanes so that we can read across all 4 lanes to determine the order of the bases added during the reaction. Remember that replication goes from 5' to 3'. The smallest pieces migrate the furthest in the gel. That means that the smallest piece must have stopped closest to the 5' end of the sequence that was just produced during the sequencing reaction. We can read the sequence produced during the reaction (listed as the sequence of the complementary strand in this diagram) by reading from bottom to top and this will read the sequence from 5' to 3'.

Human Genome Project

1988 - Congress funds NIH and DOE 1991 - Project begins 2001 - First draft published (90% complete) 2003- Second draft complete (99% of euchromatin) Project still continues with focus on variation Craig Venter and Francis Collins The human genome project was an incredible effort for its time. Plans started getting underway when Congress set up funding for the project in 1988. This was a multinational effort. In the USA, congress provided about 3 billion dollars toward the effort. Some of the first to get involved were the Department of Energy (DOE) and National Institute of Health (NIH). Most of the wet lab research completed through federal funding was funded by NIH. James Watson, was initially put in charge of the NIH efforts. That position was later taken over by Francis Collins who was in charge when the first draft of the sequence was release. The DOE had already been conducting research on DNA to understand radiation exposure/damage of nuclear power plant employees and contributed some additional information as well as information on computer organization and banking of the sequencing data. In addition to public funds, some private companies got involved. Celera Genomics, run by Craig Venter, was the main private company contributor. They also were important in coming up with a new way to conduct DNA sequencing. The first draft was published in 2001 and the second in 2003. The project still continues with a focus on genetic variation in euchromatic regions

Dideoxy (Sanger) Sequencing Reaction

3'-OH required by DNA polymerase to form phosphodiester bond. DNA replication reaction proceeds until a dideoxy nucleotide is incorporated. No further extension occurs. Detect which position the dideoxy nucleotide was incorporated The type of DNA sequencing reaction we will be using is the dideoxy or Sanger sequencing reaction. A normal deoxyribonucleoside triphosphate has a 3'OH that is very important since it is used in forming the phosphodiester bond. Remember, in replication, the triphosphate form of the nucleotide is required and the PPi is released as the nucleotide is added. There is another form of triphosphate form of the nucleotides - a dideoxyribonucleoside triphosphate. These do not have the 3'OH, they just have a 3'H. However, the dideoxy form can be incorporated in the DNA during replication. If it is, then replication will stop since no other nucleotides can be added to the 3'H. There are dideoxy nucleotides of all four base types. The theory behind the dideoxy sequencing is that we allow replication to occur in the presence of all for normal deoxyribonucleoside triphosphates, and 1 dideoxy form. We run 4 reaction tubes - each with a different dideoxy form (ATG and C). In the tube with ddGTP, the reaction will always stop after the addition of a dideoxy guanine. Similar ideas exist for the other tubes, so we have a tube that stops after the addition of each base type. Then we just have to detect the position each dideoxy nucleotide was incorporated into

Recombinant DNA •EcoRI recognition sequence

5'-GAATTC-3' 3'-CTTAAG-5 5'-G AATTC-3' 3'-CTTAA G-5' This is the recognition sequence for the enzyme EcoRI. The naming of the enzymes is based on the bacterial source that they come from. The first letter is the first letter of the genus, followed by the first 2 letters of the species, sometimes a letter for a strain name, and then roman numerals to indicate the order that the enzymes were found within that source. EcoRI was the first enzyme isolated from E.coli strain RY13. The sequence is cut at the two arrows. In this case it is not cut straight across the sequence. The enzymes cut the sugar phosphate backbone, which allows the strands molecule to break into two pieces. When EcoRI cuts the DNA, there are some bases on each piece that are not paired. These are called overhangs or "Sticky Ends' because these bases will want to base pair or stick to complementary sequences.

GWASGenome-Wide Association Studies

Advantages: 1.Associates a trait and gene markers in a biological population (random mating population) as opposed to controlled crosses 2.Uses molecular genetics techniques to scan entire genome for regions that show statistical importance for a trait. These regions can then be sequenced and analyzed in more detail. We also mentioned GWAS or Genome wide association studies early in the course. These associate a trait and gene markers in a biological population (random mating population) as opposed to controlled crosses. These studies use molecular genetics techniques to scan entire genome for regions that show statistical importance for a trait. These regions can then be sequenced and analyzed in more detail to identify genes of importance in the trait.

Haploinsufficiency

Although typically, both copies of a tumor suppressor gene have to be altered to result in cancer, there are some cases where a higher susceptibility to cancer occurs when only one copy of the normal allele is present Haploinsufficiency occurs when a diploid organism has the loss of function of one copy of a gene and is left with only one functional copy of a gene and this functional copy does not produce enough of the gene product to exhibit the wild-type phenotype Example: Bloom Syndrome Although typically, both copies of a tumor suppressor gene have to be altered to result in cancer, there are some cases where a higher susceptibility to cancer occurs when only one copy of the normal allele is present. These are cases of haploinsufficiency. Haploinsufficiency occurs when a diploid organism has the loss of function of one copy of a gene and is left with only one functional copy of a gene and this functional copy does not produce enough of the gene product to exhibit the wild-type phenotype One example of this is Bloom Syndrome

Use Anchor Markers to Correlate Different Maps

Anchor Markers are mapped both genetically and physically so can be used to correlate the genetic and physical maps STS: sequence-tagged sites - short, unique DNA sequences (200 - 500 bp) used to link physical and genetic maps. These sequences are hybridized to overlapping clones on physical maps and are hybridized to cytological maps using in situ hybridization to anchor map types. EST: expressed-sequence tags - short cDNA sequences that are used to link genetic and physical maps. These cDNA sequences are used as hybridization probes to anchor maps We need some anchor markers to help us relate these maps to each other. Anchor markers map both genetically and physically so they can be used to correlate genetic and physical maps. Two commonly used markers are STSs and ESTs. STS stands for sequence-tagged sites. These are short, unique DNA sequences (200 - 500 bp) used to link physical and genetic maps. These sequences are hybridized to overlapping clones on physical maps and are hybridized to cytological maps using in situ hybridization to anchor map types. (Remember the FISH technique). EST stands for expressed-sequence tags. These are short cDNA sequences that are used to link genetic and physical maps. These cDNA sequences are used as hybridization probes to anchor maps. Since these are produced from cDNA, they contain only actively transcribed areas of the chromosome, where the STSs may not be transcribed.

Twin Studies Can be Used to Estimate Genetic and Environmental Influences on Traits

Compare MZ reared together and DZ reared together to estimate genetic effects Let's think about what monozygotic and dizygotic twins share in common We can say that monozygotic twins (abbreviated MZ) raised together have the same genotype and similar (we will call it the same here) environments. Monozygotic twins raised apart have the same genes, but different environments. Dizygotic twins (abbreviated DZ) have different genes but will share the same environment if they are raised together and different environments if raised apart. If we want to estimate the genetic effects for a trait, we want to keep the environment constant between the individuals we evaluate and vary the genetics. If we compare monozygotic twins raised together with DZ twins raised together we can estimate genetic effects.

High Incidence of Occurance is NOT the same as a High Rate of Death

Compare number deaths per year to the number of new cases per year. Some high values are pancreatic, esophageal, liver and ovarian cancers A high incidence of occurrence of a particular cancer type is not the same as a high probability of death from that type of cancer. Look at the number of deaths per year and compare that to the number of new cases per year. Some of the higher values according to this chart are pancreatic, esophageal, liver and ovarian cancers. The high death rate for these types of cancers may be due to the importance of the organ function (esp pancreatic and liver cancers) and/or the difficulty in diagnosing the cancer until it is in the later stages. It is always more difficult to treat later stage cancers where metastasis has occurred.

Concordance from Twin Studies

Compare pairs of monozygotic and dizygotic twins for a trait Concordant = twins same for trait Discordant = twins differ for trait Can be expressed as • •% concordance indicates a percentage of that twin group that showed the same phenotype • •Ratio of MZ:DZ (both reared together) for a particular trait where the numbers in the ratio are the % concordance for each twin group • •Can be used to estimate heritability. Studies to estimate heritability that involve twins involve evaluated a lot of pairs of twins for a trait. Both monozygotic and dizygotic twin pairs are used with each twin raised together with their twin. We say that the twins are concordant for a trait if they are the same for that trait. They are discordant if they differ for that trait. Can be expressed as the % concordance that indicates a percentage of that twin group that showed the same phenotype. We can also calculate the ratio of MZ:DZ (both reared together) for a particular trait where the numbers in the ratio are the % concordance for each twin group. These values can be used to estimate heritability. •High MZ: Low DZ indicates significant role of the variance of genetic effects in the phenotypic variance • •Similar MZ and DZ indicates a lot of variation is due to variance in environmental effects • •Low MZ but still much higher than DZ indicates genetic predisposition, but variance due to environmental factors is important in the variance of the phenotypes for this trait We are not going to calculate a specific heritability value from these data. We are just going to look at trends. If there is a high MZ : low DZ value, like the red colored traits here, this indicates a significant role of genetic effects - which is a high heritability. If the MZ and DZ values are similar, like the green trait here, most of the variation in phenotype is due to variance in environmental effects which indicates a low heritability. If there is a low MZ value but it is still quite higher than the DZ value, like the purple colored traits here, there is a genetic predisposition, but a low to moderate heritability for the trait.

Practice Making Recombinant DNA

Create a functional recombinant plasmid that will confer tetracycline resistance but not ampicillin resistance to the bacterial cells that contain this plasmid by inserting a piece of pig DNA into the vector below. Give the map of the recombinant plasmid indicating the location of all restriction enzyme sites and the base number at which each enzyme cuts the recombinant plasmid. The vector size is 2200 bp. Fill in the statement below to indicate the size of your recombinant plasmid. The size of the recombinant plasmid is ________________ bp. You should expect a problem in which you have to make recombinant DNA. Try this. Identify the enzymes to be used. We want the final recombinant plasmid to be tet resistant so do not break the tet gene. We do not want the product to be amp resistant so you insert part of the pig DNA into the amp resistance gene. Use the enzymes EcoRI and KpnI Lose 225 bp from vector and gain 1600 bp from pig DNA: net gain = 1375 Recombinant size = 2200 - 225 + 1600 = 3575 bp Use the enzymes EcoRI and KpnI. This causes you to lose 225 bp from the vector and allows you to add 1600 bp from the pig DNA - a net gain of 1375 bp. The final size of the recombinant plasmid will be 2200 (what you started with) - 225 + 1600 = 3575 bp The size of the recombinant plasmid is 3575 bp Check your work! The blue in my diagram is the inserted pig DNA. You should also be able to make recombinant DNA using the lacZ gene as a marker gene and using the blue-white screening to detect cells with the recombinant plasmid.

The Role of DNA Repair in Cancer

Defective Nucleotide Excision Repair •Xeroderma pigmentosum •High risk of skin cancer since many pyrimidine dimers do not get repaired Defective Mismatch Repair •Colorectal, endometrial, stomach cancers • Defective Double Strand Break Repair •BRCA1 BRCA2 Defective DNA repair systems have been implicated in cancer. We have already talked about the defective nucleotide excision repair mechanism that is found in people with Xeroderma pigmentosum. These individuals have a high risk of skin cancer since many pyrimidine dimers do not get repaired. A defective mismatch repair mechanism has been implicated in some forms of colorectal, endometrial and stomach cancers. A defective double strand break repair mechanism has been implicated in situations where there are BRCA1 or BRCA2 mutations.

DGRP

Drosophila Genetic Reference Panel •>200 fully sequenced inbred lines from a natural population. • •Used to identify QTLs There are a couple of experimental populations that have been developed in our model organisms just for this purpose. In Drosophila melanogaster, the DGRP, the Drosophila Genetic Reference Panel has more than 200 fully sequenced inbred lines from a natural Drosophila population. The DGRP was developed in the laboratory of Dr. Trudy Mackay at NC State University. Flies were collected at the local farmers' market. They were then inbred for many generations to result in the inbred lines that were sequenced to form the DGRP. Selection experiments using these lines have helped to identify genes that are important in quantitative traits such as longevity, aggression, locomotion, alcohol sensitivity, starvation resistance and others.

Most Cancers are not Due to One Gene, but Accumulations of Mutations in Several Genes

Eg. colon cancer - tumor suppressor and oncogenes defective Progression: Benign adenomas to malignant tumors to metastasis Very small polyps ..... Larger polyp Most cancers are not due to one gene, but are due to the accumulation of mutations in several genes. In some cancers, we have learned quite a bit about the genes involved. For example, colon cancer involves mutations in the APC gene, a tumor suppressor gene. Mutations in another tumor suppressor gene, p53, and the oncogene ras then follow, along with mutations in other genes. There is a progression to colon cancer - from benign adenomas, to malignant tumors to metastasis. Genetic testing for mutations in the APC gene are available. If the test shows a mutation the patients may have colonoscopies regularly starting much earlier than the average population. Recommendations for screenings change so check with your physician, but most people without a history of colon cancer get their first colonoscopy about age 50. If there is a family history, recommendations will probably be to have the first colonoscopy earlier depending on family history and genetic tests. Ask your doctor about your specific case! The images at the bottom of the page are from colonoscopies. Colonoscopies allow the doctor to see the inner lining of the colon. Small polyps can be removed during the exam. The image on the left shows 2 small polyps. The image on the right shows one larger polyp that was identified as an adenoma. All of these were removed during the colonoscopy.

Estimating Heritability

Estimating H2 with P0, F1, and F2 data Assume that both parent populations are pure lines (inbred) Let's use some simple crosses to estimate heritability. We can start with two inbred lines - one with short ears of corn and one with long ears of corn. There is some variation among the members of each line. Since they are inbred (1 genotype) all of this variation is due to environmental effects. We mate the lines together to get the F1. In the F1 all the genotypes are the same, so the variance seen is all environmental variance. Now we make the F2 generation by intermating the F1. There are all possible genotypes present in the F2. There are still effects of the environment, so the variance calculated from the F2 plant values equals phenotypic variance and equals VG+VE L36 slide 36

How to increase grain yield?

Farmer Bill notices that some types of corn produce more grain than others. Why? •Is it genetic? •if so, the farmer can begin selective breeding •Is it environmental? •if so, the farmer can modify the environmental factors Quantitative genetics has been used in plant and animal breeding for years. Let's look at a basic plant breeding problem. Farmer Bill (named after my father in law, who is a farmer) would like to increase grain yield for corn in his crop. He notices that some types of corn produce more grain that others. Why is this? If the variation that he sees in the crop is genetic, then the farmer can do selective breeding to increase yield. If the variation is only due to environmental effects, the farmer can modify the environment (add more water, fertilizer , etc) to increase yield.

Oncogenes Chronic Myelogenous Leukemia

Fatal, uncontrolled replication of myeloid stem cells 90% of patients have the Philadelphia chromosome Reciprocal translocation involving chromosomes 9 and 22 places 2

Oncogenes Chronic Myelogenous Leukemia

Fatal, uncontrolled replication of myeloid stem cells 90% of patients have the Philadelphia chromosome Reciprocal translocation involving chromosomes 9 and 22 places 2 oncogenes near each other Chronic myelogenous leukemia is a fatal, uncontrolled replication of the myeloid stem cells. This also results from a reciprocal translocation. The c-ABL gene normally resides on chromosome #9. The BCR gene normally resides on chromosome 22. The reciprocal translocation moves the cABL gene to chromosome 22 adjacent to the BCR gene. This translocated chromosome is called the Philadelphia Chromosome. About 90% of patients with Chronic Myelogenous Leukemia have the Philadelphia chromosome. This fusion protein is affects cell cycle control which leads to cancer associated with white blood cells In this case, the two genes are transcribed as one unit and are translated into a fusion protein. The fusion protein affects the cell cycle control in an abnormal manner which leads to this cancer that is associated with white blood cells.

FISH to locate specific DNA sequence

Fluorescent in situ hybridization (FISH) Can we visualize the location of a DNA sequence on a spread of chromosomes? Yes - one technique for this is FISH - which stands for fluorescent in situ hybridization. This is a process in which a sequence that you want to locate is labeled with a fluorescent tag. The DNA is denatured (so it is single stranded) and then allowed to hybridize to the chromosomes. The fluorescent tag will appear wherever that sequence is found. The image at the lower left shows yellow dots at the ends of the chromosomes because the sequence used was a telomere sequence. Hence the telomeres of the chromosomes are fluorescing.

Phenotype = Genetic + EnvironmentP = G + E

G = A + D Genetic = Additive + Dominance Effects A = Average effect of substituting A for a in genotype. D = Dominance effects due to specific combinations of alleles at a locus D = 0 if the value of Aa is exactly between the values of AA and aa. Variation due to genetic + environment We can think of the phenotype as being a combination of the genetic and environmental effects. Let's abbreviate phenotype as P, genetic as G and environment as E. This gives us P = G+E The genetic effect can be further subdivided into additive and dominance effects (A and D) giving us G = A+D The additive effect is the average effect of substituting A for a in the genotype The dominance effect is due to the fact that sometimes the heterozygote is not halfway in value between the two homozygotes - that is sometimes specific combinations of alleles do not add exactly. We can go back to East's work with the tobacco flowers and think about how to partition out the variation that is observed. Look at the F2 generation where all possible genotypes are present. The variation here is due to both genetic and environmental effects.

Phenotypic Variance Phenotype = Genetic + EnvironmentP = G + E

G = A + D Genetic = Additive + Dominance Effects A = Average effect of substituting A for a in genotype. Additive effects of genes on the phenotype. D = Dominance effects due to specific combinations of alleles at a locus. If Aa is exactly between AA and aa, there are no dominance effects (D=0) We cannot look at an individual and determine the values for G, E, A, D. Again, think of the phenotype as a combination of the genetic and the environmental effects. Think of the genotypic value as a combination of the additive and dominance effects. If the heterozygotes is not directly in the center of the two homozygotes, you have dominance effects However, we cannot look at an individual and determine the values of G, E, A, and D. We must look at a population and partition the variance we see in the phenotypes of the individuals in the population into genetic and environmental effects and then further break down the genetic effects into additive and dominance effects. Going back to Edward East's graphs: This is the F2 generation where we have all possible genotypes present and we are still having environmental effects. We can look at the variation we observe. This would be the phenotypic variance. That variation is due to variation due to genetic effects and variation due to environmental effects. We can further partition the genetic variance into additive genetic variance and dominance genetic variance.

Retinoblastoma - RB Gene

G1 to S transition RB normally prevents E2F from activating replication Late in G1, RB releases E2F and replication can begin Let's take a closer look mechanism of action of the RB protein. You do not need to know the mechanism in this slide. It is provided for those who want a bit more detail. RB is a tumor suppressor gene. E2F is a transcription factor. When everything works properly, the RB protein binds to E2F. This keeps E2F from activating genes that cause replication. Late in G1, RB releases E2F so that replication can begin. When the RB protein is not correctly produced, the regulation of transcription is altered and cells can replicate at the incorrect time

Allopatric Speciation

Geographic Barrier initiates speciation by blocking gene flow Think about one species and the ways that divergence can allow speciation to occur. One of the easiest things to envision is a geographical barrier that initiates speciation by blocking gene flow. The population is split into two groups. For example, the population is split by a mountain range or a river so that part is on one side and the other part is on the other side of the geographical barrier. Over time, the two groups accumulate different mutations and face different selection pressures. Genetic drift operates differently in the two populations. Eventually, they can accumulate enough differences so that when the population reconnects, they cannot interbreed. This is called allopatric speciation and is the type of speciation that involves a physical barrier to reproduction.

GFP as a Reporter Gene to Detect Gene Expression

GFP linked to a protein allows us to visualize when protein is expressed in living cells over time GFP = green fluorescent protein (from a jellyfish) GFP gene is joined to protein gene. Gene is put into organism Organism produces a fusion protein. Shine UV light on organism GFP fluoresces Allows expression of protein to be seen over time in living organism Note: GFP is a small protein so fusion protein and doesn't interact much with activity of other components of cell or their functions. GFP can also be used as a reporter gene and is probably one of the most frequently used reporter genes. GFP stands for green fluorescent protein. This is a protein from a jellyfish that will fluoresce under UV light. In many cases, the GFP gene is joined to protein gene that we want to study.The combination is put into our organism.The organism produces a fusion protein. If you shine UV light on the organism, the GFP fluoresces. This allows us to see when our protein is expressed over time in a living organism

Genetic and Environmental Causes

Genetic Causes •Single gene •Polygenic (more than 1 gene) •Chromosome aberration •Mutation(s) in somatic cell or in gamete producing cell •Viruses • Environmental agents (carcinogens) •Can cause mutations •Can alter gene expression There are both genetic and environmental causes for cancer. From a genetic point of view - we have identified single genes, polygenic inheritance, chromosome aberrations and viruses that can lead to cancers. These mutations can be in either the somatic cells or in gamete producing cells - but we typically think of them as occurring in somatic cells. There are lots of environmental agents that are cancer causing - or carcinogenic. They can act by causing mutations or they can alter the expression of genes You can probably think of some such as some environmental agents that can cause cancer such as asbestos, lead, x-rays, UV light, and chemicals in tobacco.

GINA

Genetic Information Nondiscrimination Act (2008) • •Heath insurance companies cannot decrease coverage or increase rates based on results of genetic tests. • •Employers cannot discriminate (hire/fire, etc.) based on results of genetic tests. • •Neither health insurance companies or employers can require genetic testing.

Correlation of the genetic, cytological, and physical maps of a chromosome

Genetic Map: constructed from recombination frequencies in units of centimorgans. Cytological Maps: Banding patterns based on chromosome staining Physical Maps: molecular distances in base pairs (bp), kilobases (kb), megabases (mb) 1000 bp = 1 kb 1 million bp = 1 mb Physical maps often show restriction enzyme recognition sites, locations of particular clones or sequence-tagged sites (STS) In humans, 1 cM is about 1 mb Let's look at the three types of maps that need to be correlated: Genetic, Cytological, and Physical maps of a chromosome. The genetic map is constructed from recombination frequencies in units of centimorgans. The cytological maps involve the banding patterns on the chromosome and are based on chromosome staining. The physical maps are molecular distances in base pairs (bp), kilobases (kb), megabases (mb). Note that 1000 bp = 1 kb 1 million bp = 1 megabase pair (mb) Physical maps often show restriction enzyme recognition sites, locations of particular clones or sequence-tagged sites (STS) In humans, 1 cM is about 1 mb , but note that the markers that map to more than 1 map do not map straight across. While the gene/marker position is in the same order, the different maps result in different distances between the sites. This is partly due to a non-uniform amount of recombination along the length of the chromosome and due to a non-uniform amount of coiling and uncoiling along the length of the chromosome.

G x E (VGE) Interaction is also part of VP

Genetic by Environment Interaction The effect of the gene depends on the environment in which it is found Here, AA performs better in dry environments, but aa performs better in wet environments We will ignore this in our calculations, but realize that it exists and is important! VP = VG + VE + VGE There is another component of phenotypic variance: The interaction between genotype and environment. The effect of the gene depends on the environment in which it is found Here, AA performs better in dry environments, but aa performs better in wet environments We will ignore this in our calculations, but realize that it exists and is important! So more precisely, VP = VG + VE + VGE

Evolution

Genetic changes in the composition of a Population including •emergence of species •divergence of species •extinction of species • Involves variation, heredity and selection If variation is not heritable, then it cannot be passed to progeny Selection works on entire organism's phenotype so many loci as well as environmental factors are important Study diversity that exists in a population and between populations and the factors that can cause diversity. Evolution is defined as changes in the composition of a population including the emergence of new species, the divergence of species and the extinction of species. It involves variation, heredity and selection. If the variation is not heritable, it cannot be passed to the progeny. Selection works on the entire organism's phenotype so there are many genes that are important and environmental factors are also important. Please note that the change in the population must be genetic. The change must occur in a population, not an individual. The gene pool evolves, the individual does not evolve. Remember when we studied population genetics that we looked at factors that influenced diversity within a population and factors that influenced diversity between populations. These same factors are important in evolution

Another long term selection experiment is for an increased number of bristles on the back of the thorax in Drosophila.

Here one line was allowed to randomly mate without selection. That is the control line. The other line was selected for increased bristles at each generation. It appears that there was an increase for about 20 generations and then the progress from selection leveled off - probably due to approaching a biologically limiting factor (such as room for the bristles and accompanying anatomical features). Figure 24.22 The response of a population to selection often levels off at some point in time. In a response-to-selection experiment for increased abdominal chaetae bristle number in female fruit flies, the number of bristles increased steadily for about 20 generations and then leveled off.

How can we use DNA Sequence Data?

Homologous sequences are evolutionarily related Orthologs: homologous sequences found in different species Paralogs: homologous genes in the same species and arrive through gene duplication How can we use DNA sequence data? One way we can use the sequence data is to compare related species and related genes within a species. Homologous sequences are evolutionarily related - that is, they will have similarities in their sequences. You may remember these terms: Orthologs are homologous sequences found in different species. Paralogs are homologous sequences found within a single species and arrive through gene duplication. There are programs in which a sequence can be typed and then databases can be searched to find similar sequences in different species. These can be analyzed in evolutionary studies and the information on similar genes can be used to help learn about gene function and regulation.

Homologous Sequences are Related Evolutionarily

Homologs •Paralogs: homologous sequences in the same species and arrive through gene duplication • •Orthologs: homologous sequences found in different species Homologous sequences are evolutionarily related. Homologous sequences can be within the same species or in different species. Let's start with the A gene. A gene duplication occurs in one species. Now this species has 2 copies of the A gene. Time goes by and each copy gains different mutations. Let's call the gene in the original position gene A1 since it is somewhat different from the original due to mutation and selection through anagenesis. The duplicate also changed (through anagenesis) and we will call it A2. Over time, this population may split into two groups (cladogenesis) with different changes occurring in the two groups. Now changes can continue to accumulate. In one of the populations we will say that the A1 gene changed over time through anagenesis to the B1. Similarly, the A2 gene changed to the B2 gene. We define paralogs as homologous sequences in the same species that arrive through gene duplication. A1 and A2 are paralogs. B1 and B2 are also paralogs. Orthologs are homologous sequences found in different species. A1 and B2 are orthologs. A2 and B2 are also orthologs. We can compare orthologous and paralogous sequences to evaluate the amount of change over time between and within species

Example 1: Mass Selection as a Percentage

I have a population of thyme in my garden with a mean yield per plant of 60 mg and a standard deviation of 2.3 mg. I want to select the top 5% of my population to be parents of the next generation. I know that the narrow-sense heritability for yield in this population is 0.45. What is the expected mean of the progeny generation? Values of Z and I will be negative if we select to decrease a trait (left side of curve XT = Initial mean + Zσ = 60 + (1.645)(2.3) = 63.7835 mg Selection Differential = S = Iσ = (2.063)(2.3) = 4.7449 mg Response to Selection = R = h2S = (0.45)(4.7449) = 2.1352 mg Mean of Progeny Generation = X1 = X0 +R = 60 + 2.1352= 62.1352 mg Let's try a problem. I have a population of thyme in my garden with a mean yield per plant of 60 mg and a standard deviation of 2.3 mg. I want to select the top 5% of my population to be parents of the next generation. I know that the narrow-sense heritability for yield in this population is 0.45. What is the expected mean of the progeny generation? The table provided here is the Z and I table you will get on the test. We are selecting the right tail of the normal distribution with p = .05 for 5% ,so we are using the highlighted values on the table. The truncation point, selection differential, and response to selection are calculated as shown here. Note that I rearranged the heritability formula of h2= R/S to solve for the response to selection. I rearranged the formula for the response, R, to solve for the mean of the progeny generation. You may find that it is helpful to draw out the initial population and progeny curves and label them to help you relate the problem to the previous slide. If you know your formulas, this is simply a matter of using the proper formula and rearranging formulas algebraically as needed. Big hint for success: Know your formulas!!!!

FBI Database

IF you detained for some federal offenses, your DNA sample can be taken and put in the DNA database! • They do not need a warrant (in some states) to collect DNA from suspects for some crimes. If you are detained for some federal offenses your DNA sample can be taken and put into the DNA database. In NC, they do not need a warrant to take your DNA sample if you are arrested for suspicion of some violent crimes. If you are exonerated, you can request to have your sample purged from the database. If you are convicted, your sample will remain in the database.

Oncogenes(Dominant-Acting Mutations)

If an individual has two normal alleles for the locus, normal cell division will occur. If one mutant allele is present, excessive cell division occurs resulting in tumor formation. We say that oncogenes are dominant acting mutations. Some common oncogenes are listed here. We will talk specifically about the myc gene and its implication in a specific type of cancer. L37/ 29

Mapping, Sequencing and Analyzing the Structure and Function of Whole Genomes

Involves •Collecting sequence data •Correlating genetic, cytological and physical chromosome maps •Identifying DNA sequences that contain genes of interest in the genome and analyzing the function of those gene products Studying mapping, sequencing and analyzing whole genome structure and function involves collecting sequence data, correlating genetic, cytological and physical chromosome maps, and identifying DNA sequences that contain genes of interest in the genome and then analyzing the function of those gene products

Can we express PSY and CTRI in rice endosperm?

Isolate PSY from Narcissus pseudonarcissus (daffodil) Can we turn the mRNA into DNA? Can we then insert that DNA into rice? It is known that daffodils (Narcissus pseudonarcissus) can produce the PSY enzyme. We can identify the mRNA that is produced for this gene. Remember that the primary transcript of the mRNA has introns that have to be removed. We do not know how many introns and exons are present, and getting the rice to process the primary transcript correctly would probably be a very difficult thing to do. Maybe we can isolate the mature mRNA, turn it into DNA and then put that DNA into the rice - then the correct processing would already be done! Can we turn the mRNA into DNA? Can we then insert that DNA into rice? These are questions that need to be answered

Figure 26.16 Different parts of genes evolve at different rates.

L35 slide 25 This diagram shows similar information to that in the chart on the previous page. This shows a diagram of the eukaryotic gene and the relative frequency of mutation for the different regions. You should be able to discuss the relationship between the frequency of accumulated mutation in the different regions with the functions of the regions and be able to describe what each segment of the DNA is used for. Note that the highest rates of nucleotide substitution are in sequences that have the least effect on protein function.

PCR - Polymerase Chain Reaction

Kary Mullis - developed PCR protocol Denature DNA by heating to 95oC. (allows strands to separate) Each strand serves as a template for replication Primers anneal (50-65oC) to identify target that will be amplified Taqpolymerase adds nucleotides to 3' end of primer (72oC). Repeat many times Typically a small amount of DNA is isolated at the crime scene or obtained from an individual. This specific regions of the DNA need to be amplified (that is, copied many times) in order to be visible for analysis. Kary Mullis developed the Polymerase Chain Reaction (abbreviated PCR). He received a Nobel Prize for his work. The idea in PCR is to get many copies of a specific DNA sequence produced automatically. Everything is put into 1 reaction tube. The tube is placed into a thermocycler - a machine that cycles through the necessary temperatures multiple times. There are three steps in PCR: The DNA is heated to about 95oC to denature the DNA. Each strand serves as a template for replication. Primers anneal at about 50-65oC to identify the target that will be amplified. The primers are short DNA sequences that have to be included in the reaction tube. One primer has to bind to each of the template strands just upstream from the sequence you want to amplify. One of the tricks of doing this well is to identify primers that are specific for the region you want to amplify. Then the reaction is heated to 72oC so that Taq polymerase can add nucleotides to the 3' end of the primers. Taq polymerase is a DNA polymerase that was isolated from a bacteria that lives in hot springs. Remember, we are trying to do everything automatically in 1 tube. That means all of our reaction materials must be added at the beginning of the process and they must all be compatible with each other and with the range of temperatures in the reaction. The DNA polymerase from us and from E. coli would be degraded at 95oC. However, the Taq polymerase is stable at this temperature. We use 72oC for the extension part of the reaction since that is the optimum temperature for Taq activity. This process: Denature, annealing and extending is repeated about 35 - 45 times so that we produce a lot of copies of the particular DNA sequence that we are interested in. L 38/47

Tumor SuppressorRetinoblastoma - RB Gene

Knudson's Two Hit Hypothesis - both copies have to be defective in same cell to allow tumor to develop 40% of cases are inherited (1 mutant allele in zygote) Normal protein responsible for regulation at G1/S checkpoint Let's take a closer look at the retinoblastoma gene. (RB) Retinoblastoma is a tumor in the eye due to a defective RB gene. Knudson came up with his Two-Hit Hypothesis for the action of tumor suppressor genes based on studies of the RB gene. Simplistically stated, Knudson's Two Hit hypothesis just proposes that both copies of the gene need to be defective in the same cell in order to allow the tumor to develop. Let's see what this implies from an inheritance point of view. If an individual starts off with 2 normal copies of the gene in every cell, it has to accumulate 2 mutations in the same cell- one in each copy of the gene in order to lead to tumor formation. If an individual has only normal copies of the gene to start with, it is unlikely that they will get 2 copies in the same cell. However, if that happens, they end with a tumor in their eye. It is highly unlikely that they will end up with tumors in both eyes as this would require the same thing to happen in both eyes. We say this is a sproratic case of retinoblastoma. About 40% of the cases of retinoblastoma are inherited, meaning that one copy of the gene is mutant in each cell of the zygote. If an individual starts off with a mutant allele in each cell (the inherited form), only one mutation is required in 1 cell in order to cause tumor formation. These individuals often have multiple tumors in their eyes and typically have tumors in both eyes. By the way, the normal RB protein is responsible for regulation at the G1/S checkpoint and regulates the action of a transcription factor.

This graph shows the results of almost 80 generations of selection for high and low oil content in maize.

L36/ 60 . You can see that the line selected for high oil is still on an upward trajectory. However, the line selected for low oil content is approaching a limit. This is because a certain amount of oil must be present in order for the seed to germinate. Approximately 20 genetic loci that affect the percentage of oil in corn have been identified Approaching limit: unhealthy level ~ 20 loci identified that affect oil % Figure 24.21 In a long-term response-to-selection experiment, selection for oil content in corn increased oil content in one line to about 20%,whereas it almost eliminated it in another line

Mass Selection as a Percentage

L36/ slide 52 There is another way to do a mass selection experiment. Instead of selecting particular organism and measuring their mean to get the (Xs) ̅ Value, we can select the top (or bottom) percentage of the population. In our previous formulas, we calculated the selection differential as the difference between the mean of the selected population and the original population. In this method, we don't know the mean of the selected population. We have to use other parameters to estimate heritability. We use the properties of the normal distribution since we assume that our trait is distributed as a normal distribution. The proportion selected is abbreviated p. We still have the mean of our population which is officially mu, but we can use = X bar naught as well. We have to know about the variance of the original population. Remember that has to do with the spread of the curve. The standard deviation is abbreviated sigma and is the square root of the variance. The truncation point is the point at which we select all individuals in the population with greater values than the truncation point to be parents of the next generation. S still stands for the selection differential. R still stands for the response to selection. In our previous formulas, we calculated the selection differential as the difference between the mean of the selected and original populations. In this method, we don't know the mean of the selected population. We have to use other parameters to calculate heritability. We use tabulated values of I and Z. I is the selection intensity and is based on the proportion selected. Z is the standardized selection point and is also based on the proportion selected. Z is the point on the normal curve for a specific multiple of the standard deviation. The table of Z and I values will always be given to you.

Construct for Insertion into Rice Genome

L38/ 32 Need: 1.PSY gene 2.crtI gene 3.Promoters - endosperm specific 4.Poly A signals 5.Transgenic marker 6.LB and RB for insertion into plant genome Take a look at this construct - that is the term given for the DNA sequence that we will put into the vector. Remember, we want to have the PSY and the crtl gene products produced. You can see that these genes are present in the construct. Each of these genes has its own promotor. Also, each has to have the appropriate site for cleavage for the poly A tail. (the Poly A sites). There is a transgenic marker included so that we can see if the plant cell has taken up the construct. The left and right parts of the Ti plasmid (LB and RB) are required for transfer. I stuck BamHI restriction enzyme sites on both ends of the construct. These are just to indicate that there are restriction enzyme recognition sites there, but I do not know which enzymes were actually used. We allow this to be inserted into the rice plants. The next step is to identify specific rice plants that contain the genes. Credits: http://www.sciencemag.org/content/287/5451/303.long **BamHI sites are hypothetical- I don't know what enzyme was used** Structures of the T-DNA region of pB19hpc used in single transformations, and of pZPsC and pZLcyH used in co-transformations. Representative Southern blots of independent transgenic T0-plants are given below the respectiveAgrobacterium vectors. LB, left border; RB, right border; "!", polyadenylation signals; p, promoters; psy, phytoene synthase; crtI, bacterial phytoene desaturase;lcy, lycopene β-cyclase; tp, transit peptide.

Paternity Testing

L38/54 Eg. Let each box indicate a repeating unit that is 10 bp long Child #2's father could be either Sam or the Milkman What can we do to determine the correct father? Child #2's father could be either. If Sue gives the 20 bp allele to the child, then the 30 must come from the father. Sam has the 30 bp allele. However, Sue could give the 30 bp allele to the child. Then the 20 bp allele would come from the father. The milkman has the 20 bp allele. This data is inconclusive since either man could be the father based on the information we have. What can we do to determine the correct father? We can test additional genetic loci. The correct father must give 1 allele at every locus to the child so if we test enough genes, we should be able to determine who the child's father is.

What technique is used to separate the DNA fragments?

L39/55 Who Had Puppies? Give this a try: I have 4 parent dogs (2 males and 2 females) and 3 puppies. Who are the parents for each puppy. Pause and work this out and then go to the next slide to check your answers Puppy 1's parents are Mom1 and Dad1 Puppy 2's parents are Mom2 and Dad1 Puppy 3's parents are Mom2 and Dad 2 Remember, each puppy must get 1 allele from each parent!

Practice Problem with Paternity Testing: Jane wants to collect child support for her two children, Reba and Kris. The information below was gathered by amplifying two STR regions in each individual and separating the fragments using electrophoresis. Determine who (if any) of these men could be the father for each child. Write the name(s) of all possible fathers for each child in the spaces provided. Remember that each parent should contribute one allele at each locus to each child. L38/ 59-59

Let's look at Reba first. Jane has to give one band to Reba. The only band Jane has in common with Reba at locus 1 is Reba's upper band. This means Reba's Lower band had to come from her father. Dan does not have this band so he is not Reba's father no matter what he has at the other locus. Both Joe and Bill have the lower band in common with Reba so they might be the father, but we still have to look at the second locus. At the second locus, the upper band in Reba's lane must come from the father. Even though Dan has this band, we already know he is not the father since he could not give the father's band at the first locus. Joe does not have this band so Joe is not Reba's father. Bill has the Reba's upper band so Bill could be Reba's father. Now let's look at Kris. At locus 1, the upper band in Kris's profile came from her mother. The lower band came from the father. This rules out Bill. At locus 2, Jane gave the upper band to Kris. The lower band must come from the father. This rules out Joe. Dan has the correct band at both loci so Dan could be Kris's father.

Additive Effects of Genes

Let's look at a simple example with 2 genes to illustrate what I mean about the additive effects of genes. Suppose that there are 2 genes controlling plant height. The genotype aabb is 10" tall. Each A or B allele adds 5" to the height. How tall is a plant of genotype AABB? The AABB plant is 30 " tall. Since there are 4 dominant alleles we add 4 x 5 to the 10 inch base height to get 30". Now cross AABB x aabb. What is the genotype of the F1? How tall is the F1? The F1 genotype is AaBb. It is 20" tall: Since there are 2 dominant alleles, we multiply 2 x 5 = 10 and add that to the base height of 10" to get a 20" height. Now, cross two F1 together. What genotypes will be present in the F2 and how tall is each genotype? If you do the Punnett square for the cross, you have 9 genotypes. I have listed them here, but I grouped them by the number of dominant alleles present. This results in only 5 phenotypes. If we had more loci, we would have greater complexity.

Making Recombinant DNA:Selectable and Insertional Markers

Let's see if we can put a piece of the blue DNA into the pBR322 plasmid. We notice that both have cut sites for the enzyme BamHI. Since there are 2 BamHI sites on the blue DNA, we will have sticky ends on both sides of the piece we want to insert. When we cut the circular plasmid DNA we have sticky ends for BamHI on both sides of the cut. We cut both the plasmid and blue DNA with BamHI, which cuts at the places where the scissors are. We mix them together and the sticky ends anneal. Then we add DNA ligase. The ligase serves as molecular glue to restore the sugar phosphate bond after the sticky ends come together. We end up with several products from this ligation experiment: The sticky ends from the insert can stick together making a small circular piece of the blue DNA molecule. This is not what we want. This does not have any resistance genes present and cells that contain this will be ampicillin and tetracycline sensitive. We can also recreate the original pBR322 plasmid if the sticky ends close back together in the original manner. This is also not what we want. Note that this product will have the resistance genes to both ampicillin and tetracycline. The final product is the recombinant molecule and is what we are trying to make. The insert disrupted the tetr gene so the two pieces of the resistance gene on the map of the recombinant plasmid now say tets. These are circled in for you in red. The recombinant plasmid still has the ampicillin resistance gene. All of these molecules are in the same test tube. We allow competent cells to take up these products in a transformation experiment. Now we have to identify the cells that have the recombinant plasmid

Genetic VarianceG = A + D

Let's take a simple example with two alleles at one locus to explain the idea of additive and dominance effects. The bb genotype has a height of 2 inches. If we substitute the B for one of the b's, we get the Bb genotype. Let's say it's height is 4 inches. If we do another substitution of B for b, we get genotype BB that has a height of 6 inches. We did 2 substitutions. Our total change in value was 4 inches. The average change per substitution was 2 inches. And the heterozygote has a value exactly between the homozygotes. This model showed only additive effects. There are not any dominance effects. Now let's make the BB and Bb genotypes both have a height of 6 inches. The bb still has the 2 inch height. Overall, going from bb to BB, we changed 4 inches and made two allele substitutions so the average effect is still 4 divided by 2 = 2 inches. However, there is complete dominance here since the heterozygote and the BB genotype have the same value. One more, this time let the Bb genotype equal 5. This is a case of partial dominance since the heterozygote is between the two homozygotes, but not exactly between them. The average effect is still 2 since going from bb to BB we changed 4 inches with 2 changes (4 divided by 2 = 2). There is some dominance here since the heterozygote is not exactly between the homozygotes, but it is not complete dominance since the heterozygote is less than the BB genotype. Additive genetic effects primarily determines the resemblance between parents and offspring. Since a parent only gives a child one allele at each locus, most children will not have the same genotype for both alleles at a locus as a parent has. One more, this time let the Bb genotype equal 5. This is a case of partial dominance since the heterozygote is between the two homozygotes, but not exactly between them. The average effect is still 2 since going from bb to BB we changed 4 inches with 2 changes (4 divided by 2 = 2). There is some dominance here since the heterozygote is not exactly between the homozygotes, but it is not complete dominance since the heterozygote is less than the BB genotype. Additive genetic effects primarily determines the resemblance between parents and offspring. Since a parent only gives a child one allele at each locus, most children will not have the same genotype for both alleles at a locus as a parent has.

Multi Factor HypothesisEdward East - flower tube length in Tobacco

Look at the different graphs One of the first to investigate the factors important in quantitative traits was Edward East. He studied flower tube length in tobacco. East had two inbred lines: One with a short flower length and one with a long length. He mated them together to make and F1 and then mated the F1 together to get the F2 generation. Note that the F1 generation mean is between that of the two parent lines. The mean of the F2 and the F1 are very similar. Now look at the VARIATION that is present in each generation. In the inbred lines, all of the variation must be due to the environment since there is only one genotype in an inbred line. In the F1 generation, all of the variation in the plants is due to the environment. Each inbred should produce 1 type of gamete so the F1 will all have the same genotype. Therefore all of the variation must be due to the environment. In the F2, we have all possible genotypes and we also have environmental effects so the variation seen is due to the genotype and the environment. L36/ slide 7

Can we see when a gene is expressed?

Luciferase gene from firefly - glowing color Attempted to put into tobacco plant T-DNA + insert inserted into 1 plant chromosome - heterozygous plant that glows. Mendelian Inheritance pattern of 3 glowing : 1 non-glowing Why do this? Use as reporter gene If place regulatory sequences upstream from luciferase gene, regulation of luciferase can be monitored during development. Is there a way we can see when a gene is expressed? One way to do this is with a reporter gene. The luciferase gene from a firefly can be used as a reporter gene. Researchers attempted to put this gene into a tobacco plant using T-DNA as the vector. As you can see in the image at the bottom of the slide, it worked! Another thing that was cool was that they were able to show that the inserted gene segregated in a Mendelian inheritance pattern consistent with the insertion originally occurring in 1 copy of the chromosome. When that plant made gametes, half had the insert and half did not. Self-pollinating the plant resulted in ¼ of the progeny not having any copies of the insert, ½ of the progeny being heterozygous, and ¼ of the progeny being homozygous for the inserted gene. Glowing tobacco plants are an interesting thing to see, but what can they be used for? Since we can easily see when this gene is expressed, we can conduct studies on different regulatory sequences such as promotors and enhancers and see the effect of altering these sequences on gene expression based on monitoring the expression of the luciferase production.

Two Approaches to Genome Sequencing

Map-based sequencing (Classic) •Make genomic library •Fingerprint library and orient clones relative to each other •Screen the library with markers that allow you to relate it to the recombinant map •Map-based sequencing relies on detailed genetic and physical maps to align sequence fragments • Whole Genome Shotgun Approach (Modern) •Small-insert clones (plasmids) prepared directly from genomic data and sequenced •Computer programs assemble by examining overlap •Advantage of being highly automated •Variations of this approach are used in most modern high-throughput genome sequencing. There were two main approaches to sequencing in the human genome. The first is the traditional approach which researchers had been using for years. It is known as map-based sequencing. A genomic library is made, which means that fragments of the genome are cloned into the same type of vector and stored in the same type of organism such that the entire genome is represented if you look collectively at all the organisms in the library. The library is then fingerprinted and the clones are oriented relative to each other. This involves locating restriction sites and aligning clones to form contigs similar to what you saw on the last slide. The fragments inserted in each clone are sequenced and the sequences are aligned. This relies on genetic and physical maps to align the sequence fragments. In the Whole Genome Shotgun Approach, which is the process used by Celera Genomics ans is the more modern process used today. The genome is broken into pieces which are sequenced. Computer programs assemble these fragments by examining the overlaps between them. Think of this as a sequence first and then align as opposed to the opposite in the map based sequencing. This has the advantage of being highly automated. Scientists using both methods combined their information to assist each other in completing the human genome sequence. Variations of the shotgun approach are now used in most modern high-throughput genome sequencing.

Can you think of things we can measure to evaluate evolutionary divergence?

Morphology Chromosome Structure Protein Sequences DNA Sequences One of the most obvious is morphology and various morphological characteristics can be measured and evaluated. Chromosome structure is another thing we can evaluation. The order of the genes along the length of chromosomes differs between related species. This is due to inversions occurring over time. Remember that inversions suppress crossover since gametes produced through recombination are not viable. This means that heterozygotes have decreased fertility and decreased reproductive fitness. Selection tends to favor the individuals who are homozygous for each allele and does not favor heterozygotes. We can study the pattern of genes along the chromosomes. We can study protein sequences. Many species have enzymes that do the same or similar functions. The amino acid sequences of these proteins can be compared. Much of the time these days, we are looking at DNA sequence differences between species. We will do some examples of this in a bit.

Environmental Factors in Cancer

Most cancers are NOT inherited Incidence rates around the world show that migrant populations take on cancer rates of their host country Most cancers are not inherited: Incidence rates around the world show that migrant populations take on cancer rates of their host country. Environmental factors such as tobacco use, diet, obesity, alcohol, chemical exposure, and UV radiation all play an important role in cancer.

Retroviruses can cause cancer by ...

Mutating and Rearranging proto-oncogenes Inserting strong promoter near proto-oncogenes The viruses that are associated with cancer tend to be retroviruses. Retroviruses have RNA which they can convert to DNA. The DNA then integrates into the host cell's DNA. Viruses often have strong promoters which cause genes near the site of virus integration to be overexpressed. If this integration is near a proto-oncogene, it can stimulate over-expression of the gene causing it to function as an abnormal oncogene. Alternatively, the virus can integrate within a gene and inactivate the gene. In humans, HPV produces proteins that inactivate p53 and RB, allowing the cells to over proliferate without normal controls on cell division.

Mutation

Mutations at the Protein Level •Loss of Function •Complete or partial absence of protein function •Recessive-acting mutations • •Gain of Function •Cell produces protein that is not normally present •Either new gene product or gene product in new location or at an inappropriate time in development. •Dominant-acting mutations (usually). •Example: mutation in a gene that encodes a receptor for a growth factor might cause the mutated receptor to stimulate growth in the absence of the growth factor. Let's talk a bit about mutation terminology At the protein level, a loss of function means a complete or partial absence of protein function. This is a recessive acting mutation in that it takes mutations in both copies of the gene to knock out the protein function A gain of function mutation occurs when the cell produces a protein that is not normally present. This can be a new gene product or a gene product that is produced in a new location or at an inappropriate time in development. These are typically dominant mutations since usually one bad copy of the gene can cause the gain of function. For example, a mutation in a gene that encodes a receptor for a growth factor might cause the mutated receptor to stimulate growth in the absence of the growth factor leading to growth at the wrong time or place.

Tumor Suppressor Genes

Normal gene prevents uncontrolled growth Abnormal gene - no inhibition - results in tumor if no normal allele present Must disrupt both copies of the gene to lose cell cycle regulation (recessive action) Tumor suppressor genes work to prevent uncontrolled growth. Tumor suppressor genes often work at the cell cycle checkpoints to prevent bad cells from completing cell division. If both copies of the gene are abnormal, there is no inhibition on the cell division so bad cells divide, resulting in tumor formation. In order to knock out a tumor suppressor gene's function, both copies of the gene need to be mutated. These genes show a recessive action.

Mutations in Two Types of Normal Genes Can Cause Cancer Development

Normally: •Tumor suppressor genes prevent "bad" cells from dividing • Proto-oncogenes allow "good" cells to divide We do know that there are classes of genes that can cause cancer development. More than 350 genes have been identified that contribute to cancer, but it is estimated that over 2000 genes can be important! There are two main classes of genes that we know more about which are important in cell division and its regulation. Mutations in these have been implicated in cancer. These classes are tumor suppressor genes and proto-oncogenes. Tumor suppressor genes normally prevent bad cells from dividing. If we have a LOSS OF FUNCTION in a tumor suppressor gene that turns its action off in the cell, then it can no longer prevent the bad cell from dividing and a tumor can result. Proto-oncogenes allow good cells to divide. However, a have GAIN OF FUNCTION mutation in a proto-oncogene converts it to an oncogene that is always on - leading to inappropriate cell division and tumor development. Note that the proto-oncogene is the normally functioning gene. The mutated form is said to be an oncogene. These two gene classes are often compared to parts of a car: Tumor suppressors are the breaks, oncogenes are the gas. If you do not stop at a stop sign because your brakes give out - you may go into the intersection and have an accident. This is like the loss of function of a tumor suppressor gene. If you floor the gas pedal as you round a bend in the road, you may wipe out and crash into a tree. This is like the over-action of the oncogene

The farmer is going to attempt to increase Grain yield by selecting the highest yielding plants from his field to be parents of the next generation Average grain yield in the original population is 450 grams per plant (XO) ̅ = 450 He selects plants to intermate with an average yield of 710 grams/plant (Xs) ̅ = 710 The average grain yield of the offspring is 580 g/plant (X1) ̅ = 580

Now he has to relate these values to heritability. The selection differential is the difference between the average of the initial population and the average of those selected to be parents. Think of this as the potential for change. The response is the difference between the mean of the initial population and the mean of the progeny generation. Think of this as the actual change. We calculate these as Selection Differential = S = (Xs) ̅ - (XO) ̅ = 710 - 450 = 260 Response to selection = R = (X1) ̅ - (XO) ̅ = 580 - 450 = 130

Assume 3 genes are responsible and a dominant allele at each locus adds 5 units of risk, but a recessive allele at a locus only adds 2 units of risk and individuals with 25 or more units of risk develop the disorder.

Now, let's look at a cross with the same risk unit situation. Let's look at genotypes AABbCc and AABbcc. We will mate them together in a bit, but for now, count up the risk units in each. Does either of these individuals have the disorder Neither of these individuals has the disorder. AABbCc has 24 risk units. AABbcc has 21 risk units.

Clonal Evolution

Over time, tumor cells acquire more mutations that allow them to be progressively more aggressive in proliferation. These mutations can affect •Cell cycle regulation •Signal transduction •DNA repair •Telomere length •Chromosome segregation (many tumor cells are aneuploid) •Vascularization Clonal evolution refers to the fact that over time, tumor cells acquire more mutations that allow them to be progressively more aggressive in proliferation. These mutations can affect Cell cycle regulation Signal transduction DNA repair Telomere length Chromosome segregation (many tumor cells are aneuploid) Vascularization If the mutations lead to a cell type dividing more rapidly or behaving more aggressively, that cell type will tend to take over the tumor development.

About 2/3 of the Mutations Leading to Cancer are the Result of Errors in Replication

Overall, it is thought that about 2/3 of the mutations leading to cancer are the results of errors in replication. This is illustrated in the figures here. These pictures illustrate the proportion of mutations that are inherited (on the left) due to replication errors (center) and due to environmental factors (on the right). 18 tissues are illustrated with the proportions in each category broken out by tissue type. The sum of these three proportions is 100% for each tissue The color codes for hereditary, replicative, and environmental factors go from white (0%) to brightest red (100%). Based on results like this, we can say that most cancers are genetic (replication errors are genetic) but are rarely inherited (most of these are somatic mutations so are not passed to our children) B, brain; Bl, bladder; Br, breast; C, cervical; CR, colorectal; E, esophagus; HN, head and neck; K, kidney; Li, liver; Lk, leukemia; Lu, lung; M, melanoma; NHL, non-Hodgkin lymphoma; O, ovarian; P, pancreas; S, stomach; Th, thyroid; U, uterus. [Image: The Johns Hopkins University]

Can he change the grain yield through breeding? How much of the variation is due to genotypic variation?

Phenotypic variance in his corn strain = VP = VG + VE + VGE And the Genetic variance = VG = VA + VD + VI VG genetic variance VE environmental variance VGE genetic-environmental interaction variance VA additive genetic variance VD dominance genetic variance VI genetic interaction variance But let's go back to Farmer Bill. Remember, he wants to increase grain yield in corn. With our new equations, he has to ask a couple of questions: Can he change the grain yield through breeding? In order to do this, he has to have variation present and the variation has to be genetic. So another question is How much of the variation in his corn population is due to genetic variation as opposed to variation due to environmental effects.

Plasmid Vectors

Plasmid vectors have specific features useful for cloning Plasmids are double stranded circular DNA that replicate independently of the bacterial chromosome. They have features that are useful for cloning. This plasmid is pBR322. It has two marker regions: the Ampr and Tetr genes. There is a multiple cloning site (MCS) in the tetr gene. This site is a segment of DNA that has a lot of recognition sites for restriction enzymes so that there is flexibility of enzyme choice depending on what enzyme recognition sites are present in the foreign DNA. The multiple cloning site is sometimes called the polylinker - you will probably hear me call it that since that is the way I learned it. If we insert a gene into the polylinker of the tetr gene, the tet resistance gene will no longer be active. This is called insertional inactivation. Another region that is helpful for cloning, is the Selectable marker site for ampr resistance. All cells that take up any type of DNA in the transformation experiment are ampicillin resistance. The ori site allows the plasmid to replicate in the cell independently of the bacterial chromosome. After ligation and transformation, we want cells that are ampr, tets We need the ampr to indicated that the cell took up some type of DNA ie. The cell transformed. We need the tetr to show that the recombinant plasmid, not the original vector, was taken up by the cell. It is important to have two distinct markers for these purposes. L38/ 11

Heritability

Proportion of the phenotypic variance due to genetic effects Used to predict rate and amount of selection response in a breeding program 2 types of heritability Broad-sense = H2 = VG/VP = proportion of phenotypic variance due to some sort of genetic effect Narrow-sense = h2 = VA/VP = proportion of phenotypic variance due to additive genetic effects Fortunately, for our farmer, there is a measure called heritability that is fairly simple to estimate and can give him an idea of how much of the variation he sees is genetic. Heritability estimates the proportion of the phenotypic variance that is due to genetic effects. Heritability is used to predict the rate and amount of selection response in a breeding program. There are 2 types of heritability: Broad-sense is abbreviated H2 and is calculated as VG/VP It is the proportion of phenotypic variance due to some sort of genetic variance Narrow-sense is abbreviated h2 and is calculated as VA/VP It is the proportion of phenotypic variance due to additive genetic variance Heritability ranges from 0 to 1 If heritability = 0, all VP is due to environmental variance If heritability = 1, all VP is due to genetic variance The higher the heritability, the more progress we can expect from a selection/breeding program CAUTION: Still based on a population, not for an individual eg. Suppose heritability = 0.9 for plant height in a population •You have a plant that is 100" tall. •This does NOT mean that 90" are due to genes and 10" are due to the environment. It DOES mean that 90% of the phenotypic variation in the population is due to genetic variance and 10% is due to environmental variance H2 = VG/VP h2 = VA/VP Heritability ranges from 0 to 1. Think of the phenotypic variance as a pie that is split into genetic and environmental variance. If heritability = 0, all VP is due to environmental variance If heritability = 1, all VP is due to genetic variance The higher the heritability, the more progress we can expect from a selection/breeding program CAUTION: Heritability values are still based on a population, and are not for an individual For example, Suppose heritability = 0.9 for plant height in a population You have a plant that is 100" tall. This does NOT mean that 90" are due to genes and 10" are due to the environment. It DOES mean that 90% of the phenotypic variation in the population is due to genetic variance and 10% is due to environmental variance

Oncogenes

Proto-oncogenes - normally promote cell division, but must be activated to be regulated properly. Mutation in proto-oncogene results in oncogene that allows uncontrolled cell division Only need mutation in 1 copy of a proto-oncogene to get tumor (dominant mutation) whereas in tumor-suppressor gene, both copies must be mutated to get tumor (recessive mutation). Now let's move to oncogenes. Proto-oncogenes normally promote cell division, but they must be regulated properly in order to function properly. Mutation in a proto-oncogene results in an oncogene, which allow uncontrolled cell division. Only need mutation in 1 copy of a proto-oncogene to get a tumor. This is considered to be a dominant mutation, whereas in tumor-suppressor gene, we have a recessive mutation where both copies must be mutated to get the tumor

Quantitative vs Qualitative traits

Qualitative Traits -Discrete phenotypic classes -Discontinuous variation in F2 •Quantitative Traits: -Continuous variation -No distinct F2 classes Now contrast this with the quantitative trait. Ear length in corn is a quantitative trait. Instead of discrete phenotypes, we have a range of phenotypes and the trait is controlled by more than one gene. We can still start with purebreeding parents of opposite phenotype. The F1 will have a relatively narrow range of phenotypes and will have an average ear length that is intermediate to that of the two parent lines. When we mate the two F1 together, we do not see discrete classes as we would with a qualitative trait. Instead, we see a broad range of phenotypes that show continuous variation in the F2 generation. The average of the F2 is the same as the average of the F1 generation

Regulation of Cell Cycle at Checkpoints

Regulation of Cell Cycle at Checkpoints •G1/S checkpoint - monitors for proper cell size and undamaged DNA •G2/M checkpoint - holds up cycle until replication and DNA repair are complete •M checkpoint - proper spindle formation and attachment •2 classes of proteins: -Protein kinases (phosphorylate proteins) and cyclins (structural protein) interact to guide progression through cell cycle. This diagram shows the regulation checkpoints along the cell cycle. There are three: The G1/S checkpoint - where the cell is monitored for proper cell size and undamaged DNA The G2/M checkpoint - where the cell cycle is paused if needed until replication and DNA repair are complete and the M checkpoint - in which the cell is monitored for proper spindle formation and attachment There are 2 classes of proteins that work at these checkpoints. Protein kinases which phosphorylate proteins allowing changes in their activity and cyclins which are structural proteins that interact to guide the progression through the cell cycle. There are mutations that can result in errors in the cell division process. These are called cdc mutations (cdc for cell division cycle)

Insertional Marker: LacZ gene

Remember the LacZ gene that makes β-galactosidase, which then breaks down lactose into to galactose and glucose? We can use this gene as one of our marker genes. We aren't going to have the enzyme break down lactose. Instead, we want it to break down a compound that makes a color. β-galactosidase can break down a compound called X-gal into a galactose derivative and a blue pigment. If this happens in a cell, the cell grows as a blue colony as shown in this plate. You may remember that the lac operon had to be induced to produce β-galactosidase. We add IPTG, which is a synthetic inducer to turn on the lacZ gene. Therefore, in the presence of both IPTG and X-Gal, a cell that has a functional lacZ gene grows as a blue colony on the plate. E. coli bacteria which do not produce β-galactosidase are transformed with a plasmid, some of which contain an insert in the lacZ open reading frame. For bacteria harboring plasmids with the insert in lacZ, this gene is disrupted and they are unable to make beta-galactosidase resulting in a white colony. For bacteria without the insert, β-galactosidase is produced, resulting in a blue colony. The identification of recombinant colonies with the use of the lacZ gene in this manner is called blue-white screening. L38/ slide 16

Ti Plasmid: Goals:

Remove tumor causing genes Leave transfer functions intact Harness the transfer functions so that the plant will be "infected" with a gene of interest Allow inserted gene to function in plant We want to use the Ti plasmid as a vector, but we don't want to cause crown gall disease. We remove the tumor causing genes, leaving the transfer functions intact. We can harness the transfer functions so that the plant will be "infected" with the gene of interest, hopefully, allowing the inserted gene to function in the plant.

Video: The PCR Song

The PCR song is a really cute video that is an advertisement for a particular brand of thermocycler - take a break and listen to it! I am not trying to advertise a brand of thermocycler, but the song may help you remember the steps in PCR! You will need to click to start the video.

Making cDNA

Reverse Transcriptase RNA dependent DNA polymerase We can isolate mRNA for the PSY gene. Then we need to make DNA from the mRNA. We say that cDNA is the DNA that is produced using mRNA as the template. It requires the enzyme reverse transcriptase, which is a RNA dependent DNA polymerase. An RNA dependent DNA polymerase makes DNA from an RNA template. It needs a primer, but we are in luck here. Our mature mRNA has a 3' polyA tail. We can make a synthetic poly T primer which can base pair to the poly A tail and serve as the primer for reverse transcriptase. We allow the first strand of the DNA to be produced by reverse transcriptase, but now we have an RNA:DNA hybrid. We need a double stranded DNA molecule. There are various methods that are used to remove the RNA and replace it with DNA, and we can end up with double stranded DNA of the PSY gene that is already in the properly processed form for expression

Amplification of DNA for Identification

STR: Short Tandem Repeats The allele is based on the length of the DNA segment, with different alleles having different lengths because they have different numbers of copies of a short repeated DNA sequence. Useful loci for paternity/forensics purposes must be polymorphic (many forms) Eg. Let each box be 10 bp long. How long are Sue's alleles? Let's look at how PCR and electrophoresis can be used for identification. As mentioned before, the alleles that are used for identification are areas of DNA that are highly polymorphic, but do not identify the physical features of the individual. STR Stands for Short tandem repeat. These are short segments of DNA that are repeated multiple times along the length of the chromosome. Different alleles have different numbers of copies of the repeat. It is important that these regions are highly polymorphic so that we will see differences between individuals for these regions. Let's look at an example. Suppose the STR length is 10 base pairs. This is the length of 1 copy of the repeat. Sue has 2 chromosomes since she is diploid. One chromosome has 3 copies of the STR. The other has 2 copies of the STR. When these genes are amplified, the primers used are such that the entire series of the repeats are amplified. Think of the location of the primers as being where the red dots are on the diagram. One of Sue's alleles will be 30 bp long and the other will be 20 bp long.

Amplification of DNA for Identification

STR: Short Tandem Repeats VNTR: Variable Number of Tandem Repeats Microsatellite Regions Let's switch gears and talk about using DNA for identification purposes. You have probably heard about this on TV shows like CSI, NCIS and forensic files. The types of genes that are amplified for this purposes are NOT genes for morphological traits like height, weight, eye color and skin color. Instead, we use highly polymorphic regions where a short segment is repeated different numbers of times in different alleles. These are called by various names STR: Short Tandem Repeats VNTR: Variable Number of Tandem RepeatsMicrosatellite Regions The STR name for short tandem repeat is most commonly used.

Mass Selection to Estimate Heritability Heritability = h2 = R/S

Selection Differential = S = xs - xo Response = R = x1 - x0 We also looked at using mass selection to estimate heritability and used this graph and these formulas with that type of scenario. Note that I have added the response (R) and selection differential (S) values to the graph to help you visualize how these are calculated. The formulas on this slide are also formulas you need to memorize for the test

Comparison of DNA Sequence Data

Sequencing techniques have become more automated. Each dideoxy form is now labeled with a fluorescent dye. All 4 dideoxy forms are now put into the same tube as the 4 normal deoxyribonucleoside triphosphates. The sequencing reaction occurs and fragments are separated on the gel. Now there is only 1 lane. The computer reads out the sequence and prints it out. Here you can see a 4 lane sequence next to the same one lane sequence. The graphical image shows the type of printout that is actually obtained. The high peaks are the bases that are most likely to be present at each position. The small peaks are thought to be experimental "noise". Sometimes that data are not as clear as shown in this picture. In all cases, experiments are replicated to make sure that good data is obtained. From our golden rice point of view - we can look for similar sequences to the PSY gene and determine if those plant sources produce a large amount of the gene product to try to find a better gene for our purposes.

Ras Signal-Transduction Pathway Stimulates the Cell Cycle

Signal transduction: External signal triggers a cascade of intracellular reactions to produce a specific response Signal transduction occurs when an external signal triggers a cascade of intracellular reactions to produce a specific response. The Ras signal-transduction path is an important path in cells. The ras protein can be inactive or active to either allow or not allow external stimuli to start the cascade of events in the cell. If ras is activated, it starts a chain reaction that activates MAP kinase which activates transcription factors that stimulate genes taking part in the cell cycle to promote cell division (activates G1 phase of cell cycle). Genes that code for Ras proteins are often oncogenes and mutations in such genes have been often found in cancer cells. Mutations usually cause ras to be "stuck" in its active form thereby constantly stimulating cell division.

Accumulation of genetic "risk" factors

Simple Case: 3 genes assuming no environmental effects Assume a dominant allele at each locus adds 5 units of risk, but a recessive allele at a locus only adds 2 units of risk and individuals with 25 or more units of risk develop the disorder. • •What is the greatest number of risk factors with this model? • •What is the lowest number of risk units possible? Let's work a little problem to see how the accumulation of risk factors works. For simplicity here, we will assume that there are not any environmental effects. This will allow you to see the additive effects of alleles clearly. Let's say that there are 3 genes controlling a trait. Assume a dominant allele at each locus adds 5 units of risk, but a recessive allele at a locus only adds 2 units of risk and individuals with 25 or more units of risk develop the disorder. What is the greatest number of risk factors with this model? Assume 3 genes are responsible and a dominant allele at each locus adds 5 units of risk, but a recessive allele at a locus only adds 2 units of risk and individuals with 25 or more units of risk develop the disorder. The genotype with all of the dominant alleles will have the greatest number of risk factors. This is AABBCC since there are 3 genes involved. Each dominant allele has 5 units of risk and there are 6 dominant alleles so 6 x 5 = 30 units of risk The genotype with the lowest number of risk units is the aabbcc genotype. It has 6 alleles that each have 2 units of risk. 6x2 = 12 risk units.

Ti Plasmid

The Ti plasmid is a tumor inducing plasmid from Agrobacterium tumefaciens. It causes crown gall disease in the normal infection process. The Ti plasmid inserts into the plant chromosome to cause the disease We mentioned the Ti plasmid when we talked about transposable elements. The Ti plasmid is a tumor inducing plasmid from Agrobacterium tumerifaciens. It causes crown gall disease when it infects the plant. In order to cause the disease, it inserts its DNA into the plant's chromosome .

Southern Blot

Single-stranded DNA fragments from a gel are transferred to a nylon membrane using capillary action in the Southern blot. The nylon membrane is incubated with labeled, single-stranded probe DNA Probe: single stranded DNA that is a sequence we are interested in The probe binds to complimentary DNA fragments on the nylon The position of the probe is identified Since we are have isolated our DNA and separated fragments on a gel, we will be using a Southern blot. Single-stranded DNA fragments from the gel are transferred to a nylon membrane using capillary action in the Southern blot. The nylon membrane is incubated with labeled, single-stranded probe DNA. A Probe is a single stranded DNA that is a sequence we are interested in. The probe binds to complimentary DNA fragments on the nylon membrane. This allows us to see the position of the probe. We want to see which plants have the PSY and CRTI genes so we will use these genes as the probes for our experiment. •DNA from 12 different possibly transgenic strains •Tested to see if the strains had the PSY and the crtI gene incorporated into the genome •Wild type: no PSY or CRTI •Transgenic does have PSY and crtI gene Here are 12 possible transgenic plant strains. They were tested to see if the PSY and crtI genes are incorporated into their genomes. The top blot used PSY as the probe. The lower blot used crtI as the probe. You can see that both of the genes are missing in the wild type plant (indicated as WT). You can see that both genes are present in the h13 plant.

Can you think of any human traits that would be under quantitative control?

Some examples are height, weight, skin color, and hair color. You can probably think of others as well. I like this set of pictures: A school lined up their students based on height. In the upper picture, taken in 1920, the photo seems to include only male students. In the lower photo (taken 77 years later), the student population seems to include females, and the school was smart to color code the students (males in blue and females in white). They were also smart to put the males closer to the front so that it is easy to compare the male population in the two pictures. The range is broader in the 1997 picture partly due to the inclusion of the coeds. The tallest in 1920 were 5'9". In 1997, the tallest were 6'5". There are a lot of genes that are important in determining height in humans. There are also environmental factors such as diet and nutrition that are important.

Phylogenetics

Study of the relationship among and between species, individuals, or genes/alleles based on their characteristics. Phylogenetics is the study of the relationship among and between species, individuals, or genes/alleles based on their characteristics Phylogenies are constructed by inferring evolutionary relationships among present day organisms

Sympatric Speciation

Sympatric Speciation arises within a single interbreeding population without geographical barriers to gene flow. • •Eg. Races of the Apple Maggot Fly where resource use is linked to mating preference. •Original fly fed on hawthorn tree fruit. •Mutation allowed feeding on apples. •Those with the mutation mated together more on apple trees (reproductive isolation) •Speciation not complete yet. •Gene flow only 2% between hawthorn and apple flies Another type of speciation is sympatric speciation. This is when speciation arises within a single interbreeding population without geographical barriers to gene flow. The races of the apple maggot fly are a good example of sympatric speciation. Here there is a strong disruptive selection depending on resource environment. That is, the selection depends on different food source and plant host. The original fly fed on the hawthorn tree fruit. A mutation allowed it to feed on apples. Those flies with the mutation mated together more on apple trees. This led to reproductive isolation. Once reproductive isolation began, changes start happening that further isolate groups. This species of insect lays eggs in ripening fruit. Apples ripen sooner than hawthorns, so the apple race has a mating period about 3 weeks sooner than the hawthorn race. These are not yet completely different species, but the gene flow is only about 2% between hawthorn and apple flies. This is a difference in resource use, not a geographical barrier since there is nothing to prevent both races from being on both types of trees.

Example of an Annotated, sequence-based map

Take a look at this figure. It shows an annotated, sequence-based map of an 8-mb segment of DNA at the tip of human chromosome 1 as assembled by researchers at Celera Genomics. The top line gives distances in mb. The next three panels show predicted transcripts from one strand of DNA (the "forward strand"), whereas the bottom three panels show transcripts specified by the other strand of DNA (the "reverse strand"). The middle three panels give the G:C content, the positions of CpG islands, which occur upstream of genes, and the density of single nucleotide polymorphisms (SNPs), respectively. The annotation key below the map of chromosome 1 shows the components of the map, the color code for gene product functions, and the color codes for G:C content and SNP density. As you can see it is much more than just giving the ATGC sequence of the DNA!

Mass Selection Example #2

The current average petal size in our population is 5.5 cm. The plants with the largest petals were selected to interbreed for the next generation. The mean of these selected plants was 6.4 cm. You already know that narrow-sense heritability for petal length in this population is 0.74. What is the response to selection in this experiment? R = (.74)(6.4 - 5.5) = (.74)(.9) = .666 What is the mean of the offspring generation with regards to petal length? X1 = X0 + R = 5.5 + .666 = 6.166 ~6.2 Remember h2 = R/S

Response to selection Extent to which the characteristic changes over a generation

The farmer is going to attempt to increase Grain yield by selecting the highest yielding plants from his field to be parents of the next generation Average grain yield in the original population is 450 grams per plant (XO) ̅ = 450 He selects plants to intermate with an average yield of 710 grams/plant (Xs) ̅ = 710 The average grain yield of the offspring is 580 g/plant (X1) ̅ = 580 Farmer Bill thought that this seemed like a lot of crosses to make. He wanted to find a way to estimate heritability based on the plants he would plant without making so many crosses. He wanted to grow his plants and select the best for the next generation and see if he made progress. He wondered if there was a way to estimate heritability to predict future progress based on this type of experiment. And there is.... He measured the average grain yield in his original population as 450 g/plant. We will call this (XO) ̅. He selects the plants to intermate and measures their average yield as 710 g/plant. We will call this (Xs) ̅ Then he intermated the plants and grew out the progeny generation. The average grain yield in the progeny was 580 g/plant which we will call (X1) ̅ I have drawn a graph to illustrate this. The solid line is the original population. We assume that the population has a normal distribution and you can see the mean of that population on the graph. The pink shaded tail is the group of plants that were selected to be intermated. The dashed line is the progeny generation and you can see that their mean is the mean of the dashed normal curve.

Forensics

The suspect is excluded from leaving the piece of evidence at the crime scene if he/she does not match EXACTLY at all loci. Who is excluded here? Let's try a forensic analysis. Here we have some evidence collected at the crime scene, a victim, and two suspects. The alleles at two loci were amplified from each source and the pieces were separated out on the gel here. In a forensic situation, we need a perfect match between the evidence and the suspect at all loci. If we do not match perfectly at all loci, we exclude the suspect. We include the victim in the analysis since it is anticipated that some DNA samples from a crime scene could belong to the victim. Take a look at these data. Who is excluded here? Victim and Suspect 1 The victim is excluded since they do not match at locus B. Suspect 1 is excluded since they do not match at locus A Suspect 2 is not excluded since they match perfectly at both loci. L38/ 56

Blue-White Screening

There are several commercially available plasmids that have the lacZ gene as one of the marker genes. pUC19 is one of these. pUC19 has the ampicillin resistance gene as its other marker gene. These plasmids are engineered to have a polylinker within the lacZ gene. It does not mess up the protein function, but it allows us to have several enzyme sites that we can use for the insertion. The plasmid is cut with the enzyme, the insert is cut with the same enzyme and some recombinant plasmids are produced. Note that the recombinant plasmid has a line through the lacZ gene since that gene has been split and is not functional in the recombinant plasmid. We then do the transformation experiment. Some cells do not take up and DNA. Some take up miscellaneous pieces, but not complete plasmids. Some take up the original pUC19 and some take up the recombinant plasmid. We plate the cells on media containing ampicillin, IPTG and X-gal. The cells that do not take up DNA do not grow since they are not resistant to ampicillin. The cells with miscellaneous pieces do not grow since they do not have the ampr gene. The cells that take up pUC 19 grow blue. They grow since they have the ampr gene, and they are blue since the lacZ gene is intact. The cells with the recombinant plasmid grow white. Again, they grow because the have the ampr gene. They are white since they do not have a functional lacZ gene. We just select the white colonies since they should have the desired recombinant plasmid. The identification of recombinant colonies with the use of the lacZ gene in this manner is called blue-white screening.

Reverse GeneticsCRISPR Mediated Mutations

There is an excellent site that describes many potential uses of CRISPR. I have included the link on this slide. You are not responsible for the material on this slide. I do want you to be able to describe some potential uses of CRISPR, but you can get those from the video on the previous slide. This is for students who want a bit more information.

Tumor Suppressor Genes(Recessive-Acting Mutations)

This diagram shows the recessive nature of a tumor suppressor gene. The normal alleles produce factors that inhibit cell division. If both alleles are mutant, the bad cells are not prevented from dividing. Even though tumor suppressor gene action is recessive, you are at a greater risk of getting a cancer if you inherit one bad allele because then you only need to knockout one copy in one cell for a tumor to develop. L37/ 17

Tumor Suppressor Genes -This slide lists several tumor suppressor genes

This slide lists several tumor suppressor genes. APC is implicated in colorectal cancer. BRCA1 is important in some cases of breast and ovarian cancer P53 is implicated in many types of cancer And RB is implicated in various cancers, but gets its name because it was first correlated with retinoblastoma.

Shotgun Approach

This slide shows the shotgun sequencing approach used by Celera Genomics. Think of this as "blasting the DNA into pieces", sequencing the pieces and then combining the sequences using powerful computer analysis. L39/ 35

Phenotypes of transgenic rice seeds

Transgenic rice seeds make β-carotene Seeds from the transgenic plants were compared. Photo #1 is the wild type (untransformed rice), Photos 2 - 4 are for transformants. The graphs show that beta carotene is being produced in the transgenic seeds. (Graphs B, C, and D) but not in the control seeds (graph A) L38/ 27

Selecting a Percentage of the Population

Truncation point = XT = µ + Zσ Selection differential = S = Iσ Find Z and I from tabulated values based on the proportion of the population selected (p). Note that these values will be negative if you are selecting the left tail of the normal distribution. Use the narrow-sense heritability formulas to help you finish answering the question Note that µ is the same value as X0 Let's see how these work. We start off with a problem and are told that we want to select a certain proportion of the population. We find the Z and I values for that proportion on our table of values. This drawing shows that we are selecting the upper range of the original population (since we are selecting the pink shaded tail of the population. When we do this, the numbers that are in our table of Z and I values are the numbers we use. If we select the left tail (lower numerical values of the distribution) we need to use negative values of Z and I - just put a negative sign in front of the tabulated value when you select the left tail of the distribution. We can use the Z and I values to calculate the truncation point and the selection differential. Then we use the other narrow sense heritability formulas to finish answering the question. Note that µ is the same value as X0

Monozygotic Twins are Genetically Identical, While Dizygotic Twins Share only 50%

Twin studies are often used. Monozygotic twins are identical twins. They are genetically identical and arise through a single egg-sperm fertilization event with the embryo splitting into two early in development. Dizygotic twins share on average half of their genes with each other - just like regular siblings. They arise through 2 eggs each being fertilized with a different sperm and resulting in two fetuses. Although dizygotic twins are not genetically identical, they do share a similar environment with each other - particularly in utero and in early life.

Cancer Cells have...

Uncontrolled cell division - alteration of cell cycleAbility to metastasize - spread to other locations •The Cell Cycle -G1 -S -G2 -M You probably remember the cell cycle diagram from early in the semester. The cell cycle shows the life of a somatic cell: G1 general growth and metabolism, S - the replication of DNA, G2 more growth and metabolism and M for mitosis, the cell division by which one parent cell divides to produce two identical daughter cells. This cell cycle diagram is a bit more complex. When we think of cancer cells, we think of uncontrolled cell division - an alteration of the cell cycle leading to tumor development and the ability to metastasize

Just a quick re-cap: Here are some of the important formulas we were looking at for quantitative genetics. You may want to make a list of these and memorize them. You will need to know them for the test.

VP = VG + VE VG = VA + VD H2 = VG/VP h2 = VA/VP VE = (1/3)(Vinbred parent1 + Vinbred parent 2 + VF1) Variance in the F2 generation estimates VP

VP = VG + VE VG = VA + VD

VP = s2 = phenotypic variance = variance calculated from measurements of the phenotypes of the individuals in the population VG = genetic variance = variance due to genetic effects VE = environmental variance = variance due to the effects of the environment VA = additive genetic variance = amount of variation due to additive effects between alleles VD = dominance genetic variance = variation due to dominance genetic effects This gives us the equations phenotypic variance (VP) equals genotypic variance + environmental variance and Genetic variance = additive genetic variance + dominance genetic variance. Where the abbreviations stand for the following: VP = s2 = phenotypic variance = variance calculated from measurements of the phenotypes of the individuals in the population VG = genetic variance = variance due to genetic effects VE = environmental variance = variance due to the effects of the environment VA = additive genetic variance = amount of variation due to additive effects between alleles VD = dominance genetic variance = variation due to dominance genetic effects

Phylogenetic Trees Show Relationships between OTU's.

Vocabulary: •Terminal nodes •Branches •Internal nodes Eg. Quagga and Grant's zebra are more closely related to each other than they are to Grevy's zebra Take a look at this phylogenetic tree. Equus is a genus of mammals in the family Equidae, which includes horses, asses, and zebras. Within Equidae, Equus is the only recognized extant genus, comprising seven living species. (Wikipedia) This tree is rooted because there is a common ancestor for all of the OTUs in the tree. In this tree, horses are the outgroup that allows the tree to be rooted. The nodes represent the common ancestors that were present prior to divergence of the two groups. Branches are the evolutionary connections between the organism. Terminal nodes are the nodes at the end of the tree closest to the pictures. Internal nodes are those nodes located within the middle of the tree (Not at either end). We can tell the degree of relationship based on the length of the lines that connects the two OTUs. For example, the Quagga and Grant's zebra are more closely related to each other than they are to Grevy's zebra L35- slide 30

Microarray Analysis of RNA from Cancer and Noncancerous Cells

We can use microarray analysis to compare expression of a set of genes from different cell types. In this case, researchers wanted to determine if they could identify genes that might be used to predict recurrence of breast cancer. The microarray chip contains DNA probes for many genes. mRNA was isolated from cancerous and noncancerous cells from women with breast cancer. The mRNA was converted to cDNA and labeled as red (cancerous) or green (normal). It was incubated with the microarray chip, which allowed it to bind to complementary sequences on the gene chip. The computer could then analyze which genes were producing mRNA in the cancerous cells and compare that to those produced by the noncancerous cells. They were able to identify a set of genes that had an over expression in the cancer cells and others had an under expression in the cancer cells compared to the expression levels in normal cells. This helps to identify genes that might be important genetic factors involved in this type of cancer. These genes can now be studied in more detail.

Discrete Phenotypes due to a Threshold for Trait Presentation e.g. Disease Trait

We have been talking about continuous variation for quantitative traits. There are some situations where there are quantitative traits - i.e. controlled by many genes - where there are only two phenotypes. These are called threshold traits. There is an accumulation of factors (genetic and environmental) that pushes the phenotype over the limit. One example could be susceptibility to disease. In this graph, you can see that most of the population is healthy. However if the value of the phenotype passes a threshold, the individual has the disease. These are still quantitative traits as long as they are controlled by multiple genes, each with a small additive effect and the trait is affected by the environment.

Sample Problem: Organize the clones based on the presence/absence of STSs

We mentioned that we get sequences of DNA in pieces or clones. We mentioned that STS stands for sequence-tagged sites which are short, unique DNA sequences. We can use the presence of the same STSs in multiple clones to line up the clones relative to each other to form a contig. See if you can organize these three clones based on the presence and absence of the STSs. A + means that STS was present in that clone. A - means the STS was not present in that clone. Pause the show and see if you can line up the clones as shown in the diagram and indicate the relative locations of the Sequence tagged sites. You can start with any clone. I started with clone X since it was first on the chart. I drew a line to indicate the DNA and put tic marks to represent the STSs and labeled the STSs. At this point, you can put the STSs in either order since you don't have any additional information to go by. Then you build up your contig by adding information from the other clones Then add a clone that has an overlap with clone X. Note that STS3 is present in both clones X and Z and I have lined these up Finally add the last clone. STS 8 is found in both clones Z and Y. Now we can draw the contig map indicating the position of all of the STS markers L39/ 54

Which Colonies Contain Recombinant DNA?

We plate the products on media containing ampicillin and then replica plate onto a plate that contains tetracycline. How would you identify clones with the recombinant plasmid that contains fragment of interest? Note that there are three colonies on the ampicillin plate. Cells that did not take up DNA did not grow at all. Cells that took up only the circular insert DNA do not grow because they are not resistant to ampicillin. When we replica plated onto tetracycline, only colony #2 grew. Colony #2 must be resistant to both amp and tet, so colony #2 contains the original pBR322 plasmid. Colonies 1 and 3 are resistant to ampicillin, but not tetracycline. They took up the recombinant plasmid. Remember that the insert DNA inactivated the Tetracycline resistance gene in producing the recombinant plasmid. -Colonies 1 and 3 contain recombinant DNA. Colony 2 contains the pBR322 plasmid. Cells that didn't take up DNA do not grow. This is one way to identify cells that contain the recombinant plasmid, but it requires growing colonies on 2 plates of media. It seems like there should be a way to simplify this.

Example 2: Mass Selection as a Percentage

You wish to increase running speed in a population of lab rats. Your current population averages 87.9 seconds on the race course and has a standard deviation of 5.8 seconds. You plan to inter-mate the fastest 5% of the population to make the next generation. Find the truncation point. What is the mean of the selected parents? Show these on a graph Read this problem carefully. You want to increase speed. But speed is measured in the number of seconds to complete the race course. If I run the 50 yd dash in 10 seconds and you run it in 7 seconds, who is faster? You are! The faster time has the smaller numerical value. This means we need to select the left tail of the distribution. We can draw a graph for this experiment as shown here. Since we are selecting the left tail of the distribution, we need to use the negative values from the Z and I table Find the truncation point. XT = µ + Zσ = 87.9 + (-1.645)(5.8) = 78.359 sec What is the mean of the selected parents? Xs = µ + S = µ + Iσ = 87.9 + (-2.063)(5.8) = 75.935 sec Show these on a graph. Why did we use negative numbers for Z and I? We selected the left tail of the population! 8.738Z and I are negative since the fastest rats will have the smallest time so we are actually selecting the left tail of the graph. Remember, Selecting the left tail means a negative Z and I. Selecting a right tail of the graph uses positive values.

Palladin Gene and Pancreatic Cancer

`•Pancreatic cancer is the 4th leading cause of cancer death even though only about 37,000 new cases occur each year. •The palladin gene codes for a cytoskeleton protein that is important in maintaining cell shape. •Cells that metastasize generally have poor cytoskeleton structure causing them to detach from the tumor mass easily Cells that metastasize generally have poor cytoskeleton structure, which cause them to detach from the tumor mass easily. Patients with metastatic cancer are difficult to treat, but few genes that play a role in metastasis have been identified. One gene that is implicated in metastasis is the palladin gene. This gene was identified from a family that has a high frequency of pancreatic cancer. Pancreatic cancer is the 4th leading cause of cancer death even though there are only about 37,000 new cases diagnosed each year. Look at this pedigree. Note that there are individuals in each generation that have cancer and precancerous growth. The palladin gene was identified based on findings from research in this family's disorder and has not been associated with other familial pancreatic cases. The mutation is interesting though since we do not know of many genes that are implicated in metastasis. This gene has been shown to have an altered sequence and to be over-expressed in family members. It codes for a cytoskeleton protein that is important in maintaining cell shape meaning that it is likely to have a function in metastasis. Further research is being done as to the mechanism of action of the normal and mutated forms of this gene. Palladin mutation causes familial pancreatic cancer and suggests a new cancer mechanism. PLoS Med 2006 Dec; 3(12):e516.

Improving the Nutritional Value of Golden Rice Tried using the PSY gene from

maize pepper tomato arabidposis carrot •Expressed the different PSY genes in rice using similar strategy as before •Verified that the protein is expressed Back to the golden rice. The PSY gene was identified in several other plants. Separate experiments were conducted to integrate the PSY gene from maize, pepper, tomato, Arabidopsis, and carrot into rice. The different PSY genes were expressed using a similar strategy to that mentioned earlier. The protein was verified to be produced in each case by using a western blot. You can see a western blot that shows the protein expression here. Note that the wild type (WT) did not have the gene. The middle lane was the line created using the daffodil PSY gene. The lane of the right was created using the PSY gene from Zea maize (corn). •Rice with the maize PSY gene has much higher β-carotene than rice with daffodil gene •Golden Rice 2 (maize PSY) makes 30ug/g of β-carotene -72g golden rice/day would be potentially be enough to satisfy a child's RDA -A typical child's portion of rice is 60g -Often children consume more than one portion per day •When the PSY gene from maize was used, the plant had a much higher beta-carotene production than the transgenic plant with the daffodil gene. Golden rice, the gene with the maize PSY gene, produces 30ug of β-carotene per gram of rice. •72g golden rice/day would be potentially be enough to satisfy a child's recommended daily amount of vitamin A. A typical child's portion of rice is 60g, and often children consume more than one portion per day so this should provide plenty of vitamin A!

Allopatric Speciation Darwin's Finches

•14 species that evolved from a single ancestral species •Ancestral species migrated to the Galapagos Islands and underwent repeated allopatric speciation Darwin's finches are a good example of allopatric speciation. These are 14 species that evolved form a single ancestral species. The ancestral species migrated to the Galapagos Islands off the coast of Ecuador. As new islands were produced, birds migrated to occupy them. The geographic barrier in this case was the ocean. Over time, behavioral isolation occurred. Most of these species are separated by differences in song type. The Number of Species of Darwin's Finches Strongly Corresponds with the Number of Galapagos Islands The picture on the left shows the Galapagos Islands. New islands were formed from a stationary volcano when the geological plate moved over it. Plate moved to the East, so oldest islands are on the East. New volcanos formed creating even more islands. The chart on the right shows the parallel increase in the number of finch species as the number of islands increased. There are 19 Galapagos Islands. There are 14 species of the finches on these islands. L35- slide 14

Current Progress of Golden Rice

•2012 study showed the a bowl of rice (50g dry weight) provides 60% of Chinese RDA to children ages 6-8 •PR campaign in many countries •Golden rice blessed by the pope, but not "officially endorsed" by the church 11/7/13 -80% of Philippines is Roman Catholic •Golden rice could be released commercially in Bangladesh as early as 2018 A 2012 study showed the a bowl of rice (50g dry weight) provides 60% of Chinese RDA to children ages 6-8. This is very encouraging! However, people weren't enthusiastic about eating golden rice. It was not the same color as the traditional rice so it was difficult to market. Public relations campaigns were started in several countries. Golden rice was blessed by the pope, but not "officially endorsed" by the church. Since 80% of Philippines is Roman Catholic, it was thought that this blessing might encourage more acceptance in the communities. Golden rice could be released commercially in Bangladesh as early as 2018. There is still hope that golden rice usage will "catch on" since it does appear to provide the recommended daily allowance of vitamin A and could be very beneficial to populations where rice is the primary food source.

DNA Sequence Analysis Different Parts of the Genome Evolve at Different Rates

•5' flanking has promoter so some sequences will be important for transcription • •Leader and Trailer (UTRs) are transcribed, but not translated but may contain signals for RNA processing and ribosome attachment. • •Introns - removed • •Pseudogenes - • don't code for protein Different parts of the genome evolve at different rates. This chart shows the rates of nucleotide substitutes per site per year for different areas of a eukaryotic gene. The relative values of these numbers negatively correlate with their function such that mutations that greatly affect the sequence or function of the protein are less common than those that do not have such a large effect on protein function. For example the lowest number on the chart is for the nonsynonymous changes in the coding region. These mutations are rare - presumably because they alter the protein sequence and probably the function of the protein. Note that the synonymous mutations in the coding region are one of the highest numbers on the table. Think about what important sequences are found in each of the regions and how altering these would affect protein function. The 5' flanking has the promoter so some sequences will be important for transcription, but other areas may not be so critical to producing the correct protein. The Leader and Trailer (UTRs) are transcribed, but not translated. These areas may contain signals for RNA processing and ribosome attachment. The introns are removed. There are some areas that are important parts of the consensus sequences for proper splicing, but most of the intron does not have a function in determining the final protein. Hence it is not surprising that the accumulated mutations occur at a high rate in intron sequences. Pseudogenes are areas of the DNA that are not coding for functional genes, but have similar sequences to the functional genes. They can be produced by gene duplication and can accumulate mutations without causing a disruption in normal function. Can you label a diagram of a eukaryotic gene with these regions of the DNA?

Mouse Collaborative Cross

•8 inbred mouse lines: 3 wild + 5 lab lines • •Mated to randomize genes and break up linkage groups • •Now have about 1000 inbred lines that are each mixtures of the 8 original lines A different approach was taken in mice. 8 inbred lines were mated to randomize genes and break up linkage groups. Then inbred lines were developed. There are now about 1000 inbred lines that each have a mixture of genes from all 8 original lines on each chromosome. This was a collaborative effort between scientists from many universities. This picture shows images of the original 8 inbred lines used to form the collaborative cross.

SNP - Single Nucleotide Polymorphism

•A SNP is a specific site in the genome where the DNA base varies in at least 1% of the population. • •There are about 10 million SNPs in the human genome. • •SNPs located near each other in the genome are often inherited together and can often be grouped as a haplotype (a sequence of SNP patterns along the length of a chromosome). • •tagSNPs = the few SNPs used to identify a haplotype •A haplotype of 1000s of SNPs can be identified by only a few SNPs. •Thought that about 100,000 SNPs can be used to identify most haplotypes in humans (remember, there are about 10 million SNPs in humans) • The markers that are typically used in GWAS are single nucleotide polymorphisms or SNPs. A SNP is a particular site in the genome where the DNA base varies in at least 1% of the population. There are about 3 billion base pairs of DNA in the human genome. There are about 10 million SNPs in the human genome. SNPs located near each other in the genome are often inherited together and can often be grouped as a haplotype (a sequence of SNP patterns along the length of a chromosome). tagSNPs are the few SNPs used to identify a haplotype. A haplotype of 1000s of SNPs can be identified by only a few SNPs. It is thought that about 100,000 tagSNPs can be used to identify most haplotypes in humans (remember, there are about 10 million SNPs in humans). We have 3 billion base pairs of DNA - 3 billion is a lot of items to look at. We have about 10 million SNPs. 10 million is still a lot of things to look at. If there are only about 100,000 tag SNPs that can identify most haplotypes in humans, we finally have a small enough number of things to look at that we can manage the data. 100,000 is not nearly as big as 10 million or 3 billion! •A SNP is a specific site in the genome where the DNA base varies in at least 1% of the population. • •There are about 10 million SNPs in the human genome. • •SNPs located near each other in the genome are often inherited together and can often be grouped as a haplotype (a sequence of SNP patterns along the length of a chromosome). • •tagSNPs = the few SNPs used to identify a haplotype •A haplotype of 1000s of SNPs can be identified by only a few SNPs. •Thought that about 100,000 SNPs can be used to identify most haplotypes in humans (remember, there are about 10 million SNPs in humans) The markers that are typically used in GWAS are single nucleotide polymorphisms or SNPs. A SNP is a particular site in the genome where the DNA base varies in at least 1% of the population. There are about 3 billion base pairs of DNA in the human genome. There are about 10 million SNPs in the human genome. SNPs located near each other in the genome are often inherited together and can often be grouped as a haplotype (a sequence of SNP patterns along the length of a chromosome). tagSNPs are the few SNPs used to identify a haplotype. A haplotype of 1000s of SNPs can be identified by only a few SNPs. It is thought that about 100,000 tagSNPs can be used to identify most haplotypes in humans (remember, there are about 10 million SNPs in humans). We have 3 billion base pairs of DNA - 3 billion is a lot of items to look at. We have about 10 million SNPs. 10 million is still a lot of things to look at. If there are only about 100,000 tag SNPs that can identify most haplotypes in humans, we finally have a small enough number of things to look at that we can manage the data. 100,000 is not nearly as big as 10 million or 3 billion!

Genetic Correlation between Traits

•A change in value for one trait is accompanied by a change in value for another trait due to a gene that affects one trait also affects the other trait. • •This is Pleiotropy (one gene affects multiple traits). • Correlation can be Positive or Negative Sometimes the correlation between traits is due to a situation where a change of value for one trait is accompanied by a change in value for another trait because the same gene affects multiple traits. This is called pleiotropy. The correlation can be positive or negative. re are some traits in food animals and model organisms that are correlated. Not all of these are due to pleiotropy, but pleiotropic effects probably are important in some cases. Negative correlations between two desirable traits such as the milk yield and percentage butterfat can provide challenges to plant and animal breeders who want to increase both to maximize profit in agriculture. Sometimes farmers just have to figure out a way around the issue that does not involve breeding. My father in law (yep - Farmer Bill) did this with the milk production and butterfat percentage when he ran a dairy. In dairy cows, there is a negative correlation of -.38 between milk yield and butterfat so if you increase one you will decrease the other. Dairy farmers want a high volume of milk but are required to have a certain percentage of butterfat in their milk. My father in law always raised Holsteins, which produce a high volume of milk and he selected for increased milk production in his herd. However, he kept a few Jersey cows as well since Jerseys have a high butterfat component. This allowed the overall milk production to increase and he still could provide his customers with the appropriate butterfat component. It doesn't solve the genetics conundrum, but it worked for him!

Calculating Distance from DNA Sequences

•After sequences are aligned, the percentage distance x 1000 values between each pair of species can be calculated as (# differences/# nucleotides) * 1000 Calculate the percentage distance x 1000 between each pair of species and place the numbers in the table above. Let's practice getting those values: After sequences are aligned, the percentage distance x 1000 between each pair of species can be calculated as (# differences/# nucleotides) * 1000. Calculate the percentage distance times 1000 values between each pair of species and place them in the proper places on the table. Look at Species 1 and 2: There are 3 nucleotide differences. Plugging into the formula we have a value of 300 Now take a look at Species 1 and 3: There are 2 nucleotide differences. Plugging into the formula we have a value of 200 Finally, let's look at Species 2 and 3: There are 5 nucleotide differences. Plugging into the formula we have a value of 500 L35-slide 35

Expression Vectors

•Allow the inserted gene product to be produced •Must contain sequences required for transcription and translation of the gene in addition to other vector characteristics To get the gene to function and have the gene product produced, we have to include the proper sequences for transcription and translation of the gene in addition to the other vector characteristics. Remember that we will need a marker gene and an origin of replication. We need to make sure that we have a promotor and that the promotor is in the proper orientation in relation to our gene we want to express.

Types of Evolutionary Change

•Anagenesis - evolution within a lineage over time • •Cladogenesis - splitting of one lineage into two -Once cladogenesis occurs, the branches evolve separately from each other. -Leads to biological diversity since more species exist at the same time Populations change (or evolve) over time. There are two types of evolution that occur within a group of organisms that are able to reproduce. Anagenesis is the change within a single lineage that occurs over time. This is indicated by a straight line. Cladogenesis occurs when the population splits into two lineages. Once cladogenesis occurs, the two branches will evolve separately from each other. Look at the diagram at the bottom of the page. The furthest point back in time is at the bottom of the diagram. As we move to the top of the diagram, there is one line indicating that the population is changing and anagenesis occurs. Then the population splits into two groups. The splitting is cladogenesis. The point of the split is the location of the common ancestor for the two groups. Anagenesis then occurs in each of the two groups after the split as we continue to read the diagram from bottom to top. This leads to biological diversity since more species exist at the same time.

Vascularization

•Angiogenesis (growth of new blood vessels) is important to tumor progression • •Growth factors and other proteins involved in angiogenesis are often overexpressed in tumor cells • •Angiogenesis inhibitors may be inactivated or underexpressed • •Is it possible to fight cancer by preventing angiogenesis? Metastasis is the cause of death in 90% of human cancer cases! Vascularization - providing an adequate blood supply is important to the ability of a tissue to survive. Angiogenesis is the growth of new blood vessels and is important to tumor progression. Growth factors and other proteins involved in angiogenesis are often overexpressed in tumor cells. Angiogenesis inhibitors may be inactivated or underexpressed. Is it possible to fight cancer by preventing angiogenesis? Would this decrease metastasis? Metastasis is the cause of death in 90% of human cancer cases.

Tumor Suppressors BRCA1and BRCA2

•BRCA1 and BRCA2 are used to repair double strand breaks Some other tumor suppressor genes that you have probably heard of are the BRCA1 and BRCA 2 genes which are implicated in breast cancer. These genes are normally used to repair double stranded breaks in the DNA. When either of these genes is not functioning properly, broken DNA does not get repaired faithfully, leading to mutations. These mutant cells then divide and can cause tumors ......You do not need to know the mechanism, but it is illustrated here if you want to follow the diagrams. •BRCA1 and BRCA2 account for about 5-10% of all breast cancers •BRCA1 and BRCA2 account for 20-25% of hereditary breast cancers •55-65% of women who inherit a BRCA1 mutation will develop breast cancer •45% of women who inherit a BRCA2 mutation will develop breast cancer •Women in the general population have a 7% chance of developing breast cancer • •Strong family history of breast or ovarian cancer may indicate a mutation in one of these genes. • •Men with these mutations •Have an increased risk of breast cancer (esp. BRCA2) •Have increased risk of prostate cancer. • •Genetic Testing •Myriad genetics, Ambry genetics, GeneDx •Cost varies depending on insurance The BRCA1 and BRCA2 genes only account for about 5 - 10 % of all breast cancers. However, they account for about 20 - 25% of hereditary breast cancers. 55-65% of women who inherit one bad copy of the BRCA1 gene will develop breast cancer. 45% of women who inherit one bad copy of a BRCA2 gene mutation will develop breast cancer. Women in the general population have only about a 7% chance of developing breast cancer. A strong family history of breast or ovarian cancer may indicate a mutation in one of these genes. An increased risk of cancer is thought to be due to only having to mutate both copies of the gene for cancer to develop, but indications are the BRCA1 is haploinsuffient, and heterozygous cells are not completely normal so it may be that only one copy of the gene is sufficient to result in abnormal cells. Men with these mutations especially mutations in the BRCA2 gene have an increased risk of breast cancer. They also have an increased risk of prostate cancer. Genetic testing can be done for these genes. There are several companies who conduct these tests. Typically a sample of blood is drawn in the doctor's office and sent off to the testing agency. The cost varies from $300 - $3000 depending on limited testing (only test few areas of gene) or full test that tests hundreds of areas of the gene. If you are the first in your family to get tested, you must test more of the gene. If you are comparing your sequence to a known familial mutation, you test less. The test may be covered by insurance, but some prefer anonymous testing even with GINA protection

Types of Blots

•Blotting: Process of transferring molecules that were previously separated (on gel, etc.) to a membrane that is better able to support additional testing. • •Southern Blot - DNA fragments separated based on length. • •Northern Blot - RNA fragments separated based on length. • •Western Blot - Proteins that are separated on molecular weight, isoelectric point, electric charge, etc. After we separate out the DNA fragments, we can do a blot. Let's talk about the different blots and then look at the type of data we can expect in our Golden rice experiment. Blotting is the Process of transferring molecules that were previously separated (on gel, etc.) to a membrane that is better able to support additional testing. In a Southern Blot - DNA fragments separated based on length in electrophoresis and then transferred to a membrane. This process was named for Edwin Southern who worked out the technique. The other blots are a play off of Southern's name: in a Northern Blot - RNA fragments separated based on length in electrophoresis and then transferred to a membrane. In a Western Blot , Proteins are separated on molecular weight, isoelectric point, electric charge, etc. and then transferred to a membrane.

Response to selection

•Can use the response to selection to predict the heritability •Response to selection (R) R = narrow sense heritability x selection differential R = h2 x S •Narrow sense heritability h2 = R / S Narrow sense heritability is estimated as the response divided by the selection differential. Think of this as the proportion of the possible change that was actually realized in the response to selection. From the farmer's point of view - if heritability is high, the parents and offspring will be more similar to each other than if heritability is low. A higher proportion of the variation seen is additive genetic variance with a high narrow sense heritability value than with a low heritability value. So Farmer Bill will make more progress from selection if he has a high heritability value for the trait he is measuring than if the value is low. Selection Differential = S = (Xs) ̅ - (XO) ̅ = 710 - 450 = 260 Response to selection = R = (X1) ̅ - (XO) ̅ = 580 - 450 = 130 Heritability = h2 = R / S = 130 / 260 = 0.5

Genetic Change in Population: 2 step Process

•Change occurs (must be genetic) -Eg. Mutation causes new alleles -Eg. Recombination causes new combinations of alleles - Then different alleles (or combinations) must increase or decrease in frequency in the gene pool (selection and other factors Genetic change in a population is a 2 step process: First the change must occur. It must be a genetic change such as mutation that causes new alleles or recombination that causes new combination of alleles to be grouped together on a chromosome. Then the different alleles or combinations must increase or decrease in frequency in the gene pool due to selection or other factors

Cancer Genome Projects

•International Cancer Genome Consortium •http://icgc.org/ •"ICGC Goal: To obtain a comprehensive description of genomic, transcriptomic and epigenomic changes in 50 different tumor types and/or subtypes which are of clinical and societal importance across the globe." •Cancer Genome Atlas •https://cancergenome.nih.gov/ •The Cancer Genome Atlas (TCGA), a collaboration between the National Cancer Institute (NCI) and National Human Genome Research Institute (NHGRI), has generated comprehensive, multi-dimensional maps of the key genomic changes in 33 types of cancer. There are several cancer genome projects designed to identify genetic and epigenetic factors that contribute to different tumor types. This slide gives you a couple of web sites that you can look at for more information.

p53

•Chromosome 17 • •Tumor suppressor gene • •Functions at G1 checkpoint • •Mutated form seen in diverse cancer types: Colon, lung, breast, brain and is found in altered form in 50% of human tumors • •The fork in the road: if DNA is damaged, p53 delays cell division until damage is repaired or programs cell to die -Apoptosis - programmed cell death • •If p53 not working properly, cell division occurs even though DNA is damaged and occurs in unregulated manner P53 is a tumor suppressor gene that is located on chromosome #17. It functions at the G1 checkpoint. Mutations in p53 have been seen in a wide range of cancer types including colon, lung, breast, and brain. It is found in an altered form in about 50% of human tumors. I have a picture of a fork in the road because the action of p53 determines whether a damaged cell is paused until repairs can be made or whether the damaged cell is too badly damaged to repair and programmed cell death (or apoptosis) occurs. If p53 is not working properly, cell division occurs even though the DNA is damaged, and the cell division occurs in an unregulated manner. These diagrams show the two paths of action that p53 can take. This gene normally functions to determine if a defective cell can be repaired or if the cell should be sent down the path of apoptosis. In response to DNA damage, p53 is activated. It is a DNA binding protein and can stimulate transcription of p21 which stops the cell cycle In response to Lots of DNA damage, p53 stimulates transcription of bcl2.Bcl2 leads to programmed cell death or apoptosis L37/24

How much Heterozygosity is Good?

•Classical hypothesis: Organisms need low levels of heterozygosity so that they will be well adapted to their environment. Selection favors genotypes that are well-adapted to a specific environment so each organism in a specific environment should have the favorable genotype and there should be little variation in the population. • •Balance hypothesis: Organisms need high levels of heterozygosity so that they will have the necessary variability to respond to changes in the environment. A successful population would have lots of variability so that it can produce a variety of phenotypes and can allow the population to adapt to a changing environment. Therefore there should be a lot of variability in the population. There is a debate on how much heterozygosity is good. This is an argument about the importance of the survival of the population vs the optimal condition for an individual. According to the classical hypothesis, organisms need low levels of heterozygosity so that they will be well adapted to their environment. Selection favors genotypes that are well-adapted to a specific environment so each organism in a specific environment should have the favorable genotype and there should be little variation in the population. On the other hand, the balance hypothesis maintains that organisms need high levels of heterozygosity so that they will have the necessary variability to respond to changes in the environment. A successful population would have lots of variability so that it can produce a variety of phenotypes and can allow the population to adapt to a changing environment. Therefore there should be a lot of variability in the population.

Production of Eukaryotic Proteins

•Coding sequences are put under a bacterial promotor and introduced into the bacteria via plasmid vectors to use bacteria or cultured mammalian cells to synthesize therapeutic proteins -Human growth hormone, Insulin, Blood Clotting Factors •Proteins with industrial applications -Rennin in cheese manufacturing -Enzymes such as bacterial proteases in detergents and meat tenderizers, amylases to degrade complex sugars We can make some proteins in the lab for therapeutic or industrial use. The genes for the proteins are put into a bacteria or a cultured mammalian cell along with an effective promotor. The cells then produce the gene products in large quantities, and these products are harvested and can be used in industry and as medical treatments. There are several gene products that are now manufactured by bacteria and are used as therapeutic proteins. Examples include insulin, blood clotting factors, and human growth hormone. There are other proteins that are important in industrial applications that can be produced in cell culture..

CODIS

•Combined DNA Index System •DNA database funded by FBI •Uses 20 polymorphic regions used for forensic identification These loci assort independently so product rule applies regarding probability of a particular combination of alleles •However, accuracy of probabilities does depend on accurate probability estimates for particular alleles in the population •Typically estimates are about 1 in 1 billion •Data for one population is not necessarily correct for another population •Extended to 20 loci in Feb 2017 The FBI maintains a DNA database of information for some STR loci. CODIS stands for combined DNA index system. It is the DNA database funded by the FBI and uses 20 polymorphic STR regions for forensic identification. STR analysis is a very powerful forensic technique. Only a small amount of DNA is required for the test, and we have very low odds that two people will be exactly the same for all of the 20 loci currently used for discrimination. The loci assort independently so probabilities can be calculated for a particular combination of alleles being present in multiple people. Typically the odds are about 1 in 1 billion! Please note that this does vary between populations. The image here is for the 13 core CODIS loci that were used for years. The number of loci was extended to 20 loci in February 2017. Updated info, no pic: https://www.fbi.gov/services/laboratory/biometric-analysis/codis/codis-and-ndis-fact-sheet

Using SNPs in Association Studies

•Correlate presence/absence of specific SNP or SNP haplotype with presence/absence of genetic disorder • •Advantage: SNP association studies identify DNA sequence differences between individuals more quickly/ more easily than sequencing entire genomes. The goal of using SNPs in association studies is to Correlate presence/absence of specific SNP or SNP haplotype with presence/absence of genetic disorder. The advantage to this method is that SNP association studies identify DNA sequence differences between individuals more quickly and more easily than sequencing entire genomes. •Problem: •Correlation is not equal to causation! (but is a starting point for research) •It takes a lot of research to identify if a particular SNP is meaningful for a specific disorder The problem is that correlation is not equal to causation. Take a look at this graph: Over a period of several years, it appears from the graph that the number of letters in the winning word of the Scripps National Spelling Bee correlates very well with the number of people killed by venomous spiders. Logic tells you that one of these did not cause the other to occur! It takes a lot of research to determine whether or not a particular SNP is meaningful for a specific disorder.

Recombinant DNA

•Creating new DNA molecules combining DNA from different sources -Often uses restriction enzymes to cut DNA strands than can then be ligated together •Restriction enzyme - endonuclease that recognizes a specific DNA sequence and cleaves dsDNA at that sequence •Palindrome - reads the same 5' to 3' on either strand for a segment of DNA -Eg. 5'-GAATTC-3' 3'-CTTAAG-5 Recombinant DNA is the product of joining DNA sequences from different sources together. The ability to reproducibly cut DNA at a specific site so that work could be have predictable results and be replicated was a key factor in the creation of recombinant DNA. This is accomplished by restriction enzymes. Restriction enzymes are naturally occurring in bacteria and serve to protect the bacteria from invading foreign DNA. These enzymes are endonucleases that cut double stranded DNA at a specific site. Enzymes isolated from different bacterial strains recognize different sites. The recognition sites tend to be palindromes. A palindrome in words reads the same backwards and forwards - Mom (my favorite), racecar, and Hannah are examples in words. In DNA, a palindrome reads the same when going from 5' to 3' on either strand for a segment of DNA.

Constructing Phylogenetic Trees

•Distance Approach •Computing differences to infer relationships based on overall similarity of organisms, typically by using multiple phenotypic characteristics or gene sequences. •We will be using this approach. • •Parsimony Approach •Infers phylogenetic relationships based on the minimum number of evolutionary changes in the sequence that must have taken place since the organisms had a common ancestor • • •Maximum Likelihood/ Bayesian Approach •Infers relationships based on which gives the maximum probability of obtaining the set of characteristics in the organisms There are several approaches to constructing the tree after the sequences are aligned. Three are listed here: Distance Approach, Parsimony Approach, and Maximum Likelihood or Bayesian Approach. You will not need to know these definitions. We will be using the distance approach. You will need to use this approach to construct phylogenetic trees.

Diversity

•Diversity means that different alleles are present at a locus in a population. • •Is it better for an individual in the population to be heterozygous for these alleles or homozygous for one form? When there are different alleles present at a locus, we say that there is diversity at that locus in the population. There are advantages for a population to have diversity in that there is more ability for some individuals of the population to have suitable alleles to allow survival of the population if the environment changes. However, is it better for an individual in the population to be heterozygous for these alleles or to be homozygous for one allele?

Bloom Syndrome

•Due to a defective DNA helicase enzyme that is important in repairing double stranded breaks of DNA. •Individuals homozygous for mutated BLM gene have a very high rate of cancer. •Heterozygous individuals have an elevated risk for colorectal cancer •Haploinsufficiency: Individuals can still be affected even if they have one normal allele. A single normal allele is insufficient. •Similar pattern in mice: Those heterozygous for the BLM locus were more than twice as likely to develop intestinal tumors than those with two normal alleles. Bloom Syndrome is due to a defective DNA helicase enzyme that is important in repairing double stranded breaks of DNA. Individuals homozygous for mutated BLM gene have a very high rate of cancer. Heterozygous individuals have an elevated risk for colorectal cancer This is haploinsufficiency since individuals can still be affected even if they have one normal allele. A single normal allele is insufficient to allow normal function. The second DNA helicase gene may never get mutated, but only having one functional copy is still an increased risk for accumulating other mutations We see a similar pattern in mice: Those heterozygous for the BLM locus were more than twice as likely to develop intestinal tumors than those with two normal alleles.

Reverse GeneticsKnockdown Expression using RNAi

•Excess ApoB protein leads to high levels of cholesterol. •Inject lipid coated ApoB synthetic siRNA into Cynomolgus monkeys to 'knockdown' expression. In some cases, you cannot knock out a gene and get the animal to survive. However, altering the level of expression may work well to allow gene function to be studied. RNAi can be used to decrease (knock down) the expression of a gene since RNAi can decrease translation of mRNA or cause mRNA degradation. In this example, siRNA for the ApoB gene was put into the monkeys to knock down expression of the ApoB gene. Typically this gene leads to high levels of cholesterol. Take a look at the graph. Note that the level of cholesterol was decreased as the dosage of siRNA was increased showing that the expression of the ApoB gene was decreased using RNAi. Do you think this or something similar could ever be developed into a therapy to decrease expression of a gene when too much gene product is produced? That is the direction that this type of research is looking into, but much more work needs to be done before this is a real medical therapy.

Multiple Factor Hypothesis

•Expression depends on the additive effects of a number of genes. •The effect of each gene is small. •Environment plays an important role in expression of trait. •Smooths curve Based on his observations, East came up with the following Multiple Factor Hypothesis to explain the expression of quantitative traits. First, the expression of the trait depends on the additive effects of a number of genes. Second, the effect of each gene is small. Finally, the environment plays an important role in the expression of a trait. Look at the diagram. The top picture shows the additive effects with one gene. The second with 2 loci, and more loci as we go to the lowest picture. As you add more genes, with effects of alleles adding together, the boxes on the histogram get closer and closer together and the difference in height between neighboring boxes is smaller and smaller until there is very little difference in height between adjacent boxes. You can think of the effects of the environment as smoothing the curve to make a normal distribution.

Forward vs. Reverse Genetics

•Forward Genetics: Start with a mutant phenotype and seek out the gene that causes that phenotype. -Use chromosome mapping to identify the gene. •Reverse Genetics: Start with a DNA sequence (a genotype), alter its function or prevent its expression and observe the effects on the phenotype. Here are a couple of terms that are frequently used so you should be familiar with them. They really are just different approaches to studying genetics. In Forward Genetics, researchers start with a mutant phenotype and seek out the gene that causes that phenotype. They can use chromosome mapping to identify the gene and then can move to molecular analysis. In reverse genetics researchers start with a DNA sequence (a genotype), alter its function or prevent its expression and observe the effects on the phenotype.

NIH Requirements for Gene Therapy

•Gene must be cloned, well characterized and available in pure form • •An effective method must be available for delivering gene to desired cells and/or tissues • •The risks of gene therapy to the patient must have been carefully evaluated and shown to be minimal • •The disease cannot be treatable by other methods • •Data must be available from preliminary experiments with animal models or human cells and must indicate that the proposed therapy should be effective NIH does have some requirements for using gene therapy: The gene must be cloned, well characterized and available in pure form An effective method must be available for delivering gene to desired cells and/or tissues The risks of gene therapy to the patient must have been carefully evaluated and shown to be minimal The disease cannot be treatable by other methods Data must be available from preliminary experiments with animal models or human cells and must indicate that the proposed therapy should be effective

Genetics vs. Genomics

•Genetics: Study of heredity and the variation of inherited characteristics. • •Genomics: Field of genetics that attempts to understand the content, organization, function, and evolution of genetic information contained within and between whole genomes. Genetics and genomics blend together these days as many researchers are interested in both and in using techniques that span both fields. Genetics is the study of heredity and the variation of inherited characteristics. Genomics is the field of genetics that attempts to understand the content, organization, function, and evolution of genetic information contained within and between whole genomes.

Are there steps you can take to decrease your chance of getting cancer?

•HPV vaccine •Avoid environmental factors that can lead to cancer •Get recommended cancer screenings What are some steps you can take to decrease your chance of getting cancer? One thing would be to get the HPV vaccine You can also avoid environmental factors that can lead to cancer And you can get recommended cancer screenings

Gene Therapy Trial Examples

•Hemophilia B •1 in 30,000 men •Clinical trial of 10 men were treated with gene therapy, eliminating need for standard treatment (weekly infusions $100,000 to $500,000 per year) •Gene therapy - 1 dose into liver •Gene is an engineered clotting factor that is 8 to 10 times stronger than normal •All 10 benefitted •Unfortunately, 1/3 of hemophilia B patients have immunity to the virus used in the gene therapy, making them ineligible to receive it •https://medlineplus.gov/news/fullstory_170256.html •12/6/17 You may remember that hemophilia is an x-linked recessive disorder and that it affects about 1 in 30,000 men. The standard treatment is to infuse the missing blood factor and this is costly! A clinical trial of 10 men used targeted gene therapy which was 1 dose of the engineered clotting factor delivered to the liver. All 10 benefitted from the treatment. Unfortunately about a third of hemophilia B patients cannot use this therapy because they have an immunity to the virus that is used to deliver the gene product. •Adrenoleukodystrophy •Spinal Muscular Atrophy

Limitations of Heritability

•Heritability DOES NOT say how much genes affect the trait...it DOES say how much genes affect the variation in the trait. • •An individual DOES NOT have heritability -Eg. A heritability of 0.8 for height DOES NOT mean that 80% of an individual's height is due to their genes. -It DOES mean that 80% of the variation we see for height in that population is determined by genetic variation. • •The value of heritability is specific for a particular population in a particular environment and is not expected to be the same in a different population and/or different environment Also, heritability estimates assume that the environment for related individuals is not more similar than the environment for unrelated individuals. DIFFICULT TO MEET THIS ASSUMPTIONS WITH PEOPLE! We talked quite a bit about heritability and how to calculate it. There are limitations to the use of heritability. Heritability DOES NOT say how much genes affect the trait...it DOES say how much genes affect the variation in the trait. An individual DOES NOT have heritability Eg. A heritability of 0.8 for height DOES NOT mean that 80% of an individual's height is due to their genes. It DOES mean that 80% of the variation we see for height in that population is determined by genetic variation. The value of heritability is specific for a particular population in a particular environment and is not expected to be the same in a different population and/or different environment Also, heritability estimates assume that the environment for related individuals is not more similar than the environment for unrelated individuals. DIFFICULT TO MEET THIS ASSUMPTIONS WITH PEOPLE! There are other types of studies that can be used to estimate heritability for human traits.

Ethical Questions Regarding Genomics

•How do you control/regulate the spread of use of your genetic information? • •Is the information going to make life simpler or more complex? • •Who has access to your genetic information? With the increased volume of DNA testing and the increased knowledge about medical conditions associated with DNA sequences, how do you control/regulate the spread of use of your genetic information? Is the information going to make life simpler or more complex? While it may be helpful medically, could there be unfair discrimination based on someone's DNA sequence? Who has access to your genetic information? Do there need to be legal regulations regarding the use of DNA information for medical conditions?

Direct to Consumer Testing

•How much do they cost? •How do you do it? •What do they provide? •How accurate are they? •Is there any physician oversight? Counseling? • •Ancestry.com ~ $ 79 - $ 99 www.ancestry.com •23 and Me www.23andme.com •$99 ancestry only $199 ancestry and limited health information (risk and carrier status) We see ads for Direct to Consumer DNA tests. Some companies such as Ancestry.com test for countries of origin. Some companies such as 23 and me test for ancestry and for various health conditions such as carrier status and determining a risk for particular conditions. I am not trying to promote any direct to consumer DNA tests, and I caution you to read the fine print carefully prior to submitting a sample so that you are aware of what information will be provided, who has access to your information, and how your DNA information can be used

Classic Cancer Model

•Unregulated growth is due to a serial acquisition of genetic events leading to the expression of genes that promote cell proliferation while silencing the growth of inhibitory genes and blunting cell death • •Cancer is a proliferative disease The classic cancer model is that unregulated growth is due to a serial acquisition of genetic events leading to the expression of genes that promote cell proliferation while silencing the growth of inhibitory genes and blunting cell death. Cancer is a proliferative disease Therapy focuses on trying to shrink or destroy the tumors

Major Characteristics of Cancer

•Hyperplasia Uncontrolled cell division Immortal and Invasive •Anaplasia Structure/function of cell is undifferentiated •Metastasis Ability to move to and establish tumors at other sites in body Cancer cells/tissues are characterized by three properties: Hyperplasia - the uncontrolled cell division with cells that are immortal and invasive. Cancer cells will stack on top of other cells and will grow downward into other tissues. This is called the loss of bottom shelf boundary. The cancer cells can invade body systems (such as blood, lymph, bone, organs, etc) and ultimately kills good cells by robbing healthy tissues of nutrients Anaplasia - the structure and function of the cancer cells is undifferentiated compared to normal cells within the same tissue. Metastasis - the ability to move and establish tumors at remote sites in the body.

DNA Sequence Alignment

•Identification of homologous genes and properly aligning their sequences is critical in determining an accurate tree •Typically performed by computers to minimize the number of evolutionary steps We are going to use DNA sequence data to construct phylogenetic trees. There are lots of different ways to construct the tree. Two important things that have to be done are identifying homologous genes in the OTUs and then aligning their sequences properly. This proper alignment is critical in determining an accurate tree. There are computer programs which conduct the alignments. These are done so that the number of evolutionary steps is minimized between the two sequences. Take a look at this example. Look at the top alignment (the top table) The first 8 nucleotides are given for the first OTU and are labeled as Sequence 1. These have been aligned with the first 8 from the second OTU (labeled sequence 2). Note that there are 4 areas in red that differ between these sequences. Now look at the lower alignment. If we shift the sequence 2 data one base, there are only 2 places that these sequences do not align. This is like saying there was either an insertion or a deletion at that point of the sequence in one of these OTUs.

Gene Therapy Trial Examples

•Immune deficiencies •Blood cells can be removed and targeted by retroviruses to deliver wt copy of gene •Severe Combined Immune Deficiency (SCID) •Adenosine deaminase (ADA) deficiency •Hereditary blindness •Eye is accessible and partially protected from the immune system. •Leber congenital amaurosis (6/9 improved vision) •Cancer •For leukemia, remove immune cells and modify them to attack cancerous cells. (26/59 in remission) Here are a few examples of gene therapy trials: Immune deficiency disorders are one target for gene therapy studies. Blood cells can be removed and targeted by retroviruses to deliver a normal copy of the gene. Studies for this use are being done for Severe Combined Immune Deficiency (SCID) and Adenosine deaminase (ADA) deficiency. Hereditary blindness is another target for gene therapy. The eye is accessible and partially protected from the immune system. A study involving 9 patients with Leber congenital amaurosis showed improved vision in 6 out of the 9 patients in the trial. Cancer: There was a clinical trial in leukemia patients where their immune cells were removed and modified to attack cancerous cells. This study resulted in 26 out of 59 in remission.

Golden Rice

•In 1984 by scientists at the International Rice Research Institute •Lack of vitamin A during early development (after weaning) has life long consequences -250 Million preschool children affected by Vitamin A deficiency (WHO 2012) -Including: impaired vision, blindness, reduced immune response, impaired skeletal growth -Contributes to childhood mortality •No rice variety contains provitamin A (βcarotene) •Could scientists add provitamin A (which people convert to vitamin A) to rice? based on a true story. In 1984, a group of scientists at the International Rice Research Institute brainstormed about how to improve rice. They agreed that it does not provide enough vitamin A for children in early development. The lack of vitamin A has life-long consequences. Many children have a vitamin A deficiency, which can lead to loss of vision, impaired growth and reduced immune response. In 2012 the World Health Organization reported that about 250 million preschool children are affected by vitamin A deficiency, and that providing those children with vitamin A could prevent about a third of all under-five deaths,. This amounts to up to 2.7 million children that could be saved from dying unnecessarily. Vitamin A deficiency compromises the immune systems of approximately 40 percent of children under five in the developing world, greatly increasing the severity of common childhood infections, often leading to deadly outcomes. Vitamin A deficiency is most severe in Southeast Asia and Africa. None of the known rice resources have a provitamin A also known as beta carotene. People have the ability to convert provitamin A to vitamin A. Could scientists add provitamin A to rice to help prevent the consequences of vitamin A deficiency?

Improving the Nutritional Value of Golden Rice

•Initial Golden rice: 1.6ug/g total carotenoids •US Recommended daily allowance of vitamin A is 300ug •12ug β-carotene makes 1ug vitamin A •Need the rice to make more carotenoids •Would have to eat kilogram quantities. Look for PSY genes in other plants that may be similar. How can we tell if the genes are similar? Look at the sequence! Unfortunately, the initial Golden rice produced only 1.6ug/g total carotenoids. The US Recommended daily allowance of vitamin A is 300ug 12ug of β-carotene makes 1ug vitamin A At this production level, an individual would need to eat kilograms of rice to get the recommended daily amount of vitamin. Hmmm. Looks like we need the rice to make more carotenoids. One way to do this is to look at other sources of the genes and see if they will work better. Let's look for PSY genes in other plants that may be similar. How can we tell if the genes are similar? We can look at the DNA sequence! • • Look for PSY genes in other plants that may be similar. How can we tell if the genes are similar? Look at the sequence! You would have to eat kilograms to get the daily recommended amount of vitamin A.

Reverse GeneticsTransgenic Mice - Gene Function Studies

•Inject gene of interest into fertilized egg •Implant in female •Test progeny for presence of gene •Mate to obtain mice homozygous for gene •Study gene function Transgenic organisms are frequently used in reverse genetics studies. This diagram shows the production of a transgenic mouse. You do not need to know the steps, but you do need to know that it is possible to produce a transgenic mouse. The function of the inserted gene can be studied by altering the gene itself or by altering the controlling regions for the gene.

Unrooted Trees do Not Have a Common Ancestor

•Unrooted Tree •Only the distance between OTUs is known, but not the order of divergence throughout evolutionary time. Some phylogenetic trees are unrooted such as the tree shown here. We do not have a common ancestor for all OTUs in the tree. The distance between OTUs is known, but the order of divergence throughout evolutionary time is NOT known.

Gene Therapy

•Introducing functional copies of a gene into individuals who have only defective copies of that gene •Transgene: introduced copy of the gene -In successful gene therapy, the transgene will make the missing gene product and restore normal phenotype. •Somatic Cell (non-heritable): treats, but does not cure the disease. All current gene therapies are somatic cell therapies. •Germ-line (heritable): Major moral and ethical considerations •Introducing functional copies of a gene into individuals who have only defective copies of that gene •Transgene: introduced copy of the gene -In successful gene therapy, the transgene will make the missing gene product and restore normal phenotype. •Somatic Cell (non-heritable): treats, but does not cure the disease. All current gene therapies are somatic cell therapies. •Germ-line (heritable): Major moral and ethical considerations •Introducing normal genes into cells often requires the use of viruses •Retroviral vectors integrate into the DNA of the host cell •Transgene is transmitted to all progeny cells in the cell lineage •Transgene may integrate so that it disrupts function of another gene • •Other non-viral mechanisms include direct injection, lipid capsules, nanoparticles etc. •In addition to delivery of DNA or RNA, these can deliver genome editing machinery like CRISPR components. Let's look at some approaches to gene therapy. How do we get the gene into someone? Introducing normal genes into cells often requires the use of viruses. Retroviral vectors can be used to put the transgene into the DNA of the host cell. After the transgene gets into a cell, it is transmitted to all progeny cells in the cell lineage by mitosis. There are potential problems: The transgene may integrate so that it disrupts function of another gene. There are other mechanisms for introducing a gene into a cell such as direct injection, lipid capsules, nanoparticles etc. In addition to delivery of DNA or RNA, these can deliver genome editing machinery like CRISPR components. It is an exciting field, but it is very early on in the discovery and application process

Gene Therapy costs

•Kymriah (leukemia treatment using engineered immune system protein) $475,000 • •Luxterna (treatment for inherited blindness), expected to be approved by the FDA this spring Unfortunately, Gene therapy is not cheap! For example, Kymriah is a leukemia treatment using a engineered immune system protein and costs about $475,000 Another gene therapy treatment, Luxterna , a treatment for inherited blindness, is expected to be approved by the FDA soon. Most of the gene therapy approaches are at the clinical trials stage and are not available to the public at this time.

Making Recombinant DNA and Identifying Cells Containing Recombinant DNA

•Ligation Experiment conducted to join foreign DNA to vector. The foreign DNA and the vector are both cut with the same restriction enzymes. The DNA's are mixed and DNA ligase is added. Some recombinant molecules should form. • •Insertional Inactivation: The inserted DNA inactivates a gene in the vector by inserting into that gene. This allows cells which contain the recombinant DNA molecule to be identified. • •Transformation Experiment conducted to allow cells to take up products from ligation experiment. • •Identification of Different Cell Types: cells with no uptake, cells that took up the original vector and cells that took up the recombinant plasmid. Let's go over a few terms that are frequently used in making recombinant DNA and identifying cells that contain the recombinant molecule: A Ligation Experiment is conducted to join foreign DNA to vector. The foreign DNA and the vector are both cut with the same restriction enzymes. The DNA's are mixed and DNA ligase is added. Some recombinant molecules should form. Insertional Inactivation refers to the inserted DNA's inactivation of a gene in the vector when it inserts into that gene. This allows cells which contain the recombinant DNA molecule to be identified. After ligation occurs, a transformation Experiment is conducted to allow cells to take up products from ligation experiment. It is important that the ligation and transformation experiments are constructed so that the Identification of different cell types can occur. These types are: cells with no uptake, cells that took up the original vector and cells that took up the recombinant plasmid

Mapping Using Genome Wide Association Studies (GWAS)

•Looks for associations between a trait and various markers scattered across the genome • •Studies populations of individuals, not pedigrees of a particular family This slide illustrates the idea of mapping using GWAS. Markers are identified in patients with a disorder and are distinguished from markers in patients without the disorder. The idea is to associate a trait with specific molecular markers. Even though molecular markers are changes in the DNA, they are not necessarily associated with a gene or phenotype. GWAS looks across a population, not within a family, to try to find associated markers. After markers are found that correlate with the disorder, additional work is needed to find the appropriate genes and validate these genes.

Cancer

•Mass of tissue/cells with unlimited potential to divide/grow and serving no useful function in the body • •Error occurs in cell cycle in 1 cell and increases number of affected cells through mitosis • •More than 1.6 million new cases per year in the US •Cancer was the leading cause of death in 21 states in 2016 •The cancer death rate has dropped by 23% since 1991, translating to more than 1.7 million deaths averted through 2012 •Among children and adolescents (aged birth-19 years), brain cancer has surpassed leukemia as the leading cause of cancer death because of the dramatic therapeutic advances against leukemia. •© 2016 American Cancer Society. Cancer is genetic, but cancer is rarely heritable Cancer can be defined as a mass of tissue or cells with an unlimited potential to divide and serve no useful purpose in the body. 1 cell goes bad and divides by mitosis making 2 bad cells. The number of bad cells increases through more mitotic divisions There are more than 1.6 million new cases per year in the US. Cancer was the leading cause of death in 21 states in 2016 The cancer death rate has dropped by 23% since 1991, translating to more than 1.7 million deaths averted through 2012 Among children and adolescents (aged birth-19 years), brain cancer has surpassed leukemia as the leading cause of cancer death because of the dramatic therapeutic advances against leukemia We say that cancer is genetic, but cancer is rarely heritable - I will come back to this statement later on. https://www.ncbi.nlm.nih.gov/pubmed/26742998 Cancer statistics, 2016. Siegel RL1, Miller KD2, Jemal A3.

Factors effecting Diversity Within a Population

•Migration and Mutation introduce variability within populations by introducing new alleles. •Genetic Drift decreases diversity within populations as alleles are fixed and lost. •Inbreeding increases homozygous types with decrease of heterozygous types. (No change in allele frequency) •Natural Selection can increase or decrease variability within population depending on type of selection. •Recombination increases variability. Let's review some of the factors that effect diversity within a population. We studied these when we studied population genetics and they are important here as well. Migration and Mutation introduce variability within populations by introducing new alleles to the population Genetic Drift decreases diversity within populations as alleles are fixed and lost. Inbreeding increases homozygous types with decrease of heterozygous types but does not result in a change in allele frequency Natural Selection can increase or decrease variability within population depending on type of selection. Recombination increases variability within a population.

CODIS •3 tiered system: separate federal, state and local databases

•Where do the DNA profiles come from? -More than 9.4 million from offenders -About 360,000 from crime scenes -DNA from missing persons, relatives of missing persons and unidentified human remains The CODIS system is a three-tiered system with separate federal, state and local databases. The DNA profiles come from individuals who have been convicted of crimes, from crime scenes, from missing persons, relatives of missing persons and unidentified human remains.

Cancer Genetic Testing

•More than 50 hereditary cancer syndromes have been described. •Below are some of the inherited mutations for which genetic testing is available •BRCA1, BRCA2 •Female breast, ovarian, and other cancers, including prostate, pancreatic, and male breast cancer •P53 • Breast cancer, soft tissue sarcoma, osteosarcoma (bone cancer), leukemia, brain tumors, adrenocortical carcinoma (cancer of the adrenal glands), and other cancers •PTEN •Breast, thyroid, endometrial (uterine lining), and other cancers •MSH2, MLH1, MSH6, PMS2 EPCAM •Colorectal, endometrial, ovarian, renal pelvis, pancreatic, small intestine, liver and biliary tract, stomach, brain, and breast cancers •APC •Colorectal cancer, multiple non-malignant colon polyps, and both non-cancerous (benign) and cancerous tumors in the small intestine, brain, stomach, bone, skin, and other tissues •RB •Eye cancer (cancer of the retina), pinealoma (cancer of the pineal gland), osteosarcoma, melanoma, and soft tissue sarcoma •MEN1 •Pancreatic endocrine tumors and (usually benign) parathyroid and pituitary gland tumors •RET •Medullary thyroid cancer and pheochromocytoma (benign adrenal gland tumor) •VHL •Kidney cancer and multiple noncancerous tumors, including pheochromocytoma Genetic testing for hereditary factors implicated in cancer is becoming more and more common. Here are some of the genes for which genetic testing is available. Note that several of the genes we mentioned earlier are on this list: BRCA1, BRCA2, p53, APC and RB

Cancer is a "Multi-Hit" Disease (usually)

•Most cancers are sporadic and influenced by environment (usually) -Siblings are rarely affected by the same cancer -Populations that migrate to new regions tend to get cancer rates typical of that region •Cancers develop over time (usually) -Changes in cancer rates due to new environment (eg. smoking) tend to take decades -Incidence of cancer rises with age This is consistent with a multi-hit model where cancer arises over time with multiple genetic changes. Usually, cancer is a multi-hit disease. This means that several things go wrong over time and eventually a cancerous condition occurs. We have some evidence for the multi-hit nature of cancer: Most cancers are sporatic and are influenced by the environment. Siblings rarely are affected by the same type of cancer and populations that migrate to new regions tend to get cancer rates typical of that new region. Cancers usually develop over time. Changes in cancer rates due to a new environment such as smoking tend to take decades. The incidence of cancer increases with an increase in age. All of these are consistent with a multi-hit model where cancer arises over time with multiple genetic changes. We do know of some gene mutations that significantly increase a person's risk of cancer. If one or more of the necessary mutations is inherited, then fewer additional mutations are needed which can account for an increase in risk. There are some genes that have a big effect on the likelihood of tumor development.

Gene Duplication Human Globin Genes

•Multi-gene family •Evolved by successive gene duplications Gene duplication allows genes to evolve different functions. This is just one example where successive gene duplications occurred resulting in different chains of the immune system genes and the myoglobin gene.

Limitations of PCR

•Must know something about sequence surrounding gene of interest in order to use PCR to clone a gene • •PCR reactions are easily contaminated from other DNA in the lab • •Taq polymerase does not proofread and correct errors (error rate about 1 in 20,000 bp) • Fragments amplified by PCR are relatively small (2000 bp standard, modified reactions up to 50,000 bp There are some limitations to PCR: You must know something about the sequence surrounding your gene of interest in order to use PCR to clone a gene because you have to design primers that will allow your gene of interest to be amplified. PCR reactions are easily contaminated from other DNA in the lab so it is important that proper precautions are taken and that experiments are replicated with the same results. Taq polymerase does not proofread and correct errors very well. The error rate is about 1 in 20,000 bp, which is higher than the rate for some other polymerases. If errors are made in the replication process, the sequence that is amplified is not going to be correct. Fragments amplified by PCR are relatively small (2000 bp standard, modified reactions up to 50,000 bp). Sometimes we would like to amplify longer segments.

How can we get the daffodil PSY gene to be expressed in rice?

•Need lots of DNA •Need a way to get it into the rice genome •Need a promoter Now we have the DNA, what else will we need to do? We need to make a lot of the PSY DNA and get the gene into the rice genome. We also need to have the gene expressed in rice so we need to have a promotor in the proper position for expression.

What Could Genomics Mean for You?

•Personalized medicine that takes into account the molecular events underlying a disease and your "genotype" to determine the best treatment option for you. •We know that not everyone responds the same to a particular medicine or therapy. •Tailoring medicine to individuals should provide better medical care and prevent harmful side effects •Eg. Tamoxifen •Eg. Some anesthetics So what could genomics mean for you? The field of personalized medicine or precision medicine is growing rapidly. •Personalized medicine that takes into account the molecular events underlying a disease and your "genotype" to determine the best treatment option for you. We know that not everyone responds the same to a particular medicine or therapy. Tailoring medicine to individuals should provide better medical care and prevent harmful side effects • •One example is tamoxifen. It is a medication that is used to prevent recurrence of breast cancer. Its metabolized products mimic estrogen, but binds estrogen receptor block estrogen response often needed by the cancer. Only some people respond. Those that metabolize faster, respond best to the drug. Fortunately, genetic testing can be done to determine if this is expected to be an effective medication in specific individuals. •Another example has to do with anesthetics. Not everyone responds the same to anesthesia, and we can respond differently to different anesthetics. Sometimes genetic testing is done to determine an appropriate anesthetic prior to surgery, particularly if the patient has had difficulty with anesthetic responses in previous surgeries.

Mapping QTL

•QTL are Quantitative Trait Loci •QTL are identified by linkage analysis between the trait and molecular markers. •If the inheritance of a genetic marker is associated with inheritance of a quantitative trait, then the marker must be linked to a QTL involved in that trait. •SNPs (single nucleotide polymorphisms) are common makers used to identify QTL. We mentioned QTLs early in the semester. QTL stands for Quantitative Trait Loci. The increased ability to identify molecular markers has allowed us to identify the number and location of many genes that are important in quantitative traits. QTLs are identified by linkage analysis between the trait and molecular markers. If the inheritance of a genetic marker is associated with inheritance of a quantitative trait, then the marker must be linked to a QTL involved in that trait. SNPs (single nucleotide polymorphisms) are common markers used to identify QTL.

Tests/screenings for Cancer

•Regular breast exams •Mammograms •Colonoscopies •PSA Test (prostate cancer) • •Recommendations for these tests change • •Ask questions so you can make informed decisions! Here are a few tests and screenings that are used for cancer. Regular breast exams and mammograms for breast cancer Colonoscopies for colon cancer And the PSA Test for prostate cancer Please note that recommendations for these tests change. You should always talk to your physician about your personal circumstances and history. Ask questions so that you can make informed decisions!

UPGMA Unweighted Pair Group Method with Arithmetic Mean

•Relatively simple method of constructing a phylogenetic tree based on computing differences in DNA sequences The particular approach we will use is the Unweighted Pair Group Method with Arithmetic Mean or UPGMA approach. It is a relatively simple method of constructing a tree based on difference in DNA sequences.

How does a plant make β-carotene?

•Rice is missing enzymes that convert GGPP to phytoene and phytoene to lycopene Some plants make β-carotene. What biochemical path do they use? You can see the pathway here. GGPP is present in rice. But rice is missing the enzymes at the two steps with the red X's. Rice cannot produce phytoene or lycopene. Rice is capable of converting lycopene to β-carotene. L38/ 22 •Rice is missing enzymes that convert GGPP to phytoene and phytoene to lycopene •In other plants -PSY is used to convert GGPP to phytoene -PDS, ZDS, and CRTISO are used to convert phytoene to lycopene Some plants use the gene PSY to convert GGPP to phytoene. PSY stands for phytoene synthetase. Some plants use the genes PDS, ZDS, and CRISIO to convert phytoene to lycopene. •In bacteria -CRTI is used to convert phytoene to lycopene Bacteria use CRTI to convert Phytoene to Lycopene. There is not a bacterial enzyme that converts GGPP to phytoene. We want to find a source of the PSY enzyme from a plant that can produce it and that add that gene for this enzyme to the rice. Since bacterial DNA is easier to manipulate that plant DNA, we want to add the CRTI gene from bacteria and add that to the rice to produce the enzyme for the second missing step.

Rooted Trees Have a Common Ancestor for All Terminal Nodes

•Rooted Tree •The distance between OTUs is known and the order of divergence is inferred by comparing to an OTU that is considered an outgroup. •Outgroup is an OTU that is known to have diverged earlier than all of the other OTUs. The outgroup roots the tree such that all OTUs share a common ancestor. Some trees are rooted trees. These have a common ancestor for all terminal nodes. There are 4 nodes shown in this tree. The tree is read from the bottom to top with the bottom being back further in time and the top being in the present time. The common ancestor is the point that is in common with all of the OTUs in the tree. Here our OTUs are labeled A, B, C and D. The distance between OTUs is known and the order of divergence is inferred by comparing to an OTU that is considered an outgroup. An Outgroup is an OTU that is known to have diverged earlier than all of the other OTUs. The outgroup roots the tree such that all OTUs share a common ancestor. The outgroup here is the OTU labeled A.

Normal Distribution Often

•Sample Mean = X = ΣXi/n • • •Sample Variance = s2 = (Σ(Xi - X)2)/(n-1) •Standard deviation = s We often assume that the phenotypes of a quantitative trait are distributed according to the normal distribution. The Normal distribution is a bell shaped curve. The mean (indicated by x bar here) is the value along the X axis where the curve is at its highest point. We calculate the mean as the sum of all of the observations divided by the number of observations. The variance is the average squared deviation from the mean and you can see a calculation formula for the variance. You do not need to memorize this formula. Width of the curve indicates variance, wider curve means more variation Let's say we were measuring height. We could record our height values in inches. Look at the formula for the sample variance. The units for this would be inches squared. This is hard to relate height to. The standard deviation is the square root of the variance. The square root of inches squared is inches so the units for the standard deviation is inches - in other words, the units for the standard deviation are the same as those for the original measure of the trait. We can move along the x axis so that the value is X plus the standard deviation (indicated by 1s on the graph, or X + 2 times the standard deviation which is indicated by the 2s position. With a normal distribution, approximately 2/3 of the population lies + or - one standard deviation from the mean. About 95% of the population lies within 2 standard deviations of the mean. 99% of the values in the population will be within 3 standard deviations of the mean

Role of Telomere Length in Cancer

•Typically telomeres shorten as a cell ages and this ultimately contributes to the death of the cell • •Normally, telomerase works in germline cells, but not in somatic cells: allows somatic cells to die, be replaced, etc. • •Tumor cells often have telomerase expression, which is thought to contribute to the "immortality" of cancer cells. • •Mutations in genes that regulate telomerase activity may be important in cancer, but role of telomerase is not clear at this time • •Possible Cancer Therapy: Block telomerase activity? Tumor cells typically have longer than expected telomeres. Typically telomeres shorten as a cell ages and this ultimately contributes to the death of the cell. Normally, telomerase works in germline cells, but not in somatic cells: allows somatic cells to die, be replaced, etc. Tumor cells often have telomerase expression, which is thought to contribute to the "immortality" of cancer cells. Mutations in genes that regulate telomerase activity may be important in cancer, but role of telomerase is not clear at this time. Some researchers are looking at blocking telomerase activity as a possible cancer therapy. Can you see any problems with blocking telomerase activity? It would probably be important to make sure there was a cell/tissue specificity to the blocking or normal cells could be irreparably damaged.

Limits to Selection Response

•Selection response may decline after selecting for a particular characteristic in a population for a long period of time. Possible Reasons: •Genetic variation is being lost as more favorable homozygous genotypes become fixed in the population. •The extreme types may not be healthy. •Selection for small body size in mice leveled off since the smallest mice were sterile and did not pass on their genes. •Two desirable traits may be negatively correlated. •Selection for rapid growth and increased body size in turkeys is correlated with decreased fertility, decreased egg production, and decreased egg hatchability. •Most commercial turkeys are artificially inseminated now. There have been several long term selection experiments. It has been noticed that selection response can decline after selecting for a particular trait in a population for a long period of time. There are several reasons for this. First, there may not be as much genetic variation left in the population as homozygous genotypes become fixed. Another explanation is that the extreme phenotypes may not be healthy. This is the case for selection for small body size in mice. It was noticed that the response leveled off because the smallest mice were sterile so they could not pass on their genes. It may be that two desirable traits can be negatively correlated so it may be difficult to select for both. In turkeys, rapid growth and increased body size (both desirable for turkey producers) are correlated with decreased fertility, decreased egg production and decreased egg hatcheability. Most commercial turkeys are artificially inseminated to help get around some of the problem.

Speciation

•Species often defined as a group of individuals that actually or potentially interbreed in nature. • •Species become distinct when they no longer exchange genes: Reproductive Isolation • •Reproductive isolation can occur because -They don't choose to mate with each other or cannot mate with each other (Prezygotic) -Or their progeny are sterile or inviable (Postzygotic) Part of the definition of evolution has to do with changes in speciation. There are various definitions of a species. We will use the definition that a species is a group of individuals that actually or potentially interbreed in nature. Species become distinct when they no longer exchange genes This is called Reproductive Isolation Reproductive isolation can occur because Organisms don't choose to mate with each other or cannot mate with each other. These are prezygotic barriers to speciation since they occur before mating occurs. Or the progeny of a mating are sterile or inviable. These are postzygotic barriers to speciation.

Genome-Wide Association Studies (GWAS)

•Studies that look for non-random association between the presence of a trait (phenotype) and alleles at many different loci scattered across the genome. • •Especially useful for identifying QTLs - those loci that are important in quantitative traits •QTL: Quantitative Trait Loci • •Find association between molecular marker and look for candidate genes can affect a trait that are located near the marker. As genome sequencing became less expensive, additional variation was characterized by looking at multiple individuals or populations of a species. Genome Wide association studies (GWAS) look for non-random association between the presence of a trait (phenotype) and alleles at many different loci scattered across the genome. This is especially useful for identifying QTLs - those loci that are important in quantitative traits. Remember, QTL stands for Quantitative Trait Loci. We can find associations between molecular markers and a phenotype. Then we look for candidate genes near the marker that can affect that trait. Then we have to validate the biological information showing that there is a true causative relationship and not just a correlation.

Epistatic Interactions VI

•Takes into account that epistatic interactions may occur between some genes so it is part of genetic variance • •We will ignore this in our calculations, but you should realize that it exists and can be important. • VP = VG + VE + VGE VG = VA + VD + VI Remember that we talked about epistasis early in the semester. This is when the gene products from one locus mask or modify the expression of the gene products from a second locus. Epistasis can occur with quantitative traits since the gene products can interact with each other. The variance due to epistatic interactions is abbreviated VI (I for interaction). This is a part of the genetic variance. You should realize that it is present, but we will ignore it in our calculations for simplicity. Epistatic interaction variance just takes into account that there can be epistatic interactions between the genes that are responsible for quantitative traits. Since epistatic interactions would be a part of genetic effects, this would make VG = VA + VD + VI

Transgenic Plants and Animals

•Transgenic plant or animal is one in which a foreign or exogenous gene has been introduced into its genome, thereby altering its genetic constitution Transgenic plants and animals are those in which a foreign gene has been introduced thereby altering its genetic constitution. Genetically modified organisms are common in our crop plants such as cotton, soybeans and corn.

Transgenic Plants and Animals

•Transgenic plant or animal is one in which a foreign or exogenous gene has been introduced into its genome, thereby altering its genetic constitution. • •Transgenic organisms can also contain a transgene which is derived from its own genome, but is altered in some way. Transgenic plants and animals are those in which a foreign or exogenous gene has been added with the inserted gene altering the genetic constitution of that organism. Organisms are also said to be transgenic when the gene is derived from their own genome, but it altered in some way. We need to make a transgenic rice plant!

Cancer Results From Uncontrolled Cell Division

•Tumor - distinct mass of abnormal cells that do not have normal controls on cell division • •Benign - abnormal cells remain localized and do not invade surrounding tissue • •Malignant - cancer cells invade surrounding tissue • •Metastatic - cancer cells spread and establish secondary tumors in other sites in the body A tumor is a distinct mass of abnormal cells that do not have normal controls on cell division Tumors are classified as benign if the abnormal cells remain localized and do not invade surrounding tissue. Even benign tumors can have medical consequences because they may "push" on neighboring cells or organs to prevent their normal function Malignant tumors are cancer cells that invade surrounding tissue Metastatic tumors occur when cancer cells spread and establish secondary tumors in other sites in the body

Ethical Issues with GN testing

•Use of Partial Match data: Search databases for partial matches to help identify suspects -Familial DNA Testing -Conflict between solving crimes and protecting privacy A Not so Perfect Match There are ethical issues with using genetic testing in criminal cases. Sometimes a complete match is not obtained, but a very close match is observed in the database. This is called a partial match. Sometimes these matches are so close that it is suspected a family member of the partial match could be the perfect match. One ethical question is should family members be asked to submit their DNA for testing in order to find a perfect match - a process called familial DNA testing. But is this like an illegal search? The video A Not So Perfect Match describes this process and discusses some of the ethical concerns. You may need to rewind the video - it seems to start part way through! •There have been ~350 people who were falsely convicted released from prison and ~149 alternative perpetrators identified The Innocence Project In addition to helping convict perpetrators of crimes, DNA evidence has been used to exonerate people who were falsely imprisoned. Take a look at the Innocence Project web page for more information!

Same mean and different variances

•Variance is related to the width of the curve •i.e. the range of values The variance has to do with the width of the curve. This graph shows three populations each with the same mean and each with a normal distribution. The narrowest curve, indicated in red, has the smallest variance. The line in blue has the broadest range of values for the trait so it has the greatest variance.

Cloning Vectors

•Vector is a carrier DNA molecule that is capable of independent replication into which a DNA fragment can be cloned. Its purpose is to carry foreign DNA into the cell. Ideal cloning vector has 1.Origin of replication that allows it to replicate in the host cell 2.Selectable/Insertional markers that allow cells containing the vector and the recombinant molecule to be identified 3.A single cleavage site for each restriction enzyme used in producing the recombinant DNA molecule Vectors are molecules of DNA that are capable of independent replication into which a DNA fragment can be cloned. It's purpose is to get the foreign DNA into the cell and allow it to replication in the cell. Some vectors also allow the gene product to be produced in the cell. Plasmids, lambda phage, and retroviruses can all be used as vectors. When we say cloning vector, we are talking about getting the foreign DNA into the cell and allowing it to replication When we say expression vector, we are talking about a situation in which the inserted foreign DNA can also be expressed (transcribed) in the cell. An ideal cloning vector has an Origin of replication that allows it to replicate in the host cell, Selectable/Insertional markers that allow cells containing the vector and the recombinant molecule to be identified, and A single cleavage site for each restriction enzyme used in producing the recombinant DNA molecule Expression vectors will also have the appropriate factors so that transcription and translation can occur.

Why does Variation Exist?

•We see polymorphisms (multiple allelic forms) maintained in populations. • •Why are those polymorphisms maintained from generation to generation? •Neutralist Theory: Many mutations are neutral. This causes polymorphisms to occur in population. Polymorphisms are maintained in the population since neither form has an advantage and the mutant types are not affected by selection. -eg. Two proteins with slightly different amino acid sequences both have proper level of function - •Selectionist Theory: many polymorphisms are maintained in the population due to selection. -eg. Two forms of a protein may allow for optimum performance over a range of cellular conditions - •Probably both of these are true sometimes - which is true more of the time? Nearly Neutral Model Is most mutation good or neutral? The neutralist theory maintains that many mutations are neutral. This causes polymorphisms to occur in population. Polymorphisms are maintained in the population since neither form has an advantage and the mutant types are not affected by selection. One example would be a situation where there are two proteins with slightly different amino acid sequences that both have the proper level of function. The Selectionist Theory maintains that many polymorphisms are maintained in the population due to selection. One example of this would be two forms of a protein may allow for optimum performance over a range of cellular conditions. Another example is the case of the sickle cell allele being maintained in the population since individuals with the heterozygous genotype have protection against malaria, but do not get sickle cell anemia. Probably both of these are true sometimes - which is true more of the time? The current thought is that although we know of some where a heterozygous condition is advantageous, most mutations are neutral. This has led to the current thought - the Nearly Neutral Model.

Phylogenetic Trees are Used to Show Degrees of Similarity between OTUs

•We will be looking at differences in Operational Taxonomic Units (OTU's) based on differences in their DNA sequences. •OTU can be a species or a strain of a virus or even different alleles within a species Researchers often use phylogenetic trees to show the degree of similarity between groups of organisms. These groups can be different species, but other groups can be compared as well (different races, breeds of animals, proteins, etc). Phylogenetic trees don't always compare two species. We are going to say that we will look at differences in Operational Taxonomic Units (OTU's). OTU can be a species or a strain of a virus or even different alleles within a species. We will be constructing our trees based on DNA sequences between these OTUs.

Assume 3 genes are responsible and a dominant allele at each locus adds 5 units of risk, but a recessive allele at a locus only adds 2 units of risk and individuals with 25 or more units of risk develop the disorder.

•What is the largest number of risk units possible in the children? •What is the genotype for this individual? Will this individual have the disorder Let's determine the genotype with the most risk factors. Since each parent can give a "A" allele to the child, the child can be genotype AA which is the genotype with the most risk factors possible from this cross at the A locus. Since each parent can give a B allele to the child, the child can be BB For the C locus, only 1 parent has a dominant allele so the genotype with the most risk factors in the progeny at the C locus will be Cc. Putting this together, the genotype with the most risk factors in the progeny of this cross is AABBCc. We can add up the risk factors and see that there are 27 units of risk for this child. Since the threshold is 25, this child will have the disorder. This is a situation where 2 parents who do not have the disorder have a child who accumulates enough risk factors to have the disorder.

Do all parts of the human genome accumulate mutations at the same rate?

•What might cause one region to accumulate more or fewer mutations than another region? DNA Sequence Analysis -Different Parts of the Genome Evolve at Different Rates

Example: Warfarin

•Widely prescribed anti-coagulant (Coumadin) • •Greater than 10-fold inter-individual variability in the dose required to attain a therapeutic response •Required dose also varies depending on individual's diet and is not the same for each person • •2 genes influence effective dose: •CYP2C9 (warfarin metabolic enzyme) •VKORC1 (Vitamin K epoxide reductase complex 1) Have you ever heard of the warfarin? It is widely prescribed as the anticoagulant Coumadin. The level of Coumadin has historically been difficult to manage since there is a greater than 10 fold inter-individual variability in the dose required to attain a therapeutic response. The required dose also varies depending on the individual's diet and is not the same for each person. It is important that the dose be correct since blood clotting can occur if the dose is too small, and severe bleeding similar to that of a hemophiliac can occur if the dose is too high. While patients levels are still monitored, the discovery of two genes that are influential in determining the effective dose have been identified. Patients are genotyped for these genes in order to provide a better and more effective dose of the medication.


Set pelajaran terkait

La France Contemporaine: Chapitre 6

View Set

Modern Novel for Choice Novel Consideration 2020

View Set

Chapter 7 - Investing for Retirement

View Set

Daily Double Chp.8 (15) Consumer Protection

View Set

AWS Cloud Practitioner - Sample Exams 1, 2, 3, & 4

View Set

Chapter 4: Folk and Popular Culture (1)

View Set

Domain 3 and 4 equations and food born illness

View Set

HRM 300T Week 1 Apply Assignment

View Set

Unitary, federal and confederal systems

View Set

9B: They wanted a dramatic skyline and they got one

View Set

Criminal Justice Semester 1 Review

View Set