Chi Square Bio340 ASU
Degrees of freedom equation
# categories - # of estimated parameter - 1
AA= 5 Aa=8 aa=3
(2x5)+8/32= .56 frequency of A (2x3)+8/32=.44 Frequency of q
within balance b/w mutation and delection
Allelic frequencies can equilibrium if the introduction of new alleles by a continuous mutation is balanced by the removal via natural selection New deleterious mutations, which could be completely recessive or partly dominant, are constantly arising spontaneously Natural selection removes them from the population, but there is an equilibrium between their appearance and removal. w(AA)=1 w(Aa)=1 w(aa)= 1-s
Alternative hypothesis: H1
An alternative hypothesis (H1) is the rival to the null and defines the rejection region • One-tailed: the region of rejection is in one tail of the sample • Our company's new drug works better than the competitor's • Two-tailed: the region of reject is in both tails of the sample • Our company's new drug works differently than the competitor's If we can reject the null hypothesis, we say there is evidence for the alternative hypothesis
Most populations are not at Hardy- Weinberg equilibrium
Assumptions are often not met: • Population has structure • Mating is often non-random • Many populations are quite small • Not all genotypes are equally viable
assumptions that genotype frequencies can be estimated: 5
1. All genotypes have equal survival and reproduction (no selection). 2. Mating is random with respect to genotype. 3. Effectively infinite (i.e. large) population size. 4. Mendelian segregation. 5. No mutation, migration, or population subdivision
Steps in X2 test
1. State the null and alternative hypotheses. 2. Count observed offspring numbers. 3. Calculate expected numbers under null hypothesis. 4. Calculate the test statistic. 5. Determine the number of degrees of freedom. 6. Compare X2 statistic with appropriate P value to determine whether the null hypothesis should be rejected.
The initial frequency of a new mutation in the gene pool is:
1/2N ex N=10000 1/2(10000)= 1/20000= 5x10^-5 = .9999995
There are two forms of human earwax: wet and dry. Wis a dominant allele that produces wet earwax. In a sample of Spaniards, you measure the following genotypes. What is the allele frequency of w? WW= 880 Ww =105 ww = 15
2(N) + Ww/ 2(Total population) 2(15)+105/2(1000)= 135/2000= .0675x100= 6.75%
The number of distinct phenotype categories for a polygenic trait produced by the segregation of additive alleles of a given number of genes (n) is calculated as:
2n+1
Genetic bottlenecks
A change in allele frequency following a dramatic reduction in the size of a population The survivors of the bottleneck are likely to have low levels of genetic diversity Allele frequencies are different from those of the original population and an increase in the level of inbreeding could be observed
genetic drift
A change in the allele frequency of a population as a result of chance events rather than natural selection.
natural selection
A process in which individuals that have certain inherited traits tend to survive and reproduce at higher rates than other individuals because of those traits.
Hardy-Weinberg equilibrium
AA Aa aa p^2 2pq q^2
Positive assortative mating reduces the frequency of heterozygotes Three kinds of matings occur:
AA × AA All offspring are AA aa × aa All offspring are aa Aa × Aa Only half of offspring are Aa The rest are homozygotes
variance
measure how far a set of random numbers are spread out from their mean
Calculate F of IBD
Add A(1) and A(2) = (1/2)^n (pop 1) + (1/2)^n (pop 2)
Evolution by Natural Selection
Change in the heritable characteristics of a population due to selection
coefficient of inbreeding (F)
Coefficient of inbreeding of individuals is the probability that a randomly chosen pair of homologous genes are identical by descent • If two individuals are unrelated: Fij = 0 • Their child is also not inbred: F = 0
Inbreeding
Continued breeding of individuals with similar characteristics Inbreeding is a form of non-random mating • Individuals preferentially mating with relatives • first cousins • second cousins • uncle-niece
Why doesn't selection eliminate all deleterious alleles?
Deleterious recessive alleles end up at a frequency where gain by mutation balances loss by selection A BALANCE
Neutrality
Despite the high mutation rates, most mutations are neutral • When calculating the probability of different outcomes under genetic drift, we assume that the A and a alleles do not confer differences in fitness
What determines how much polymorphism a population has?
Evolutionary Forces creating diversity Evolutionary Forces destroying diversity
Fixation index in the Subpopulation relative to the Total population.
FST= 2pq-H/2pq
Degrees of freedom (k) For simple null hypotheses:
For simple null hypotheses: k = number of categories - 1 • You lose a degree of freedom because the total number of observations is fixed • Some null hypotheses require expectations to be estimated from the data • Each time you estimate a parameter you lose a degree of freedom
Identity by descent (IBD)
Gametes will share the same allele as a consequence of descending from the same ancestor. This is caused by inbreeding.
heterozygote advantage
Greater reproductive success of heterozygous individuals compared to homozygotes; tends to preserve variation in gene pools.
deltaH
H-H'= 1/2N (H)
The heterozygosity (H) in the overall population is given by the mean of the two:
H= 2p(1)q(1) + 2p(2)q(2)/ 2
Modeling selection assumptions
HW assumptions • One locus, two alleles • Discrete generations • Variation in genotype fitness, relative fitness (w): represent AA as w(AA) ...etc
Testing Hardy Weinberg
Hardy-Weinberg equilibrium is the standard null hypothesis of population genetics • Rejecting the null suggests that some evolutionary force is acting on the gene pool • As before, we use a chi-squared goodness-of-fit test to determine if the null is rejected
True NUll hyp
If the null hypothesis is true, the statistic will exceed the 5% critical value only 5% of the time
Allele fitness
In the current generations, let p be the allele frequency of A and q = 1 - p
Non-random mating can affect genotype frequencies because
Mates can be limited by geography • Mates can be chosen by their traits • Mates can be closer related to one another
Evolutionary Forces creating diversity
Mutation Migration
__________are the ultimate source of polymorphism
Mutations
Non-random mating
Non-random mating affects genotype frequencies not allele frequencies
___________________ measures how well the selection differential can be passed on to progeny
Response to Selection (R)
Pearson's chi-squared test (test statistic)
Pearson's X2 allows you to summarize your observed data into a single value • Pearson's X2 statistic allows us to summarize how well our observed offspring numbers fit the expected numbers • Important: the sum uses counts not ratios AND takes over all offspring types
GgWw Ggww gg Ww gg ww Observed 5 45 45 5 Obs. Ratio 1 9 9 1 Expected 25 25 25 25 Exp. Ratio 1 1 1 1 How are we going to test the null hypothesis with four categories?
Pearson's chi-squared test (test statistic)
non-random mating
Positive assortative mating: Bias toward phenotypically similar mates • Negative assortative mating: Bias toward phenotypically different mates • Inbreeding: Bias toward mating with relatives • Outbreeding: Bias against mating with relatives
Calculating IBD from pedigree
Probability of IBD = (1/2)^n, n = transmission of events required to produce IBD. 1/2= the probability of transmission of an allele
Wahlund effect
Reduction of heterozygosity in a population caused by subpopulation structure
____________________measures the difference between the population mean and the individuals selected for mating
Selection Differential (S)
Evolutionary Forces destroying diversity
Selection drift
Examples of mutations causing neutral polymorphism
Synonymous changes in protein coding sequences • Nonsynonymous changes that replace one amino acid with a chemically similar one.
selection coefficient (s)
measure of the relative intensity of selection against a genotype; equals 1 minus fitness
Fisher's fundamental theorem
The rate of evolution via natural selection in a population is proportional to the additive genetic variance in the population • Thus narrow-sense heritability (h2) measures how well a population will respond to selecti
Goodness-of-fit test What does the distribution of X2 test statistics follow if you did an infinite number of simulations?
Theorem: The X2 test statistic follows a chisquared distribution with k degrees of freedom if the null hypothesis is true.
Discrete ratios. 1:4:6:4:1
Two genes determine color variation: • A and B • Each gene has two alleles: • A and B are additive: they contribute equally to red coloration) • a and b are non-additive: they contribute nothing to red coloration
trait variance
V(x) = V(g )+ V(e)
Selection
Variation in average reproductive success among phenotypes
Balance between mutation and drift
When mutation and drift are in balance, a population can reach an equilibrium use use heterozygosity (H) as a measure of variation. H will be near 0 when a population is near fixation for a single allele (low variation), 1 when there are many alleles that have equal frequency (high variation)
Histogram of results
With a significance level of 0.05, you have a 5% chance of rejecting the null hypothesis, even if it is true. 5% of your experiments will give you data that are significantly different, just by chanc
chi-squared test (test statistic)
X2 = Σ(Obs - Exp)^2/ Exp
X (mean)
Xbar + g (genetic factors ) + e (environment factors)
LIfe cyle Stage of a sexually reproducing organism
Zyogte -> Adult (Viability selection) Adult-> Parents (Sexual selection & survival selection) Parent -> gametes (Fecundity Selection) Gametes -> Zygotes (Compatibility Selection)
selection coefficient (s)
a measure of the relative intensity of selection against a given genotype
Partial dominant deleterious allele
allele with some deleterious effect in heterozygotes as well as homozygous h= the degree of dominance of the deleterious allele w(AA)=1 w(Aa)=1-hs w(aa)=1-s a is partially dominant deleterious allele
founder effect
change in allele frequencies as a result of the migration of a small subgroup of a population produce significant genetic drift on allele frequency
Hardy-Weinberg equilibrium
condition that occurs when the frequency of alleles in a particular gene pool remain constant over time Genotype frequencies can be easily predicted from allele frequencies
A _________ can take any conceivable value within an observed range. ranges Examples Tree height Seed weight Milk yield Blood pressure Enzyme activity HUMAN HEIGHT
continuous trait
change in H b/w generation due to mutation
deltaH=2u(1-H)
A ___________ can take only certain fixed values. Examples Pea flower color Pea seed shape Hemophilia Polydactyly Fly eye color
discrete trait
Degrees of freedom (k): • For more complex null hypotheses:
k = number of categories - number of estimated parameters - 1
Allele Segregation
multiple-gene hypothesis to investigate the patterns of inheritance produced in the length of the corolla in Nicotiana longiflora • East designed his experiment using true-breeding parental lines (short and long) • There is a small amount of variation in corolla length in each true-breeding strain, which suggests that gene-gene interactions or multifactorial effects produce some variability
Directional selection
occurs when natural selection favors one of the extreme variations of a trait Eliminates genetic diversity Under directional selection, the advantageous allele increases as a consequence of differences in survival and reproduction among different phenotypes • The increases are independent of the dominance of the allele, and even if the allele is recessive, it will eventually become fixed
Average fitness of the population
overbar w= p^2w(AA) + 2pqw(Aa) + q^2w(aa)
freq
p + q = 1=100%
frequency of allele in gametes equation
p'=pw(a)/overbar w= p^2w(AA) + pqw(Aa)/overbar w
Hardy-Weinberg equilibrium
p2 + 2pq + q2 = 1
Dihybrid test cross What is our null hypothesis H0 for genotype ratios? GgWw x ggww
phenotype Genotype Probability Dominant for both traits Gg Ww Dominant for the first and recessive for the second Gg ww Recessive for the first and dominant for the second gg Ww Recessive for both gg ww
Narrow sense heritability (h^2)
proportion of phenotypic variance due to additive genotypic variance alone
equation equilibrium freq. of deleterious recessive allele is:
qhat=sqrt (u/s)
partially dominant deleterious allele equation
qhat=u/hs
Finding 𝑯 "
represents the equilibrium value of H H'= (1-1/2N)H
w=1-s
selection coefficient equation
Fitness
the average number of offspring produced by a phenotype
relative fitness (w)
the contribution an individual makes to the gene pool of the next generation relative to the contributions of other individuals
Broad sense heritability (H^2)
the proportion of the total phenotypic variance of a trait that is attributable to genetic variance, where genetic variance is represented in its entirety as a single value if no genetic variation H^2=0
stable equilibrium
the state of an object balanced so that any small displacement or rotation raises its center of gravity p(E)= t/s+t q(E)= s/s+t
additive
they contribute equally
non-additive genetic variance
they contribute nothing
complex traits
traits controlled by multiple genes, the interaction of genes with each other, and with environmental factors where the contributions of genes and environment are undefined
fitness of alleles
w(A)= pw(AA) + qw(Aa) w(a)= pw(Aa) + qw(aa)
Null hypothesis: H0
• there is no association between two phenomena • there is no effect by a drug on a disease • there is no linkage between two genes • there is no evolution occurring in the gene pool
Response to selection equation
𝑅 = ℎ^2 × 𝑆