Categories
Ask Biology

Probability of all alleles represented in a sample

I’m trying to wrap my head around some formulas presented in the 1992 paper from Chakraborty "Sample Size Requirements for Addressing the Population Genetic Issues of Forensic Use of DNA Typing" (https://www.jstor.org/stable/41464266?seq=1, but I have not been able to.

Specifically, the right hand side of formula (16) and it’s relation with formula (13).

Formula 13 indicates the probability that all alleles are represented in a given sample of size n

right hand side of formula 16 indicates the probability r allele to be represented in a given sample of size n

Formula 13 indicates the probability, for a locus with k segregating alleles whose frequencies are contained in the vector p, that all alleles are represented in a given sample of size n, and the right hand side of formula 16 indicates the probability of r alleles to be represented in a given sample of size n.

First of all, why, based on 13, the expression inside the summation indicates the probability of an allele of frequency p, to remain unobserved in a sample of size n?

I tried to understand this from the Hardy-Weinberg equation but did not have any success.

Second, Why to take the expression in (16) to the r’th power?

Which biological concepts am I missing?

Thank you in advance for your time.

Leave a Reply

Your email address will not be published. Required fields are marked *