Page 1 of 1

Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 3:52 am
by Ares Land
I've got a fairly hairy problem and maybe some of you are better than me at statistics and probabilities can help me figure out.

So, let's consider a fictional triploid species (that is, with three pairs of chromosomes instead of two).
Females of that species can either produce haploid eggs, like we do, or diploid eggs.
Males only produce haploid sperm.

So we've something like this

Code: Select all

Male	Female
XYZ	ABC
Sperm	Egg
X	A
Y	B
Z	C
	AB
	BC
	CA
So the species produces two kinds of offspring:
  • diploid offspring (haploid sperm + haploid egg): XA, XB, XC, YA, YB and so on.
  • triploid offspring (haploid sperm + diploid egg): XAB, XBC, YAB, and so on.
My question is, how much of the genome of the diploid individual XA is present in their triploid siblings XAB, XBC, YAB, etc...

Now, I've got a solution using simple brute force and checking all possible combinations... But I wonder if there is a more elegant solution to that problem? (Ideally, I'd like to figure out a formula that I could generalize -- it'd be less error prone and I could try and check different scenarios with different numbers of chromosomes)

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 3:57 am
by alice
Rather than get bogged down in the maths, you could always run a few simulations instead.

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 4:32 am
by zompist
I assume it works like this:

One dude has XA for a particular gene. If a sibling has XAB, they have 100% of his markers for that gene. That's also true for XAC.
For XBC YAB YAC ZAB ZAC, they have 50% of his markers.
And YBC and ZBC have none.

A random triploid sibling has 50% of his markers.

If you're talking about genomes, it's like running this one-gene scenario randomly as many times as you have genes. Say it's 10,000. Then you'd expect the percentage to be... roughly 50%. (Running a 50%-probability test n times generates a binomial distribution; see that article for some nice graphs.) The peak gets narrower and narrower as you add repetitions (here genes), so you're going to get near 50% but it's not impossible to get any value whatsoever.

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 6:05 am
by Ares Land
Thanks!
May I ask for a quick sanity check on a related scenario?

This time we're assuming that males are haploid and females triploid.

Code: Select all

Male	Female
XYZ	ABC
Sperm	Egg
X	A
Y	B
	C
	AB
	BC
	CA
This time, if a dude has XA:
XAB, XAC: 100%
XBC, YAB, YAC: 50%
YBC: 0%
So a random triploid sibling has 7/12, or about 58% of his markers.

On a complete genome,I'm a little unclear about what the result of running this scenario n times would be, but intuitively I'd assume the percentage to stay close to 58%? (But then again, about the only thing I know about statistics is that I shouldn't trust my intuitions...)

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 9:32 am
by Ares Land
Actually, forge out my last question! I found a better solution.

What I was trying to do was figure out a way for eusociality to evolve, besides haplodiploidy. And lo and behold, I found a solution that is both plausible and arguably even better than haplodiploidy. The gory details are here: http://verduria.org/viewtopic.php?f=3&t ... 512#p20512

(I spent way too much time on this, but still, I'm kinda proud of that system.)

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 1:07 pm
by Salmoneus
I acknowledge it may no longer be directly relevant, but I just wanted to point out an error/assumption: the above calculations of how many genes are shared assumes that each chromosome contains the same number of genes.

In reality, the human X chromosome contains around 800 protein-encoding genes, while the Y chromosome (which is both much smaller and mostly junk) contains only around 70 protein-encoding genes. And of course, in theory the difference could be far larger - the Y chromosome only actually 'needs' one gene (SRY, the sex-determining gene).

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 1:12 pm
by Ares Land
Salmoneus wrote: Fri Oct 25, 2019 1:07 pm I acknowledge it may no longer be directly relevant, but I just wanted to point out an error/assumption: the above calculations of how many genes are shared assumes that each chromosome contains the same number of genes.

In reality, the human X chromosome contains around 800 protein-encoding genes, while the Y chromosome (which is both much smaller and mostly junk) contains only around 70 protein-encoding genes. And of course, in theory the difference could be far larger - the Y chromosome only actually 'needs' one gene (SRY, the sex-determining gene).
Quite true. (I toyed with an X0 system as well, in which the Y chromosome gets so small as to be eliminated entirely).

Re: Genetics, probabilities and statistics problem

Posted: Fri Oct 25, 2019 11:10 pm
by zompist
Salmoneus wrote: Fri Oct 25, 2019 1:07 pmIn reality, the human X chromosome contains around 800 protein-encoding genes, while the Y chromosome (which is both much smaller and mostly junk) contains only around 70 protein-encoding genes. And of course, in theory the difference could be far larger - the Y chromosome only actually 'needs' one gene (SRY, the sex-determining gene).
True, but he asked about the genome, not the sex chromosomes.

Perhaps more strangely, the number of genes per chromosome varies widely outside the sex chromosomes. The max is 2058 genes, on chromosome 1; the min besides Y is 234, on chromosome 21.