How Many Ancestors Share Our DNA?

This post was written four years ago, using a quick-and-dirty model of recombination to answer the question in the title. Since then a more detailed and rigorously tested model has been developed by Graham Coop and colleagues to answer this same question. You can read more about the results of this model on the Coop Lab blog here and here. Graham’s model is based on more accurate data, more careful tracking of multiple ancestors and a more realistic model of per-chromosome recombination, and thus his results should be considered to have superseded mine.

Over at the Genetic Genealogist, Blaine Bettinger has a Q&A post up about the difference between a genetic tree and a genealogical tree. The destinction is that your genealogical tree is the family tree of all your ancestors, but your genetic tree only contains those ancestors that actually left DNA to you. Just by chance, an individual may not leave any DNA to a distance descendant (like a great-great-great-grandchild), and as a result they would not appear on their descendant’s genetic tree, even though they are definitely their genealogical ancestor.

At the end of his post, Blaine asks a couple of questions that he would like to be able to answer in the future;

  • At 10 generations, I have approximately 1024 ancestors (although I know there is some overlap). How many of these ancestors are part of my Genetic Tree? Is it a very small number? A surprisingly large number?
  • What percentage, on average, of an individual’s genealogical tree at X generations is part of their genetic tree?

I think that I can answer those questions, or at least predict what the answers will be, using what we already know about sexual reproduction.

Simulating Sex with Recombination

We can give a simple answer by assuming that each chromosome is passed on intact, with a 50% chance of getting either one from a pair. This gives us a maximum of 44 genetic ancestors, and means that the probability of being related to any particular ancestor N generations ago is 1 – (1 – 0.5N – 1)22. We’d have about 43 genetic ancestors out of 1024 genealogical ancestors after 10 generations.

This is an underestimate; outside of the Y chromosome and the mitochondria, DNA is not passed down as whole chromosomes; recombination occurs, in which chromosomes come together and swap DNA. This mixes up DNA and stop is getting lost, and lets you have DNA from more than 44 genetic ancestors.

To answer Blaine’s two questions, we need to take recombination into account. To do this, I put together a computer simulation of recombination, using data from Chowdhury et al‘s large population study of recombination; they found that recombination rates vary from person to person, and especially between genders, with significantly more recombination in women than in men (see this graph for the data I used). I simulation individuals with the 22 non-sex chromosomes, and each person had their own recombination rate, chosen at random, dependent on their gender. I simulated sexual reproduction with unrelated individuals, and checked whether their DNA was present in their descendants N generations in the future.

The results

Here are the results of the simulation:

sharingprob

This graph shows the probability that an ancestor of mine from N generations ago will pass DNA onto me, i.e. the probability that a genealogical ancestor is also a genetic ancestor. The black line is the simulation with recombination, the red line is our prediction without recombination; as we guessed, recombination increases your chance of being genetically related to your ancestors, though it still drops of pretty dramatically; after 10 generations, only about 12% of your genealogical tree is in your genetic tree.

The probability of having DNA from all of your genealogical ancestors at a particular generation becomes vanishingly small very rapidly; there is a 99.6% chance that you will have DNA from all of your 16 great-great grandparents, only a 54% of sharing DNA with all 32 of your G-G-G grandparents, and a 0.01% chance for your 64 G-G-G-G grandparents. You only have to go back 5 generations for genealogical relatives to start dropping off your DNA tree.

We also care about how many genetic ancestors we have after a certain number of generations, shown below:

Nancestors

The number of genetic ancestors starts off growing exponentially, but eventually flattens out to around 125 (at 10 generations, 120 of your 1024 genealogical ancestors are genetic ancestors).

As a final note, there is an interesting effect of the larger recombination rate in women; you are, on average, slightly more closely related to your maternal line (your maternal grandmother, your mother’s maternal grandmother, etc) than you are to you paternal line (your father’s paternal grandfather, etc). We can see the sharing probability for the maternal line and the paternal line show:

mvflines

You are about 30% more likely to be genetically related to your maternal-line ancestor 10 generations ago than you are the corresponding paternal-line ancestor (14% vs 11%).

My not-very-well-documented R code for this simulation can be found here

Share and Enjoy:
  • Digg
  • Reddit
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed

24 Responses to How Many Ancestors Share Our DNA?

  1. How do the simulations change when there are 2, 3, 4…. common ancestors present within specific generations? What you provide is the theoretical minimum.

  2. Great, thanks! Your first chart above addresses one side of the issue: Given recombination, what is the probability you will inherit NO DNA from a given ancestor x generations back?
    The other side: Given recombination, and given that you KNOW you inherited y cM of autosomal DNA from a particular line of ancestors, what is the probability you will inherit AT LEAST y cM of DNA from an ancestor x generations back? This plot should look quite different, and I guess it should depend on the length of the chromosome involved, right?

    If so, can you at least give a simplified answer, assuming an average autosome length?

  3. Doh, can you tell it has been a while since my last genetics class? I guess the answer to my question above (which describes the situation of many 23andMe /DecodeMe customers) basically comes directly from the definition of a centimorgan. Feel free to delete the question.

  4. “….you are, on average, slightly more closely related to your maternal line (your maternal grandmother, your mother’s maternal grandmother, etc) than you are to you paternal line (your father’s paternal grandfather, etc).”

    Cool.

    But, aren’t most people going to be more like their fathers, then, than their mothers since they should inherit more directly *groups* or *packages* of genes that did not recombine (i.e. that remain linked)?

  5. Was pleased to read about girls being closer to the maternal line. . .

  6. I am a female and more closely related to my paternal line, 53.13%, according to Relative Finder. Almost everyone of my matches comes from my father’s side.
    Some of his relatives, I am even more related to because of the recombination. On the average, it looks as though I may lose a few cm’s with most; although, many of our matches have remained exactly the same.
    One particular high/distant cousin match for him, totally disappears for me.
    All very interesting!

  7. Pingback: Using Genome-Wide SNP Scans to Explore Your Genetic Heritage

  8. You are showing a point estimate but it would be nice to see the range of values. Can you repost the plots with 95% prediction intervals?

  9. Does it mean that ethnic affinities derived from autosomal DNA data are also that recent? (10-12 generations)

  10. This information really revolutionizes our thinking about our ancestors, and it has been fun to study what you have done with this simulation.

    It is known that recombination is not equal across the genome, and there are known recombination hotspots. It would be interesting to see how this analysis would look if you incorporated recombination rates across the genome. All of the data is accessible from HapMap’s website http://hapmap.ncbi.nlm.nih.gov/downloads/recombination/latest/rates/

    Also, it is strange to me that you see such a leveling off in number of ancestors at 125, it even looks like the number of ancestors drops a little at 15 generations. Does your code include a minimum threshold for being identified as a relative? Otherwise, I don’t think that is possible.

    Nice work, I hope you could generate a simulation with the recombination rate data, because it is way out of my league to try something like that.

  11. We have about 3.08 billion base pairs. Therefore, we have about 6.16 billion bases and over 30 generations before we have to make a choice between one ancestor and another, one allele and another, at a particular base pair.

    About 30 generations at about 30 years per generation is about 900 years. 900 years ago our world population was between 310 million and 790 million. It is likely no more than 2/3rds of those have descendents to the present, given the effect of the plagues that happened between 900 years ago and now.

    Without inbreeding, however, at 30 generations you have 1,073,741,824 ancestors. If the whole world were in your ancestry, then you would still have many cases of “inbreeding” with sixth cousins, seven times removed.

    However, even if there were only 30 ethnic groups in relative reproductive isolation from each other, until the last 200 years there has been very little genetic sharing between groups. There was not much interbreeding, for instance, between native Americans and Mongolians 200 years ago. Genealogical ancestry does not extend much beyond 400 years ago.

    Therefore, we can say that your genetic ancestors and your genealogical ancestors are in the same set and equivalent.

  12. @John Lloyd Scharf

    We have about 3.08 billion base pairs. Therefore, we have about 6.16 billion bases and over 30 generations before we have to make a choice between one ancestor and another, one allele and another, at a particular base pair.

    No that’s not how it works. You don’t have 3.08bn independent base pairs, you have 22 independent autosomes, the bases within which have a correlation structure dependent on recombination.

  13. SO THIS MEANS IF A WOMAN WERE ABLE TO TRAVEL BACK IN TIME AND HAVE SEX WITH A DIRECT ANCESTOR AND BECOME IMPREGNATED BY HIM SHE COULD THEORETICALLY HAVE A COMPLETELY NORMAL CHILD?

  14. @Luke: This to me seems one of the main blog entries about genetics. I think a work-up of the issue in a scientific paper would be worth. So many write about Admixture between populations and do PCA etc, but the basic knowledge about the statistical correlations in autosomal DNA inheritance seems to me still rudimentary.
    A similar study related to the X chromosome (X-DNA) would be very interesting. The number of potential ancestors indeed follows an interesting pattern and only 2/3 recombine: 1 (male), 2 (female), 3, 5, 8, 13, 21, 34, 55, 89, 144

    @Hanna: Already by the known simplified inheritance models you share ca. 12.5% of the DNA with first cousins. The same like with your Great-grandparents. So I guess there will be the same probability of normal childs if timetravels would be possible or the use of gametes of ancestors.

  15. Martha Lyle

    I halfway think I understand this! I’m wondering how the genetic inheritance and Y-DNA are related. My brother has been tested for Y-DNA, and matches 67 markers, confirming that our ancestor, who came to America in 1656, was from Sweden (95% accuracy). Does this high marker match have any relationship to the percentage of inherited genetics. As I read the material above I think I am seeing that my brother has only 11% of the ancestor’s DNA inheritance. I guess my question is, would this percentage be any higher based on the Y-DNA?

  16. Helena Oksanen

    Hi, This article make sense and I have match for one father but not to their children and we don´t share X -chromosome at all! So it seems to be like your results says.

  17. There are four Canadian Algonquin Indian Natives in my ancestry from the 1600′s to 1700′s. Is it possible that a DNA could register an ‘Indian’ presence in my genome from that long-ago time? Not that I can justify the cost of DNA however, the data I have comes from reliable archival journals written and kept by Jesuits in Church records and libraries and in Franco-American Society records in cities where there is a strong number of Franco-American residents. Thank you. AliceMary

  18. gene are the base of life whose contain the informations. deos it is possible to know who were the first human race (means african or american or indians or egyptians)

  19. i wanna to know that after the origin first the algae orignates ,and from there all the species has originated but why only aur ancestors has evolved not the other species espacially in mental level .only the humans had evolved at the mental level that is only one species but from the origin of life many species had come earlier and after us ,but our ancestors the aps had developed only at that time the they were animals too.what was the especial in the aps.

  20. I want to ask, if my great grandfather is a Portuguese, will I still b a Portuguese if their son and granddaughter married a Chinese?

  21. I was wondering if someone is directly of royal descent from thier paternal grandfather, and paternal grandmother, into them by way of thier father (3000 years ago), is the king’s dna still inside of them

  22. I did the family finder test. I got over 150 matches, including 4 women who “supposedly” are 2nd cousins!!! Well, I know who my 2nd cousins are on both sides, and these 4 women are not amongst them. Furthermore, our connection must be way further back than 2nd cousins, because their ancestors are from places that are nowhere near my ancestral locations, and my tree goes back a long way. Does anyone have an explanation for these 4 matches? When I called the company I was assured their results are accurate and if not 2nd cousins, they should at least be 3rd cousins – not a chance! Of course, since I am 100% Jewish, the matches were about 95% all Jewish, and 5% with some Jewish ancestry.

  23. What percent of my great great grandfather’s genes do I have?

  24. My maternal side, the results are B2 with a direct line from myself (male) going straight to my 3rd great grandmother (Native American) born in 1833. There are no males in the line until myself. Therefore, how much of this Native American connects to me and how far back does it go? The various DNA sites have me with matches Eastern/Southern Asia and the entire Americas. How would this look on a line graph? I have many 1st thru 5th cousins plus two of my mother’s sisters on my relative list. Some of these I have never met but we know our common ancestors. on my mothers side.

    I am now beginning to work on the paternal of which I have many 3rd thru 5th cousins who I chat with. I will likely need a half sibling, aunt or uncle to confirm both my father and common ancestor with those on this line.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>