How much information can we get from a genome scan? Many companies, such as 23andMe and deCODE Genetics sell genetic tests that allow you to determine parts of your DNA sequence: one selling point is that it can tell you how susceptible you are to various diseases. But how much can a genome really tell you?
In general, people say ‘not much’, and cite the importance of the environment, social, cultural factors, and our lack of knowledge of disease genetics: these are all valid and important points. But, can we put some figures on exactly how much a genome scan can tell us? Can we calculate exactly how much the average person’s predicted probability of getting a disease will change after they get their DNA scanned?
In this post, we will take three important diseases of decreasing rarity, and take all the genetic variants that are known to influence them. We will see exactly how much we expect this information to change someone’s likelihood of getting the disease.
To do this, we will look at three measures. Firstly, the mean absolute probability difference, which measures how far the average prediction using the genetic information will be from the prediction based only on how common the disease is in the population; we will also use the mean relative risk, which measures how many times bigger or smaller the genetic prediction is on average. Secondly, we will look at ‘low-risk’ and a ‘high-risk’ profiles. These are genetic profiles that are lucky or unlucky, but not hugely unlikely; specifically, 10% of people will have genetic profiles at least as ‘extreme’ (in either direction) as this. Finally, we will look at the distribution of genetic risk in the population as a whole.
Crohn’s disease is a type of Inflammatory Bowel Disease; it is a rare autoimmune disorder in which the immune system overreacts to microbes in the intestine wall, attacking tissue and causing inflamed, sore patches to appear throughout the gut. Around 1 in 2500 people suffer from Crohn’s disease in the UK, or around 0.04% of the population.
I have taken data for Crohn’s risk variants from Barrett et al, a recent meta-analysis that replicated or discovered a total of 30 variants that were associated with Crohn’s. In the absolute worst genetic case, having 2 copies of each of the 30 risk variants, you would have a 13.5% chance of developing the disease, and in the best case, having no copies of the risk variants, you’d have an essentially 0 chance (though you’d have to be exceptionally unlucky or lucky, respectively, for either of these to be the case).
The average person will change their probability of getting the disease by around 0.02%, or an average proportional change of around 2.16-fold. The sort of low-risk profile you might expect would have a 0.01% chance of developing the disease, and a high-risk one might have a 0.1% chance. The distribution of risk profiles looks like this:
You lie somewhere within this range, most likely towards the peak in the middle, the mass of people who are neither particularly susceptible not particularly resistant to Crohn’s disease. However, you may lie within the long tail of people who have a higher-than normal chance of developing the disease. But even these numbers never really get large enough to be that useful; you may have a 1 in 500, rather than 1 in 2500 chance of getting the disease, but what does that really tell you?
Type I Diabetes
Type I Diabetes, or insulin dependent diabetes, is a relatively common metabolic autoimmune disorder in which the immune system attacks the pancreas and destroys its ability to produce insulin, meaning that the body is unable to properly regulate blood sugar. In the UK, around 1 in 230 have the disease (0.44%).
A lot of work has been done on the genetics of Type I Diabetes, and we know of a large number of variants that affect your disease risk. I have taken data from three T1D studies, Todd et al, Cooper et al and Barrett et al. In total, this gives 44 risk alleles: in the worst genetic case, having 2 copies of each of the 44 risk variants, you would have a 31% chance of developing the disease, and in the best case, you’d have a 1 in 35 000 chance (though, once again, these would be extremely unlikely).
The absolute probability difference is 0.3% (average relative risk of 2.4-fold). If you have a risk profile toward the disease resistant end, you could expect to have a 1 in 1250 chance of developing the disease (0.08%), and if you are towards the high end, you this can go up to 1 in 77 (1.3%). The distribution of risk profiles is shown below:
In this case, the long tail starts to hit relatively significant figures, going up towards 1 in 50.
Type II Diabetes
Type II Diabetes, or non-insulin-dependent diabetes is another metabolic disease, in which cells of the body lose the ability to respond fully to insulin, and causing misregulation of blood sugar. The incidence in the UK is around 1 in 25 (4%).
The most up-to-date study that I know if is Zeggini et al, who reported 11 Type II Diabetes-associated variants (there may well be newer, fancier studies out there). In the unrealistic worst case, these variants could cause a 28% chance of developing the disease, and in the best case, a 0.64% chance.
The absolute probability difference is 1%, and the average relative risk is 1.3-fold. A reasonably lucky profile would have a 1 in 40 (2.5%) chance of developing the disease, and an unlikely one would have an 1 in 12 (8.5%) chance. The risk distribution looks like this:
While the information still isn’t hugely predictive, there is potentially useful information here. An individual with a high-risk profile has a relatively high chance of developing the disease, and if I find out I have such a profile, and I eat a diet rich in sugar, I know that doing so is riskier than I thought.
Does This Mean Anything?
The probability that I will have one of the high or low-risk profiles that I have mentioned above for a particular disease is 10%. The probability that I will have an extreme profile in at least one disease is 27%. There is a significant chance that I have a significantly different risk to what I would otherwise expect. Obviously, the numbers are still small (a change from 4% to 8% is hardly life-shattering), but they are non-trivial.
One big thing that I have left out here is that people already have a good idea of what their genetic risk for disease is, by knowing their family history. Do any members of your family have Crohn’s disease? If not, then you are almost certainly not in the long tail of susceptible individuals. The genetic information can still tell you additional things, especially if your family is small, but it reduces the amount that the genetic data changes your predictions.
Finally, the probability changes that I have given above are larger than could have been achieved this time last year, and will continue to improve as our understanding of genetics increases. It will be interesting to re-do this once we have started to do sequencing studies on human disease.