ASHG: Rare Variants, and the 1000 Genomes Project

Hello all (it is taking every bone in my body not to say ‘Aloha’ here).

So, today was the first real day of the ASHG Annual Meeting; after accidentally falling asleep for basically all of yesterday, it was good to finally see some familiar faces and dig my teeth into some real science.

I’m going to write a little about the first couple of sessions I’ve seen, and say what sort of themes are being shouted loud enough to get into my jetlagged mind. I have also been tweeting the conference at quite a high frequency (about 30 tweets so far), and in more detail than I have given here; follow me on @lukejostins if you are interested. To see all the ASHG twittering, check out #ASHG2009.

The blogs posts over the next few days will be aimed mostly at those who are, at least vaguely, In The Know about genomics. However, if there are people who would like a less jargonistic lowdown of the conference, please leave a comment and I’ll see what I can do.

Rare Variants

There were no out-and-out wonders in this section; no batch of new disease genes located using rare variants, no decisive answer to the relationship between allele frequency and disease risk, no perfect answer to how to perform association studies with rare variants. But there was some nice work presented.

Carlos Bustamante talked about using modelling both demography and selection, and fitting the model using the 3D spectrum of allele frequency across three populations. The model doesn’t need to use selection to explain the frequenc of synonymous SNPs, but negative selection is required to explain the frequency of non-synonymous SNPs. His model predicts that 50% of new coding mutations should be mildly or seriously deleterious; this compared to around 20% for observed rare SNPs (>1%); and basically zero for common SNPs. This hints that most deleterious mutations are purged very quickly.

Suzanne Leal proposed a new method of looking for association in rare variants, and did a power study of this and other association methods in simulated data. She found that her algorithm (KBAC) had the highest power, followed closely by the WSS method. It is good that someone finally did this, but her analysis was marred by a) using some slightly odd assumptions in her simulations (50% of rare variants in an associated gene are causative? really?) b) not looking at number of false positives, which makes the power scores pretty meaningless unless all the tests have the same FP rate.

A common theme that came up across in all the talks was that population demographic effects become more powerful for rare variants. Jon Cohen showed that most large, rare risk variants in the Dallas heart study were restricted to a particular ethnic group. Carlos mentioned this his demographic studies showed the importance of properly matching cases and controls for rare variants, and Suzanne expressed doubt that our existing methods of controlling for population structure will work for rare variants.

The session showed some nice results, but mostly stood to highlight how far we still have to go.

The 1000 Genomes Project

The ASHG conference had an entire session dedicated to the 1000 Genomes project. Gil McVean gave a nice explanation and progress update, with some heartening figures for the proportion of common variants discovered (95% for MAF > 3%), some nice news on format agreements, and some not-bad-but-not-great results for imputation. Richard Durbin presented some simulation work showing that their choice of sample number and depth (400 individuals at 4X per population) was optimal to maximise the number of variants discovered, and Pardis Sabeti presented some beautiful work on tracking down selection in HapMap samples that was only tangentially related to the 1000 Genomes Project.

The talks were all nice enough, good ideas, good progress, a general impression that everything is working well, but with no solid results. We have yet to show a unified set of uses of the 1000 Genomes data that gives a set of real significant biological findings. For the moment, the 1000 Genomes Project seems to be brimming with potential, but we can’t wait forever for this to crystallize into science. Hopefully, after the final Pilot release in the next month or so, things will start to come together.

Share and Enjoy:
  • Digg
  • Reddit
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed

2 Responses to ASHG: Rare Variants, and the 1000 Genomes Project

  1. Pingback: Tweets that mention ASHG: Rare Variants, and the 1000 Genomes Project « Genetic Inference -- Topsy.com

  2. Pingback: uberVU - social comments

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>