The second day (or the first day, depending on if you count yesterday’s pre-sessions) of AGBT is nearly done. There has been a lot of things going on today, but I’m only going to cover one; once again, you can get more detail on all the talks I’ve seen on my Twitter feed (@lukejostins).
Other things that I’ve done: I had a very interesting talk with Geoff Nilsen at Complete Genomics, in which I got to ask various questions, including: “Why don’t they use color-space?”, “It confuses customers, and the error model is good enough already”. “In what sense is Complete ’3rd Gen’?”, “Because it’s cheaper”. I also saw a set of presentations from 454 on de novo assembly, and the new Titanium 1k kit, which actually contains virtually no 1kb reads: mean read length is about 800bp, but beyond 600 the error rates get very high.
There has been some other blog coverage of AGBT from our army of bloggers: MassGenomics has some first impressions, and Anthony Fejes is uploading his detailed notes about all the talks. You can also follow a virtual rain of tweets on the #AGBT hashtag.
Fun with Exome Sequencing
Debbie Nickerson (again!) gave a talk about sequencing genomes to hunt down the genes underlying Mendelian disorders. The process is very simple; you sequence a 4-10 exomes of suffers, look for non-synonymous mutations shared in common between them, and then apply filters (such as presence in HapMap exomes) to find SNPs that are likely to be causal. Debbie is in the process of sequencing 200 exomes for 20 diseases, and has a real success story under her belt in tracking down the genes for 2 disorders. She raised the interesting question of how to validate the discovered genes, given that Mendelian disorders tend to have a large number of independent mutations.
Stacey Gabriel gave a related talk on exome sequencing, focusing on using the method Debbie described to track rare variants for complex traits. To do that, you ‘Mendelianise’ the trait, by only picking extreme individuals; She did this for high and low LDL-choloresoral, giving some candidate genes, but no smoking gun.
Let’s look slightly closer at this; you sequence a number of individuals with extreme traits, look for genes with shared non-synonymous mutations, and look for functional effects. This is a linkage study! A very small and underpowered linkage study, with a variant-to-gene collapse method (like a poor-mans lasso), and some sort of manual pathway/functional analysis (a poor-mans GRAIL), but linkage all the same. This is really re-inventing the wheel, without really learning any of the lessons that the first round of linkage analysis taught us (or even stopping to ask whether, if such variants existed, they would have been picked up by linkage in the first place).
It is not that Stacey Gabriel is doing anything wrong; it is just that she is failing to consider that she is attempting to solve non-statistically a problem that statisticians have worked on for decades. In short, she is risking taking the statistics out of statistical genetics.