The second day of the Biology of Genomes conference is complete. In the morning we had the Genetics of Complex Traits session, followed by the poster session and the always exciting “Wine and Cheese Party”.
As usual, there were a few talks in the first session describing large consortium analyses of human complex disease. For instance, I presented work (on behalf of the International IBD Genetics Consortium) on the Inflammatory Bowel Disease immunochip project. We have genotyped tens of thousands of cases from 15 different countries, and discovered a host of new common loci for IBD. I’ll be writing about this project on Genomes Unzipped soon. On the other end of the allele frequency spectrum, Mark McCarthy reported on some of the next-generation sequencing projects that are going on in Type 2 Diabetes; these have less samples (something like 7K cases in total), but generated high quality calls for a large number of rare variants. Mark reported on a few interesting hints of rare T2D associations, but his overall conclusion was that we will need tens of thousands of samples to be well powered to find rare variants that underlie common disease. We will need to go beyond just sequencing a few thousand samples, and start designing well-powered replication studies to follow up what we find.
But I wanted to talk about some more non-standard, and slightly cleverer studies of non-human phenotypes that I found interesting. Three speakers described pretty nifty studies that used the particular properties of non-human sequence data to do some well-powered sequencing experiments that wouldn’t be possible with humans.
Magnus Nordborg presented various studies of wild Arabidopsis plants, including sequencing more than a thousand lines and imputing a thousand more as part of an international (1000 Genomes Project style) collaboration, as well as a smaller study of 200 Swedish plant lines. These plant lines are naturally inbred, and you can thus grow up a large number of clones to get accurate phenotype measurements and test the effect of environmental changes. As well as genome sequencing, Magnus looked at methylation and gene expression, in each case doing the assay on plants grown at 10 and 16 degrees Celsius. Interestingly, he found that only 2% of gene expression is determined purely by environment, though nearly a quarter was determined by interactions between the environment, genetics and methylation.
Amelie Baud managed to magic up a >1400-sample case-control study of rats phenotypes by sequencing 8 individuals. This entire sample set were the decedents of 8 rat lines, so using sparse genotype data, and sequence from the 8 parent lines, you are able to impute the entire genome, as well as trace which haplotypes descend from which founder rats. By looking at the difference between haplotypes and variants, you can also find cases of allelic heterogeneity (which seems to be surprisingly common).
My personal favorite talk, Ran Blekhman put together a large genome-wide association study of the relationship between microbiota and its human host, without doing any new sequencing of the humans. The study involved taking microbiota sequencing reads and picking out “contaminating” reads that come from the human host; by combining multiple samples from the same individuals, it was possible to put together an average of 10X coverage of the human hosts. Ran showed 51 genetic variants that seemed to correlated with the levels of different bacterial species. Satisfyingly, a canonical pathway analysis of this data showed very similar pathways to my analysis of inflammatory bowel disease loci, highlighting the relationship between commensal bacteria and IBD.
Thank you to Mark McCarthy, Magnus Nordborg, Amelie Baud and Ran Blekhman for giving me permission to write about their talks. The imagine of the plant pot full of rats above is taken from Flickr