Alright, it’s time to address the meat of the matter of AGBT; the state of play of sequencing technology. I’ll go through each of the major companies in turn, and talk about what they’ve brought to the table, and what the future holds for them.
I covered Illumina on day zero. Basically, the GAIIx can now generate 7Gb/day, with 2x150bp, and error rates universally under 2%. The HiSeq generates 31Gb/day, 2x100bp, with error rates under 1%; this will soon be pushed to 43Gb/day with a slight decrease in accuracy. For sheer volume of sequence, no-one can match Illumina.
As I said yesterday, 454‘s median read lengths are climbing into the 700-800 range, but with error rates being pretty high beyond 600 or so. Not bad, but after all the fuss over 1000bp reads, also a little disappointing.
454 have been pushing their work on assembly; they’ve worked pretty hard to make an easy-to-follow recipe, involving both single-end and paired-end sequencing, and the program Newbler. Many interesting critters have had this treatment, including bonobo, panda and Desmond Tutu (in order of majesty).
I found the SOLiD content of this conference very cool. Focusing more on the medical genomics side of things, SOLiD is involved in various clinical trials to see whether genomic information can increase cancer survivial times, and emphasizing the importance of accuracy in a clinical setting.
Lots of cool new tech too: For instance, mixing 2-base and 1-base encoding, apparently making error rates of 1 in 10^6 possible. Apparently library prep errors now dominate, so SOLiD has been working on finding more gentle enzymes for amplification. Particularly cool was a throw-away slide on running the ligase on single molecules and actually getting signal (though actual single-molecular sequencing probably isn’t economic).
Pacific Biosciences have produced an extremely interesting product; it is a game-changer, though exactly what it means for sequencing is not immediately obvious. I am going to hold back on writing about PacBio right now, because I have a more in-depth post on the exact specs and implications of the PacBio, in comparison to their nearest equivalent Oxford Nanopore, in the works.
Complete Genomics have gone from “interesting idea” to “thriving technology” in a very short period of time. They’re scaling up their sequencing centre as we speak; they’ll have 16 machines in the next few months, generating 500 40X genomes a month. Over the year, providing they get more orders, they’ll scale up to 96 machines, with a predicted 5X increases in capacity per machine as well. If this all goes well, in theory they are on target for their 5000 genomes by the end of the year.
Complete also have some very interesting new technologies on the horizon, which they will be discussing tomorrow; check the twitter feed for coverage. A lot of people underestimate Complete Genomics, but it is starting to become evident that they are as much game-changers as more flash technology.
Ion Torrent wins both my major awards this year: the “most surprising release” award and the “sounds most like a soviet weapons project” award. Ion torrent burst onto the scene with its tiny machine (GS Junior sized); the first major non-florescence-based method in a long time, using the emission of hydrogen ions from the the DNA polymerase reaction to measure incorporation in a 454 stylee.
The machine can produce a rapid 150Mb or so in a single hour run, for about $500 in disposables. The machine itself costs a tiny $50k. From what I’ve heard, a lot of people are interested in a machine like this for fast library validation, though it also has applications in diagnostics and microbiology. Unfortunately, it looks like the error rates are currently high, though they claim these will drop by release time.
Overall, we are starting to see a divergence in sequencing technologies, as each tech concentrates on having clearly defined advantages and potential applications that differ from all others. This means that the scientists themselves can more closely tailor their choice of tech to fit their situation. Are you a small lab that needs 10 high-quality genomes on a budget? Go to Complete. Want a cheap, fast machine for library validation? Use Ion Torrent. Setting up a pipeline for sequencing thousands of genomes? Go Illumina.
I suppose this was all driven by the fact that Illumina’s machine has such high yield that chasing them is a fool’s game, so everyone else is concentrating on what they can do that Illumina doesn’t. This is pretty good for science as a whole; we are moving away from the One-Size-Fits-All approach to high-throughput sequencing, and moving into a time of more mature, application-based methods.