The Future of Second Generation Sequencing

Illumina, the major player in high-throughput sequencing these days, have announced the newest version of their second generation sequencing platform, the HiSeq 2000. The machine can produce a lot more sequence, and at lower cost, than the previous Genome Analyzer II.

I’m not going into much detail about the machine: for that, see posts at Genomics Law Report, Genome Web, Genetic Future, Pathogenomics and PolITiGenomics. What I really care about is what this machine implies for the future of sequencing, and specifically what we can predict about the coming 2nd verses 3rd generation sequencing battles that will be kicking off later this year.

PacBio’s 3rd generation machine, which will be arriving later this year, will have an initial throughput of around 3Gb a day, at a price of around 1.4$ per Mb in consumable costs. I don’t know the specs for Oxford NanoPore’s machine; my guess is that it will be similar, but we’ll know soon.

Compare PacBio’s capacity to the HiSeq 2000, which will produce 25 Gb per day, at claimed consumables price of $0.11 per Mb ($10 000 for a 30X genome). In short, the Illumina 2nd gen machine is going to be able to pump out much more sequence at a much higher rate than PacBio. Both will rapidly increase the power of their machines after release, but we don’t know who will push faster (Dave Dooling thinks Illumina could push the HiSeq to 450 Gb per run with existing technology).

Of course, the competition isn’t just based on pure throughput. Read length and error rates are also important; the 3rd gen machines will also have much longer read lengths than Illumina and SOLiD, and we expect that the quality of sequence will be higher as well, giving the possibility of some real Gold Standard genomes being produced from these machines, rather than the somewhat messy genomes we get from Illumina.

This all ties in to the conversation I had with the Illumina people at ASHG; Illumina think that it’ll be a good few years before 3rd Gen sequencing can catch up with their current machines. I expect that, between now and 2014 (when PacBio release v2 of their machine), the major sequencing centres will keep a combination of 2nd and 3rd gen machines. The 2nd gen machines will be used when a very large amount of low-quality sequence is required, such as for Genome-Wide Association Studies or RNA-seq. The 3rd gen machines will be used for assembling genomes, looking for copy-number variations and studying the genetics and epigenetics of non-coding and repetitive regions.

I guess what I’m trying to say is that, as exciting and cool as the single-molecule technologies of PacBio and Oxford NanoPore are, it is far too soon to announce the death of Second Gen sequencing. If Illumina continues to push its throughput as hard as it is doing now, 2nd generation machines will be widely used for a long while yet.

The future will become a bit clearer at the AGBT conference, where we should see some big announcements from PacBio, Oxford Nanopore, Complete Genomics, ABI and Illumina. Me and a host of other bloggers will be there to cover them.

Share and Enjoy:
  • Digg
  • Reddit
  • StumbleUpon
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed

9 Responses to The Future of Second Generation Sequencing

  1. Nicely put, Luke. I’m excited about the launch of third-gen technologies this year simply because I’m desperately craving the end of the informatic nightmare that is short-read sequencing; but clearly no-one will be retiring their GAs and HiSeqs in favour of third-gen platforms in the immediate future.

    It will certainly be interesting to see how quickly third-gen gets taken up as a supplement to second-gen for human WGS: imagine hybrid de novo assemblies combining low-coverage third-gen data with high-coverage second-gen sequence. Sweet.

  2. Pingback: Another Stop on the Road to the $1,000 Genome

  3. Helicos claims at least 24Gbase per 24Hrs and 36Gbase per 1.5days which is the exact same throughput claimed for HiSeq2000.

    Which platform do you suspect has the most room to grow?
    Which platform has the lowest error rate?

    HLCS needs to get their system cost down to $600K or so to be competitive (they are working on this).

  4. 24 Gb per day is the maximum scanning rate of the Helicos imaging system. According to the system specs (which are a year out of date), the throughput is 2.5-3.4 Gb per day (105-140 Mb/hour); i.e. slightly less than the GAII, and much less than HiSeq.

    The error rates of Helicos are a bit of a worry too – their whole-genome paper last year reported error rates of 3.6%, and most of these were deletions, making indel detection difficult.

    It is always possible that Helicos will make some kind of surprise breakthrough, but with such high error rates and such low read length, the machines suffer from 2nd-gen style problems even more than 2nd gen machines. Given the recent fall in share price, I don’t think the current Heliscope is really going to play much of a part in the future of sequencing.

    In terms of what system has the most room to grow; that really depends on what time scale you are talking about. In the short term, I expect Illumina still has a lot of development in store; the R&D guys have been doing this a long time, and have an awful lot of support from the company. By 2013/14, I expect the competition will be between Oxford NanoPore and PacBio for who can push their read length furthest into the tens of thousands.

  5. What about Complete Genomics? …it looks like it’s in the mid way between 2nd and 3rd gen sequencing techs. They talked about 5k human genomes delivered in 2010. And in their proof of principle paper they were talking about 8k $ for a 87X coverage (compared to 10K $ for 30X with the new Illumina HiSeq 2000).
    What do you think about Complete genomics? Their strategy is really new for this market since they are not going to sell the sequencers but only the sequencing service and they seem to be ready to penetrate the market starting from 2011 (since the genomes for 2010 are already sold to . Their error rate seems very good too: accuracy of 1 false variant per 100 kilobases.

  6. Yeah Complete Genomics is definitely worth keeping an eye on. As well as the obvious fact that their business model is very different, I also know that they are working on some interesting chemistry-side (as opposed to software-side) technologies to make mapping and assembly much more accurate, which if they get working well could overcome many of the disadvantages of the 2nd gen technology.

    If I was going to place a bet on who wins the Archon X prize, it’d be Complete Genomics.

  7. Pingback: Nibbles: Sequencing, Agricultural origins, Mating systems, Tomato shelf-life, Beer vs Tea, Soy, Carrot, Seed processing, Screw-pine, Yams, Salicornia

  8. Pingback: Science Report » Blog Archive » Nibbles: Sequencing, Agricultural origins, Mating systems, Tomato shelf-life, Beer vs Tea, Soy, Carrot, Seed processing, Screw-pine, Yams, Salicornia, Pollinators

  9. Like you state in your article, I suspect Next-gen sequencing will continue to be around after release of Third-gen sequencing platforms. Despite these innovative developments, Sanger sequencing with capillary systems is still widely popular in research. Not everyone needs to sequence an entire genome when a 5kb sized gene is what is being researched. Plus the ability to sequence longer fragments also gives rise to more mistakes. At least for a while, Next-gen and Third-gen platforms will likely compliment each other.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>