Tag Archives: sequencing

Off to AGBT

Tomorrow morning I head off to the Advances in Genome Biology and Technology conference (AGBT for short) on Marco Island, Florida; as someone who loves the cold and hates warm places, this is not as exciting for me as you may think. One thing that is exciting, however, is the nice genomics blog presence at the conference; me, Daniel MacArthur, Anthony Fejes, Dan Koboldt and David Dooling will all be playing our parts as Ambasadors to the Blogosphere. Interestingly, assuming independence, there is a 68% chance that at least one of us will get eaten by an alligator; watch this space!

The high blog coverage is justified; we are expecting to get a feel for how the field of DNA sequencing tech will advance over the next year. I will be particularly interested in seeing what Complete Genomics have to report, as well as 3rd gen sequencing presentations from Pacific Biosciences and Life Technologies (ABI). One group that are notable by their lack of a presentation is Oxford Nanopore, which is a shame; I’m sure Nanopore will be talked about plenty anyway.

I have recently got a brand new laptop to replace the brand new laptop that I lost at ASHG last year, and I’m going to keep up the same schedule I did then; a daily blog post summing up the day’s highlights, and more detailed, up-to-the-minute coverage of every talk I see on my twitter feed (@lukejostins). For more AGBT twittering, I think people are going to be using the hashtag #AGBT.

As is traditional when I go away, I will also be sending a daily e-mail with amusing things that have occured at the conference, but that is promised to my girlfriend Hannah, so you will, alas, not get to read it.

The Future of Second Generation Sequencing

Illumina, the major player in high-throughput sequencing these days, have announced the newest version of their second generation sequencing platform, the HiSeq 2000. The machine can produce a lot more sequence, and at lower cost, than the previous Genome Analyzer II.

I’m not going into much detail about the machine: for that, see posts at Genomics Law Report, Genome Web, Genetic Future, Pathogenomics and PolITiGenomics. What I really care about is what this machine implies for the future of sequencing, and specifically what we can predict about the coming 2nd verses 3rd generation sequencing battles that will be kicking off later this year.

PacBio’s 3rd generation machine, which will be arriving later this year, will have an initial throughput of around 3Gb a day, at a price of around 1.4$ per Mb in consumable costs. I don’t know the specs for Oxford NanoPore’s machine; my guess is that it will be similar, but we’ll know soon.

Compare PacBio’s capacity to the HiSeq 2000, which will produce 25 Gb per day, at claimed consumables price of $0.11 per Mb ($10 000 for a 30X genome). In short, the Illumina 2nd gen machine is going to be able to pump out much more sequence at a much higher rate than PacBio. Both will rapidly increase the power of their machines after release, but we don’t know who will push faster (Dave Dooling thinks Illumina could push the HiSeq to 450 Gb per run with existing technology).

Of course, the competition isn’t just based on pure throughput. Read length and error rates are also important; the 3rd gen machines will also have much longer read lengths than Illumina and SOLiD, and we expect that the quality of sequence will be higher as well, giving the possibility of some real Gold Standard genomes being produced from these machines, rather than the somewhat messy genomes we get from Illumina.

This all ties in to the conversation I had with the Illumina people at ASHG; Illumina think that it’ll be a good few years before 3rd Gen sequencing can catch up with their current machines. I expect that, between now and 2014 (when PacBio release v2 of their machine), the major sequencing centres will keep a combination of 2nd and 3rd gen machines. The 2nd gen machines will be used when a very large amount of low-quality sequence is required, such as for Genome-Wide Association Studies or RNA-seq. The 3rd gen machines will be used for assembling genomes, looking for copy-number variations and studying the genetics and epigenetics of non-coding and repetitive regions.

I guess what I’m trying to say is that, as exciting and cool as the single-molecule technologies of PacBio and Oxford NanoPore are, it is far too soon to announce the death of Second Gen sequencing. If Illumina continues to push its throughput as hard as it is doing now, 2nd generation machines will be widely used for a long while yet.

The future will become a bit clearer at the AGBT conference, where we should see some big announcements from PacBio, Oxford Nanopore, Complete Genomics, ABI and Illumina. Me and a host of other bloggers will be there to cover them.

ASHG: Quantifying Relatedness and Active Subjects in Genome Research

Well, the American Society of Human Genetics Annual Meeting is coming to a close for another year. My talk is done and dusted, so I no longer have to lie awake at night worrying that I will forget everything other then the words to “Stand By Your Man” when confronted by the crowd. My white suit is now more of an off-white suit, with regions of very-off-white and pretty-much-entirely-out-of-sight-of-white. I’m looking forward to getting back home to catch up on my sleep.

For the last time, I’m going to give a little summary of talks today that I thought were interesting, or gave some indication of where genetics may be heading in the future. I will write up some more general thoughts about the meeting in the next few days, as soon as the traveling is out of the way and my mind has recharged.

If you would like some second opinions on the conference, GenomeWeb has a number of articles, including a couple of short summaries, as well as a nice mid-length article about the 1000 Genomes session; there are also a number of articles over at In The Field, the Nature network conference blog.
Continue reading

ASHG: Statistical Genomics and Beyond GWAS in Complex Disease

The second day of the American Society of Human Genetics Annual Meeting is drawing to a close; here’s a lowdown of what talks I’ve enjoyed today.

Remember, follow @lukejostins on Twitter if you want more up-to-the-minute details on the ASHG talks.
Continue reading

ASHG: Chatting with the Sequencing People

While I am here, I though I’d take the chance to chat to the people at the booths for the three major Second Gen sequencing platforms (Illumina, SOLiD and 454). It was surprisingly fun, the guys I talked to all seemed enthusiastic, and it was nice to find out where the scientists in the companies think the technology is going.

In the interests of openness: the 454 booth gave me a cool T-shirt and poster, so this may well have biased my opinion of them
Continue reading

Recombination in the X and Y Chromosomes

ResearchBlogging.org

Rosser, Z., Balaresque, P., & Jobling, M. (2009). Gene Conversion between the X Chromosome and the Male-Specific Region of the Y Chromosome at a Translocation Hotspot The American Journal of Human Genetics, 85 (1), 130-134 DOI: 10.1016/j.ajhg.2009.06.009


There is new paper is out in American Journal of Human Genetics about how the X and Y chromosome might not be as separate as we think, and in fact might undergo regular recombination in certain regions (you can read a press release for the paper here).

Specifically, the paper is a resequencing study of the X and Y chromosome homologues PRKX and PRKY in around 60 individuals, looking for signatures of recombination. In summary; it is an interesting and well supported paper in as far as it goes, but it raises more questions about Y chromosome evolution than it answers
Continue reading

Books for Bioinformatics Beginners

Olaf left a comment asking about what books a mathematically competent and generally informed non-geneticist can read to learn about modern genetics. As he notes there tends to be a bit of a lack of books that assume you are know the basics, but does not assume you have an undergrad degree. You tend to find things that are either of the form “this is Mr Gene, he makes proteins!”, or of the form “a non-Bayesian could infer with certainty an inversion-deletion event had caused this ribosomal disruption, so attached are they to their bootstrapped pseudo-statistics!”.

This sort of request also tends to come from the very large number of undergrads trained in genetics in some classical sense (a mixture of population and functional genetics) who want to get a general understanding of this whole Modern Genomics phenomenon that basically all of genetics is at least partly involved in these days.
Continue reading

Basics: Sequencing DNA, Part 2

This post follows on from my previous post on Sanger sequencing, and is part of an ongoing series that looks at how we take DNA, hidden away in our cell nuclei, into read the sequence of base pairs that make up our genetic code. In this post, we look at the Second Generation Sequencing machines, that are currently sequencing thousands of genomes-worth of DNA per year throughout the world.
Continue reading

The Genome Campus is a Mac?

Catching up on my RSS feeds, I came across a post at PolITiGenomics, about the European Bioinformatics Institute’s Paul Flicek taking part in one of those ‘I am a person of significance, I use a Mac’ videos:

First the most important bits. At 0:06, THAT’S MY COLLEGE! And at 0:25, THAT’S THE BUILDING I WORK IN! And at 2:24, I EAT THERE! How exciting.
Continue reading

Basics: Sequencing DNA, Part 1

For an embarrassingly long time, I had very little idea how we read people’s DNA. We deal with DNA sequence so often, and use it for such a plethora of things, that if I thought about it at all, my thoughts would have been something along the lines of “er, well, you just, you know, sequence it, right? Run it out on a gel, or, er, something”. I remember years ago admitting this ignorance to a friend, who said “Oh, they have machines that do it”; this response is both reassuring and terrifying. Anyway, I finally rectified my ignorance (about the time I read the book Genomes 3, which filled in a lot of the blanks on the molecular side of my subject); it is actually a pretty fascinating topic, and also a pretty important one, since progress in sequencing technology drives progress in much of genetics generally. So, I thought I’d dedicate a series of posts to sequencing.

While it seems so simple, sequencing DNA is a pretty major challenge. If you ever hold DNA in your hands, it basically takes the form of a long-chained acid dissolved in water. If you want to know something about it you can dye it and run it out on a gel, to see how large the molecules are (large molecules run more slowly through a gel, so you can tell how big the molecule is by how far it moves), but doing much beyond that requires quite a bit of thinking.
Continue reading