On the UK’s DNA Database, Part 1

This is the first part of a double post in the UK National DNA Database.

The newspapers have been flaring up over the issue of the National DNA Database (NDNAD) over the last week. The NDNAD, which is the largest such database in the world, was denounced by the European Court of Human Rights as unjustified, as it holds information on innocent people, and routinely uses them to investigate crimes. The govournment proposed certain changes, the most reported of which is the decision to only hold innocent people’s DNA for 6-12 years. Liberty’s Shami Chakrabarti denounced the policy, saying “wholly innocent people – including ­children – will have their most intimate details stockpiled for years”.

The blogosphere has also been making interesting noises about this: Iain Brassington at the BMJ’s Journal of Medical Ethics blog posts about the ethical problems with the database, and over at Liberal Conspiracy Denny de la Haye talks about how the government’s proposed policy changes fail to address the issues raised by the Court of Human Rights.

I thought it might be worth researching exactly what information is held on the NDNAD, and what this information could be used for. This post turned out to be pretty long, so I’ve split it up into two posts: this first one asks exactly what the genetic profiling involves, and what information is recorded. The second post, which I will put up on Friday, asks how the information is used currently, and what it could potentially be used for in the wrong hands.

What information do they hold?

The NDNAD currently holds information on anyone that has been arrested on suspicion of a crime; I expect a relatively large number of those crimes are filed as ‘loitering with an intention of being black”, since about 40% of black UK nationals are on the DNA register. As of 2006, over 4 million people were on the database - extrapolating, there are probably around 5 million entries by now. The other side of the equation is that samples from crime investigations (blood and hair from crime scenes, vaginal swabs from rape victims and so on) are taken, profiled and added to the database.

Each entry contains the individual’s name, sex, date of birth, ethnicity, as well as the associated genetic data. As an aside, I think that keeping this demographic data, for innocent people, on a criminal databse is as invasive as keeping their DNA information. What is also potentially more worrying is that the companies that do the testing keep the physical sample in storage, at the behest of the NDNAD, and can be reanalysed at any time.

The new proposals limit the length of time that someone’s DNA can be stored if they are not convicted of a crime to 6 years for minor crimes and 12 for serious crimes (this approach of treating people who are innocent of a minor crime differently to those that are innocent of a major crime seems a little contraditory to legal due processess). What I think is a more important decision, which the media and the blogosphere seems not to have picked up on, is that samples will now be destroyed after they have been profiled; this ensures that the police cannot just decide to do a deep genotyping scan on everyone suspected criminal on a whim.

What is the ‘Genetic Profile’

Exactly what genetic information makes up the genetic profiles is not widely publicised, but according to a Parlimentary report from 2005 the information in the database comes from the SGM+ system. SGM+ stands for Second Generation Multiplex Plus, and refers to the AmpFlSTR SGM Plus PCR Amplification Kit, produced by Applied Biosystems (ABI). This kit works by looking at Short Tandem Repeats (STRs); these are regions of DNA that consist of a short pattern (like CAGC) repeated over and over again. Because this pattern confuses the DNA replication machinery, these sections are prone to growing and shrinking as bits are accidentally put in or chopped out; as a result, the length of the STR will vary a lot between people. Using the PCR reaction, we can amplify up the bit of DNA, and find out what size it is; if we do this for a number of different sites, we get an individual-specific profile, a set of STR lengths that is unique to that individual.

SGM+ uses 10 such sites, each one on a different chromosome, as well as looking at the X/Y Chromosome gene Amelogenin to identify sex. The first six sites were developed by the UK’s Forensic Science Service, and were validated in a published journal. These consist of the sites D8S1179, D18S51, D21S11, FGA, TH01 and vWA. The additional 4 sites were developed by ABI, and consist of D2S1338, D3S1358, D16S539 and D19S433.

When all 10 sites are taken into account, there is virtually a zero chance that two individuals will have the same length profile (the manufacturers make the claim of 1 in three trillion, though the working probability used by the police is 1 in 1 billion) . Even for two siblings, there is only a 1 in a 1000 chance that they have the same profile. This information can also be used to tell if two individuals are related.

In a 2007 policy bulletin, the Royal Society of Chemistry reported that the statistical tests are performed to minimise the chance of false positives, and as a result have a lower detection rate than they could. They stated that sometime in 2008/2009, the system would be upgraded to include 13-15 markers, which would decrease this effect.

Next Time

In the next, slightly longer but less information dense post, I will talk about how the police use the NDNAD, what problems can arise, and what the more unpleasant uses of the NDNAD could be in the wrong hands.

Share and Enjoy:
  • Digg
  • Reddit
  • StumbleUpon
  • del.icio.us
  • Facebook
  • Twitter
  • Google Bookmarks
  • FriendFeed

2 Responses to On the UK’s DNA Database, Part 1

  1. One thing that piques my interest is how the database might handle chimeric individuals. I’m guessing trisomies and other irregularities wouldn’t have such a noticeable effect.

  2. So, for things like trisomy you’d probably just get too many peaks, and the reading would be dismissed (filed as a Partial Profile).

    Chemerism is more tricky; by default, they don’t test for it (as they only take one tissue type), so only one genotype would go on the database: that does leave the potential for false negatives. If they did get a confirmed chimera, I doubt they’d have a way of storing that on the database without just creating one entry for each profile.

    However, I’d have thought that chimerism is so rare that it isn’t worth having a specific plan for.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>