Just a quick note. Nick Loman notes that he intends to use material from my Basics: Sequencing series in his undergrad lectures. That is pretty awesome, and I feel an urge to reciprocate by using one of the things he’s blogged about, but given that I teach mathematics on a blackboard, I’m not entirely sure how to do so*.
To clarify, the images and material in those posts, and indeed everything written in this blog, can be used freely for any purpose. I would like it if you would provide a link back here, or note who created them verbally, but that is by no means required.
* Ohh ooh I’ve got one, a question for my first year Elementary Mathematics for Biologists students:
Question 1
The Sanger Centre owns 42 sequencing machines, of which 2 are 454 and 40 are Illumina. Throughout the rest of the UK, there are 12 Illumina machines, 9 454s, and 3 SOLiDs (1). Perform a chi-squared test of independence to see whether there Sanger Centre has significantly different purchasing priorties than the rest of the UK. Is this test valid in this instance?
(1) According to data found at http://pathogenomics.bham.ac.uk/blog/2009/08/sequencing-in-the-u-k/
Answer 1:
The contingency table is:
|
ILMN |
454 |
SLD |
TOT |
SC |
40 |
2 |
0 |
42 |
UK |
12 |
9 |
3 |
24 |
TOT |
52 |
11 |
3 |
66 |
The expected values are thus:
|
ILMN |
454 |
SLD |
SC |
33.1 |
7 |
1.9 |
UK |
18.9 |
4 |
1.1 |
Chi-squared score is thus ~19.04. This is larger than the 95% critical value of 6.0 for df = 2.
This test is not valid in this case, for two reasons. Firstly, the expected values are very low, and thus the normal approximation is unlikely to hold; we should instead use Fisher’s exact test. Secondly, each purchase of a sequencing machine is not independent of the result of the last purchase; you are more likely to buy the same machine again, since you have invested in equipment, software and training for that type of machine.