You may be know, for my Day Job I am a Ph.D. student at the Sanger Institute just outside Cambridge, doing lots of exciting things with genomes and the like. However, as you may not know (but, to be honest, are just as likely to as not), as my Thing On The Side I study evolutionary algorithms with my previous supervisor Yogi Jaeger. In my first Fancy Perk of being a full-time researcher, I will be going to Amsterdam later this month to give a talk at the University of Amsterdam on a comparison between two algorithms.
Now I think the details of the comparison are actually pretty interesting to a lay-person, especially because they lead to some idle speculation about the nature of evolutionary forces. Now, Yogi generally doesn’t like me speculating about such things, since he thinks it isn’t rigorous, and he points out that lots of people far smarter than me spend their time doing advanced theoretical studies about the constraints and capacities of the evolutionary process. However, I know that you, my most kind blogventurer, will not raise such objections. And thus, once again I will subject you to my idle musing.
Naturally-Inspired Computing
The problem I work on is inferring parameters in a model (do not fear, an explanation of what that means is coming just… about… NOW!). Imagine we have a model that makes predictions (for example, it may predict how a disease will spread through the population), which depends on things that we do not yet know, which we call parameters (such as the probability of a healthy person getting the disease if they meet an ill person). We have some data (like the number of people who got ill at various times), and the aim is to figure out the values of the parameters that make the model predict the data as well as possible. Now, sometimes you can calculate this efficiently, but if you have a lot of parameters you will often just have to try lots of different combinations of parameters until you figure out one that works. This can be a pain, and so we use ‘optimization algorithms’, which try and help us search for the best parameters as easily as possible.
The two optimization algorithms I am comparing are both Naturally Inspired. Naturally Inspired algorithms come about when we have a difficult problem, and instead of just sitting around Thinking Real Hard, we look to see whether nature ever has to solve that particular problem. If she has, we can copy her answer (scientists are good at looking at the world and figuring out what it is doing, and if we can substitute that process for actually thinking things up from scratch, we will).
I won’t bore you with too much detail for the first algorithm, which is called Simulated Annealing. As I am sure you could infer (given the mental capabilities common to all blogventurers), it mimics real annealing, a process that allows the molecules in a metal to find their most stable configuration by heating them up and then slowly cooling them. In our case, we define a more stable configuration of parameters as a set of values that give good predictions, then move them around in a similar way to the molecules in a metal.
The second algorithm is an Evolutionary Strategy, and it mimics (all together now) evolution. Specifically, we create a bunch of virtual individuals, each of which starts with a random bunch of parameters. We let these individuals mutate (by randomly changing the parameters) and breed to produce offspring (which have an average of their parents parameters). After putting them in a room together and politely waiting outside for them to do their thing (or turning down the lights and playing sleazy funk, as I must confess to, on occasion, doing), we come in and we rank the individuals according to how well their parameters make predictions; we select the best individuals, and use them to seed a new generation. We keep doing this until we find a solution that we think is good enough.
This is the point in the blog where I point out a potential confusion, in order to avoid promoting misunderstanding in my trusting reader. I do not, myself, sit down and oversee the love life of my virtual wards, and I do not loving rate and select the best amongst them (or equally lovingly cull the worst). I have produced computer code that will do this for me, such that in the best tradition of our modern world, I may let the machines tend my herds, while I sip wines of varying quality.
Evolutionary Flexibility
Where this gets interesting is when you attempt to run these algorithms on multiple computers (it is generally easier to buy lots of slower computers than one really fast one, so if possible we’d rather have programs run on multiple networked computers that talk to each other). The Simulated Annealing algorithm had to be entirely rewritten so that it will still run on multiple computers, and with clever ways of exchanging information between computers so that the different computers didn’t just each go off and do their own thing.
In contrast, the Evolutionary Strategy has a natural method of splitting up the algorithm. Real species of evolving organisms exist in many populations that only rarely mix, and this does not seem to prevent them from evolving. In fact, it is often helpful, and previous versions of the Evolutionary Strategy have used separate populations on one computer to keep up diversity, with individuals occasionally being moved randomly between populations. My masters thesis involved carefully splitting up such an algorithm so that a number of different computers, each with it’s own population, could act the same as lots of populations on one computer (which involved having faster computers wait for he slower ones, so as to avoid running off ahead).
As it happens, I needn’t have bothered being so careful. A few months ago I re-wrote the algorithm such that each computer ignored what the other computers were doing, and sent migrants out to other computers every few generations. This broke down the symmetry between populations, meaning that populations on different computers would act differently to each other, and would mean that populations on faster computers would send out more migrants. I did not know for sure that the algorithm would still work as well. As it happens, it worked just fine, and as faster computers didn’t have to wait for slower ones, it was quite a bit faster.
The point about that is that the evolutionary approach easily ran in parallel, despite it being so hard for the Simulated Annealing to do the same. The evolutionary algorithm was more flexible as well; it didn’t matter if you changed how the populations were set up, or how they interacted, it still worked fine. An interesting example of this is a cool program from the ’80s, called ASPARAGOS, which ran on dozens of little transputer chips connected together in a ring, with individuals migrating between adjacent chips. It worked a treat, despite it’s odd implementation.
This flexibility seems surprising to me at first glance, but you may well have already seen why it shouldn’t. Adaptive evolution has produced results throughout evolutionary time, regardless of the conditions it has been placed in. Clusters of island birds evolve perfectly well, as do tightly packed grasslands, widely-dispersed tropical trees and massive bacterial colonies. A significant amount of work goes into trying to stop evolution from finding new ways of getting on with the business of living (especially if that living is going on inside of us, c.f. antibiotic resistance), and even in these cases we know that out best efforts can only slow down the inevitable.
Evolution is not like any other natural process. It does not require specific conditions to work; all it needs is a method of generating new changes with a certain complexity (generally a combination of mutation and breeding, or other genetic transfer), and selection pressures. Stars might only form in galactic dust clouds with the right density, and liquid water only within a tiny temperature and pressure range, but life needs no such fine-tuning (at least not after it has got going). As long as evolution acts, life will find a way.








Interesting! I’d heard of evolution modelling algorithms before, but didn’t really have a huge idea of what they were doing. Have fun in Amsterdam.
It isn’t really an evolution modelling algorithm; it’s purpose isn’t really to model evolution, but to find parameters.
There are programs that evolve for the sake of studying evolution, rather than evolve in order to solve problems. The big ones are Tierra, in which individuals run as programs on a computer that compete for CPU time and memory, and Avida, in which virtual computers compete with each other.