Let us limit the task facing our monkey somewhat. Suppose that he has to produce, not the complete works of Shakespeare but just the short sentence 'Methinks it is like a weasel', and we shall make it relatively easy by giving him a typewriter with a restricted keyboard, one with just the 26 (capital) letters, and a space bar. How long will he take to write this one little sentence?
Hamlet: Do you see yonder cloud that's almost in shape of a camel?
Polonius: By the mass, and 'tis like a camel, indeed.
Hamlet: Methinks it is like a weasel.
The scenario is staged to produce a string of gibberish letters, assuming that the selection of each letter in a sequence of 28
characters will be random. The number of possible combinations in this random sequence is 2728, or about 1040,
so the probability that the monkey will produce a given sequence is extremely low.
A computer program could be written to carry out the actions of Dawkins's hypothetical monkey, continuously generating combinations
of 26 letters and spaces at high speed. Even at the rate of millions of combinations per second, it is unlikely, even given the entire
lifetime of the universe to run, that the program would ever produce the phrase "METHINKS IT IS LIKE A WEASEL".
Dawkins intends this example to illustrate a common misunderstanding of evolutionary change, i.e. that DNA sequences or organic compounds such as proteins are the result of atoms randomly combining to form more complex structures. In these types of computations, any sequence of amino acids in a protein will be extraordinarily improbable (this is known as Hoyle's fallacy). Rather, evolution proceeds by hill climbing, as in adaptive landscapes. Dawkins then goes on to show that a process of cumulative selection can take far fewer steps to reach any given target.
We again use our computer monkey, but with a crucial difference in its program. It again begins by choosing a random sequence of 28 letters, just as before ... it duplicates it repeatedly, but with a certain chance of random error – 'mutation' – in the copying. The computer examines the mutant nonsense phrases, the 'progeny' of the original phrase, and chooses the one which, however slightly, most resembles the target phrase, METHINKS IT IS LIKE A WEASEL.
By repeating the procedure, a randomly generated sequence of 28 letters and spaces will be gradually changed each generation. The sequences progress through each generation:
Generation 01: WDLTMNLT DTJBKWIRZREZLMQCO P
Generation 02: WDLTMNLT DTJBSWIRZREZLMQCO P
Generation 10: MDLDMNLS ITJISWHRZREZ MECS P
Generation 20: MELDINLS IT ISWPRKE Z WECSEL
Generation 30: METHINGS IT ISWLIKE B WECSEL
Generation 40: METHINKS IT IS LIKE I WEASEL
Generation 43: METHINKS IT IS LIKE A WEASEL
The program aims to demonstrate that the preservation of small changes in an evolving string of characters (or genes) can produce
meaningful combinations in a relatively short time as long as there is some mechanism to select cumulative changes, whether it is
a person identifying which traits are desirable (in the case of artificial selection) or a criterion of survival ("fitness") imposed
by the environment (in the case of natural selection). Reproducing systems tend to preserve traits across generations, because the
offspring inherit a copy of the parent's traits. It is the differences between offspring, the variations in copying, which become
the basis for selection, allowing phrases closer to the target to survive, and the remaining variants to "die."
Although Dawkins did not provide the source code for his program, a "Weasel" style algorithm could run as follows.
- Start with a random string of 28 characters.
- Make 100 copies of this string, with a 5% chance per character of that character being replaced with a random character.
- Compare each new string with the target "METHINKS IT IS LIKE A WEASEL", and give each a score (the number of letters in the string that are correct and in the correct position).
- If any of the new strings has a perfect score (28), halt. Otherwise, take the highest scoring string, and go to step 2.
For these purposes, a "character" is any uppercase letter, or a space. The number of copies per generation, and the chance of mutation per letter are not specified in Dawkins's book; 100 copies and a 5% mutation rate are examples. Correct letters are not "locked". Each correct letter may become incorrect in subsequent generations. The terms of the program and the existence of the target phrase do however mean that such 'negative mutations' will quickly be 'corrected'.