Evolutionary algorithms in Python

What is context-free grammar?

A grammar is a set of production rules for a language. Context-free grammar is any grammar that is of the form G = (V, \Sigma, R, S) , where V is a finite set of nonterminal characters, \Sigma is a finite set of terminal characters, R is the relation from V to \Sigma, and S is the start symbol.

S\ \to\ I+I,I-I,I*I,I/I
I\ \to\ I+I,I-I,I*I,I/I,(I),T
T\ \to\ x
x\ \to\ 1,2,3,4,5,6,7,8,9

This is what it looks like in code form:

symbols = ['S','I','T','x']
grammar = {
    'S':['I+I','I-I','I*I','I/I'],
    'I':['I+I','I-I','I*I','I/I','(I)'] + ['T']*5,
    'T':['x'],
    'x':[str(float(x)) for x in range(1,10)]
}

I used multiple instances of the symbol T, which is a generic pointer to a terminal symbol. Using just one makes the nonterminal characters so much more likely to be chosen, that the expressions tend to never end.

We can think of an individual instance of resulting language to be the translation of the right-most character at any point. Then, an individual can be defined by a fixed-length list of numbers.

attribute_size = max([len(x) for x in grammar.values()])
individual = [random.randint(0,attribute_size-1) for _ in range(30)]
print(individual)

Running it results in the following output:

[6, 5, 3, 5, 1, 8, 3, 7, 9, 3, 9, 2, 6, 9, 1, 8, 3, 0, 6, 3, 4, 8, 8, 6, 8, 5, 4, 9, 7, 3]

Now we must define a method of converting this list to a string. I had to check for the case that it was done, by storing the string from the loop before and seeing if it has changed. Also note that we can disregard how large the rule numbers can be because we can use the modulo operation to index over the rules.

def express(individual, symbols, grammar):
    expression = symbols[0]
    for branch in individual:
        expression_old = expression
        for symbol in symbols:
            if symbol in expression:
                i = expression.rindex(symbol)
                rules = grammar[symbol]
                replacement = rules[branch % len(rules)]
                expression = expression[:i] + replacement + expression[i+1:]
                break
        if expression == expression_old:
            break
    return expression

Then, our individual from earlier becomes:

7.0/1.0-4.0/9.0*2.0

Evolutionary algorithms using DEAP

Evolutionary algorithms describe the class of algorithms that use simulated evolution for the purpose of optimization. Given a fixed length of attributes, individuals can “breed” with each other using crossover points. After many generations, with an optional mutation factor, we arrive at increasingly optimal populations.

DEAP is a library for Distributed Evolutionary Algorithms in Python. I used their onemax example to write the code below to maximize arithmetic operations. In this case, I only made the individual long enough to result in two numbers, so the answer should be 9\times9=81.

Tracking the population as it evolves, DEAP’s onemax example

optimizers.py

import random, math
random.seed(64)

from deap import base
from deap import creator
from deap import tools

class EvolutionaryCFG:
    '''
    Evolutionary Context-Free Grammar
    '''

    def __init__(self, symbols, grammar, evaluation_function, population_size=20, num_attributes=30):

        self.symbols = symbols
        self.grammar = grammar
        attribute_size = max([len(x) for x in grammar.values()])
        self.evaluation_function = evaluation_function
        
        creator.create('FitnessMax', base.Fitness, weights=(1.0,))
        creator.create('Individual', list, fitness=creator.FitnessMax)

        self.toolbox = base.Toolbox()
        self.toolbox.register('attr_int', random.randint, 0, attribute_size-1)
        self.toolbox.register('individual', tools.initRepeat, creator.Individual, self.toolbox.attr_int, num_attributes)
        self.toolbox.register('population', tools.initRepeat, list, self.toolbox.individual)
        self.toolbox.register('evaluate', self.evaluate)
        self.toolbox.register('mate', tools.cxOnePoint)
        self.toolbox.register('mutate', tools.mutUniformInt, low=0, up=255, indpb=0.05)
        self.toolbox.register('select', tools.selTournament, tournsize=3)

        self.population = self.toolbox.population(n=population_size)
        self.mate_prob = 0.5
        self.mutant_prob = 0.2
        
        # Prevents re-evaluation of the same individuals (assumes determinism)
        self.score_cache = {}

    def express(self, individual):
        expression = 'S'
        for branch in individual:
            expression_old = expression
            for symbol in self.symbols:
                if symbol in expression:
                    i = expression.rindex(symbol)
                    rules = self.grammar[symbol]
                    replacement = rules[branch % len(rules)]
                    expression = expression[:i] + replacement + expression[i+1:]
                    break
            if expression == expression_old:
                break
        return expression
        
    def evaluate(self, individual):
        expression = self.express(individual)
        if expression not in self.score_cache.keys():
            if any([symbol in expression for symbol in self.symbols]):
                score = float('-inf')
            else:
                try:
                    score = self.evaluation_function(expression)
                except:
                    score = float('-inf')
            self.score_cache[expression] = score
        else:
            score = self.score_cache[expression]
        return score,
    
    def evolve(self, generations):
        print('Start of evolution')
        
        # Evaluate the entire population
        print('\tGeneration 0')
        fitnesses = list(map(self.toolbox.evaluate, self.population))
        for ind, fit in zip(self.population, fitnesses):
            ind.fitness.values = fit
        
        print('\t\tEvaluated %i individuals' % len(self.population))
    
        # Gather all the fitnesses in one list and print the stats
        fits = [ind.fitness.values[0] for ind in self.population if ind.fitness.values[0] != float('-inf')]
        length = len(self.population)
        mean = sum(fits) / length
        sum2 = sum(x*x for x in fits)
        std = abs(sum2 / length - mean**2)**0.5
        print('\t\t\tMin %s' % min(fits))
        print('\t\t\tMax %s' % max(fits))
        print('\t\t\tAvg %s' % mean)
        print('\t\t\tStd %s' % std)
        
        for generation in xrange(1,generations+1):
            print('\tGeneration %i' % generation)
            
            # Select the next generation individuals
            offspring = self.toolbox.select(self.population, len(self.population))
            # Clone the selected individuals
            offspring = list(map(self.toolbox.clone, offspring))
        
            # Apply crossover and mutation on the offspring
            for child1, child2 in zip(offspring[::2], offspring[1::2]):
    
                # cross two individuals with probability self.mate_prob
                if random.random() < self.mate_prob:
                    self.toolbox.mate(child1, child2)
    
                    # fitness values of the children must be recalculated later
                    del child1.fitness.values
                    del child2.fitness.values

            # mutate an individual with probability self.mutant_prob
            for mutant in offspring:
                if random.random() < self.mutant_prob:
                    self.toolbox.mutate(mutant)
                    del mutant.fitness.values
        
            # Evaluate the individuals with an invalid fitness
            invalid_ind = [ind for ind in offspring if not ind.fitness.valid]
            fitnesses = map(self.toolbox.evaluate, invalid_ind)
            for ind, fit in zip(invalid_ind, fitnesses):
                ind.fitness.values = fit
            
            print('\t\tEvaluated %i individuals' % len(invalid_ind))
            
            # The population is entirely replaced by the offspring
            self.population[:] = offspring
            
            # Gather all the fitnesses in one list and print the stats
            fits = [ind.fitness.values[0] for ind in self.population if ind.fitness.values[0] != float('-inf')]
            length = len(self.population)
            mean = sum(fits) / length
            sum2 = sum(x*x for x in fits)
            std = abs(sum2 / length - mean**2)**0.5
            print('\t\t\tMin %s' % min(fits))
            print('\t\t\tMax %s' % max(fits))
            print('\t\t\tAvg %s' % mean)
            print('\t\t\tStd %s' % std)
        
        print('Done!')
        
        best_ind = tools.selBest(self.population, 1)[0]
        print('Best individual is\n\t%s\n\t%s\n\nResult:%s' % (best_ind, self.express(best_ind), best_ind.fitness.values))

main.py

import math

from optimizers import EvolutionaryCFG

def evaluate(expression):
    return math.fabs(eval(expression))

def main():
    symbols = ['x','y','z','T','I','S']
    grammar = {
        'x':[str(float(x)) for x in xrange(1,10)],
        'y':[str(float(x)) for x in xrange(1,10)],
        'z':[str(float(x)) for x in xrange(1,10)],
        'T':['x','y','z'],
        'I':['I+I','I-I','I*I','I/I','(I)'] + ['T']*5,
        'S':['I+I','I-I','I*I','I/I']
    }

    optimizer = EvolutionaryCFG(symbols, grammar, evaluate, population_size=300, num_attributes=7)
    optimizer.evolve(generations=4)

if __name__ == '__main__':
    main()

Results

Running main.py has the following result:

Start of evolution
	Generation 0
		Evaluated 300 individuals
			Min 0.0
			Max 54.0
			Avg 2.02304232804
			Std 6.04227568275
	Generation 1
		Evaluated 167 individuals
			Min 0.0
			Max 63.0
			Avg 5.24390740741
			Std 9.29000294167
	Generation 2
		Evaluated 172 individuals
			Min 0.0
			Max 72.0
			Avg 12.2283809524
			Std 14.7002320802
	Generation 3
		Evaluated 188 individuals
			Min 0.571428571429
			Max 81.0
			Avg 22.6207936508
			Std 19.0033049194
	Generation 4
		Evaluated 180 individuals
			Min 0.0
			Max 81.0
			Avg 37.4266666667
			Std 19.6186464591
Done!
Best individual is
	[6, 5, 4, 8, 6, 1, 8]
	9.0*9.0

Result:(81.0,)

About the author



Hi, I'm Nathan. I'm an electrical engineer in the Los Angeles area. Keep an eye out for more content being posted soon.


Leave a Reply

Your email address will not be published. Required fields are marked *