# Genetic Algorithm: An Overview

The genetic algorithm is part of a family of algorithms used for optimization problems first conceived of in the 1950s at the Institute for Advanced Study in Princeton, NJ.   The algorithm didn’t gain much commercial use until the late 1980s.  In this post, I will briefly discuss genetic algorithm and how it works, going over an example of its implementation.  I will also discuss what practical problems genetic algorithm can be used to solve.  Lastly, I will provide some links for more reading on the subject should you feel like learning more.

So, the Genetic Algorithm gets its name from the fact that it attempts to simulate biological evolution.    If you recall from high school biology, all life is composed of DNA, and that DNA is made up of chromosomes, which are made up of Nucleic Acids.  When an organism reproduces, the resulting child organism is a combination of the DNA from two parents. Now, in Biology, most organisms have pairs of chromosomes and there are dominant and recessive traits, and that’s all well and good, but for the purpose of the Algorithm, we’ll assume one long strand of DNA that takes some from Parent A, and some from Parent B.  Along the way, random mutations are added.  Sometimes these mutations are beneficial and increase the survivability of the organism, and other times not so much.  It’s with these basics in mind that we can the generate our algorithm.

All that DNA is really just encoded data.  In biology, codons and their order determine how proteins are made and what traits a creature has.  In our example, we will take a look at the well known N-Queens Problem.  In case you aren’t familiar with it, N-Queens proposes a problem space consisting of an N x N chessboard in which N queens must be placed where none can be attacking each other (n>3). For those of you unfamiliar with chess, the Queen can attack in a straight line or a diagonal.  As it turns out, the N-queens completion problem is NP-Complete, and while that is an interesting aspect of the problem, the variant of the puzzle we will be solving presumes we start with an empty board and try to find a solution, not all the solutions.

Now, in order to use the genetic algorithm, we need two things :

1. A way to encode our board
2. A fitness test or heuristic for evaluating a board position.

The first is pretty simple.  Using an array of integers, we can store all of the queen positions in the single dimensional array like so : Int = { 1,3,6,8,5,7,2,4}

That array acts as our chromosome, and finding the fitness becomes a simple matter of taking the number of attacks and subtracting it from the maximum number of conflicts which is defined as : (n2+ n)/2.  For an 8 x 8 board, the highest fitness would equal 36 and also be a solution.

```public int getCost(){
int cost = 0;

//check each columns queen to see what queens it threatens
for(int i=0; i<board.length; i++){
for (int j = i+1; j< board.length;j++){
if (board[i] == board[j])
cost++;
//now check diagonals , if the column has a value equal
//to the current column +/- the offset then it is being attacked
// on the diagonal
int offset = j-i;
if ((board[i] == board[j]-offset)||(board[i]== board[j]+offset))
cost++;
}
}
this.currentCost=cost;
this.fitness = maxConflict - cost;
return cost;
}
```

Now that we have a way of encoding the data and measuring the fitness of the solution we can talk about what the genetic algorithm actually does. The algorithm works by taking a population of boards (generated at random), calculates their fitness, then breeds the fittest boards to create a new generation of boards that will hopefully be more fit until eventually a solution is reached. So the order is :

1. Initialize a Population.
2. Select Parents to Breed.
3. Introduce Random Mutations.
4. Create new generation.
5. Repeat until the Stopping condition is reached.

I’ll go over each one of these steps since each one introduces a parameter to the algorithm that becomes important.  Here is an implementation of the genetic algorithm in Java for N-queens :

```public void solveWithGeneticAlgorithm(int populationSize, int boardSize, double mutationRate, double crossoverRate){
double generations =0;
int crossovers = 0;
int mutations = 0;
Random r = new Random();
//initialize a population of boards
Population population = new Population(populationSize, boardSize);
population.sortByFitness();
double start, stop, length;

start = System.currentTimeMillis();
while (!population.hasSolution()){
generations++;
Population newPopulation = new Population(0, boardSize); //create empty population
for(int i=0;i< population.size;i++){
//select new parents
Board father = population.randomSelect();
Board mother = population.randomSelect();
//check to crossover, if the probability threshold is met
if( r.nextDouble() < crossoverRate){
int crossoverPoint = r.nextInt(father.size);
//crossover and make a new child
Board child = population.reproduceCrossover(father, mother, crossoverPoint);
child = population.mutateChromosome(child, mutationRate);
crossovers++;
}
else{
//no crossover pick better parent to put back in the population
if(father.getFitness()>=mother.getFitness())
else
}

}
mutations += population.mutations;
population = newPopulation;
population.sortByFitness();
}
stop = System.currentTimeMillis();
length = stop-start;
population.printSolution();
System.out.println("\nFor Board Size : "+boardSize+ " Solution found in : " + length + " ms \n Number of generations : "+generations);
System.out.println("Number of offspring generated : " + crossovers + " # of Mutations : " +mutations);
}
```

When deciding on an optimum population size, playing with different numbers can be helpful.  There are of course programmatic ways to arrive at the best size (in a sense a meta-genetic algorithm), but to avoid additional complexity, we’ll stick to trial and error for now.  In the table below, you see for a 90 x 90 board, the optimum population is around a factor of 2 times the dimensionality.  That isn’t to say that will always hold to for other n values, but for this one, it seems a smaller population works and not much is gained by increasing the population.

 Population Size Board Size Time for Solution 90 90 30.6 seconds 180 90 9.3 seconds 270 90 22.1 seconds 450 90 72.1 seconds 560 90 122.6 seconds 900 90 127.2 seconds

Selection and Mutation

The next parameters we’ll look at are the selections and the mutations.  Both are important, and there are various ways to approach the problem.  The simplest way to select parents to create the next generation is to pick parents on the higher end of the fitness scale. Of course, there are a lot of things to consider here as well, and in the next installment of the series I’ll go a little into the different techniques for mutating the chromosomes and discuss a little more about what problems genetic algorithm can solve. 