:: YAES - Yet Another Extensible Simulation Framework ::

Genetic Programming (GP) is an evolutionary algorithm in which the evolutionary units are computer programs or functions described by tree structures. The tree structures can easily be parsed into compilable code consisting of conditional branches, mathematical operators, variables and constants. The genetic programming algorithm used here is based on the generational genetic algorithm.

We decided to split the evolution of the behavior of the agent in two stages. In the first stage we evolve tactical behaviors, which control primitive actions such as eat food, explore, attack and multiply. In the second stage we evolve game strategies by combining the behaviors from the first stage using Finite State Machine (FSM) structures.

Stage 1 - Evolving Tactical Behaviors. Four types of primitive behaviors were created: Eat-food, Explore, Attack and Flee. Eat-food behavior was generated using a fitness function which defines an ideal individual as an agent that does not collide with obstacles and that eats all the available food resources. The fitness is derived based on the number of failed move sequences, the length of the failed move sequences and the amount of food eaten. This ideal individual would have a fitness value of 0. The best individual over the experiment had a fitness value of 0.01. Average fitness at the last generation was 0.3759 and best individual of the last generation had a fitness of 0.05.

Although the optimal solution was not found, the best individual was able to, in most cases, effectively avoid obstacles and consume available food resources. The behaviors for Explore, Attack and Flee can be evolved similarly, or can be created from the Eat-Food behavior by replacing the heuristics.

As we did not manage to evolve optimal primitive behaviors which reliably avoided being stuck on obstacles, we decided to augment the evolved behaviors with helper heuristics. The heuristics used provide directions to closest food, opponent agent and unexplored areas using A* search.

Stage 2 - Evolving Game Strategies. With the tactical behaviors already created, the next challenge is to make decisions of which tactics to be applied at any given moment. We decided to arrange the tactical behaviors in a finite state machine type structure, and apply genetic programming to evolve the transition rules for these structures. Here, two types of game strategies were created: Balanced and Aggressive. The balanced strategy seeks to create an agent that doesn't specialize on any type of behavior. The aggressive strategy seeks to create an agent that specialize on attacking and killing other agents.

The game strategies were generated in a FFM game with 6 additional opponent agents. The purpose of the opponent agents is to generate hostile and conflicting situations from which strategies resolving these situations can be evolved. Strategies are formed based on weights that depict the influence-level of each tactical behavior from the first stage. The balanced game strategy was evolved using equal weights, 0.25 for each weight type. The aggressive strategy on the other hand was configured with higher weight values for attack and flee, 0.40, and remaining weights set to 0.1.

To validate the game strategies that were evolved, we allowed them to execute in the FFM game for 20 individual runs each of 4000 simulation cycles. Each run was set up with 5 balanced strategy agents, 5 aggressive strategy agents, and 6 opponent agents.

Results show that the aggressive strategy was able to perform on average 23.45 successful attacks. This is more then twice as much as the balanced strategy was able to do. The aggressive strategy involves no eating of food and few attempts to flee when in conflict. The balanced strategy has on average 154.04 successful eat attempts, map coverage of 23% and 10.57 successful attack attempts. The balanced strategy is, as expected, more general then the attacker strategy as it involves eating, attacking and more extensive exploring.

Author: Linus J. Luotsinen and Joakim N. Ekblad