|
Evolving the goal priorities of autonomous agents |
|
| These two images show the simulation environments that the agents learn in. The black round object centered at the top of the environments is where the agents start, the four grey regions near the corners are the areas of interest, and the smaller black squares are the obstacles. |
|
| This image illustrates the problem that occurs when vectors from the avoid obstacle and follow obstacle goals are summed together. The black area is the obstacle, the vector perpendicular to the obstacle is the one returned by the avoid obstacle goal function, and the parallel vector is the one returned by the follow obstacle goal function. The resulting vector is at a 45 degree angle to the obstacle. Without the comfort value, this would cause obstacles to never be followed. |
|
| These images show the effects of adding in different random vector lengths to the final direction vector obtained from the goal functions. The random vector length in the top-left image is 0.0, whereas the top-right image was obtained using a length of 0.01. With no random vector, the agents bounce back and forth between the top and bottom of the environment. The upper right image shows that even small amounts of randomness in the system can significantly affect its overall behavior and avoid this repetitive behavior. The bottom-left agent has a randomness value of 0.02, while the bottom-right agent's randomness value is 0.04. |
|
| These graphs show the average fitness per generation for the thirty runs for each of the environments. Fitness is computed by summing the number of agents alive and the number of AOIs seen by each agent. The graphs are ordered so that they correspond to the environments above. Along with the average fitness, the standard deviation of that average and the best fitness per generation are shown. |
|
| These two figures show the plots of the evolved values from the last generation of each individual for all runs. If the goal function is not used by the agents in the environment (such as the avoid obstacle goal in Environment 1), a large range of evolved values is produced. On the other hand, if the parameter is important to the agents' behavior, it typically evolves within a specific range. |
|
| This is an extremely efficient behavior that was evolved. There is no "follow the leader" strategy employed in our system, but because the agents are deterministic and the evolved randomness value was 0.0, a "follow the leader" behavior emerged. The lines extending from the agents to the obstacles indicate that the agent is sensing that obstacle. |