Acovea: Analysis of Compiler Options via Evolutionary Algorithm
Describing the Evolutionary Algorithm
by Scott Robert Ladd
30 May 2004
ACOVEA (Analysis of Compiler Options via Evolutionary Algorithm) implements a genetic algorithm to find the "best" options for compiling singular algorithms with the GNU Compiler Collection (GCC) C and C++ compilers. "Best", in this context, is defined as those options that produce the fastest resulting executable program from a given source code. Acovea is a C++ framework that can be extended to test other programming languages and non-GCC compilers.
I envision Acovea as an optimization tool, similar in purpose to profiling. Traditional function-level profiling can identify the algorithms most influential in a program's performance; Acovea can then be applied to those algorithms to find the compiler flags and options that generate the fastest code. Acovea is also useful for testing combinations of flags for pessimistic interactions, and for testing the reliability of the compiler.
This article details the mechanics of the Acovea algorithm;
the application of of this technology to the GNU C and C++ compilers can be
found in An Evolutionary Analysis of GNU C Optimization.
I've been using genetic algorithms (GA) for over a decade in various ways, and even wrote a book about them way back in 1996. You'll find GAs being applied across the computing spectrum, from neural network optimization to the design of aircraft wings. For the most part, applications of GAs tend to be rather specialized; they are only rarely seen in more pedestrian, general-purpose applications.
A particular efficacy of genetic algorithms is their ability to search large solution spaces. In terms of analyzing GCC's optimization options, 27 quadrillion possibilities can certainly be considered "large"; to me, a GA was an obvious tool for satiating my curiosity. I've recently begun using GAs in experiments that test software reliability and performance; the gcc "option" question provided an excellent opportunity to put my theories to work on a practical problem.
I'm not going to provide a full description of genetic algorithms theory in this article, since I've done so elsewhere. In a nutshell, a genetic algorithm is an iterative process based loosely on biological evolution through natural selection. A population of "organisms" is tested against a given problem and assigned a "fitness" value based on a level of success; a new generation of "organisms" is then generated, with higher-fitness "organisms" having the best chance of "breeding." The process repeats until a specific number of generations are run, or a desired outcome is reached. If you want all the arcane details, read Melanie Mitchell's excellent book, An Introduction to Genetic Algorithms (MIT, 1996).
Classes & Initial Populations
The C++ code for this article is based on the latest version of my Evocosm class library. I've released a new version of Evocosm in conjunction with this article; it is included in the Acovea tar files listed at the beginning of the article.
The organism undergoing evolution is an acovea_organism, which I derived from libevocosm::organism<chromosome>; a chromosome, in turn, wraps a vector<option *>. An option is an abstract concept of a compiler flag, such as -fgcse or -mfpmath=387. A compiler object generates chromosomes from internal tables of options, based on the characteristics of the compiler's command line; a compiler also provides target-specific tools for mutation and command-line generation.
The gccacovea program implements a concrete compiler class for GCC 3.4, gcc34_compiler. That, in turn, is extended by the gcc34_pentium_compiler class to support Pentium-specific command-line options. These two classes encapsulate definitions of options specific to their realms, implementing methods that produce appropriate chromosomes and command lines.
This application has undergone three distinct design stages. The initial version used pure binary strings, with each bit in a long long representing a single GCC flag. This did not work well, because some options (e.g., -mfpmath, -finline-limits) have more than one potential state. The next design (version 2) was based on XML descriptions of a compiler and its options. Again, I found that the added complexity of parsing XML did not make Acovea any easier to update or extend; changes in the ordering and structure of the XML were reflected in the C++ code, and I found myself maintaining and synchronizing two different object hierarchies.
Acovea version 3 --upon which I based these articles -- is built on a simple C++ object hierarchy, eliminating the coordination issues and simplifying program builds (no need for an XML-parsing library).
To analyze a non-Pentium architecture, I create a new gcc34_arch_compiler class with specific options, and modify the driver program for any platform-specific options that may be required. For example, I have a SPARC version of Acovea in testing at the moment; it took less than an hour to develop. I am also developing an iccacovea for testing Intel's C++ compiler.
Each initial population is a set of randomly-selected acovea_organisms. I also "seed" each population with predefined organisms that represent the "composite" optimization options -O1, -O2, -O3, and -Os. That way, I ensure that the default settings get a chance to compete with random option sets.
The fitness test follows this sequence for each organism:
- Generate a gcc command line, using options specified by the organism's bit flags.
- Compile the target program (e.g., almabench.c) using fork/exec/wait.
- Run the resulting executable image, again, using fork/exec/wait.
- Assign the organism's fitness from the run-time reported by the generated program program, via pipe.
My use of fork/exec/wait is standard POSIX practice. For a non-POSIX compiler, you'll need to use native techniques for executing an external program and waiting for its completion. You may also need to change the way in which gccacovea pipes information between executed programs and the main application, if your system does not support traditional Unix pipes.
Should a compile or program execution fail (because of an internal compiler error or generation of erroneous code), the run is assigned an extremely high fitness, and it is not considered for statistical analysis. Fitness scaling will allow such "broken" chromosomes to breed, however, although their likelihood of reproduction is small. The program reports erroneous compilations to stderr.
Before reproduction, I apply windowed fitness scaling. Fitness tends to converge after a few generations; scaling helps natural selection by improving the reproductive chances of the fittest option sets. Command line options allow selection of other fitness scaling techniques.
By default, all reproduction involves two-point crossover between two parent organisms. This can be adjusted to a probability between 0 and 1 with the gccacovea wrapper application. You can change the type of fitness scaling, or turn it off entirely, using command-line options.
Each option handles mutation according to its base class. For simple options, mutation involve turning the option on and off in the command line. A more complex option, like -fpmath, has several settings that may potentially change in its mutation function.
Most of my tests were run with mutation rates of 1 to 2%, although I did try higher mutation rates in verifying my algorithms and assumptions. Naive implementers of genetic algorithms tend to believe that more mutation is better, when, in fact, too much mutation masks any actual evolution. Evolution takes place among organisms over time; if a high mutation rate is causing major changes in the organisms, the genetic algorithm devolves into a random search.
Multiple Populations and Migration
Typical genetic algorithms evolve a single population of organisms. For the last year, I've been experimenting with multiple populations that share genetic information through the limited migration of organisms. The biological model I've visualized is that of African lions; the overall collection of all lions is divided into prides (groups); upon occasion, males will move from one pride to another. Each pride is a genetic pool; exchanging a few members allows genetic information to move between groups without severely diluting the unique nature of each pride.
Over time, isolated populations tend toward genetic uniformity; crossover swaps genetic information between parents, shuffling the genes in a population, but it does not produce any new information. In a traditional genetic algorithm, mutation is the primary source of "new" information. However, mutation is undirected, providing raw data that is valuable only by chance. Migration, on the other hand, exchanges individuals that have been "bred"; thus, an immigrant organism is likely to have useful genes based on natural selection within its original population. In gccacovea, I found that a small migration rate seemed to produce the most effective final results. The gccacovea command-line options allow these settings to be changed, so you can compare how 200 organisms in a single population compare to five populations of 40 organisms (the default).
As always, I look forward to considered comments.