Re: Measuring the effectiveness of a GA implementation?



On Sun, 23 Jul 2006 16:08:06 -0700, Digital Puer wrote:
Hi, I'm a computer scientist and have been using genetic algorithms in
some of my work. One problem I've encountered is that when my papers are
being peer-reviewed, a common criticism is that there is no metric that
can be used to measure the effectiveness of my GA approach.

Note: my work is not on genetic algorithms themselves but on the use of
a GA to solve optimisation problems.

There are several issues involved in evaluating the "effectiveness" of a
GA - or of any optimisation algorithm, for that matter.

Firstly you have to state precisely the class of problem that your
algorithm is intended to solve and also how you intend to sample problems
from that class (eg. you want to solve 100-city Travelling Salesman
problems, which you construct by random placement of cities according to a
given distribution).

However, a typical criticism is that I do not show how close the GA's
solution is to the optimal answer. Now, I could easily run an exhausting
traversal through the permutations to find the optimal solution and
compare it the GA's solution, but that takes a long time and cannot be
done for all test cases and for a very large number of permutations.

Indeed; in many (if not most) scenarios to which GAs are likely to be
applied, there will not be an algorithm available that will enable you to
calculate optimal solutions (at least in not less time than the age of the
universe...); in which case, comparison with an optimal solution is
obviously out of the question and any criticism on the lack of such a
comparison simply invalid.

For example, suppose I am using a GA to find the best permutation of an
integer set. I usually compare my approach with alternative greedy
approaches and show that the GA produces better results (with respect to
the evaluation function).

Ah, but how do you compare? What do you mean by "better results"?

I'll assume that your problem doesn't admit practical calculation of
optimal solutions. The best you can then do in terms of "measuring
effectiveness" is comparison of your algorithm with other (preferably
previously researched) algorithms. There are two basic aspects to such
comparisons: (1) quality of solutions (since we're talking about GAs, for
"quality" read "fitness") and (2) time required to reach solutions (here
you might want to measure "time" in terms of "fitness evaluations" to
establish a level playing field, though one can envisage situations where
this might not be appropriate).

Then you could ask either or both of the following distinct questions:

(1) "Fitness-centric": What is the fitness of solutions achieved in a
given time?

(2) "Time-centric": What is the time required to achieve a solution of
given fitness?

Note that these questions are likely to be answered statistically in terms
of mean time (resp. mean fitness) plus, preferably, some measure of the
distribution of these quantities (eg. standard errors). An added subtlety
for a statistically meaningful answer to question (2) relates to that fact
that you cannot be prepared to wait forever and indeed have no guarantee
that a given fitness will *ever* be achieved in a trial run. The most
sensible option here is probably to establish a maximum "time out" time
and quote statistics for "failed trials" alongside mean time for
non-failed trials.

So for the rest of you computer scientists, in your papers how do
justify that a GA approach is a good way to optimise?

My preferred approach would be to present graphs measuring both fitness-
and time-centric comparisons; so for a fitness-centric comparison I would
graph mean fitness (+/- error bars) on the vertical axis against (given)
time on the horizontal. For time-centric comparison, I'd graph mean time
(+/- error bars) on the vertical axis against (given) fitness on the
horizontal.

The point is that time/quality questions really are distinct and may give
you different answers for a given comparison. And for either, the answer
may not be an unequivocal "better/worse" one... for instance, under a
fitness-centric comparison algorithm A may be better if you are only
prepared to wait for, say, 1000 fitness evaluations, while algorithm B may
be better if you are prepared to wait for 10,000 evaluations; that is, the
graphs may cross at some time value. You might also encounter situations
where an algorithm achieves good *mean* fitness (resp. time) but the
distribution has such high variance (or is so skewed) as to render the
algorithm unreliable in practice.

My impression is that in the optimisation (and particularly the GA)
literature, researchers tend to be somewhat cavalier (to put it politely)
about evaluating effectiveness of optimisation algorithms (I'm loath to
speculate on whether your peer-reviewers might be aware of some of the
issues involved...).

Also, am I correct in assuming that a GA's running time cannot be
evaluated by the typical big-O notation?

Why not (assuming you knew how to calculate running time)?

Regards,

--
Lionel B
.



Relevant Pages

  • Re: Fermi Paradox
    ... always generates a pair of possible mutations. ... What you have is a round of selection after you generate two ... using a different fitness function. ... There may be interesting algorithm consequences to how one separates ...
    (talk.origins)
  • Re: Needleman-Wunsch - producing more than the optimal solution -- please help
    ... What I want is the optimal solution, second optimal solution, third ... an issue I see is that it's not clear to me that the algorithm actually compares solutions to find the "optimal" one so much as it detects the best alignment of two inputs based on a pre-computed weighting matrix. ... But the moment you allow for the choice of something other than the maximal choice, you have to deal with every permutation possible, and evaluate those permutations in some way. ... Not only would you, after redoing the choice, have to recompute the rest of the "F" matrix that depended on that choice, there's no guarantee that when you're done, the score in the lower-right corner that results from that replacement would indeed be the next-lowest score possible in the matrix. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Shortish paths
    ... Remember I'm not asking for an algorithm, just for advice on what to ... original graph. ... optimal solution is removed from the graph, ... I intend to begin by rereading the shortest path sections of Cormen ...
    (comp.theory)
  • Re: NORMSDIST() and NORMSINV()
    ... was thinking of substituting both functions with custom methods. ... and for NormSInvI was thinking of using P.J. Acklam's algorithm. ... and I would be evaluating this function approximately 6000 times. ...
    (microsoft.public.excel.programming)
  • Re: Human design and natural "design"
    ... >>usual for a genetic algorithm. ... >>This fitness function never varies. ... > weights for NNa to a file and let the program continue. ... with the fitness function used to evolve the NN's. ...
    (talk.origins)