Re: Scaling in fmincon()
- From: "John D'Errico" <woodchips@xxxxxxxxxxxxxxxx>
- Date: Fri, 6 Jan 2006 17:19:06 -0500
Fijoy George wrote:
>
>
> Hi all,
>
> I have been using the Matlab constrained optimization fucntion().
> Lately, I
> have been observing that the speed and the success of the
> optimization
> depends on the "scale" of the problem. For example, I have observed
> that
> when I divide the objective function by a constant so as to make
> the
> function values smaller, the optimization converges significantly
> faster.
>
> I see in the literature that scaling is an important factor in
> optimization,
> and optimization softwares normally have automatic procedures for
> transforming the user-specified problem to a well-scaled one.
>
> So, does anyone know if Matlab fmincon() does any scaling at all?
> Has anyone
> experienced scaling problems before with fmincon()?
Is it possible that you are kidding yourself?
Suppose your tolerance is 1e-5 ,with an objective
function on the order of 1.
Fmincon iterates for a while, until the change in
objective value is less than its tolerance. It
decides to terminate.
Suppose you scale the objective function so that
its value is roughly 1e-5, but leave tolfun and
all else unchanged. Fmincon will converge "faster".
Actually, fmincon will just terminate faster. Has
it converged? Not necessarily. This will happen
with an absolute tolerance, as I believe tolfun
is for fmincon. (Unverified, but I believe this
to be true.)
Lets do a little test. I'll form two objectives,
with exactly the same shape, but scale one of
them by a factor of 1000.
rosen = @(x) (1-x(1)).^2 + 105*(x(2)-x(1).^2).^2;
rosen2 = @(x) ((1-x(1)).^2 + 105*(x(2)-x(1).^2).^2)/1000;
x = fmincon(rosen,[2;],[],[],[],[],[0 0],[5 5], ...
[],optimset('disp','iter','tolfun',1.e-5))
Warning: Large-scale (trust region) method does not currently solve
this type of problem,
switching to medium-scale (line search).
> In fmincon at 260
max Directional
First-order
Iter F-count f(x) constraint Step-size derivative
optimality Procedure
0 3 106 -2
1 10 4.25391 -1.75 0.125 351
1.47e+03
2 16 0.360729 -1.554 0.25 -4.74
13.7
3 21 0.326586 -1.484 0.5 0.575
19.4
4 25 0.266214 -1.515 1 0.0162
2.86
5 29 0.260499 -1.508 1 -0.00502
4.32
6 33 0.237634 -1.462 1 -0.00342
10.3
7 37 0.219094 -1.441 1 -0.0167
10.2
8 41 0.131387 -1.344 1 -0.0549
6.94
9 45 0.0848583 -1.289 1 -0.0355
2.69
10 50 0.0720531 -1.215 0.5 0.0504
8.44
11 54 0.0377952 -1.191 1 -0.0177
2.15
12 58 0.0207713 -1.115 1 0.00167
4.2
13 62 0.00969768 -1.096 1 -0.00653
1.26
14 66 0.00441785 -1.039 1 0.00251
2.38
15 70 0.00117921 -1.034 1 -0.000426
0.124
16 74 0.000204715 -1.013 1 -0.000467
0.248
17 78 1.96633e-05 -1.002 1 -4.49e-05
0.167 Hessian modified
18 82 5.79256e-07 -1.001 1 1.74e-06
0.015 Hessian modified
Optimization terminated: magnitude of directional derivative in
search
direction less than 2*options.TolFun and maximum constraint
violation
is less than options.TolCon.
No active inequalities
x =
1.0006
1.0013
x = fmincon(rosen2,[2;3],[],[],[],[],[0 0],[5 5], ...
[],optimset('disp','iter','tolfun',1.e-5))
Warning: Large-scale (trust region) method does not currently solve
this type of problem,
switching to medium-scale (line search).
> In fmincon at 260
max Directional
First-order
Iter F-count f(x) constraint Step-size derivative
optimality Procedure
0 3 0.106 -2
1 8 0.0396314 -1.579 0.5 0.368
0.405
2 12 0.00204318 -1.712 1 -0.0127
0.0856
3 16 0.000586305 -1.748 1 0.000511
0.0131
4 20 0.000553544 -1.744 1 1.51e-06
0.000485
Optimization terminated: magnitude of directional derivative in
search
direction less than 2*options.TolFun and maximum constraint
violation
is less than options.TolCon.
No active inequalities
x =
1.7436
3.0426
The true minimizer is at [1 1]. Because I scaled
the objective so poorly with respect to the tolerance,
fmincon decided that it was close enough. Has the
second optimization truly converged faster? Has
it even converged? In the context of the tolerance
I provided, it apparently has. But are the two
results equally valuable?
The difference between the two was no more than a
scale factor. Had you chosen a different optimizer,
with a different tolerance style, I could likely
have as easily found an objective function to trip
it up. This is a characteristic of all such numerical
algorithms.
The moral? An optimizer needs to be treated with
care and understanding. The more understanding you
apply, the better you will do. Always apply thought
to your results. Ask if those results make sense at
every step of the way.
In the end, I just look at this as job security
for the numerical analysts of the world.
HTH,
John D'Errico
Numerical animalist
.
- Follow-Ups:
- Re: Scaling in fmincon()
- From: Fijoy George
- Re: Scaling in fmincon()
- References:
- Scaling in fmincon()
- From: Fijoy George
- Scaling in fmincon()
- Prev by Date: DSP Blockset user's guide!
- Next by Date: error prediction using newrb/sim (radial basis functions)
- Previous by thread: Scaling in fmincon()
- Next by thread: Re: Scaling in fmincon()
- Index(es):
Relevant Pages
|