Gradient Searches for local extrema
Worksheet by Mike May, S.J.- maymk@slu.edu
> restart;
One of the methods the book gives for trying to find a local maximum or minimum is by doing a gradient search. Since it is computationally intensive, it is hard to do without mechanical help. Let's work through a search.
We start by defining the function.
> func := (x,y) -> (x-2)^2+(y-3)^2+sin(x)*cos(y);
Since the function is a mixture of a polynomial and a trig function it will be hard to set the two partials equal to zero. Thus a gradient search is probably the easiest way to find the extrema.
Since Maple can do 3D plotting it is useful to plot the function and see what is going on. We are going to guess that the minimum is at (2, 3) and plot for a del neighborhood of size 1 around that point.
>
a := 2: b := 3: del := 1: evalf(func(a, b));
plot3d(func(x,y),x=a-del..a+del, y=b-del..b+del,
axes=boxed, style=patchcontour);
plots[contourplot](func(x,y),x=a-del..a+del, y=b-del..b+del);
It is clear that we are close to the minimum, but that we have missed it. (We are within .5 of the minimum.)
Now we want to start a gradient search. Visually the strategy of a gradient search is to pick the direction that is steepest uphill (or downhill) from your guess, and move in that direction until the graph levels out. The level spot will be the new guess. The first step is to find the gradient by finding the partials.
>
funcx := diff(func(x,y),x);
funcy := diff(func(x,y),y);
Next we want to evaluate the partials at our guess.
>
fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));
The gradient lets us define a cross section of the curve through the point (a,b) in the direction of the gradient.
(We are looking at the function on the parameterized curve [a+fxpoint*t, b+fypoint*t].) Since we are looking for a minimum point on the cross section, so we look at negative values of t, which correspond to going downhill.
> plot(func(a+fxpoint*t, b+fypoint*t), t=-1..0);
Since this is a function of one variable, we find the minimum by taking a derivative and setting it equal to zero.
>
pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t);
tmin := fsolve(pathdiff,t=0);
This gives us a new guess for the local minimum.
> a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);
Note that the value of func has decreased. We are closer to the minimum.
We can now repeat the process.
>
for i from 1 to 10 do
a := a1; b := b1; currentval := func(a, b);
print(`currrent a `=a, `current b `=b, `current val `= currentval);
fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));
print(`current gradient `=[fxpoint,fypoint]);
pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t):
tmin := fsolve(pathdiff,t=0);
a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);
od:
Notice how quickly this process converges to a solution. As we near the solution, the gradient approaches zero. (The gradient will be zero when we are at a critical point.)
For convenience we put everything in one lump of code. For variety's sake we will start out with a bad guess for the minimum.
>
func := (x,y) -> (x-2)^2+(y-3)^2+sin(x)*cos(y);
initx := 10; inity := 10;
funcx := diff(func(x,y),x);
funcy := diff(func(x,y),y);
a1 := initx: b1 := inity:
for i from 1 to 10 do
a := a1; b := b1; currentval := evalf(func(a, b));
print(`currrent a `=a, `current b `=b, `current val `= currentval);
fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));
print(`current gradient `=[fxpoint,fypoint]);
pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t):
tmin := fsolve(pathdiff,t=0);
a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);
od:
>
Exercises:
1) Use a gradient search to find the local minimum of f(x,y) = (x+1)^4 +(y-1)^4+1/(x^2*y^2+1).
>
2) Use a gradient search to find a local minimum of C(x,y) = .15*x+3*(.8*y)^(-4.87)+(.8*y)*(1+3/x).
>