Sec14.2GradSearch.mws

Gradient Searches for local extrema

Worksheet by Mike May, S.J.- maymk@slu.edu

> restart;

One of the methods the book gives for trying to find a local maximum or minimum is by doing a gradient search. Since it is computationally intensive, it is hard to do without mechanical help. Let's work through a search.

We start by defining the function.

> func := (x,y) -> (x-2)^2+(y-3)^2+sin(x)*cos(y);

func := proc (x, y) options operator, arrow; (x-2)^...

Since the function is a mixture of a polynomial and a trig function it will be hard to set the two partials equal to zero. Thus a gradient search is probably the easiest way to find the extrema.

Since Maple can do 3D plotting it is useful to plot the function and see what is going on. We are going to guess that the minimum is at (2, 3) and plot for a del neighborhood of size 1 around that point.

> a := 2: b := 3: del := 1: evalf(func(a, b));
plot3d(func(x,y),x=a-del..a+del, y=b-del..b+del,
axes=boxed, style=patchcontour);
plots[contourplot](func(x,y),x=a-del..a+del, y=b-del..b+del);

-.9001976297

[Maple Plot]

[Maple Plot]

It is clear that we are close to the minimum, but that we have missed it. (We are within .5 of the minimum.)

Now we want to start a gradient search. Visually the strategy of a gradient search is to pick the direction that is steepest uphill (or downhill) from your guess, and move in that direction until the graph levels out. The level spot will be the new guess. The first step is to find the gradient by finding the partials.

> funcx := diff(func(x,y),x);
funcy := diff(func(x,y),y);

funcx := 2*x-4+cos(x)*cos(y)

funcy := 2*y-6-sin(x)*sin(y)

Next we want to evaluate the partials at our guess.

> fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));

fxpoint := .4119822456

fypoint := -.1283200602

The gradient lets us define a cross section of the curve through the point (a,b) in the direction of the gradient.
(We are looking at the function on the parameterized curve [a+fxpoint*t, b+fypoint*t].) Since we are looking for a minimum point on the cross section, so we look at negative values of t, which correspond to going downhill.

> plot(func(a+fxpoint*t, b+fypoint*t), t=-1..0);

[Maple Plot]

Since this is a function of one variable, we find the minimum by taking a derivative and setting it equal to zero.

> pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t);
tmin := fsolve(pathdiff,t=0);

pathdiff := .3723908172*t+.4119822456*cos(2+.411982...
pathdiff := .3723908172*t+.4119822456*cos(2+.411982...

tmin := -.3442353195

This gives us a new guess for the local minimum.

> a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);

a1 := 1.858181160

b1 := 3.044172297

-.9323774855

Note that the value of func has decreased. We are closer to the minimum.

We can now repeat the process.

> for i from 1 to 10 do
a := a1; b := b1; currentval := func(a, b);
print(`currrent a `=a, `current b `=b, `current val `= currentval);
fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));
print(`current gradient `=[fxpoint,fypoint]);
pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t):
tmin := fsolve(pathdiff,t=0);
a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);
od:

`currrent a ` = 1.858181160, `current b ` = 3.04417...

`current gradient ` = [-.15363847e-2, -.493269054e-...

`currrent a ` = 1.858698465, `current b ` = 3.04583...

`current gradient ` = [.374042e-4, -.1165000e-4]

`currrent a ` = 1.858685738, `current b ` = 3.04583...

`current gradient ` = [-.895e-7, -.28406e-6]

`currrent a ` = 1.858685768, `current b ` = 3.04583...

`current gradient ` = [.18e-8, .38e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

Notice how quickly this process converges to a solution. As we near the solution, the gradient approaches zero. (The gradient will be zero when we are at a critical point.)

For convenience we put everything in one lump of code. For variety's sake we will start out with a bad guess for the minimum.

> func := (x,y) -> (x-2)^2+(y-3)^2+sin(x)*cos(y);
initx := 10; inity := 10;
funcx := diff(func(x,y),x);
funcy := diff(func(x,y),y);
a1 := initx: b1 := inity:
for i from 1 to 10 do
a := a1; b := b1; currentval := evalf(func(a, b));
print(`currrent a `=a, `current b `=b, `current val `= currentval);
fxpoint := evalf(subs({x=a, y=b}, funcx));
fypoint := evalf(subs({x=a, y=b}, funcy));
print(`current gradient `=[fxpoint,fypoint]);
pathdiff := diff(func(a+fxpoint*t, b+fypoint*t),t):
tmin := fsolve(pathdiff,t=0);
a1 := a + tmin*fxpoint; b1 := b + tmin*fypoint; func(a1, b1);
od:

func := proc (x, y) options operator, arrow; (x-2)^...

initx := 10

inity := 10

funcx := 2*x-4+cos(x)*cos(y)

funcy := 2*y-6-sin(x)*sin(y)

`currrent a ` = 10, `current b ` = 10, `current val...

`current gradient ` = [16.70404103, 13.70404103]

`currrent a ` = 1.723782166, `current b ` = 3.21016...

`current gradient ` = [-.4004040688, .4880579452]

`currrent a ` = 1.858634291, `current b ` = 3.04579...

`current gradient ` = [-.1532273e-3, -.12570899e-3]...

`currrent a ` = 1.858685691, `current b ` = 3.04583...

`current gradient ` = [-.2232e-6, .27305e-6]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

`currrent a ` = 1.858685767, `current b ` = 3.04583...

`current gradient ` = [-.12e-8, .36e-9]

>

Exercises:

1) Use a gradient search to find the local minimum of f(x,y) = (x+1)^4 +(y-1)^4+1/(x^2*y^2+1).

>

2) Use a gradient search to find a local minimum of C(x,y) = .15*x+3*(.8*y)^(-4.87)+(.8*y)*(1+3/x).

>