next up previous
Next: Some Comparisons Up: Conjugate Direction Minimization: An Previous: Limitations of Conjugate Gradient

Improvements in the Conjugate Gradient Method

The conjugate gradient method uses the steepest descent method to produce its first shift direction, or ``seed" direction. The rate of convergence of early cycles can be improved if a seed that incorporates as much information as practical about the function is used. We would like a direction which will include compensation for the differences in the eigenvalues of the normal matrix. Because in X-ray crystallography the diagonal terms of the normal matrix dominate, a diagonal approximation to the normal matrix provides a powerful and quick alternative to the steepest descent method of generating shift directions. In this procedure the search direction is calculated by

  equation57

where tex2html_wrap_inline412 is the diagonal approximation to the normal matrix for the parameters of cycle k. For the fastest rate of convergence, this shift vector should be used as a seed for conjugate direction searches. It is not clear, however, how one should calculate tex2html_wrap_inline416 of Equation (6).

The refinement problems that we address are the wide range of magnitude of the eigenvalues of the normal matrix and the existence of off-diagonal terms. If we could choose a different set of parameters, for which the normal matrix was simpler, the rate of convergence would improve. Ideally one would choose a system of parameters such that all the eigenvalues were equal and all the off-diagonal elements were zero; then one cycle of steepest descent minimization would suffice.

Let us assume that we have determined a matrix ( tex2html_wrap_inline418 ) which will transform the usual crystallographic parameters into such a set of parameters ( tex2html_wrap_inline420 ). The transformations between the familiar parameters and the new ones will be

equation65

We can perform Fletcher-Reeves conjugate gradient minimization on the function using this new parameter space. The equations will be as before, Equations (4) thru (6), but with primes added:

equation79

equation84

equation89

Instead of working with the tex2html_wrap_inline420 parameters we can substitute back to the original tex2html_wrap_inline348 parameters. The resulting equations are

equation96

equation102

equation111

In these equations the shift vectors, tex2html_wrap_inline426 , are all premultiplied by tex2html_wrap_inline428 . It would be simpler to eliminate this complication by simply defining tex2html_wrap_inline430 . The final equations for conjugate direction refinement, derived from recombined parameters, but operating on the ``native" parameters are

  equation122

  equation128

  equation136

At this point the matrix tex2html_wrap_inline418 is undefined. The optimal choice for tex2html_wrap_inline418 would require that tex2html_wrap_inline436 , the normal matrix for the new parameters, be equal to the identity matrix. To calculate the optimal tex2html_wrap_inline418 we need both the normal matrix and its inverse; thus, we made no gains in computational efficiency over the full matrix method. However, if we recognize that in crystallography tex2html_wrap_inline364 is almost diagonal we can set tex2html_wrap_inline442 . Then tex2html_wrap_inline444 in the Equations (15), (16) and (17) will be replaced by tex2html_wrap_inline446 . Making this substitution we obtain

equation155

  equation161

The seed direction ( tex2html_wrap_inline448 when tex2html_wrap_inline450 and tex2html_wrap_inline452 ) is now the shift calculated from the diagonal approximation to the normal matrix, as we desired. In addition, however, we have an equation for tex2html_wrap_inline416 .

In summary, we have a minimization method where the diagonal terms of the normal matrix are explicitly included and the off-diagonal elements are dealt with via a set of conjugate directions.

Agarwal (1978) suggested a similar method; however, his equation for tex2html_wrap_inline416 was incorrect. In the present nomenclature, his proposal for tex2html_wrap_inline416 was

equation170

In conjugate gradient refinement tex2html_wrap_inline416 is equal to ratio of the length of the gradient at point k divided by that length at point k-1. Because k should be closer to the minimum than point k-1, tex2html_wrap_inline416 should be less than unity. An estimate of Agarwal's tex2html_wrap_inline416 can be achieved by examining his tex2html_wrap_inline416 for cycle 2, which is

equation174

As before, the parameters after cycle 1 should be closer to the minimum than the starting parameters, resulting in tex2html_wrap_inline478 . This value results in the undesirable outcome that the previous cycle's direction is considered more important that the direction calculated from the current parameters. This now explains why Agarwal found it necessary to place an empirical upper limit of 0.4 on tex2html_wrap_inline416 . The value of tex2html_wrap_inline416 calculated with Equation (19) typically falls between 0.5 and 0.9. The empirical value of 0.4 falls closer to the typical value than either setting tex2html_wrap_inline416 to zero (and using the method of Equation (7)) or using the equation of Agarwal(1978).


next up previous
Next: Some Comparisons Up: Conjugate Direction Minimization: An Previous: Limitations of Conjugate Gradient

Dale Edwin Tronrud
Thu Nov 20 10:28:11 PST 1997