next up previous
Next: Crystallographic term Up: An Efficient General-Purpose Least-Squares Previous: Detection of errors in

Theoretical background

The goal is to minimize a suitable function of the observations in terms of a structural model specified by variables such as coordinates, thermal factors and occupancies. The function used in least-squares refinement is

  equation64

where tex2html_wrap_inline1998 is the experimental value for the observation j, tex2html_wrap_inline2002 is the corresponding value calculated from the coordinate and thermal parameters tex2html_wrap_inline2004 that specify the structural model, and W(j) is the desired weighting function. The sum in (1) is over all observations, but can be separated into different terms based, for example, on the crystallographic observations tex2html_wrap_inline2008 and the stereochemical observations b (see Appendix B for additional details):

  equation73

More terms could be added if other classes of observation were available. The gradient of M can also be separated into similar terms. This means that the calculations for the crystallographic term can be kept completely separate from calculations for the other terms.

The computational problem is to determine a set of parameters which minimized M. There exist function minimization methods which use no derivatives, which use only first derivatives, and which use second derivatives, in order of increasing power of convergence and increasing computational cost. In the present case there are several reasons for using first-derivative methods.

  1. The radius of convergence of first-derivative methods is larger than that of second-derivative methods, and in these problems one often starts far from the minimum.
  2. The computational cost of first-derivative methods is proportional to N (the number of parameters) instead of tex2html_wrap_inline2018 . For large N this is very important.
  3. Implementation of parameter constraints for holding variables constant, or for requiring variables to behave as rigid groups, is particularly simply for first-derivative methods (see below).

In order to hold a parameter constant, one simply sets the derivative of this parameter equal to zero before calculating the parameter shifts. This prevents the corresponding parameters from changing. To treat a set of atoms tex2html_wrap_inline2022 as a rigid group one redefines these atoms in terms of a chosen origin tex2html_wrap_inline2024 and three orientation parameters tex2html_wrap_inline2026 , i.e.

equation88

Similarly, the residual M is redefined as

equation93

Then by the chain rule

equation98

which gives an overdetermined system of equations for the derivatives of f. This system of equation can be solved by least squares and the solution used to calculate values of tex2html_wrap_inline2032 consistent with the rigid-body constraint. This had the virtue of always operating in the original parameter space, but has the fault that nonlinearities can distort the rigid group. A more correct method is to perform the parameter shift steps in the ( tex2html_wrap_inline2034 ) space and expand to the original space for all other calculations.

First-derivative methods all use the same general strategy, namely calculation of the shift direction followed by a line search for a minimum in the chosen direction. The present package uses the conjugate gradient method (Fletcher & Reeves, 1964). In this procedure the changes in the gradient vector from cycle to cycle are used to approximate the second derivative without actually having to compute this quantity.

By using the method of Agarwal (1978) the amount of computer time required to calculate the gradient of the crystallographic term is only slightly longer than the calculation of a FFT of the structure. The time to calculate the gradient of the stereochemical term is, in comparison, miniscule. The stereochemical and crystallographic gradients are combined with the shift vector of the previous cycle to give the direction (but not the magnitude) of the shift for each parameter. The search along the shift vector for the optimum shift magnitude requires at least three calculations of M, i.e. three FFT's plus some additional calculations. Thus the overall computer time required for a single cycle of refinement is approximately four times that required for one FFT. It is apparent that space-group-specific FFT's can substantially reduce the required computer time per cycle. Included in the refinement package is a program (to be described elsewhere) that will calculate space-group-specific FFT's for most noncentrosymmetric space groups.


next up previous
Next: Crystallographic term Up: An Efficient General-Purpose Least-Squares Previous: Detection of errors in

Dale Edwin Tronrud
Thu Jan 22 14:07:35 PST 1998