The goal is to minimize a suitable function of the observations in terms of a structural model specified by variables such as coordinates, thermal factors and occupancies. The function used in least-squares refinement is
where is the experimental value for the observation j,
is the corresponding value calculated from the coordinate and thermal parameters
that specify the structural model, and W(j) is the desired
weighting function. The sum in (1) is over all observations, but
can be separated into different terms based, for example, on the crystallographic
observations
and the stereochemical observations b (see
Appendix B for additional details):
More terms could be added if other classes of observation were available. The gradient of M can also be separated into similar terms. This means that the calculations for the crystallographic term can be kept completely separate from calculations for the other terms.
The computational problem is to determine a set of parameters which minimized M. There exist function minimization methods which use no derivatives, which use only first derivatives, and which use second derivatives, in order of increasing power of convergence and increasing computational cost. In the present case there are several reasons for using first-derivative methods.
In order to hold a parameter constant, one simply sets the derivative of this parameter
equal to zero before calculating the parameter shifts. This prevents the
corresponding parameters from changing. To treat a set of atoms
as a rigid group one redefines these atoms in terms of a chosen origin
and three orientation parameters
, i.e.
Similarly, the residual M is redefined as
Then by the chain rule
which gives an overdetermined system of equations for the derivatives of f.
This system of equation can be solved by least squares and the solution used
to calculate values of consistent with the
rigid-body constraint. This had the virtue of always operating in the original
parameter space, but has the fault that nonlinearities can distort the rigid
group. A more correct method is to perform the parameter shift steps in
the (
) space and expand to the original space for
all other calculations.
First-derivative methods all use the same general strategy, namely calculation of the shift direction followed by a line search for a minimum in the chosen direction. The present package uses the conjugate gradient method (Fletcher & Reeves, 1964). In this procedure the changes in the gradient vector from cycle to cycle are used to approximate the second derivative without actually having to compute this quantity.
By using the method of Agarwal (1978) the amount of computer time required to calculate the gradient of the crystallographic term is only slightly longer than the calculation of a FFT of the structure. The time to calculate the gradient of the stereochemical term is, in comparison, miniscule. The stereochemical and crystallographic gradients are combined with the shift vector of the previous cycle to give the direction (but not the magnitude) of the shift for each parameter. The search along the shift vector for the optimum shift magnitude requires at least three calculations of M, i.e. three FFT's plus some additional calculations. Thus the overall computer time required for a single cycle of refinement is approximately four times that required for one FFT. It is apparent that space-group-specific FFT's can substantially reduce the required computer time per cycle. Included in the refinement package is a program (to be described elsewhere) that will calculate space-group-specific FFT's for most noncentrosymmetric space groups.