Without complete knowledge of the normal matrix, minimization of a quadratic function would require repeated cycles. In each cycle a shift vector is chosen ( for cycle k) and the minimum along that direction is found with a line search. The minimization of a function along a direction reduces the problem to one parameter, named . The new values for the full set of parameters (for cycle k+1) are
The value of is set to that value which minimizes . It defines the minimum along the shift vector .
The particular set of directions searched determines the rate of convergence. For example, if one chooses to search along the axes of parameter space, first varying the x parameter of the first atom, then the y parameter and so on, the minimum can only be found after a number of cycles many times n.
Many cycles are required when one parameter at a time is varied because the parameters (and therefore the shift directions) are interdependent; thus, in subsequent cycles previously searched directions must be searched again. The number of cycles could be reduced if a series of directions could be identified which were independent. This independence is described mathematically as the direction vectors being conjugate to the normal matrix (Luenberger, 1973), which is defined explicitly as when .
A conjugate direction method is one in which a series of directions are devised which are conjugate with respect to the normal matrix but do not require the normal matrix in order for them to be determined. In the particular conjugate direction method called conjugate gradient, the direction vector for cycle k+1 is determined from
where is chosen to ensure that is conjugate to all previous directions. and are defined to be and 0, respectively, which results in the first cycle being a steepest descent cycle ( ).