Derivatives

Applications should provide partial first derivatives whenever possible, to make Knitro more efficient and more robust. If first derivatives cannot be supplied, then the application should instruct Knitro to calculate finite-difference approximations.

First derivatives are represented by the gradient of the objective function and the Jacobian matrix of the constraints. Second derivatives are represented by the Hessian matrix, a linear combination of the second derivatives of the objective function and the constraints.

First derivatives

The default version of Knitro assumes that the user can provide exact first derivatives to compute the objective function gradient and constraint gradients. It is highly recommended that the user provide exact first derivatives if at all possible, since using first derivative approximations may seriously degrade the performance of the code and the likelihood of converging to a solution. However, if this is not possible the following first derivative approximation options may be used.

• Forward finite-differences This option uses a forward finite-difference approximation of the objective and constraint gradients. The cost of computing this approximation is n function evaluations (where n is the number of variables). The option is invoked by choosing user option gradopt = 2.
• Centered finite-differences This option uses a centered finite-difference approximation of the objective and constraint gradients. The cost of computing this approximation is 2n function evaluations (where n is the number of variables). The option is invoked by choosing user option gradopt = 3. The centered finite-difference approximation is often more accurate than the forward finite-difference approximation; however, it is more expensive to compute if the cost of evaluating a function is high.

Note

When using finite-difference gradients, the sparsity patten for the constraint Jacobian should still be provided if possible to improve the efficiency of the finite-difference computation and decrease the memory requirements.

Although these finite-differences approximations should be avoided in general, they are useful to track errors: whenever the derivatives are provided by the user, it is useful to check that the differentiation (and the subsequent implementation of the derivatives) is correct. Indeed, providing derivatives that are not coherent with the function values is one of the most common errors when solving a nonlinear program. This check can be done automatically by comparing finite-differences approximations with user-provided derivatives. This is explained below (Checking derivatives).

Second derivatives

The default version of Knitro assumes that the application can provide exact second derivatives to compute the Hessian of the Lagrangian function. If the application is able to do so and the cost of computing the second derivatives is not overly expensive, it is highly recommended to provide exact second derivatives. However, Knitro also offers other options, which are described in detail below.

• (Dense) Quasi-Newton BFGS:

The quasi-Newton BFGS option uses gradient information to compute a symmetric, positive-definite approximation to the Hessian matrix. Typically this method requires more iterations to converge than the exact Hessian version. However, since it is only computing gradients rather than Hessians, this approach may be more efficient in some cases. This option stores a dense quasi-Newton Hessian approximation so it is only recommended for small to medium problems (e.g., n < 1000). The quasi-Newton BFGS option is chosen by setting user option hessopt = 2.

• (Dense) Quasi-Newton SR1:

As with the BFGS approach, the quasi-Newton SR1 approach builds an approximate Hessian using gradient information. However, unlike the BFGS approximation, the SR1 Hessian approximation is not restricted to be positive-definite. Therefore the quasi-Newton SR1 approximation may be a better approach, compared to the BFGS method, if there is a lot of negative curvature in the problem (i.e., the problem is not convex) since it may be able to maintain a better approximation to the true Hessian in this case. The quasi-Newton SR1 approximation maintains a dense Hessian approximation and so is only recommended for small to medium problems (e.g., n < 1000). The quasi-Newton SR1 option is chosen by setting user option hessopt = 3.

• Finite-difference Hessian-vector product option:

If the problem is large and gradient evaluations are not a dominant cost, then Knitro can internally compute Hessian-vector products using finite-differences. Each Hessian-vector product in this case requires one additional gradient evaluation. This option is chosen by setting user option hessopt = 4. The option is only recommended if the exact gradients are provided.

Note

This option may not be used when algorithm = 1 or 4 since the Interior/Direct and SQP algorithms need the full expression of the Hessian matrix (Hessian-vector products are not sufficient).

• Exact Hessian-vector products:

In some cases the application may prefer to provide exact Hessian-vector products, but not the full Hessian (for instance, if the problem has a large, dense Hessian). The application must provide a routine which, given a vector v (stored in hessVector), computes the Hessian-vector product, H*v, and returns the result (again in hessVector). This option is chosen by setting user option hessopt = 5.

Note

This option may not be used when algorithm = 1 or 4 since, as mentioned above, the Interior/Direct and SQP algorithms need the full expression of the Hessian matrix (Hessian-vector products are not sufficient).

• Limited-memory Quasi-Newton BFGS:

The limited-memory quasi-Newton BFGS option is similar to the dense quasi-Newton BFGS option described above. However, it is better suited for large-scale problems since, instead of storing a dense Hessian approximation, it stores only a limited number of gradient vectors used to approximate the Hessian. The number of gradient vectors used to approximate the Hessian is controlled by user option lmsize.

A larger value of lmsize may result in a more accurate, but also more expensive, Hessian approximation. A smaller value may give a less accurate, but faster, Hessian approximation. When using the limited memory BFGS approach it is recommended to experiment with different values of this parameter (e.g. between 5 and 15).

In general, the limited-memory BFGS option requires more iterations to converge than the dense quasi-Newton BFGS approach, but will be much more efficient on large-scale problems. The limited-memory quasi-Newton option is chosen by setting user option hessopt = 6.

Note

When using a Hessian approximation option (i.e. hessopt > 1), you do not need to provide any sparsity pattern for the Hessian matrix since it is not used in the Hessian approximations. The Hessian sparsity pattern is only used when providing the exact Hessian (hessopt = 1).

As with exact first derivatives, exact second derivatives often provide a substantial benefit to Knitro and it is advised to provide them whenever possible. If the exact second derivative (i.e. the Hessian) matrix is provided by the user, it can (and should) be checked against a finite-difference approximation for errors using the Knitro derivative checker. See (Checking derivatives) below.

Jacobian and Hessian derivative matrices

The Jacobian matrix of the constraints is defined as

and the Hessian matrix of the Lagrangian is defined as

where is the vector of Lagrange multipliers (dual variables), and is a scalar (either 0 or 1) for the objective component of the Hessian that was introduced in Knitro 8.0.

Note

For backwards compatibility with older versions of Knitro, the user can always assume that if the user option hessian_no_f=0 (which is the default setting). However, if hessian_no_f=1, then Knitro will provide a status flag to the user when it needs a Hessian evaluation indicating whether the Hessian should be evaluated with or . The user must then evaluate the Hessian with the proper value of based on this status flag. Setting hessian_no_f=1 and computing the Hessian with the requested value of may improve Knitro efficiency in some cases. Examples of how to do this can be found in the examples/C directory.

Example

Assume we want to use Knitro to solve the following problem:

Rewriting in the Knitro standard notation, we have

Note

For demonstration purposes we show how to compute the Jacobian and Hessian corresponding to all constraints and components of this model. However, with the new API introduced in Knitro 11.0 the quadratic constraint and the linear constraint should be loaded separately and should not be included in the sparse Jacobian or Hessian structures provided through the nonlinear callbacks. In addition, the linear term in the objective could also be provided separately if desired.

Note

Please see the examples provided in examples/C/ which demonstrate how to provide the derivatives (e.g. objective gradient, constraint Jacobian, and Hessian) for nonlinear terms via callbacks for different types of models, while loading linear and quadratic structures separately.

Computing the Sparse Jacobian Matrix

The gradients (first derivatives) of the objective and constraint functions are given by

The constraint Jacobian matrix is the matrix whose rows store the (transposed) constraint gradients, i.e.,

The values of depend on the value of and change during the solution process. The indices specifying the nonzero elements of this matrix remain constant and are set in KN_set_cb_grad() by the values of jacIndexCons and jacIndexVars.

Computing the Sparse Hessian Matrix

For the example above, the Hessians (second derivatives) of the objective function is given by

and the Hessians of constraints are given by

Scaling the objective matrix by , and the constraint matrices by their corresponding Lagrange multipliers and summing, we get

The values of depend on the value of and (and , which is either 0 or 1) and change during the solution process. The indices specifying the nonzero elements of this matrix remain constant and are set in KN_set_cb_hess() by the values of hessIndexVars1 and hessIndexVars2.

Inputing derivatives

MATLAB users can provide the Jacobian and Hessian matrices in standard MATLAB format, either dense or sparse. See the fmincon documentation, http://www.mathworks.com/help/optim/ug/writing-constraints.html#brhkghv-16, for more information. Users of the callable library must provide derivatives to Knitro in sparse format. In the above example, the number of nonzero elements nnzJ in J(x) is 6, and these arrays would be specified as follows (here in column-wise order, but the order is arbitrary) using the callable library.

jac[0] = -sin(x[0]);  jacIndexCons[0] = 0; jacIndexVars[0] = 0;
jac[1] = 2*x[0];      jacIndexCons[1] = 1; jacIndexVars[1] = 0;
jac[2] = 1;           jacIndexCons[2] = 2; jacIndexVars[2] = 0;
jac[3] = 2*x[1];      jacIndexCons[3] = 1; jacIndexVars[3] = 1;
jac[4] = 1;           jacIndexCons[4] = 2; jacIndexVars[4] = 1;
jac[5] = 1;           jacIndexCons[5] = 2; jacIndexVars[5] = 2;


In the object-oriented interface, these values are set in the user-defined problem class by implementing:

std::vector<int> KTRIProblem::getJacIndexCons();
std::vector<int> KTRIProblem::getJacIndexVars();


to return vectors with the constraint and variable indices in either column-wise or row-wise order. If using the KTRProblem class, setting the values with:

KTRProblem::setJacIndexCons(int id, int val);
KTRProblem::setJacIndexVars(int id, int val);


will store the values to be returned by the appropriate get functions.

Note

Using KTRPProblem class, by default the Jacobian is assumed to be dense and stored row-wise.

Note

Even if the application does not evaluate derivatives (i.e. finite-difference first derivatives are used), it must still provide a sparsity pattern for the constraint Jacobian matrix that specifies which partial derivatives are nonzero. Knitro uses the sparsity pattern to speed up linear algebra computations.

Note

When using finite-difference first derivatives (gradopt > 1), if the sparsity pattern is unknown, then the application should specify a fully dense pattern (i.e., assume all partial derivatives are nonzero). This can easily and automatically be done by setting nnzJ to either KN_DENSE_ROWMAJOR or KN_DENSE_ROWMAJOR in the callable library function KN_set_cb_grad() (and setting jacIndexCons and jacIndexVars to be NULL).

Since the Hessian matrix will always be a symmetric matrix, Knitro only stores the nonzero elements corresponding to the upper triangular part (including the diagonal). In the example here, the number of nonzero elements in the upper triangular part of the Hessian matrix nnzH is 4. The Knitro array hess stores the values of these elements, while the arrays hessIndexVars1 and hessIndexVars2 store the row and column indices respectively. The order in which these nonzero elements is stored is not important. If we store them column-wise, the arrays hess, hessIndexVars1 and hessIndexVars2 are as follows:

hess[0] = -lambda[0]*cos(x[0]) + 2*lambda[1];
hessIndexVars1[0] = 0;
hessIndexVars2[0] = 0;

hess[1] = 2*lambda[1];
hessIndexVars1[1] = 1;
hessIndexVars2[1] = 1;

hess[2] = sigma*3*x[2]*x[2];
hessIndexVars1[2] = 1;
hessIndexVars2[2] = 2;

hess[3] = sigma*6*x[1]*x[2];
hessIndexVars1[3] = 2;
hessIndexVars2[3] = 2;


In the object-oriented interface, the Hessian matrix column indices are set in the user-defined problem class by implementing:

std::vector<int> KTRProblem::getHessIndexRows();
std::vector<int> KTRProblem::getHessIndexCols();


and having them return vectors with the Hessian row and column indices, respectively. If using the KTRProblem class, setting the values with:

KTRProblem::setHessIndexRows(int id, int val);
KTRProblem::setHessIndexCols(int id, int val);


will store the values to be returned by the appropriate get functions.

Note

In Knitro, the array objGrad corresponding to , can be provided in dense or sparse form. The arrays jac, jacIndexCons, and jacIndexVars store information concerning only the nonzero (and typically nonlinear) elements of J(x). The array jac stores the nonzero values in J(x) evaluated at the current solution estimate x, jacIndexCons stores the constraint function (or row) indices corresponding to these values, and jacIndexVars stores the variable (or column) indices. There is no restriction on the order in which these elements are stored; however, it is common to store the nonzero elements of J(x) in either row-wise or column-wise fashion.

MATLAB example

Let us modify our example from Getting started with MATLAB so that the first derivatives are provided as well. In MATLAB, you only need to provide the derivatives for the nonlinear functions, whereas in the callable library API you need to provide the derivatives for both linear and nonlinear constraints in . In the example below, only the inequality constraint is nonlinear, so we only provide the derivative for this constraint.

function firstDer()

function [f, g] = obj(x)
f = 1000 - x(1)^2 - 2*x(2)^2 - x(3)^2 - x(1)*x(2) - x(1)*x(3);
if nargout == 2
g = [-2*x(1) - x(2) - x(3); - 4*x(2) - x(1); -2*x(3) - x(1)];
end
end

% nlcon should return [c, ceq, GC, GCeq]
% with c(x) <= 0 and ceq(x) = 0
function [c, ceq, GC, GCeq] = nlcon(x)
c = -(x(1)^2 + x(2)^2 + x(3)^2 - 25);
ceq = [];
if nargout==4
GC = -([2*x(1); 2*x(2); 2*x(3)]);
GCeq = [];
end
end

x0  = [2; 2; 2];
A = []; b = []; % no linear inequality constraints ("A*x <= b")
Aeq = [8 14 7]; beq = [56]; % linear equality constraints ("Aeq*x = beq")
lb = zeros(3,1); ub = []; % lower and upper bounds

knitromatlab(@obj, x0, A, b, Aeq, beq, lb, ub, @nlcon, [], options);

end


The only difference with the derivative-free case is that the code that computes the objective function and the constraints also returns the first derivatives along with function values. The output is as follows.

=======================================
Artelys Knitro 11.0.0
=======================================

Knitro presolve eliminated 0 variables and 0 constraints.

algorithm:            1
hessopt:              2
honorbnds:            1
maxit:                10000
outlev:               1
par_concurrent_evals: 0
Knitro changing bar_initpt from AUTO to 3.
Knitro changing bar_murule from AUTO to 4.
Knitro changing bar_penaltycons from AUTO to 1.
Knitro changing bar_penaltyrule from AUTO to 2.
Knitro changing bar_switchrule from AUTO to 2.
Knitro changing linsolver from AUTO to 2.

Problem Characteristics                                 (   Presolved)
-----------------------
Objective goal:  Minimize
Objective type:  general
Number of variables:                                  3 (           3)
bounded below only:                               0 (           0)
bounded above only:                               0 (           0)
bounded below and above:                          3 (           3)
fixed:                                            0 (           0)
free:                                             0 (           0)
Number of constraints:                                2 (           2)
linear equalities:                                1 (           1)
quadratic equalities:                             0 (           0)
gen. nonlinear equalities:                        0 (           0)
linear one-sided inequalities:                    0 (           0)
quadratic one-sided inequalities:                 0 (           0)
gen. nonlinear one-sided inequalities:            1 (           1)
linear two-sided inequalities:                    0 (           0)
quadratic two-sided inequalities:                 0 (           0)
gen. nonlinear two-sided inequalities:            0 (           0)
Number of nonzeros in Jacobian:                       6 (           6)
Number of nonzeros in Hessian:                        6 (           6)

EXIT: Locally optimal solution found.

Final Statistics
----------------
Final objective value               =   9.36000000000049e+02
Final feasibility error (abs / rel) =   0.00e+00 / 0.00e+00
Final optimality error  (abs / rel) =   4.78e-08 / 2.99e-09
# of iterations                     =          7
# of CG iterations                  =          0
# of function evaluations           =          8
# of gradient evaluations           =          8
Total program time (secs)           =       0.01651 (     0.015 CPU time)
Time spent in evaluations (secs)    =       0.00472

===============================================================================


The number of function evaluation was reduced to 8, simply by providing exact first derivatives. This small example shows the practical importance of being able to provide exact derivatives; since (unlike modeling environments like AMPL) MATLAB does not provide automatic differentiation, the user must compute these derivatives analytically and then code them manually as in the above example.

C/C++ example

Let us show how to provide derivatives through the callable library API. Here we look at the C example examples/C/exampleNLP2.c, which solves the model:

max   x0*x1*x2*x3         (obj)
s.t.  x0^3 + x1^2 = 1     (c0)
x0^2*x3 - x2 = 0    (c1)
x3^2 - x1 = 0       (c2)


Note that this problem has linear terms, quadratic terms and general nonlinear terms. We will show how to provide both first and second derivatives for the nonlinear structure through callbacks while separately loading the linear and quadratic structure.

#include <stdio.h>
#include <stdlib.h>
#include "knitro.h"

/** Callback for nonlinear function evaluations */
int callbackEvalFC (KN_context_ptr             kc,
CB_context_ptr             cb,
KN_eval_request_ptr const  evalRequest,
KN_eval_result_ptr  const  evalResult,
void              * const  userParams)
{
const double *x;
double *obj;
double *c;

if (evalRequest->type != KN_RC_EVALFC)
{
printf ("*** callbackEvalFC incorrectly called with eval type %d\n",
evalRequest->type);
return( -1 );
}
x = evalRequest->x;
obj = evalResult->obj;
c = evalResult->c;

/** Evaluate nonlinear term in objective */
*obj = x[0]*x[1]*x[2]*x[3];

/** Evaluate nonlinear terms in constraints */
c[0] = x[0]*x[0]*x[0];
c[1] = x[0]*x[0]*x[3];

return( 0 );
}

/** Callback for nonlinear gradient/Jacobian evaluations */
int callbackEvalGA (KN_context_ptr             kc,
CB_context_ptr             cb,
KN_eval_request_ptr const  evalRequest,
KN_eval_result_ptr  const  evalResult,
void              * const  userParams)
{
const double *x;
double *jac;

if (evalRequest->type != KN_RC_EVALGA)
{
printf ("*** callbackEvalGA incorrectly called with eval type %d\n",
evalRequest->type);
return( -1 );
}
x = evalRequest->x;
jac = evalResult->jac;

/** Evaluate nonlinear term in objective gradient */

/** Evaluate nonlinear terms in constraint gradients (Jacobian) */
jac[0] = 3.0*x[0]*x[0]; /* derivative of x0^3 term  wrt x0    */
jac[1] = 2.0*x[0]*x[3]; /* derivative of x0^2*x3 term  wrt x0 */
jac[2] = x[0]*x[0];     /* derivative of x0^2*x3 terms wrt x3 */

return( 0 );
}

/** Callback for nonlinear Hessian evaluation */
int callbackEvalH (KN_context_ptr             kc,
CB_context_ptr             cb,
KN_eval_request_ptr const  evalRequest,
KN_eval_result_ptr  const  evalResult,
void              * const  userParams)
{
const double *x;
const double *lambda;
double sigma;
double *hess;

if (   evalRequest->type != KN_RC_EVALH
&& evalRequest->type != KN_RC_EVALH_NO_F )
{
printf ("*** callbackEvalHess incorrectly called with eval type %d\n",
evalRequest->type);
return( -1 );
}

x = evalRequest->x;
lambda = evalRequest->lambda;
/** Scale objective component of the Hessian by sigma */
sigma = *(evalRequest->sigma);
hess = evalResult->hess;

/** Evaluate nonlinear term in the Hessian of the Lagrangian */
hess[0] = lambda[0]*6.0*x[0] + lambda[1]*2.0*x[3];
hess[1] = sigma*x[2]*x[3];
hess[2] = sigma*x[1]*x[3];
hess[3] = sigma*x[1]*x[2] + lambda[1]*2.0*x[0];
hess[4] = sigma*x[0]*x[3];
hess[5] = sigma*x[0]*x[2];
hess[6] = sigma*x[0]*x[1];

return( 0 );
}

/** main function */
int  main (int  argc, char  *argv[])
{
int  i, nStatus, error;

/** Declare variables. */
KN_context   *kc;
int    n, m;
double cEqBnds[3] = {1.0, 0.0, 0.0};
/** Used to define linear constraint structure */
int    lconIndexCons[2];
int    lconIndexVars[2];
double lconCoefs[2];
/** Used to define quadratic constraint structure */
int    qconIndexCons[2];
int    qconIndexVars1[2];
int    qconIndexVars2[2];
double qconCoefs[2];
/** Pointer to structure holding information for nonlinear
*  evaluation callback for terms:
*     x0*x1*x2*x3  in the objective
*     x0^3         in first constraint
*     x0^2*x3      in second constraint */
CB_context   *cb;
int    cIndices[2];
/** Used to define Jacobian structure for nonlinear terms
*  evaluated in the callback. */
int    cbjacIndexCons[3];
int    cbjacIndexVars[3];
double cbjacCoefs[3];
/** Used to define Hessian structure for nonlinear terms
*  evaluated in the callback. */
int    cbhessIndexVars1[7];
int    cbhessIndexVars2[7];
double cbhessCoefs[7];
/** For solution information */
double x[4];
double objSol;
double feasError, optError;

/** Create a new Knitro solver instance. */
error = KN_new(&kc);
if (error) exit(-1);
if (kc == NULL)
{
printf ("Failed to find a valid license.\n");
return( -1 );
}

/** Initialize Knitro with the problem definition. */

/** Add the variables and specify initial values for them.
*  Note: any unset lower bounds are assumed to be
*  unbounded below and any unset upper bounds are
*  assumed to be unbounded above. */
n = 4;
error = KN_add_vars(kc, n, NULL);
if (error) exit(-1);
for (i=0; i<n; i++) {
error = KN_set_var_primal_init_value(kc, i, 0.8);
if (error) exit(-1);
}

/** Add the constraints and set the rhs and coefficients */
m =3;
error = KN_add_cons(kc, m, NULL);
if (error) exit(-1);
error = KN_set_con_eqbnds_all(kc, cEqBnds);
if (error) exit(-1);

/** Coefficients for 2 linear terms */
lconIndexCons[0] = 1; lconIndexVars[0] = 2; lconCoefs[0] = -1.0;
lconIndexCons[1] = 2; lconIndexVars[1] = 1; lconCoefs[1] = -1.0;
error = KN_add_con_linear_struct (kc, 2,
lconIndexCons, lconIndexVars,
lconCoefs);
if (error) exit(-1);

/** Coefficients for 2 quadratic terms */

/* x1^2 term in c0 */
qconIndexCons[0] = 0; qconIndexVars1[0] = 1; qconIndexVars2[0] = 1;
qconCoefs[0] = 1.0;

/* x3^2 term in c2 */
qconIndexCons[1] = 2; qconIndexVars1[1] = 3; qconIndexVars2[1] = 3;
qconCoefs[1] = 1.0;

qconIndexVars1, qconIndexVars2,
qconCoefs);
if (error) exit(-1);

/** Add callback to evaluate nonlinear (non-quadratic) terms in the model:
*     x0*x1*x2*x3  in the objective
*     x0^3         in first constraint c0
*     x0^2*x3      in second constraint c1 */
cIndices[0] = 0; cIndices[1] = 1;
error = KN_add_eval_callback (kc, KNTRUE, 2, cIndices, callbackEvalFC, &cb);
if (error) exit(-1);

/** Set obj. gradient and nonlinear jac provided through callbacks.
*  Mark objective gradient as dense, and provide non-zero sparsity
*  structure for constraint Jacobian terms. */
cbjacIndexCons[0] = 0; cbjacIndexVars[0] = 0;
cbjacIndexCons[1] = 1; cbjacIndexVars[1] = 0;
cbjacIndexCons[2] = 1; cbjacIndexVars[2] = 3;
error = KN_set_cb_grad(kc, cb, KN_DENSE, NULL, 3, cbjacIndexCons,
cbjacIndexVars, callbackEvalGA);
if (error) exit(-1);

/* Set nonlinear Hessian provided through callbacks. Since the
* Hessian is symmetric, only the upper triangle is provided.
* The upper triangular Hessian for nonlinear callback structure is:
*
*  lambda0*6*x0 + lambda1*2*x3     x2*x3    x1*x3    x1*x2 + lambda1*2*x0
*               0                    0      x0*x3         x0*x2
*                                             0           x0*x1
*                                                          0
*  (7 nonzero elements)
*/
cbhessIndexVars1[0] = 0;  /* (row0,col0) element */
cbhessIndexVars2[0] = 0;
cbhessIndexVars1[1] = 0;  /* (row0,col1) element */
cbhessIndexVars2[1] = 1;
cbhessIndexVars1[2] = 0;  /* (row0,col2) element */
cbhessIndexVars2[2] = 2;
cbhessIndexVars1[3] = 0;  /* (row0,col3) element */
cbhessIndexVars2[3] = 3;
cbhessIndexVars1[4] = 1;  /* (row1,col2) element */
cbhessIndexVars2[4] = 2;
cbhessIndexVars1[5] = 1;  /* (row1,col3) element */
cbhessIndexVars2[5] = 3;
cbhessIndexVars1[6] = 2;  /* (row2,col3) element */
cbhessIndexVars2[6] = 3;
error = KN_set_cb_hess(kc, cb, 7, cbhessIndexVars1, cbhessIndexVars2, callbackEvalH);
if (error) exit(-1);

/** Set minimize or maximize (if not set, assumed minimize) */
error = KN_set_obj_goal(kc, KN_OBJGOAL_MAXIMIZE);
if (error) exit(-1);

/** Demonstrate setting a "newpt" callback.  the callback function
* "callbackNewPoint" passed here is invoked after Knitro computes
* a new estimate of the solution. */
error = KN_set_newpt_callback(kc, callbackNewPoint, NULL);
if (error) exit(-1);

/** Set option to print output after every iteration. */
error = KN_set_int_param (kc, KN_PARAM_OUTLEV, KN_OUTLEV_ITER);
if (error) exit(-1);

/** Solve the problem.
*
*  Return status codes are defined in "knitro.h" and described
*  in the Knitro manual. */
nStatus = KN_solve (kc);

printf ("\n\n");
printf ("Knitro converged with final status = %d\n",
nStatus);

/** An example of obtaining solution information. */
error = KN_get_solution(kc, &nStatus, &objSol, x, NULL);
if (!error) {
printf ("  optimal objective value  = %e\n", objSol);
printf ("  optimal primal values x  = (%e, %e, %e, %e)\n", x[0], x[1], x[2], x[3]);
}
error = KN_get_abs_feas_error (kc, &feasError);
if (!error)
printf ("  feasibility violation    = %e\n", feasError);
error = KN_get_abs_opt_error (kc, &optError);
if (!error)
printf ("  KKT optimality violation = %e\n", optError);

/** Delete the Knitro solver instance. */
KN_free (&kc);

return( 0 );
}


Running this code produces the following output.

=======================================
Artelys Knitro 11.0.0
=======================================

Knitro presolve eliminated 0 variables and 0 constraints.

outlev:                  1
Knitro changing algorithm from AUTO to 1.
Knitro changing bar_initpt from AUTO to 3.
Knitro changing bar_murule from AUTO to 4.
Knitro changing bar_penaltycons from AUTO to 1.
Knitro changing bar_penaltyrule from AUTO to 2.
Knitro changing bar_switchrule from AUTO to 1.
Knitro changing linesearch from AUTO to 1.
Knitro changing linsolver from AUTO to 2.

Problem Characteristics                                 (   Presolved)
-----------------------
Objective goal:  Maximize
Objective type:  general
Number of variables:                                  4 (           4)
bounded below only:                               0 (           0)
bounded above only:                               0 (           0)
bounded below and above:                          0 (           0)
fixed:                                            0 (           0)
free:                                             4 (           4)
Number of constraints:                                3 (           3)
linear equalities:                                0 (           0)
quadratic equalities:                             1 (           1)
gen. nonlinear equalities:                        2 (           2)
linear one-sided inequalities:                    0 (           0)
quadratic one-sided inequalities:                 0 (           0)
gen. nonlinear one-sided inequalities:            0 (           0)
linear two-sided inequalities:                    0 (           0)
quadratic two-sided inequalities:                 0 (           0)
gen. nonlinear two-sided inequalities:            0 (           0)
Number of nonzeros in Jacobian:                       7 (           7)
Number of nonzeros in Hessian:                        9 (           9)

EXIT: Locally optimal solution found.

Final Statistics
----------------
Final objective value               =   2.50000000082290e-01
Final feasibility error (abs / rel) =   1.86e-10 / 1.86e-10
Final optimality error  (abs / rel) =   2.66e-09 / 2.66e-09
# of iterations                     =          3
# of CG iterations                  =          0
# of function evaluations           =          4
# of gradient evaluations           =          4
# of Hessian evaluations            =          3
Total program time (secs)           =       0.00183 (     0.001 CPU time)
Time spent in evaluations (secs)    =       0.00001

===============================================================================

Knitro converged with final status = 0
optimal objective value  = 2.500000e-01
optimal primal values x  = (7.937005e-01, 7.071068e-01, 5.297315e-01, 8.408964e-01)
feasibility violation    = 1.863212e-10
KKT optimality violation = 2.660655e-09


Providing both first and second derivatives allows KNitro to solve this model in only 4 function evaluations.

Note

Automatic differentiation packages like ADOL-C and ADIFOR can help in generating code with derivatives. These codes are an alternative to differentiating the functions manually. Another option is to use symbolic differentiation software to compute an analytical formula for the derivatives.

Object-oriented C++ example

Let us now modify our C++ example from Getting started with the object-oriented interface, so as to provide first derivatives.

#include "KTRSolver.h"
#include "KTRProblem.h"
#include <iostream>

class ProblemExample : public KNITRO::KTRProblem {
// objective properties
void setObjectiveProperties() {
setObjType(KTR_OBJTYPE_GENERAL);
setObjGoal(KTR_OBJGOAL_MINIMIZE);
}

// constraint properties
void setConstraintProperties()
{
// set constraint types
setConTypes(0, KNITRO::KTREnums::ConstraintType::ConLinear);

// set constraint lower bounds
setConLoBnds(0.0);

// set constraint upper bounds
setConUpBnds(0, 0.0);
setConUpBnds(1, KTR_INFBOUND);
}

// Variable bounds. All variables 0 <= x.
void setVariableProperties() {
setVarLoBnds(0.0);
}

public:
// constructor: pass number of variables and constraints to base class
ProblemQCQP() : KTRProblem(3, 2) {
// set problem properties
setObjectiveProperties();
setVariableProperties();
setConstraintProperties();
}

// Objective and constraint evaluation function
// overrides KTRProblem class
double evaluateFC(const std::vector<double>& x,
std::vector<double>& c,
std::vector<double>& jac) {

// constraints
c[0] = 8.0e0*x[0] + 14.0e0*x[1] + 7.0e0*x[2] - 56.0e0;
c[1] = x[0] * x[0] + x[1] * x[1] + x[2] * x[2] - 25.0e0;

// return objective function value
return 1000 - x[0] * x[0] - 2.0e0*x[1] * x[1] - x[2] * x[2]
- x[0] * x[1] - x[0] * x[2];
}

// Gradient and Jacobian evaluation function
// overrides KTRProblem class
int evaluateGA(const std::vector<double>& x,
std::vector<double>& jac) override {

objGrad[0] = -2.0e0*x[0] - x[1] - x[2];
objGrad[1] = -4.0e0*x[1] - x[0];
objGrad[2] = -2.0e0*x[2] - x[0];

// gradient of the first constraint, c[0].
jac[0] = 8.0e0;
jac[1] = 14.0e0;
jac[2] = 7.0e0;

// gradient of the second constraint, c[1]. */
jac[3] = 2.0e0*x[0];
jac[4] = 2.0e0*x[1];
jac[5] = 2.0e0*x[2];
return 0;
}
};

int main(int argc, char *argv[]) {
// Create a problem instance.
ProblemExample problem = ProblemExample();

// Create a solver - optional arguments:
// exact first derivatives
// BFGS approximate second derivatives
KNITRO::KTRSolver solver(&instance, KTR_GRADOPT_EXACT, KTR_HESSOPT_BFGS);

int solveStatus = solver.solve();

if (solveStatus != 0) {
std::cout << std::endl;
std::cout << "KNITRO failed to solve the problem, final status = ";
std::cout << solveStatus << std::endl;
}
else {
std::cout << std::endl << "KNITRO successful, objective is = ";
std::cout << solver.getObj() << std::endl;
}

return 0;
}


Two changes were made to the previous example. This adds evaluateGA() function to the problem class, defining the derivatives, and the KTRSolver constructor is passed KTR_GRADOPT_EXACT instead of KTR_GRADOPT_FORWARD, since the exact gradient function is now defined. Running this example produces the same output as the callable library example.

Checking derivatives

One drawback of user-supplied derivatives is the risk of error in computing or implementing the derivatives, which would result in providing Knitro with (wrong and) incoherent information: the computed function values would not match the computed derivatives. Approximate derivatives computed by finite differences are useful to check whether user-supplied derivatives match user-supplied function evaluations.

Users of modeling languages such as AMPL need not be worried about this, since derivatives are computed automatically by the modeling software. However, for users of MATLAB and the callable library it is a good practice to check one’s exact derivatives against finite differences approximations. Note that small differences between exact and finite-difference approximations are to be expected.

Knitro offers the following user options to check for errors in the user-supplied first derivatives (i.e., the objective gradient and the Jacobian matrix) and second derivatives (i.e. the Hessian matrix).

Derivative Check Options

Option Meaning
derivcheck Specifies whether or not to enable the derivative checker, and which derivatives to check (first, second or both)
derivcheck_terminate Whether to terminate after the derivative check or continue to the optimization if successful
derivcheck_tol Specifies the relative tolerance used for identifying potential errors in the derivatives
derivcheck_type Specifies whether to use forward or central finite differences to compute the derivative check

Note that to use the derivative checker, you must set gradopt = 1 (to check the first derivatives) and/or hessopt=1 (to check the second derivatives/Hessian). You must also supply callback routines that compute the objective and constraint functions and analytic first derivatives (to check the first derivatives), and/or analytic second derivatives/Hessian (to check the second derivatives). By default, the derivative checker is turned off. To check first derivatives only, simply set derivcheck = 1; to check second derivatives/Hessian only set derivcheck = 2; and to check both first and second derivatives set derivcheck = 3. Additionally you can set derivcheck_type to specify what type of finite differencing to use for the derivative check, and derivcheck_tol to change the default relative tolerance used to detect derivative errors. Setting derivcheck_terminate will determine whether Knitro always stops after the derivative check is completed, or continues with the optimization (when the derivative check is successful).

It is best to check the derivatives at different points, and to avoid points where partial derivatives happen to equal zero. If an initial point was provided by the user then Knitro will perform the derivative check at this point. Otherwise, if no initial point is provided, Knitro will perform the derivative check at a randomly generated point that satisfies the variable bounds. To perform a derivative check at different points, simply feed different initial points to Knitro.

Using the example problem above, if the Knitro derivative checker runs, with value derivcheck = 1, and the relative differences between all the user-supplied first derivatives and finite-difference first derivatives satisfy the tolerance defined by derivcheck_tol, then you will see the following output:

-------------------------------------------------------------------------
Knitro Derivative Check Information

Checking 1st derivatives with forward finite differences.
Derivative check performed at user-supplied initial 'x' point.
Printing relative differences >   1.0000e-06.

Maximum relative difference in the objective gradient =   0.0000e+00.
Maximum relative difference in the Jacobian           =   0.0000e+00.
Derivative check passed.
-------------------------------------------------------------------------


before the optimization begins. Since the derivative check passed, Knitro will automatically proceed with the optimization using the user-supplied derivatives.

Now let us modify the objective gradient computation in the example problem above as follows:

/* gradient of objective */
/* objGrad[0] = -2*x[0] - x[1] - x[2]; */
objGrad[0] = -2*x[0] - x[1]; /* BUG HERE !!! */


Running the code again, we obtain:

-------------------------------------------------------------------------
Knitro Derivative Check Information

Checking 1st derivatives with forward finite differences.
Derivative check performed at user-supplied initial 'x' point.
Printing relative differences >   1.0000e-06.

WARNING: The discrepancy for objective gradient element objGrad[0]
exceeds the derivative check relative tolerance of 1.000000e-06.
analytic (user-supplied) value =  -6.000000000000e+00,
finite-difference value        =  -8.000000000000e+00,
|rel diff| =  3.3333e-01, |abs diff| =  2.0000e+00

Maximum relative difference in the objective gradient =   3.3333e-01.
Maximum relative difference in the Jacobian           =   0.0000e+00.
Derivative check failed.
-------------------------------------------------------------------------

EXIT: Derivative check failed.


Knitro is warning us that the finite difference approximation of the first coordinate of the gradient at the initial point is about -8, whereas its (supposedly) exact user-supplied value is about -6: there is a bug in our implementation of the gradient of the objective. Knitro prints a message indicating the derivative discrepancy it found and terminates immediately with a failure message.