SciPy, conditions optimization

8 min


SciPy (pronounced sai pay) is a numpy-based math package that also includes C and Fortran libraries. With SciPy, an interactive Python session turns into a fully functional processing environment like MATLAB, IDL, Octave, R, or SciLab.

In this article, we will look at the basic techniques of mathematical programming — solving conditional optimization problems for the scalar function of several variables using the scipy.optimize package. Algorithms for unconditional optimization have already been discussed in the previous article. More detailed and current help on scipy functions can always be obtained using the help (), Shift + Tab command or in the official documentation.

Introduction

A common interface for solving problems both conditional and unconditional optimization in the scipy.optimize package is provided by the function minimize (). However, it is known that there is no universal way to solve all problems, so the choice of an adequate method, as always, falls on the shoulders of the researcher.
The appropriate optimization algorithm is specified using the function argument. minimize (..., method = "").
For conditional optimization of the function of several variables, implementations of the following methods are available:

  • trust-constr – search for a local minimum in the confidence region. Article on wiki, article on Habré;
  • SLSQP – sequential quadratic programming with constraints, Newtonian method for solving the Lagrange system. Wiki article.
  • TNC – Truncated Newton Constrained, a limited number of iterations, is good for non-linear functions with a large number of independent variables. Article on the wiki.
  • L-BFGS-B – the Broyden – Fletcher – Goldfarb – Shanno method from the four, implemented with reduced memory consumption due to partial loading of vectors from the Hesse matrix. Article on wiki, article on Habré.
  • COBYLAMARE Constrained Optimization By Linear Approximation, limited optimization with linear approximation (without calculating the gradient). Article on the wiki.

Depending on the method chosen, the conditions and restrictions are set differently for solving the problem:

  • class object Bounds for methods L-BFGS-B, TNC, SLSQP, trust-constr;
  • list (min, max) for the same methods L-BFGS-B, TNC, SLSQP, trust-constr;
  • object or list of objects LinearConstraint, NonlinearConstraint for COBYLA, SLSQP, trust-constr;
  • dictionary or list of dictionaries {'type': str, 'fun': callable, 'jac': callable, opt, 'args': sequence, opt} for methods COBYLA, SLSQP.

Article layout:
1) Consider the use of the conditional optimization algorithm in the trusted domain (method = "trust-constr") with restrictions specified as objects Bounds, LinearConstraint, NonlinearConstraint ;
2) Consider sequential least squares programming (method = "SLSQP") with the restrictions specified in the form of a dictionary {'type', 'fun', 'jac', 'args'};
3) Disassemble an example of optimizing the output products on the example of a web studio.

Conditional optimization method = "trust-constr"

Method implementation trust-constr based on EQSQP for problems with constraints of the equality type and on TRIP for problems with constraints in the form of inequalities. Both methods are implemented by local minimum search algorithms in the confidence domain and are well suited for large-scale tasks.

The mathematical formulation of the minimum search problem in general:

$ min_x f (x) $

$ c ^ l leq c (x) leq c ^ u, $

$ x ^ l leq x leq x ^ u $

For strict equality constraints, the lower bound is set equal to the upper

$ c ^ l_j = c ^ u_j $

.
For one-sided restriction, the upper or lower limit is set np.inf with the appropriate sign.
Let it be necessary to find the minimum of the known Rosenbrock function of two variables:

$ min_ {x_0, x_1} 100 (x_0 - x_1 ^ 2) ^ 2 + (1-x_0) ^ 2 $

The following restrictions on its scope are set:

$ x_0 ^ 2 + x_1 leq 1 $

$ x_0 ^ 2 - x_1 leq 1 $

$ 2x_0 + x_1 = 1 $

$ x_0 + 2x_1 leq 1 $

$ 0 leq x_0 leq 1 $

$ -0.5 leq x_1 leq 2.0 $

In our case, there is only one solution at

$[x_0, x_1] = [0.4149, 0.1701]$

for which only the first and fourth constraints are valid.
Let's go through the restrictions from the bottom up and consider how you can write them in scipy.
Restrictions

$ 0 leq x_0 leq 1 $

and

$ -0.5 leq x_1 leq 2.0 $

we define using the Bounds object.

from scipy.optimize import Bounds
bounds = bounds ([0, -0.5], [1.0, 2.0])

Restrictions

$ x_0 + 2 x_1 leq 1 $

and

$ 2 x_0 + x_1 = 1 $

write in linear form:

$ begin {bmatrix} - infty \ 1 end {bmatrix} leq begin {bmatrix} 1 & 2 \ 2 & 1 end {bmatrix} begin {bmatrix} x_0 \ x_1 end {bmatrix } leq begin {bmatrix} 1 \ 1 end {bmatrix} $

We define these constraints as a LinearConstraint object:

import numpy as np
from scipy.optimize import LinearConstraint
linear_constraint = LinearConstraint ([[1, 2], [2, 1]], [-np.inf, 1], [1, 1])

Finally, the nonlinear constraint in matrix form:

$ c (x) = begin {bmatrix} x_0 ^ 2 + x_1 \ x_0 ^ 2 - x_1 end {bmatrix} leq begin {bmatrix} 1 \ 1 end {bmatrix} $

We define the Jacobi matrix for this restriction and a linear combination of the Hessian matrix with an arbitrary vector

$ v $

:

$ J (x) = begin {bmatrix} 2x_0 & 1 \ 2x_0 & -1 end {bmatrix} $

$ H (x, v) = sum limits_ {i = 0} ^ 1 v_i nabla ^ 2 c_i (x) = v_0 begin {bmatrix} 2 & 0 \ 2 & 0 end {bmatrix} + v_1 begin {bmatrix} 2 & 0 \ 2 & 0 end {bmatrix} $

Now we can define a nonlinear constraint as an object. NonlinearConstraint:

from scipy.optimize import NonlinearConstraint

def cons_f (x):
     return [x[0]** 2 + x[1], x[0]** 2 - x[1]]def cons_J (x):
     return[[2*x[[2*x[0], one],[2*x[2*x[0], -one]]def cons_H (x, v):
     return v[0]* np.array ([[2, 0], [0, 0]]) + v[1]* np.array ([[2, 0], [0, 0]])

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = cons_J, hess = cons_H)

If the size is large, the matrix can be set in a sparse form:

from scipy.sparse import csc_matrix

def cons_H_sparse (x, v):
     return v[0]* csc_matrix ([[2, 0], [0, 0]]) + v[1]* csc_matrix ([[2, 0], [0, 0]])

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1,
                                            jac = cons_J, hess = cons_H_sparse)

or as an object LinearOperator:

from scipy.sparse.linalg import LinearOperator

def cons_H_linear_operator (x, v):
    def matvec (p):
        return np.array ([p[p[0]* 2 * (v[0]+ v[1]), 0])
    return LinearOperator ((2, 2), matvec = matvec)

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1,
                                jac = cons_J, hess = cons_H_linear_operator)

When calculating the Hesse matrix

$ H (x, v) $

costly, you can use the class Hessianupdatestrategy. The following strategies are available: Bfgs and SR1.

from scipy.optimize import BFGS

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = cons_J, hess = BFGS ())

Hessian can also be calculated using finite differences:

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = cons_J, hess = '2-point')

The Jacobi matrix for constraints can also be calculated using finite differences. However, in this case, the Hessian matrix using finite differences is no longer calculated. Hessian must be defined as a function or using the HessianUpdateStrategy class.

nonlinear_constraint = NonlinearConstraint (cons_f, -np.inf, 1, jac = '2-point', hess = BFGS ())

The solution of the optimization problem is as follows:

from scipy.optimize import minimize
from scipy.optimize import rosen, rosen_der, rosen_hess, rosen_hess_prod

x0 = np.array ([0.5, 0])
res = minimize (rosen, x0, method = 'trust-constr', jac = rosen_der, hess = rosen_hess,
                constraints =[linear_constraint, nonlinear_constraint],
                options = {'verbose': 1}, bounds = bounds)
print (res.x)

`gtol` termination condition is satisfied.
Number of iterations: 12, function evaluations: 8, CG iterations: 7, optimality: 2.99e-09, constraint violation: 1.11e-16, execution time: 0.033 s.
[0.41494531 0.17010937]

If necessary, the Hessian calculation function can be defined using the LinearOperator class.

def rosen_hess_linop (x):
    def matvec (p):
        return rosen_hess_prod (x, p)
    return LinearOperator ((2, 2), matvec = matvec)

res = minimize (rosen, x0, method = 'trust-constr', jac = rosen_der, hess = rosen_hess_linop,
                 constraints =[linear_constraint, nonlinear_constraint],
                 options = {'verbose': 1}, bounds = bounds)

print (res.x)

or the product of the Hessian and an arbitrary vector through the parameter hessp:

res = minimize (rosen, x0, method = 'trust-constr', jac = rosen_der, hessp = rosen_hess_prod,
                constraints =[linear_constraint, nonlinear_constraint],
                options = {'verbose': 1}, bounds = bounds)
print (res.x)

Alternatively, the first and second derivatives of the optimized function can be calculated approximately. For example, a hessian can be approximated using the function SR1 (quasi-Newtonian approximation). The gradient can be approximated by finite differences.

from scipy.optimize import SR1
res = minimize (rosen, x0, method = 'trust-constr', jac = "2-point", hess = SR1 (),
               constraints =[linear_constraint, nonlinear_constraint],
               options = {'verbose': 1}, bounds = bounds)
print (res.x)

Conditional optimization method = "SLSQP"

The SLSQP method is designed to solve problems of minimizing a function in the form:

$ min_x f (x) $

$ c_j (x) = 0, j in mathcal {E} $

$ c_j (x) geq 0, j in mathcal {I} $

$ lb_i leq x_i leq ub_i, i = 1, ..., N $

Where

$ mathcal {E} $

and

$ mathcal {I} $

– sets of indexes of expressions describing restrictions in the form of equalities or inequalities.

$[lb_i, ub_i]$

– sets of lower and upper bounds for the domain of definition of the function.

Linear and nonlinear constraints are described as dictionaries with keys. type, fun and jac.

ineq_cons = {'type': 'ineq',
             'fun': lambda x: np.array ([1-x[1-x[0] - 2 * x [1],
                                          1 - x [0] ** 2 - x [1],
                                          1 - x [0] ** 2 + x [1]]),
             'jac': lambda x: np.array ([[- 1.0, -2.0],[-2 * x
                                          [-2*x[0], -1.0],[-2 * x
                                          [-2*x[0], 1.0]])
            }

eq_cons = {'type': 'eq',
           'fun': lambda x: np.array ([2*x[2*x[0] + x [1] - one]),
           'jac': lambda x: np.array ([2.0, 1.0])
          }

The search for the minimum is as follows:

x0 = np.array ([0.5, 0])
res = minimize (rosen, x0, method = 'SLSQP', jac = rosen_der,
               constraints =[eq_cons, ineq_cons], options = {'ftol': 1e-9, 'disp': True},
               bounds = bounds)

print (res.x)

Optimization terminated successfully. (Exit mode 0)
            Current function value: 0.34271757499419825
            Iterations: 4
            Function evaluations: 5
            Gradient evaluations: 4
[0.41494475 0.1701105 ]

Optimization example

In connection with the transition to the fifth technological mode, consider the optimization of production on the example of a web-studio, which brings us a small but steady income. Imagine yourself a director of a galley, which produces three types of products:

  • x0 – selling landings, from 10 tr.
  • x1 – corporate sites, from 20 tr.
  • x2 – online stores, from 30 tr.

Our friendly working team includes four juns, two midls and one seigneur. The fund of their working time for the month:

  • Juny: 4 * 150 = 600 people * hour,
  • Middleies: 2 * 150 = 300 people * hour,
  • Senor: 150 people * hour.

Let the first available junior should spend (10, 20, 30) hours on the development and deployment of one site of type (x0, x1, x2), middle (7, 15, 20), senor – (5, 10, 15) best hours time of your life.

Like any normal director, we want to maximize the monthly profit. The first step to success is writing the target function. value as the amount of income from products produced during the month:

def value (x):
    return - 10 * x[0] - 20 * x[1] - 30 * x[2]

This is not an error; when searching for a maximum, the objective function is minimized with the opposite sign.

The next step – we prohibit the processing of our employees and impose restrictions on the working time fund:

$ begin {bmatrix} 10 & 20 & 30 \ 7 & 15 & 20 \ 5 & 10 & 15 end {bmatrix} begin {bmatrix} x_0 \ x_1 \ x_2 end {bmatrix} leq begin {bmatrix} 600 \ 300 \ 150 end {bmatrix} $

Which is equivalent to:

$ begin {bmatrix} 600 \ 300 \ 150 end {bmatrix} - begin {bmatrix} 10 & 20 & 30 \ 7 & 15 & 20 \ 5 & 10 & 15 end {bmatrix} begin {bmatrix} x_0 \ x_1 \ x_2 end {bmatrix} geq 0 $

ineq_cons = {'type': 'ineq',
             'fun': lambda x: np.array ([600-10*x[600-10*x[0] - 20 * x [1] - 30 * x[2],
                                         300 - 7 * x [0] - 15 * x [1] - 20 * x[2],
                                         150 - 5 * x [0] - 10 * x [1] - 15 * x[2]])
            }

Formal restriction – output should be only positive:

bnds = bounds ([0, 0, 0], [np.inf, np.inf, np.inf])

And finally, the most optimistic assumption is that because of the low price and high quality, a queue of satisfied customers is constantly lining up with us. We are free to choose monthly production volumes based on the solution of the conditional optimization problem with scipy.optimize:

x0 = np.array ([10, 10, 10])
res = minimize (value, x0, method = 'SLSQP', constraints = ineq_cons, bounds = bnds)
print (res.x)

[7.85714286 5.71428571 3.57142857]

The nasty one is rounded up to the whole and we calculate the monthly load of the rowers with the optimal output of products x = (8, 6, 3) :

  • Juny: 8 * 10 + 6 * 20 + 3 * 30 = 290 people * hour;
  • Middleies: 8 * 7 + 6 * 15 + 3 * 20 = 206 people * hour;
  • Senor: 8 * 5 + 6 * 10 + 3 * 15 = 145 people * hour.

Conclusion: for the director to get his deserved maximum, it’s best to do 8 landing pages, 6 medium sites and 3 stores per month. The senor must plow without interrupting the machine, the load of mids will be about 2/3, dzunov less than half.

Conclusion

The article outlines the basic methods of working with the package. scipy.optimizeused to solve conditional minimization problems. I personally use scipy purely for academic purposes, so this example is of such a comic character.

A lot of theory and vinar examples can be found, for example, in the book of I. L. Akulich "Mathematical programming in examples and problems". More hardcore use scipy.optimize to build a 3D structure for a set of images (article on Habré) can be viewed in scipy-cookbook.

The main source of information is docs.scipy.org, wishing to contribute to the translation of this and other sections. scipy Welcome to github.

Thank you mephistopheies for participating in the preparation of the publication.


0 Comments

Leave a Reply