Cutting planes with Julia and JuMP

When you need to interact with a mathematical programming solver, you have a few possible choices. I’ll briefly discuss the pros and cons of the different approaches one can use, and talk about a new interface, JuMP, developed in the Julia language. Differently to other interfaces I’ve tried in the past, JuMP also supports the implementation of solver callbacks. I’ll show an example on the separation of cutting planes for a lot-sizing problem (code included).

The high road

In many cases, a good option is to use a high-level domain-specific modelling language, such as AMPL or GAMS (I prefer the former), which allows you to write something which is as close as possible to mathematical notation. This provides great flexibility in tinkering with the “static” model — you can reformulate it in different ways with almost no effort. Then, you can solve it with any solver you like, be it Cplex, Gurobi, SCIP, or any other. Modelling languages fall short when you want to do something dynamic, like generating cutting planes during the solution process, modifying the branching decisions, using custom heuristics, or maybe if you want to integrate your optimization process in a more complex system. There are workarounds to emulate some of these things, but it’s often kind of painful and quite inefficient.

Getting your hands dirty

The second choice is going all-in: you pick a solver, choose a (supported) programming language and write a real program, using the interface exposed by the solver callable library. This allows for a much finer control of the solution process, that you can typically interact with via callbacks. There are some drawbacks. First, writing code in a language like C++ or Java requires a bit more effort and time than just writing 20 lines of AMPL code. Second, the code you are writing is indissolubly linked with the solver you have chosen, and there is no way to switch to another one unless you want to learn a whole new library interface and rewrite basically everything from scratch.

The third way

A third option is using an interface in a language which is more general-purpose than AMPL or GAMS (and possibly open-source…), but that can seamlessly switch between solvers. There have been in the past several attempts at high-level interfaces for mathematical programming solvers, like Coin-or OSI, Pyomo or Pulp (all with more-or-less-friendly open-source licenses). The advantage of these interfaces is that you have the power of a real programming language, so you can do a lot more that you can with AMPL (trading off expressiveness!), while remaining solver-independent. This means that, for example, you can use an academically-licensed commercial solver when you are doing research, but you can switch to an open-source solver if you want to sell or use your code for consulting/commercial purpose.

In this direction (and somewhat further) goes the Julia interface for mathematical programming, JuMP, which recently caught my attention. The library is written in Julia, that is a new language claiming to bring together the features of high-level scientific and general-purpose programming languages (Python, R, Matlab) with the speed of C code.

Aside from that, what I really like about JuMP, which afaik is not in any of the predecessors, is that it provides interfaces for solver callbacks. So, it’s not merely an interface that lets you formulate a problem and pass some options to a solver, but it actually allows you to interact with the solution process. e.g., with cut, branch and heuristic callbacks, in a solver-independent way — and you can switch between solvers just changing a line of code, which is awesome. I’m going to show this with an example.

An example: generating cutting planes for a simple lot-sizing problem

Let’s consider an uncapacitated lot-sizing problem, with the following formulation:

\[\begin{align} \min\quad & \sum_{t\in T}\left(K_tz_t + c_t q_t + h_t s_t \right) & \\ \text{s.t.}\quad & s_{t-1} + q_t = d_t + s_t \qquad& \forall t \in T \\ & 0 \le q_t \le M z_t \qquad&\forall t \in T \\ & s_t \ge 0 \qquad&\forall t \in T \\ & z_t \in\{0,1\} \qquad&\forall t \in T \end{align}\]

where \(q_t\) is the production level, \(z_t\) is 1 iff \(q_t>0\), and \(s_t\) is the inventory level at the end of period \(t\). This simple problem is actually polynomially solvable, either via dynamic programming or with an extended formulation. However, let’s assume I want to solve it as a mixed-integer program, and use a solver callback to add some cutting planes to strengthen the formulation. Specifically, at each node of the search tree, we look for a violated inequality from the set of \((l,S)\) inequalities¹:

\[\begin{equation} \sum_{j\in S}q_i \leq \sum_{j\in S}d_{jl}z_{j} + s_{l},\qquad \forall l\in T, \forall S\subseteq\{1,\dots,l\} \end{equation}\]

that will be added to the formulation via a User Cut Callback. The Julia code that builds the model and calls the solver is the following:

function solveULS(path::String; solver=CplexSolver(), valid::Bool = true) 
   T, c, h, K, d = readULS(path)
   m = Model(solver = solver)
   @defVar(m, z[1:T], Bin)
   @defVar(m, q[i = 1:T] >= 0)
   @defVar(m, s[1:T] >= 0)

   @setObjective(m, Min, sum{c[t]*q[t] + K[t]*z[t] + h[t]*s[t], t=1:T})
   @addConstraint(m, activation[t = 1:T], q[t] <= sum(d[t:T])*z[t])
   @addConstraint(m, balance[t = 1:T], (t>1?s[t-1]:0) + q[t] == s[t] + d[t])

   separationtime = 0.
   separated = 0
   called = 0
   function lSgenerator(cb)
      tt = time()
      called += 1
      z_val = getValue(z)
      q_val = getValue(q)
      expr = separate(T, d, z_val, q_val, z, q)
      if expr != nothing
         @addUserCut(cb, expr >= 0)
         separated += 1
      end
      separationtime += time()-tt
   end 

   if valid
      addCutCallback(m, lSgenerator)
   end

   status = solve(m)
   println("Objective value: ", getObjectiveValue(m))
   println("Separation time: $separationtime seconds")
   println("Separated: $separations")
   status
end

After building the model (lines 3–10), we define the function lSgenerator that looks for a violated inequality and adds it to the model. The complete code, including the function separate() that actually implements the separation logic, can be found here.

I have run a few tests with Cplex and Gurobi, using their Julia interfaces CPLEX.jl and Gurobi.jl. The problem is far too easy for the solvers: they don’t need my cutting planes at all. So, to see the behavior and the impact of the cutting plane separation, I just disabled the default cuts of the solvers. In the Julia interactive shell, this means calling my function like this:

ls.solveULS("test.dat", solver=CplexSolver(CPX_PARAM_CUTSFACTOR=1), valid = true) 

and

ls.solveULS("test.dat", solver=GurobiSolver(Cuts=0), valid = true) 

to use either Cplex or Gurobi and disable their default cutting planes. Note that, in my code, only one valid inequality is generated per node, as it seems to be the best option with this setup.

The output of Cplex 12.5, for a randomly generated test instance with 200 time periods, is:

User cuts applied:  206

 Root node processing (before b&c):
   Real time             =    0.04 sec. (20.31 ticks)
 Sequential b&c:
   Real time             =    1.23 sec. (1095.59 ticks)
                           ------------
 Total (root+branch&cut) =    1.27 sec. (1115.89 ticks)

Objective value: 3636.00
Separation time: 85.73 ms
Separated: 206
Nodes: 380

while with Gurobi 5.6 you get:

Cutting planes:
  User: 201

Explored 368 nodes (11145 simplex iterations) in 0.91 seconds
Thread count was 1 (of 1 available processors)

Optimal solution found (tolerance 1.00e-04)
Best objective 3.636000000000e+03, best bound 3.636000000000e+03, gap 0.0%
Objective value: 3636.00
Separation time: 63.21 ms
Separated: 202
Nodes: 368

Gurobi and Cplex explore pretty much the same number of nodes and generate a similar number of cutting planes (nice to see they are actually doing something similar). It looks like Cplex used all the valid inequalities we added, while Gurobi discarded one. In the end, the results look comparable. Note that, not surprisingly, the results are extremely worse if the \((l,S)\) inequalities are disabled (valid=false): the formulation is very weak, and a pure branch-and-bound method takes ages, exploring millions of nodes (yay for cutting planes).

Under the hood: JuMP vs. C++ native interface

Out of curiosity, I’ve implemented exactly the same thing in C++, with Cplex’s Concert library. It took slightly more effort, C++ is clearly a bit more verbose, but in the end the code doesn’t look too different: the C++ version is here.

Now, this is the output of the run with the C++ implementation:

User cuts applied:  206

 Root node processing (before b&c):
   Real time             =    0.07 sec. (20.33 ticks)
 Sequential b&c:
   Real time             =    1.22 sec. (1099.97 ticks)
                           ------------
 Total (root+branch&cut) =    1.29 sec. (1120.30 ticks)

Separation time: 13 ms
Separated: 206
Called: 206
Nodes: 380

A couple of remarks. The separation appears to be more efficient in C++ (between 4 and 20 times faster, the measurements have a pretty high variance). The difference in reported time might be due to how the timing is computed, though Julia time() and C++ chrono should measure the same thing. Most importantly, my Julia code could probably be improved: I’m not always sure what is the best way to do something in Julia (it’s quite easy to gain/lose an order of magnitude in speed for the inexperienced coder), although I tried to keep it simple and stay as close as possible to the C++ version.

One thing I’ve observed is that the separated cuts are exactly the same (I checked), as well as the number of explored nodes, but the branch-and-cut runs —although extremely similar— are not quite identical in the last few nodes (slightly different LP results), maybe due to numerical error?

Another interesting thing is that to get similar results with the Julia and C++ interfaces, some reverse engineering was involved: the default behavior of a cut callback in the native C++ interface is not the same as the one in the Julia interface. Specifically, to get a similar behavior in my C++ code, I had to:

call abortCutLoop() whenever I add a cut (i.e., exit the cut loop and branch)
pass UseCutPurge as a parameter of the cut, meaning that Cplex can delete it, if convenient (while the default value in the C++ interface is UseCutForce, meaning that the added inequality can’t be removed).

I haven’t tried to do the same with Gurobi, as I’m not familiar with its interface. Looking at the similar results of CPLEX and Gurobi, my feeling is the developers of JuMP have tried to set default values for Cplex callbacks so they would behave similarly to Gurobi “out of the box”.

Overall, while still under development, I believe JuMP is definitely worth a try and possibly ready to be used for some work. The developers are very active and ready to give a hand, which is nice. Supporting solver callbacks is a killer feature from my point of view, although it could bring about some new types of headache: JuMP is still young, and not really battle-tested.

One final thing to keep in mind is that, though rather ambitious, this is still an interface. I’m 100% sure you will occasionally miss the full control than you have by using the solver-specific libraries: as I’ve shown above, switching solvers is easy, but understanding what exactly is happening is not trivial. The good part is that the interface code in Julia is pretty easy to read, and, should the need arise, you can modify it yourself: everything is on github, so fork away!

The \((l,S)\) inequalities are actually sufficient to describe the convex hull of the feasible region, but here we will only add some of them as valid inequalities to speed up the branch-and-bound search. ↩