linear.3

         Linear Programming - Frequently Asked Questions List
                       (linear-programming-faq)
                 Most recent update: September 1, 1993

-----------------------------------------------------------------------

"The best way to get information on Usenet isn't to ask a question, but
to post the wrong information."  -- aahz@netcom.com

Q0.  "What's in this FAQ?"

A:  Table of Contents
   Q0.  "What's in this FAQ?"   (Oh no, a recursion!)
   Q1.  "What is Linear Programming?"
   Q2.  "Where is there a good code, preferably public domain, to solve
         Linear Programming problems?"
   Q3.  "Oh, and we also want to solve it as an integer program. I
         think there will be only a few thousand variables or so."
   Q4.  "What software is there for nonlinear optimization?"
   Q5.  "I wrote an optimization code.  Where are some test models?"
   Q6.  "What is MPS format?"
   Q7.  "Just a quick question..."
   Q8.  "What references are there in this field?"
   Q9.  "How do I access this Netlib server I keep hearing about?"
   Q10.  "Who maintains this FAQ list?"

See also the related FAQ on Nonlinear Programming (NLP).

-----------------------------------------------------------------------

Q1.  "What is Linear Programming?"

A:  A Linear Program (LP) is a problem that can be put into the form

    minimize   cx
    subject to Ax=b
                x>=0

where x is the vector to be solved for, A is a matrix of known
coefficients, and c and b are vectors of known coefficients.  The
expression "cx" is called the objective function, and the equations
"Ax=b" are called the constraints.  All these entities must have
consistent dimensions, of course, and you can add "transpose" symbols
to taste.  The matrix A is generally not square, hence you don't solve
an LP by just inverting A.  Usually A has more columns than rows, and
Ax=b  is therefore under-determined, leaving great latitude in the
choice of x with which to minimize cx.

Other formulations can be used in this framework.  For instance, if you
want to maximize instead of minimize, multiply the c vector by -1.  If
you have constraints that are inequalities rather than equations, you
can think of introducing one new variable (a "slack") for each
inequality and treat the augmented row of the matrix as an equation.
If you have variables that shouldn't be represented as non-negative,
you might think of them as the difference of two non-negatives.  And
so on.  LP codes will usually take care of such "bookkeeping" for you.

LP problems are usually solved by a technique known as the Simplex
Method, developed in the 1940's and after.  Briefly stated, this method
works by taking a sequence of square submatrices of A and solving for
x, in such a way that successive solutions always improve, until a
point is reached where improvement is no longer possible.  A family of
LP algorithms known as Interior Point methods has been developed
starting in the 1980's, that can be faster for many (but so far not
all) problems.  Such methods are characterized by constructing a
sequence of trial solutions that go through the interior of the
solution space, in contrast to the Simplex Method which stays on the
boundary and examines only the corners (vertices).

LP has a variety of uses, in such areas as petroleum, finance,
forestry, transportation, and the military.

The word "Programming" is used here in the sense of "planning"; the
necessary relationship to computer programming was incidental to the
choice of name.  Hence the phrase "LP program" to refer to a piece of
software is not a redundancy, although I tend to use the term "code"
instead of "program" to avoid the possible ambiguity.

-----------------------------------------------------------------------

Q2.  "Where is there a good code, preferably public domain, to solve
Linear Programming problems?"

A:  It depends on the size and difficulty of your models.  LP
algorithms and computer technology have both made such great strides
that models that were previously considered "large" are now routinely
solved.  Nowadays, with good commercial software (i.e., code that you
pay for), models with a few thousand constraints and several thousand
variables can be tackled on a 386 PC.  Workstations can often handle
models with variables in the tens of thousands, or even greater, and
mainframes can go larger still.  Public domain (free) codes can be
relied on to solve models of somewhat smaller dimension.  The choice of
the code can make more difference than the choice of computer hardware.
It's hard to be specific about model sizes and speed, a priori, due to
the wide variation in things like model structure and variation in
factorizing the basis matrices.  Just because a given code has solved a
model of a certain dimension, it may not be able to solve *all* models
of the same size, or in the same amount of time.

There is a public domain code, written in C, called "lp_solve" that its
author (Michel Berkelaar, email at  michel@es.ele.tue.nl ) says has
solved models with up to 30,000 variables and 50,000 constraints.  The
author requests that people retrieve it by anonymous ftp from
"ftp.es.ele.tue.nl" in directory pub/lp_solve.  There is an older
version to be found in the Usenet archives, but it contains bugs that
have been fixed in the meantime, and hence is unsupported.  (As an
editorial opinion, I must state that difficult models will give this
code trouble.  It's not as good as a commercial code.  But for someone
who isn't sure just what kind of LP code is needed, it represents a
very reasonable first try, since it does solve non-trivial-sized
models, and it is free.)

For DOS/PC users, Prof. Timo Salmi at the University of Vaasa in
Finland offers a Linear Programming and Linear Goal Programming binary
called "tslin".  You should be able to access it by ftp at
garbo.uwasa.fi in directory /pc/ts (the current file name is
tslin33b.zip, apparently using ZIP compression), or else I suggest
contacting Prof. Salmi at ts@uwasa.fi .

Also on the garbo server is a file called lp261.zip, with a descriptor
of "Linear Programming Optimizer by ScanSoft".  It consists of PC
binaries, and is evidently some sort of shareware (i.e., not strictly
public domain).

There is an ACM TOMS routine for LP, #552, available from the netlib
server, in directory /netlib/toms.

For academic users only, on a limited variety of platforms, there is
a free version of LOQO available.  Here is a recent announcement:

  Binary executables for LOQO, a linear/quadratic program solver, have
  been installed in elib.  If you are interested in trying the code,
  use anonymous ftp to elib.zib-berlin.de.  Everything is found in the
  pub/opt-net/software/loqo/ directory.  There are executables for
  IBM workstations (RS6000's), Silicon Graphics workstations (SGI's),
  HP workstations (7xx) and Sun workstations.  The package includes
  a subroutine library (libloqo.a), an executable (loqo), the source
  for the main part of loqo (loqo.c), and associated documentation
  (loqo.dvi and *.eps).  The algorithm used is a one-phase
  primal-dual-symmetric interior-point method.  The code was designed
  to work well on all sizes of problems ranging from very small to very
  large scale.  Benchmarks are included showing results competitive
  to existing popular packages such as CPLEX.

  THE FREE CODE FROM ELIB IS INTENDED TO BE USED FOR ACADEMIC RESEARCH
  PURPOSES ONLY.  If you wish to purchase a commercial version, please
  contact me for details.

  Bob Vanderbei  (rvdb@Princeton.EDU)

There are books that contain source code for the simplex method.  See
the section on references.  The consensus is that the LP code published
in Numerical Recipes is not at all strong, and should be avoided for
heavy-duty use.  If your requirement is for a solver that can handle
100-variable models, it might be okay.  Codes found in most books are
usually similarly limited.

The following suggestions may represent a low-cost way of solving LP's
if you already have certain software available to you.  1) Some
spreadsheet programs have an embedded LP solver, or offer one as an
installable option.  2) A package called QSB (Quantitative Systems for
Business, from Prentice-Hall publishers) has an LP module among its
routines.  3) If you have access to a commercial math library, such as
IMSL or NAG, you may be able to use an LP routine from there.
4) Mathematical systems MATLAB (The Math Works, Inc., (508) 653-1415)
and MAPLE (reference?) also have LP solvers; an interface from MATLAB
to lp_solve is available from Jeff Kantor (Jeffrey.Kantor@nd.edu), and
there's also a simplex code written in the MATLAB language, available
from the netlib server, file netlib/matlab/optimization/simplex1.m.Z.
(There's a Usenet newsgroup on MATLAB: comp.soft-sys.matlab.)  If
speed matters to you, then according to a Usenet posting by Pascal
Koiran (koiran@ens-lyon.fr), on randomly generated LP models, MATLAB
was an order of magnitude faster than MAPLE on a 200x20 problem but an
order of magnitude slower than lp_solve on a 50x100 problem.  (I don't
intend to get into benchmarking in this document, but I mention these
results just to document why I choose to focus mostly on special
purpose LP software.)

If your models prove to be too difficult for free software to handle,
then you may have to consider acquiring a commercial LP code.  There
are dozens of such codes on the market.  At the end of this section is
a *very* condensed version of a survey of LP software published in the
June 1992 issue of "OR/MS Today", a joint publication of ORSA
(Operations Research Society of America) and TIMS (The Institute of
Management Science).  For further information I would suggest you read
the full article.  It's likely that you can find a copy, either
through a library, or by contacting a member of these two organizations
(most universities probably have several members among the faculty and
student body). This magazine also carries advertisements for many of
these products, which may give you additional information to help make
a decision.

The author of that survey, Ramesh Sharda, has updated and expanded it
for 1993 into a larger report called "Linear and Discrete Optimization
and Modeling Software: A Resource Handbook".  For information, contact
Lionheart Publishing Inc., 2555 Cumberland Parkway, Suite 299, Atlanta,
GA 30339. Phone: (404)-431-0867.  This book is fairly expensive, and
geared toward users whose needs for LP software are considerable.

There are many considerations in selecting an LP code.  Speed is
important, but LP is complex enough that different codes go faster on
different models; you won't find a "Consumer Reports" article to say
with certainty which code is THE fastest.  I usually suggest getting
benchmark results for your particular type of model if speed is
paramount to you.  Benchmarking may also help determine whether a given
code has sufficient numerical stability for your kind of models.

Other questions you should answer: Can you use a stand-alone code, or
do you need a code that can be used as a callable library, or do you
require source code?  Do you want the flexibility of a code that runs
on many platforms and/or operating systems, or do you want code that's
tuned to your particular hardware architecture (in which case your
hardware vendor may have suggestions)?  Is the choice of algorithm
(Simplex, Interior Point) important to you?  Do you need an interface
to a spreadsheet code?  Is the purchase price an overriding concern?
Is the software offered at an academic discount (assuming you are at a
university)?  How much hotline support do you think you'll need?

It may not always be true that "you get what you pay for," but it is
rare that you get more than you pay for.  8v)   There is usually a
large difference in LP codes, in performance (speed, numerical
stability, adaptability to computer architectures) and features, as you
climb the price scale.  If a code seems overpriced to you, you may not
yet understand all of its features.

In the table below, I give the name of the software, and the phone
number listed in the June 1992 survey.  To keep the table short
enough for inclusion here, I decided not to bother typing in all the
mailing addresses.  (I suppose that an "800" number will not be useful
to people outside the US; consult the full survey for more information
on contacting such vendors.  Also, for some companies there may exist
European or Asian contact points, but this is beyond the scope of this
document.)

Under the column marked "alg" is placed an S if the code makes use of
the Simplex method, or an I if it uses some form of Interior Point
method.  Under "H/W" is the minimum hardware needed to run the code;
generally there is a hierarchy where PC's (and Macintoshes) are the
least powerful, then workstations (WS) like Suns and RS-6000's, on up
to supercomputers, so by the symbol "^" is meant "and up", namely that
most commonly-encountered platforms are supported.  Under the column
"MIP?" is a yes or no depending on whether or not the code can solve
some form of Integer LP (see the section in this FAQ on the topic).
Under "Other algs" I mention other methods that the code is claimed
to provide, such as Quadratic Programming, other Nonlinear Programming,
or network algorithms.

The first part of the table consists of software I deem to be LP
solvers.  The second part is software that in some sense is a front end
to the solvers.  In some cases it becomes a hazy definition, but I hope
the distinction turns out to be useful to readers.

Even more so than usual, I must emphasize that you must use this
information at your own risk.  I provide it as a convenience to those
readers who have difficulty in locating the OR/MS Today survey, I take
no responsibility for errors either in the original article or by my
act of copying it by hand, though I will gladly correct any mistakes
that are pointed out to me.

Code Name          Phone        Alg  H/W            MIP?  Other algs
---------          -----        ---  -------        ----  ----------
AT&T KORBX         908-949-8966  I   WS ^           No    QP
Best Answer        510-943-7667  S   Mac Plus       No
CPLEX              702-831-7744  S   PC-386 ^       Yes   Network
Excel              206-882-8080  S   PC, Mac        Yes   NLP
FortLP             708-971-2337  S   PC ^           Yes
HS/LP Lin. Opt.    201-627-1424  S   PC-386, VAX    Yes
INCEPTA            416-896-0515  S   PC-386         Yes   visualization
LAMPS              413-584-1605  S   PC-386 ^       Yes
LINDO              800-441-2378  S   PC ^           Yes   QP
LOQO               609-258-0876  I   PC ^           No    QP
LP88               703-360-7600  S   PC             No
MathPro            202-887-0296  S   PC-286, WS     Yes   visualization
MILP88             703-360-7600  S   PC             Yes
MILP Lin. Opt.   (+361)149-7531  S   PC             No    visualization
MPS-III            703-558-8701  S   PC-386 ^       Yes   QP, Network
MSLP-PC            604-361-9778  S   PC             No
OB1                702-831-4967  I   PC-386 ^       No
OMP                919-847-9997  S   PC, VAX, WS    Yes
OSL                914-385-6034  S/I PC, WS, IBM    Yes   QP, Network
PC-PROG            919-847-9997  S   PC             Yes   QP
SAS/OR             919-677-8000  S   PC ^           Yes   Network
SCICONIC        (+44)908-585858  S   PC-386 ^       Yes
STORM              216-464-1209  S   PC             Yes   Network
Turbo-Simplex      703-351-5272  S   PC, Mac        No
What If            800-447-2226  S   PC             Yes   NLP
XA                 818-441-1565  S   PC ^           Yes
XPRESS-MP          202-887-0296  S   PC-286 ^       Yes
YLP                702-831-4967  S   PC ^           No


Code Name          Phone             Min H/W
---------          -----             -------
GAMS               415-583-8840      PC-286 ^
LINGO              800-441-2378      PC ^
MIMI/LP            908-464-8300      WS
MPL Model. Sys.    703-351-5272      PC
OMNI               202-627-1424      PC-386, VAX, WS, IBM
VMP                301-622-0694      PC-386, WS
What's Best!       800-441-2378      PC, Mac, WS

-----------------------------------------------------------------------

Q3.  "Oh, and we also want to solve it as an integer program. I
     think there will be only a few thousand variables or so."

A:  Hmmmm.  You want
    - Nontrivial model size
    - Integer solutions
    - Public domain code
Pick one or maybe two of the above.  You can't have all three.  8v)

Integer LP models are ones where the answers must not take fractional
values.  It may not be obvious that this is a VERY much harder problem
than ordinary LP, but it is nonetheless true.  The buzzword is "NP-
Completeness", the definition of which is beyond the scope of this
document but means in essence that, in the worst case, the amount of
time to solve a family of related problems goes up exponentially as the
size of the problem grows, for all known algorithms that solve such
problems to a proven answer.

Integer models may be ones where only some of the variables are to be
integer and others may be real-valued (termed "Mixed Integer LP" or
MILP, or "Mixed Integer Programming" or MIP); or they may be ones where
all the variables must be integral (termed "Integer LP" or ILP).  The
class of ILP is often further subdivided into problems where the only
legal values are {0,1} ("Binary" or "Zero-One" ILP), and general
integer problems.  For the sake of generality, the Integer LP problem
will be referred to here as MIP, since the other classes can be viewed
as special cases of MIP.  MIP, in turn, is a particular member of the
class of Discrete Optimization problems.

People are sometimes surprised to learn that MIP problems are solved
using floating point arithmetic.  Although various algorithms for MIP
have been studied, most if not all available general purpose large-
scale LP codes use a method called "Branch and Bound" to try to find an
optimal solution.  Briefly stated, B&B solves MIP by solving a sequence
of related LP models.  (As a point of interest, the Simplex Method
currently retains an advantage over the newer Interior Point methods
for solving these sequences of LP's.)  Good codes for MIP distinguish
themselves more by solving shorter sequences of LP's, than by solving
the individual LP's faster.  Even more so than with regular LP, a
costly commercial code may prove its value to you if your MIP model is
difficult.

You should be prepared to solve far smaller MIP models than the
corresponding LP model, given a certain amount of time you wish to
allow (unless you and your model happen to be very lucky). There exist
models that are considered challenging, with a few dozen variables.
Conversely, some models with tens of thousands of variables solve
readily.  The best explanations of "why" usually seem to happen after
the fact.  But a MIP model with hundreds or thousands of variables
should be approached with a certain amount of humility.

A major exception to this generally gloomy outlook is that there are
certain models whose LP solution always turns out to be integer.  Best
known of these is the so-called Network-Flow Problem.  Special cases of
this problem are the Transportation Problem, the Assignment Problem,
and the Shortest Path Problem.  The theory of unimodular matrices is
fundamental here.  It turns out that these particular problems are best
solved by specialized routines that take major shortcuts in the Simplex
Method, and as a result are relatively quick-running even compared to
ordinary LP.  Some commercial LP solvers also include a network solver.
See the section on references for a book by Kennington and Helgason,
which contains some source code for Netflo.  Netflo is available by
anonymous ftp at dimacs.rutgers.edu, in directory
    /pub/netflow/mincost/solver-1
but I don't know the copyright situation (I always thought you had to
buy the book to get the code).  Another text containing Fortran code is
"Linear Network Optimization: Algorithms and Codes" by D.P. Bertsekas,
though I am unaware of any place that has the source code online.
There is an ACM TOMS routine, #548, that solves the Assignment problem
using the Hungarian Method, available from the netlib server, in
directory /netlib/toms.

The public domain code "lp_solve", mentioned earlier, accepts MIP
models, as do a large number of the commercial LP codes in the OR/MS
Today survey.  I have seen mention made of algorithm 333 in the
Collected Algorithms from CACM, though I'd be surprised if it was
robust enough to solve large models.

In the book by Syslo et al. (see section on references) is code for 28
algorithms, most of which pertain to some aspect of Discrete
Optimization.

Mustafa Akgul (AKGUL@TRBILUN.BITNET) at Bilkent University maintains an
archive via anonymous ftp (firat.bcc.bilkent.edu.tr or 139.179.10.13).
In addition to a copy of lp_solve (though I would recommend using the
official source listed in the previous section), there is mip386.zip,
which is a zip-compressed code for PC's.  He also has a couple of
network codes and various other codes he has picked up.  All this is in
directory pub/IEOR/Opt and its further subdirectories LP, PC, and
Network.

The better commercial MIP codes have numerous parameters and options to
give the user control over the solution strategy.  Most have the
capability of stopping before an optimum is proved, printing the best
answer obtained so far.  For many MIP models, stopping early is a
practical necessity.  Fortunately, a solution that has been proved by
the algorithm to be within, say, 1% of optimality often turns out to be
the true optimum, and the bulk of the computation time is spent proving
the optimality. For many modeling situations, a near-optimal solution
is acceptable anyway.

Once one accepts that large MIP models are not typically solved to a
proved optimal solution, that opens up a broad area of approximate
methods, probabilistic methods and heuristics, as well as modifications
to B&B.  Genetic Algorithms and Simulated Annealing have been studied
heavily for difficult problems like these, though (IMHO) the successes
have been problem dependent and difficult to generalize.  There's a
(compressed) Postscript file available by anonymous ftp, containing a
forty-page introduction to the topic, that one can obtain from
beethoven.cs.colostate.edu:pub/TechReports/1993/tr-103.ps.Z

The Usenet newsgroup on genetic algorithms, comp.comp.ai.genetic, has
an FAQ on the topic, otherwise known as "The Hitch-Hiker's Guide to
Evolutionary Computation".  That FAQ can be obtained by anonymous ftp
at rtfm.mit.edu, in directory /pub/usenet/news.answers/ai-faq/genetic,
in files named part*,

Whatever the solution method you choose, when trying to solve a
difficult MIP model, it is usually crucial to understand the workings
of the physical system (or whatever) you are modeling, and try to find
some insight that will assist your chosen algorithm to work better.  A
related observation is that the way you formulate your model can be as
important as the actual choice of solver.  You should consider getting