Linear Programming - Frequently Asked Questions List (linear-programming-faq) Most recent update: September 1, 1993 ----------------------------------------------------------------------- "The best way to get information on Usenet isn't to ask a question, but to post the wrong information." -- aahz@netcom.com Q0. "What's in this FAQ?" A: Table of Contents Q0. "What's in this FAQ?" (Oh no, a recursion!) Q1. "What is Linear Programming?" Q2. "Where is there a good code, preferably public domain, to solve Linear Programming problems?" Q3. "Oh, and we also want to solve it as an integer program. I think there will be only a few thousand variables or so." Q4. "What software is there for nonlinear optimization?" Q5. "I wrote an optimization code. Where are some test models?" Q6. "What is MPS format?" Q7. "Just a quick question..." Q8. "What references are there in this field?" Q9. "How do I access this Netlib server I keep hearing about?" Q10. "Who maintains this FAQ list?" See also the related FAQ on Nonlinear Programming (NLP). ----------------------------------------------------------------------- Q1. "What is Linear Programming?" A: A Linear Program (LP) is a problem that can be put into the form minimize cx subject to Ax=b x>=0 where x is the vector to be solved for, A is a matrix of known coefficients, and c and b are vectors of known coefficients. The expression "cx" is called the objective function, and the equations "Ax=b" are called the constraints. All these entities must have consistent dimensions, of course, and you can add "transpose" symbols to taste. The matrix A is generally not square, hence you don't solve an LP by just inverting A. Usually A has more columns than rows, and Ax=b is therefore under-determined, leaving great latitude in the choice of x with which to minimize cx. Other formulations can be used in this framework. For instance, if you want to maximize instead of minimize, multiply the c vector by -1. If you have constraints that are inequalities rather than equations, you can think of introducing one new variable (a "slack") for each inequality and treat the augmented row of the matrix as an equation. If you have variables that shouldn't be represented as non-negative, you might think of them as the difference of two non-negatives. And so on. LP codes will usually take care of such "bookkeeping" for you. LP problems are usually solved by a technique known as the Simplex Method, developed in the 1940's and after. Briefly stated, this method works by taking a sequence of square submatrices of A and solving for x, in such a way that successive solutions always improve, until a point is reached where improvement is no longer possible. A family of LP algorithms known as Interior Point methods has been developed starting in the 1980's, that can be faster for many (but so far not all) problems. Such methods are characterized by constructing a sequence of trial solutions that go through the interior of the solution space, in contrast to the Simplex Method which stays on the boundary and examines only the corners (vertices). LP has a variety of uses, in such areas as petroleum, finance, forestry, transportation, and the military. The word "Programming" is used here in the sense of "planning"; the necessary relationship to computer programming was incidental to the choice of name. Hence the phrase "LP program" to refer to a piece of software is not a redundancy, although I tend to use the term "code" instead of "program" to avoid the possible ambiguity. ----------------------------------------------------------------------- Q2. "Where is there a good code, preferably public domain, to solve Linear Programming problems?" A: It depends on the size and difficulty of your models. LP algorithms and computer technology have both made such great strides that models that were previously considered "large" are now routinely solved. Nowadays, with good commercial software (i.e., code that you pay for), models with a few thousand constraints and several thousand variables can be tackled on a 386 PC. Workstations can often handle models with variables in the tens of thousands, or even greater, and mainframes can go larger still. Public domain (free) codes can be relied on to solve models of somewhat smaller dimension. The choice of the code can make more difference than the choice of computer hardware. It's hard to be specific about model sizes and speed, a priori, due to the wide variation in things like model structure and variation in factorizing the basis matrices. Just because a given code has solved a model of a certain dimension, it may not be able to solve *all* models of the same size, or in the same amount of time. There is a public domain code, written in C, called "lp_solve" that its author (Michel Berkelaar, email at michel@es.ele.tue.nl ) says has solved models with up to 30,000 variables and 50,000 constraints. The author requests that people retrieve it by anonymous ftp from "ftp.es.ele.tue.nl" in directory pub/lp_solve. There is an older version to be found in the Usenet archives, but it contains bugs that have been fixed in the meantime, and hence is unsupported. (As an editorial opinion, I must state that difficult models will give this code trouble. It's not as good as a commercial code. But for someone who isn't sure just what kind of LP code is needed, it represents a very reasonable first try, since it does solve non-trivial-sized models, and it is free.) For DOS/PC users, Prof. Timo Salmi at the University of Vaasa in Finland offers a Linear Programming and Linear Goal Programming binary called "tslin". You should be able to access it by ftp at garbo.uwasa.fi in directory /pc/ts (the current file name is tslin33b.zip, apparently using ZIP compression), or else I suggest contacting Prof. Salmi at ts@uwasa.fi . Also on the garbo server is a file called lp261.zip, with a descriptor of "Linear Programming Optimizer by ScanSoft". It consists of PC binaries, and is evidently some sort of shareware (i.e., not strictly public domain). There is an ACM TOMS routine for LP, #552, available from the netlib server, in directory /netlib/toms. For academic users only, on a limited variety of platforms, there is a free version of LOQO available. Here is a recent announcement: Binary executables for LOQO, a linear/quadratic program solver, have been installed in elib. If you are interested in trying the code, use anonymous ftp to elib.zib-berlin.de. Everything is found in the pub/opt-net/software/loqo/ directory. There are executables for IBM workstations (RS6000's), Silicon Graphics workstations (SGI's), HP workstations (7xx) and Sun workstations. The package includes a subroutine library (libloqo.a), an executable (loqo), the source for the main part of loqo (loqo.c), and associated documentation (loqo.dvi and *.eps). The algorithm used is a one-phase primal-dual-symmetric interior-point method. The code was designed to work well on all sizes of problems ranging from very small to very large scale. Benchmarks are included showing results competitive to existing popular packages such as CPLEX. THE FREE CODE FROM ELIB IS INTENDED TO BE USED FOR ACADEMIC RESEARCH PURPOSES ONLY. If you wish to purchase a commercial version, please contact me for details. Bob Vanderbei (rvdb@Princeton.EDU) There are books that contain source code for the simplex method. See the section on references. The consensus is that the LP code published in Numerical Recipes is not at all strong, and should be avoided for heavy-duty use. If your requirement is for a solver that can handle 100-variable models, it might be okay. Codes found in most books are usually similarly limited. The following suggestions may represent a low-cost way of solving LP's if you already have certain software available to you. 1) Some spreadsheet programs have an embedded LP solver, or offer one as an installable option. 2) A package called QSB (Quantitative Systems for Business, from Prentice-Hall publishers) has an LP module among its routines. 3) If you have access to a commercial math library, such as IMSL or NAG, you may be able to use an LP routine from there. 4) Mathematical systems MATLAB (The Math Works, Inc., (508) 653-1415) and MAPLE (reference?) also have LP solvers; an interface from MATLAB to lp_solve is available from Jeff Kantor (Jeffrey.Kantor@nd.edu), and there's also a simplex code written in the MATLAB language, available from the netlib server, file netlib/matlab/optimization/simplex1.m.Z. (There's a Usenet newsgroup on MATLAB: comp.soft-sys.matlab.) If speed matters to you, then according to a Usenet posting by Pascal Koiran (koiran@ens-lyon.fr), on randomly generated LP models, MATLAB was an order of magnitude faster than MAPLE on a 200x20 problem but an order of magnitude slower than lp_solve on a 50x100 problem. (I don't intend to get into benchmarking in this document, but I mention these results just to document why I choose to focus mostly on special purpose LP software.) If your models prove to be too difficult for free software to handle, then you may have to consider acquiring a commercial LP code. There are dozens of such codes on the market. At the end of this section is a *very* condensed version of a survey of LP software published in the June 1992 issue of "OR/MS Today", a joint publication of ORSA (Operations Research Society of America) and TIMS (The Institute of Management Science). For further information I would suggest you read the full article. It's likely that you can find a copy, either through a library, or by contacting a member of these two organizations (most universities probably have several members among the faculty and student body). This magazine also carries advertisements for many of these products, which may give you additional information to help make a decision. The author of that survey, Ramesh Sharda, has updated and expanded it for 1993 into a larger report called "Linear and Discrete Optimization and Modeling Software: A Resource Handbook". For information, contact Lionheart Publishing Inc., 2555 Cumberland Parkway, Suite 299, Atlanta, GA 30339. Phone: (404)-431-0867. This book is fairly expensive, and geared toward users whose needs for LP software are considerable. There are many considerations in selecting an LP code. Speed is important, but LP is complex enough that different codes go faster on different models; you won't find a "Consumer Reports" article to say with certainty which code is THE fastest. I usually suggest getting benchmark results for your particular type of model if speed is paramount to you. Benchmarking may also help determine whether a given code has sufficient numerical stability for your kind of models. Other questions you should answer: Can you use a stand-alone code, or do you need a code that can be used as a callable library, or do you require source code? Do you want the flexibility of a code that runs on many platforms and/or operating systems, or do you want code that's tuned to your particular hardware architecture (in which case your hardware vendor may have suggestions)? Is the choice of algorithm (Simplex, Interior Point) important to you? Do you need an interface to a spreadsheet code? Is the purchase price an overriding concern? Is the software offered at an academic discount (assuming you are at a university)? How much hotline support do you think you'll need? It may not always be true that "you get what you pay for," but it is rare that you get more than you pay for. 8v) There is usually a large difference in LP codes, in performance (speed, numerical stability, adaptability to computer architectures) and features, as you climb the price scale. If a code seems overpriced to you, you may not yet understand all of its features. In the table below, I give the name of the software, and the phone number listed in the June 1992 survey. To keep the table short enough for inclusion here, I decided not to bother typing in all the mailing addresses. (I suppose that an "800" number will not be useful to people outside the US; consult the full survey for more information on contacting such vendors. Also, for some companies there may exist European or Asian contact points, but this is beyond the scope of this document.) Under the column marked "alg" is placed an S if the code makes use of the Simplex method, or an I if it uses some form of Interior Point method. Under "H/W" is the minimum hardware needed to run the code; generally there is a hierarchy where PC's (and Macintoshes) are the least powerful, then workstations (WS) like Suns and RS-6000's, on up to supercomputers, so by the symbol "^" is meant "and up", namely that most commonly-encountered platforms are supported. Under the column "MIP?" is a yes or no depending on whether or not the code can solve some form of Integer LP (see the section in this FAQ on the topic). Under "Other algs" I mention other methods that the code is claimed to provide, such as Quadratic Programming, other Nonlinear Programming, or network algorithms. The first part of the table consists of software I deem to be LP solvers. The second part is software that in some sense is a front end to the solvers. In some cases it becomes a hazy definition, but I hope the distinction turns out to be useful to readers. Even more so than usual, I must emphasize that you must use this information at your own risk. I provide it as a convenience to those readers who have difficulty in locating the OR/MS Today survey, I take no responsibility for errors either in the original article or by my act of copying it by hand, though I will gladly correct any mistakes that are pointed out to me. Code Name Phone Alg H/W MIP? Other algs --------- ----- --- ------- ---- ---------- AT&T KORBX 908-949-8966 I WS ^ No QP Best Answer 510-943-7667 S Mac Plus No CPLEX 702-831-7744 S PC-386 ^ Yes Network Excel 206-882-8080 S PC, Mac Yes NLP FortLP 708-971-2337 S PC ^ Yes HS/LP Lin. Opt. 201-627-1424 S PC-386, VAX Yes INCEPTA 416-896-0515 S PC-386 Yes visualization LAMPS 413-584-1605 S PC-386 ^ Yes LINDO 800-441-2378 S PC ^ Yes QP LOQO 609-258-0876 I PC ^ No QP LP88 703-360-7600 S PC No MathPro 202-887-0296 S PC-286, WS Yes visualization MILP88 703-360-7600 S PC Yes MILP Lin. Opt. (+361)149-7531 S PC No visualization MPS-III 703-558-8701 S PC-386 ^ Yes QP, Network MSLP-PC 604-361-9778 S PC No OB1 702-831-4967 I PC-386 ^ No OMP 919-847-9997 S PC, VAX, WS Yes OSL 914-385-6034 S/I PC, WS, IBM Yes QP, Network PC-PROG 919-847-9997 S PC Yes QP SAS/OR 919-677-8000 S PC ^ Yes Network SCICONIC (+44)908-585858 S PC-386 ^ Yes STORM 216-464-1209 S PC Yes Network Turbo-Simplex 703-351-5272 S PC, Mac No What If 800-447-2226 S PC Yes NLP XA 818-441-1565 S PC ^ Yes XPRESS-MP 202-887-0296 S PC-286 ^ Yes YLP 702-831-4967 S PC ^ No Code Name Phone Min H/W --------- ----- ------- GAMS 415-583-8840 PC-286 ^ LINGO 800-441-2378 PC ^ MIMI/LP 908-464-8300 WS MPL Model. Sys. 703-351-5272 PC OMNI 202-627-1424 PC-386, VAX, WS, IBM VMP 301-622-0694 PC-386, WS What's Best! 800-441-2378 PC, Mac, WS ----------------------------------------------------------------------- Q3. "Oh, and we also want to solve it as an integer program. I think there will be only a few thousand variables or so." A: Hmmmm. You want - Nontrivial model size - Integer solutions - Public domain code Pick one or maybe two of the above. You can't have all three. 8v) Integer LP models are ones where the answers must not take fractional values. It may not be obvious that this is a VERY much harder problem than ordinary LP, but it is nonetheless true. The buzzword is "NP- Completeness", the definition of which is beyond the scope of this document but means in essence that, in the worst case, the amount of time to solve a family of related problems goes up exponentially as the size of the problem grows, for all known algorithms that solve such problems to a proven answer. Integer models may be ones where only some of the variables are to be integer and others may be real-valued (termed "Mixed Integer LP" or MILP, or "Mixed Integer Programming" or MIP); or they may be ones where all the variables must be integral (termed "Integer LP" or ILP). The class of ILP is often further subdivided into problems where the only legal values are {0,1} ("Binary" or "Zero-One" ILP), and general integer problems. For the sake of generality, the Integer LP problem will be referred to here as MIP, since the other classes can be viewed as special cases of MIP. MIP, in turn, is a particular member of the class of Discrete Optimization problems. People are sometimes surprised to learn that MIP problems are solved using floating point arithmetic. Although various algorithms for MIP have been studied, most if not all available general purpose large- scale LP codes use a method called "Branch and Bound" to try to find an optimal solution. Briefly stated, B&B solves MIP by solving a sequence of related LP models. (As a point of interest, the Simplex Method currently retains an advantage over the newer Interior Point methods for solving these sequences of LP's.) Good codes for MIP distinguish themselves more by solving shorter sequences of LP's, than by solving the individual LP's faster. Even more so than with regular LP, a costly commercial code may prove its value to you if your MIP model is difficult. You should be prepared to solve far smaller MIP models than the corresponding LP model, given a certain amount of time you wish to allow (unless you and your model happen to be very lucky). There exist models that are considered challenging, with a few dozen variables. Conversely, some models with tens of thousands of variables solve readily. The best explanations of "why" usually seem to happen after the fact. But a MIP model with hundreds or thousands of variables should be approached with a certain amount of humility. A major exception to this generally gloomy outlook is that there are certain models whose LP solution always turns out to be integer. Best known of these is the so-called Network-Flow Problem. Special cases of this problem are the Transportation Problem, the Assignment Problem, and the Shortest Path Problem. The theory of unimodular matrices is fundamental here. It turns out that these particular problems are best solved by specialized routines that take major shortcuts in the Simplex Method, and as a result are relatively quick-running even compared to ordinary LP. Some commercial LP solvers also include a network solver. See the section on references for a book by Kennington and Helgason, which contains some source code for Netflo. Netflo is available by anonymous ftp at dimacs.rutgers.edu, in directory /pub/netflow/mincost/solver-1 but I don't know the copyright situation (I always thought you had to buy the book to get the code). Another text containing Fortran code is "Linear Network Optimization: Algorithms and Codes" by D.P. Bertsekas, though I am unaware of any place that has the source code online. There is an ACM TOMS routine, #548, that solves the Assignment problem using the Hungarian Method, available from the netlib server, in directory /netlib/toms. The public domain code "lp_solve", mentioned earlier, accepts MIP models, as do a large number of the commercial LP codes in the OR/MS Today survey. I have seen mention made of algorithm 333 in the Collected Algorithms from CACM, though I'd be surprised if it was robust enough to solve large models. In the book by Syslo et al. (see section on references) is code for 28 algorithms, most of which pertain to some aspect of Discrete Optimization. Mustafa Akgul (AKGUL@TRBILUN.BITNET) at Bilkent University maintains an archive via anonymous ftp (firat.bcc.bilkent.edu.tr or 139.179.10.13). In addition to a copy of lp_solve (though I would recommend using the official source listed in the previous section), there is mip386.zip, which is a zip-compressed code for PC's. He also has a couple of network codes and various other codes he has picked up. All this is in directory pub/IEOR/Opt and its further subdirectories LP, PC, and Network. The better commercial MIP codes have numerous parameters and options to give the user control over the solution strategy. Most have the capability of stopping before an optimum is proved, printing the best answer obtained so far. For many MIP models, stopping early is a practical necessity. Fortunately, a solution that has been proved by the algorithm to be within, say, 1% of optimality often turns out to be the true optimum, and the bulk of the computation time is spent proving the optimality. For many modeling situations, a near-optimal solution is acceptable anyway. Once one accepts that large MIP models are not typically solved to a proved optimal solution, that opens up a broad area of approximate methods, probabilistic methods and heuristics, as well as modifications to B&B. Genetic Algorithms and Simulated Annealing have been studied heavily for difficult problems like these, though (IMHO) the successes have been problem dependent and difficult to generalize. There's a (compressed) Postscript file available by anonymous ftp, containing a forty-page introduction to the topic, that one can obtain from beethoven.cs.colostate.edu:pub/TechReports/1993/tr-103.ps.Z The Usenet newsgroup on genetic algorithms, comp.comp.ai.genetic, has an FAQ on the topic, otherwise known as "The Hitch-Hiker's Guide to Evolutionary Computation". That FAQ can be obtained by anonymous ftp at rtfm.mit.edu, in directory /pub/usenet/news.answers/ai-faq/genetic, in files named part*, Whatever the solution method you choose, when trying to solve a difficult MIP model, it is usually crucial to understand the workings of the physical system (or whatever) you are modeling, and try to find some insight that will assist your chosen algorithm to work better. A related observation is that the way you formulate your model can be as important as the actual choice of solver. You should consider getting