Survey PaperGrammatical evolution for constraint synthesis for mixed-integer linear programming
Introduction
The Mixed-Integer Linear Programming (MILP) models [1] are a common representation for a real-world object that consists of three parts: (1) variables of the object specified with domains (real or integer) and bounds on their values, (2) linear constraints representing the relationships between these variables, and (3) a linear objective function of these variables representing the outcome of this object. For instance, for a diet plan, the variables may represent the quantities of food items, the constraints might represent the lower bounds on nutrients delivered by the food items, and the objective function could represent the cost of the food. MILP models are quite popular in business and academia, e.g., the NEOS Solver Server [2] reports of the submitted models in 2019 were MILP. A solver is a software tool that solves the model by assigning values to variables that minimize (maximize) the objective function subject to the constraints. For example, it finds the diet plan of the minimal cost that meets the nutritional constraints.
MILP models are typically handcrafted by a modeling expert in collaboration with domain experts. This is because sharing the competencies in modeling, and the object being modeled by a single expert is not common in practice. The modeling expert gains information on the object by interviewing the domain experts. As things like personal feelings, and incomplete knowledge of the domain experts may hide some details from the modeling expert, modeling often requires several iterations to bring satisfactory alignment of MILP models with reality. To further complicate matters, many real-world objects are not linear and the non-linear relationships need to be linearized or approximated to meet the requirement of the MILP model. These are advanced techniques and implementing them is error-prone. The errors in MILP models often remain undetected until the optimal solution to the model turns out inapplicable in practice, requiring another iteration of modeling. All these challenges increase the cost of modeling and optimization services.
ZIMPL [3] is a high-level modeling language for MILP models that facilitates modeling by compactly representing common constructs, e.g., sums and quantifiers. The ZIMPL interpreter automatically linearizes common non-linear functions, e.g., absolute value, min, max. ZIMPL transforms into an LP format [4], a low-level modeling language supported by all major solvers. Therefore, a MILP model specified in ZIMPL can be solved by virtually any solver.
ZIMPL, though helpful, does not diminish all challenges in modeling and the burden on the experts remains high. In this study, we propose to help the experts further. Rather than handcraft the MILP model, we propose an approach to automate the synthesis of MILP models in ZIMPL from underlying data about the problem. We assume that the dimension sets, the parameters, and the variables of the object are given. For instance, for the diet plan, one dimension is a set of food items and another is a set of nutrients, the parameters consist of volumes of nutrients in food items, and the variables represent quantities of food items in the diet plan. We also assume that a training set of examples of feasible solutions is available, e.g., the set of exemplary diet plans meeting all nutrition constraints. A diet advisor may easily collect such data during her service, however, transforming this data into a MILP model requires proper technical training.
Building a MILP model can be decomposed into two largely-independent tasks, (1) the design of the objective function, and (2) the design of the constraints. The latter task of constraint design is more demanding because the number of constraints is usually large, while a typical model consists of only one objective function. Hence, in this work, we focus our attention towards constraint synthesis.
The primary contributions of this study relate to the verification of the main research hypothesis: the MILP constraints in ZIMPL can be synthesized from the underlying problem data using Grammatical Evolution (GE) [5].
More precisely, the contributions are:
- •
The formalization of the Constraint Synthesis Problem (CSP) in Section 2.3
- •
The proposition in Section 4 of the Grammatical Evolution for Constraint Synthesis (GECS) algorithm for CSP
- •
The empirical verification of the properties of GECS using fourteen real-world and four synthetic CSPs in Section 5.
GECS first generates a problem-specific context-free grammar from the input data, then runs GE to synthesize the constraints. GE is an evolutionary algorithm that uses integer vectors as genotypes and transforms them into code using the given grammar. GE has proved effective in many code synthesis problems [5], [6], [7].
GECS is not the first algorithm for CSP, however, to our knowledge it is the first one that synthesizes MILP constraints in a high-level modeling language. The use of the high-level language allows for the generation of constraints that automatically adapt to the data and facilitates the synthesis of large sets of related constraints. This offers a great advantage over contemporary algorithms, most of which fine-tune the weights and produce independent constraints stuck to the training examples. As empirical evidence shows, this also makes GECS resistant to the curse of dimensionality [8] that all other referenced algorithms suffer from. Section 3 discusses the variants of CSP and compares GECS to contemporary algorithms. Section 5.3 confirms empirically the superiority of GECS to two other algorithms in the terms of the test-set performance. Section 6 discusses the advantages and disadvantages of GECS in the context of other algorithms. Section 7 concludes this work and outlines possible extensions to GECS.
Appendix A shows the best models synthesized by GECS in this work. Appendix B lists the abbreviations and the symbols used in the text.
Section snippets
Terminology
We define several distinct formal objects that share common names in the literature. To make things clear, we use the term problem to refer to the Constraint Synthesis Problem (CSP), the term model to refer to the MILP model that in fact consists of the input and the output of the CSP, and the term solution to refer to the solution of the MILP model. We also use the terms model and set of constraints interchangeably, as the latter is an essential part of the former and we do not synthesize
Related work
In this section we first discuss the alternatives to ZIMPL, then review different formulations of a CSP, and finally survey the works on the synthesis of Mathematical Programming (MP) models.
Constraint synthesis algorithm
Grammatical Evolution for Constraint Synthesis (GECS), the main contribution of this study, is the algorithm solving CSP posed in Section 2.3. The input to GECS is the ZIMPL snippet consisting of the definitions of the sets of parameters , dimension sets , and variables , and the matrix of examples, in the reference implementation given in the CSV format. GECS assumes that the ZIMPL snippet is complete and consists of all symbols available for use in the model. GECS yields a ready-to-use
Experiment
We seek the answers to four experimental questions:
- •
What is the best parameter setting for GECS?
- •
How well does GECS scale with the dimensions of CSPs?
- •
How well does GECS compare to its competitors?
- •
How well do the synthesized models work in optimization?
We use eighteen MILP models in ZIMPL as ground truth in eighteen benchmark CSPs. Table 2 shows the statistics of these models: types, numbers, and dimensionality of the involved symbols. The prefix in the name of the model denotes its source: ‘a’
Discussion
A MILP model in ZIMPL is general in the sense that it represents an entire class of real-world objects sharing the same constraints and the same objective function and differing only in the values of the parameters and dimensions. For instance, for the zdiet MILP model in ZIMPL, two diet plans with different food dimensions have the same constraints and the objective function and differ only in the food set. This is a qualitative difference w.r.t. the LP format [4] that effectively stores
Conclusions and future work
We formally posed the Constraint Synthesis Problem for MILP models in ZIMPL high-level modeling language, proposed the GECS algorithm aimed at solving CSP, and verified experimentally its properties and performance w.r.t. the contemporary algorithms. GECS synthesizes MILP models guided by the grammar of ZIMPL and the exemplary solutions. This is a qualitatively different approach than of the majority of previous algorithms, which optimize numerically the weights in the constraints. This mode of
Funding
T.P. Pawlak acknowledges the support of National Science Centre Poland grant 2016/23/D/ST6/03735, and the National Centre for Research and Development Poland grant LIDER/14/0086/L-10/18/NCBR/2019. M. O’Neill acknowledges the support of Science Foundation Ireland grants 13/IA/1850 and 13/RC/2094.
CRediT authorship contribution statement
Tomasz P. Pawlak: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Resources, Writing - original draft, Visualization, Supervision, Project administration, Funding acquisition. Michael O’Neill: Validation, Resources, Writing - review & editing, Funding acquisition.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (72)
The upper bound theorem for polytopes: an easy proof of its asymptotic version
Computational Geometry
(1995)- et al.
One-class synthesis of constraints for mixed-integer linear programming with C4.5 decision trees
Appl Soft Comput
(2018) Synthesis of mathematical programming models with one-class evolutionary strategies
Swarm Evol Comput
(2019)- et al.
Automatic synthesis of constraints from examples using mixed integer linear programming
Eur J Oper Res
(2017) - et al.
Empirical decision model learning
Artif Intell
(2017) - et al.
Training a 3-node neural network is np-complete
Neural Networks
(1992) - et al.
Ellipsoidal one-class constraint acquisition for quadratically constrained programming
Eur J Oper Res
(2021) Generalization as search
Artif Intell
(1982)- et al.
Structured learning modulo theories
Artif Intell
(2017) - et al.
A survey of known results and research areas for n-queens
Discrete Math
(2009)
Improved differential evolution for noisy optimization
Swarm Evol Comput
Noisy evolutionary optimization algorithms – a comprehensive survey
Swarm Evol Comput
Model Building in Mathematical Programming
Rapid Mathematical Programming
Grammatical evolution: Evolutionary automatic programming in a arbitrary language
Experiments in program synthesis with grammatical evolution: A focus on integer sorting
The Elephant in the Room: Towards the Application of Genetic Programming to Automatic Programming
Dynamic programming
Machine learning: The art and science of algorithms that make sense of data
Mixed-data classificatory programs i - agglomerative systems.
Australian Computer Journal
The hit-and-run sampler: A globally reaching markov chain sampler for generating arbitrary multivariate distributions
Proceedings of the 28th Conference on Winter Simulation
One-class classification: taxonomy of study and review of techniques
Knowl Eng Rev
On the shape of a set of points in the plane
IEEE Trans. Inf. Theory
AMPL: A modeling language for mathematical programming
One-class constraint acquisition with local search
Proceedings of the Genetic and Evolutionary Computation Conference
CMA-ES for one-class constraint synthesis
Proceedings of the 2020 Genetic and Evolutionary Computation Conference
Synthesis of mathematical programming constraints with genetic programming
Learning linear programs from data
2019 IEEE 31st International Conference on Tools with Artificial Intelligence (ICTAI)
On the complexity of polyhedral separability
Discrete & Computational Geometry
Syntax-guided synthesis
2013 Formal Methods in Computer-Aided Design
Large-margin convex polytope machine
Proceedings of the 27th International Conference on Neural Information Processing Systems - Volume 2
Model agnostic solution of csps via deep learning: A preliminary study
Cited by (6)
Optimization with constraint learning: A framework and survey
2024, European Journal of Operational ResearchFormulation of non-local space-fractional plate model and validation for composite micro-plates
2023, International Journal of Engineering ScienceContinuous discovery of Causal nets for non-stationary business processes using the Online Miner
2022, European Journal of Operational ResearchCitation Excerpt :The produced QCQP models are guaranteed to be convex, and thus are solvable in polynomial time. GECS by Pawlak & O’Neill (2021) is a grammatical evolution-based algorithm for the synthesis of mixed-integer LP models. Contrary to the above works, GECS uses a high-level modeling language that facilitates the synthesis of far larger models.