A methodology to automatically optimize dynamic memory managers applying grammatical evolution
Introduction
Nowadays, multimedia applications are mostly developed using C++. This kind of software programs tends to make intensive use of dynamic memory due to their inherent data management. However, in C++, dynamic memory is allocated via the operator new() and deallocated by the operator delete(), which are mapped directly to the malloc() and free() functions of the standard C library in most compilers. Therefore, the creation and destruction of objects is managed by a general-purpose memory allocator, which may provide good runtime and low memory usage for a wide range of applications (Johnstone and Wilson, 1999, Lea, 2010).
However, using specialized Dynamic Memory Managers (DMMs) that take advantage of application-specific behavior can dramatically improve application performance (Barrett and Zorn, 1993, Grunwald and Zorn, 1993). In this regard, three out of the twelve integer benchmarks included in SPEC (parser, gcc, and vpr (SPEC, 2013)) and several server applications, use one or more custom DMMs (Berger et al., 2001).
On the one hand, studies have shown that dynamic memory management can consume up to 38% of the execution time in C++ applications (Calder et al., 1995). Thus, the performance of dynamic memory management can have a substantial effect on the overall performance of C++ applications. On the other hand, new multimedia devices must rely on dynamic memory for a very significant part of their functionality due to the inherent unpredictability of the input data. These devices also integrate multiple services such as multimedia and wireless network communications, which also compete for memory space. Then, the dynamic memory management influences the global memory usage of the system (Atienza et al., 2006b). Finally, energy consumption has become a real issue in overall system design due to circuit reliability and packaging costs (Vijaykrishnan et al., 2003). However, it has been recently proved that the DMM consumes only a 1% of the total enery consumption by the memory subsystem usually in the execution of a given application (Díaz et al., 2011). Thus, the energy consumption by the DMM is not relevant on this case and the optimization of the dynamic memory subsystem has two goals that cannot be seen independently: performance and memory usage. There cannot exist a memory allocator that delivers the best performance and least memory usage for all programs. However, a custom memory allocator that works best for a particular program can be developed using grammatical evolution (Risco-Martin et al., 2009).
To reach higher performance, programmers often write their own ad hoc custom memory allocators as macros or monolithic functions in order to avoid function call overhead. This approach, implemented to improve application performance, is enshrined in the best practices of skilled computer programmers (Meyers, 1995). Nonetheless, this kind of code is brittle and hard to maintain or reuse, and as the application evolves, it can be difficult to adapt the memory allocator as the application requirements vary. Moreover, writing these memory allocators is both error-prone and difficult. Indeed custom and efficient memory allocators are complicated pieces of software that require a substantial engineering effort.
In this work, we have developed a framework based on grammatical evolution to automatically design optimized DMMs for a target application, minimizing memory usage and maximizing performance. Fig. 1 depicts the optimization process. First, as Fig. 1(a) shows, we run the application under study together with an instrumentation tool, which logs all the required information into an external file: identification of the object created/deleted, operation (allocation or deallocation) object size in bytes and memory address. Since all the DMM exploration process is performed simulating the generated DMMs with the profiling report, this task must be done just once. In the following phase, as Fig. 1(b) shows, we automatically examine all the information contained in the profiling report, obtaining a specialized grammar for the target system. As a result, some incomplete rules in the original grammar (see Section 5), such as the different block sizes, are automatically defined according to the obtained profiling. To this end, we have developed a tool called Grammar Generator. The last phase is the optimization process. As Fig. 1(c) depicts, this phase consists of a Grammatical Evolution Algorithm (GEA) that takes the grammar generated in the previous phase and the profiling report of the application as inputs. GEA is supported by a DMM simulator that tests the behavior of every DMM generated by the grammar applied to the application. Our GEA is constantly generating different DMM implementations from the grammar file. When a DMM is generated (DMM(j) in Fig. 1(c)), it is received by the DMM simulator. Then, the simulator emulates the behavior of the application, debugging every line in the profiling report. Such emulation does not de/allocate memory from the computer like the real application, but maintains useful information about how the structure of the selected DMM evolves in time. After the profiling report has been simulated, the DMM simulator returns back the fitness of the current DMM to the GEA. The fitness is computed as a weighted sum of the performance and memory usage by the proposed DMM for the target device and application under study. Finally, the DMM with lowest fitness is returned as solution (optimized DMM).
The rest of the paper is organized as follows. First, Section 2 describes some recent advances in the area of DMMs. Next, Section 3 defines the design space of memory allocators. Then, Section 4 details the design and implementation of the DMM simulator, as well as some configuration examples. Section 5 details how grammatical evolution is applied to the DMM optimization. Section 6 shows our experimental methodology, presenting the six benchmarks selected, whereas Section 7 shows the results for these benchmarks. Finally, Section 8 draws conclusions and future work.
Section snippets
Related work
Several approaches have been presented in the last decade to design flexible and efficient infrastructures for building custom and general-purpose memory allocators (Berger et al., 2001, Atienza et al., 2006b, Atienza et al., 2006a). All the proposed methodologies are based on high-level programming where C++ templates and object-oriented programming techniques are used. They allow the software engineer to compose both general-purpose and custom memory allocator mechanisms. The aforementioned
Dynamic memory management
In this section, we summarize the main characteristics of dynamic memory management, as well as the new classification of memory allocators, which is later considered in the implementation of the simulator and the grammar definition.
DMM simulator design
The global DMM optimization process needs an external simulator to evaluate every candidate DMM generated by the GEA (see Fig. 1(c)). In this section, we motivate and describe our DMM simulator as well as we outline its design goals.
As introduced in Section 1, several existing libraries allow the implementation of both general-purpose and custom DMMs. However, exploration techniques cannot be easily applied. Indeed, each custom design must be implemented, compiled and validated against a target
Grammatical evolution applied to DMM optimization
Grammatical evolution (GE) (e.g., O’Neill and Ryan, 2003, O’Neill and Ryan, 2001, Brabazon and O’Neill, 2006, Dempsey et al., 2007, Brabazon et al., 2008) is a grammar-based form of Genetic Programming (GP) (Poli et al., 2008). It combines principles from molecular biology to the representational power of formal grammars. GE has a rich modularity, which provides a unique flexibility. This makes possible to use alternative search strategies (evolutionary, deterministic or some other approach),
Experimental methodology
In this section, we describe the set of benchmarks used, the fitness function employed and the general configuration of our optimization algorithm.
Results
We simulated the benchmarks described in the previous section by considering five general-purpose allocators: the Kingsley allocator (labeled as KNG in the following figures), the Doug's Lea allocator (labeled as LEA), a buddy system based on the Fibonacci allocation algorithm (labeled as FIB), a list of 10 segregated free-lists (S10), and an exact segregated free list allocator (EXA). Finally, we compared the results with the custom DMM obtained with our proposed automatic exploration process
Conclusions and future work
New multimedia devices have increased their capabilities and now complex applications can be ported to them. Such applications include intensive dynamic memory requirements that must be heavily optimized for an efficient mapping on these devices. To efficiently use dynamic memory in these applications, software engineers often write custom allocators from scratch, which is a difficult and error-prone process.
In this paper, we have presented a new multi-objective optimization method based on
José L. Risco-Martín is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on computational theory of modeling and simulation, with emphasis on Discrete Event Systems Specification (DEVS), dynamic memory management for embedded systems and evolutionary computation.
References (31)
- et al.
Efficient system – level prototyping of power-aware dynamic memory managers for embedded systems
Integr. VLSI J.
(2006) - et al.
The design and analysis of a quantitative simulator for dynamic memory management
J. Syst. Softw.
(2004) - et al.
A parallel evolutionary algorithm to optimize dynamic memory managers in embedded systems
Parallel Comput.
(2010) - et al.
Simulation of high-performance memory allocators
Microprocess. Microsyst.
(2011) - et al.
Systematic dynamic memory management design methodology for reduced memory footprint
ACM Trans. Des. Autom. Electron. Syst.
(2006) - et al.
Using lifetime predictors to improve memory allocation performance
SIGPLAN Not.
(1993) - et al.
Composing high-performance memory allocators
- et al.
Reconsidering custom memory allocation
- et al.
Biologically Inspired Algorithms for Financial Modelling
(2006) - et al.
An introduction to evolutionary computation in finance
IEEE Comput. Intell. Magaz.
(2008)
Quantifying behavioral differences between C and C++ programs
J. Program. Lang.
Quantifying the impact of dynamic memory managers into memory-intensive applications
Constant creation in grammatical evolution
Int. J. Innov. Comput. Appl.
CustoMalloc: efficient synthesized memory allocators
Softw. Pract. Exp.
The memory fragmentation problem: solved?
SIGPLAN Not.
Cited by (0)
José L. Risco-Martín is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on computational theory of modeling and simulation, with emphasis on Discrete Event Systems Specification (DEVS), dynamic memory management for embedded systems and evolutionary computation.
J. Manuel Colmenar is assistant professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on design methodologies for integrated systems and high-performance embedded systems, novel architectures for logic and memories in forthcoming nano-scale electronics, dynamic memory management and memory hierarchy optimizations for embedded systems, and low-power design of embedded systems.
J. Ignacio Hidalgo is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM). His research interests include processor design with special emphasis on the application of novel timing methodologies, memory hierarchy optimization and management, thermal-aware designs, and the application of Bio-inspired optimization techniques in CAD problems.
Juan Lanchares is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM). His research interests focus on design methodologies for integrated systems and high-performance embedded systems, processor design with special emphasis on the application of novel timing methodologies, dynamic memory management and memory hierarchy optimizations for embedded systems, and the application of bio-inspired optimization techniques in CAD problems.
Josefa Díaz Álvarez obtained a M.S. degree in Computer Engineering in 2007 from the University of Extremadura (UNEX). She is currently doing her PhD at the Complutense University of Madrid (UCM). She is currently assistant professor of Computer Science, at the Computer and Technology Department at the Merida campus of the UNEX. Her current research interests include evolutionary algorithms, optimization of hardware resources and microprocessors.