A methodology to automatically optimize dynamic memory managers applying grammatical evolution

https://doi.org/10.1016/j.jss.2013.12.044Get rights and content

Highlights

  • In this paper, we present a new multi-objective optimization method based on genetic programming that can be used to optimize the complex DMMs implementations for highly dynamic applications. This method largely simplifies the exploration effort of multi-layered DMMs for designers and enables the refinement of DMM implementations in an automated way.

  • The proposed approach leads to important savings in overall system integration time for dynamic applications. In addition, the method obtains optimal implementations of DMMs structures with respect to key designer's metrics.

  • Our experimental results with six benchmarks and five general-purpose DMMs show that the presented optimization approach significantly reduce the execution time and memory usage up to 59.27% on average when comparing the global fitness.

  • The results obtained so far have outlined other interesting future research lines in the area of DMM implementation optimizations using grammatical evolution. Initially, the grammar can be extended in order to obtain more and more DMM candidates.

Abstract

Modern consumer devices must execute multimedia applications that exhibit high resource utilization. In order to efficiently execute these applications, the dynamic memory subsystem needs to be optimized. This complex task can be tackled in two complementary ways: optimizing the application source code or designing custom dynamic memory management mechanisms. Currently, the first approach has been well established, and several automatic methodologies have been proposed. Regarding the second approach, software engineers often write custom dynamic memory managers from scratch, which is a difficult and error-prone work. This paper presents a novel way to automatically generate custom dynamic memory managers optimizing both performance and memory usage of the target application. The design space is pruned using grammatical evolution converging to the best dynamic memory manager implementation for the target application. Our methodology achieves important improvements (62.55% and 30.62% better on average in performance and memory usage, respectively) when its results are compared to five different general-purpose dynamic memory managers.

Introduction

Nowadays, multimedia applications are mostly developed using C++. This kind of software programs tends to make intensive use of dynamic memory due to their inherent data management. However, in C++, dynamic memory is allocated via the operator new() and deallocated by the operator delete(), which are mapped directly to the malloc() and free() functions of the standard C library in most compilers. Therefore, the creation and destruction of objects is managed by a general-purpose memory allocator, which may provide good runtime and low memory usage for a wide range of applications (Johnstone and Wilson, 1999, Lea, 2010).

However, using specialized Dynamic Memory Managers (DMMs) that take advantage of application-specific behavior can dramatically improve application performance (Barrett and Zorn, 1993, Grunwald and Zorn, 1993). In this regard, three out of the twelve integer benchmarks included in SPEC (parser, gcc, and vpr (SPEC, 2013)) and several server applications, use one or more custom DMMs (Berger et al., 2001).

On the one hand, studies have shown that dynamic memory management can consume up to 38% of the execution time in C++ applications (Calder et al., 1995). Thus, the performance of dynamic memory management can have a substantial effect on the overall performance of C++ applications. On the other hand, new multimedia devices must rely on dynamic memory for a very significant part of their functionality due to the inherent unpredictability of the input data. These devices also integrate multiple services such as multimedia and wireless network communications, which also compete for memory space. Then, the dynamic memory management influences the global memory usage of the system (Atienza et al., 2006b). Finally, energy consumption has become a real issue in overall system design due to circuit reliability and packaging costs (Vijaykrishnan et al., 2003). However, it has been recently proved that the DMM consumes only a 1% of the total enery consumption by the memory subsystem usually in the execution of a given application (Díaz et al., 2011). Thus, the energy consumption by the DMM is not relevant on this case and the optimization of the dynamic memory subsystem has two goals that cannot be seen independently: performance and memory usage. There cannot exist a memory allocator that delivers the best performance and least memory usage for all programs. However, a custom memory allocator that works best for a particular program can be developed using grammatical evolution (Risco-Martin et al., 2009).

To reach higher performance, programmers often write their own ad hoc custom memory allocators as macros or monolithic functions in order to avoid function call overhead. This approach, implemented to improve application performance, is enshrined in the best practices of skilled computer programmers (Meyers, 1995). Nonetheless, this kind of code is brittle and hard to maintain or reuse, and as the application evolves, it can be difficult to adapt the memory allocator as the application requirements vary. Moreover, writing these memory allocators is both error-prone and difficult. Indeed custom and efficient memory allocators are complicated pieces of software that require a substantial engineering effort.

In this work, we have developed a framework based on grammatical evolution to automatically design optimized DMMs for a target application, minimizing memory usage and maximizing performance. Fig. 1 depicts the optimization process. First, as Fig. 1(a) shows, we run the application under study together with an instrumentation tool, which logs all the required information into an external file: identification of the object created/deleted, operation (allocation or deallocation) object size in bytes and memory address. Since all the DMM exploration process is performed simulating the generated DMMs with the profiling report, this task must be done just once. In the following phase, as Fig. 1(b) shows, we automatically examine all the information contained in the profiling report, obtaining a specialized grammar for the target system. As a result, some incomplete rules in the original grammar (see Section 5), such as the different block sizes, are automatically defined according to the obtained profiling. To this end, we have developed a tool called Grammar Generator. The last phase is the optimization process. As Fig. 1(c) depicts, this phase consists of a Grammatical Evolution Algorithm (GEA) that takes the grammar generated in the previous phase and the profiling report of the application as inputs. GEA is supported by a DMM simulator that tests the behavior of every DMM generated by the grammar applied to the application. Our GEA is constantly generating different DMM implementations from the grammar file. When a DMM is generated (DMM(j) in Fig. 1(c)), it is received by the DMM simulator. Then, the simulator emulates the behavior of the application, debugging every line in the profiling report. Such emulation does not de/allocate memory from the computer like the real application, but maintains useful information about how the structure of the selected DMM evolves in time. After the profiling report has been simulated, the DMM simulator returns back the fitness of the current DMM to the GEA. The fitness is computed as a weighted sum of the performance and memory usage by the proposed DMM for the target device and application under study. Finally, the DMM with lowest fitness is returned as solution (optimized DMM).

The rest of the paper is organized as follows. First, Section 2 describes some recent advances in the area of DMMs. Next, Section 3 defines the design space of memory allocators. Then, Section 4 details the design and implementation of the DMM simulator, as well as some configuration examples. Section 5 details how grammatical evolution is applied to the DMM optimization. Section 6 shows our experimental methodology, presenting the six benchmarks selected, whereas Section 7 shows the results for these benchmarks. Finally, Section 8 draws conclusions and future work.

Section snippets

Related work

Several approaches have been presented in the last decade to design flexible and efficient infrastructures for building custom and general-purpose memory allocators (Berger et al., 2001, Atienza et al., 2006b, Atienza et al., 2006a). All the proposed methodologies are based on high-level programming where C++ templates and object-oriented programming techniques are used. They allow the software engineer to compose both general-purpose and custom memory allocator mechanisms. The aforementioned

Dynamic memory management

In this section, we summarize the main characteristics of dynamic memory management, as well as the new classification of memory allocators, which is later considered in the implementation of the simulator and the grammar definition.

DMM simulator design

The global DMM optimization process needs an external simulator to evaluate every candidate DMM generated by the GEA (see Fig. 1(c)). In this section, we motivate and describe our DMM simulator as well as we outline its design goals.

As introduced in Section 1, several existing libraries allow the implementation of both general-purpose and custom DMMs. However, exploration techniques cannot be easily applied. Indeed, each custom design must be implemented, compiled and validated against a target

Grammatical evolution applied to DMM optimization

Grammatical evolution (GE) (e.g., O’Neill and Ryan, 2003, O’Neill and Ryan, 2001, Brabazon and O’Neill, 2006, Dempsey et al., 2007, Brabazon et al., 2008) is a grammar-based form of Genetic Programming (GP) (Poli et al., 2008). It combines principles from molecular biology to the representational power of formal grammars. GE has a rich modularity, which provides a unique flexibility. This makes possible to use alternative search strategies (evolutionary, deterministic or some other approach),

Experimental methodology

In this section, we describe the set of benchmarks used, the fitness function employed and the general configuration of our optimization algorithm.

Results

We simulated the benchmarks described in the previous section by considering five general-purpose allocators: the Kingsley allocator (labeled as KNG in the following figures), the Doug's Lea allocator (labeled as LEA), a buddy system based on the Fibonacci allocation algorithm (labeled as FIB), a list of 10 segregated free-lists (S10), and an exact segregated free list allocator (EXA). Finally, we compared the results with the custom DMM obtained with our proposed automatic exploration process

Conclusions and future work

New multimedia devices have increased their capabilities and now complex applications can be ported to them. Such applications include intensive dynamic memory requirements that must be heavily optimized for an efficient mapping on these devices. To efficiently use dynamic memory in these applications, software engineers often write custom allocators from scratch, which is a difficult and error-prone process.

In this paper, we have presented a new multi-objective optimization method based on

José L. Risco-Martín is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on computational theory of modeling and simulation, with emphasis on Discrete Event Systems Specification (DEVS), dynamic memory management for embedded systems and evolutionary computation.

References (31)

  • B. Calder et al.

    Quantifying behavioral differences between C and C++ programs

    J. Program. Lang.

    (1995)
  • J. Díaz et al.

    Quantifying the impact of dynamic memory managers into memory-intensive applications

  • I. Dempsey et al.

    Constant creation in grammatical evolution

    Int. J. Innov. Comput. Appl.

    (2007)
  • D. Grunwald et al.

    CustoMalloc: efficient synthesized memory allocators

    Softw. Pract. Exp.

    (1993)
  • M.S. Johnstone et al.

    The memory fragmentation problem: solved?

    SIGPLAN Not.

    (1999)
  • Cited by (0)

    José L. Risco-Martín is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on computational theory of modeling and simulation, with emphasis on Discrete Event Systems Specification (DEVS), dynamic memory management for embedded systems and evolutionary computation.

    J. Manuel Colmenar is assistant professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM), Spain. His research interests focus on design methodologies for integrated systems and high-performance embedded systems, novel architectures for logic and memories in forthcoming nano-scale electronics, dynamic memory management and memory hierarchy optimizations for embedded systems, and low-power design of embedded systems.

    J. Ignacio Hidalgo is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM). His research interests include processor design with special emphasis on the application of novel timing methodologies, memory hierarchy optimization and management, thermal-aware designs, and the application of Bio-inspired optimization techniques in CAD problems.

    Juan Lanchares is associate professor at the Computer Architecture and Automation Department of Complutense University of Madrid (UCM). His research interests focus on design methodologies for integrated systems and high-performance embedded systems, processor design with special emphasis on the application of novel timing methodologies, dynamic memory management and memory hierarchy optimizations for embedded systems, and the application of bio-inspired optimization techniques in CAD problems.

    Josefa Díaz Álvarez obtained a M.S. degree in Computer Engineering in 2007 from the University of Extremadura (UNEX). She is currently doing her PhD at the Complutense University of Madrid (UCM). She is currently assistant professor of Computer Science, at the Computer and Technology Department at the Merida campus of the UNEX. Her current research interests include evolutionary algorithms, optimization of hardware resources and microprocessors.

    View full text