A systematic review on search-based refactoring

https://doi.org/10.1016/j.infsof.2016.11.009Get rights and content

Abstract

Context: To find the best sequence of refactorings to be applied in a software artifact is an optimization problem that can be solved using search techniques, in the field called Search-Based Refactoring (SBR). Over the last years, the field has gained importance, and many SBR approaches have appeared, arousing research interest.

Objective: The objective of this paper is to provide an overview of existing SBR approaches, by presenting their common characteristics, and to identify trends and research opportunities.

Method: A systematic review was conducted following a plan that includes the definition of research questions, selection criteria, a search string, and selection of search engines. 71 primary studies were selected, published in the last sixteen years. They were classified considering dimensions related to the main SBR elements, such as addressed artifacts, encoding, search technique, used metrics, available tools, and conducted evaluation.

Results: Some results show that code is the most addressed artifact, and evolutionary algorithms are the most employed search technique. Furthermore, most times, the generated solution is a sequence of refactorings. In this respect, the refactorings considered are usually the ones of the Fowler’s Catalog. Some trends and opportunities for future research include the use of models as artifacts, the use of many objectives, the study of the bad smells effect, and the use of hyper-heuristics.

Conclusions: We have found many SBR approaches, most of them published recently. The approaches are presented, analyzed, and grouped following a classification scheme. The paper contributes to the SBR field as we identify a range of possibilities that serve as a basis to motivate future researches.

Introduction

A software product is frequently evolving to address different functionalities. These evolutions may make the software design more complex and different from the original one, decreasing the software quality. In this sense, a meaningful effort is devoted to the software maintenance phase, where the software refactoring activity can be used. This activity is used to improve the software quality, by improving some quality attributes, such as understandability, maintainability, extensibility and performance [1]. Moreover, such an activity can be also used in the early software engineering phases, such as software development, design and re-engineering.

Software refactoring is performed by changing the software structure without modifying its external behavior. It applies a set of meaning-preserving restructurings, called refactorings [2], which are very simple operations performed to change a software artifact, such as move a method, move a field and extract a class. Refactorings were originally proposed in the context of object-oriented software [3], where some catalogs of refactorings exist [2], [3]. Since then, software refactoring has been applied in different contexts, such as aspect-oriented software, software product line, and in distinct artifacts such as code, models, documentations, requirements and so on.

Finding a good sequence of refactorings to be applied in a software artifact is considered a hard task [P66], since there is a wide range of refactorings and the ideal sequence is correlated to different quality attributes to be improved. In fact, this is an optimization problem that can be solved by search techniques in the field known as Search-Based Software Engineering (SBSE) [4]. Search algorithms allow the addition of several metrics to compute the solution quality. This is one of the factors that makes the use of search techniques for software refactoring very attractive. Furthermore, these algorithms are capable to automatically find, in a huge space, solutions that a software engineer might not have been able to think of [4].

We found successful SBSE approaches that show the applicability of search techniques in a wide variety of problems from diverse software engineering areas in many ways, using different search algorithms, such as Genetic Algorithms and other ones from the operation research area [5]. In this way, the area that applies search-based techniques to perform software refactoring is called Search-Based Refactoring (SBR). This area has been growing over recent years.

A growing number of SBSE works is reported by reviews found in the literature. We can find surveys addressing specific SBSE areas such as software test [6], [7], [8], software design [9], and requirements [10]. None of them addresses specifically SBR. General SBSE surveys [4], [5], [11], [12] provide an overview of the SBSE field, by discussing research directions on SBSE, presenting the software engineering activities and search-based algorithms. Such works include software refactoring but do not explore specific SBR characteristics in depth, like the addressed artifacts, encoding, available tools, used metrics and evaluations conducted. Existing surveys [1] and systematic reviews [13] on software refactoring do not consider search-based approaches.

To contribute to the SBR area, this paper presents results from a systematic review, aiming at finding specific details about the existing SBR approaches. A systematic review is a study to identify, evaluate and interpret available researches related to a particular research question or topic area [14]. It is a good technique to extract information about the papers and identify research trends, since it presents a systematic methodology to be followed.

To conduct this systematic review we followed Kitchenham’s guidelines [14]. We planned the review by defining the research questions, the search string, the sources for searching and the selection criteria. We searched for primary sources and, after the final selection, the data was extracted in order to answer the research questions. Detailed results about the extracted data are presented, such as the most used artifacts, metrics and refactorings. In this way, this paper adds to the contributions of existing SBSE surveys focusing specifically on software refactoring and: (i) offering a more complete and updated list of works obtained systematically and covering the last sixteen years; (ii) providing a classification schema and grouping works, considering specific SBR characteristics that are not addressed in related work; (iii) identifying in the found works, the main contributions and best practices that can point out trends in the area, as well as, gaps and limitations that can suggest need of further study and research opportunities.

This paper is organized as follows. Section 2 reviews background on software refactoring and search-based techniques. Section 3 discusses related work. Section 4 describes how the review was conducted. Section 5 presents and analyses the obtained results. Section 6 describes trends and research opportunities. Section 7 contains the threats to validity of our results. Finally, Section 8 concludes the paper.

Section snippets

Background

This section introduces the fields explored in our research: software refactoring, search algorithms and Search-Based Refactoring (SBR).

Related work

In the literature, we can find works about software refactoring in general. The work of Mens [1] provides a review of the software refactoring tasks, artifacts to be used, formalisms and techniques to be applied, and essential issues to be considered in the development of software refactoring tools. In a most recent paper, Abebe and Yoo [13] describe results from a systematic review conducted to identify the trends, opportunities and challenges of the field. The works are described and gaps

Research method

We performed this systematic review following the three phases presented in the guidelines of Kitchenham [14]. The first phase, planning the review, creates the research protocol to be followed. In the second one, conducting the review, the search is performed, the papers are selected, and their data are extracted and synthesized. The third phase, reporting the review, specifies the dissemination mechanisms and formats the main report. In this sense, such a phase resulted in the elaboration of

Results

In this section, we first give an overview of the selected studies with respect to some basic information about the authors, years and publication venues. Then, we answer each research question based on the extracted information.

Trends and research opportunities

Analyzing the studies found in this review we could identify some research gaps and limitations. Based on them, we discuss in this section main research opportunities related to each investigated software refactoring element. We also present some trends in the SBR by analyzing the frequency of such elements in the studies over the years (see Fig. 16).

Threats to validity

In this section, we identify possible threats to the validity of our review results, by using the taxonomy of Wohlin et al. [57].

Construct validity refers to the relation between theory and observation [57]. The main threats in this category are related to the research questions, engines used, and search string. The research questions may not address all the aspects of the SBR field. To minimize such threat, we elaborated research questions covering the main ingredients that compose a

Concluding remarks

This paper presented results of a systematic review on SBR, focusing on studies that propose search-based approaches to suggest or apply a sequence of refactorings in an artifact. As far as we know, this is the first review addressing specifically SBR approaches. In this sense, specific aspects of the approaches were revealed and some trends and opportunities were identified to guide future researches in the field. That way, researchers can identify common characteristics of the approaches and

Acknowledgments

This work is supported by Brazilian funding agencies CAPES and CNPq [grant numbers 307762/2015-7, 473899/2013-2].

References (57)

  • P. McMinn

    Search-based software test data generation: a survey: research articles

    Softw. Test. Verif. Reliab.

    (2004)
  • M. Harman

    The current state and future of search based software engineering

    in: Proceedings of the Future of Software Engineering

    (2007)
  • M. Harman et al.

    Search Based Software Engineering: A Comprehensive Analysis and Review of Trends Techniques and Applications

    Technical Report

    (2009)
  • M. Abebe et al.

    Trends, opportunities and challenges of software refactoring: A systematic literature review

    Int. J. Softw. Eng. Appl.

    (2014)
  • B. Kitchenham et al.

    Guidelines for Performing Systematic Literature Reviews in Software Engineering

    Technical Report

    (2007)
  • W. Griswold et al.

    The Birth of Refactoring: A Retrospective on the Nature of High-Impact Software Engineering Research

    IEEE Softw.

    (2015)
  • W.F. Opdyke

    Refactoring object-oriented frameworks

    (1992)
  • S. Hanenberg et al.

    Refactoring of aspect-oriented software

    Proceedings of the International Conference on Object-Oriented and Internet-based Technologies, Concepts, and Applications for a Networked World

    (2003)
  • D. Roberts et al.

    A refactoring tool for smalltalk

    Theory Prac. Object Syst.

    (1997)
  • F. Trucchia et al.

    Pro PHP Refactoring

    (2010)
  • G. Sunyé et al.

    Refactoring UML Models

    in: Proceedings of the International Conference on The Unified Modeling Language, Modeling Languages, Concepts, and Tools

    (2001)
  • V. Alves et al.

    Refactoring product lines

    in: Proceedings of the International Conference on Generative Programming and Component Engineering

    (2006)
  • S.W. Ambler et al.

    Refactoring Databases: Evolutionary Database Design

    (2006)
  • E. Harold

    Refactoring HTML: Improving the Design of Existing Web Applications

    (2008)
  • E.-G. Talbi

    Metaheuristics: From Design to Implementation

    (2009)
  • C.A.C. Coello et al.

    Evolutionary Algorithms for Solving Multi-Objective Problems

    (2006)
  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithm: NSGA-II

    IEEE Trans. Evolut. Comput.

    (2002)
  • E. Zitzler et al.

    SPEA2: Improving the Strength Pareto Evolutionary Algorithm

    Technical Report

    (2001)
  • Cited by (88)

    • Code smells and refactoring: a tertiary systematic literature review

      2024, International Journal of System of Systems Engineering
    • Deep Learning-Based Code Refactoring: A Review of Current Knowledge

      2024, Journal of Computer Information Systems
    View all citing articles on Scopus
    View full text