1. the complete title of one (or more) paper(s) published in the open literature describing the work that the author claims describes a human-competitive result; Automated Software Transplantation ------------------------------------------------------------------------------- 2. the name, complete physical mailing address, e-mail address, and phone number of EACH author of EACH paper(s); Earl T. Barr University College London Department of Computer Science Gower Street London WC1E 6BT United Kingdom e-mail: e.barr@ucl.ac.uk tel: +44 (0)20 7679 3570 Mark Harman University College London Department of Computer Science Gower Street London WC1E 6BT United Kingdom e-mail: mark.harman@ucl.ac.uk tel: +44 (0)20 7679 1305 Yue Jia University College London Department of Computer Science Gower Street London WC1E 6BT United Kingdom e-mail:yue.jia@ucl.ac.uk tel: +44 (0)20 7679 3673 Alexandru Marginean University College London Department of Computer Science Gower Street London WC1E 6BT United Kingdom e-mail:alexandru.marginean.13@ucl.ac.uk Tel: + 44 7497 648278 Justyna Petke University College London Department of Computer Science Gower Street London WC1E 6BT United Kingdom e-mail: j.petke@ucl.ac.uk tel: +44 (0)20 7679 7190 ------------------------------------------------------------------------------- 3. the name of the corresponding author (i.e., the author to whom notices will be sent concerning the competition); Alexandru Marginean ------------------------------------------------------------------------------- 4. the abstract of the paper(s); Automated transplantation would open many exciting avenues for software development: suppose we could autotransplant code from one system into another, entirely unrelated, system. This paper introduces a theory, an algorithm, and a tool that achieve this. Leveraging lightweight annotation, program analysis identifies an organ (interesting behavior to transplant); testing validates that the organ exhibits the desired behavior during its extraction and after its implantation into a host. While we do not claim automated transplantation is now a solved problem, our results are encouraging: we report that in 12 of 15 experiments, involving 5 donors and 3 hosts (all popular real-world systems), we successfully autotransplanted new functionality and passed all regression tests. Autotransplantation is also already useful: in 26 hours computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the VLC media player; compare this to upgrading x264 within VLC, a task that we estimate, from VLC's version history, took human programmers an average of 20 days of elapsed, as opposed to dedicated, time. ------------------------------------------------------------------------------- 5. a list containing one or more of the eight letters (A, B, C, D, E, F, G, or H) that correspond to the criteria (see above) that the author claims that the work satisfies; (C) The result is equal to or better than a result that was placed into a database or archive of results maintained by an internationally recognized panel of scientific experts. (E) The result is equal to or better than the most recent human-created solution to a long-standing problem for which there has been a succession of increasingly better human-created solutions. (F) The result is equal to or better than a result that was considered an achievement in its field at the time it was first discovered. (G) The result solves a problem of indisputable difficulty in its field. (H) The result holds its own or wins a regulated competition involving human contestants (in the form of either live human players or human-written computer programs). ------------------------------------------------------------------------------- 6. a statement stating why the result satisfies the criteria that the contestant claims (see examples of statements of human-competitiveness as a guide to aid in constructing this part of the submission); (C) Code reuse is a seminal problem in computer science. A perennial development practice to reuse code is to copy and paste source code from a donor program to a host program, then update it until it compiles, runs, and works in the host. We are the first to automate code transplantation to move nontrivial, useful functionality between two systems [12,13]. We used MuScalpel, our code transplantation tool, to transplant H264 [18], the most used media encoder, from x264 [19], an award winning donor program, into VLC [20], a very popular open source video player. As a side effect of transplantation, MuScalpel improved the encoding time of the H264 “organ”. H.264 is the most widely used media encoding format, because of its increased compression rate, speed, and quality. There are a lot of H.264 media encoders: Wikipedia reports about 11 H.264 encoders [17]. These encoders are regularly compared against each other, as numerous online encoder databases and archives show [1,2,15] . We transplanted the H.264 encoding functionality from x264, the tool recognised as the best H.264 encoding by these databases and archives. We evaluated our result on video clips with the same settings as the benchmarks used in the MSU Sixth MPEG-4 AVC/H.264 Video Codecs Comparison (since we were not able to access exactly the same video sequences as used in the competition) [3]. Our transplanted H.264 organ performs slightly better than the original implementation of H.264 encoder in the x264 donor program by reducing the execution time for encrypting each video by 2.4% on average. (E) Media encoding is a long-standing problem in computer science. Video quality is continually increasing and requires ever more space, straining disk capacity even in an era when disk space is cheap. Nowadays 1920x1080 quality is standard, although 4k videos are starting to be used more frequently. Unencoded, the raw stream of a 1 hour movie at 1920x1080 resolution and 60 frames per second (FPS), as standard for NTSC television, or high-end HDTV systems, consumes 1.53 TB hard disk space [4]. Few users have anywhere close to the needed HDD capacity. To solve this problem, many encoding standards have been created. For instance, an H.264 encoded version of the same video, requires only 86.9 GB [5], just 0.05% of the initial space requirement. Video media players are thus under continual development. Human software developers need to make sure that their video media players are able to handle new encoding optimisations and new media formats. We were able to automate this task. In 26 hours of computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the popular open source VLC media player. We estimate that upgrade of x264 within VLC took human programmers an average of 20 days of elapsed, as opposed to dedicated, time (based on VLC’s git version history). (F) H.264 encoding vastly improved on previous media encoders. x264 is an award winning implementation of H.264. It took part in three regulated competitions [6] evaluating various human-written H.264 encodings, and won: - first place at MSU Sixth MPEG-4 AVC/H.264 Video Codecs Comparison, with ~24% better encoding than second place; - first place at Doom9's 2005 codec shoot-out, passing Ateme by a hair; - tied for 1st place (with Ateme) in the second annual MSU MPEG-4 AVC/ H.264 codecs comparison; Our experiments show that the transplanted H.264 encoder, from x264, encrypts 2.4% faster on average than the x264 donor program, on our benchmark (more details about this in (H) ). (G) With the abundance of code available online software developers need not write their programs from scratch. If they require a particular feature, for example a video encoding functionality for their media player, they can choose to use an open source implementation and transplant it into their program. However, this is a non-trivial task that has thus far been performed manually. We present an automated approach and show several successful autotransplantation results. In particular, in 26 hours of computation time we successfully autotransplanted the H.264 video encoding functionality from the x264 system to the popular open source VLC media player. We estimate that upgrade of x264 within VLC took human programmers an average of 20 days of elapsed, as opposed to dedicated, time (based on VLC’s git version history). (H) For showing the human competitiveness of MuScalpel, we aimed to evaluate our result on the benchmark of MSU Sixth MPEG-4 AVC/H.264 Video Codecs Comparison [15]. Though the “x264 organ” has not officially entered into the MSU Sixth MPEG-4 AVC/H.264 Video Codecs Comparison competition, we faithfully replicated the criteria and settings used in that competition. We were not able to access exactly the same videos as used in the competition, but we generated 8 video clips with the same settings as each video that was used in that competition [3]. Our transplanted H.264 organ performs slightly better than the original implementation of H.264 encoder in the x264 donor program, by reducing the execution time for encrypting each video by 2.4% on average, while preserving the output for our benchmark. Our main purpose is transplanting useful functionality: this improvement is a side-effect of using genetic programming in the transplantation process. ------------------------------------------------------------------------------- 7. a full citation of the paper (that is, author names; publication date; name of journal, conference, technical report, thesis, book, or book chapter; name of editors, if applicable, of the journal or edited book; publisher name; publisher city; page numbers, if applicable); Barr, Earl T., Mark Harman, Yue Jia, Alexandru Marginean, and Justyna Petke. "Automated software transplantation." In Proceedings of the 2015 International Symposium on Software Testing and Analysis, pp. 257-269. ACM, 2015. ------------------------------------------------------------------------------- 8. a statement either that "any prize money, if any, is to be divided equally among the co-authors" OR a specific percentage breakdown as to how the prize money, if any, is to be divided among the co-authors; Any prize money, if any, is to be awarded to Alexandru Marginean. ------------------------------------------------------------------------------- 9. a statement stating why the authors expect that their entry would be the "best," and Genetic programming (GP) and genetic improvement (GI) have been successfully applied to fixing real-world problems: improving execution time, bug fixing, improving energy consumption, etc. We tackle the problem of improving the functional properties of a program: we are the first one to use code transplants for moving useful functionality between two unrelated programs. We implemented our approach in a practical tool (MuScalpel), and used it to automatically transplant the best implementation of H.264 encoder, into VLC, the most popular open source media player. Both systems are popular, substantial real-world systems. Our work received a best paper award and attracted media attention: - ACM SIGSOFT Distinguished Paper Award at ISSTA 2015; - Wired article [7]; - BBC Click [8] , between minutes 12.50–20.30; - Motherboard article [9]; Our work also attracted social media reactions. On the same day when MuScalpel was featured in Wired, Jeremy Clarkson left Top Gear (“the BBC’s biggest global brand with sales of the TV show, DVDs, books, live shows and other merchandise worth more than £50m a year” [14]). Surprisingly, the article on MuScalpel has more shares on twitter than the one about Jeremy Clarkson leaving Top Gear. Programmers and real-world developers consider autotransplantation as human competitive, and indicate real fear that they would be put out of business by our work. For example, a thread on an UK contractors forum [10] discusses this problem, and a twitter post [11] raises this issue. H.264 is the most widely used media encoding standard [16]. Implementing it from scratch is tedious, error prone, and difficult; and requires specialised multimedia and mathematical knowledge. We took the best implementation of H.264, from an award winning tool (x264), and automatically transplanted it into VLC. We are the first to use genetic improvement to automatically move useful functionality between two potentially unrelated systems. We know the functionality is useful, since the developers of VLC have used and maintained H.264 encoders for the last 14 years. We know this is not a trivial task, from the GIT History of VLC. We identified a total of 438 commits concerning the H.264 standard. The total size of the H264 “organ” that MuScalpel transplanted is 23000 source lines of code. ------------------------------------------------------------------------------- 10. an indication of the general type of genetic or evolutionary computation used, such as GA (genetic algorithms), GP (genetic programming), ES (evolution strategies), EP (evolutionary programming), LCS (learning classifier systems), GE (grammatical evolution), GEP (gene expression programming), DE (differential evolution), etc. Genetic Improvement (GI), Genetic Programming (GP) ------------------------------------------------------------------------------- References: [1] http://www.compression.ru/video/codec_comparison/h264_2012/ [2] http://www.streamingmedia.com/articles/editorial/featured-articles/first-look-h.264-and-vp8-compared-67266.aspx [3] http://www.compression.ru/video/codec_comparison/h264_2010/appendixes.html#Appendix_4 [4] http://studiopost.com/contact/tech-specs/calculating-disk-space-requirements?format=uncompressed_10_1080&frame_rate=f60&length=1&length_type=hours [5] http://studiopost.com/contact/tech-specs/calculating-disk-space-requirements?format=h264_1080&frame_rate=f60&length=1&length_type=hours [6] http://www.videolan.org/developers/x264.html [7] http://www.wired.co.uk/news/archive/2015-07/30/code-organ-transplant-software-myscalpel [8] http://www.bbc.co.uk/programmes/p02y78pp [9] http://motherboard.vice.com/read/muscalpel-is-an-algorithmic-code-transplantation [10] http://forums.contractoruk.com/general/108112-i-am-obsolete.html [11] https://twitter.com/anirbanbasu/status/627972057154912256 [12] Barr, E.T., Harman, M., Jia, Y., Marginean, A. and Petke, J., 2015, July. Automated software transplantation. In Proceedings of the 2015 International Symposium on Software Testing and Analysis (pp. 257-269). ACM. [13] Marginean, A., Barr, E.T., Harman, M. and Jia, Y., 2015. Automated Transplantation of Call Graph and Layout Features into Kate. In Search-Based Software Engineering (pp. 262-268). Springer International Publishing. [14] http://www.theguardian.com/media/2015/mar/11/top-gear-bbc-jeremy-clarkson [15] http://www.compression.ru/video/codec_comparison/h264_2010/ [16] http://www.streamingmedia.com/Articles/Editorial/What-Is-.../What-is-H.264-74735.aspx [17] https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC#Software_encoders [18] https://en.wikipedia.org/wiki/H.264/MPEG-4_AVC [19] http://www.videolan.org/developers/x264.html [20] http://www.videolan.org/