skip to main content
10.1145/1389095.1389333acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Evolutionary lossless compression with GP-ZIP*

Published:12 July 2008Publication History

ABSTRACT

In recent research we proposed GP-zip, a system which uses evolution to find optimal ways to combine standard compression algorithms for the purpose of maximally losslessly compressing files and archives. The system divides files into blocks of predefined length. It then uses a linear, fixed-length representation where each primitive indicates what compression algorithm to use for a specific data block. GP-zip worked well with heterogonous data sets, providing significant improvements in compression ratio compared to some of the best standard compression algorithms. In this paper we propose a substantial improvement, called GP-zip*, which uses a new representation and intelligent crossover and mutation operators such that blocks of different sizes can be evolved. Like GP-zip, GP-zip* finds what the best compression technique to use for each block is. The compression algorithms available in the primitive set of GP-zip* are: Arithmetic coding (AC), Lempel-Ziv-Welch (LZW), Unbounded Prediction by Partial Matching (PPMD), Run Length Encoding (RLE), and Boolean Minimization. In addition, two transformation techniques are available: the Burrows-Wheeler Transformation (BWT) and Move to Front (MTF). Results show that GP-zip* provides improvements in compression ratio ranging from a fraction to several tens of percent over its predecessor.

References

  1. I. M. Pu, Fundamental Data Compression, HB, ISBN-13: 978-0-7506-6310-62006. Chapter 1.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Ahmad Kattan and Riccardo Poli, Evolutionary Lossless Compression with GP-ZIP, Proceedings of the IEEE World Congress on Computational Intelligence, IEEE 2008.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. William H. Hsu and Emy E. Zwarico, Automatic Synthesis of Compression Techniques for Heterogeneous Files SOFTPREX: Software-Practice and Experience, Vol. 25, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. John R. Koza. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA, 1992.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. Peter Nordin and Wolfgang Banzhaf. Programmatic compression of images and sound. In John R. Koza, David E. Goldberg, David B. Fogel, and Rick L. Riolo, editors, Genetic Programming 1996: Proceedings of the First Annual Conference, pages 345--350, Stanford University, CA, USA, 28-31 July 1996. MIT Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Evelyne Lutton, Jacques Levy-Vehel, Guillaume Cretin, Philippe Glevarec, and Cidric Roll. Mixed IFS: Resolution of the inverse problem using genetic programming. Complex Systems, 9:375--398, 1995.]]Google ScholarGoogle Scholar
  7. Evelyne Lutton, Jacques Levy-Vehel, Guillaume Cretin, Philippe Glevarec, and Cidric Roll. Mixed IFS: Resolution of the inverse problem using genetic programming. Research Report No 2631, Inria, 1995.]]Google ScholarGoogle Scholar
  8. Anargyros Sarafopoulos. Automatic generation of affine IFS and strongly typed genetic programming. In Riccardo Poli, Peter Nordin, William B. Langdon, and Terence C. Fogarty, editors, Genetic Programming, Proceedings of EuroGP'99, volume 1598 of LNCS, pages 149--160, Goteborg, Sweden, 26-27 May 1999. Springer-Verlag.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Andreas Klappenecker and Frank U. May. Evolving better wavelet compression schemes. In Andrew F. Laine, Michael A. Unser, and Mladen V. Wickerhauser, editors, Wavelet Applications in Signal and Image Processing III, volume 2569, San Diego, CA, USA, 9-14 July 1995. SPIE.]]Google ScholarGoogle ScholarCross RefCross Ref
  10. Alex Fukunaga and Andre Stechert. Evolving nonlinear predictive models for lossless image compression with genetic programming. In John R. Koza, Wolfgang Banzhaf, Kumar Chellapilla, Kalyanmoy Deb, Marco Dorigo, David B. Fogel, Max H. Garzon, David E. Goldberg, Hitoshi Iba, and Rick Riolo, editors, Genetic Programming 1998: Proceedings of the Third Annual Conference, pages 95--102, University of Wisconsin, Madison, Wisconsin, USA, 22-25 July 1998. Morgan Kaufmann.]]Google ScholarGoogle Scholar
  11. Johan Parent and Ann Nowe. Evolving compression preprocessors with genetic programming. In W. B. Langdon, E. Cantú-Paz, K. Mathias, R. Roy, D. Davis, R. Poli, K. Balakrishnan, V. Honavar, G. Rudolph, J. Wegener, L. Bull, M. A. Potter, A. C. Schultz, J. F. Miller, E. Burke, and N. Jonoska, editors, GECCO 2002: Proceedings of the Genetic and Evolutionary Computation Conference, pages 861--867, New York, 9-13 July 2002. Morgan Kaufmann Publishers.]]Google ScholarGoogle Scholar
  12. Bell, R.A.a.T.C. A corpus for the evaluation of lossless compression algorithms. in IEEE Data Compression Conference (DCC'97). March 25 1997. Los Alamitos, California.: IEEE Computer Society.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Thomas Krantz, Oscar Lindberg, Gunnar Thorburn, and Peter Nordin. Programmatic compression of natural video. In Erick Cantú-Paz, editor, Late Breaking Papers at the Genetic and Evolutionary Computation Conference (GECCO-2002), pages 301--307, New York, NY, July 2002. AAAI.]]Google ScholarGoogle Scholar
  14. Jingsong He, Xufa Wang, Min Zhang, Jiying Wang, and Qiansheng Fang. New research on scalability of lossless image compression by GP engine. In Jason Lohn, David Gwaltney, Gregory Hornby, Ricardo Zebulum, Didier Keymeulen, and Adrian Stoica, editors, Proceedings of the 2005 NASA/DoD Conference on Evolvable Hardware, pages 160--164, Washington, DC, USA, 29 June-1 July 2005. IEEE Press.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. I. Witten and R. Neal and J. Cleary, Arithmetic coding for data compression, Communications of the ACM, Vol. 30, pp. 520--541, 1987.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. J. Ziv and A. Lempel, Compression of Individual Sequences via Variable--Rate Coding, IEEE Transactions on Information Theory, September 1978.]]Google ScholarGoogle Scholar
  17. J. G. Cleary and W. J. Teahan and Ian H. Witten, Unbounded Length Contexts for PPM, Data Compression Conference, pp. 52--61, 1995.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. S. W. Golomb, Run-length encodings, IEEE Trans. Inform. Theory, Vol. IT-12, pp. 399--401, 1966.]]Google ScholarGoogle Scholar
  19. A. Kattan, Universal Lossless Data Compression with built in Encryption. Master Thesis, University of Essex 2006.]]Google ScholarGoogle Scholar
  20. M. Burrows and D. J. Wheeler, A block-sorting lossless data compression algorithm, SRC, Number 124, 1994.]]Google ScholarGoogle Scholar
  21. Z. Arnavut, Move-to-Front and Inversion Coding, DCC: Data Compression Conference, IEEE Computer Society TCC, 2000.]] Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. ACT Archive Compression Test {cited 2 December 2007} Available from: http://compression.ca/act/act-win.html]]Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in
  • Published in

    cover image ACM Conferences
    GECCO '08: Proceedings of the 10th annual conference on Genetic and evolutionary computation
    July 2008
    1814 pages
    ISBN:9781605581309
    DOI:10.1145/1389095
    • Conference Chair:
    • Conor Ryan,
    • Editor:
    • Maarten Keijzer

    Copyright © 2008 ACM

    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    • Published: 12 July 2008

    Permissions

    Request permissions about this article.

    Request Permissions

    Check for updates

    Qualifiers

    • research-article

    Acceptance Rates

    Overall Acceptance Rate1,669of4,410submissions,38%

    Upcoming Conference

    GECCO '24
    Genetic and Evolutionary Computation Conference
    July 14 - 18, 2024
    Melbourne , VIC , Australia

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader