Skip to main content

Designing Similarity Indexes with Parallel Genetic Programming

  • Conference paper
Book cover Similarity Search and Applications (SISAP 2013)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8199))

Included in the following conference series:

  • 1628 Accesses

Abstract

The increasing diversity of unstructured databases leads to the development of advanced indexing techniques as the metric indexing model does not fit to the general similarity models. Once the most critical postulate, namely the triangle inequality, does not hold, the metric model produces notable errors during the query evaluation. To overcome this situation and to obtain more qualitative results, we want to discover better indexing models for databases using arbitrary similarity measures. However, each database is unique in a specific way, so we outline the automatic way of exploring the best indexing method. We introduce the exploration approach using parallel genetic programming principles in a multi-threaded environment built upon recently introduced SIMDEX Framework. Furthermore, we introduce smart pivot table which is an intelligent indexing method capable of incorporating obtained results. We supplement the theoretical background with experiments showing the achieved improvements in comparison to the single-threaded evaluations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bartoš, T., Eckhardt, A., Skopal, T.: Fuzzy Approach to Non-metric Similarity Indexing. In: SISAP 2011, pp. 115–116. ACM (2011)

    Google Scholar 

  2. Bartoš, T., Skopal, T., Moško, J.: Efficient Indexing of Similarity Models with Inequality Symbolic Regression. In: GECCO 2013. ACM (2013)

    Google Scholar 

  3. Bartoš, T., Skopal, T., Moško, J.: Towards Efficient Indexing of Arbitrary Similarity. SIGMOD Record 42(2), 5–10 (2013)

    Article  Google Scholar 

  4. Beecks, C., Uysal, M.S., Seidl, T.: Signature quadratic form distance. In: Proc. ACM International Conference on Image and Video Retrieval, pp. 438–445 (2010)

    Google Scholar 

  5. Chávez, E., Navarro, G., Baeza-Yates, R., Marroquín, J.L.: Searching in metric spaces. ACM Comp. Surveys 33(3), 273–321 (2001)

    Article  Google Scholar 

  6. Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)

    Article  Google Scholar 

  7. Fernandez, F., Spezzano, G., Tomassini, M., Vanneschi, L.: Parallel genetic programming. In: Parallel Metaheuristics, pp. 127–153. Wiley Interscience (2005)

    Google Scholar 

  8. Gagné, C., Parizeau, M., Dubreuil, M.: The Master-Slave Architecture for Evolutionary Computations Revisited. In: Cantú-Paz, E., et al. (eds.) GECCO 2003. LNCS, vol. 2724, pp. 1578–1579. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  9. Hetland, M.L.: Ptolemaic indexing. arXiv:0911.4384 [cs.DS] (2009)

    Google Scholar 

  10. Koza, J.R.: Genetic programming. MIT Press, Cambridge (1992)

    MATH  Google Scholar 

  11. Koza, J.R., Poli, R.: Genetic programming. In: Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques. Springer (2005)

    Google Scholar 

  12. Lokoč, J., Hetland, M., Skopal, T., Beecks, C.: Ptolemaic indexing of the signature quadratic form distance. In: SISAP 2011, pp. 9–16. ACM (2011)

    Google Scholar 

  13. Skopal, T.: On fast non-metric similarity search by metric access methods. In: Ioannidis, Y., et al. (eds.) EDBT 2006. LNCS, vol. 3896, pp. 718–736. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  14. Skopal, T.: Unified framework for fast exact and approximate search in dissimilarity spaces. ACM Transactions on Database Systems 32(4), 1–46 (2007)

    Article  Google Scholar 

  15. Skopal, T., Bartoš, T.: Algorithmic Exploration of Axiom Spaces for Efficient Similarity Search at Large Scale. In: Navarro, G., Pestov, V. (eds.) SISAP 2012. LNCS, vol. 7404, pp. 40–53. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  16. Skopal, T., Bustos, B.: On nonmetric similarity search problems in complex domains. ACM Comp. Surv. 43, 1–50 (2011)

    Article  Google Scholar 

  17. Zezula, P., Amato, G., Dohnal, V., Batko, M.: Similarity Search: The Metric Space Approach. Advances in Database Systems. Springer, USA (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Bartoš, T., Skopal, T. (2013). Designing Similarity Indexes with Parallel Genetic Programming. In: Brisaboa, N., Pedreira, O., Zezula, P. (eds) Similarity Search and Applications. SISAP 2013. Lecture Notes in Computer Science, vol 8199. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41062-8_29

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41062-8_29

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41061-1

  • Online ISBN: 978-3-642-41062-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics