Skip to main content

Dynamic Load Balancing Model: Preliminary Results for Parallel Pseudo-search Engine Indexers/Crawler Mechanisms Using MPI and Genetic Programming

  • Conference paper
  • First Online:
Book cover Vector and Parallel Processing — VECPAR 2000 (VECPAR 2000)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1981))

Included in the following conference series:

Abstract

Methodologies derived from Genetic Programming (GP) and Knowledge Discovery in Databases (KDD) were used in the parallel implementation of the indexer simulator to emulate the currentWorldWide Web (WWW) search engine indexers. This indexer followed the indexing strategies that were employed by AltaVista and Inktomi that index each word in each Web document. The insights gained from the initial implementation of this simulator have resulted in the initial phase of the adaption of a biological model. The biological model will offer a basis for future developments associated withan integrated Pseudo-Search Engine. The basic characteristics exhibited by the model will be translated so as to develop a model of an integrated searche ngine using GP. The evolutionary processes exhibited by this biological model will not only provide mechanisms for the storage, processing, and retrieval of valuable information but also for Web crawlers, as well as for an advanced communication system. The current Pseudo-Search Engine Indexer, capable of organizing limited subsets of Web documents, provides a foundation for the first simulator of this model. Adaptation of the model for the refinement of the Pseudo-Search Engine establishes order in the inherent interactions between the indexer, crawler and browser mechanisms by including the social (hierarchical) structure and simulated behavior of this complex system. The simulation of behavior will engender mechanisms that are controlled and coordinated in their various levels of complexity. This unique model will also provide a foundation for an evolutionary expansion of the search engine as WWW documents continue to grow. The simulator results were generated using Message Passing Interface (MPI) on a network of SUN workstations and an IBM SP2 computer system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abramson, M. Z., Hunter, L.: Classification using Cultural Co-evolution and Genetic Programming. In: Koza, J. R., Goldberg, D. E., Fogel, D. B., Riolo, R. L. (eds.): Proc. of the 1996 Genetic Programming Conf. MIT Press, Cambridge, MA (1996) 249–254.

    Google Scholar 

  2. Bagrodia, R.: Process Synchronization: Design and Performance Evaluation of Distributed Algorithms. IEEE Transactions on Software Engineering 15 no. 9 (1989) 1053–1064.

    Article  Google Scholar 

  3. Braden, B., Cerpa, A., Faber, T., Lindell, B., Phillips, G., Kann, J.: The ASP EE: An Active Execution Environment for Network Control Protocols. Technical Report, Information Sciences Institute, University of Southern California, Marina del Rey, CA (1999).

    Google Scholar 

  4. Chapman, C. D., Jakiela, M. J.: Genetic Algorithm-Based Structural Topology Design withComp liance and Topology Simplification Considerations. J. of Mech. Design 118 (1996) 89–98.

    Article  Google Scholar 

  5. Crovella, M. E., Bestavros, A.: Self-Similarity in World Wide Web Trafic: Evidence and Possible Causes. IEEE/ACM Transactions on Networking (1997) 1–25.

    Google Scholar 

  6. Dracopoulos, D. C., Kent, S.: Bulk Synchronous Parallelisation of Genetic Programming. In: Wasniewski, J., Dongarra, J., Madsen, K., Olesen, D. (eds.): PARA’9: Proc. of the 3rd Intl. Workshop on Applied Parallel Computing, Industrial Computation and Optimization. Springer-Verlag, Berlin, Germany (1996) 216–226.

    Google Scholar 

  7. Duda, J. W., Jakiela, M. J.: Generation and Classification of Structural Topologies with Genetic Algorithm Speciation. Journal of Mechanical Design 119 (1997) 127–131.

    Article  Google Scholar 

  8. Franke, H., Hochschild, P., Pattnaik, P., Snir, M.: An Efficient Implementation of MPI. In: Proc. of Conf on Prog. Environments for Massively Parallel Distributed Systems. (1994) 219–229.

    Google Scholar 

  9. Free, J. B.: The Social Organization of Honeybees (Studies in Biology no. 81). The Camelot Press Ltd, Southampton (1970).

    Google Scholar 

  10. Gouet, P., Diprose, J. M., Grimes, J. M., Malby, R., Burroughs, J. N., Zientara, S., Stuart, D. I., Mertens, P. P. C.: The Highly Ordered Double-Stranded RNA Genome of Bluetongue Virus Revealed by Crystallography. Cell 97 (1999) 481–490.

    Google Scholar 

  11. Horta, E. L., Kofuji, S. T.: Using Reconfigurable Logic to Implement an Active Network. In: Shin, S. Y. (ed.): CATA 2000: Proc. of the 15th Intl. Conf. on Computers and their Applications. ISCA Press, Cary, NC (2000) 37–41.

    Google Scholar 

  12. Iba, H., Nozoe, T., Ueda, K.: Evolving Communicating Agents based on Genetic Programming. In: ICEC’ 97: Proc. of the 1997 IEEE Intl. Conf. on Evolutionary Computation. IEEE Press, New York (1997) 297–302.

    Google Scholar 

  13. Information Sciences Institute: Transmission Control Protocol (TCP). Technical Report RFC: 793, University of Southern California, Marina del Rey, CA (1981).

    Google Scholar 

  14. Koza, J. R.: Survey of Genetic Algorithms and Genetic Programming. In: Proc. of WESCON’ 95. IEEE Press, New York (1995) 589–594.

    Google Scholar 

  15. Koza, J. R., Andre, D.: Parallel Genetic Programming on a Network of Transputers. Technical Report STAN-CS-TR-95-1542. Stanford University, Department of Computer Science, Palo Alto (1995).

    Google Scholar 

  16. Leland W. E., Taqqu M. S., Willinger W., Wilson, D. V.: On the Self-Similar Nature of Ethernet Trafic. In: Proc. of ACM SIGComm’ 93 ACM Press (1993) 1–11.

    Google Scholar 

  17. Marenbach, P., Bettenhausen, K. D., Freyer, S., U., Rettenmaier, H.: Data-Driven Structured Modeling of a Biotechnological Fed-Batch Fermentation by Means of Genetic Programming. J. of Systems and Control Engineering 211 no. I5 (1997) 325–332.

    Google Scholar 

  18. Oussaidène, M., Chopard, B., Pictet, O. V., Tomassini, M.: Parallel Genetic Programming and Its Application to Trading Model Induction. Parallel Computing 23 no. 8 (1997) 1183–1198.

    Article  MATH  Google Scholar 

  19. Pacheco, P. S.: Parallel Programming with MPI. Morgan Kaufman Publishers, Inc., San Francisco, (1997).

    MATH  Google Scholar 

  20. Peterson, L. L., Davie, B. S.: Computer Networks: A Systems Approach. Morgan Kaufmann Pbulishers, Inc., San Francisco (1996).

    MATH  Google Scholar 

  21. Quinn, M. J.: Designing Efficient Algorithms for Parallel Computers. McGraw-Hill, New York (1987).

    MATH  Google Scholar 

  22. Sherrah, J., Bogner, R. E., Bouzerdoum, B.: Automatic Selection of Features for Classification using Genetic Programming. In:Narasimhan, V. L. Jain, L. C. (eds.): Proc. of the 1996 Australian New Zealand Conf. on Intelligent Information Systems. IEEE Press, New York (1996) 284–287.

    Google Scholar 

  23. Snir, M., Hochschild, P., Frye, D. D., Gildea, K. J.: The communication software and parallel environment of the IBM SP2. IBM Systems Journal 34 no. 9 (1995) 205–221.

    Article  Google Scholar 

  24. Stoffel, K., Spector, L.: High-Performance, Parallel, Stack-Based Genetic Programming. In: Koza, J. R., Goldberg, D. E., Fogel, D. B., Riolo, R. L. (eds.): Proc. of the 1996 Genetic Programming Conf. MIT Press, Cambridge, MA (1996) 224–229.

    Google Scholar 

  25. Tanese, R.: Parallel Genetic Algorithm for a Hypercube. In: Grefenstette, J. J. (ed.): Proc. of the 2nd Intl. Conf. on Genetic Algorithms. Lawrence Erlbaum Associates, Hilsdale, NJ (1987) 177–183.

    Google Scholar 

  26. Tatsumi, M., Hanebutte, U. R.: Study of Parallel Efficiency in Message Passing Environments. In: Tentner, A. (ed.): Proc. of the 1996 SCS Simulation Multiconference. SCS Press, San Diego, CA (1996) 193–198.

    Google Scholar 

  27. Tennenhouse, D. L., Smith, J. M., Sincoskie, W. D., Wetherall, D. J., Minden, G. J.: A Survey of Active Network Research. IEEE Communications Magazine 35 no. 1 (1997) 80–86.

    Article  Google Scholar 

  28. Tennenhouse, D. L., Wetherall, D. J.: Towards an Active Network Architecture. ACM Computer Communications Review 26 no. 2 (1996).

    Google Scholar 

  29. von Frisch, K.: Bees: Their Vision, Chemical Senses, and Languages. Cornell University Press, Ithaca, New York (1964).

    Google Scholar 

  30. Walker, R. L.: Assessment of theWeb using Genetic Programming. In: Banshaf, W., Daida, J., Eiben, A. E., Garzon, M. H., Honavar, V., Jakiela, M., Smith, R. E. (eds.): GECCO-99: Proc. of the Genetic and Evolutionary Computation Conf. Morgan Kaufman Publishers, Inc., San Francisco (1999) 1750–1755..

    Google Scholar 

  31. Walker, R. L.: Development of an Indexer Simulator for a Parallel Pseudo-Search Engine. In: ASTC 2000: Proc. of the 2000 Advanced Simulation Technologies Conf. SCS Press, San Diego, CA (April 2000) To Appear.

    Google Scholar 

  32. Walker, R.L.: Dynamic Load Balancing Model: Preliminary Assessment of a Biological Model for a Pseudo-SearchEn gine. In:Biologically Inspired Solutions to Parallel Processing Problems (BioSP3). Lecture Notes in Computer Science. Springer-Veglag, Berlin Heidelberg New York (2000) To Appear.

    Google Scholar 

  33. Walker, R. L.: Implementation Issues for a Parallel Pseudo-Search Engine Indexer using MPI and Genetic Programming. In: Ingber, M., Power, H., Brebbia, C.A. (eds.): Applications of High-Performance Computers in Engineering VI. WIT Press, Ashurst, Southampton, UK (2000) 71–80.

    Google Scholar 

  34. Walker, R. L., Ivory, M. Y., Asodia, S., Wright-Pegs, L.: Preliminary Study of Search Engine Indexing and Update Mechanisms: Usability Implications. In: Shin, S. Y. (ed.): CATA 2000: Proc. of the 15th Intl. Conf. on Computers and their Applications. ISCA Press, Cary, NC (2000) 383–388.

    Google Scholar 

  35. Wetherall, D. J.: Developing Network Protocols with the ANTS Toolkit. Design Review (1997).

    Google Scholar 

  36. Willis, M. J., Hiden, H. G., Marenbach, P., McKay, B. Montague, G. A.: Genetic Programming: An Introduction and Survey of Applications. In: Proc. of the 2nd Int. Conf. on Genetic Algorithms in Engineering Systems: Innovations and Applications. IEE Press, London (1997) 314–319.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2001 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Walker, R.L. (2001). Dynamic Load Balancing Model: Preliminary Results for Parallel Pseudo-search Engine Indexers/Crawler Mechanisms Using MPI and Genetic Programming. In: Palma, J.M.L.M., Dongarra, J., Hernández, V. (eds) Vector and Parallel Processing — VECPAR 2000. VECPAR 2000. Lecture Notes in Computer Science, vol 1981. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44942-6_5

Download citation

  • DOI: https://doi.org/10.1007/3-540-44942-6_5

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-41999-0

  • Online ISBN: 978-3-540-44942-3

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics