Elsevier

Applied Soft Computing

Volume 56, July 2017, Pages 173-181
Applied Soft Computing

Evolutionary design of hash function pairs for network filters

https://doi.org/10.1016/j.asoc.2017.03.009Get rights and content

Highlights

  • Pipelined reconfigurable hash function with parallel computation is proposed for IP address filtering in field-programmable gate arrays.

  • The evolutionary algorithm fine-tunes the reconfigurable hash function for the given set of Internet Protocol addresses.

  • The proposed hash function provides high-speed lookup and achieves a higher table-load factor in comparison with conventional hash functions.

Abstract

Network filtering is a challenging area in high-speed computer networks, mostly because lots of filtering rules are required and there is only a limited time available for matching these rules. Therefore, network filters accelerated by field-programmable gate arrays (FPGAs) are becoming common where the fast lookup of filtering rules is achieved by the use of hash tables. It is desirable to be able to fill-up these tables efficiently, i.e. to achieve a high table-load factor in order to reduce the offline time of the network filter due to rehashing and/or table replacement. A parallel reconfigurable hash function tuned by an evolutionary algorithm (EA) is proposed in this paper for Internet Protocol (IP) address filtering in FPGAs. The EA fine-tunes the reconfigurable hash function for a given set of IP addresses. The experiments demonstrate that the proposed hash function provides high-speed lookup and achieves a higher table-load factor in comparison with conventional solutions.

Introduction

Computer networks are a potentially dangerous environment where the integrity and the security of shared data can be violated by an attacker. The monitoring and filtering of the communication can be a countermeasure against attacks and other unlawful activities such as the illegal sharing of copyrighted data. After a monitoring network node receives a data packet the processing must be completed in the time given by the speed of the network. For example, in future 400 Gbps networks there are only a couple of nanoseconds available for performing all of the required operations on the packet. On the other hand, the processing of the packets requires time consuming operations such as finding records in tables, updating data in other tables and external memory accesses. The performance of general-purpose processors is insufficient and, therefore, network monitors and filters are implemented and accelerated, for example, in field-programmable gate arrays (FPGAs) [1], [2].

A common identification of the attacker is based on its Internet Protocol (IP) address. The network monitor needs to lookup the source IP address of the packet in various tables, e.g. in a table of nodes for monitoring and blacklisting. These tables are implemented usually as hash tables with constant worst-case lookup time [3], [4]. Hash tables with linear worst-case lookup cannot be used because it is impossible to guarantee that the packet will be processed in time. Linear lookup is the result of mapping more than one IP addresses to the same table position which means that all table records in the given position need to be compared during the lookup.

Constant worst-case lookup can be achieved by perfect hash functions [5] which map each IP address to a unique position in the table and no additional search is required after identifying the table position. However, the hash function has a relatively large memory overhead requiring at least 2 bits per each IP address [6]. Insertion of additional IP addresses into the table requires the rebuilding (rehash) of the table which can take considerably longer than the time available for filtering because the filter must be put offline for rehashing, or a switch to an alternative filter is required. The disadvantages of perfect hashing motivated the researchers to consider cuckoo hashing [7] as an alternative for hashing in FPGAs [8], [9], [10], [11].

Cuckoo hashing uses two or more hash functions. These functions map items to a different part of the hash table. The insertion of a new item into the table is performed as follows. The hash is computed for the item which determines its position in the table. If the item is mapped to an occupied position then it pushes out the previous occupant from that position just like the offspring of the namesake European brood-parasitic bird cuckoo pushes out the other eggs from the nest. The pushed-out item is rehashed by another hash function into a different position of the hash table. Cuckoo hashing with two hash functions h and q is shown in Fig. 1 where item 1 is hashed by h into the table, and item 2 is pushed-out and is rehashed by q elsewhere into the table. The items are repeatedly pushed-out and rehashed by using the available hash functions and, as a consequence, cuckoo hashing can rearrange the items in the table. It is possible that the same item is pushed-out twice. In this case an unresolvable collision exists and the table must be rehashed with new hash functions just like in the case of perfect hashing. However, the iterative rearrangement gives a higher probability for the insertion to be successful. Perfect hash requires time consuming rehashing more often [5], [7].

The work presented in this paper uses cuckoo hashing for FPGA-based IP address filtering. A pipelined reconfigurable hash function with parallel computation is proposed. The proposed evolutionary algorithm (EA) fine-tunes the reconfigurable hash function for the given set of IP addresses selected for filtering/monitoring. The proposed hash function provides the lookup of IP addresses at a speed suitable for high-speed computer networks and achieves a higher table-load factor in comparison with conventional hash functions. As a consequence, the tables are filled up with more IP addresses and the filter will be offline less frequently due to rehashing or table replacement.

The rest of the paper is organized as follows. Sections 2 and 3 introduce the state-of-the-art of hashing in FPGAs and the evolutionary design of hash functions, respectively. Section 4 deals with the proposed FPGA-based system for IP address filtering. The reconfigurable hash function is proposed in Section 5. The experimental results are discussed in Section 6 and the paper is concluded in Section 7.

Section snippets

Hashing in FPGA-based network filters

FPGA is a device consisting of universal, reconfigurable elements arranged into a two-dimensional array and reconfigurable interconnections between them. The desired functionality is mapped into various elements: Boolean functions usually into several lookup tables (LUTs), larger memory blocks into collections of block random-access memories (BRAMs) and complex arithmetic operations into digital signal processing (DSP) slices. These reconfigurable elements are interconnected in order to achieve

Evolutionary design of hash functions

Since there is no definition for the exact behavior of the hash function one cannot use a deterministic algorithm for development [12]. Characteristics such as output uniform distribution, table-load factor, collision rate and avalanche effect can be used to evaluate hash functions, but it is not known how to reverse this process and define the function which will have good characteristics for various inputs. General-purpose hash functions used today were developed by experts with years of

IP address filtering

The system for IP address filtering is shown in Fig. 2. It consists of three main parts: the software (SW), the FPGA and the external memory (ext). The evolutionary design of hash functions is implemented in SW. The hash function configurations are uploaded into the FPGA where the IP address filtering is performed at high-speed. The hash table containing the desired set of IP addresses is in the external memory (ext).

The system is designed in such a way that it can be integrated to commercially

Reconfigurable hash functions for FPGA-based IP filters

In general, a hash function is a Boolean function h : Bx  By where x is the number of input bits and y the number of bits in the output hash.

The hash function for an IP address version 4 requires the processing of x = 32 inputs. The size of the output depends on the required capacity of the hash table but it can be assumed that it has at least y = 10 bits. Therefore, the problem of the hash function pair design can be classified as hard.

A pipelined reconfigurable hash function component with parallel

Experimental results

The proposed reconfigurable hash function component, together with the EA-based automated tuning for selected IP addresses, were implemented and evaluated. The FPGA-based fast lookup of IP addresses was investigated in a XC7Z020 Zynq all programmable (AP) system-on-chip (SoC) device and the SW-based evolution in an Intel Xeon E5-2630 processor. The IP addresses used in the experiments were extracted from a firewall in the Czech national research and education network (CESNET). Hash tables with

Conclusions

Optimization of cuckoo hashing for FPGA-based IP address filtering was presented in this paper. The proposed pipelined reconfigurable hash function component with parallel computation is fine-tuned by the EA for the given set of IP addresses selected for filtering/monitoring. The proposed hash function component provides the lookup of IP addresses at a speed suitable for high-speed computer networks because with an initial latency it is able to produce hashes in each clock cycle.

It is not

Acknowledgment

This work was supported by The Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science – LQ1602.

References (27)

  • M. Attig et al.

    400 Gb/s programmable packet parsing on a single FPGA

  • L. Kekely et al.

    Software defined monitoring of application protocols

    IEEE Trans. Comput.

    (2016)
  • L. Kekely et al.

    Fast lookup for dynamic packet filtering in FPGA

  • D. Tong et al.

    A memory efficient IPv6 lookup engine on FPGA

  • V. Pus et al.

    Fast and scalable packet classification using perfect hash

  • D. Belazzougui et al.

    Hash, displace, and compress

  • R. Pagh et al.

    Cuckoo hashing

  • S. Pontarelli et al.

    Parallel d-pipeline: a cuckoo hashing implementation for increased throughput

    IEEE Trans. Comput.

    (2016)
  • T.N. Thinh et al.

    Massively parallel cuckoo pattern matching applied for NIDS/NIPS

  • L. Kekely et al.

    Software defined monitoring of application protocols

  • M. Dvorak et al.

    Low latency book handling in FPGA for high frequency trading

  • A.G. Konheim

    Hashing in Computer Science: Fifty Years of Slicing and Dicing

    (2010)
  • D. Fotakis et al.

    Space efficient hash tables with worst case constant access time

  • Cited by (7)

    • Evolutionary hash functions for specific domains

      2019, Applied Soft Computing Journal
      Citation Excerpt :

      Berarducci et al. [38] proposed using grammatical evolution to generate hash functions, and in [39], Karasek et al. proposed the automatic design of NCHFs using GP, which is a similar but simpler approach to the one presented in [2] and in this article. More recently, Dobai et al. [40] have successfully used evolutionary algorithms for a fine tuning of the reconfigurable hash functions for a given set of IP addresses. Prior to that work, a preliminary version of GP-Hash was reported in 2006 [41,42].

    • A novel routing algorithm for IoT cloud based on hash offset tree

      2018, Future Generation Computer Systems
      Citation Excerpt :

      Therefore, it is necessary to design effective algorithm that does not rely on IP address length [29–44].

    • Evolutionary Design of Hash Functions for IPv6 Network Flow Hashing

      2020, 2020 IEEE Congress on Evolutionary Computation, CEC 2020 - Conference Proceedings
    • BHA-160: Constructional design of hash function based on NP-hard problem

      2019, International Journal of Advanced Computer Science and Applications
    • Fast Reconfigurable Hash Functions for Network Flow Hashing in FPGAs

      2018, 2018 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2018
    View all citing articles on Scopus
    View full text