Evolutionary design of hash function pairs for network filters
Graphical abstract
Introduction
Computer networks are a potentially dangerous environment where the integrity and the security of shared data can be violated by an attacker. The monitoring and filtering of the communication can be a countermeasure against attacks and other unlawful activities such as the illegal sharing of copyrighted data. After a monitoring network node receives a data packet the processing must be completed in the time given by the speed of the network. For example, in future 400 Gbps networks there are only a couple of nanoseconds available for performing all of the required operations on the packet. On the other hand, the processing of the packets requires time consuming operations such as finding records in tables, updating data in other tables and external memory accesses. The performance of general-purpose processors is insufficient and, therefore, network monitors and filters are implemented and accelerated, for example, in field-programmable gate arrays (FPGAs) [1], [2].
A common identification of the attacker is based on its Internet Protocol (IP) address. The network monitor needs to lookup the source IP address of the packet in various tables, e.g. in a table of nodes for monitoring and blacklisting. These tables are implemented usually as hash tables with constant worst-case lookup time [3], [4]. Hash tables with linear worst-case lookup cannot be used because it is impossible to guarantee that the packet will be processed in time. Linear lookup is the result of mapping more than one IP addresses to the same table position which means that all table records in the given position need to be compared during the lookup.
Constant worst-case lookup can be achieved by perfect hash functions [5] which map each IP address to a unique position in the table and no additional search is required after identifying the table position. However, the hash function has a relatively large memory overhead requiring at least 2 bits per each IP address [6]. Insertion of additional IP addresses into the table requires the rebuilding (rehash) of the table which can take considerably longer than the time available for filtering because the filter must be put offline for rehashing, or a switch to an alternative filter is required. The disadvantages of perfect hashing motivated the researchers to consider cuckoo hashing [7] as an alternative for hashing in FPGAs [8], [9], [10], [11].
Cuckoo hashing uses two or more hash functions. These functions map items to a different part of the hash table. The insertion of a new item into the table is performed as follows. The hash is computed for the item which determines its position in the table. If the item is mapped to an occupied position then it pushes out the previous occupant from that position just like the offspring of the namesake European brood-parasitic bird cuckoo pushes out the other eggs from the nest. The pushed-out item is rehashed by another hash function into a different position of the hash table. Cuckoo hashing with two hash functions h and q is shown in Fig. 1 where item 1 is hashed by h into the table, and item 2 is pushed-out and is rehashed by q elsewhere into the table. The items are repeatedly pushed-out and rehashed by using the available hash functions and, as a consequence, cuckoo hashing can rearrange the items in the table. It is possible that the same item is pushed-out twice. In this case an unresolvable collision exists and the table must be rehashed with new hash functions just like in the case of perfect hashing. However, the iterative rearrangement gives a higher probability for the insertion to be successful. Perfect hash requires time consuming rehashing more often [5], [7].
The work presented in this paper uses cuckoo hashing for FPGA-based IP address filtering. A pipelined reconfigurable hash function with parallel computation is proposed. The proposed evolutionary algorithm (EA) fine-tunes the reconfigurable hash function for the given set of IP addresses selected for filtering/monitoring. The proposed hash function provides the lookup of IP addresses at a speed suitable for high-speed computer networks and achieves a higher table-load factor in comparison with conventional hash functions. As a consequence, the tables are filled up with more IP addresses and the filter will be offline less frequently due to rehashing or table replacement.
The rest of the paper is organized as follows. Sections 2 and 3 introduce the state-of-the-art of hashing in FPGAs and the evolutionary design of hash functions, respectively. Section 4 deals with the proposed FPGA-based system for IP address filtering. The reconfigurable hash function is proposed in Section 5. The experimental results are discussed in Section 6 and the paper is concluded in Section 7.
Section snippets
Hashing in FPGA-based network filters
FPGA is a device consisting of universal, reconfigurable elements arranged into a two-dimensional array and reconfigurable interconnections between them. The desired functionality is mapped into various elements: Boolean functions usually into several lookup tables (LUTs), larger memory blocks into collections of block random-access memories (BRAMs) and complex arithmetic operations into digital signal processing (DSP) slices. These reconfigurable elements are interconnected in order to achieve
Evolutionary design of hash functions
Since there is no definition for the exact behavior of the hash function one cannot use a deterministic algorithm for development [12]. Characteristics such as output uniform distribution, table-load factor, collision rate and avalanche effect can be used to evaluate hash functions, but it is not known how to reverse this process and define the function which will have good characteristics for various inputs. General-purpose hash functions used today were developed by experts with years of
IP address filtering
The system for IP address filtering is shown in Fig. 2. It consists of three main parts: the software (SW), the FPGA and the external memory (ext). The evolutionary design of hash functions is implemented in SW. The hash function configurations are uploaded into the FPGA where the IP address filtering is performed at high-speed. The hash table containing the desired set of IP addresses is in the external memory (ext).
The system is designed in such a way that it can be integrated to commercially
Reconfigurable hash functions for FPGA-based IP filters
In general, a hash function is a Boolean function h : Bx → By where x is the number of input bits and y the number of bits in the output hash.
The hash function for an IP address version 4 requires the processing of x = 32 inputs. The size of the output depends on the required capacity of the hash table but it can be assumed that it has at least y = 10 bits. Therefore, the problem of the hash function pair design can be classified as hard.
A pipelined reconfigurable hash function component with parallel
Experimental results
The proposed reconfigurable hash function component, together with the EA-based automated tuning for selected IP addresses, were implemented and evaluated. The FPGA-based fast lookup of IP addresses was investigated in a XC7Z020 Zynq all programmable (AP) system-on-chip (SoC) device and the SW-based evolution in an Intel Xeon E5-2630 processor. The IP addresses used in the experiments were extracted from a firewall in the Czech national research and education network (CESNET). Hash tables with
Conclusions
Optimization of cuckoo hashing for FPGA-based IP address filtering was presented in this paper. The proposed pipelined reconfigurable hash function component with parallel computation is fine-tuned by the EA for the given set of IP addresses selected for filtering/monitoring. The proposed hash function component provides the lookup of IP addresses at a speed suitable for high-speed computer networks because with an initial latency it is able to produce hashes in each clock cycle.
It is not
Acknowledgment
This work was supported by The Ministry of Education, Youth and Sports of the Czech Republic from the National Programme of Sustainability (NPU II); project IT4Innovations excellence in science – LQ1602.
References (27)
- et al.
400 Gb/s programmable packet parsing on a single FPGA
- et al.
Software defined monitoring of application protocols
IEEE Trans. Comput.
(2016) - et al.
Fast lookup for dynamic packet filtering in FPGA
- et al.
A memory efficient IPv6 lookup engine on FPGA
- et al.
Fast and scalable packet classification using perfect hash
- et al.
Hash, displace, and compress
- et al.
Cuckoo hashing
- et al.
Parallel d-pipeline: a cuckoo hashing implementation for increased throughput
IEEE Trans. Comput.
(2016) - et al.
Massively parallel cuckoo pattern matching applied for NIDS/NIPS
- et al.
Software defined monitoring of application protocols
Low latency book handling in FPGA for high frequency trading
Hashing in Computer Science: Fifty Years of Slicing and Dicing
Space efficient hash tables with worst case constant access time
Cited by (7)
Evolutionary hash functions for specific domains
2019, Applied Soft Computing JournalCitation Excerpt :Berarducci et al. [38] proposed using grammatical evolution to generate hash functions, and in [39], Karasek et al. proposed the automatic design of NCHFs using GP, which is a similar but simpler approach to the one presented in [2] and in this article. More recently, Dobai et al. [40] have successfully used evolutionary algorithms for a fine tuning of the reconfigurable hash functions for a given set of IP addresses. Prior to that work, a preliminary version of GP-Hash was reported in 2006 [41,42].
A novel routing algorithm for IoT cloud based on hash offset tree
2018, Future Generation Computer SystemsCitation Excerpt :Therefore, it is necessary to design effective algorithm that does not rely on IP address length [29–44].
CostCounter: A Better Method for Collision Mitigation in Cuckoo Hashing
2023, ACM Transactions on StorageEvolutionary Design of Hash Functions for IPv6 Network Flow Hashing
2020, 2020 IEEE Congress on Evolutionary Computation, CEC 2020 - Conference ProceedingsBHA-160: Constructional design of hash function based on NP-hard problem
2019, International Journal of Advanced Computer Science and ApplicationsFast Reconfigurable Hash Functions for Network Flow Hashing in FPGAs
2018, 2018 NASA/ESA Conference on Adaptive Hardware and Systems, AHS 2018