Belazzougui D., Boldi P., Ottaviano G., Venturini R., Vigna S.
Perfect hash functions FOS: Computer and information sciences Cache-oblivious algorithms Computer Science - Data Structures and Algorithms Hypergraph algorithms Data Structures and Algorithms (cs.DS)
The computation of a peeling order in a randomly generated hypergraph is the most time- consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(n log n) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real- world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.
Source: DCC 2014 - Data Compression Conference, pp. 352–361, Snowbird, Utah, USA, 26-28 March 2014
@inproceedings{oai:it.cnr:prodotti:282742, title = {Cache-oblivious peeling of random hypergraphs}, author = {Belazzougui D. and Boldi P. and Ottaviano G. and Venturini R. and Vigna S.}, doi = {10.1109/dcc.2014.48 and 10.48550/arxiv.1312.0526}, booktitle = {DCC 2014 - Data Compression Conference, pp. 352–361, Snowbird, Utah, USA, 26-28 March 2014}, year = {2014} }