Conference object  Open Access

Fast and compact set intersection through recursive universe partitioning

Pibiri G. E.

Compressed Set Intersection  SIMD  Inverted Indexes  Efficiency 

We present a data structure that encodes a sorted integer sequence in small space allowing, at the same time, fast intersection operations. The data layout is carefully designed to exploit word-level parallelism and SIMD instructions, hence providing good practical performance. The core algorithmic idea is that of recursive partitioning the universe of representation: a markedly different paradigm than the widespread strategy of partitioning the sequence based on its length. Extensive experimentation and comparison against several competitive techniques shows that the proposed solution embodies an improved space/time trade-off for the set intersection problem.

Source: IEEE Data Compression Conference, Online Conference, 23-26/03/2021

