7 result(s)
Page Size: 10, 20, 50
Export: bibtex, xml, json, csv
Order by:

CNR Author operator: and / or
more
Typology operator: and / or
Language operator: and / or
Date operator: and / or
more
Rights operator: and / or
2013 Conference article Restricted
Compressed static functions with applications
Belazzougui D., Venturini R.
Given a set of integer keys from a bounded universe along with associated data, the dictionary problem asks to answer two queries: membership and retrieval. Membership has to tell whether a given element is in the dictionary or not; Retrieval has to return the data associated with the searched key. In this paper we provide time and space optimal solutions for three well-established relaxations of this basic problem: (Compressed) Static functions, Approximate membership and Relative membership.Source: 24th Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 229–240, New orleans, Louisiana, USA, 5 gennaio 2013
Project(s): MIDAS via OpenAIRE

See at: knowledgecenter.siam.org Restricted | CNR ExploRA


2013 Conference article Open Access OPEN
Dynamic compressed strings with random access
Grossi R., Raman R., Rao S., Venturini R.
We consider the problem of storing a string $S$ in dynamic compressed form, while permitting operations directly on the compressed representation of $S$: access a substring of $S$; replace, insert or delete a symbol in $S$; count how many occurrences of a given symbol appear in any given prefix of $S$ (called rank operation) and locate the position of the $i$th occurrence of a symbol inside $S$ (called select operation). We discuss the time complexity of several combinations of these operations along with the entropy space bounds of the corresponding compressed indexes. In this way, we extend or improve the bounds of previous work by Ferragina and Venturini [TCS,~2007], Jansson et al.mbox{} [ICALP,~2012], Nekrich and Navarro [SODA,~2013].Source: ICALP 2013 - Automata, Languages, and Programming. 40th International Colloquium, pp. 504–515, Riga, Latvia, 8-12 July 2013
DOI: 10.1007/978-3-642-39206-1_43
Metrics:


See at: figshare.com Open Access | doi.org Restricted | link.springer.com Restricted | CNR ExploRA


2013 Journal article Open Access OPEN
Distribution-aware compressed full-text indexes
Ferragina P., Sìren J., Venturini R.
In this paper we address the problem of building a compressed self-index that, given a distribution for the pattern queries and a bound on the space occupancy, minimizes the expected query time within that index space bound. We solve this problem by exploiting a reduction to the problem of finding a minimum weight $K$-link path in a properly designed Directed Acyclic Graph. Interestingly enough, our solution can be used with any compressed index based on the Burrows-Wheeler transform. Our experiments compare this optimal strategy with several other known approaches, showing its effectiveness in practice.Source: Algorithmica 67 (2013): 529–546. doi:10.1007/s00453-013-9782-3
DOI: 10.1007/s00453-013-9782-3
DOI: 10.1007/978-3-642-23719-5_64
Project(s): MIDAS via OpenAIRE
Metrics:


See at: HELDA - Digital Repository of the University of Helsinki Open Access | Algorithmica Open Access | doi.org Restricted | Algorithmica Restricted | link.springer.com Restricted | CNR ExploRA


2013 Conference article Restricted
Compressed cache-oblivious string B-tree
Ferragina P., Venturini R.
In this paper we address few variants of the well-known prefix-search problem in strings, and provide solutions for the cache-oblivious model which improve the best known results. In particular we close (asymptotically) the classic problem which asks for searching the set of strings sharing the longest common prefix with a queried pattern. Indeed, we provide an I/O-optimal solution matching the space lower bound for tries up to a constant multiplicative factor $(1+epsilon)$, for $epsilon>0$. Our solutions hinge upon few technicalities and, more interestingly, on a novel compressed storage scheme which adds to the elegant Locality Preserving Front-Coding (Bender {em et al.} 2006) the ability to decompress prefixes of the stored strings I/O-optimally still preserving its space bounds.Source: ESA 2013 - Algorithms. 21st Annual European Symposium, pp. 469–480, Sophia Antipolis, France, 2-4 September 2013
DOI: 10.1007/978-3-642-40450-4_40
DOI: 10.1145/2903141
Project(s): SoBigData via OpenAIRE
Metrics:


See at: doi.org Restricted | ACM Transactions on Algorithms Restricted | link.springer.com Restricted | CNR ExploRA


2013 Report Unknown
Cache-oblivious peeling of random hypergraphs
Belazzougui D., Boldi P., Ottaviano G., Venturini R., Vigna S.
The computation of a peeling order in a randomly generated hypergraph is the most time- consuming step in a number of constructions, such as perfect hashing schemes, random r-SAT solvers, error-correcting codes, and approximate set encodings. While there exists a straightforward linear time algorithm, its poor I/O performance makes it impractical for hypergraphs whose size exceeds the available internal memory. We show how to reduce the computation of a peeling order to a small number of sequential scans and sorts, and analyze its I/O complexity in the cache-oblivious model. The resulting algorithm requires O(sort(n)) I/Os and O(n log n) time to peel a random hypergraph with n edges. We experimentally evaluate the performance of our implementation of this algorithm in a real- world scenario by using the construction of minimal perfect hash functions (MPHF) as our test case: our algorithm builds a MPHF of 7.6 billion keys in less than 21 hours on a single machine. The resulting data structure is both more space-efficient and faster than that obtained with the current state-of-the-art MPHF construction for large-scale key sets.Source: ISTI Technical reports, 2013

See at: CNR ExploRA


2013 Journal article Open Access OPEN
On the bit-complexity of Lempel-Ziv compression
Ferragina P., Nitto I., Venturini R.
One of the most famous and investigated lossless data-compression schemes is the one introduced by Lempel and Ziv about 30 years ago [38]. This compression scheme is known as dictionary-based compressor and consists of squeezing an input string by replacing some of its substrings with (shorter) codewords which are actually pointers to a dictionary of phrases built as the string is processed. Surprisingly enough, although many fundamental results are nowadays known about the speed and e ffectiveness of this compression process, "we are not aware of any parsing scheme that achieves optimality [...] under any constraint on the codewords other than being of equal length" [29, pag. 159]. Here optimality means to achieve the minimum number of bits in compressing each individual input string, without any assumption on its generating source. In this paper we investigate some issues pertaining to the bit-complexity of LZ77-based compressors, the most powerful variant of the LZ-compression scheme, and we design algorithms which achieve bit-optimality in the compressed output size by taking efficient/optimal time and optimal space.Source: SIAM journal on computing (Print) 42 (2013): 1521–1541. doi:10.1137/120869511
DOI: 10.1137/120869511
DOI: 10.1137/1.9781611973068.84
Metrics:


See at: SIAM Journal on Computing Open Access | www.di.unipi.it Open Access | doi.org Restricted | SIAM Journal on Computing Restricted | epubs.siam.org Restricted | www.scopus.com Restricted | CNR ExploRA


2013 Contribution to book Restricted
Web search
Ferragina P., Venturini R.
Faced with the massive amount of information on the Web, which includes not only texts but nowadays any kind of file (audio, video, images, etc.), Web users tend to lose their way when browsing the Web, falling into what psychologists call "getting lost in hyperspace". Search engines alleviate this by presenting the most relevant pages that better match the user's information needs. Collecting a large part of the pages in the Web, extrapolating a user information need expressed by means of often ambiguous queries, establishing the importance of Web pages and their relevance for a query, are just a few examples of the difficult problems that search engines address every day to achieve their ambitious goal. In this chapter, we introduce the concepts and the algorithms that lie at the core of modern search engines by providing running examples that simplify understanding, and we comment on some recent and powerful tools and functionalities that should increase the ability of users to match in the Web their information needs.Source: The power of algorithms. Inspiration and Examples in Everyday Life, edited by Giorgio Ausiello, Rossella Petreschi, pp. 107–137. Berlin: Springer-Verlag, 2013
DOI: 10.1007/978-3-642-39652-6_5
Metrics:


See at: link.springer.com Restricted | CNR ExploRA