2007
Conference article
Restricted
A similarity approach on searching for digital rights
Allasia W, Falchi F, Gallo FWe present an innovative approach that treats the right management metadata as metric objects, enabling similarity search on IPR attributes between digital items. We show how the content base similarity search can help both the user to deal with a huge amount of similar items with different licenses and the content providers to detect fake copies or illegal uses. Our aim is the management of the metadata related to the Digital Rights in centralized systems or networks with indexing capabilities for both text and similarity searches, providing theSource: JOURNAL OF UNIVERSAL COMPUTER SCIENCE (PRINT), pp. 147-154. Graz, Austria, 5-7 September 2007
See at:
CNR IRIS | CNR IRIS
2008
Journal article
Open Access
Scalability comparison of Peer-to-Peer similarity-search structures
Batko M, Novak D, Falchi F, Zezula PDue to the increasing complexity of current digital data, similarity search has become a fundamental computational task in many applications. Unfortunately, its costs are still high and grow linearly on single server structures, which prevents them from efficient application on large data volumes. In this paper, we shortly describe four recent scalable distributed techniques for similarity search and study their performance in executing queries on three different datasets. Though all the methods employ parallelism to speed up query execution, different advantages for different objectives have been identified by experiments. The reported results would be helpful for choosing the best implementations for specific applications. They can also be used for designing new and better indexing structures in the future.Source: FUTURE GENERATION COMPUTER SYSTEMS, vol. 24 (issue 8), pp. 834-848
See at:
CNR IRIS | ISTI Repository | www.sciencedirect.com | CNR IRIS | CNR IRIS
2006
Conference article
Restricted
On scalability of the similarity search in the world of peers
Batko M, Novak D, Falchi F, Zezula PDue to the increasing complexity of current digital data, similarity search has become a fundamental computational task in many applications. Unfortunately, its costs are still high and the linear scalability of single server implementations prevents from efficient searching in large data volumes. In this paper, we shortly describe four recent scalable distributed similarity search techniques and study their performance of executing queries on three different datasets. Though all the methods employ parallelism to speed up query execution, different advantages for different objectives have been identified by experiments. The reported results can be exploited for choosing the best implementations for specific applications. They can also be used for designing new and better indexing structures in the future.
See at:
CNR IRIS | CNR IRIS
2007
Conference article
Restricted
A digital rights aware similarity measure for multimedia documents
Allasia W, Falchi F, Gallo F, Orio NThis paper presents a novel approach to the retrieval of multimedia documents that considers Intellectual Property Rights (IPR) metadata as a multidimensional feature in a metric space. The approach allows us to perform similarity searches on the IPR attributes of digital items and to integrate these searches in a common query-by-example paradigm. We aim at managing the metadata related to the IPR in both centralized and Peer-to-Peer systems with metric indexing capabilities. Together with content-based similarity search, IPR similarity search can help the end user to deal with a huge amount of similar items with different licenses. Moreover, content providers may be able to detect fake copies or illegal uses. Two use cases, related to the retrieval of music and images respectively, are presented to describe the possible applications of the approach.
See at:
doi.acm.org | CNR IRIS | CNR IRIS
2007
Conference article
Restricted
An Innovative approach for Indexing and Searching Digital Rights
Allasia W, Chiariglione F, Gallo F, Falchi FThe objective of this paper is to demonstrate the reuse of digital content, as video documents or PowerPoint presentations, by exploiting existing technologies for automatic extraction of metadata (OCR, speech recognition, cut detection, MPEG-7 visual descriptors, etc.). The multimedia documents and the extracted metadata are then indexed and managed by the Multimedia Content Management System (MCMS) MILOS, specifically developed to support design and effective implementation of digital library applications. As a result, the indexed digital material can be retrieved by means of content based retrieval on the text extracted and on the MPEG-7 visual descriptors (via similarity search), assisting the user of the e-Learning Library (student or teacher) to retrieve the items not only on the basic bibliographic metadata (title, author, etc.).
See at:
doi.ieeecomputersociety.org | CNR IRIS | CNR IRIS
2008
Conference article
Restricted
Audio-visual content analysis in P2P networks: the SAPIR approach
Allasia W, Falchi F, Gallo F, Kacimi M, Kaplan A, Mamou J, Mass Y, Orio NContent based search in audio-visual collections requires media specific analysis for extracting low level features to be efficiently indexed and searched. We present the SAPIR media framework for analyzing digital content and representing the extracted features in a common schema, used to index and search content in a P2P network. The framework contains splitters of compound objects into simple objects to deal with complex media like videos, using image and speech analyzers. We report usage of this framework in the SAPIR demo.
See at:
CNR IRIS | ieeexplore.ieee.org | CNR IRIS
2007
Other
Restricted
SAPIR - D3.1 - Common Schema for Feature Extraction
Falchi F, Allasia W, Gallo F, Jonathan M, Yosi M, Miotto R, Orio N, HageĢge C, Kaplan AIn this report we define a representation formalism for describing multimedia documents containing any combination of video, still images, music, speech, and text. A document description in this formalism includes metadata (author, title, etc.), as well as the results of automatic feature extraction for use in indexing, search, and browsing. By defining a single representation format that covers all media, we intend to support cross-media search; for example, an image similarity search might retrieve both videos and still images; and a keyword search on titles might receive documents of all media types. The representation is based on the MPEG-7 standard, with extensions to cover media, features, and metadata not covered by the standard. MPEG-7 provides a rich vocabulary for describing document structure and content, and its status as a standard means that SAPIR will be interoperable with other multimedia management systems. The SAPIR-specific extensions are defined in such a way as to preserve this interoperability. The report describes project activities undertaken as part of task T3.1
See at:
CNR IRIS | CNR IRIS
2013
Software
Metadata Only Access
Visual information retrieval
Falchi FVIR is a library for content based image retrieval and classification based on global and local features. The library allows comparing images considering their global and/or local features. It includes local features matching, RANSAC, MPEG-7 global features comparisons, kNN classification, Bag-of-Words (or Bag-of-Features) approach. It is an ongoing project. At the moment it does not implement feature extraction.
See at:
github.com | CNR IRIS
2013
Other
Restricted
PRESTO4U - CoP Progress report year 1
Ligios L, Lindgaard P H, Teruggi D, Verbruggen E, Chakravarthy A, Laurenson P, Mcneil J, Crhistensen T, Bagnoli L, Falchi F, Snyders MPresto4U is has created nine Communities of Practice (CoPs), each based on a shared concern, a shared set of problems and a common pursuit of technological solutions related to the particular custodial practices and preservation challenges in a principal sub-sector of audiovisual media. The CoPs, collectively and individually, provide a crucial reference point and exchange environment for all the Presto4U activities. Each CoP is coordinated by a consortium Partner that is well respected and well connected to other CoP members, that is keen to help develop the CoP's practice, and is knowledgeable and passionate about the Presto4U topics. Now that the Communities of Practice have been created, a large part of the work in Presto4U consists of the maintaining, nurturing and growing of each CoP through online and offline activities, publications and tools. The project will manage the activities for each CoP expert working group through directing, facilitating, stimulating and maintaining interactions and feedback for providing detail on the preservation needs of each Community. This part of the project will also gather data about the various communities, provide analyses of gaps and challenges, define and share information with the supplyside, and maintain feedback mechanisms and broker connections between challenges of each CoP with the project. This first CoP Progress Report is a collated report describing the individual progress for each Community of Practice since their establishment at Project Month 7, focusing on the actual Community of Practice management progress as well as a first glimpse of the long term digital preservation technological needs, barriers and suppliers. Not all Communities of Practice have started at an equal level of "pre-development" nor have they developed at the same rate. Therefore, each chapter in this report should be considered as an individual overview of the situation in each Community of Practice. The report will be used further into the project to ensure the efficient mapping of Communities of Practice and the project requirements, and to stimulate combined efforts between two or more Communities of Practice where possible. It will also help set further priorities for work ongoing in Presto4U, notably the dissemination focus of WP5, the research in WP3, the specific challenges for the development of an online Market Place in WP4, and the measuring of impact of the project's efforts in WP6.Project(s): PRESTO4U
See at:
CNR IRIS | CNR IRIS
2014
Other
Restricted
Presto4U - Longitudinal CoP impact analysis: simple longitudinal study of the potential impact of the CoP activities based on the CoP reports and their progress over time
Bagnoli L, Teruggi D, Verbuggen E, Lindgaard P H, Gram J N, Rendina M, Falchi F, Walland P, Schallaure P, Houpert J, Bauer C, Finoradin B, Oomen J, Oleksik PThe scope of this Deliverable is to study the impact of the Presto4U project on the CoP activities, based on CoP reports and their progress over time. The study will includes a validation of the models from WP4T1 based on real uptake case studies, the influence of technology watch and brokering services on CoP's choices, and a satisfaction survey for every CoP on the role of the project for the understanding and adoption of new solutions. A brief introduction presents the economic environment in which the various communities face the challenges of preservation and how these are addressed with different approaches depending on the type of community. The second part, presents the analysis of the interactions within content holders and producers, service and technology providers. It will also gain input in the ability of the Communities to internally integrate technology or a specific practice in terms of specialized and skilled staff, internally available technology or potential financial resources and their dependence from external technology, financing and service providers. The third part intends to draw some conclusions on the project status, through an analysis of the expectations and achievements from the point of view of all the nine communities involved in Presto4U. A satisfaction survey, in the guise of a common questionnaire, has been submitted to all the CoP leaders.Project(s): PRESTO4U
See at:
CNR IRIS | CNR IRIS | www.prestocentre.org
2014
Other
Restricted
Presto4U - Recommendations on rights technology
Boch L, Di Carlo A, Gallo F, Pellegrino J, Laurenson P, Rendina M, Ligios L, Ortolani S, Christensen T, Bagnoli L, Teruggi D, Verbruggen E, Falchi FThe scope of Presto4U project is technology for digital audiovisual media preservation. The long-term preservation of digital audiovisual media presents a range of complex technological, organisational, economic and rights-related issues. Presto4U focuses onto useful technological solutions, for raising awareness and improving the adoption of audiovisual preservation research results, both by service providers and media owners. Presto4U has created a series of Communities of Practice in the main sub-sectors of the audiovisual media preservation for developing knowledge about their practices and their unresolved issues for accessing to and taking advantage from research output. The scope of this report is technology for handling audiovisual rights. Beyond considering how to prevent the loss of content, rights information has to be included upstream in the preservation process before the sources of information become uncertain. There are many areas of research into audiovisual rights, covering: definitions of formats and models or rights representations, such as rights expression languages and ontologies, development of innovative services for the management of rights information, and architectures for enforcing the appropriate use of rights. The context is provided by the legal framework, the media contracts, the technologies for content distribution and fruition, the technologies for the information management, and the related standards and specifications. Topics in the scope of this report include how to ascertain rights ownership, how to represent unambiguously the rights in machine-readable form, and which are the practical guidelines for handling rights expressed in that way.Project(s): PRESTO4U
See at:
CNR IRIS | CNR IRIS | www.prestocentre.org
2014
Other
Restricted
PRESTO4U - CoP Progress report year 2
Ligios L., Lindgaard P. H., Teruggi D., Verbruggen E., Chakravarthy A., Laurenson P., Mcneil J., Cristensen T., Bagnoli L., Falchi F. Snyders M.Since the nine Communities of Practice have been created in year 1, a large part of the work in Presto4U during year 2 consisted of directing, facilitating, stimulating and maintaining interactions and feedback for providing detail on the preservation needs of each Community. This part of the project also gathered data about the various communities, providing - mostly qualitative - analyses of gaps and challenges, identifying relations with the supply-side, and delivering feedback to the project for many of the challenges of each CoP. This second CoP Progress Report is a collated report describing the individual progress for each Community of Practice in year 2. Each chapter was written by the respective Community of Practice leader and responsible partner in the project. Many of the findings and recommendations in the task have been taken into account in other work of Presto4U and in the further strategies for the audiovisual preservation domain as documented in WP6.Source: Project report, PRESTO4U, Deliverable D2.5, 2014
Project(s): PRESTO4U
See at:
www.prestocentre.org | CNR ExploRA
2007
Other
Open Access
A Content-Addressable Network for Similarity Search in Metric Spaces
Falchi FBecause of the ongoing digital data explosion, more advanced search paradigms than the traditional exact match are needed for contentbased retrieval in huge and ever growing collections of data produced in application areas such as multimedia, molecular biology, marketing, computer-aided design and purchasing assistance. As the variety of data types is fast going towards creating a database utilized by people, the computer systems must be able to model human fundamental reasoning paradigms, which are naturally based on similarity. The ability to perceive similarities is crucial for recognition, classification, and learning, and it plays an important role in scientific discovery and creativity. Recently, the mathematical notion of metric space has become a useful abstraction of similarity and many similarity search indexes have been developed.
In this thesis, we accept the metric space similarity paradigm and concentrate on the scalability issues. By exploiting computer networks and applying the Peer-to-Peer communication paradigms, we build a structured network of computers able to process similarity queries in parallel. Since no centralized entities are used, such architectures are fully scalable. Specifically, we propose a Peer-to-Peer system for similarity search in metric spaces called Metric Content-Addressable Network (MCAN) which is an extension of the well known Content-Addressable Network (CAN) used for hash lookup. A prototype implementation of MCAN was tested on real-life datasets of image features, protein symbols, and text -- observed results are reported. We also compared the performance of MCAN with three other, recently proposed, distributed data structures for similarity search in metric spaces.Project(s): Search on Audio-visual content using peer-to-peer Information Retrieval
See at:
etd.adm.unipi.it | CNR IRIS | ISTI Repository | CNR IRIS
2014
Dataset
Metadata Only Access
YFCC100M-HNfc6
Falchi FYFCC100M-HNfc6 contains deep features extracted from the 100 million images of the Creative Commons Initiative. In particular we took the activation of the neurons in the fc6 layer of the Hybrid-CNN whose model and weights are public available in the Caffe Model Zoo. The Hybrid-CNN was trained on 1,183 categories (205 scene categories from Places Database and 978 object categories from the train data of ILSVRC2012 (ImageNet) with ~3.6 million images. The architecture is the same as Caffe reference network.
The Yahoo Flickr Creative Commons 100M (YFCC100M) dataset was created in 2014 as part of the Yahoo Webscope program. The dataset consists of approximately 99.2 million photos and 0.8 million videos, all uploaded to Flickr between 2004 and 2014 and published under a Creative Commons commercial or non commercial license.
The deep features have been integrated in the corpus maintained by the Multimedia Commons initiative which is an effort to develop and share sets of computed features and ground-truth annotations for the Yahoo Flickr Creative Commons 100 Million dataset (YFCC100M), which contains around 99.2 million images and nearly 800,000 videos from Flickr, all shared under Creative Commons licenses.
We also give Content-Based Image Retrieval results for various approaches and for various subsets of the datasets. While on the web page you can only see 100 results, 10,001 resutls for each query are available for download.
See at:
CNR IRIS | www.deepfeatures.org