Schall K., Bailer W., Barthel K. -U., Carrara F., Lokoč J., Peška L., Schoeffmann K., Vadicamo L., Vairo C.
Video browsing Video content analysis Content-based retrieval Evaluations Interactive video retrieval
CLIP-based text-to-image retrieval has proven to be very effective at the interactive video retrieval competition Video Browser Showdown 2022, where all three top-scoring teams had implemented a variant of a CLIP model in their system. Since the performance of these three systems was quite close, this post-evaluation was designed to get better insights on the differences of the systems and compare the CLIP-based text-query retrieval engines by introducing slight modifications to the original competition settings. An extended analysis of the overall results and the retrieval performance of all systems’ functionalities shows that a strong text retrieval model certainly helps, but has to be coupled with extensive browsing capabilities and other query-modalities to consistently solve known-item-search tasks in a large-scale video database.
Source: INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL, vol. 13 (issue 15)
@article{oai:iris.cnr.it:20.500.14243/485023,
title = {Interactive multimodal video search: an extended post-evaluation for the VBS 2022 competition},
author = {Schall K. and Bailer W. and Barthel K. -U. and Carrara F. and Lokoč J. and Peška L. and Schoeffmann K. and Vadicamo L. and Vairo C.},
doi = {10.1007/s13735-024-00325-9},
year = {2024}
}