Selecting sketches for similarity search Mic V, Novak D, Vadicamo L, Zezula P Techniques of the Hamming embedding, producing bit string sketches, have been recently successfully applied to speed up similarity search. Sketches are usually compared by the Hamming distance, and applied to filter out non-relevant objects during the query evaluation. As several sketching techniques exist and each can produce sketches with different lengths, it is hard to select a proper configuration for a particular dataset. We assume that the (dis)similarity of objects is expressed by an arbitrary metric function, and we propose a way to efficiently estimate the quality of sketches using just a small sample set of data. Our approach is based on a probabilistic analysis of sketches which describes how separated are objects after projection to the Hamming space.