Research‎ > ‎

Large Scale Image Search Benchmark

Description

Indexing quickly and accurately in a large collection of images has become an important problem with many applications. Given a query image, the goal is to retrieve matching images in the collection. We compare the structure and properties of seven different methods based on the two leading approaches: voting from matching of local descriptors vs. matching histograms of visual words, including some new methods. In particular, we compare: Kd-Trees, Hierarchical K-Means, Locality Sensitive Hashing (LSH) with three different hash functions (L2, Spherical Simplex, Spherical Orthoplex), Inverted File, and Min-Hash. We derive theoretical estimates of how the memory and computational cost scale with the number of images in the database. We evaluate these properties empirically on four real-world datasets with different statistics. We discuss the pros and cons of the different methods and suggest promising directions for future research.

Software

Caltech Large Scale Image Search Toolbox: contains our implementations of the algorithms compared in this work.

References

  1. Mohamed Aly, Mario Munich, and Pietro Perona. Indexing in Large Scale Image Collections: Scaling Properties and Benchmark.
     IEEE Workshop on Applications of Computer Vision (WACV), Hawaii, January 2011. [pdf]
  2. Mohamed Aly, Mario Munich, and Pietro Perona. Indexing in Large Scale Image Collections: Scaling Properties, Parameter Tuning, and Benchmark.
    Technical Report, Caltech, USA, October 2010. [pdf]
Comments