Description
This Matlab package implements several algorithms used for large scale
image search. The algorithms are implemented in C++, with an eye on large
scale databases. It can handle millions of images and hundreds of millions
of local features. It has MEX interfaces for Matlab, but can also be used
(with possible future modifications) from Python and directly from C++. It
can also be used for approximate nearest neighbor search, especially using
the KdTrees or LSH implementations.
The algorithms can be divided into two broad categories, depending on the
approach taken for image search:
 Bag of Words (BoW)
The images are represented by histograms of visual words.
It includes algorithms for computing dictionaries:
 KMeans.
 Approximate KMeans (AKM).
 Hierarchical KMeans (HKM).
It also includes algorithms for fast search:
 Inverted File Index.
 Inverted File Index with Extra Information (for example for
implementing Hamming Embedding).
 MinHash.
 Full Representation (FR)
The images are represented by the individual features.
It includes algorithms for fast approximate nearest neighbor search:
 KdTrees (Kdtree).
 Hierarchical KMeans (Hkm).
 Locality Senstivie Hashing (LSH), with several hash functions:
 Hamming hash function (bit sampling, approximates hamming
distance) i.e. h = x[i]
 Cosine hash function (random hyperplanes through the origin,
approximates dot product) i.e.h = sign(<x,r>)
 L1 hash function (approximates the L1 distance) i.e.
h = floor((x[i]b) / w)
 L2 hash function (random hyperplanes with bias, approximates
euclidean distance, similar to E2LSH) i.e.
h = floor((<x,r>  b) / w)
 Spherical Simplex (approximates distances on the unit
hypersphere)
 Spherical Orthoplex (approximates distances on the unit
hypersphere)
 Spherical Hypercube (approximates distances on the unit
hypersphere)
 Binary Gausian Kernels (approximates gaussian kernel)
News
 November 5, 2010: Version 1.0.
Download
Caltech Large Scale Image Search
[From Google Code (zip)
or Local (zip) 144 KB]
Source Code
More Information
The Large Scale
Image Search Benchmark Project page has more information. Citation If you find the toolbox useful, please cite the paper [1] below. References

Mohamed Aly, Mario Munich, and Pietro Perona.
Indexing in Large Scale Image Collections: Scaling Properties and
Benchmark.
IEEE Workshop on Applications of Computer Vision (WACV), Hawaii,
January 2011.
[pdf]

Mohamed Aly, Mario Munich, and Pietro Perona.
Indexing in Large Scale Image Collections: Scaling Properties,
Parameter Tuning, and Benchmark.
Technical Report, Caltech, USA, October 2010.
[pdf]
