Research‎ > ‎

Automatic Discovery of Image Families


Gathering large collections of images is quite easy nowadays with the advent of image sharing websites. However, such collections contain duplicates and highly similar images, what we refer to as image families. Automatic discovery and cataloguing of such similar images in large collections is important for many applications, e.g. image search, image collection visualization, and research purposes among others.

This work investigates this problem by thoroughly comparing two broad approaches for measuring image similarity: global vs. local features. We assess their performance as the image collection scales up to over 11,000 images with over 6,300 families. Moreover, we present a new algorithm to automatically determine the number of families in the collection.


  • Mohamed Aly, Peter Welinder, Mario Munich, and Pietro Perona. Towards Automated Large Scale Discovery of Image Families.
    Second IEEE Workshop on Internet Vision, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Miami, Florida, June 2009. [pdf]
  • Mohamed Aly, Peter Welinder, Mario Munich, and Pietro Perona. Automatic Discovery of Image Families: Global Vs. Local Features .
    IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, November 2009. [pdf]
  • Catech Buildings Dataset
  • Caltech Game Covers Dataset


This is a joint work with Peter Welinder, Mario Munich, and Pietro Perona.