Bull and the CEA - the French Atomic Energy Authority - announce that they have achieved a record performance for image searches in very large-scale databases. Finding one image among 22 million stored in a database now takes just six seconds, i.e. 3.7 million images per second, 5 times faster than previously. This record result was achieved on a supercomputer designed and supplied by Bull, using the multimedia search software specially developed by CEA LIST1 as part of the FAME22 project. It opens the way to a vast field of applications ranging from business intelligence to comparison of medical images, from 'data mining' on the Internet to e-business and content management.
A revolutionary image search technology with multiple applications...
Today, Internet search engines carry out image searches using just textual description as criteria (image name, or caption, for example). By carrying out the search based on an analysis of the image content, the Piria3 search engine developed by the CEA provides a much more powerful solution, opening the way to a vast field of applications: from business intelligence to comparison of medical imagery, 'data mining' on the Internet, e-business and content management.
The CEA LIST leads research into multi-lingual multimedia knowledge engineering, and for several years now has been developing knowledge extraction techniques with the aim of improving the relevance of the results obtained. The principle underlying content-based searches for images involves calculating a visual or coded signature for each image in a database, and classifying these signatures in an index. The query is articulated as an image, and produces a response in the form of similar images. These search techniques based on content, that start with an analysis of pixel values, are intrinsically very power-hungry in terms of computing resources.
... made possible today by access to ultra-high computing capacity
In the FAME2 project, which the CEA is part of, researchers have had access to significant High-Performance Computing (HPC) resources for the purpose of testing the Piria image search application on an extremely large-scale database.
As part of the testing process, the Piria engine code had to be adapted for the parallel architecture of the supercomputer developed by Bull (consisting of 88 Intel® Itanium® processor cores and 50 Terabytes of disc space), enabling integration of the database of 22 million images, occupying some 2.9 Terabytes. This initiative was led by the CEA/DAM4, and involved close collaboration between the CEA LIST teams and Bull. The results of this development were presented during the summer of 2007: the 22 million images were indexed in less than one week of processing time, using 48 of the supercomputer's Intel® Itanium® processor cores; once the database was indexed, users could submit a query from their browser application and obtain almost instantaneous responses.
A world record performance
The Piria engine enables an image search among 22 million images to be complete in just six seconds - i.e. 3.7 million images per second, 5 times faster than previously - compared with the search for an image among 11 million using the Cortina system, a content-based search engine accessible via the Internet and developed by the University of California at Santa Barbara (UCSB). This benchmark was one of the major challenges that the FAME2 project set itself.
This success demonstrates the power of these image recognition technologies developed by the CEA LIST on very large databases occupying several terabytes. These technologies are marketed by the company NewPhenix5.
1CEA LIST: the CEA's systems and technology integration laboratory
2Within the SYSTEM@TIC PARIS-REGION competitiveness cluster, the FAME2 project aims to develop architectures for high-performance parallel processing and make them available to businesses so they can validate their future computing requirements. The project partners - Bull (project co-ordinator), the French petroleum institute (l'Institut Français du Pétrole), the Ecole Centrale Paris, INRIA, Dassault Aviation, ILOG, INT Evry, NewPhenix, the University of Versailles Saint-Quentin (UVSQ), CAPS and the CEA - are pursuing this collaboration as part of the POPS (Peta Operations per Second) project.
3 Piria web site
4CEA/DAM: the CEA's military applications directorate