Name : Bertrand Delezoide

Institution : CEA LIST


Dr. Bertrand Delezoide has received his MS degree and PhD in Signal Processing and Informatics from University of Pierre et Marie Curie, Paris, France, in 2001 and 2006. During 2004-2005 he was assistant professor at the Université Pierre et Marie Curie. Currently, he is a research engineer from the CEA-LIST. His research interests are in the area of heterogeneous multimedia document processing: video and audio automatic indexing, automatic video segmentation, speech to text and video retrieval.  



Publications :

Delezoide, B. Modèles d'indexation multimédia pour l'analyse automatique de films de cinéma. Ph.D. Thesis, Université Pierre et Marie Curié, Paris, France, 2006.

Delezoide B., Le Borgne H. SemanticVox: A multilingual video search engine, Proc of ACM International Conference on Image and Video Retrieval (CIVR 2007), Amsterdam, The Netherlands, July 9-11 2007.




Title of Project : M.E., a multimedia indexing and search engine


Content access to public or private networks (eg Internet or ADSL television) has deeply and rapidly changed since the massive efforts undertook to digitalize existing documents and the emergence of a wide range of media. The amount of data (texts, photographs and videos) and the presence of highly heterogeneous and multimedia documents on the Internet has led to the development of new systems to access large databases of documents coming from these virtual worlds.


Multimedia Engine (ME) is a collection of softwares and resources that can analyze and explore large multimedia corpus. It contains the following modules:


A multimedia heterogeneous documents analyzer: from a multimedia document such as a Web page (which can contain texts, photos and videos), this tool creates a digital signature of the document describing its structure and content. Structure and content extraction is achieved by the application of processes specific to each of the media. Signatures are then indexed in SQL databases and files.

A multimedia search engine: this tool retrieves documents that are similar to a query provided by the user (e.g an image, a video, a sentence, or a web page). The main novelty of ME is the combination and fusion of several retrieval techniques taking into account not only the text of the documents, but also the structure and content of other media (image and video).

We will present three applications of ME. SemanticVox: a cross-lingual automatic video indexing and retrieval system, based on speech transcripts and video analysis. VIP: a celebrity video search engine. WikipediaMM: the results of our algorithms during the international evaluation campaign of visual information retrieval from a collection of Wikipedia images.