Name : Hervé Bredin

Institution : CNRS / IRIT


Hervé BREDIN was born in Cholet (France) in 1981 and is a french citizen. In 2004, he received the engineering diploma from the "Grande Ecole" Telecom ParisTech focusing mostly on signal and image processing, pattern recognition and human-computer interactions. Then, he was a PhD candidate at the Signal and Image Processing Department ( under the supervision of Dr. Gérard Chollet until 2007 when he successfully defended his PhD thesis dealing with biometrics and, more precisely, audio-visual identity verification based on talking-faces and its robustness to high-effort forgery (such as replay attacks, face animation or voice transformation). With this thesis, he won the EBF European Biometric Industry Award 2007. In 2008, he was a postdoctoral researcher with the Center for Digital Video Processing at Dublin City University where he investigated statistical methods for automatic summarization of raw or user-generated video content. Since october 2008, he is a permanent researcher (Chargé de Recherche) at the National Center for Scientific Research (http://www.cnrs/) in Institut de Recherche en Informatique de Toulouse (, France.


Publication :

H. Bredin and G. Chollet, Making Talking-Face Authentication Robust to Deliberate Imposture, in ICASSP 2008, IEEE International Conference on Acoustics, Speech, and Signal Processing, Las Vegas, USA, 2008






Title of Project : Making Talking-Face Authentication Robust to Deliberate Imposture


Numerous studies have exposed the limits of biometric identity verification based on a single modality (such as fingerprint, iris, hand-written signature, voice, face). The talking face modality, that includes both face recognition and speaker verification, is a natural choice for multimodal biometrics.


Talking faces provide richer opportunities for verification than does any ordinary multimodal fusion. The signal contains not only voice and image but also a third source of information: the simultaneous dynamics of these features. Natural lip motion and the corresponding speech signal are synchronized.


However, this specificity is often forgotten and most of the existing talking-face authentication systems are based on the fusion of the scores of two separate modules of face verification and speaker verification. Even though this prevalent paradigm may lead to the best performance on widespread evaluation frameworks based on random impostor scenarios,  we expose its weakness when confronted to realistic deliberate impostors.


A client-dependent audiovisual synchrony measure is used in order to deal with deliberate impostors and a new fusion strategy and its performance against random and deliberate impostors are studied.