| Project title |
Dynamic Visual Features and Improved Audio-Visual Fusion for Automatic Speech Recognition |
| Summary |
Human speech is bimodal in nature. Incorporating visual features in Automatic Speech Recognition systems can improve performance in real environments. This work addresses core challenges in audio-visual speech recognition. It will develop new dynamic visual features that better capture the correlations in key mouth movements used by humans in lipreading. This is crucial in improving Hidden Markov Model performance. It will explore a new audio-fusion strategy motivated by the differing visibility of visemes allowing the influence of the audio and video stream to change over time. |
| Funding Agency |
SFI |
| Programme |
RFP |
| Type of Project |
|
| Date from |
Oct. 2009 |
| Date to |
Sept. 2013 |
| Person Months |
|
|
|
| Project title |
Robust Speaker Verification |
| Summary |
Biometrics involves the use of intrinsic physical or behavioural traits of humans to verify their identity. Traits used in biometrics typically include face, fingerprints, hand geometry, handwriting, iris, retinal, vein, and voice. Many are concerned that these technologies are potentially invasive and open to fraud. Speaker verification, using voice or voice and video, has been recognised as an important alternative in the world of biometrics. It is less invasive and requires less expensive installations that iris and fingerprint authentication systems.
The changes that occur in the human voice due to ageing have been well documented. The impact of these changes on speaker verification is less clear. In this work, we examine the effect of long-term vocal ageing on a speaker verification systems. |
| Funding Agency |
IRCSET |
| Programme |
|
| Type of Project |
|
| Date from |
2009 |
| Date to |
2012 |
| Person Months |
36 |
|
|
| Project title |
Audio-Visual Fusion for Human Computer Interaction. |
| Summary |
This project will thus focus on key challenges in Audio Visual Speech Recognition:
• Given state of the art audio and visual features, do early or late integration strategies work better?
• How well does such an integration scheme translate to less controlled situations, where the speech is less constrained, intonation or prosody is more natural, or the speech is emotionally influenced?
• Can these algorithms work on a real handheld device?
|
| Funding Agency |
IRCSET |
| Programme |
|
| Type of Project |
|
| Date from |
2011 |
| Date to |
2014 |
| Person Months |
|
|
|
| Project title |
Speech Quality for VoIP |
| Summary |
This project is developing new metrics to measure speech quality for VoIP applications, particularly Google Chrome WebRTC |
| Funding Agency |
Google Inc |
| Programme |
|
| Type of Project |
Industrially sponsored research |
| Date from |
April 2011 |
| Date to |
April 2012 |
| Person Months |
12 |
|
|
Action Recognition in Multimedia Streams in, editor(s)Petros Maragos, Alexandros Potamianos, Patrick Gros , Multimodal Processing and Interaction, Springer Verlag. , 2008, [Daire Lennon, Naomi Harte, and Anil Kokaram, Rozenn Dahyot, Francois Pitie] Notes: [Multimedia Systems and Applications (Book Series)] |
David Corrigan, Naomi Harte, Anil Kokaram, Pathological Motion Detection for Robust Missing Data Treatment, EURASIP Journal on Advances in Signal Processing, 2008, 2008, pArticle ID 542436 DOI |
Corrigan, David; Harte, Naomi; Kokaram, Anil;, Automated Segmentation of Torn Frames using the Graph Cuts Technique, Image Processing, IEEE International Conference on Image Processing, 2007. ICIP 2007., San Antonio, TX, USA , 2007, (Sept. 16-Oct. 19), 2007, pp557-560 Url TARA - Full Text DOI |
Harte, Naomi; Rankin, Andrew; Baugh, Gary; Kokaram, Anil;, Detection of Illegal Dumping from CCTV at Recycling Centres, International Machine Vision and Image Processing, International Machine Vision and Image Processing Conference, Kildare, Ireland , 2007, (5-7 Sept. ), 2007, pp204 Url TARA - Full Text |
Corrigan, D. Harte, N. and Kokaram, A. , Pathological motion detection for robust missing data treatment in degraded archived media, Image Processing, IEEE International Conference on Image Processing 2006, Atlanta, GA , 8-11 Oct. 2006 , 2006, pp621 - 624 Url TARA - Full Text DOI |
| More Publications and Other Research Outputs >>> |
Contact:helpdesk@tcd.ie Last Updated:19-JUN-2013 |