The speech material
A description of the material can be found on the Description of the materials page.
- Training material (clean) (163 Mb)
- Test material (138 Mb)
- Development material (67M)
- Test material as stereo (left=noise, right=speech) (198 Mb)
- Development material as stereo (left=noise, right=speech) (102M)
The .zip files of both sets of test material and the development material contain seven directories, named testsetX and devsetX where X is the number referring to the number associated with the type of noise in the table on the Description of the materials page. In addition to this, test.zip also contains seven directories named testsetXp. These contain a small number of practice stimuli, which can be used in the perceptual experiments to give the listeners a small practice session. If you are not running perceptual experiments, you can ignore these directories. Finally, the test.zip and dev.zip also contain six files named testsetX_offsets.dat and devsetX_offsets.dat. These are (MATLAB) files showing the offsets of each of the VCVs into the noise. People building ASR systems are not allowed to use these data.
Phoneme segmentation data
- 91 hand segmented VCVs: in HTK format. This set consists of at least three items per consonant in a context in which the first and the second vowel were identical, added to that were 19 randomly selected VCVs.
- Updated [05-03-2008] Automatically generated phoneme segmentation of the clean training material: in HTK format.
- Updated [06-03-2008] Automatically generated phoneme segmentations of the test data: in HTK format (.zip).
The models of the MFCC-based baseline recognition system (click here for a description of the model set) were used to create phoneme segmentations of the clean test data using forced alignment.
Warning: Everyone is invited to use these segmentations but should be warned that they are not ‘perfect’ segmentations. If you obtain better segmentations and if you would like to share these please send them to the organisers via e-mail and we will post them on this website.
Back to Introduction