SpeM: Introduction

This page provides information on the computational model of human speech recognition: SpeM. SpeM (Speech based Model of speech recognition) was originally implemented to serve as a tool for research in the field of human speech recognition (HSR). It is a new and extended implementation of the theory underlying the Shortlist model (Norris, 1994), a computational model of human word recognition. The main advance of SpeM over pre-existing computational models of HSR is that SpeM uses the acoustic speech signal as input, while Shortlist and other computational models of HSR only take handcrafted symbolic representations as input.

SpeM consists of two modules:

  • An automatic phone recogniser (APR)
  • A word search module

The word search module parses the probabilistic phone graph created by the APR in order to find the most likely (sequence of) words, and computes for each word its activation based on the accumulated acoustic evidence for that word.

To be able to use the word search module, it is not obligatory to use an APR created with the same software package (Phicos, Steinbiss et al., 1993) as has been used in the original SpeM experiments. The word search module has successfully been used in combination with HTK (Young et al., 2002) and MIT’s SUMMIT recognition system (Glass, 2003) as well. To be able to use the word search module, the input graphs should have the same structure as shown in the example file below.

If you have used SpeM in your research, please put a reference to SpeM using the reference given at ‘Additional information’ below. If you have any questions regarding the use of SpeM, please do not hesitate to contact me.