top of page

Dilden Bağımsız Otomatik Digits-In-Noise-Testi: Aladdin

Aladdin: Automatic LAnguage-independent Development of the Digits-In-Noise test

Authors: Sigrid Polspoel1, Sophia E. Kramer1, Bas Van Dijk2, Cas Smits1 1Amsterdam UMC, Vrije Universiteit Amsterdam, Otolaryngology – Head and Neck Surgery, Ear & Hearing, Amsterdam Public Health research institute, De Boelelaan

1117, Amsterdam, Netherlands 2Cochlear, Advanced Innovation – Algorithms and Application – Cochlear Technology Centre Schaliënhoevedreef 20i, 2800 Mechelen, Belgium

Background The digit-in-noise (DIN) test is a successful hearing test that is used as a screening instrument, a diagnostic tool in clinics, as well as a self-administered home test for CI users. The current limitation of the test is that, since the speech stimuli are language specific, it needs to be developed separately for each language. This makes the development time consuming, expensive and subject to improvement. Another limitation is that the DIN test is not customized for CI users, yielding less accurate test results in this group. These issues will be tackled in this project by applying artificial intelligence techniques to automate the entire development procedure.

Goal The aim of the Automatic LAnguage-independent Development of the Digits-In-Noise test (Aladdin)- project is to create a test development procedure for the automatic generation of digits-in-noise tests. This procedure will employ text-to-speech (TTS) and automatic speech recognition (ASR) systems to design DIN tests in various languages and for different target populations such as CI users. As all new DIN tests will have the same development procedure, the test results will become more comparable across languages than what is currently the case. Moreover, this project has the potential to make the DIN affordable for low and middle income countries by drastically reducing development costs.

Method Multiple studies will be conducted to assess whether the current development procedure (Smits at al.) can be replaced by an automatic one. First, we will evaluate if the speech produced by a TTS system can replace a human voice in the context of hearing tests. Next, speech recognition functions of the speech items are obtained to have a future reference for the ASR system for three target groups: normal hearing listeners, listeners with hearing loss and CI users. Finally, ASR systems are trained to construct speech recognition functions of synthesized speech material, including stimuli that have been processed by a CI processor. The speech recognition functions of the ASR systems are compared to the ones obtained in the study with human listeners. The ultimate result is a system where the TTS system creates the spoken digits and the ASR system equalizes recognition of the individual digits resulting in accurate DIN tests in any language (Figure 1). We aim to have the Aladdin project accomplished by the end of 2023.


Figure 1. A diagram of the Aladdin procedure: First digits are produced by a TTS in any language, these digits are then modified by an audio processor. For the CI user tests, the digits are also processed by a CI processor. Next, digit triplets and masking noise are constructed. Subsequently, the ASR system determines the speech recognition functions of the processed speech material. Based on these, level corrections are made (by the audio processor, hence the two-sided arrow) to make the speech items equally intelligible.


Referanslar:

Smits C., Theo Goverts S., Festen J. M., The digits-in-noise test: Assessing auditory speech recognition abilities in noise. J. Acoust. Soc. Am. 133, 1693–1706 (2013).

Commentaires


bottom of page