Factors Influencing the Korean Version of the Digit-in-Noise Test
Article information
Abstract
Background and Objectives
The digits-in-noise (DIN) test was developed as a simple and time-efficient hearing-in-noise test worldwide. The Korean version of the DIN (K-DIN) test was previously validated for both normal-hearing and hearing-impaired listeners. This study aimed to explore the factors influencing the outcomes of the K-DIN test further by analyzing the threshold (representing detection ability) and slope (representing test difficulty) parameters for the psychometric curve fit.
Subjects and Methods
In total, 35 young adults with normal hearing participated in the K-DIN test under the following four experimental conditions: 1) background noise (digit-shaped vs. pink noise); 2) gender of the speaker (male vs. female); 3) ear side (right vs. left); and 4) digit presentation levels (55, 65, 75, and 85 dB). The digits were presented using the method of constant stimuli procedure. Participant responses to the stimulus trials were used to fit a psychometric function, and the threshold and slope parameters were estimated according to pre-determined criteria. The accuracy of fit performance was determined using the root-mean-square error calculation.
Results
The listener’s digit detection ability (threshold) was slightly better with pink noise than with digit-shaped noise, with similar test difficulties (slopes) across the digits. Gender and the tested ear side influenced neither the detection ability nor the task difficulty. Additionally, lower presentation levels (55 and 65 dB) elicited better thresholds than the higher presentation levels (75 and 85 dB); however, the test difficulty varied slightly across the presentation levels.
Conclusions
The K-DIN test can be influenced by stimulus factors. Continued research is warranted to understand the accuracy and reliability of the test better, especially for its use as a promising clinical measure.
Introduction
Hearing impairment is a leading factor of disability and a significant public health burden in aging societies [1]. The number of hearing-impaired (HI) patients is increasing rapidly, and sufficient evidence demonstrates significant associations with depression, hospitalization risk, and cognitive disorders [2–5]. One of the first signs of early sensory neural hearing loss is difficulty in challenging listening situations [6]. Thus, efficient speech recognition in a noise test is essential for earlier detection of hearing disorders.
Many tests exist for assessing speech recognition in noise that differ in their complexity of speech material, context level, signal-to-noise ratio (SNR), and presentation level; the Hearing-in-Noise Test, the Quick Speech-in-Noise test, the Matrix Sentence Test, and the speech-perception-in-noise test [7–10]. However, many of these speech-in-noise tests have a long performance time and cannot solely measure hearing level because sentences or words rely on memory and linguistic ability.
In 2013, Smits, et al. [8] developed a digits-in-noise (DIN) test as an alternative to the previous standard Dutch speech-in-noise test. Because digits are easy and familiar speech material, the DIN test is relatively immune to learning effects, linguistic ability, and other personal factors [11]. The participants use a digital keyboard to perform the test, allowing automation of the procedure. This simple and efficient speech-in-noise test in Dutch was successfully used as a screening test and was used in over 65,000 telephone calls during the first four months before it was implemented as an online test. Successful test performance led to development of the DIN in diverse languages. In the last decade, worldwide studies have shown the effectiveness of the DIN test in British English [12], American English [13], German [14], French [15], Mandarin [16], Polish [17], Russian [18], and Spanish [19]. The Korean version of the DIN (K-DIN) test was recently developed and optimized for mobile testing devices for normal hearing (NH) adults [20] and also validated in both NH and HI listeners by comparing with the previous Korean speech perception-in-noise test [21].
There has been a growing interest in increasing the efficiency and sensitivity of the DIN test under various conditions. Smits [22] standardized the number of digit triplets and used inter-aural antiphasic digits to improve the sensitivity of the test [23]. Low-pass filtered masking noise improved the sensitivity of the DIN test for high-frequency hearing loss groups [24].
As an extension of our previous K-DIN validation study [21], we evaluated the K-DIN test under four condition modifications to evaluate factors influencing K-DIN performance: 1) types of noise comparison, 2) gender-specific speaker comparison, 3) test ear comparison, and 4) presentation level of stimulus comparison. These four proposed conditions are important factors influencing the pursuit of diagnostic hearing evaluations. The findings from this study could provide standard criteria for future use of the K-DIN test as a tool in clinical diagnosis.
Subjects and Methods
Subjects
The procedures were approved by the Institutional Review Board of Hallym University (#HIRB-2020-067), and the experiments were conducted in compliance with the Declaration of Helsinki, International Conference of Harmonization Guidelines for Good Clinical Practice. All of the participants were involved after prior consent.
Twenty young adults with NH were involved in Experiments I, II, III, and IV. Participants ranged in age from 23 to 28 years (mean±standard deviation: 24.15±1.35 years), and 10 of the 20 participants were female. Normal hearing was defined as air conduction pure-tone threshold ≤25 dB hearing level (HL) averaged across 0.5, 1, 2, and 4 kHz (weighted four-frequency average). Mean pure-tone averages (PTAs) were 1.76±2.73 and 1.87±2.91 dB HL for the left and right ears, respectively. An additional 15 young adults with NH participated in Experiment III, for a total of 35 young adults. These participants ranged in age from 21 to 28 years (mean±standard deviation: 24.54±1.50 years), and 20 of the 35 were female. Mean PTAs were 1.86±2.96 and 1.93± 3.03 dB HL for the left and right ears, respectively.
Since the total number of subjects was not large enough and additional 15 young adults only participated in Experiment III, the normality test was performed for the data of 20 adults group and 35 adults group. The Shapiro-Wilk test and Kolmogorov-Smirnov test were administered and both listener groups were normally distributed. The level of significance was tested at 0.05 level.
Stimuli and procedure
All experiments were conducted in a single-walled, sound-attenuated booth. Signals were generated at a sampling rate of 44.1 kHz with MATLAB (version R2020b; Mathworks, Natick, MA, USA) on a Surface Pro touchscreen (Microsoft, Redmond, WA, USA). To produce the exact sound level of stimuli, an RME Babyface Pro soundcard (RME Audio, Haimhausen, Germany) was connected to the Surface Pro laptop and presented through headphones (Sennheiser Electronics GmbH & Co. KG, Wedemark-Wennebostel, Germany). Each headphone’s frequency response was equalized using calibration measurements obtained with a Brüel & Kjær sound level meter (Brüel & Kjær Sound & Vibration Measurement A/S, Nærum, Denmark) with a 1-inch microphone in an artificial ear. All participants responded with their action on the main laptop screen, which displayed the MATLAB software interface.
The 10 digits (0–9) were recorded by Korean speakers and used as target stimuli. Specifically, a total of eight (4 females: F0=224.8±18.1 Hz; 4 males: F0=172.3±44.5 Hz) trained professional actors recorded all single digits. Fundamental frequency (F0), which represents voice pitch, was estimated using the cepstrum algorithm in MATLAB, where the output is the Fourier transform of the log of the magnitude spectrum of the input waveform [25]. F0 for each speaker was averaged across all speakers’ digit stimuli. After recording stimuli, all digits were adjusted to the equal root mean square level for −20 dB. Fig. 1 shows example spectrograms of digits 0 and 2 spoken by female speaker 1 (Supplementary Figs. 1–8 in the online-only Data Supplement), and power spectral densities of the two steady noises.
Digit-shaped noise was used as common background noise throughout Experiments I–IV. The digit-shaped noise was generated with the same spectrum as the long-term averaged digit spectrum by Adobe Audition (Adobe, San Jose, CA, USA). The noises were gated at 50 msec prior to onset of digit stimuli and remained on for 1,500 msec. This ensured that the digit signals were maintained within a steady background noise.
The method of constant stimuli was used for digit recognition measurement. The target digit stimuli were fixed at a 65 dB sound pressure level (SPL) and embedded within a simultaneous background noise masker of varying SNR (−26, −20, −14, −8, and −2-dB SNR). During the experiment, all participants were asked to listen to the digit in the background noise carefully and to enter on the screen the digit that they heard. The subject responses were averaged to calculate the percentage correct for each digit (0–9) for each experimental condition as described below.
Experiment I: Effect of types of presenting noise
In Experiment I, two types of steady noises were used as background maskers: digit-shaped noise and pink noise. The pink noise masker was generated with cutoff frequencies of 100 Hz and 8,000 Hz. As with the digit-shaped noise, the pink noise was gated at 50 msec prior to onset of the digit stimuli and remained on for 1,500 msec. The average spectral densities for the two noise maskers used in this study are shown in Fig. 1. The number of trials was set to 400 trials (10 digits×4 speakers×5 SNRs×2 noise conditions). Speaker gender was balanced and randomly sampled: 2 of 4 females and 2 of 4 males.
Experiment II: Effect of speaker gender
To compare the effect of speaker gender, all 8 speakers were included in the trials. Each digit stimulus was presented by 4 female and 4 male talkers for 5 SNR conditions. Thus, the number of trials in Experiment II was set to 400 trials (10 digits×8 talkers×5 SNRs).
Experiment III: Difference between ear sides
The stimuli and procedures for Experiment III were identical to those used in Experiment I. To evaluate the difference between the two ears, participants listened to 400 trials (10 digits×4 talkers×5 SNRs×2 monaural listening conditions) of digit-shaped noise.
Experiment IV: Effect of stimulus presentation level
The stimuli and procedures for Experiment IV were identical to those used in Experiment I. While target digit stimuli were fixed at 65 dB SPL throughout Experiments I–III, the K-DIN test was performed with three additional levels of digit stimuli (55, 75, and 85 dB SPL) in Experiment IV. Thus, the number of trials in Experiment IV was set to 600 trials (10 digits×4 talkers×5 SNRs×3 level conditions).
Results
Fig. 2 shows example raw data (symbols) and psychometric function fits (curves) for digits 0 and 9. Here, the psychometric function describes the relationship between SNR and subject response using a logistic function in the form of Eq. (1):
where P is the percent correct (0%–100%), x is the SNR, and “a” and “b” are the threshold and slope parameters, respectively. The threshold parameter “a” indicates detection ability of the digit stimulus and is indicated by the SNR at the 50% point on the psychometric function. The slope parameter “b” indicates difficulty with the digit detection task and is represented by the psychometric function slope. For example, as shown in Fig. 2, digit number 0 shows better detection ability (smaller SNR) than digit number 9; however, the detection performance is more difficult for digit 0 (shallower slope) than digit 9. These threshold and slope parameters were estimated using a curve fitting toolbox in MATLAB software and interpreted throughout Experiments I–IV.
Experiment I: Effect of types of presenting noise
Fig. 3 shows the estimated average psychometric function curves for digits 0–9 as a function of SNR in two background noise conditions (digit-shaped noise and pink noise). The percentage correct values for each digit were averaged across all subjects to obtain the psychometric function. The results show that, in the digit-shaped noise condition, the threshold a ranged between −24.52 dB (digit 7) and −16.63 dB (digit 9), and the mean threshold was −20.42±2.70 dB. In the pink nose condition, the threshold ranged between −26.16 dB (digit 2) and −19.91 (digit 5), and the mean threshold was −23.04±1.65 dB. The slope parameter “b” was similar between the two noise types, except for digit 9 in the pink noise condition. The mean slope was −0.33±0.05 for the digit-shaped noise and −0.28±0.07 for the pink noise. On average, the digit-shaped noise had slightly higher SNRs than the pink noise. Only digit “1” resulted in higher SNRs in pink noise, while other digits resulted in higher SNRs in the digit-shaped noise. The detailed threshold and slope parameters of the fitted psychometric function and the fitting errors (goodness-of-fit) for each digit are shown in Table 1.
Experiment II: Effect of speaker gender
Fig. 4 shows the estimated average psychometric function curves for digits 0–9 as a function of SNR in two speaker-gender conditions. In the male speaker condition, the threshold a ranged between −23.67 dB (digit 7) and −15.54 dB (digit 6), and the mean threshold was −19.53±2.80 dB. In the female speaker condition, the threshold ranged between −25.37 dB (digit 7) and −15.80 dB (digit 5), and the mean threshold was −21.18±3.13 dB. According to the mean threshold values, female speakers yielded slightly better detection ability (1.65 dB) than male speakers. In particular, digits “4” and “6” yielded smaller SNRs in female speakers, with a greater than 4-dB SNR difference. The slope parameter b was similar between the two speaker-gender conditions except for digit “1,” and the average slope was −0.33±0.07 for male speakers and −0.40±0.09 for female speakers. The detailed threshold and slope parameters of the fitted psychometric function and the fitting errors (goodness-of-fit) for each digit are shown in Table 2.
Experiment III: Difference between ears
Fig. 5 shows the average psychometric function curves for digits 0–9 as a function of SNR in two monaural listening conditions (left-only and right-only). The threshold in the left ear condition ranged between −22.91 dB (digit 7) and −12.05 dB (digit 5), and the mean threshold was −17.96±3.67 dB. The threshold in the right ear condition ranged between −23.60 dB (digit 7) and −12.99 dB (digit 5), and the mean threshold was −18.23±3.55 dB. The slope parameters were −0.44±0.17 and −0.43±0.12 for the left and right ears, respectively. The detailed threshold and slope parameters of the fitted psychometric function and the fitting errors (goodness-of-fit) for each digit are shown in Table 3.
Experiment IV: Effect of stimuli presentation level
Fig. 6 shows the average psychometric function curves for digits 0–9 as a function of SNR in four target-level conditions (55, 65, 75, and 85 dB). The threshold ranged between −23.98 dB (digit 7) and −12.84 dB (digit 9) in the 55-dB SNR condition, −23.07 dB (digit 7) and −12.80 dB (digit 5) in the 65-dB SNR condition, −21.11 dB (digit 1) and −10.72 dB (digit 5) in the 75-dB SNR condition, and −22.42 dB (digit 7) and −12.03 dB (digit 5) in the 85-dB SNR condition. Mean thresholds were −18.24±3.64, −18.13±3.28, −16.52±3.20, and −17.36±2.95 dB for the 55-, 65-, 75-, and 85-dB SNR conditions, respectively. The slope parameters were similar at around −0.4 in the four presentation levels, except the 75-dB presentation level had a shallower slope (−0.31) than the others. Detailed threshold and slope parameters of the fitted psychometric function, and the fitting errors (goodness-of-fit) for each digit are shown in Table 4.
Discussion
Is the K-DIN test affected by noise type?
The findings from two background noise conditions suggest that listeners with NH have slightly better digit detection ability for pink noise (mean threshold value −23.04±1.65 dB) than for digit-shaped noise (mean threshold value −20.42± 2.70 dB). This difference may be attributed to energetic masking of digit-shaped noise which more interferes with speech material than pink noise in the DIN test. However, the difference was small, and no task difficulty was observed.
The type of background noise affects the result of the DIN test in certain conditions [26,27]. Smits et al. [28] compared the Dutch DIN test and American-English DIN test for steady-state noise and interrupted noise in NH listeners. Dutch and English DIN tests showed no difference in steady-state noise, but the English DIN yielded significantly better scores for interrupted noise. Previously, Smits and Houtgast [29] suggested 16-Hz interrupted noise which resulted in the highest spread in SRT values is considered beneficial compared to 32-Hz interrupted noise or continuous speech-shaped noise. In contrast, speech-spectrum noise compared to multi-talker babble noise resulted in better SRT in the Persian DIN test [30].
HI listeners, especially those who have high-frequency sensorineural hearing loss, are more influenced by masking features of noise type compared to NH listeners [29,31]. Patients with high-frequency hearing loss resulted in a higher correlation with DIN than NH listeners in the interrupted noise compared to broadband noise [24]. Since our study was conducted on NH listeners, different masking features of noise type seem to have resulted in a small effect.
Is the K-DIN test affected by speaker gender?
In Experiment II, the female speaker yielded slightly better detection ability (mean threshold value −21.18±3.13 dB) than the male speaker (mean threshold value −19.53±2.80 dB), but the difference was small enough to suggest that K-DIN test performance is not affected by the speaker gender. A previous DIN study using other languages also reported that speaker gender was not a critical factor for the DIN test. In the Canadian version of the DIN test, developed in both English and French and in both female and male versions [32], the mean speech recognition threshold of four types of language-speaker versions varied by only 0.6 dB despite spectral variation across two speakers and two languages.
Are there differences between test ears?
In Experiment III, both threshold and slope parameters were similar between the right and left ear sides when being tested through the digits. Therefore, monaural presentation seems to have no influence on the overall DIN test for NH listeners. This result supports K-DIN as an appropriate screening tool for the hearing test because digit stimuli are affected little by right ear advantage or the central auditory pathway. Furthermore, this finding suggests that the K-DIN test could be used for testing HI listeners who have asymmetric hearing loss, with threshold and slope parameter references being estimated from NH listeners.
Is the K-DIN test affected by presentation level of stimuli?
In Experiment IV, lower stimulus presentation levels (55- and 65-dB SPL) resulted in slightly better thresholds than higher presentation levels (75 dB and 85 dB). This suggests that NH listeners have better digit detection ability near the listener’s comfortable sound levels. However, there are limitations to this interpretation because the difference was small, and the threshold of 85 dB (17.36±2.95) was higher than the threshold of 75 dB (−16.52±3.20).
Previous studies in other languages have reported that the DIN test could be independent of sound level above a minimum level of 60 dB SPL [33]. In contrast, when bilateral or unilateral HI listeners are included, the presentation level of stimuli influenced the DIN test result [30]. Therefore, the estimated threshold and slope parameters in this study with NH listeners could be the standard criteria when the K-DIN test is used as a testing tool for listeners with various degrees of hearing loss.
Limitations and future research
Because repeat cooperation from the subjects was required to compare various factors, a limited number of subjects participated in this study, which limit generalization of results. Also, not all participants (n=35) were involved in all four experiments. Twenty people underwent all four condition modifications, while 15 people were only involved in Experiment III. Further study with large-scale data will provide more data to achieve accuracy in DIN tests.
Clinical implications for the Korean DIN test
This study evaluates whether performance of the Korean DIN test is influenced by condition modifications (noise type, gender-specific speaker, ear side, and presentation level of stimuli) that can influence the pursuit of diagnostic hearing evaluations. By focusing on NH listeners, our findings have the potential to define standard criteria for the Korean DIN test validation.
The major findings are as follows. First, pink noise resulted in slightly better digit detection ability compared with digit-shaped noise. Second, gender difference and ear side had no influence on threshold value (SNRs of 50% intelligence). Last, lower stimuli presentation levels (55 dB and 65 dB) near comfortable sound level resulted in better SNRs. Those results suggest that the Korean version of the DIN test can be used as a validated hearing screening test that is easy and time efficient. Future research is needed to generate large-scale normative data and to optimize the testing parameters for developing a reliable Korean DIN test as a tool in clinical diagnosis.
Supplementary Materials
The online-only Data Supplement is available with this article at https://doi.org/10.7874/jao.2022.00472.
Acknowledgments
This work was supported by the Ministry of Education of the Republic of Korea, the National Research Foundation of Korea (NRF-2022S1A5C2A03091539), the Institute of Clinical Medicine Research of Bucheon St. Mary’s Hospital, Research Fund, 2018, and the Research Fund of the E.N.T. Catholic University of Korea that was created in the program year of 2022.
Notes
Conflicts of Interest
The authors have no financial conflicts of interest.
Author Contributions
Conceptualization: Jae-Hyun Seo, Yonghee Oh, Woojae Han. Data curation: Subin Kim, Chanbeom Kwak, Yonghee Oh. Formal analysis: Subin Kim, Chanbeom Kwak, Yonghee Oh. Funding acquisition: Woojae Han, Jae-Hyun Seo. Methodology: Yonghee Oh, Jae-Hyun Seo. Project administration: Yonghee Oh, Jae-Hyun Seo, Woojae Han. Visualization: Yonghee Oh, Chanbeom Kwak. Writing—original draft: Subin Kim. Writing—review & editing: all authors. Approval of final manuscript: all authors.