The Band-Importance Function for the Korean Standard Sentence Lists for Adults
Article information
Abstract
Background and Objectives
The band-importance function (BIF) refers to a value characterizing the relative importance of different frequencies to speech intelligibility. The purpose of this study was to derive the BIF for the Korean standard sentence lists for adults (KS-SL-A).
Subjects and Methods
In this study, sentences from the KS-SL-A were used as the speech material. Twenty-six normal-hearing Korean listeners participated and intelligibility scores in 8 filters with 3 signal-to-noise ratio conditions were obtained. Based on the intelligibility score percentages, the BIF for the KS-SL-A was derived by using an established protocol.
Results
Band-importance weights varied across frequency bands. The most important frequency region was around 316 Hz (20.0%), and the importance of the frequency bands below the center frequency (CF) of 1,778 Hz was 59.6%. Therefore, low frequencies below the CF of 1,778 Hz were more important than high frequencies above the CF of 1,778 Hz.
Conclusions
The BIF for KS-SL-A could be applied towards developing a hearing aid fitting formulae for Korean listeners.
Introduction
The band-importance function (BIF) refers to a value characterizing the relative importance of different frequencies to speech intelligibility [12]. Each frequency band's importance weight is determined by a value between 0.0 and 1.0. An importance function of 1.0 corresponds to a degree of 100% importance. The BIF has various significant applications in the field of hearing sciences [123] and is an important component of the Speech Intelligibility Index (SII), a model to predict speech intelligibility [1]. When the SII predicts a person's ability to understand speech, the predicted value is calculated based on the dynamic range of the speech, the BIF for the speech, and the person's hearing thresholds. The BIF can also be used to determine frequency-gain response for the hearing aid fitting formula [3]. National Acoustic Laboratories' non-linear fitting procedure version 1 was developed based on the BIF and the dynamic range of speech when determining frequency-gain characteristics [3].
The BIF has been shown to vary between languages. Wong, et al. [4] derived the BIF for Cantonese sentences. The authors reported that low-frequency information was more important for understanding Cantonese speech when compared to English. In the case of Cantonese sentences, the most important frequencies were around 1,600 Hz and the weighting factor of band-importance for frequencies from 0 to 355 Hz was 13.6% in one-third octave bands. In the case of English sentences, however, the most important frequencies were around 2,000 Hz, and the weighting factor of band importance for frequencies from 100 to 400 Hz was 4.63% in 9 frequency bands [5]. The results from this study suggest that different languages may have different BIFs.
Jin, et al. [6] obtained the BIFs by using Korean hearing-in-noise test (K-HINT) sentences [7]. They reported that the BIF for Korean sentences was different from that for English sentences. Specifically, low frequency bands below the center frequency (CF) of 250 Hz were more important in the Korean BIF than in English BIF. For example, the band-importance weight for Korean at the CF of 250 Hz was 8.4%, but the weight for English at the same CF was 2.65%. Thus, the authors [6] concluded that the Korean BIF may be different from other languages, which have different acoustic and linguistic characteristics.
The BIF can be influenced by speech stimuli from the same language. The American National Standard Institute reported that BIFs for the Central Institute for the Deaf auditory test (CID-22) [8] and the Northwestern University auditory test number 6 (NU-6) [9] were different even if the stimulus types for both were similar [1]. For example, bandimportance weights for the CID-22 at the CFs of 150 Hz, 250 Hz, and 350 Hz were 5.07%, 6.77%, and 6.41%, respectively; however, band-importance weights for the NU-6 at the same CFs were 2.34%, 3.68%, and 5.20%, respectively. Not only did these three frequency bands differ but other frequency bands also showed different importance weights between the CID-22 words and the NU-6 words [1]. The CID-22 consisted of 200 words and each list included 50 words [8]. The CID-22 words were developed based on phonetically balanced monosyllabic words (e.g., 'wood', 'where', and 'chin') [8]. The NU-6 test was comprised of four lists of 50 words, which were developed based on phonemically balanced monosyllables (e.g., 'back', 'late', and 'such') [9]. Although both sets of test words were developed using different rationales, the stimulus type was similar. These results suggest that the BIF can be dependent on speech materials, even when they originate from the same language.
The present study was designed to derive the BIF for the Korean standard sentence lists for adults (KS-SL-A) [10]. The KS-SL-A was developed for the sentence recognition test, which was used to evaluate listening skills in everyday life. Each list comprised of 10 sentences with 40 key words that were phonetically balanced across the lists. There were a total of 8 lists. The KS-SL-A was chosen based on the selection criteria for vocabulary and sentence structures of the CID everyday sentence test [11]. Although the BIF based on the K-HINT sentences has already been derived, it is also important to derive the BIF for the KS-SL-A [6] since the KS-SL-A and the K-HINT sentences were developed for different purposes. The K-HINT sentences were developed to test speech perception in a noisy environment, while the KS-SL-A were developed to test speech recognition in a quiet environment [78].
The purpose of this study was to derive the BIF for the KS-SL-A. The results from this study provide information on the relationship between a standardized Korean sentence stimuli and the BIF. In addition, the BIF for the KS-SL-A could serve as an important tool for the development of a hearing aid fitting formula for Korean listeners.
Subjects and Methods
Participants
Twenty-six normal-hearing Korean listeners participated in this study. There were 13 male and 13 female participants (19-26 years). Their auditory thresholds were a 20 dB HL or better with octave frequencies from 250 Hz to 8,000 Hz. All participants had normal middle ear functions.
Stimuli
Sentences of the KS-SL-A were used in the study as stimuli [10]. A total of 80 sentences were used, with 8 sets of 10 sentences per set. Each test set comprised 40 keywords. Sentences were mixed with speech-shaped noises, which were matched to the spectrum of the sentences to create signals that varied in SNR. Three SNR conditions were used (-5, 0, +5 dB SNR). The noisy sentences were adjusted to have a level of 45 dB HL [12]. The noisy speech sentences were then filtered through both low- and high-pass filters (224, 447, 891, 1,413, 2,239, 3,548, 5,623, and 11,000 Hz) using Cooledit 2.0 (ADOBE systems, San Jose, CA, USA). Band widths were selected at regular intervals in 1/3 octave band cutoff frequencies [13]. We used linear-phase finite impulse response filters with a rejection slope of 96 dB/octave at the desired cutoff frequencies.
Equipment
To generate the filtered stimuli, an audio program (Cooledit 2.0, Adobe Systems, San Jose, CA, USA) was used. The stimuli were routed to an audiometer (GSI61; Grason-Stadler, Eden Prairie, MN, USA) and presented to the participant's right ear through a headphone (TDH-50P; Telephonics Corporation, Farmingdale, NY, USA).
Procedure
To derive the BIF, intelligibility data were obtained using various filters with varying SNR conditions [613]. Participants listened to the speech stimuli through headphones. The task was to listen to a sentence and repeat back as much of the sentence as possible. Subsequently, intelligibility (percent keywords correct) was measured in various filtering (8 low- and 8 high-pass filters) with noise conditions (3 dB SNRs). Since the stimulus set included 80 sentences, each listener participated in a subset (16 conditions) of the total number of conditions (16 filters×3 SNRs=48 conditions) in a randomized order. Each participant took about two hours to complete the testing and the break was given every 30 minutes.
Derivation of the BIF
We used an established protocol [13] to derive the BIF. Briefly, the established protocol used an iteration technique exploiting the relationship between the BIF and the SII. Relative SII values were determined from percent correct scores. BIF values were then derived from the relative SII values.
First, the relationship between the percent correct scores and the relative SII values were determined. The initial point for the relationship was determined by plotting the percent correct scores as a function of band-frequency at the highest SNR (+5 dB SNR) used in the present study. One curve represented the data where the high frequencies were removed using a low-pass filter and the other curve represented the data where the low frequencies were removed using a highpass filter. The intersection of these two curve was considered the initial point in a series of graphical processes and curve interpolations for data at different SNRs, generating the constant values in Equation 1 [Eq. (1)]:
In Eq. (1), which was originally proposed by Fletcher and Galt [14], S is the percent correct score, A is the SII value, P is a value for the measure of the talker's and listener's proficiency, and Q and N are fitting constants. The proficiency value, P, was assumed to be 1 because all participants and talkers had normal hearing. The BIF was determined using the computed values of fitting constants through the inverse of Eq. (1). Using the inverse of Eq. (1), all percent correct scores were transformed into associated SII values and the cumulative values were converted to separate values. The separate SII values for low- and high-pass filters were averaged, then the values for the same cutoff frequency band at the different SNRs were averaged. After expanding the averaged values to a 0.0 to 1.0, the final BIF values for each frequency band were determined. Kim and Jin [15] provided a more detailed description for the BIF protocol.
Results
Mean intelligibility scores in 8 filtering and 3 SNR conditions for both low- and high-pass filters are shown in Fig. 1, 2. Overall, the average intelligibility scores increased as audible frequency band and SNR increased. At the highest SNR (+5 dB) with low-pass filtering, the mean intelligibility scores increased from 0.00% to 98.83% as cut-off frequencies increased from 224 Hz to 11,000 Hz. In contrast, mean intelligibility scores in high-pass filtered conditions decreased from 99.77% to 0.00% as cut-off frequencies varied from 224 Hz to 11,000 Hz. At the lowest SNR (-5 dB) with low-pass filtering, the mean intelligibility scores increased from 0.00% to 69.33% as cutoff frequencies increased from 224 Hz to 11,000 Hz. In contrast, mean intelligibility scores in high-pass filtered conditions decreased from 65.83% to 0.00% as cutoff frequencies varied from 224 Hz to 11,000 Hz.
The BIF for KS-SL-A that was derived using the established protocol is shown in Table 1. In Eq. (1), Q and N values to derive the best-fit BIF were 0.4402 and 6.076, respectively. The R2 value was 0.9963 indicating a reliable relationship between derived band-importance weights and obtained percent intelligibility values. Overall, band-importance weights were varied across frequency bands. The highest band-importance weight was 20.0% at the CF of 316 Hz. The lowest band-importance weight was 10.0% at the CF of 1,122 Hz. The importance of the frequency bands below the CF of 1,778 Hz was 59.6%.
Discussion
The BIF for the KS-SL-A from this study was compared to the BIF for the K-HINT [6] in Fig. 3. The BIF for the KHINT was derived for 21 frequency bands; therefore, in order to compare the results from the 2 studies, cumulative raw data were used [4]. Overall, the cumulative band-importance weights for K-HINT were higher than the weights for KS-SLA below 3,000 Hz. In the case of the BIF for the KS-SL-A, frequency regions below 316 Hz and 2,818 Hz accounted for 20.0% and 76.5% of importance weight, respectively. The midpoint of the BIF was around at 1,122 Hz (46.2%). In the case of the BIF for K-HINT, frequency regions below 350 Hz and 2,900 Hz accounted for 22.9% and 78.8% of importance weight, respectively. The midpoint of the BIF was around at 1,170 Hz (52.1%).
The comparison of band-importance functions for Korean (current study), English [5] and Cantonese [4] sentences is shown in Fig. 4. The BIFs for English (9 frequency bands) and Cantonese (18 frequency bands) were also derived using different numbers of frequency bands. In order to compare the results from the 3 studies, cumulative raw data were used [4]. As shown in Fig. 4, low frequency regions below 1,100 Hz were more important in Korea and Cantonese BIFs compared to English BIF. In the case of Korea and Cantonese BIFs, frequency regions below 1,122 and 1,250 Hz accounted for 46.2% and 51.2% of importance weight, respectively. However, in the case of English BIF, frequency area below 1,160 Hz accounted for 34.8%.
To obtain intelligibility data, the current study followed an established procedure [13]. However, few recent reports used different methodologies to obtain intelligibility data [1617]. For example, Warren, et al. [16] measured intelligibility using one-octave band-pass filters (CFs of 250, 500, 1,000, 2,000, 4,000, and 8,000 Hz) at five different speech levels in quiet. Healy, et al. [17] obtained intelligibility data using 21 bandpass filters from the CF of 150 Hz to the CF of 8,500 Hz at a fixed speech level (70 dB A) in quiet. While the current study obtained the intelligibility data using various high and low-pass filters at several SNRs, the two studies mentioned above used band-pass filters in a quiet condition. Because these studies used different methodologies, direct comparison may not be possible. Thus, validation studies may be required to know which method is more reliable to derive the BIF.
The present study had a few limitations. The BIF for the current study was derived using 7 frequency bands. However, other BIF studies derived their respective BIFs by using more segmented frequency bands such as bands 9 [5], 18 [4], and 21 [6]. Although the results of the present study provide accurate importance weights for the 7 frequency bands, further study is required to derive importance weights for more segments of frequency bands. Although we made comparisons with studies that derived their BIF with the same rationale, these studies used different methodologies such as different frequency bands and talkers [456]. Thus, an exact comparison across these BIFs may not have been achieved.
In addition, intelligibility performances in the present study were measured using the speech shaped noise which was used in previous studies [456]. However, intelligibility performances can be affected by different types of noise [18]. Thus, in order to identify whether differences in BIFs are evident across different types of noise, various types of noise like multi-talker babble may be considered in a future study.
The current findings can be applied for developing a hearing aid fitting formula for Korean listeners. By comparing to the formula for English listeners [3], one can take into consideration the finding of the importance of the low-frequency regions in Korean speaking and place more weight on lowfrequency gains when developing the hearing aid fitting formula for Korean listeners. Although other factors like dynamic range of speech should be considered to determine proper hearing aid gains, the present study may be an initial step to develop hearing aid fitting formula for Korean listeners.
Acknowledgments
This study was supported by a grant from Samsung Electronics Co., Ltd. and by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Science, ICT & Future Planning (NRF-2015R1C1A1A01052458).
Notes
Conflicts of interest: The authors have no financial conflicts of interest.