Development and Validation of the Spatial Separation Sentence Test in Kannada
Article information
Abstract
Background and Objectives
This study aimed to develop and validate a modified version of the Speech in Noise Sentence Test in Kannada, which would be appropriate for testing the speech comprehension ability of children aged 8-12 years.
Subjects and Methods
A total of 120 sentences were chosen from 200 familiar sentences and split into four lists. Continuous discourse was used as a competition or distractor. Using MATLAB, the target stimulus was presented at 0-degree azimuth while the distractor’s location varied (+90° and -90° azimuth). The test was programmed to dynamically adjust the signal-to-noise ratio (SNR) based on participants’ responses. After initial validation, a pilot study was conducted with 60 typically hearing children aged 8 to 12 years.
Results
The SNR50 scores significantly improved when the distractor and target sentences were spatially separated across all groups. Age had a significant influence on the spatial separation scores. The test-retest reliability was excellent.
Conclusions
The developed stimuli effectively measured spatial separation, and the normative and psychometric analyses demonstrated reliable outcomes.
Introduction
Central auditory processing disorder (CAPD) is a deficit in auditory skills mediated by the central auditory nervous system [1]. CAPD encompasses various auditory behaviors, including temporal processing, auditory discrimination, binaural processing, dichotic listening, and sound localization [2]. It is prevalent among individuals with learning difficulties such as dyslexia, attention deficit disorder, and autism spectrum disorder. One of the trait symptoms of a child with CAPD is the struggle to comprehend speech in challenging environments, leading to academic performance issues [3].
The ability to understand speech in noisy situations is very commonly used as a first line of assessment in CAPD. Spatial separation is one of the key cues to identify speech in noisy backgrounds. Spatial separation helps our auditory system analyze complex sound scenes by segregating different sound sources. When speech and noise sources are spatially separated, the brain can more effectively distinguish between them, thereby enabling the auditory systems to focus on the target speech signal and suppress irrelevant noise. This creates a “spatial release from masking” leading to improved speech intelligibility [4]. The spatial separation can be assessed using Listening in Spatialized Noise (LiSN).
The LiSN evaluates how well a listener can integrate and comprehend a target sentence presented in a spatially separated environment from where a competing sentence is delivered. The test has two main variables: 1) spatial separation of target and masker and 2) gender of the speaker: male vs. female speaker as stimulus and masker, thus, in total four conditions. The results are interpreted as talker advantage (the ability to separate male from female speakers at different spatial locations) and spatial advantage (the ability to use spatial cues). The latter effect is considered robust. The clinical application of the LiSN test is to identify children with suspected CAPD and evaluate the effectiveness of interventions to remediate binaural interaction deficits [5,6]. LiSN is highly validated and has demonstrated its ability to accommodate diverse language groups, including the North American LiSN-Sentences test [7,8] and LiSN-Universal (LiSN-U) [9].
The prevailing LiSN tests have semantic elements and accent, which are difficult for the natives to comprehend. Therefore, the primary motivation for developing the LiSN test that could assess the ability of children with suspected CAPD to understand speech in background noise with comprehendible semantic elements for the native population. To the best of our knowledge, the LiSN test is not available in any Indian languages. Kannada is a Dravidian language spoken widely in the southern state of India in Karnataka. Kannada is a phonetic language with numerous alphabets, an extensive vowel system, and complex grammatical systems (noun case, conjugations, and gender agreement). It follows subject-object-verb (SOV) order as opposed to English (SVO and lacks gendered nouns). Kannada is an agglutinative language, meaning that affixes and suffixes are added to the root words to convey various grammatical and semantic meanings. The structure of a sentence can be modified by adding prefixes or suffixes to the root words. For example, in English: “She eats an apple,” while in Kannada: “ (Avalu sebu tindalu).” The current study aims to develop and validate a quick version of MATLAB-based Spatial Separation Sentence Test (SSST) in Kannada. The proposed test would mainly evaluate the spatial advantage and ignore the talker advantage because of its robustness. Additionally, the study aims to acquire pilot normative performance in a small set of children aged 8 to 12 years.
Subjects and Methods
The study received approval from the Institutional Ethics Committee, Kasturba Medical College and Kasturba Hospital, Manipal (IEC: 325/2022), and is registered with the Clinical Trials Registry–INDIA. The study consisted of two phases.
Phase I: Development and validation of stimulus
Stimulus preparation
In the study, for the target stimulus selection, 200 sentences were chosen from academic textbooks of the III and IV standards in the Kannada language. Care was taken to maintain the word length range from 4–6. Each sentence contained at least one content word used as a target word. Each sentence contained strictly the SOV format. School teachers from the government school recruited five children with exemplary academic performance to participate in the familiarity check. The teacher read the sentences. The children were asked to rate their familiarity on a 5-point scale, with 1 representing unknown and 5 representing most familiar. One hundred fifty sentences with an average rating of 5 were selected for recording, while the rest 50 sentences were ignored. Similarly, the two most comprehendible passages were selected as distractor discourse from four passages.
Stimulus recording
The stimuli were recorded in a soundproof room using Adobe Audition 2020 software and a standard omnidirectional head-worn condenser microphone (Pyle Pro PMHM2) placed 6 cm away from the speaker’s mouth. A female native Kannada speaker recorded the selected target sentences, while a male speaker recorded the distracter discourses. Instructions were given to maintain a natural accent, intonation, and vocal effort, avoiding emphasis on specific keywords. After recording, unnecessary breaks and quiet periods were removed, and a noise cancellation algorithm was applied to ensure consistent intensity levels. The modified stimuli were saved in .wav format.
Content validation
A panel of five audiologists with a minimum of 10 years of work experience were asked to validate all the 150 recorded stimuli. The audiologists rated the stimuli (on a 5-point scale, with 5 being strongly recommended) according to parameters such as intensity, appropriateness, distortion, naturalness, and difficulty in identifying the tokens. Open-ended suggestions were also considered for the inclusion and exclusion of the sentence. The sentences reported to be difficult were rerecorded using the same speaker, and all procedures mentioned above were followed. The final list included only sentences that received ratings above 4.8 from at least 75% of the audiologists. At this stage, there were 120 potential sentences were selected for the stimulus equalization procedure.
To validate the sentence difficulty, 10 adult volunteers with normal hearing ability aged 19 to 25 were recruited. The sentences were sequentially presented from 1 to 120 at -10 dB signal-to-noise ratio (SNR). Correct responses were scored as 1, and incorrect responses as 0. This procedure was subsequently repeated at -10, -7.5, -5, -2.5, 0, +2.5, +5, +7.5, and +10 dB. The overall correct scores per SNR for all 120 sentences were tabulated. We then removed sentences with extremely low scores at a given SNR and retained only the ones with comparable scores. Also, the sentences between the lists were shuffled to balance the difficulty. Thus, lists 1, 2, and 3 consisted of 30 comparable sentences each. Finally, we used 3×1 repeated measures analysis of variance (ANOVA) (3 lists×1 SNR) to obtain the final statistics, which showed either no statistically significant difference between the lists or a mean difference of less than 2.0 dB. All the pruned sentences were formed in list 4 and used for practice trials in the normative study.
Ultimately, the SSST consists of 120 sentences organized into four lists, with 30 target sentences in each list. One list was used as a practice set, and lists 1, 2, and 3 were used for three different spatial conditions. The details are mentioned in the procedure section.
Calibration setup and development of a MATLAB-based tool for SSST
Initially, all the stimuli were RMS normalized to +/-2 dB and were less than 10% of the maximum sound card (Realtek High-definition Audio [default] sound controller running on DELL Inspiron 3511 running on 11th Gen Intel® CoreTM i5-1135G7 Processor at 2.42 GHz) volume using a custom MATLAB script (MathWorks, Natick, MA, USA). Then, these stimuli were played through AKG K272 high-definition stereo headphones (Harman Consumer Group Inc, Northridge, CA, USA) with frequency responses from 16 Hz to 28,000 Hz positioned over the 6 cm3 coupler (type 4152). Using a class 1 sound level meter (B&K Type 2250; Bruel & Kjaer, Lyngby, Denmark), the sound pressure level (SPL) of the stimuli was measured to be between 68 and 72 dB SPL with 40% system volume. The female speaker’s recorded target sentences and the distracter discourses were convolved at various azimuth angles as specified in Table 1.
The “Oldenburg Hearing Device (OlHeaD) Head Related Transfer Function (HRTF) Database” version 1.0.3 was used for this convolution of the spatial separation [10]. It is the latest, freely available, and validated HRTF source code widely used by many studies to produce specialization of under headphones. The program was executed using a graphical user interface (GUI) built on the MATLAB platform. The GUI facilitated the inclusion of an SNR adjustment feature (Fig. 1) and for the stimulus presentation.
Phase II: Pilot study
Participants
All the children were in mainstream education, in school grades 3 to 7, with Kannada as their native language or fluent as a spoken language. Written consent from caregivers was obtained in accordance with ethical standards. Fifteen children were recruited from each age group (8 to 12 years) based on their good academic performance, as reported by teachers, from Kannada medium schools in the Udupi district.
One hundred participants were screened using the Developmental Screening Test to evaluate children’s motor development, speech and language, and personal-social development [11]. Any children with ear-related complaints or a history of sensory, neurological, or psychological illness were not recruited. Following screening, 60 children were chosen for the study. These participants were evenly distributed among four groups (G1, G2, G3, and G4) ranging from 8 to 11 years and 11 months. Specifically, G1 includes ages 8 to 8 years 11 months, G2 includes ages 9 to 9 years 11 months, G3 includes ages 10 to 10 years 11 months, and G4 includes ages 11 to 11 years 11 months (Table 2).
Demographic information was gathered through an interview, and otoscopy was performed to visualize the middle ear. The study only included children with a hearing threshold lower than 20 dB HL.
Test equipment
Puretone audiometry screening was performed using a GSI-18 screening audiometer attached to TDH-39 audiometric headphones using standard audiometry techniques. DELL Inspiron 3511 (Laptop P112F101) running on 11th Gen Intel® CoreTM i5-1135G7 Processor at 2.42 GHz with an inbuilt sound card and Realtek High-definition Audio (default) sound controller was used for the study. MATLAB R2022b v9.13.0.2105380 was installed in the laptop which contains the developed code for running the SSST. The SPL of each stimulus was calibrated using a Bruel & Kjaer type 2250 (class 1) calibrated sound level meter (SLM) and a Bruel & Kjaer type 4152 artificial ear. For presenting the stimulus, the laptop was paired with AKG K272 high-definition stereo headphones, which boast a frequency response ranging from 16 Hz to 28,000 Hz.
Experimental setup
The entire test procedure was conducted within the selected educational institution, during the school hours. The school library or silent classroom was chosen as a site for testing, where the noise levels ranged from 40.3 dBA to 43.2 dBA. The testing was paused whenever interferences like class intervals, leisure activities, and during lunch intervals. Two tests were performed; hearing sensitivity and spatial separation testing.
Test procedure
Hearing sensitivity screening was assessed using pure tone audiometry for frequencies of 1 kHz, followed by 0.5, 4, and 2 kHz. Children were instructed to raise their hands to indicate when they heard a tone, and thresholds were recorded on a proforma sheet. Those children fulfilling the inclusion criteria will undergo spatial separation testing.
For estimating SNR50, four lists of 30 sentences each were created to be used as stimulus. The three test lists have 30 sentences each and will be presented in every three conditions of distracter location. In list 1, the stimuli and distracter will be presented at 0° azimuth; and from list 2 and 3, the target stimulus arrive from 0° azimuth, whereas the competing message is perceived to arrive from +90° and -90° azimuth, respectively. List 4 is exclusively for providing practice trials for the participant. It contains 10 sentences each in three different conditions. Once the participant could understand the procedure, the actual test administration is initiated. Prior to testing, the following instruction were provided to the children “You will hear two sentences simultaneously in both ear. The participants were instructed to listen to the target sentences and repeat back what they heard.”
An up-and-down procedure is used for presenting the stimulus at various SNR. The administration of SSST starts so that the stimulus is presented at +10 dB SNR for the first three presentations. The SNR level was adjusted (step size of 2.5 dB) so that if the participant could repeat all the content words in a given sentence, it was scored as a correct response. Likewise, a minimum of two out of three correct responses or 50% response criteria were set at each SNR level, the SNR is reduced by 2.5 dB. If the client gives a response that is less than 50% after the consecutive presentations, the ceiling has been reached, and this level was recorded as the SNR50 of the client for that particular test condition. This testing process was repeated for all three test conditions. The mean test duration was around 20 minutes per participant.
Scoring
The content words from all 120 sentences were identified and highlighted on the scoring sheet. The minimum level at which the participant responded 50% of the time was recorded as the SNR50 (in dB) for that particular test condition.
Test-retest reliability
After a 3-week interval, a sample of 16 respondents was randomly drawn from the entire selection to evaluate the test-retest reliability of the SSST. The same stimuli and method were used to retest these children. The results were compared with their earlier test results.
Data analysis
All age groups’ mean speech reception thresholds (SRTs, SNR in dB), 95% confidence intervals, percentile scores, and interparticipant standard deviations (in dB) were pooled for all three SSST distracter conditions. The difference in score or the benefit (in dB) achieved by the participant using speaker cues and spatial cues to differentiate between the distracter and target stimulus was ascertained as the participant’s performance in the test.
Statistical analysis
Analysis of the obtained data was exported to IBM SPSS Statistics, Version 27.0.1.0 (IBM Corp., Armonk, NY, USA) after documenting the scores in an Excel file. Descriptive statistics were used to categorize the dataset’s characteristics, such as the mean of a variable and standard deviation (SD). To analyze whether there was any significant difference (p<0.05) in scores between the scores for the condition where spatial release from masking was provided between the left and the right ears, the mean SNR (dB) scores were compared. A parametric test (paired sample t-test) was used to find the significant difference between SRT obtained between the three lists for each age group. For the repetitive measurement components of distracter location (i.e., 0° from +/-90°), mean SRT, ANOVA was performed. The intraclass correlation coefficient (ICC) and its 95% confidence interval were used to examine the test-retest reliability.
Results
SNR50 scores
The SNR50 scores for each SSST condition were calculated separately for all the participants. Table 3 show the mean and standard deviation, 95% confidence interval, and percentiles obtained for the two ears separately across age ranges.
A repeated-measures ANOVA was carried out with 3 (condition)×4 (age groups) levels to evaluate the significant difference in scores within the age group. Repeated measures ANOVA results suggested that sphericity was not violated. There was a significant main effect of condition on SNR50 [F (2, 110)=35.874, p=0.001, η2=0.39]. There was a significant main effect of age on SNR50 [F(3, 55)=9.602, p=0.001, η2=0.34]. There was also no significant interaction between the SSST conditions and age group [F(6, 110)=35.874, p=0.40, η2= 0.05]. Figs. 2 and 3 show the mean SNR50 at each condition for all age groups.
Discussion
The study developed and validated the SSST tool in Kannada for children.
Stimulus selection for SSST
The developed SSST differs from LiSN in three ways. First, the competing signal is randomly selected. Second, the study focuses on spatial advantage only. Spatial hearing is relevant to assess in children with APD, and the test is valid for children from 8 to 12 years. To ensure validity, the test was administered to school children under controlled conditions in a silent environment, minimizing any potential strain during the task. Third, the study prioritizes spatial release from masking and ignores the talker’s advantage. The literature supports the spatial advantage’s robustness in directly assessing speech’s spatial processing compared to the smaller talker advantage [5,6]. The decision to prioritize the spatial advantage in the study was based on the finding that it demonstrates minimal variation compared to the total advantage and is notably different from the talker advantage [5].
The stimuli were initially validated by the researchers and then by experts. The target sentence’s SNR was set at 0 dB to challenge auditory perception, which ensured that the target was difficult to perceive and lacked adaptation. The experts’ content validation results also confirmed the appropriateness of the stimulus materials.
The study results showed a significant effect of age on spatial advantage as assessed by SSST. SNR50 scores obtained across age groups vary such that they improve as age progresses. These findings align with the literature, where the spatial benefit increases with age while the change in talker advantage is negligible [12,13].
Ear effect
The study found a significant difference between two spatial conditions (conditions 2 and 3) in which the distractor on the right side had a higher SNR50 than that on the left side. However, the difference in SNR50 between +90° and -90° conditions is within +/-2 dB, which is unlikely to impact clinical findings significantly. The current study independently tracked thresholds for +90° and -90° in the procedure, mainly because of the implementation of a fixed list per condition or lack of randomization between the ears. Hence, it allows us to estimate the direction effect. Previous psychophysical studies have also shown ear effects similar to the current study [14-16]. But, the classical LiSN test measures combined +/-90° azimuth results. Hence, it does not intend to measure ear-specific spatial advantage.
Spatial release from masking in SSST
The study found that spatial separation significantly improved SNR50, with the lowest scores obtained when a 90° azimuth separated the target and distracter in the horizontal plane. The brain adapts to background noise by scaling the neuronal response gain, resulting in noise-tolerant cortical representations of speech. This leads to a release in masking for speech in noise recognition in human listeners. Binaural squelch and head-shadow advantage are two possible phenomena describing spatial release from masking.
The research also suggests that the mean SNR50 improves with age, but the difference is robust in eccentric azimuth conditions. However, spatial release is evident as early as 5–6 years, and spatial advantage is consistent across children aged 8 to 12 years [17]. Hence, the SSST could be valid and can be used for assessing spatial release from masking among children in the age group 8 to 12 years based on the consistency of spatial advantage across this age group.
Test-retest reliability
All the test conditions across the groups showed a reliability coefficient above 0.9. These results suggest excellent reliability for the age groups, material, and procedure. Therefore, the material developed in this study and potential future normative data can be a valuable tool in the clinical and hearing assessment of children aged 8–12 years.
Study limitation
The current test uses the classical fixed list presentation method compared to the randomization. The step size used in the current study was 2.5 dB compared to 2 dB in LiSN. Instead of pooling the left and right spatial segregation, the current study estimated separate thresholds for +90° and -90° azimuth. Though the SNR50 differences were statistically significant, it is attributed more to interaural level difference cues alone. The sentence equalization procedure adopted for the present study is also slightly different from that of the original LiSN. Excellent test-retest reliability and reduction in test time could be considered an advantage.
Implications
The present study focused solely on assessing spatial advantage, excluding measures of talker advantage and overall advantage. The findings indicate that SSST demonstrates high reliability compared to the original LiSN test and LiSN-S. However, a larger sample size could have provided a more representative sample. Data collection took place in a quiet room within a school setting, which represents the natural environment for school-going children. Testing was paused during interferences such as breaks and lunchtime. Therefore, current norms are suitable for open field testing but require verification in audiometric rooms.
Recommendations for clinical use
List 1 should be convolved so that the stimulus and distracter arrive from 0° azimuth. For list 2 and list 3, the stimulus and distracter should be separated by +90° or -90° azimuth, respectively. Controlling the amount of trial or practice sentences provided to the participants can result in obtaining a reliable and real-time response.
Conclusion
The developed and validated SSST in Kannada is suitable for assessing spatial processing skills in children. In the future, this test can be used in other clinical populations and of other ages.
Notes
Conflicts of Interest
The authors have no financial conflicts of interest.
Author Contributions
Conceptualization: Hari Prakash Palaniswany, Kanaka Ganapathy. Data curation: Asish Mervin Hermon, Hari Prakash Palaniswany. Formal analysis: Asish Mervin Hermon, Kanaka Ganapathy, Hari Prakash Palaniswany. Methodology: all authors. Project administration: Asish Mervin Hermon. Software: Arivudai Nambi Pitchai Muthu. Supervision: Kanaka Ganapathy, Hari Prakash Palaniswany. Validation: all authors. Visualization: all authors. Writing—original draft: Asish Mervin Hermon. Writing—review & editing: Kanaka Ganapathy, Hari Prakash Palaniswany. Approval of final manuscript: all authors.
Funding Statement
None
Acknowledgements
We are very grateful to all the school children and teachers who took part in this study.