Vowel Context Effect on the Perception of Stop Consonants in Malayalam and Its Role in Determining Syllable Frequency
Article information
Abstract
Background and Objectives
The study investigated vowel context effects on the perception of stop consonants in Malayalam. It also probed into the role of vowel context effects in determining the frequency of occurrence of various consonant-vowel (CV) syllables in Malayalam.
Subjects and Methods
The study used a cross-sectional pre-experimental post-test only research design on 30 individuals with normal hearing, who were native speakers of Malayalam. The stimuli included three stop consonants, each spoken in three different vowel contexts. The resultant nine syllables were presented in original form and five gating conditions. The consonant recognition in different vowel contexts of the participants was assessed. The frequency of occurrence of the nine target syllables in the spoken corpus of Malayalam was also systematically derived.
Results
The consonant recognition score was better in the /u/ vowel context compared with /i/ and /a/ contexts. The frequency of occurrence of the target syllables derived from the spoken corpus of Malayalam showed that the three stop consonants occurred more frequently with the vowel /a/ compared with /u/ and /i/.
Conclusions
The findings show a definite vowel context effect on the perception of the Malayalam stop consonants. This context effect observed is different from that in other languages. Stop consonants are perceived better in the context of /u/ compared with the /a/ and /i/ contexts. Furthermore, the vowel context effects do not appear to determine the frequency of occurrence of different CV syllables in Malayalam.
Introduction
Speech is a string of phonemes coarticulated in rapid succession. Earlier studies [1,2] have shown that acoustic cues of individual phonemes get altered by the neighboring phonemes when coarticulated. Consonants and vowels are acoustically distinct from each other [3] and are the smallest units in a language. When used in specific rule-governed combinations, they constitute syllables, words, and sentences. Consonants are generally weak in their amplitude [3] compared to the vowels. It is vowels that boost the energy of the consonants when articulated in consonant-vowel (CV) or vowel-consonant (VC) combinations. The consonants on the contrary play a key role in speech intelligibility [4]. When coarticulated, they may alter each other’s acoustics, which may be either facilitatory or derogatory for their perception.
Earlier studies have demonstrated the coarticulatory effects of vowel context on the perception of consonants [5,6]. Of particular interest has been the effect of vowel context on the perception of stop consonants due to their dynamic spectrotemporal characteristics [7,8]. As the first attempt, Liberman, et al. [9] showed that perception of a stop-release burst depends on the following vowel. If a noise burst centered at 1,600 Hz was followed by steady-state vowel /i/ or /u/, listeners perceived /p/; but if followed by steady-state /a/, they perceived it as /k/.
Dubno and Levitt [10] investigated the effect of vowel context on the recognition of consonants in the CV and VC syllables. The results showed the lowest recognition in the context of /u/ and highest in the context of /a/. On the contrary, Helfer and Huntley [11] found significantly reduced consonant recognition in the context of /a/ compared to /i/ and /u/. The context effects have also been found to vary between CV and VC syllables. In general, consonant recognition is better in CV syllables compared to VC syllables [10,12,13]. In CV syllables, consonants were better recognized in /a/ context [12,13], while in the VC syllables, they were better recognized in /i/ [12] and /u/ [13] contexts. The results of CV and VC syllables reflect carry-over and anticipatory coarticulatory cues respectively. The differences between the two suggest that anticipatory and carry-over coarticulatory cueing differs based on the vowel context.
The perceptual salience of coarticulatory cues may also vary across different languages [14-16]. Crowther and Mann [14] found that the perceptual weightage of vocalic duration was strongest for native speakers of English followed by Japanese and Mandarin speakers. Wagner, et al. [15] and Li, et al. [16] found cross-linguistic differences in perceptual salience of coarticulatory cues of fricatives. On the contrary, Wagner [17] found no significant difference across English, Polish, Spanish, Dutch, and German for the perception of stop consonants based on coarticulatory cues in vowels, although it existed for the perception of fricatives. It is important to note that crosslinguistic differences in their study were derived only based on the comparison across gates. No comparison was done across vowel contexts.
Languages differ in their phonetic structure, prosodic patterns, and patterns of phonetic contrasts (allophonic variations). This means that the inferences drawn on the role of coarticulatory cues in one language can’t be generalized to other languages. Singh and Black [18] reported that, for Hindi consonants, vowel context /i/ resulted in a better identification score compared to /a/. Kalaiah and Bhat [19] investigated the vowel context effect in Kannada. The vowels were /a/, /i/, /u/, /e/, and /o/. Results showed that the consonant recognition score was highest in the vowel contexts /o/ and /a/, while the recognition score was significantly reduced in /i/. Taken together, they indicate that there are cross-linguistic differences in vowel context effect on consonant recognition.
Malayalam is a Dravidian language spoken in the southwest of India. It is spoken by more than 40 million people in Kerala, Lakshadweep Islands, Mahe, etc. [20]. It exhibits a rare seven place of articulation contrasts in stop consonants and nasals. A well-defined rule-based structure aids for almost every allophone formation in Malayalam. The phonemes are more nasalized and therefore contain more low-frequency energy [21]. Narne, et al. [22] found that Malayalam has more perceptual weightage for low frequencies compared to English and have attributed the findings to the inherent phonetic differences and use of more nasalized speech. The greater allophonic variations and predominant low-frequency energy suggest that the vowel context effect would be different in Malayalam compared to other languages.
In all languages, certain CV combinations occur more frequently than the others. For example, in English, Dutch, and German, i.e., less than 5% of the entire syllable inventory is sufficient to produce approximately 80% of all speech in those languages [23]. There is a tendency for spoken CV syllables to show preferred combinations: labial consonants with central, alveolars with front, and velars with back vowels [24]. The exact reasons for such preferences are not explored. In this study, we hypothesize that the vowel context effect if any is an important variable that determines the preferred CV combination. For example, if in the context of /i/, the recognition of /p/ reduces according to the current hypothesis, /pi/ shall occur less frequently in spoken corpora. To test this hypothesis, the current study investigated the relationship between the vowel context effect and the frequency of different CV combinations in Malayalam.
Subjects and Methods
The study included two experiments. In experiment 1, the effect of vowel context on the perception of gated stop consonants in Malayalam was investigated. In experiment 2, the frequency of occurrence of various CV syllables in Malayalam was determined.
Experiment 1
The perception of stop consonants /p/, /t/, and /k/ was studied in /a/, /i/, and /u/ vowel contexts. The cross-sectional preexperimental post-test only design [25] was used. The original tokens were progressively truncated at every 10 ms interval. The perception of consonants across various truncated tokens was compared.
Participants
Thirty adults in the age range of 18 years to 50 years participated in the study. All participants were native speakers of the Calicut dialect of Malayalam. They had normal hearing sensitivity in both ears: pure-tone hearing thresholds within or equal to 15 dB HL at octave frequencies between 250 and 8,000 Hz. They had type ‘A’ tympanogram with acoustic reflexes present in both ears, suggestive of normal middle ear functioning. The speech identification scores were 90% or more for Malayalam phonetically balanced word list [26] in quiet, and 60% or more in the presence of speech noise at 0 dB SNR. Their clinically normal speech and language abilities were assessed and interpreted by a qualified speech-language pathologist. They had no history of neurological disorders. Informed consent was obtained from all the participants prior to their participation in the study. The study was approved by the ethical board of All India Institute of Speech & Hearing (WOF-0383/2014-15).
Stimuli
Experiment 1 used original CVs and their truncated tokens. The original CVs had /p/, /t/, and /k/ consonants combined with /a/, /i/, and /u/ vowels. This resulted in nine original syllables. The CVs were non-meaningful in Malayalam. They were uttered by an adult male who was a native speaker of Malayalam and a professional orator. He was instructed to utter each syllable in a neutral tone and to utter each syllable three times with sufficient intervals between subsequent utterances. The utterances were audio-recorded using a unidirectional microphone (AHUJA AUD-101 XLR, Ahuja Radios, New Delhi, India) kept at 6 inches from the mouth, and using Adobe Audition software version 3 (Adobe systems Incorporated, San Jose, CA, USA). The recording was digitized at a sampling frequency of 44,100 Hz and 16-bit resolution. A word reference was given to the speaker prior to the recording of each CV.
The samples recorded were inspected for the clarity of sound and waveform by five speech-language pathologists. The sample of each CVs that was rated best for its clarity and waveforms was chosen. The final samples were normalized to the same average root mean square level using Adobe Audition software. These syllables were operationally termed ‘the original syllables.’
The original syllables were then truncated using PRAAT software (version 6.1.40; http://www.praat.org/) to generate the gated tokens. In the waveforms of CVs, the onset of the burst was located. From the burst onset till the onset of steady-state of vowel (which is also the end of formant transition) gates were placed at every 10 ms. The original syllable was successively truncated (forward gating) leading to as many truncated tokens as that of the number of gates. Utmost care was taken to truncate at the zero crossings.
Fig. 1 shows the truncation points for the stop consonant /ta/. Five gates were placed in each original syllable and truncated, resulting in five truncated tokens in each syllable. While in the first token, the first 10 ms was removed; in the fifth token up to 50 ms was removed. Overall, there were 54 tokens: nine original tokens and 45 truncated tokens. The original syllable was operationally termed G0 (gate 0) and the subsequent ones were called G1, G2, G3, G4, and G5. The method of truncation was the same as that of Wagner [17]. A 1 kHz tone, normalized to the same scaling factor as that of speech tokens was concatenated prior to the list of tokens and it was meant to calibrate the stimulus output.
Procedure
Puretone audiometry, speech audiometry, and speech perception tests were carried out in sound-treated rooms wherein the ambient noise levels were within permissible levels (ANSI S3. 1.1999). Participants were tested for their identification of consonants in a quiet room. The Paradigm software (version 2.5.0.68; Perception Research Systems Incorporated, Lawrence, KS, USA) was used to present the tokens. Each were presented five times, resulting in 270 presentations, and the order of tokens was randomized by the software.
The tokens were presented with an inter-stimulus interval of a minimum of 3 seconds and were delivered through high fidelity Sennheiser HD 449 (Wedenmark, Germany) headphone to the participants. The stimulus intensity was set at the most comfortable level. A graphic user interface was prepared using Paradigm software wherein the three consonants were displayed and the participants were instructed to click on the consonant heard. A forced-choice identification task was used. The software was scripted in such that unless participants respond, the next stimulus was not presented. The participants completed the task in a single session. The responses were automatically stored in the Paradigm software. The correct responses were scored ‘1’ and the incorrect responses were scored ‘0.’ The total correct scores were converted into a percentage, which was used for all the statistical analysis.
Experiment 2
This experiment determined the frequency of occurrence of the target syllables in Malayalam. A pool of 20,000 Malayalam words were collected from various sources textbooks, dictionary, magazines, newspapers, conversations, and also from corpus developed by the Central Institute of Indian Language [27]. The selected words belonged to nouns, verbs, and adjectives. Proper nouns were excluded. The words selected were judged for their familiarity by five native speakers of Malayalam, who rated them on a five-point rating scale. The judgments were compiled and the words rated as ‘most familiar,’ ‘familiar,’ and ‘familiar but not used every day’ were considered for frequency analysis. Out of the 20,000 words, a total of 14,895 words met the criterion.
The selected words were then transcribed using International Phonetic Alphabet for Malayalam [28]. The transcribed data constituted to about 18,537 syllables. The transcribed data set were analysed using Systematic analysis of language transcripts (SALT) software version 9 (LLC, Madison, WI, USA) for the total frequency count. A database of Malayalam phonemes was prepared and saved in the SALT software. The SALT software compared the database and provided the phoneme count based on the loaded phoneme file.
Results
Experiment 1
The group data were statistically compared across the three vowel contexts, the three consonants, and the six gating conditions. The data were statistically analysed using Statistical Package for the Social Sciences software version 21.0 (IBM Corp., Armonk, NY, USA).
Fig. 2 gives the mean and standard deviation (SD) of percentage correct responses for the three-stop consonants, in the three vowel contexts, in the six different gating conditions. The mean percentage was different across the three vowel contexts. The mean percentage also decreased from G0 to G5 condition for all the stop consonants in all three vowel contexts.
To begin with, the overall effect of vowel context on consonant recognition was tested by combining the recognition scores of all the consonants. Fig. 3 gives the mean and SD of the combined recognition scores. The mean percentage of recognition was higher in the context of vowel /u/ compared to vowel /a/ and /i/, irrespective of gating condition. No particular trend could be derived from comparing /a/ and /i/.
The results of repeated measures analysis of variance (ANOVA) showed a significant main effect of vowel context [F (2, 28)=678.631, p<0.001] as well as gating condition [F (5, 25)=1,274.214, p<0.001]. There was also a significant interaction between the effects of vowel context and gating condition [F (10, 20)= 78.411, p<0.001]. The subsequent Bonferroni pairwise comparisons showed significant difference (p<0.001) between /u/ and /a/, and between /u/ and /i/ vowel contexts. However, there was no significant difference between /a/ and /i/ vowel contexts (p=0.308).
The effect of vowel context on the consonant recognition was further studied by considering the recognition scores of /p/, /t/, and /k/, separately. The results of three-way repeated measures ANOVA showed significant main effect of vowel context [F (2, 28)=27.957, p<0.001], consonant [F (2, 28)= 31.481, p<0.001] as well as gates [F (2, 25)=1,389.425, p<0.001]. All the two-way interactions {i.e., consonant*vowel context [F (4, 26)=4.580, p<0.001], vowel context*gate [F (10, 20)=7.930, p<0.001], consonant*gate [F (10, 20)=12.268, p<0.001]} and the three way interaction {vowel*gate* consonant [F(10, 20)=38.755, p<0.001]} were significant. Owing to significant interaction between consonant and vowel context, the effect of vowel context was tested separately in each consonant and vowel context. Table 1 gives the results of two-way ANOVA (vowel and gate as repeating variables) tested separately in each consonant. The results showed significant vowel effect, gate effect and significant interaction (vowel *gate) in all the three consonants.
Subsequent Bonferroni multiple comparisons showed that the recognition scores of all three consonants were significantly higher in the context of /u/ compared to that in /a/ and /i/ (p<0.001). However, there was no significant difference between /a/ and /i/ contexts in any of the stop consonants [/p/ (p=1.00), /t/ (p=1.00), and /k/ (p=1.00)]. In view of significant interaction between vowel context effect and gate effect, the vowel effect was further tested separately in each gating conditions, separately for /p/, /t/, and /k/. Table 2 shows the results of repeated measures ANOVA which showed larger vowel context effects in intermediate conditions (G2 and G3) compared to the extreme gate conditions (G0 and G5). Further, pairwise comparisons are not presented in view of the restrictions to the size of the manuscript in the journal.
Discussion
The study explored: 1) the effect of vowel context (/a/, /i/, and /u/) on the perception of stop consonants (/p/, /t/, and /k/) in Malayalam and 2) the relationship between the vowel context effect and the frequency of different stop CV combinations in Malayalam. The results of experiment 1 showed strong evidence for the presence of vowel context effect on the perception of stop consonants in Malayalam. Taken together results of the two experiments revealed that the vowel context effect does not determine the frequency of occurrence of a particular CV combination in Malayalam.
The vowel context effect on the perception of stop consonants was shown in other languages in the earlier investigations [10,12,13,18,19]. The current study is the first one to show it in Malayalam. It was found that the recognition of the stop consonants was significantly better in the /u/ context compared to other vowel contexts. The results are similar to those reported in English [11,13] in the VC context. There was no significant difference between /a/ and /i/ contexts in the present study which is similar to that reported in Arabic [18]. The findings are in contraindication to the earlier studies in Hindi [18], English [10,12,18], Japanese [18], and Kannada [19]. This suggests that there are cross-linguistic differences in the effect of vowel context, as hypothesized in this study. The differences in the findings could be attributed either to the differences in acoustic cues across the languages or to the possible differences in perceptual weightage [14-17]. Furthermore, the previous studies had considered all classes of consonants (plosives, nasals, and fricatives) together while deriving the vowel context effect, due to which, the exclusive vowel effect on stop consonants can’t be deciphered. The present study on the contrary showed vowel effect exclusively on stop consonants.
The study also revealed a significant effect of gating and consonant type on the recognition of stop consonants, which is consistent with the findings of previous investigations [17]. The vowel context effect was evidenced only in the truncated tokens and not in the original tokens. This suggests that the vowel effect gets unveiled only when the redundancy in the token is reduced, and the listener is forced to rely on the coarticulatory cues for the perception of consonants. The effect of gating was lesser in the /u/ context compared to /a/ and /i/, which suggests that coarticulatory cues were more when the stop consonants were uttered in the context of /u/. On comparing the three consonants, it was seen that the coarticulatory cues were most robust in the case of /t/ when uttered with /u/.
On comparing vowel context effect separately in the threestop consonants, it was found that the context effect was more in /k/ compared to /p/ and /t/. Kalaiah and Bhat [19] in their study in Kannada, found the least or no vowel effect on the recognition of /k/. The difference in the findings again supports the cross-linguistic differences in the coarticulatory perception. The difference in the findings could be attributed to the acoustical differences in the coarticulated phonemes of the two languages.
In the study, it was hypothesized that the frequency of occurrence of a syllable (with the particular consonant and vowel combinations) would be determined by vowel context effect on consonant perception. In experiment 1, consonant recognition in the context of /u/ was resilient to truncation, while the recognition was significantly poorer in the context of /a/ and /i/. Accordingly, it was expected that the stop consonants would combine with /u/ more frequently in the spoken corpora of Malayalam compared to /a/ and /i/. However, syllables with the vowel /a/ were found to be more frequent, followed by that with /u/ and /i/. This suggests that the vowel context effect is not the primary variable that determines the frequency of occurrence of different CVs in the spoken corpora. Although the exact reason for the high occurrence of his stop consonants with vowel /a/ is not known, one can speculate that the central place of articulation and larger open cavity during the production of vowel /a/ could be a few important variables.
The current study probed the vowel context effect only in stop consonants. Future studies can explore the same in other classes of consonants. Owing to the presence of vowel context effect, it is advised that the vowels are counterbalanced in the syllable identification test, to neutralize the vowel context effects. More so, it would be necessary while testing individuals with hearing impairment.
Acknowledgements
We wish to thank our Director, All India Institute of Speech and Hearing, for allowing us to conduct the study. We extend our sincere thanks to all our participants for their patient cooperation.
Notes
Conflicts of interest
The authors have no financial conflicts of interest.
Author Contributions
Conceptualization: all authors. Data curation: Dhanya Mohan. Formal analysis: Dhanya Mohan. Investigation: all authors. Methodology: all authors. Project administration: Dhanya Mohan. Resources: Dhanya Mohan. Software: Dhanya Mohan. Supervision: Sandeep Maruthy. Validation: all authors. Visualization: all authors. Writing—original draft: Dhanya Mohan. Writing—review & editing: all authors. Approval of final manuscript: all authors.