Sunday, October 9, 2016

Hidden Complexity

Gussenhoven and Kenstowicz’s passages describe complexities of speech that we almost never think about, despite using speech every day of our lives.  In “The production of speech,” Gussenhoven provides an overview of the anatomy that creates the sounds of human speech. He describes how small changes such as rounded/unrounded lips or the vertical position of the tongue’s dorsum can create completely different sounds.  Gussenhoven also gives anatomical explanations of different ‘styles’ of speaking, such as whispering, breathy voice, and creaky/laryngealized voice. 

Kenstowicz’s work provides a detailed analysis of how the sounds that we vocalize can be very different from the underlying phoneme that we believe we are saying.  There are many different rules that determine which allophone we vocalize, including the adjacent phonemes and the pattern of stress within the word.  What's even more intriguing is that we don’t usually notice the differences between allophones – we alter our speech based on intricate and subtle rules, completely unconscious of what we are doing. This allophone variation shows how our conscious mind receives something very different from what our ears hear.  It seems so natural, but there is an incredible amount of subconscious pre-processing that takes place as we extract meaning from sound. 

This complexity may explain why it is so difficult for computers to understand human speech.  Not only do many distinct allophones map to a single phoneme, some allophones are not even pronounced!  Kenstowicz gives the example of ‘tends’, in which the d is not pronounced.  Furthermore, allophones change from region to region, so there is no single mapping from allophones to phonemes.  This multitude of allophones may explain why, during our class tests of speech recognition with Siri/Google Now, enunciating words sometimes seemed to make them less likely to be recognized.  In trying to say every sound, we may have been adding sounds that are not usually present in spoken language (e.g. the t in winter). 

The articles also raise questions about how we learn language.  We have no conscious experience of the incredible complexity of forming sound when speaking, or of mapping allophones to phonemes when listening.  But we know how to do it, so at some point we learned all these rules and began to apply them to every word we hear and say.  Kenstowicz concisely describes the situation; he writes “The mysterious thing is how we learn these rules. No one has taught them to us. We are unable to discover them through introspection. Yet we all tacitly know them if we are native speakers of English.” (Kenstowicz, 66). 

In addition, it seems that some of the allophones Kenstowicz observes arise from a conflict between the written representation of the word and its actual pronunciation.  ‘Tends’ is a good example; it is almost impossible to pronounce the d in tends (I couldn’t do it!) but everyone believes it’s there.  This could be a result of the standard spelling, which includes a d (probably because tends is a conjugated version of tend).  If so, it raises the question of how deeply writing influences speech.  

1 comment:

  1. I also find it interesting how we somehow manage to learn all of these unwritten rules of a language. There are all of these little pieces that we never think about and just use despite not knowing why. On a broader scale, there are tons of examples of this, like the rule in English about ordering adjectives in a specific way (opinion, size, age, shape, color, origin, material, purpose Noun). While I've never had to think about it in english, I've struggled with a similar concept in French. It's incredible how much we pick up somehow without knowing, and struggle when it comes to learning another language.

    ReplyDelete