Saturday, October 8, 2016

Phonetics, Phonemes, and Artificial Language

            The Gussenhoven and Kenstowicz readings laid foundations of phonetics and English phonology, respectively.  The two readings were rather complementary. Gussenhoven provides a handy reference for reading most of the material present in Kentowicz, and Kentowicz provides an opportunity to consider uses of and the reality of material presented by Gussenhoven.

In his chapter, Gussenhoven systematically outlines the mechanics and structures the human body uses to produce the sounds present in speech.  He writes of the different parts of the vocal tract and larynx, and how constrictions form different sounds.  One particular point I found interesting is how the author states that most of the structures humans use to form language have other primary biological purposes. I had never before contemplated that speech’s mechanics was based on a work-with-what-you-already-have foundation, but it now seems sort of obvious. After all, one has to be able to speak and breathe before one can talk.

The focus of Kenstowicz’s paper was on phonemes and rules regarding pronunciation in American English and other languages, and asks questions about why languages form the rules they do.  This reading was admittedly denser than the first, but the focus on [t] and the many examples provided made it easier to follow along and sound things out.   In addition, referencing the Gussenhoven reading for vocabulary and visualization was invaluable.  A point of the work I found to be particularly fascinating was how unconscious a native speaker of English is about the allophones of the phoneme [t], and other manifestations of different phonemes.


One topic I considered in the reading of these two papers was the artificial production of language. Are allophones and rules surrounding manifestation of phonemes what make Google and Siri and other companions sound robotic or even incorrect in their pronunciation of words, especially ones that are unfamiliar or names of places?  Do the linguists that work on these voices narrowly transcribe words for the computer to pronounce, and then just have the computer do its best on words unfamiliar with?  Or do they give the computers rules like the ones that Kentowicz discusses and have the computers apply them to the best of their abilities? At least the linguists don't have to worry about constructing vocal tracts for these machines and can just record human voices.

2 comments:

  1. Ethan, I find your note about artificial language production really interesting – I actually didn't fully comprehend the significance of these readings with relation to Siri and Google until I read your post! I did consider speech recognition (and the need for context to differentiate words such as "writer" and "rider"), but didn't think about how hard it can be to replicate that very vocal production. I wonder if it is possible to program a "virtual vocal tract," which could possibly copy the complex series of anatomical machinations that we execute to say words. Perhaps Google is already working on this, but it would certainly be interesting!

    ReplyDelete
  2. I'm also interested in your point about artificial production of language! After some research, I discovered Siri's voice was actually recorded by Susan Bennett (an actual person), so it seems the pronunciation of specific words isn't what makes Siri sound robotic. Perhaps this artificial quality has to do with the context the words are in (questions vs. statements, for example) or the tone in which they are spoken? Alternatively, maybe the words are too perfectly spoken/articulated, giving Siri a robotic/stiff sound quality.

    ReplyDelete