The Gussenhoven and
Kenstowicz readings laid foundations of phonetics and English phonology,
respectively. The two readings were rather
complementary. Gussenhoven provides a handy reference for reading most of the
material present in Kentowicz, and Kentowicz provides an opportunity to
consider uses of and the reality of material presented by Gussenhoven.
In
his chapter, Gussenhoven systematically outlines the mechanics and structures the
human body uses to produce the sounds present in speech. He writes of the different parts of the vocal
tract and larynx, and how constrictions form different sounds. One particular point I found interesting is
how the author states that most of the structures humans use to form language
have other primary biological purposes. I had never before contemplated that
speech’s mechanics was based on a work-with-what-you-already-have foundation,
but it now seems sort of obvious. After all, one has to be able to speak and
breathe before one can talk.
The
focus of Kenstowicz’s paper was on phonemes and rules regarding pronunciation in
American English and other languages, and asks questions about why languages
form the rules they do. This reading was
admittedly denser than the first, but the focus on [t] and the many examples
provided made it easier to follow along and sound things out. In
addition, referencing the Gussenhoven reading for vocabulary and visualization was
invaluable. A point of the work I found
to be particularly fascinating was how unconscious a native speaker of English is
about the allophones of the phoneme [t], and other manifestations of different
phonemes.
One
topic I considered in the reading of these two papers was the artificial
production of language. Are allophones and rules surrounding manifestation of
phonemes what make Google and Siri and other companions sound robotic or even
incorrect in their pronunciation of words, especially ones that are unfamiliar
or names of places? Do the linguists that work on these voices narrowly transcribe words for the computer to
pronounce, and then just have the computer do its best on words unfamiliar
with? Or do they give the computers rules
like the ones that Kentowicz discusses and have the computers apply them to the
best of their abilities? At least the linguists don't have to worry about constructing vocal tracts for these machines and can just record human voices.
Ethan, I find your note about artificial language production really interesting – I actually didn't fully comprehend the significance of these readings with relation to Siri and Google until I read your post! I did consider speech recognition (and the need for context to differentiate words such as "writer" and "rider"), but didn't think about how hard it can be to replicate that very vocal production. I wonder if it is possible to program a "virtual vocal tract," which could possibly copy the complex series of anatomical machinations that we execute to say words. Perhaps Google is already working on this, but it would certainly be interesting!
ReplyDeleteI'm also interested in your point about artificial production of language! After some research, I discovered Siri's voice was actually recorded by Susan Bennett (an actual person), so it seems the pronunciation of specific words isn't what makes Siri sound robotic. Perhaps this artificial quality has to do with the context the words are in (questions vs. statements, for example) or the tone in which they are spoken? Alternatively, maybe the words are too perfectly spoken/articulated, giving Siri a robotic/stiff sound quality.
ReplyDelete