Sunday, October 16, 2016

A Universal Grammar (Even For Computers!)

Andrew Carnie's book provides a comprehensive and fascinating introduction to the study of syntax and generative grammars. In the first chapter, Carnie defines many of the key terms of interest, then presents the ways in which the study of syntax is a science, one in which you generate hypotheses, test using data, and then refine any rules based on the results of the tests. Following this, Carnie gets into what I was most interested in -- the concept of a Universal Grammar (UG).

The concept of a UG comes from the belief that language is innate, a stance championed by Noam Chomsky and now widely accepted. However, I had never fully considered why such a model was necessary until I read the "logical problem of language acquisition" section. It is not logical for humans to be able to understand the rules of a language without having heard every instance of a legal sentence in that language -- because, for all we know at the time, there could be some sentence we haven't heard that violates the rules that we've learned. The UG compensates for this by providing a "blueprint," to use Carnie's term, of the rules of English, one that is innate in the human mind and allows us to both recognize and produce sentences we've never heard before.

Carnie continues to expand on this idea in chapters 2 and 3 -- chapter 2 explores how individual words can contain syntactic rules within them, while chapter 3 introduces the fascinating concept of grammar trees. Grammar trees focus on using the grammatical rules encoded into words (discussed in depth in the second chapter) to create a model of the very structure of any sentence. I was introduced to grammar trees (as well as the classification rules introduced in chapter 2) last year in the computational linguistics class Syntactic Theory and Implementation (SYMSYS 184) -- they were the basis of getting our program to recognize legal English sentences.

That class, however, also brought into sharp focus the difficulty of getting a computer to recognize the intricacies of English sentences, to understand this Universal Grammar. It happens, of course, in so many apps we use today -- but I certainly appreciate the difficulty of it, after ten weeks of coding hundreds of lines of code, with the upshot being the program understanding sentences not much more difficult than "the dog and the cat went to the store." How computer programs can internalize the intricacies of the UG while also being efficient is something I'm definitely interested in understanding further.


No comments:

Post a Comment