Saturday, October 15, 2016

Modeling vs Living Language

Andrew Carnie gives readers an accessible introduction to syntax in the first three chapters of his book. He starts by situating the study of syntax within the context of other fields of study in linguistics. Phonology, the study of the sound components of language, comes first, followed by morphology. Syntax, the study of sentence structure, comes next, at one level below semantics, or the study of language meaning. The dominant theory of syntax, developed by Noam Chomsky, is that of a generative grammar, in which language is produced according to a set of subconscious rules. Carnie is quick to clarify that “rules” can be a misleading term. People think of grammatical rules as prescriptive, when in reality, linguists aren’t focused on “proper” grammar, but rather in descriptive rules and what they can tell us about language and the brain.

I’m interested to know if, by modeling language alone, we will ever truly come to know everything there is to know about language. As discussed, we don’t understand the brain well enough to understand how it is that rare sentences can be deemed as perfectly grammatical by children who have never heard the sentence before. However, within different groups of people, different dialects and grammars exist and each function in their own right. Carnie also says, “Many parts of language are built in, or innate,” (15). I wonder how it is scientists come to decide what is innate as separate from what is learned, and if we will ever be able to truly isolate all variables necessary to confirm this, on human subjects, no less.

In reading this article, I’m reminded of a lot of other projects and coursework in my life right now. In SymSys1, we have been learning about context free grammars in the context of finite state machines, and how memory is necessary to generate sentences that make actual sense because so much of what makes sense relies on deep reading of context. In CS106B, like a lot of people this week, I made a random writing generator which takes in an input files with the Bible or accumulated Eminem lyrics or whatever you give it and produces random writing in the style of the input file. It does this by taking a few words at a time and probabilistically mapping it to phrases that commonly follow those words in the input file. Although it only uses phrases that actually occur within the input file, the output writing is frequently nonsensical. I think this problem illustrates quite a few of the issues with rule following, rather than modeling actual language, that Carnie is looking to illustrate.

1 comment:

  1. I also wonder how much we will ever know about language. If Carnie is correct in saying that there are parts of language that are merely innate, then I have a hard time believing that we can truly understand everything about language. The relation to n-grams is telling here: although it is relatively easy to generate random text that sounds like language, whether or not that output can actually be deemed "language" is debatable. There are so many rules, as Carnie has noted, that build up even the simplest grammars. So in turn, I wonder, "Is it possible for computers to replicate language, or will they never be able to because they lack an innate piece that humans have?"

    ReplyDelete