Sunday, October 23, 2016

Digital humanities and linguistics

Atkins and Levin took advantage of advances in technology to conduct linguistic analysis on mass sets of data from an electronic corpus. This reminds me of an introductory seminar I took last year on the digital humanities in which we are able to examine texts from across dozens of centuries, authors, languages, etc., and begin to see patterns emerge. Using these techniques, we explored what these literary texts would tell us about the changes in gender roles, artistic movements, and other societal aspects of interest during the particular time period. I was interested to see the study of electronic corpus within linguistics in this reading, as Atkins and Levin examined the pattern of internal versus external causation within syntactic differences in the “shake” verbs. The “apparent idiosyncratic behavior of certain verbs” such as these, when stripped of the rich context of information as presented through the large sets of data, explains why English learners have a hard time understanding the subtleties in usage between near-synonyms, especially when learning material is presented as a limited lexical entry in widely varying dictionaries.


Haspelmath gives an interesting decomposition and differences between lexemes and word-forms, further describing compounds as often having hierarchical structures much as syntactic phrases do. The fact that we make a distinction between the lexeme, an abstract embodiment that appears as the most basic “dictionary word” we have in our heads, and the considerably more abundant word-forms that exist concretely in our texts, is an impressive feature of how we as speakers and writers have found a relatively systematic way of categorization and decomposition --- instilling levels of abstraction to (somewhat) neatly catalog and keep track of our complex languages. The hierarchical structure revealed later in the readings, then, is not a surprise; yet, the realization of such should still evoke admiration of the wonders of human language and the hard work linguists do. As Slobin and the frog-story project points out, there is still much to be studied, across a multitude of mediums, in the field of linguistics. The frog stories point out the intricacies of world languages, particularly between motion verbs, and the difficulties in translation due to the fact that languages develop in different cultural contexts. This links back to our study last week of obstacles in text-to-text translation, and shows that it is more than just correct syntactical rules and grammars but also semantic subtleties that pose challenges to the field.

1 comment:

  1. Love your highlight of the intersection between tech and linguistics. Being able to use an electronic corpus really expands the resources that available to linguists however I can imagine as the reading alludes to of the many subtle difficulties that arise from all of this lexical data in database format

    ReplyDelete