Wednesday, July 8, 2009

Words Count. But Why Count Words?

Million Word March?? I’ve just slogged through what feels like a million words written about the “Million Word March.” Words vigorously promoting the “March” battling more words roundly panning it, What’s causing all that vehement verbiage? An ongoing project mounted by Global Language Monitor (www.languagemonitor.com). GLM self-describes as a site that:

“… documents, analyzes and tracks trends in language the world over, with a particular emphasis upon Global English.”

The MWM project claims to have an up-to-the-minute count of words in the English language. For years, GLM predicted the date when the millionth word would meet the entrance requirements (criteria set by GLM), revising its predictions to later and later, grabbing some media attention each time. Finally, this June, the magic moment arrived! To be precise: June 10, 2009 at 5:22 am Eastern Time. Too bad most of us were sleeping.

Lots of media covered it, but by now the focus had morphed: controversy the project has sparked among linguists became as newsworthy as the GLM-hyped lexical (non)-event.

CNN reported (www.cnn.com/2009/TECH/06/09/million.words) that GLM’s founder, Paul J.J. Payack, had become “somewhat of a pariah in the linguistic community.” That’s putting it mildly.

Back in 2006, Jesse Sheidlower, editor at Oxford University Press in New York, wrote in Slate (www.slate.com/id/2139611/#Return) that Payack’s claims were “suckering even the respectable press.” By 2009, New York Times reporter Jennifer Scheussler (June 14, “Keeping it Real on Dictionary Row”) wrote, “It’s hard to find scholars who react with anything less than blunt outrage at the headline-garnering ‘Million-Word March’ ” Some reactions, wrote Scheussler, were unprintable. Printable ones included “it’s bushwa, fraud, hokum … a sham… hoax....“

A Bonanza! Like ambulance-chasers at a gruesome car wreck, sociologists typically race to get material from a sharp controversy, especially if it erupts in a usually staid community. What could be more staid than the community of dictionary scholars (“lexicographers”, to the cognoscenti)?

But the sociologists haven’t clustered at this sometimes nasty debate basically because so few of them study language issues – and fewer still focus on lexicography. Yet a bonanza of issues for sociology of language, and of culture, lurks here. A few:

· Why is counting words newsworthy in the first place?

· How and why does word counting happen regularly in the dictionary world? How does that relate to thing-counting in many parts of our culture?

· Zeroing in on the controversy, what is at stake that gets so many people so riled up? [“Zeroing in” pun intended !].

Too much, in fact, to tackle in one post. I’ll just circle around the broadest question:

· Why the fascination with counting words?

Give or take 3 million? About three decades before Payack launched his maligned “March,” a scholar appropriately named Read estimated about 4 million words in English. Sidney Landau cites that tidbit in his respected book: Dictionaries: The Art and Craft of Lexicography (2nd ed, , 2001, Cambridge U. Press). But Landau uses it only to pronounce the effort futile: “Read’s was as good a guess as any, but even so it is not very meaningful.” (Landau, p.28)

So, why do linguists even bother? Reasons vary from, at one extreme, nationalistic pride resting on questionable theories that languages with more words support more sophisticated thinking, to a more modest aim that Landau proposes. For his study, even a very approximate answer could be useful as a base to figure, roughly, what percentage of a language (in this case, English) is covered by its most complete dictionaries. The answer: Noone knows, except the certainty that even an “unabridged dictionary” is nowhere near “complete.”

So, forget about the denominator, i.e., number of words in English. But what about the numerator -- the word count in any given dictionary? That’s a highly competitive game they all seem to play, even though most reasons linguists gave to belittle Payack’s claimed count, also apply within the clearer boundaries of a dictionary.

Consider verbs: Should you count each form separately? For example: Is the dictionary entry for “count” one word? Or should it be four, by also including: “counts,” “counted,” and “counting”? There are an awful lot of verbs, so your decision would have major impact.

Furthermore, like many words, “count” is polysemous (that’s jargon for “having multiple meanings.”) So, maybe “count” should be counted three times – once for its meaning as a verb, to enumerate; another time for its meaning as a noun, i.e., the result of enumerating, and yet again as a noun that refers to a title in British society? And how should we treat different “senses” of a word, such as the sense of “count (enumerating)” which means the non-numeric value placed on something – “In a democratic society, your opinion counts.”

Those are just a few of the complexities!

And yet, the same dictionary publishers whose word-mavens smartly itemize pitfalls preventing useful counts that can be reliably compared to each other, typically feature just such counts on their book covers or their websites. Inconsistent? Not really, because it’s not the lexicographers sullying their purity, it’s the marketing departments doing their thing.

More Words for Your Money!! And that moves us into another topic worth probing: the struggle between the “professional and scientific” (or “the art and craft”) model of creating dictionary content versus the practical matter of financially supporting that expensive habit – after all, flashy word counts can pull in purchasers.

But that’s for another time – you’ve already gotten 907 words – totally free! (Now, 912).

No comments:

Post a Comment