141 - Vocabulary

Recently on facebook, I found myself taking a vocabulary test. Mixed emotions in play: enjoying being able to access FB at all; nervous as at any test (what is at risk, but pride?), curious at the form of any such test.

We distinguish between words we read comfortably, or hear without questioning them - received words, receptive vocabulary  - and words which we will speak or write - productive vocabulary. Personally I recognise that the oral and aural vocabularies are significantly smaller than the written ones. In context, words I don’t know well enough to use productively are absorbed with ease. That is not to say one remembers them, of course. When I do a crossword, the result often includes words or names I (this one) didn’t know; this week, Phaedra and raceme; from two weeks ago, houyhnhnm (!), imbedded (not embedded), morel (not moral), natant (should have got that, floating). And I was only doing / attempting the Guardian.

I think my vocabulary is around 30,000 words. One test, http://my.vocabularysize.com/,  says 28,800, where I thought perhaps three of my answers to the 100 words were dubious. Jebdu, http://www.jebdu.com/vocab.html said I was in the 96th percentile, over 37,000 with no errors, and said it again with two errors.  http://testyourvocab.com said 40,100 [cenacle? sparge, clerisy, epigone? Am I sure about conflate? prurient, sedulous, captious, adumbrate ? [yes to all but the last]. Clerisy and cenacle I didn’t know and should. ‘Test your vocab’ is an excellent site and explains what it does, how it does it and shares the statistical results.

Jebdu uses fonqa to see if you recognise a word not in the dictionary. All sorts of people have looked this up (and I’ll bet a few did so while doing the test, cheats). I knew the other three words (luckily), but fonqa now has a meaning—‘an unknown word’. The urban dictionary2, fun that it is, says :

Cromulent was a fonqa until it got popular, now it's in all the dictionaries.

The Oxford Oxford Dictionaries added these three fonqas in 2014: ‘Hot Mess,’ ‘Side Boob,’ ‘Throw Shade’


If you are interested in how the counting is done, read the nitty gritty details. Short form, with detail cut out:

There is a wonderful resource, the British National Corpus, about 108 words. This has word frequency, an important feature. Subtract derivations (-ly), inflected forms (-s, -ed), place names, people’s names, gibberish. Adjust the frequency count to reflect those changes. Rank by frequency. This leaves about 45,000 words, surprisingly. It turns out that the rest of the dictionary is mainly either scientific or archaic terms, or rare but easy put-together words like "unrivaled." And the non-put-together words above 35,000 or so are, let us tell you, hard. Having got the ranking, the question set is then chosen in ways that remove deducible words, combination words (phrases), cognates and words limited by nation [canny, I thought, outwith the dictionary].

Then the (logarithmic) scale is set and with the now further limited set of words, each word you don’t ‘know’ is balanced by later words you do know. The position of balance fore and aft is where your vocabulary is deemed to lie.

So what we do is to test vocabulary in two steps. In the first, we pick around 40 words, stretching from the easiest to hardest words in English. This gives us a general idea of your vocabulary level. We then present a second narrower set of words, sorted by frequency, in a range where we think you'll know all the initial ones, none of the final ones, but have a wide mix of both in the middle. By testing you in this narrower range, we can come up with a quite accurate vocabulary estimate for people of any level.

To understand how we come up with the exact number at the end, let's start with an analogy. Imagine you have the whole dictionary of 45,000+ words, with words arranged in order from most-common to least-common, and you mark all the words you know. At the end, you go back, and discover that at exactly word #15,000, there are 2,000 words that came earlier (more common words) which you didn't know. And at word #15,000, there are 2,000 words which come afterwards (less common words) which you do know. The 2,000 after which you do know cancel out the 2,000 before you don't, and in the end it means you know 15,000 words.


Clearly it is far easier to test receptive vocabulary than productive vocabulary. Clearly your vocabulary may include a load of words specialised to your business (or not) - these are left out, which will reduce the rating for medics, perhaps unfairly.

Vocabulary varies with age and again I refer you to testyourvocab, showing a graph that steeply climbs to the point where you cease education, climbs more slowly to about forty and sinks a bit on retirement. There is less decay among those with bigger vocabularies, who also live longer.1


Determinants of vocabulary depend, it seems, most upon reading habits “between 4 and 15”, or in my mind, up to 15. Factoids from the same site ../blog tell me: we learn about a word a day, but 2.5 times that if foreign; that foreigners will have a vocabulary over 10,000 if living abroad (ie in an English speaking country they match an eight-year-old native); that kids who read ‘lots’ pick up four new words a day.


Since I’m in my second childhood, allegedly, today’s four words are curmudgeon, clerisy, cortinate and cortile but not curtilage, cortège, cortina or cortisol.


DJS 20141020



1 Now there’s an wicked idea: read more, live longer. For reading lots enlarges your vocabulary, and reading fiction more so than non-fiction (surprise; could that be because the technical words are generally missing from the test?). What really happens is that the data gets ‘noisy’.


2 Read the urban dictionary for all those words you don’t know. To quote desbuckingham,

You know you're getting old when you have to look up the meaning of a word on urban dictionary.

I looked up Scoin, singular member of the family and got

A road game similar to punchbuggy, based on the Scions. Upon the sighting of a Scion, one must yell the mispronounced "scoin" and purple nurple someone else in the car. Calling a White Scoin arms you with 2 nurples.

Urngh?


Top picture from sodahead.com

© David Scoins 2017