A blog for fans of Bananagrams, word games, puzzles, and amazing things
Showing posts with label words. Show all posts
Showing posts with label words. Show all posts

Sunday, December 15, 2013

The amazing ENABLE word-list project

While looking for a good word list to use for a project that I am working on, I discovered ENABLE (which stands for Enhanced North American Benchmark LEexicon), a word list that seems to have been compiled mainly by Alan Beale (with some help from Mendel Cooper) in order to create a reference that can be used when playing word games. Since it is an open and freely available list, it has served as the basis for the word lists used in many games, such as Words with Friends. What distinguishes this word list from the many others out there is how thoroughly its creation has been documented in the many files in the ENABLE package and its supplemental archive.

For this reason, many of the disadvantages of the Scrabble Tournament Word List can be eliminated. For instance, as the compilers themselves note:

In contrast to other word lists, the ENABLE list has not been crippled by being limited to words under an arbitrary length. The ENABLE list is eminently suitable for most word games, such as Anagrams and Clabbers, and for crossword puzzle solving, rather than just for Scrabble. A great deal of research has gone into removing this limitation, however the list is much the better for it.
Another critique of the Scrabble Word Lists and Dictionaries is that they are carrying around many words that were in dictionaries back in the 1970s but have long since disappeared from both usage and lexicons. The ENABLE supplement includes a list of 9,768 stale words (which it defines as words that appear in the Scrabble Tournament Word List but not in modern dictionaries).

Most of these stale words (like AXAL (an obsolete form of "axial") and WHERVE ("a round piece of wood put on a spindle to receive the thread")) were words I had never heard of and therefore had no problem eliminating from the word list for my project. There were also some words that I thought needed to be retained based on being in common usage including SPELUNK/SPELUNKED/SPELUNKING (which, according to the Google Books Ngram Viewer, has been used with increasing frequency since about the 1940s) and UPSTANDING (which peaked in popularity in the 1920s, reached a local minimum around 1970, but has been on the upswing since 1990).

This is only a sampling of what makes ENABLE so useful. Amateur lexicographers and other interested parties can find and download the whole ENABLE package through this page.

Tuesday, December 4, 2012

Fun with collective nouns

In England in the Middle Ages, the common practice of hunting led people to coin terms for groups of animals that were specific to each sort of animal (such as a gaggle of geese or a pride of lions). These specialized collective nouns are therefore called "terms of venery" (where "venery" is another word for hunting). In the 14th and 15th centuries, this became a full-fledged fad, with silly terms being coined just for fun and with the process being extended from animals to groups of people.

These terms are still being concocted today. For those of us who like such neologisms, there is now a site dedicated to them: All Sorts (subtitled "a linguistic experiment").

A good place to start is the list of collective nouns sorted by popularity.

Some of my favorites are:

  • a seemingly empty room of ninjas
  • a brace of orthodontists
  • a hush of librarians
  • a _____ of mime artists
  • a heard of homonyms
  • a winter of discount tents
  • a clutch of handbags
  • a knot of string theorists
  • an array of programmers
  • a herd of eavesdroppers

The way the site works is that it catalogs whatever suggestions people make on Twitter (when they use the hashtag "#collectivenouns").

Naturally, I couldn't help but suggest a few:

  • a closet of skeletons
  • a clattering of abacuses
  • an epiphany of light bulbs
  • an ink cloud of octopuses
  • a curiosity of question marks
The All Sorts project is a great concept because it clearly demonstrates the frivolity and playfulness of minting new collective nouns. Try it. It's fun!

Sunday, January 15, 2012

The Deeper Meaning of Liff

While cleaning out my closet, I came across The Deeper Meaning of Liff by Douglas Adams and John Lloyd. The preface to the original, unexpanded version of this book (The Meaning of Liff) read:
In Life* there are many hundreds of common experiences, feelings, situations and even objects which we all know and recognize, but for which no word exists. On the other hand, the world is littered with thousands of spare words which spend their time doing nothing but loafing about on signposts pointing at places. Our job, as we see it, is to get these words down off the signposts and into the mouths of babes and sucklings and so on, where they can start earning their keep in everyday conversation and make a more positive contribution to society.

* And, indeed, in Liff.

Lloyd had helped Adams on the original Hitchhiker's Guide to the Galaxy radio scripts, and while visiting Greece, where Adams was supposed to be writing the novelization, they wound up playing a game that Douglas adapted from an English class exercise. As related in Neil Gaiman's Don't Panic: The Official Hitchhiker's Guide to the Galaxy (also from my closet),
...someone would say the name of a town, and someone else would say what it meant. [...] As John Lloyd explained [...] "Near the end of the holiday, I started writing them down, not having very much else to do. By the end of the holiday, we had about twenty of these things, some of the best ones in The Meaning of Liff, like 'Ely' — the first, tiniest inkling that something, somewhere, has gone terribly wrong."

Here are some of my favorite words from the book:

Ahenny (ah-HEN-nee) adj.
The way people stand when examining other people's bookshelves.

Ballycumber (ba-li-KUM-ber) n.
One of the six half-read books lying somewhere in your bed.

Boolteens (BOOL-teenz) pl. n.
The small scattering of foreign coins and halfpennies which inhabit dressing tables. Since they are never used and never thrown away boolteens account for a significant drain on the world's money supply.

Dalmilling (dal-MILL-ing) ptcpl. vb.
Continually making small talk to someone who is trying to read a book.

Delaware (DEL-a-wair) n.
The hideous stuff on the shelves of a rented house.

Duddo (DUD-doh) n.
The most deformed potato in any given collection of potatoes.

Dufton (DUF-tn) n.
The last page of a document that you always leave face down in the photocopier and have to go and retrieve later.

Farnham (FAR-num) n.
The feeling you get at about four o'clock in the afternoon when you haven't got enough done.

Ferfer (FER-fer) n.
One who is very excited that they've had a better idea than the one you've just suggested.

Frating Green (FRAY-ting GREEN) adj.
The shade of green which is supposed to make you feel comfortable in hospitals, industrious in schools and uneasy in police stations.

Fulking (FUL-king) ptcpl. vb.
Pretending not to be in when the carol-singers come round.

Hewish (HEW-ish) adj.
In a mood to swipe at vegetation with a stick.

Hoggeston (HOG-us-tn) n.
The act of overshaking a pair of dice in a cup in the mistaken belief that this will affect the eventual outcome in your favor and not irritate everyone else.

Kabwum (KAB-wum) n.
The cutesy humming noise you make as you go to kiss someone on the cheek.

Kent (kent) adj.
Politely determined not to help despite a violent urge to the contrary. Kent expressions are seen on the faces of people who are good at something watching someone else who can't do it at all.

Kentucky (ken-TUK-ee) adj.
Fitting exactly and satisfyingly. The cardboard box that slides neatly into an exact space in a garage, or the last book which exactly fills a bookshelf, is said to fit 'real nice and kentucky'.

Liff (lif) adj.
A common object or experience for which no word yet exists.

Millinocket (MIL-in-ok-et) n.
The thing that rattles around inside an aerosol can.

Nacton (NAK-ton) n.
The 'n' with which cheap advertising copywriters replace the word 'and' (as in 'fish 'n' chips', 'mix 'n' match', 'assault 'n' battery'), in the mistaken belief that this is in some way chummy or endearing.

Plymouth (PLIM-uth) vb.
To relate an amusing story to someone without remembering that it was they who told it to you in the first place.

Quoyness (KWOY-nes) n.
The hatefullness of words like relionus and KopyKwik.

Rochester (RO-ches-ter) n.
One who is able to gain occupation of the armrests on both sides of their cinema or aircraft seat.

Scethrog (SKETH-rog) n.
One of those peculiar beards-without-moustaches worn by religious Belgians and American scientists which help them look like trolls.

Thrupp (THRUP) vb.
To hold a ruler on one end on a desk and make the other end go bbddbbddbbrrbrrrrddrr.

Woking (WOH-king) n.
Standing in the kitchen wondering what you came in here for.


There ought to be a word for a book that you've never fully read and haven't looked at in years, but suddenly can't bear to part with. Inspired by Douglas Adams, I've decided to call it a "Spennymoor".

Friday, December 16, 2011

Impossible objects. Impossible words.


Impossible objects are drawings of apparently three-dimensional objects which look correct when their individual parts are examined, but when you look at the object as a whole, it turns out to be not realizable. One of the most famous examples was created by D. H. Schuster and published in a psychology journal in 1964. The paper was titled "A New Ambiguous Figure: A Three-Stick Clevis", and said figure looks like this:


As emphasized by the colored background, the top of the object resembles the upper part of a stirrup (which has the basic form of what is known as a "clevis") and the bottom of the object looks like three parallel rods. Somewhere in between lies the ambiguity that destroys the three-dimensionality. Martin Gardner referred to such drawings as "undecidable figures".

The three-stick clevis has since gone by many other names: blivet, devil's tuning fork, widget, and poiuyt.

Mad Magazine used "poiuyt" as the name for the above blivet when they featured it on their March 1965 cover. The difficulty you may be having in deciding how to pronounce "poiuyt" is due to its unusual origin. The word "QWERTY" was formed by starting from the left side of the top row of a typewriter and taking the first six letters. Applying the same technique to the other end of the keyboard you get the looking glass version of "QWERTY"... "poiuyt".

Just as perspective drawing was invented to allow us to make two-dimensional depictions of three-dimensional things, spelling was invented to allow transcription of spoken language. And just as we can draw objects that are logically inconsistent, so can we write combinations of letters that correspond to no spoken word.

"Poiuyt" has no apparent, standard, or authoritative pronunciation. Dictionaries ignore it. If the reader will indulge me, I will nominate it as our first impossible word.


Another candidate for impossibility is "balge" (as in "balge yellow"), a term that is listed in Merriam-Webster's Third New International Dictionary as having no known pronunciation and no known origin (as if it spontaneously generated on a piece of paper on some lexicographer's desk). "Balge yellow" has been defined as "a brilliant yellow color" and "sunflower yellow", so at least that part of its wordhood is known.

A 1976 survey of color names by the National Bureau of Standards identified
balge yellow as the color pictured here, which also goes by names such as "jonquil" and "Naples yellow". These redundant names may explain why "balge" use ended.

Even though no one seems to know how to pronounce "balge", it doesn't feel undecidable in the way that "poiuyt" does, probably due to the latter's discombobulating four consecutive vowels.

My third nomination for impossible word is YHWH which is the English version of the Hebrew word: יהוה. Controversy surrounds this word. It is used throughout the original Hebrew texts of the Old Testament as the primary name for God. Some pronounce it as "Jehovah" or "Yahweh". Since ancient Hebrew lacked was written without vowels, the correct pronunciation of יהוה is not known. There is a strong taboo against speaking this name in Judaism, so it may be that whatever correct pronunciation might have existed has disappeared due to lack of use. Some believe that the pronunciation is a secret preserved by only a few people in each generation. What I like best about this word is that there is another name you can use when talking about it: "the Tetragammaton" (from the Greek for "having four letters"). The undecidability of YHWH's pronunciation is in an entirely different class than that of poiuyt, but maybe a property of impossible words is that they are all impossible in their own ways. This one seems to be more of an arms-crossed, exasperated "Tetragammaton, you're impossible!" way.

I feel obligated to mention one word that I thought would be impossible but has turned out not to be: Mxyzptlk. Mister Mxyzptlk is a mischievous prank-playing imp from the fifth dimension who occasionally visits Earth to wreak havoc until Superman deals with him. The gimmick was that the only way to send Mister Mxyzptlk back home was to trick him into saying his name backwards.

While Mxyzptlk has been pronounced in a variety of ways throughout the years, allegedly the DC Comics editor gave an authoritative pronunciation early on: "mix-yez-PIT-elick". But I suppose that one could claim that it is in Mxyzptlk's trickster nature that the pronunciation of his name refuses to be nailed down.


Then there are heteronyms which are words that are spelled the same way but pronounced differently:
"bass" can rhyme with "glass" or "space".

"wind" can be pronounced with a short I (like the thing that blows) or a long I (the verb that describes forming a ball of yarn).
They're better classified as ambiguous than outright impossible.

As heteronyms change pronunciation based on the context they are used in, they are analogous to the Necker Cube:


Rather than representing a figure that has no sensible three-dimensional realization, the Necker cube confounds the viewer because it has more than one realization. Most people initially see it as a wire-frame cube, viewed from the top, with the lower-left square as the front face. After studying the figure for some time, it may seem to suddenly shift to a cube seen from above with the upper-right square as the front. I find that I can switch between the viewpoints by focussing on a face that appears to be at the back of the cube which seems to cause it to pull forward.


The impossible word is an exceedingly rare thing because we tend to make up pronunciations for words, even if we have to break the laws of phonics. (Doing so yields ultraphonic words (words outside the range of normal phonics), such as Big Bird's pronunciation of ABCDEFGHIJKLMNOPQRSTUVWXYZ as a single long word by sneaking vowel sounds into strings of consonants like JKLMN.)


Spelling a word is a reductive, lossy process. Accents, tones, sarcasm, are all generally omitted. English orthography, in particular, requires collapsing the full spoken word into a few characters, introducing considerable ambiguity, but from this ambiguity is born many good things, like puns and poiuyts.



Further reading:
  • On balge: According to an 1875 Bulletin from the National Association of Wool Manufacturers, balge yellow was "generally employed on cassimere for vestings". Google Books also has the recipe for dying wool balge yellow.

A word about the very cool font used for the IMPOSSIBLE graphic above: “ISOSIBILIA Typography Designed by Rodrigo Fuenzalida for Neo2. - [Back to footnote reference]

Tuesday, September 20, 2011

Why words are the lengths they are

Some words are long and others are short. What determines how long a particular word should be? If you look at some long words (like "serendipity", "pandemonium", and "hypothesis") and some short words ("my", "in", and "of"), you might come to the conclusion that short words are short because they are used frequently while long words can afford to be long because they come up rarely. This idea was first proposed by a Harvard linguist named George Zipf in 1936.

Researchers at the MIT Department of Brain and Cognitive Sciences took a fresh look at this question and came up with a new theory. They present their results in a paper titled (spoilers!) "Word lengths are optimized for efficient communication".

How much information is conveyed by a word? Consider the sentence that starts "After I got home, I walked the...". If I finish the sentence as "I walked the dog", the extra word "dog" doesn't convey much information because it's probably one of the words your brain was expecting. More surprising would have been "I walked the cat" or "I walked the bulldozer", "I walked the quasar" or "I walked the plank". It is the amount of surprise that researchers are equating with the information contained in a word. Consequently the information content of a word depends on the context that it appears in. [For those who want a more quantitative explanation, the information contribution from a particular context (like, "I walked the...") is -log(p), where p is the probability that the word appears at the end of that phrase and where log() is the natural logarithm function. To get the total information for a word like "dog", you just sum -p log(p) over all the contexts that "dog" appears in.]

Ideally what the researchers would have liked to examine is the relationship between how long it takes to say words and how much information they convey, but it was easier (and, they argue, an adequate approximation) to use the number of letters in a word in place of its utterance duration. But later, they went back and ran the same tests (for a few languages) using number of syllables instead of number of letters, and the results were the same.

To calculate the relationship between word length and frequency, the researchers used the same N-gram data set that Google used in its N-gram viewer. This figure from the paper summarizes their findings:
The plot on the left shows word length versus word use frequency, with frequency decreasing from left to right. (Here the data has been divided into large groups of words ("bins") and the average lengths and frequency have been used.) For the first few points (high-frequency words like "the"), the slope of the line is strong, but then it quickly flattens out, indicating that for low-frequency words, the frequency of the word doesn't change the length very much.

The plot on the right shows average word length versus the information content of the word. Here, the line starts off jagged but then becomes strongly-sloped and very straight. This tells us that how much information a word carries is indeed a good predictor of how long the word will be.

The researchers also cite other work that has shown that, when speaking, people will speak more information-dense syllables more slowly than less information-dense syllables. (If you've ever listened to the synthesized voice of something like a GPS, you'll be familiar with the jerkiness of the pronunciation that sounds like it is speaking some syllables too slowly and others too quickly.)

It would seem that a corollary to this principle is that as a word becomes more common (or more precisely, loses information density), it experiences a linguistic force, pushing it toward a shorter form. This shortening process is called phonetic erosion. Examples of the resulting shortenings (also called clippings) are "refrigerator" becoming "fridge", "going to" becoming "gonna", and "cabriolet" being completely replaced by "cab". Here are a few other terms that have evolved much shorter forms:
  • advertisement → ad
  • caravan → van
  • examination → exam
  • gasoline → gas
  • gymnasium → gym
  • influenza → flu
  • public house → pub
So, essentially, the researchers found that the old idea that word length is based mainly on frequency of word usage (short words are used often while long words are used rarely) does a poor job of explaining why words are the lengths they are. The amount of information in a word (averaged over the various contexts that it is used in) is a far better predictor for how long the word will be. The only exception to this is the 5% to 20% of words that are the least informative (generally short, high-frequency words like "the" and "and").

This result holds, not just for English, but also for the other ten languages that they examined (Czech, Dutch, French, German, Italian, Polish, Portuguese, Romanian, Spanish, and Swedish).

The basic idea that I take away from this work is that there is some maximum rate that our brains can understand incoming speech, and that our speech patterns reformulate what we are saying to evenly distribute information over time. It makes me wonder whether pausing for effect is taking advantage of this fact. Similarly, when I say a word slowly to emphasize it, maybe I am just slowing it down to suggest that it contains a lot of information.

Epilogue: In case you were wondering, the actual ending to the sentence that started "After I got home, I walked the..." was "...tightrope.".


Friday, July 1, 2011

Good three-letter words for Bananagrams, sorted by rareness

When playing Bananagrams, 2-letter words are great for rapidly adding letters to a grid, like when you are running off a string of "PEEL"s, but it's hard to make an entire grid out of 2-letter words. In this post, I'm going to examine some of the 3-letter words and show you which are most commonly used and which you might want to add to your active vocabulary.

First, consider this list of some 3-letter words you can make with the letter V:
eve, ivy, ova, rev, van, vat, veg, vet, vex, via, vie, vim, vow

We don't have a good set of data of what words are most commonly used in playing Bananagrams, and Scrabble word choice would likely be a poor substitute since players' priorities in Scrabble are very different than in Bananagrams. WordSquared offers a good compromise: its gameplay shares score maximization with Scrabble, but also includes the rapid score-aloof word-building (such as when evading or outflanking opponents) that epitomizes Bananagrams.

If you sort the above V words by word count (as obtained from WordSquared word pages), you get
van, eve, vet, vat, vie, via, vex, rev, vow, ivy, ova, veg, vim

But sorting by raw word count is not the most useful ordering since the Scrabble tile distribution is going to skew the results. There are lots of As, Es, Ns, and Ts, so of course VAN, EVE, VET, and VAT will be popular words. I wanted to subtract out this bias to see what words would be most and least used if all letters were equally likely to be available. I accomplished this by just dividing each word count by the number of times that each of its letters occurs in a standard Scrabble tile set [which corresponded to the Word2 distribution until the recent Word2 redesign, which fortunately happened after I finished this post]. (For example, the word VAN had been used in WordSquared 24198 times (over about 2 months). In a 100-tile Scrabble set, there are 2 Vs, 9 As, and 6 Ns, so I divided 24198 by 2 and by 9 and by 6 to get a normalized count of 224.1.) Sorting the words by the normalized word count gives:
vex, vow, ivy, van, vat, vet, vim, vie, rev, eve, veg, via, ova

When people have a V, they are more likely to make VEX than any other three-letter V word. (In this case, 70% more likely than even the closest word (VOW).) On the other end, VIA is used fairly often in terms of raw word counts, but sparingly when considering how often it could be made.

Below are more word lists, all sorted by this effective word usage rate, starting from the most common words and ending with the most neglected:

3-letter words that contain the hardest letters, sorted from common to rare:

J words:
joy, jug, job, jam, jaw, jog, jay, jab, jig, jar, jet, jot, jut, jag, jib

K words:
key, kid, sky, kit, yak, kin, keg, wok, ink, ark, ask, oak, irk, ski, ken, ilk, koi, uke, eke, ska, auk

V words:
vex, vow, ivy, van, vat, vet, vim, vie, rev, eve, veg, via, ova

X words:
box, fox, wax, fix, mix, vex, fax, hex, tax, pox, six, sex, tux, lax, axe, lox, sax

Z words:
zip, zit, zoo, zap, zig, zag, fez


3-letter words that begin with vowels, sorted from common to rare:

Words that begin with A:
axe, ark, ask, any, awe, ace, arc, and, age, aim, arm, act, ash, ape, ago, ant, aye, art, air, all, aft, aid, ate, add, are, ail, ale, ado, apt, ass, asp

Words that begin with E:
elk, egg, elf, eve, eye, end, ego, elm, emu, ewe, eat, ebb, ear, eel, eke, era, eon, ere, eta

Words that begin with I:
ivy, ink, icy, ice, irk, ilk, ill, imp, inn, ire, its, ion

Words that begin with O:
off, owl, orb, oak, own, oil, old, owe, ova, one, oft, out, orc, our, odd, ode, oaf, opt, oat, oar, obi

Words that begin with U:
use, urn, ump, uke

Words that begin with Y:
you, yak, yam, yes, yet, yew, yap, yip, yin, yep, yon, yea


3-letter words that end with vowels, sorted from common to rare:

Words that end in A:
via, ova, yea, boa, pea, tea, bra, sea, spa, ska, era, goa, baa, eta

Words that end in E:
axe, the, bye, cue, ice, hue, she, pie, vie, eve, ace, awe, foe, eye, age, dye, bee, owe, hoe, rye, due, ape, woe, die, sue, aye, toe, tie, ewe, ore, rue, use, ate, lye, are, doe, ale, ode, pee, wee, see, ire, fie, uke, tee, eke, lee, ere

Words that end in I:
chi, ski, koi, phi, poi, obi, psi

Words that end in O:
zoo, who, ego, ago, boo, goo, moo, two, woo, too, coo, ado, pro, loo, bro, fro, rho, tao

Words that end in U:
you, flu, emu, gnu, tau

Words that end in Y:
joy, jay, why, key, guy, boy, buy, coy, way, toy, fly, cry, day, gay, sky, hay, bay, ivy, pay, shy, may, soy, hey, fry, lay, dry, say, ray, try, ply, sly, spy, icy, pry, nay, thy, fey, any, sty, ley


It's interesting to look at how different words fare. Auks and asps are nearly forgotten, but foxes and owls are quite popular. Greek letters (eta, phi, psi, rho, tau) and abbreviations for musical instruments (sax, uke) and for formal wear (tux) do not see much grid time. I am glad to see that everybody loves joy.

If you want to learn useful new words to add to your active Bananagrams vocabulary, the ends of those lists might be a good place to start.



Further reading:

Monday, June 20, 2011

Words deleted from the new British Scrabble dictionary

One point in favor of the British approach to Scrabble dictionaries is that they appear to actually delete words from the list once they stop appearing in their current source dictionaries. Some of the deleted words are words that the sources corrected, either by capitalization (Freon), splitting into two words ("jet plane", not "jetplane"), or elimination of abbreviations ("arccos" is not a word; it's an abbreviation for "arccosine").

When I last checked in on the UK Scrabble dictionary committee, they were talking about doing away with some obscure or erroneous words in the Collins Scrabble Words list. Frequently singled out were "smoyle" (an obsolete form of the verb "smile") and "Pernod" (a brand name for a French liqueur which also appears in the American Scrabble dictionary).

While nearly 400 words have been deleted, somehow both "smoyle" and "Pernod" survived the cuts. Here are some that did not:

APFELSTRUDEL

[The Anglicized form, "apple strudel", appears to have taken over for the original German form.]

ARCCOSES

[This is supposed to be the plural of "arccos", itself a deleted word since it is merely an abbreviation for the arccosine function in trigonometry. Including "arccos" as a word is a somewhat understandable mistake, but pluralizing it as "arccoses" is fairly egregious, as no one ever writes such a thing. This is probably one of the glaring problems in the previous edition of the Collins Official Scrabble Words that caused the world Scrabble tournament people to reject it and retain their old list.]

AWESTRIKE
AWESTRIKING

[AWESTRUCK and AWESTRICKEN are apparently still fine. It turns out that no one awestrikes. The Chambers Dictionary has switched to a hyphenated form: "awe-strikes". From surveying the Internet, I'd say it's more popular to "strike awe".]

BARRACOOTA

[This is an obsolete spelling for "barracuda".]

BELLPUSH    a button used in ringing a bell

["Bellpull" is still fine.]

BRICKSHAPED

BROADMINDED    incapable of being shocked. Opposite of shockable.

CARDCASTLE

["Cardcastle" is apparently an obsolete synonym for a house of cards. The last three words have all switched to hyphenated forms.]

CARPARK    a space for parking cars

[I was a bit disappointed by this deletion until I looked up the one instance I know this phrase from (The Restaurant at the End of the Universe), and discovered that Douglas Adams also preferred writing it as two words:
"I'm in the car park," said Marvin.
"The car park?" said Zaphod, "what are you doing there?"
"Parking cars, what else does one do in a car park?"
"OK, hang in there, we'll be right down."
In one movement Zaphod leapt to his feet, threw down the phone and wrote "Hotblack Desiato" on the bill.
"Come on guys," he said, "Marvin's in the car park. Let's get on down."
"What's he doing in the car park?" asked Arthur.
"Parking cars, what else? Dum dum."
]

CHILIOI    one thousand

[Greek word meaning "thousand"; it can be singular or plural; seems to come up most often because it appears in the Book of Revelation]

CORNRENT    rent paid in corn

[Naturally.]

DEPENDACIE

[This was already a very rarely used word, meaning "submissiveness". Shakespeare used it in Antony and Cleopatra, but modern printings have substituted the word "dependency".]

EUROPEANISE
EUROPEANIZE    To cause to become like the Europeans in manners or character; to habituate or accustom to European usages.

[Someone realized that these words are almost always capitalized. On the other hand "Francization" and "Francisation", the noun forms of Francize and Francise (meaning to make something French), have just been added to the CSW.]

FLASHFORWARD

[I have a feeling that LOST fans will have something to say about this. FLASHBACK is still on the list.]

GRENZ    as in grenz rays, X-rays of long wavelength produced in a device when electrons are accelerated through 25 kilovolts or less [adj]

[Grenz rays (ultrasoft X-rays with wavelengths between 0.07 nanometers and 0.4 nanometers)) were discovered by German physician Gustav Bucky. Bucky noted that the effects of this radiation on biological tissue were somewhat like ultraviolet light and somewhat like the adjacent X-ray part of the spectrum, so he called them "Grenz rays" from the German word Grenz, meaning "boundary". The term seems to have been confined to medicine and is now falling out of usage as Grenz ray therapy is giving way to other techniques.]

HAMBURGHER    a patty of ground beef

HEROE    a man revered for his bravery, courage etc, also HERO

HOWSOMEVER

[Apparently this is an archaic form of "however".]

PARAMAECIUM
PARAMOECIUM    Any of various freshwater ciliate protozoans of the genus Paramecium, usually oval and having an oral groove for feeding.

PLAYBUS    a bus with activities for children

[This seems to be a British concept. As far as I can tell, it's a bit like a bookmobile, except that rather than being a mobile library, it's more like a mobile playground with possibly some educational elements or facilities. From photos I've found, I'd define a playbus as a double-decker bus with ball pits, slides, tunnels, all with lots of padding and primary colors. "Playbus" has apparently transitioned to a capitalized form.]

POCKETPHONE

SHOTPUT

SIDESTREET

["Shot put" and "side street" are now standard.]

STOCKHORN

[This now extinct musical instrument is similar to the better known "hornpipe" and the less well known "pibgorn". It was a single-reed woodwind constructed from a sheep's shin bone and used a cow's horn for the flared part at the end that amplifies the sound. The stockhorn is the Scottish version of this instrument. It also goes by the name "stock-and-horn".]

SWONE    a fainting fit

UPSWARM    to send up in a swarm

[Now you "up-swarm" something (e.g., bees). Shakespeare used this one too, but he wasn't up-swarming bees.]

WASM    an outmoded policy

[This is apparently a portmanteau word, resulting from the combination of WAS and ISM. Or looked at differently, an outdated ISM becomes a WASM. This is one of the words that was removed because it was dropped from the Chambers dictionary (the other UK source dictionary) due to lack of usage.]

WYSIWYG    what you see is what you get, matching computer display with what will be printed (adj)

YOS

[The shortest deleted word was deemed to be an incorrect pluralization of the noun "yo", where "yo" is defined as
an expression of calling for attention
]

I find this list intriguing. It's like a graveyard for forgotten words. (Here lies "Grenz rays".)

Remember, if you want to keep your favorite words alive, you have to use them. Write books about them! Insert them gratuitously into blog comments! Or the ideas they represent may become wasms.

Wednesday, June 8, 2011

The new Scrabble words (if you use the British Scrabble dictionary)


A couple of people asked for my opinion on the "new Scrabble words", so I looked them over. The first and most important thing to point out is that these new words have been added to Collins Official Scrabble Words (CSW), effectively the Scrabble tournament dictionary for most of the world, but not for the United States, Canada, or Thailand. Since Collins Official Scrabble Words (equivalent to the "SOWPODS" word list) automatically includes the American Scrabble tournament word list, new words are only added from the British side when they are absent from the most current American list.

There were about 2800 words added to this list. I've picked out the most interesting ones to discuss:

The good

One category of additions that I found most welcome are the many new computer and Internet terms: autosave, blogosphere, inbox, linkrot, metadata, overclock, permalink, timestamp, and whitelist.

(Less welcome is the inclusion of "readme" as an adjective (as in referring to files named "README.TXT" as "readme files").)

Other terms that I have heard frequently and seem appropriate for such a word list are: afterparty, arthouse, beestung, breadstick, buzzkill, edamame, fanboy, nunchucks, regift, ribeye, spork, and upsell.

The absence of "spork" and "nunchucks" from the American Scrabble dictionary had bothered me, so I am glad to see these additions.

The not-so-good

Other new words I am more skeptical about. "VoIP", which is clearly an acronym (Voice over Internet Protocol) is listed as a new word. Apparently some pronounce it like /voyp/ rather than spelling it out (/vee oh eye pea/), but as long as it is spelt with any capital letters, it seems clear that it can't be played in a word game without risking fisticuffs.

And they've added "XRAY", even though any sensible spelling would be "X-ray" even when it's used in a phonetic alphabet (Alpha, Bravo, Charlie...). Adding this entire phonetic alphabet has also resulted in the inclusion of "India", "Juliet", "November", "Quebec", and "Yankee".

The word grok comes from Robert Heinlein's Stranger in a Strange Land where he defines it as "to understand so thoroughly that the observer becomes a part of the observed — to merge, blend, intermarry, lose identity in group experience." Grokking is a profound, transformative understanding of something. Unfortunately, the new Collins Scrabble Words list has added to the past and present participles ("grokked" and "grokking") some alternate spellings which are clearly wrong ("grocked", "groked", "grocking", and "groking"). Maybe these will get fixed in a future version.

Also, the CSW indicates that the word "quantum" has grown a second pluralization ("quantums" as opposed to the standard "quanta"). This may turn out to be one of the words that the Collins word list editors wind up eating (the traditional punishment for any quickly recalled words).

The improper verbs

The new words FACEBOOK and MYSPACE are both listed as verbs, meaning (depending on who you ask) 1) to search for someone's profile on the respective web sites, 2) to post something on these sites, or 3) just to generally use these sites. Since the words in the word list are written in all capital letters, it's not possible to tell whether these words are supposed to be retaining or dropping their capitalization in verb form. There is a history of brand names becoming lowercase verbs: Hoover ⇒ hoover (to clean with a vacuum cleaner; also, to suck up like a vacuum cleaner), Xerox ⇒ xerox (to photocopy), Velcro ⇒ velcro (to fasten together the two fabric pieces of a hook-and-loop fastener). If I had to extract or propose a rule of thumb, I'd say that a brand name can become a lowercase verb when it has been generalized beyond the original brand. The definitions of FACEBOOK and MYSPACE as verbs seem both overly broad (in the range of actions they can denote) and overly narrow (in each only referring to one particular site). In contrast, another brand name that has just appeared on this word list as a verb is PHOTOSHOP. This seems totally appropriate to me since it's been around long enough that "to photoshop" means (in my mind) to manipulate an image using graphics software, without being restricted to Adobe's Photoshop.

The rest

The definitions below are quotations from the Zyzzyva word study program, and my comments are in brackets.

BEATBOXING    a form of hip-hop music in which the voice is used to simulate percussion instruments

BETCHA    a spelling of 'bet you' representing colloquial pronunciation

BLOKART    a land vehicle with a sail

[While we can welcome this word for the whimsicality it embodies, a different spelling ("BLOWKART") has left the building. Most likely, heading in the direction of the prevailing wind.]

BOBBLEHEAD    a type of collectible doll, with head often oversized compared to its body

CATFLAP    a small opening in a door to let a cat through

CHEESESTEAK    a sandwich filled with grilled beef and cheese

CHESSBOXING    a hybrid sport which combines the sport of boxing with games of chess in alternating rounds

[While this was originally a made-up sport, there are now regular international chessboxing tournaments in London.]

CATAPHOR    a word that has the same reference as another word used later

[A cataphor is a phrase for which the meaning only becomes clear later in the sentence. Example: "Although he worked very hard at his wall-balancing lessons, Humpty Dumpty ultimately had to contend with the fact that he was still egg-shaped." He cataphorically refers to Humpty Dumpty.]

CROWDSOURCE    to outsource work to an unspecified group of people, typically by making an appeal to the general public on the Internet

CRIA    the offspring of a llama

[Apparently this word is often used in crossword puzzles.]

CUSPY    of a computer program, well written and easy to use

DISEMVOWEL    to remove the vowels from (a word in a text message, email,etc) in order to abbreviate it

EMERSED    (Of leaves) rising above the surface of water

[Plants that grow out of the water are said to be emersed. Contrast with "immersed".]

ENURN    to put into an urn, also INURN

EXERGY    a measure of the maximum amount of work that can theoretically be obtained from a system

GLAMPING    a form of camping in which participants enjoy physical comforts associated with more luxurious types of holiday

MONOTASKING    the act of performing one task at a time

MWAH    a representation of the sound of a kiss (interj)

PAREIDOLIA    a psychological phenomenon involving a vague and random stimulus being perceived as significant e.g. seeing faces in clouds

PORLOCK    to hinder by an irksome intrusion or interruption

[This term comes from Samuel Taylor Coleridge's story of how he emerged from a dream with the poem that would have been Kubla Khan fully formed in his mind. He claims to have written down the first 54 lines (the only ones that were eventually published) before being interrupted by a visitor from Porlock. Some scholars doubt this story, but "to Porlock" makes for a great new verb. It seems to be mainly used in British English, but I vote that everyone start using it.]

RISORIUS    a facial muscle situated at the corner of the mouth

[The risorius is the muscle people use when they fake a smile (smiling with upturned lips, but not with their eyes). An authentic smile uses the zygomaticus major and zygomaticus minor muscles to pull up the corners of the mouth and also uses the orbicularis oculi muscles to raise the cheeks and form crow's feet around the eyes. Other primates (like lemurs, macaques, orangutans, gibbons, and chimpanzees) do not even have a well-defined risorius muscle.]

SKYLESS    without a sky

[It is listed in the 1911 version of the Century Dictionary with the definition: "Without sky; cloudy; dark; thick." I first thought that this word meant literally without a sky (as in a planet that has no atmosphere), and while it is occasionally used that way in science fiction, it's more generally used figuratively, in melancholy descriptions.]

SPARTICLE    a shadow particle such as a SQUARK believed to have been produced at the time of the Big Bang

[There is an theory called "supersymmetry" which would tidy up a lot of little mathematical problems with the current physics theories of how fundamental particles and forces work. Supersymmetry says that every fundamental particle has a supersymmetric partner. This scheme of adding an S to the beginning of the names of some fundamental particles to denote their hypothetical supersymmetric partners has produced such words as "sfermion", "stau sneutrino", "smuon", and "sstrange squark". These must be fun to pronounce! Sparticles are the sorts of things that physicists would love to find evidence for in particle accelerators like the Large Hadron Collider.]

SPLISH    to splash

[The Wiktionary currently has two definitions:
splish, the noun: "(onomatopoeia, humorous) splash"

and

splish, the verb: "(intransitive) To make a light splashing sound."

The "light splashing" definition rings true to me.]

STOOZE    to borrow money at an interest rate of 0%, a rate typically offered by credit card companies as an incentive for new customers

[The Wikipedia entry indicates that "stoozing" money includes, not just borrowing money at a 0% interest rate, but then investing it (for instance, in a high interest savings account), and then paying it back. This is a sneaky technique for earning money, apparently named for Stooz, a user of the Motley Fool's Credit Card discussion board in the UK, who used and posted about this technique often. While it was originally referred to as "doing a Stooz", a variant spelling has developed that drops the capitalization and adds a silent E.]

STORMSTAYED    isolated or unable to travel because of adverse weather conditions, esp a snowstorm

[This is useful as a more general term than "snowed in". I suggest we import this as "stormstuck".]

SUNGAZING    the practice of staring directly at the sun at sunset or sunrise, esp in the belief that doing so allows one to survive without eating food

[Bananagrammer.com recommends stargazing or moongazing, if you value your retina. Also, eating food occasionally is a good idea, unless you can photosynthesize.]

TRUTHINESS    the quality of being considered to be true because of what the believer wishes or feels, regardless of the facts

[Coined by Stephen Colbert, this is a word more loaded with connotation than a line of text can easily convey.]

TURDUCKEN    a dish consisting of a partially deboned turkey stuffed with a deboned duck, which itself is stuffed with a small deboned chicken

VELLUS    as in vellus hair, short fine unpigmented hair covering the human body

[The opposite of "terminal hair" (dark, thicker body hair).]

WHOLPHIN    a hybrid of a whale and a dolphin

[At Sea Life Park in Hawaii, a bottlenose dolphin and a false killer whale that were being kept together, unexpectedly produced offspring. The false killer whale is actually another species of dolphin, but the discrepancy in sizes (the false killer whale mother was 14 feet long and weighed 2000 pounds while the father was 6 feet long and massed 400 pounds) and the fact that such a combination had never before been seen made the world's first known false-killer-whale/dolphin hybrid a surprise. The fully grown wholphin is 10 feet long and weighs 600 pounds. She is also midway between her parents in shape and color and number of teeth. (Bottlenose dolphins have 88 teeth, false killer whales, 44, and the wholphin has 66.) She, in turn, has mated with a dolphin and given birth to another wholphin. This is another surprise, as hybrid animals (like the mule) are usually infertile.

"Wholphin" is sometimes also spelt "wolphin", although this variant did not make it into the Collins Scrabble Words list.]

Two words that are not new additions, but that I learned from looking through these word lists are: SCOPA (the hair on the legs of many bees, which transport pollen from flower to flower) and UPTALKING:
UPTALKING the practice of speaking with a rising intonation at the end of each statement, as if one were asking a question


It makes sense that a word list that aspires to represent a more international flavor of English be larger. And I have heard on more than one occasion that British English is actually less conservative and is changing more rapidly than American English, so a faster-growing British Scrabble word list is not unexpected. Reading through all these words has been educational and, at times, fascinating, but I'm sure glad that I don't have to memorize them all!


Wednesday, April 20, 2011

Innergrams and what they can tell us about word favoritism

So I was playing Word2, and I had a rack of A R T T U _ _ and a desire to build down from the word LEND. Reflexively, I spelled DART and was about to move on when I paused and asked myself "Why didn't I make DRAT?". DRAT is a perfectly fine word, and unlike DART, I don't recall ever playing it before. After building off the end of LEND, I was planning to build another word off the end of DART to extend my slaloming, weaving string of words off toward the horizon. This style of wordcrafting is not atypical, so many people must have previously encountered a similar choice between two words that start and end with the same letters. I began to wonder how they chose.

Fortunately WordSquared has a new feature that allowed me to find out. By clicking on a word on the board, you can pull up a pop-up box containing a little information about the word including definitions, who has recently played the word, and how many times it has been played (since statistics have been kept... about a month ago as of late-March).

DART had been played 4789 times.

DRAT had been played 1723 times.

So it was not just me.

I decided to study this a little more. I compiled a long list of anagrams that share the same first and last letters (e.g., FORTH and FROTH, SEAHORSE and SEASHORE). Since it is only the inner letters that are scrambled, I decided to call them "innergrams".

The table below shows the resulting innergram pairs with each word's respective word count (as taken from Word2 statistics pages like this one.) The words are sorted so the more frequently used one is always in the first column. The fifth column shows the ratio of the two word counts.

Since I wanted to identify data that was not strong enough to draw conclusions from, I used formal hypothesis testing. The chi-square goodness of fit test I used is described in detail here. The essence of it is that the more data you have and the farther the ratio of word counts is from 1, the stronger the evidence is that one word is preferentially being used over the other. The chi-square parameter (in column 6) measures how strong this evidence is. I've sorted the table by increasing evidence strength.

Admittedly, there are lots of situations where one of these words would be favored over another for in-game reasons (like, CRAVE was already on the board and CRAVEN was made by just adding an N, or maybe a triple-letter score square made CAVERN a higher scoring choice). Averaged over many instances, some of these effects should cancel out.

The first three rows have such a small chi-square value that it's pretty certain that people are not (on average) favoring one of these words over another. (Maybe for every person who makes CRAVEN by adding an N to CRAVE, there is someone else making CAVERN by adding an N to CAVER.) The gray rows are weakly supported. The rest of the rows have a big enough chi-square parameter that we can say with greater than 95% certainty that Word2 players favor the first word over the second word. In the last column of the table, I suggest reasons why.



Essentially, this is a listing of possible word blind spots. DART is a far more popular choice than DRAT, and unlike many of the examples on this list, this asymmetry cannot be explained by ART being a more frequently available hook than RAT. (RAT has been played 19,000 times and ART only 12,000 times.)

I have highlighted with orange the rows where there is strong evidence that the second word is a blind spot word. The green rows indicate that I suspect a blind spot exists, but other explanations could also account for the imbalance.

All innergrams are potentially useful tools for Bananagrams players since the ability to most quickly rearrange your grid can be the difference between winning and losing a game. Blind spot words are just the innergrams that you are most likely to not have immediately at hand... until now!

Blind spot words:
blot, causal, citric, coral, drat, garb, labile, prefect, reserve, rogue, slat, sloe, snag, stanch.

Other possible blind spot words:
brunt, clod, froth, gird, median, recuse, sidle, spilt



In the interest of completeness, below are the innergram pairs that I left out of the table because their word counts were too low. (No count exceeded 8.)

scalarsacral
martialmarital
converseconserve
eternityentirety
preserveperverse
coagulatecatalogue
seashoreseahorse
parentalpaternal
compliantcomplaint
observeobverse
repriserespire
metronomemonotreme
perceptprecept

The last two pairs had word counts of zero. Build one of these words in Word2 and you may be the first!

Wednesday, March 23, 2011

Guess My Word - a fun word-guessing game

Important notice: This post is about an online game that no longer exists at the linked URL, but you can find the same game (now maintained by someone else here.
Imagine if you were trying to play Twenty Questions but weren't allowed to ask about the meaning of the word, only its position in the dictionary. Well somebody else imagined it first, wrote it up, and put it online.

It's called "Guess my word!", and each day there is a new word to guess.

You start by making some initial guess about where the word might be in the alphabet:


You find out where the word is with respect to your guesses (with the closest words being highlighted in blue and red (indicating before and after, respectively)):


And after a number of guesses, eventually you should converge on the word:


...after which you can enter your name to appear on the leaderboard and check out other people's times and guesses. It's a good way to work on your active vocabulary when you are not playing Bananagrams...

You can play Guess My Word by clicking here. If you really like it, there are now two separate words you can play each day. Plus there is a new iPhone app called Lexicographer which is a little different, and for which I have written a separate review.

The best thing about this game is that you can also easily play it offline as a two-person game. You can take turns thinking of the secret word and guessing, and since this can be played without pen and paper (as long as you can remember the two current bounding words), it makes a great game to play in the car.



Further reading:

Monday, February 21, 2011

"Zen" and the art of Google N-gram Viewing

Over on the WordSquared blog, WordSquarers are pondering what should be a legal word in a word game. In particular, they are asking whether ZEN should be an allowed word in their game. "zen" is probably the most frequently asked about word because many people (myself included) initially expect that the Word2 game will accept it, but it never does...

The argument in favor of admitting "zen" to the dictionary is that usage suggests that there are two kinds of "Zen": capital-Z "Zen", which refers to Zen Buddhism and lowercase-Z "zen" which refers to a state of extreme calm and centeredness. Of course, the idea of this calm state is a reference to what is considered to be a result of the practice of Zen meditation.

It turns out that "Zen" is sometimes capitalized even in phrases like "a Zen outlook on life" or when something is said to be or feel "so Zen". This usage is consistent with "Zen" being a proper adjective (like "British").

To pursue this question further, I used Google's Ngram Viewer (which really ought to be spelt "N-gram Viewer") to compare the frequency of usage of the words "Zen" and "zen" in books over the last 200 years. The capitalized version completely dominates. (The oscillations in the appearance of "Zen" in English language books seem to reflect periodic variations in Western interest in Eastern mysticism. Roughly similar oscillations can be seen in the usage of "Tao".)



If you look at just the usage of "zen" over time,


you see that back in the 1800s, long before the concept of Zen was even popularized in Western society, instances of "zen" are present in print like some kind of background noise. And indeed, closer examination reveals that these "zen"s have nothing to do with Zen. They are frequently word fragments (like cases where the word "citizen" has been broken between pages and the OCR failed to transmit the dash in "-zen" to Google N-gram Viewer) or abbreviations of names in plays ("Zen." = Zenobia in some plays).

The same search, done on the American English corpus (rather than the overall English corpus, as above),


also fails to show a decisive increase in the usage of "zen" in English books in the U.S..

In contrast, many words admitted into the dictionary show usage patterns that clearly surpass their background noise levels. The first Google N-grams image above shows a good example of this behavior for the word "Zen". And consider "supersize",

a word accepted into the Merriam-Webster Collegiate Dictionary in 2006. Words that have such an abrupt exponential gain in usage must be the easiest for lexicographers to deal with.

I thought I had found a good argument in favor of making "zen" a word when I realized that there is another word which also has a nearly identical usage pattern. Both "Zen" and "Christian" refer to specific religions, and both are also used in a more relaxed fashion as an adjective (roughly meaning "placid" and "humane, altruistic", respectively). But the question of the correct case is the same with "christian": It is not frequently found in an uncapitalized form, and dictionaries nearly universally include only the capitalized form of the word.

It's possible that editors are keeping the uncapitalized "zen" out of books because it does not appear in dictionaries. And then, to the extent that dictionary inclusion reflects usage in print, "zen" is doomed to be perceived as a common misspelling and locked out of dictionaries forever. Of course, these days lexicographers search for new words in lots of other media, including the less rigorously edited Internet, so "zen" may yet be recognized as a legitimate word.


If you want to express your opinion about "zen", the comments on that Word2 blog post are still open, and you can always post here in the shiny new Bananagrammer comments area.

Tuesday, November 30, 2010

Unbananagrammable words

In my search for the longest word that you can make in a Bananagrams game, I found a few words (like "floccinaucinihilipilification") which can't be spelled with a standard 144-tile set of Bananagrams because there aren't quite enough of one letter or another. For instance, there are only two Ks in a bag of Bananagrams, and there are at least two words with three or more Ks in common use today.

I started off by studying the letter distribution for a Bananagrams set, shown here:
ABCDEFGHIJKLMNOPQRSTUVWXYZ
133361834312225381132969633232

And then I decided to compile a sampling of unbananagrammable words (leaving out some obscure or questionable ones), beginning with the letter...


B

Which word has more than three Bs? It appears in The Sound of Music. It's in a song (naturally). It's that "How do you solve a problem like Maria?" song. Answer: The word is flibbertigibbet, meaning a flighty person, and it's generally used in a critical fashion. I kind of like this word and vote that the nuns try to cut flibbertigibbets some slack.


C

There are about 100 words that would require more than three C tiles to construct, but most of them are highly technical, like "diplococcus" (a kind of bacteria) or compound words like "sacrococcyx" that have to do with things near or connecting to the tailbone. (As a corollary, you can make the word coccyx in Bananagrams (if you ever wind up with all 3 Cs).)

A concrescence is a collection of parts that have grown together, like a bunch of cells in a biological context. The verb meaning to grow together is "concresce".

My new favorite C-laden word is scacchic. It's a very obscure word meaning "pertaining to chess". And it is the shortest word in English that contains four Cs.


F

riffraff - The people who are to be kept out. Used in a dismissive fashion. Depending on who's using the word, the riffraff may contain flibbertigibbets.


K

Yes, we are already at K! The two common words with more than two Ks are kickback and knickknack (4 Ks!).

A third example is knickerbockers. Knickerbockers are short pants for men that come down below the knee, but not all the way down to the ankle. They have mostly gone out of fashion, but a variant of knickerbockers can still be seen as a part of the uniform worn by baseball players.

M

metagrammatism - This means the practice of forming anagrams. It more commonly goes by the term "anagrammatism".

mummiform - Shaped like a mummy. There are a lot of cool -iform words like "igniform", meaning shaped like fire and "cucumiform", meaning shaped like a cucumber.


P

whippersnapper - An inexperienced yet cocky kid. The term was originally coined to describe 17th-century slackers who hung out on the street, snapping whips for no reason. Somewhere along the way the meaning morphed to its current form. No thanks to the whippersnappers!


S

stresslessness has an amazing 7 Ss AND it is a word that people actually use sometimes.

possessionlessness, in contrast, seems a little unwieldy and virtually never appears in print. But it does have 8Ss, and that counts for something.


Z

pizzazz is kind of amazing for being over 57% Z (not to mention 71% pizza). "Pizzazz" is the shortest word that you cannot spell with one set of Bananagrams tiles.


That's it! There are really very few words you can't spell with Bananagrams. And maybe a spell checker.



Further reading:

Saturday, January 30, 2010

The longest word that you can make in Bananagrams

How long can a word constructed from the tiles in a Bananagrams set be? I decided to find out.

"Pneumonoultramicroscopicsilicovolcanoconiosis" is sometimes referred to as the longest English word. It now seems that this word was invented by the National Puzzlers' League as a hoax since the first instance of it ever appearing in print is in a 1935 newspaper article:
Pneumonoultramicroscopicsilicovolcanokoniosis succeeded electrophotomicrographically as the longest word in the English language recognized by the National Puzzlers' League at the opening session of the organization's 103d semi-annual meeting held yesterday at the Hotel New Yorker. The puzzlers explained that the forty-five-letter word is the name of a special form of silicosis caused by ultra-microscopic particles of silica volcanic dust...

A book called Wordplay: A Curious Dictionary of Language Oddities tells the rest of the story:
Frank Scully, author of a series of puzzle books and later one of the early UFO enthusiasts, read the newspaper article and repeated the word in Bedside Manna: The Third Fun in Bed Book (Simon and Schuster, 1936, p. 87). On the strength of this citation, League members (with a wink from the editors?) got the word into both the OED Supplement and Webster's Third. There it remains even to this day.
Whether it really counts as a word or not is moot since it would require 6 C tiles which is 3 more than are in a Bananagrams set. (See this post for a table of the letters in a Bananagrams set.)

"Supercalifragilisticexpialidocious" is a nonsense word from a song in the 1964 Disney movie Mary Poppins. In the movie, it is defined as a word to say when you don't know what to say. It is listed in some dictionaries, but only as a proper noun (i.e., the name of the song).

"Pseudopseudohypoparathyroidism" is an unusual word because of the double "pseudo". It looks like a double negative, but it's really not. Pseudopseudohypoparathyroidism is so called because it seems like pseudohypoparathyroidism in that both are disorders resulting in symptoms such as inadequate skeletal growth and shortness, but pseudohypoparathyroidism is caused by resistance to calcium and phosphorus, while pseudopseudohypoparathyroidism is not. Regular hypoparathyroidism is caused by malfunction of the parathyroid glands resulting in low levels of parathyroid hormone and as a consequence, low levels of calcium and phosphorus in the blood. It's a real word, if a highly technical one. However, the word "pseudopseudohypoparathyroidism" requires more P tiles than we have.

Which brings us to the frivolous little word "floccinaucinihilipilification". It appears to have been coined in the 18th century by Eton College students who combined a bunch of Latin roots, each meaning "nothing" or "insignificant". Floccinaucinihilipilification was defined to be the act of judging something to be worthless. (This is a typical example of 18th century college student hijinks, right up there with herding cows into campus libraries and taking apart the dean's mini-steamboat then reassembling it in someone's dorm room.) It's kind of a beautiful word. Too bad we are one C short of being able to spell it.

And so finally we arrive at "antidisestablishmentarianism", the long word you've all been waiting for. During the 19th century, the issue of whether the Church of England should be the the state church of Britain was a contentious one. The movement favoring disestablishment of the state church was referred to as "disestablishmentarianism", and the counter-movement was called "antidisestablishmentarianism". It can actually be spelled with one set of Bananagrams tiles, is generally recognized as a real word, and is therefore (by my estimate) the longest possible word in a Bananagrams game.

Words like "antidisestablishmentarianism" are agglutinative constructions. English allows such limited use of such constructions, like when combining Latin roots to form words. Forming words (even new words) by agglutinative combination is so common in the German language that there effectively is no longest German word. And in fully agglutinative languages like Turkish, extra word parts can be added on to a base word to a much greater extent. Whereas in German, nouns are routinely extended into much larger nouns, in an agglutinative language, entire sentences can be built up from one long, space-free string of letters. Word games must be very different in Turkey!

So there you have it. A tour of the forces that push words beyond their normal lengths: hoaxes, pranks, politics, medical jargon, and musicals.

I will leave you with some really long words:

UNCHARACTERISTICALLY

counterrevolutionaries



and, of course,

SUPERCALIFRAGILISTICEXPIALIDOCIOUS!

Saturday, October 17, 2009

Palindromes you are most likely to be able to make in Appletters

Since making palindromes earns players bonus points in the Applescore game (as described in this previous post on Appletters), I've compiled a list of the palindromes you are most likely to be able to make during the game (really short ones or ones that use common letters).

Off the top of my head, using only the minimal length examples:

bib, bob
dad, deed, did, dud
eke, ere, eve, ewe, eye
gag, gig
kook
mom
noon, nun
pap, pep, pip, pop, pup
radar
sees
tat, tit, tot
wow

And, of course, there is one Scrabble-legal letter combination that is two letters long and a palindrome: AA. A'a (pronounced /ah ah/) is a type of lava, which is thicker and more viscous than other types. It is characterized by flowing in a sporadic fashion and leaving a rough surface when it cools.

And if it's not already been suggested, then I propose a variation of Appletters called Palindrominoes: The word snake is formed as in regular Appleletters, but if you can form a palindrome where the base letter (the "head" of the snake, I suppose) is inside the palindrome, then you are permitted to position letters both above and below the end-tile, as shown in the example below:
  e   
VIGOR
e I
N
KITE
Then you can give the snake multiple heads, like a hydra or a planarian or something. A little extra chaos to spice up your Appletters game.

Friday, July 31, 2009

What to do when you have too many vowels (other than panic)

This is the natural complement to the post about what to do when you have too many consonants. As mentioned before, if you dump one of your vowels, assuming that you are picking from the 144-letter Bananagrams distribution, you have an 8% chance of getting three vowels back. In my experience, if you have one extra vowel tile, and you just can't rearrange the grid to fit it in somewhere, even exchanging it for two vowels and a consonant tends to be easier to deal with. Of course, you could wind up with some consonants that are tricky to use (which may be why I rarely dump letters.. I also enjoy rearranging the grid).

Whatever approach you choose, you definitely want to know some words with a large percentage of vowels. There are some nice all-vowel words out there like aye, eye, and you. Here is a sampling of some longer vowel-heavy words:

eunoia (83% vowels)
eerie, adieu, audio, bayou (80%)
year, ooze, area, iota, auto (75%)
sequoia (71%)

"Eunoia" and "sequoia" are also distinctive for being two of the shortest word containing all five vowels. "Eunoia" may get you in trouble if you try to use it since it's an obscure word. [I am partial to it because it is a brain word. It comes from a Greek word meaning "beautiful/favorable thinking". The "beautiful thinking" interpretation led to the obscure English usage of "eunoia" - a state of normal mental health. A stricter reading suggests that the Greek word referred to thinking that was favorable to someone (like one's spouse). The "blissful and benevolent state of mind" interpretation, though questionable, is the nicest.]

If you are interested in obscure words on the extremes of human language, check out the All-Vowel Words and All-Consonant Words dictionaries. They start with tame words like "eau" and "brr" and then spin off into highly arcane references (at times approaching Borges-level bizarreness). They come packaged together in a book called "Wye's Dictionary of Improbable Words", downloadable from Lulu for ~$14.

Sunday, June 14, 2009

"Za" and dictionaries

A recent article on the Wall Street Journal site discussed the effect of adding new words to the list of legal Scrabble words. The controversial words are things like "za" (a rare slang term for pizza), "qi" (a Chinese word, meaning life force... a new-fangled spelling of "chi"), and "zzz" (onomatopoeic word for the sound of snoring). Some argue that they make it too easy to use the letters "Z" and "Q", and that their point values should be reduced from 10. Others argue that some of these words are just lame. (It's a good article that also discusses the issues of rule changes in a more general sense.)

Apparently, the way the Official Scrabble Players Dictionary works is that if a word is added to just one of a set of five dictionaries, it is eligible to be considered for the "Scrabble-legal" list. My understanding is that it will be approved as long as it doesn't violate any obvious criteria (hyphenation, proper nounness, foreignness). The virtue of this is that it is an objective approach. I do wonder though how different the list would look if a word had to appear in four of the dictionaries before becoming a Scrabble word.

Should Bananagrams use the Official Scrabble Players Dictionary? Or should it come up with its own dictionary? Is "za" pronounced to rhyme with "baa", or does it retain the schwa sound from the end of the word "pizza"? Do I take every opportunity to use the word "schwa"? (Answer: Yes.)

Friday, May 29, 2009

Q-without-U words

My one concession to the Scrabble-memorizer approach is to know some of the Q-without-U words, as they are so useful in Bananagrams. Also, I do not like to invoke the DUMP rule.

A faqir is another spelling for "fakir", which is like a Hindu monk or ascetic. Tell your friends that it is the name for those guys who sleep on a bed of nails.

A qaid is an Arab chief. The Arabic language is big on the Q-without-U, as we will see.

A qanat is a system for distributing water in arid climates, developed by the ancient Persians, but still in used today. It's basically an underground tunnel, channeling water from some source along a path, and then there are wells drilled down at different positions along its trajectory. It has its own Wikipedia page! If you have more than one, you have qanats.

A qat is an evergreen shrub found in East Africa and Arabia, used as a narcotic (similar in effect and intensity to ecstasy). Possibly one of the shortest words for a drug. (Individual letters and acronyms don't count! Show me the vowels!) You can also make references to "shrubbery" to diffuse any qat-related tension.

A qintar is the Albanian penny. Probably you will never use this one, when you can just make qat.

Qoph is the 19th letter of the Hebrew alphabet. It looks like this:

It is mainly useful when making minimal pangrams: "Vext cwm fly zing jabs Kurd qoph".


Further reading: You may also enjoy: