Word Explainer

One of the hardest things to master as a writer is the appropriate use of vocabulary to get ones point across effectively and engagingly to the target audience. So, as an author, what happens when your word processor pops up and says that your words are too complicated?

Well, what does a mere computation device know about words?

Roger (@ModeratePeril), a friend of mine who writes really thoughtful articles on games and media, got a popup telling him his text was too complicated:

Roger1

What on earth is the Flesch Reading Ease Test and why is it telling an accomplished writer to tone it down a bit? You only have to dig a little to turn up some basic facts about the Flesch tests – they are (unsurprisingly) a way of assigning a number to how easy any (English) text is to read. You simply count the numbers of sentences, words, and syllables, and crunch the numbers. Presto, you know how easy your text is to read:

ReadabilityFormula

Sauce: Wikipedia, which is canon.

The higher the number you calculate, the easier your text is to read, apparently. But here is where things get interesting, because the number you obtain from the formula has an education level assigned as an equivalent; 90 is “understandable by an 11-year old” and 30 is “college graduate” level.

Now, these tests were developed to assess which audiences will understand which text and with what level of effort. When used in moderation, and with a clear picture of the audience, it could help someone decide how to approach their writing.  What it’s not designed for is a flat yea or nay on every and any piece of work! I therefore empathize with Roger’s reaction:

Roger2

It has an overlong cape which gets caught in the swivel chair wheels.

But there’s more to this story – I’m a scientist, and that means that by necessity I’m a Numbers Person. One does not simply apply a formula and mindlessly accept the results, even if it’s appropriate for the setting. No, if you care about your writing then you should understand the real reasons that the Flesch–Kincaid readability tests are insufficient in every application. So, are you ready to argue semantics with me?

My first objection is that the formula I show above is empirical – it was discovered by trial and improvement without specific reference to an overall mathematical theory about how meaning is constructed [citation needed]. That means that, while it may quite well have held true to the test group, you have to be exceedingly cautious about any extrapolation which is made – such as applying a measure derived in 1975 to an audience in 2018. Language changes, people change, and people speak and read differently across the country – there might not be any reason to believe that what was true in 1975’s New Orleans is sensible in 2018’s Glasgow. Empiricism is a touchy subject among scientists for a reason: we are burned by it too often.

But what about those silly semantic arguments I teased you with earlier? Well, simply put these tests measure features of the forms of the words you use, but make no reference to their content. Linguists have a framework for classifying the rules which language follows. Sentences work well if they meet the following criteria:

  • Syntax – rules about how words and sentences must be constructed to be valid. “Thrifty” is a word but “salikhjdfg” is not, each sentence must have a verb, and so on.
  • Static semantic – to do with the combination of meanings of the components of a sentence. “I are fling” doesn’t have an ascribable meaning on its own, but “I have a cat” does.
  • Semantic – to do with the actual meaning of a thing while in context. “Horse ate me” means something, but if you intended to communicate that you ate a horse then this sentence fails on the semantic level. “I’m a Hodgeheg” doesn’t pass the semantics test… unless you’re reading the right story.

When applied to words, sentences, and texts as a whole, these readability tests relate easily measured features of the syntax to infer something about the static semantics of a text in the hope that it informs you about the semantics, which I think is a bit naive. The problem here is that there are plenty of situations where using simpler words and sentences for something dilutes the meaning of what you’re trying to communicate – and this is a problem which we encounter in science all the time! Science is hard to explain in “plain English” because you need to communicate a lot of specific meaning.

Using simple words and short sentences doesn’t often cut it, and we know this from long hard experience. We even had a fad of “explaining using the ten hundred words people use most often,” known eventually as the Thing Explainer methodology. I even tried it myself, to no real benefit or impact, because the things one wishes to explain tend to have a stronger requirement on meaning density than a “plain” sentence can hold. Communicating science is a carefully balanced skill of language readability choice and packing in enough meaning to satisfy your audience; one which takes more skill than simply replacing your long words with more shorter words.

“OK, science is special. But I’m not a scientist, so I don’t care.” – A hypothetical sceptic.

Not so fast there, imaginary critic. Science is a part of my lived experience, so it’s easy for me to draw on for examples, but there are plenty of examples which show the mediocrity of a readability test. Here’s a famous sentence which is well known for being “technically a correct sentence” but for being almost unreadable;

“Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.”

It’s nonsense, and it scores low on the readability test primarily because the word “buffalo” has three syllables – this is not why the sentence is hard to read. An alternative is far easier to understand for reasons entirely unrelated to word length and on the readability test scores better but not very well at all;

“Many bison from Buffalo, which other bison from Buffalo confuse, confuse some bison from Buffalo.”

The dramatic change in readability of the sentence with similar syntax and static semantics has dramatically different semantics to the casual reader, but the readability test can’t detect that – only that on average the word length has dropped a bit. I think this gets quite nicely to the point that meaning isn’t quantifiable in anything like so straightforward a way as counting syllables, and the proxy measurements we rely on are almost inevitably missing part of the larger picture. A readability test can’t tell you whether you’re reaching your audience, and unless you’re really struggling to do that you shouldn’t be making writing decisions based on empirical formulae.

This post was rated a 52.8 on the Flesch Reading Ease scale; “fairly difficult to read”. The tail does not wag the dog, indeed.

up_goer_five

xkcd.com/1133/

.

.

.

.

… well since you’re still here… one more example that entertained me is from Moby Dick [sauce]:

“Though amid all the smoking horror and diabolism of a sea-fight, sharks will be seen longingly gazing up to the ship’s decks, like hungry dogs round a table where red meat is being carved, ready to bolt down every killed man that is tossed to them; and though, while the valiant butchers over the deck-table are thus cannibally carving each other’s live meat with carving-knives all gilded and tasselled, the sharks, also, with their jewel-hilted mouths, are quarrelsomely carving away under the table at the dead meat; and though, were you to turn the whole affair upside down, it would still be pretty much the same thing, that is to say, a shocking sharkish business enough for all parties; and though sharks also are the invariable outriders of all slave ships crossing the Atlantic, systematically trotting alongside, to be handy in case a parcel is to be carried anywhere, or a dead slave to be decently buried; and though one or two other like instances might be set down, touching the set terms, places, and occasions, when sharks do most socially congregate, and most hilariously feast; yet is there no conceivable time or occasion when you will find them in such countless numbers, and in gayer or more jovial spirits, than around a dead sperm whale, moored by night to a whaleship at sea.”

– which scores negatively on the reading ease test, but doesn’t quite capture the same semantics as “sharks are bad, but so are people, because we both like to butcher stuff”.

Advertisements

About stoove

A physicist, researcher, and gamesman. Likes to think about the mathematics and mechanics behind all sorts of different things, and writing up the thoughts for you to read. A competent programmer, enjoys public speaking and mechanical keyboards. Has opinions which might even change from time to time.
This entry was posted in General Science, Opinion and tagged , , , , , . Bookmark the permalink.

2 Responses to Word Explainer

  1. unceasingtoe says:

    Reblogged this on Not Another Research Blog and commented:
    Science, and the scientific method, can apply to myriad situations that many don’t often appreciate. In this case there is a nice example of why you should be careful of blindly following readability metrics when writing… A problem which Herman Melville thankfully didn’t have to worry about!

  2. Alex Smith says:

    While I’ll always advocate shorter sentences, there shouldn’t be an emphasis on always making the language more accessible – especially when compromising accuracy. Communication is hard as it is. Finding the right words from a basket of a commonly shared subset is usually insufficient to describe an experience which, often, you can only truly understand by experiencing it yourself. You can’t describe a sensation in words such that someone else knows precisely what you felt and to what extent (perhaps generally getting close, even empathising somewhat, and roughly agreeing with the words thrown-back at you is all we aim for) unless they’ve experienced it too. Often it’s easier just to use catch-all terms so that someone has a shot of understanding where you’re coming from, but using catch-all words lowers the precision of the meaning being received (measured). The specific meaning, as you say, has been lost.

    It is no different in science. The objective of communicating findings, of course, is to get across some knowledge whilst minimising any misunderstanding. So if outlandish or unusual language is required then it should be used. Anything you can find which gets close to the meaning that you want to convey should be used, (almost) regardless of how complicated it might become – I’ve often seen people create their own seven or eight words in a single paper because it was impractical to do otherwise. One could, of course, then go on to try and make an explanation more accessible, as that chap did, but not at the cost of discarding the more accurate and therefore the more helpful – or at least less unhelpful – wording.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s