One of the hardest things to master as a writer is the appropriate use of vocabulary to get ones point across effectively and engagingly to the target audience. So, as an author, what happens when your word processor pops up and says that your words are too complicated?
Well, what does a mere computation device know about words?
What on earth is the Flesch Reading Ease Test and why is it telling an accomplished writer to tone it down a bit? You only have to dig a little to turn up some basic facts about the Flesch tests – they are (unsurprisingly) a way of assigning a number to how easy any (English) text is to read. You simply count the numbers of sentences, words, and syllables, and crunch the numbers. Presto, you know how easy your text is to read:
The higher the number you calculate, the easier your text is to read, apparently. But here is where things get interesting, because the number you obtain from the formula has an education level assigned as an equivalent; 90 is “understandable by an 11-year old” and 30 is “college graduate” level.
Now, these tests were developed to assess which audiences will understand which text and with what level of effort. When used in moderation, and with a clear picture of the audience, it could help someone decide how to approach their writing. What it’s not designed for is a flat yea or nay on every and any piece of work! I therefore empathize with Roger’s reaction:
But there’s more to this story – I’m a scientist, and that means that by necessity I’m a Numbers Person. One does not simply apply a formula and mindlessly accept the results, even if it’s appropriate for the setting. No, if you care about your writing then you should understand the real reasons that the Flesch–Kincaid readability tests are insufficient in every application. So, are you ready to argue semantics with me?
My first objection is that the formula I show above is empirical – it was discovered by trial and improvement without specific reference to an overall mathematical theory about how meaning is constructed . That means that, while it may quite well have held true to the test group, you have to be exceedingly cautious about any extrapolation which is made – such as applying a measure derived in 1975 to an audience in 2018. Language changes, people change, and people speak and read differently across the country – there might not be any reason to believe that what was true in 1975’s New Orleans is sensible in 2018’s Glasgow. Empiricism is a touchy subject among scientists for a reason: we are burned by it too often.
But what about those silly semantic arguments I teased you with earlier? Well, simply put these tests measure features of the forms of the words you use, but make no reference to their content. Linguists have a framework for classifying the rules which language follows. Sentences work well if they meet the following criteria:
- Syntax – rules about how words and sentences must be constructed to be valid. “Thrifty” is a word but “salikhjdfg” is not, each sentence must have a verb, and so on.
- Static semantic – to do with the combination of meanings of the components of a sentence. “I are fling” doesn’t have an ascribable meaning on its own, but “I have a cat” does.
- Semantic – to do with the actual meaning of a thing while in context. “Horse ate me” means something, but if you intended to communicate that you ate a horse then this sentence fails on the semantic level. “I’m a Hodgeheg” doesn’t pass the semantics test… unless you’re reading the right story.
When applied to words, sentences, and texts as a whole, these readability tests relate easily measured features of the syntax to infer something about the static semantics of a text in the hope that it informs you about the semantics, which I think is a bit naive. The problem here is that there are plenty of situations where using simpler words and sentences for something dilutes the meaning of what you’re trying to communicate – and this is a problem which we encounter in science all the time! Science is hard to explain in “plain English” because you need to communicate a lot of specific meaning.
Using simple words and short sentences doesn’t often cut it, and we know this from long hard experience. We even had a fad of “explaining using the ten hundred words people use most often,” known eventually as the Thing Explainer methodology. I even tried it myself, to no real benefit or impact, because the things one wishes to explain tend to have a stronger requirement on meaning density than a “plain” sentence can hold. Communicating science is a carefully balanced skill of language readability choice and packing in enough meaning to satisfy your audience; one which takes more skill than simply replacing your long words with more shorter words.
“OK, science is special. But I’m not a scientist, so I don’t care.” – A hypothetical sceptic.
Not so fast there, imaginary critic. Science is a part of my lived experience, so it’s easy for me to draw on for examples, but there are plenty of examples which show the mediocrity of a readability test. Here’s a famous sentence which is well known for being “technically a correct sentence” but for being almost unreadable;
“Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo.”
It’s nonsense, and it scores low on the readability test primarily because the word “buffalo” has three syllables – this is not why the sentence is hard to read. An alternative is far easier to understand for reasons entirely unrelated to word length and on the readability test scores better but not very well at all;
“Many bison from Buffalo, which other bison from Buffalo confuse, confuse some bison from Buffalo.”
The dramatic change in readability of the sentence with similar syntax and static semantics has dramatically different semantics to the casual reader, but the readability test can’t detect that – only that on average the word length has dropped a bit. I think this gets quite nicely to the point that meaning isn’t quantifiable in anything like so straightforward a way as counting syllables, and the proxy measurements we rely on are almost inevitably missing part of the larger picture. A readability test can’t tell you whether you’re reaching your audience, and unless you’re really struggling to do that you shouldn’t be making writing decisions based on empirical formulae.
This post was rated a 52.8 on the Flesch Reading Ease scale; “fairly difficult to read”. The tail does not wag the dog, indeed.
… well since you’re still here… one more example that entertained me is from Moby Dick [sauce]:
“Though amid all the smoking horror and diabolism of a sea-fight, sharks will be seen longingly gazing up to the ship’s decks, like hungry dogs round a table where red meat is being carved, ready to bolt down every killed man that is tossed to them; and though, while the valiant butchers over the deck-table are thus cannibally carving each other’s live meat with carving-knives all gilded and tasselled, the sharks, also, with their jewel-hilted mouths, are quarrelsomely carving away under the table at the dead meat; and though, were you to turn the whole affair upside down, it would still be pretty much the same thing, that is to say, a shocking sharkish business enough for all parties; and though sharks also are the invariable outriders of all slave ships crossing the Atlantic, systematically trotting alongside, to be handy in case a parcel is to be carried anywhere, or a dead slave to be decently buried; and though one or two other like instances might be set down, touching the set terms, places, and occasions, when sharks do most socially congregate, and most hilariously feast; yet is there no conceivable time or occasion when you will find them in such countless numbers, and in gayer or more jovial spirits, than around a dead sperm whale, moored by night to a whaleship at sea.”
– which scores negatively on the reading ease test, but doesn’t quite capture the same semantics as “sharks are bad, but so are people, because we both like to butcher stuff”.