Tuesday, 20 December 2011

Elements of scientific style

Scientists must write. In fact, between papers, posters and proposals, they must write a lot. So it's somewhat surprising that many science students are never given formal teaching in how to write well. They are expected to learn through a grand process of trial and error, with each submission throughout a degree corrected, graded and returned. The language in scientific publications is usually correct but I find that it is often unclear and cumbersome. What it lacks, I believe, is style.

All writers should be motivated to write well. What is better written is more widely read and that is what all scientists want. An important addition to this thought is that many readers of English-language journals are not first language speakers. They are less likely to understand your use of future perfect tense. All the more reason to avoid complicated constructions.

The time has at last come for me to start preparing my PhD thesis. To cultivate some inspiration, I've been trawling through a few materials about how to write better. This post is a summary of my findings. It is advice to myself that you may also find useful.

I think everyone should read two items. First, Section III of Strunk & White's Elements of Style describes straightforward ways to improve composition and provides contrasting examples of good and bad composition. Second, The Economist Style Guide, which was available online, is almost entirely dedicated to concise, jargon-free word choice. It is also populated with dry humour in its examples.

Technical points can be drawn from the style guides of relevant scientific organisations. For example, the American Institute of Physics and International Astronomical Union have style manuals that should be read by physicists and astronomers. Also, Donald Knuth (of TeX fame) released notes on mathematical writing based on a course at Stanford.

Composition Starts with a C...

... and so does everything about it. In many guides, the basics of style are Cs. The American Institute of Physics Style Manual is most explicit, declaring that writing should be clear, concise, and complete. But how does one achieve these things?

Clear writing is achieved through short, active sentences, at least for the important points. Readers remember short, sharp statements. Avoid long, wandering sentences, where a large number of clauses can lead to confusion, and especially avoid a sequence of loose sentences. Try to use simple tenses. The present tense usually suffices. If you find yourself writing in the future perfect or past continuous, consider for a moment whether it is necessary.

Concise writing is borne from frugal use of words. For example, "The fact that..." can almost always be reduced to "That..." Do not declare that a result is "very interesting" or say things "in the opinion of the present authors". Let the reader decide if you're not just stating the obvious. Aside from using fewer words, you can also use shorter, simpler words. The Economist Style Guide is full of examples. The added danger of complicated words is that they can be misused and misunderstood. As The Economist writes of underprivileged, sometimes they don't even make sense.
Since a privilege is a special favour or advantage, it is by definition not something to which everyone is entitled. So underprivileged, by implying the right to privileges for all, is not just ugly jargon but also nonsense.
To ensure that writing is complete is difficult in technical work. There is a fine line between writing everything an article needs to be logically complete and writing more than is necessary. To evaluate how much to write, you must first understand the audience for whom you are writing. Writing a conference proceeding for a small meeting of researchers on a specific subfield requires less background material than a widely-circulated journal or a fellowship proposal that will be read by scientists in other fields.

There are some simple things that are fairly obvious. For a start, all symbols and abbreviations must be defined and, if necessary, explained appropriately too. You may define all sorts of derived quantities but it will help the reader if you explain why they are interesting.

Another point about completeness regards the use of citations. Personally, I think some writers feel that a citation absolves them of any need to explain the content of the cited work. Many articles rely heavily on work presented fully in other articles. A citation should be accompanied by some explanation of the logic that leads to whatever conclusion or result is being employed, especially if that work is itself very long. A sentence or two may save your reader the trouble of looking up another paper and therefore make him more likely to continue reading. It is offputting to open a six-page article to find that the authors rely heavily on detailed and subtle calculations or measurements in a poorly-written 30-page epic.

Another C that can be added is to make sure your writing is correct. You shouldn't be leaving hanging particles or hidden verbs but there are subtler errors in English usage. Many are explained in The Economist Style Guide. A dry-witted example is the difference between "among" and "between":
To fall between two stools, however painful, is grammatically acceptable; to fall between the cracks is to challenge the laws of physics.

Structured writing

It's no secret that most scientists peruse papers briefly before committing to reading them in detail. It pays to write for these perusers by structuring content appropriately. Besides, structured writing helps lead the reader along structured thought.

The smallest logical unit of composition is the paragraph. Start each paragraph with a sentence that captures its content so that those who are skimming through will quickly get an idea of how the article presents its content. The paragraph principle leads me to plan a paper from the top down, all the way to the paragraph level.

When it comes to arranging paragraphs, I found an interesting point in a talk given at MIT. Points made in paragraphs form a logical sequence but there are usually multiple dependencies between these thoughts. More than one idea depends on more than one preceding idea. So how should one link the paragraphs? The answer is to choose an arrangement that gives the fewest "crossovers", as in the image below.



The logical relationship between the paragraphs is shown at the left. If we write all the starting points in sequence (the "layered" approach), we end up crossing back and forth between logical sequences. By instead using a "linear" approach, we avoid this problem, even though some logical jumps are necessary to bring everything together.

Special problems for technical writers

Some advice on style must be cast aside in technical writing. For example, it is difficult to write without jargon, though it can be kept to a minimum. There is no better way of describing a photometric redshift than to call it a photometric redshift. So use those words but explain them if your target audience warrants it.

There are potential problems with technical words that carry undesirable connotations in other contexts. For example, in a recent paper, I describe a choice of equations that allows "arbitrary boundary conditions". A co-author pointed out that "arbitrary" is usually synonymous with "pointless" or "not really worth pursuing". But in the mathematical context, it is precisely the correct word to use. It sounds bad to the lay reader but it is technically correct so I stuck with it. If you can avoid such words, do so, but not if the price is the precise technical meaning.

Mathematical and theoretical writing is often written in a particular textbook-like style which makes for dense and heavy reading. You can help the reader by slowing down the concept-barrage. Restate definitions in simple ways. Explain why a definition or equation is interesting. Separate technical segments with paragraphs written in a more conversational style. Even in less technical or mathematical sections, make sure that you have always motivated the detailed calculations you have written.

The final C

The advice in the previous paragraph may seem to contradict the principle of concise writing and this brings us to a final C: compromise. George Orwell's sixth elementary rule is quoted in The Economist:
Break any of these rules sooner than say anything outright barbarous.
Ultimately, following these rules steadfastly may lead to a clunky sentence that simply does not read well. You can do well within the guidelines of good writing but there are occasions where a rule is better off broken. Perhaps a sentence sounds better if it begins with a conjunction or there is no escaping the precise meaning of clunky phrase. In these cases, so be it.

The final piece of advice is point 15 from Knuth's opening section:
There is a definite rhythm in sentences. Read what you have written, and change the wording if it does not flow smoothly.
If the proof of the pudding is in the eating, then the proof of the writing is in the reading. Follow this rule above all.

No comments:

Post a Comment