March 2009

The danger of “language” in programming

I recently stumbled upon this blog entry from Steve Riley. There are a number of things in this that can be agreed with or argued about, but one in particular struck me as especially dangerous:

“[C++] has a million nuances (like the English language) that somehow make expressing exactly what you want to say seem easier than other languages.”

I can interpret this sentence in three ways. The first is meaningless. Just some pretty wording to impress people, only to say “C++ is good”. Georges Orwell warned us about this, but I doubt Steve was that sloppy. The second is literal. C++ share a trait with natural languages: the “million nuances”. I don’t think that was the intended meaning, though it was how I first understood it. The last (and most likely) is the analogy. C++ and English have more nuances than other languages —programming and spoken, respectively. I call it biased towards his native tongue, but his point remains.

Anyway, comparing natural and programming languages disturbs me, because drawing the line between similarities and differences is hard. That makes the jump from right premises to wrong conclusions way too easy. Analogies are a great tool in many cases, but when talking about computers, As Dijkstra said they just don’t work.

A fundamental difference between the two kind of languages is often overlooked: their usage. Natural languages are exclusively used to talk to people, while programming languages are also used to talk to machines. This is not the same at all: when told to do something, people do as needed. Machines do as told.

In virtually every use of a computer, we have some specific need, about which we tell the computer, using some programming language(s). (This need may evolve, but that’s not the point.) The resulting description of this need is the program. What often happens at this point is a disagreement between the computer and the human about the meaning of this program —which is a bug.

There are two ways to deal with bugs. Correcting them, and avoiding them. The production of a program of any kind always combine both approaches. Most of the time, the (explicit) focus is on correction. Sometimes, it is on avoidance. (life critical systems are a typical example).

Of these two, the only one that is significantly influenced by the structure of programming languages is avoidance. It is therefore crucial to insist on it when choosing (or designing) one. Several characteristics of programming languages can help or impede bug avoidance. Conciseness can help, for it is known that the defect per line of code is pretty much constant across languages. A paranoid type system can help, if it is still convenient. But I think what helps most is obviousness. The more obvious, the better. Ideally no special case, no subtlety, no nuance. By itself, the language should be boring.

That leads us to why Steve’s above sentence is so dangerous. Not because it’s wrong (although it is), but because it’s right. C++ does have many nuances. It is a very interesting and very subtle language, to the point even machines (namely compilers) disagree about its meaning. That is not good. That is a recipe for disaster.