Page doesn't render properly ?

Home

Code

Docs

UI

rand()

Links

CV

Spaces considered harmful

One thing I've noticed with many projects' source, from beginner's 100-liners to sizeable high-profile projects, is their tendency to utterly and completely fudge the indentation issue. My argument is that not only is indentation via spaces a bad idea, it's also morally wrong. It's a silly argument.

Space indentation is a bad idea

Many projects use somewhere between 2 to 4 spaces as the "indentation unit" of their source. Speaking to the developers, some of the reasons often given are :

"Tabs make the source too wide to read"

I've heard this numerous times, and each time I am left utterly bemused. The entire point of the hard tab character \t is that the actual horizontal space visible from printing it is not set in stone. You can set it to whatever you like. So in what way can tabs possibly affect the width of source ? They can't.

This doesn't even consider the fact that this complaint indicates that perhaps you need to rethink your source anyway, if you're having real width problems. This is discussed eloquently by Linus Torvalds in linuxbox:/usr/src/linux/Documentation/CodingStyle.

"It makes consistency impossible between developers"

Actually, if people are having trouble with using hard tabs, it means that they are having consistency problems - logically, they must be mixing space-indented and tab-indented source. Isn't it obvious that this is always a bad idea ? Each file and preferably each project must have a consistent style.

"I can't align comments"

Some developers like to make comments like this :

if (foo || bar) // this code is necessary frobnicate(); // due to bug 199 in libpointless // which hasn't been fixed yet

and complain that tabs make this sort of alignment very difficult. To which I just say : "UGH!!". This side-commenting leads to long lines (and worse, short lines of comments that lead to dangerous terseness on occasion) and are particularly hard to edit. What happens when you need to add || baz == foo ? You have to play the re-formatting game. No fun. What on earth is wrong with the time-honoured :

// this code is necessary due to bug 199 in libpointless // which hasn't been fixed yet if (foo || bar) frobnicate();

Space indentation is morally wrong

I believe in the light shone by the prophets, namely using a tab size of 8. However, I accept there are other, curious, people on this planet who prefer a different tabsize. That's OK - the wonder of hard tabbing means they can have their cake, and I can eat it (or something like that).

Now let's consider me and the other developer (let's call him "Hagbard") working on a project together. Hagbard started the project, using a space-based indentation unit of 2 spaces. I join in with a patch to introduce some memory leaks and seg faults. Suddenly, I find myself forced to use spaces, and an indentation size that is alien to me.

Let's review what has just happened. Hagbard has enforced his preferences upon me, for no good reason. It's fairly clear from this example what's going on - Hagbard is using a fascist source management system: "you don't like the indentation method - tough". Amazingly, space-based indentation is actually recommended in some coding styles (for example, Python's).

The inevitable result of this, on a medium-sized project, is a horrendous mis-mash of indentation sizes and styles. Don't stand there looking nonchalant - you've all seen such code; in universities, in business code, in student projects. Such chaos, and all avoidable merely by using that key to the left of your keyboard !

"But tabs don't always work !"

I've heard it said that viewing diffs doesn't work properly using hard tabs. All I can suggest here is that your tools are broken - I regularly both cat and vi patches, and they look fine. Fix your tools - don't impose fascist policies.

Tabs do not prevent alignment. Alignment is different from indentation. For example :

for (std::vector<std::string>::const_iterator cit = v.begin(); cit != v.end(); ++cit) { // etc. }

What's going on here ? Let's see :

<tab>for (std::vector<std::string>::const_iterator cit = v.begin(); <tab>`````cit != v.end(); ++cit) { <tab><tab>// etc. <tab>}

We're using spaces for alignment, but tabs for indentation. Show me the editor that this can look wrong on. The JSPWiki referenced below makes the claim that developers fuck this up. This alone (assuming it's true) does not excuse not doing it. Developers fuck up indentation too. The particularly bad example they show is fairly obviously from poorly-maintained software. You're not involved in poorly-maintained software are you ?

Note that the above code isn't too great anyway. Especially with if conditions, a small inline helper function describing the meaning of the test is often better. Don't end up like this (from the gcc source) :

if ((reg_class_subset_p (class, rld[i].class) || reg_class_subset_p (rld[i].class, class)) /* If the existing reload has a register, it must fit our class. */ && (rld[i].reg_rtx == 0 || TEST_HARD_REG_BIT (reg_class_contents[(int) class], true_regnum (rld[i].reg_rtx))) && out == 0 && rld[i].out == 0 && rld[i].in != 0 && ((GET_CODE (in) == REG && GET_RTX_CLASS (GET_CODE (rld[i].in)) == 'a' && MATCHES (XEXP (rld[i].in, 0), in)) || (GET_CODE (rld[i].in) == REG && GET_RTX_CLASS (GET_CODE (in)) == 'a' && MATCHES (XEXP (in, 0), rld[i].in))) && (rld[i].out == 0 || ! earlyclobber_operand_p (rld[i].out)) && (reg_class_size[(int) class] == 1 || SMALL_REGISTER_CLASSES) && MERGABLE_RELOADS (type, rld[i].when_needed, opnum, rld[i].opnum)) {

(There are worse examples !)

"Ah, but what about this :"

.... std::string s = "1. My quite long string. Really I believe in the sun," "2. the rain, and my magical guitar. Hurrah !\n";

The above source is space-indented. Look how the text lines up - this can be quite important for readability. But in fact uses hard tabs correctly is only a little harder :

.... std::string s = "1. My quite long string. Really I believe in the sun," "2. the rain, and my magical guitar. Hurrah !\n";

It's not too much to ask for, surely ?

"Hey, ever heard of indent ?"

Actually, yes. indent is a program that is capable of re-formatting some sources to a standard style. Unfortunately, it does not cover all common languages; and is not complete even in the best-supported languages such as C. It's generally impossible to get indent to work reliably and transparently for developers with a shared source tree.

Additionally, most editors are capable of transparently converting tabs/spaces on load and save. But of course, such things are still unreliable, and cannot be trusted. More importantly, this isn't an argument against using hard tabs; I can just as easily turn it around and say to space-based developers "why don't you turn this feature on ?"

Let's look at an example where automatic conversion from spaces to tabs goes wrong :

while (p) { for (std::vector<std::string>::const_iterator cit = v.begin(); cit != v.end(); ++cit) { // etc. } }

This is the source file, with spaces. Now imagine my editor reading this in, and converting the 2-spaces into 8-sized tabs on the screen. It will look like this :

while (p) { for (std::vector<std::string>::const_iterator cit = v.begin(); cit != v.end(); ++cit) { // etc. } }

Ugh. Note that the spaces lobby, who so like their multi-line statements to align correctly, have just forced me to read extremely badly-aligned code. More importantly, I must take considerable care to not accidentally re-align such broken indentation, as it will not work when converted back (here, my natural instinct to remove a tab on the line starting "cit != ..." will cause the space-based source to look horrendous). This actually impacts on my development time, and I know I'm not alone. Using tabs for indentation and spaces for alignment does not suffer from this problem to anything like the same extent, since the code looks right on my screen. Using all-spaces has actually lost data here.

The rule here is simple : tabs for indentation, spaces for alignment.

"My xterms can't copy/paste tabs"

Mine can't either, so I usually just use multiple vi sessions. However, it is possible to get this working properly (i.e. the tabs are not converted to spaces during the paste), as I have done it previously (yes, I forget how). Besides, there are often other things that get in the way, like line numbering. Again, broken tools do not mean it's OK to impose your indentation methods upon me.

"I can't share files with short tabs"

Consider a user who prefers to set his tabsize to 2. He develops on a wide terminal and produces hard-tabbed source files 80 characters wide at places (from his point of view). But you get this source, and, using 8-sized tabs, the lines that he has written are wrapped.

On the face of it, this scenario seems like a killer argument against hard tabbing. After all, haven't we just destroyed the possibility of sharing source nicely ?

Well, maybe, but it is not hard tabbing's fault. It is the fault of the original developer who writes lines that are too long. This is an argument against developing using small tabsizes, not hard tabbing per se. I would indeed recommend using a tabsize of 8, but this is not the consequence of hard tabbing - it is the consequence of writing bad code that is too wide.

The alternative, "fixing" this problem, is to hard code 2 spaces as an indentation unit in the source. But note - not only have we suddenly imposed indentation style on other people, but we still have the problem of the source lines being over-filled. The problem hasn't gone away just because the indentation is fixed, instead of variable.

"Why should I expend effort for your benefit ?"

This is the most common objection I hear - the spaces lobby ask why they should bother using tabs: they don't need it, and they don't want it.

But this is ultimately selfish. It rings of people not writing documentation for others, because they understand the source, and they don't need it. Obviously (before you all go hysterical) documentation is far more important that the indentation issue, but my point is that the same attitude underlies both stances. Ultimately (assuming you are at all interested in other people reading and modifying your source, as any good developer is) making the tiny sacrifices you must in order to accomodate a wider range of developers will be to your benefit: if the source is more readable to more developers, you are more likely to get patches.

Think about the relative costs and benefits. The costs are minimal; the benefits are great.

Summary

In summary: don't do it kids.

References

Take the ideas you find useful. Try not to get hung up on the labels. - Jonathan S. Shapiro
2003/09/07