hwechtla-tl: Virtues of the Unix shell


Mikä on WikiWiki?
koko wiki (etsi)
viime muutokset

Time and again, I bump into some opinions or discussion about how the Unix user interface, the command line, should be improved, or maybe replaced with something completely/relatively different. This time, it was the comments on this lwn.net article: https://lwn.net/Articles/429118/#Comments

My own opinion is this: it's not so simple.

There exist numerous alternatives to traditional Unix command line, even within Unix-like systems like modern Linux/*BSD systems. Some of them are the result of really lots of thought; take scsh, for example. Many of these alternative utilities are not only user interface experiments, but also provide new and interesting conceptual models for processing data: for instance, if you want to replace plain text with structured objects, you can use any of the very-high-level programming languages, such as Python or Scheme; if you want to replace the file system with structured data, you have the possibility to save all your data in different kinds of databases instead. So, with this plethora of tools, ready to use, just why does the temptation of traditional Unix command line still exist?

It's because it's really hard to come up with a better alternative.

First off, it's hard to beat plain text as a medium for data. Inertia is not the only reason for object-oriented shells not taking off. Plain text is easy to view and inspect, and many data structures have a natural representation in plain text. Even those that do not have a natural representation in plain text can be readily expressed in plain text. Plain text has lots of tradition for how things are expressed -- that tradition is missing from structured data. It takes decades to build the same kind of tradition.

Sure, representing all data as intelligent objects sounds good, but when you start implementing the idea, you will soon realise that you have to make a lot of design decisions. How can you ever make an informed decision on all this stuff? This is the shell. The data you are manipulating might be anything, so you have to design a representation for everything. What's the equivalent of "tr" for objects? How could you be sure there should be one? How could you be sure there should not? What should it do for ".doc" files? What could be the interface of objects toward different classes of tools? What would the tool classes be?

It doesn't take long before using a single interface, the plain text, actually starts to seem like a good design choice.

Another thing that gets more that its share of whining is the inconsistency of command syntax. Oh no, "mv" has this syntax, "dd" has another, and everything and your mother does things its own style. Usually the excuse is history, but that's not the only reason.

Just imagine you got to redesign all the command interfaces. Remember, this is the shell. A command may do anything, so you have to design a user interface for any kind of task. Quite daunting, isn't it? And here's the problem of orthogonality and consistency: all these different kinds of tasks are, well, different. They are also somewhat similar, in ways that partially overlap. Should the archiver interface be modelled after file system interface or file conversion interface? Or maybe the mail folder interface? What kind of categories should the utilities be grouped into? What is the equivalent for "-n" for all text utilities? How could you be sure whether all utilities should have that switch? What should it do for "sort"? If you decide that all filename manipulation tools should have a "--also-copy" switch, how can you make sure it will make sense for all tools to come?

My (relatively simple) laptop install has 2090 commands for the ordinary user. Some of these could be dispensed with, but many provide multiple kinds of functionality. Are you sure you could design a consistent interface to these 2k-10k of functions that my commands provide? Just to pick a random one, what should the functions provided by the "uname" command be made into? What about "montage"? What kind of consistency would you suggest between these two commands?

If you look at programming languages, you'll notice that the best of them have designed consistent interfaces for a few things, such as iterating over a collection, holding a resource while doing something, doing something on a specific condition, and reporting and handling errors. Standardising all this has taken the experience of several decades of programming language design and experimentation. Now, are you ready to standardise the interfaces for everything that users might want to do on their computer?

Or, might it be more sensible to let people choose the command line syntax for their commands, using existing tools as starting points where applicable?

In my opinion, this difficulty of designing consistent interfaces for everything in the world is the reason why we haven't seen, and maybe will never see, a user interface that stays consistent for all kinds of tasks, and all kinds of data. Sure, the "Edit" menu is quite consistent, all right, and you might have a standard way of requesting online help from the tools you work with. But that's really little consistency. Just imagine you'd have to design a consistent graphical user interface for everything you can do with the Unix command line. It probably won't happen. Now, Windows has a "PowerShell" that's supposed to be a more intelligent command line. The only gotcha is that most programs don't support its interface requirements. My guess is that they never will. That, or it will take them 30 years to do so and in the course there will be as many bad design decisions as there have been in the 30 first years of Unix command line.

The kind of consistency that copy&paste provides has been available in the Unix systems always. You can always save your output in a file, and read your input from a file. That's actually more than you can do with copy&paste, and with more consistent interface. And help? Lo and behold, we have a command for requesting help. It's so standard that "vi" has a keybinding for looking up help for the command / library call under the cursor! And the man pages are also actually useful, at least for non-GNU programs, unlike any online documentation for graphical programs I've ever read.

Another line of thinking that supports the Unix way is that you should optimise for the common case. I don't know what is the common case for your computer usage, but for me it's editing text, processing text, arranging files, synchronising files with version control, making remote command connections, browsing the web, reading my e-mail, installing and updating software, and automating these processes where possible. Guess what? The Unix command line makes this very efficient. What are your odds at designing a user interface that achieves consistency between all kinds of things and makes the common tasks easy?

And then there's the complaint that shell syntax is horrible. Do tell. But then, what's the option? It's good that a command and its arguments are separated by spaces, because that's nice to write. Yes, then you need some way of telling that a space is to be taken literally (for example, as a part of a file name); would you sacrifice the common case of file names that don't have spaces for the uncommon case of file names that have? What about wildcards? Should they be requested separately? I wouldn't be so sure. Which one would you prefer:

rm tempfile *~ .gvpics/*


for f in ['tempfile'] + glob('*~') + glob('.gvpics/*'): remove(f)

What about very simple things, like:

diff -u file1 file1~

Would you rather have it like

differences("file1", "file1~", type="unified")

Would you even settle for

(run '(diff -u file1 file1~))

Sure, making nontrivial scripts (especially ones that won't break) is hard in shell. It's natural, because the shell is not designed to be a general purpose programming tool. But could we really make it one without sacrificing its other virtues? It turns out that the shell is already a superb tool for specific programming purposes: it's one of the only programming languages with which relative amateurs are actually able to do something useful (from their subjective point of view).

What about the commands being cryptic and amnemonic? Well, I've used the most proclaimed alternative, long command names with dashes within, and completion for entering them effectively -- for instance, emacs functions and Scheme shells. It might be just me, but I find these long command names harder to remember (although easier to guess) than the short Unix-y ones. From my point of view, if you know how the name originates, it does not strain my memory at all to remember which letters are missing (such as o and e from "move"). And, it's easier to remember whether the command is "mv" or "vm" than it is to remember whether it is "move-file" or "file-move".

So, what I'm coming at is this. If you want to improve things, then rather than bashing "bash", you should be suggesting how all the different use cases should be handled. There's a lot of use cases out there, and the all-important common case varies by user.

Things can be done other ways, and have been. I do my data processing with at least a dozen different languages. The command line is definitely one of the most important of these. If I found it bad, I'd have at least five alternative languages to command my computer in.

The Unix command line can be improved, and has. It's the kind of distributed, practical, experience-driven improvement that is taking place. That's why the Unix command line is so great, even despite its many shortcomings. It has been improved for a long time.

The Unix command line can be made consistent in specific areas, and has. Actually, there are really many common best practices in the Unix world, and they cover more things than you could possibly imagine. Where should I document the contents of a directory? A README file. What should I call this picture? smile.jpeg. How should I represent a set of phone numbers as data? Write them in a plain text file, one by line. What should I make the key for erasing a word? Control-W, or look from the tty properties. How should I represent a control character in my documentation? Put a caret in front of a letter (^H, for instance). Where should my interactive program report a problem? On the bottom line of the terminal.

So, if you are going to make yet another attempt at replacing the Unix command line, I wish you luck. I also suggest that you take a look at the alternatives that already exist. Maybe you'll learn something important: you could find that it's not easy to tell what you are trying to achieve.

kategoria: mv-mielipide kategoria: työkalut

kommentoi (viimeksi muutettu 04.03.2011 09:38)