The 1996 LISA conference was held in Chicago IL, from 29 September to 4 October 1996. LISA is the annual system administration conference organized by the USENIX association; it is the flagship event for SAGE, the system administrator's special interest group within USENIX.
The event was organized in the usual way: several days of long tutorials (half- or full-day) followed by a three-day conference. The tutorials cost extra. The main conference has two parallel tracks: `refereed papers' (three talks to a 90-minute session) and `invited talks' (90 minutes each). I usually spend most of my time in the invited talks, because the refereed papers are published in a proceedings, and half an hour usually isn't enough time for the speaker to say much anyway. There are also several hours of one- or two-hour BOF (birds-of-a-feather) sessions each evening on a variety of special topics.
Here are some notes on the parts of the conference I found most interesting, loosely grouped by subject matter. It is no accident that they reflect some of my own areas of interest; others might pick different highlights.
Peter Van Epp (Simon Fraser University) related SFU's experiences in replacing the backbone of their huge bridged Ethernet with ATM. The talk focused on issues rather than conclusions, which means it exposed a number of interesting dark corners in ATM. SFU seems satisfied with their new network, but they don't think ATM is entirely mature yet.
Several people from MCI (the company that runs the NSF ATM backbone these days) spoke on the tools they use to monitor OC3 network links. They use optical splitters to tap both fibres, feeding the result into two ATM cards (one for each traffic direction) in a PC with lots of memory. The PC captures the headers from ATM cells and IP packets and does a little preprocessing; it runs a dedicated program, not UNIX. Detailed data analysis is done later on other systems. The focus was on how to do monitoring rather than the results, but there are some interesting graphs of real data in the paper.
Heon Yeom (Seoul National University) talked about on-the-fly IP address and port number translation. The idea is to use a few externally-visible IP addresses for a larger number of machines inside a network; hence the need to map port numbers as well as addresses. They do it just to conserve external IP addresses, but the same idea could be used by a firewall to hide internal addresses. (I know of firewalls that remap IP addresses for this reason, but don't know whether any remap port numbers as well.) Remapping port numbers has interesting effects on protocols like rlogin where the port number has semantic meaning, or ftp where client and server are supposed to be aware of the numbers.
Louis Todd Heberlein (University of California at Davis) gave a long talk on formal models for detecting security breakins. It all seemed to boil down to two main approaches: heuristics that watch for the trails of known attacks, and schemes that cast out events that are known to be benign, reporting the exceptions as possible problems. All the models and lessons apply just as well to less sexy aspects of system administration, e.g. watching for software and hardware errors; I was surprised that the speaker didn't point that out, since he complained that funding for this sort of work is often hard to get.
The CERT folks held their usual BOF. They gave a brief report on recent trends in breakins, which is covered pretty well in their published summaries. They also noted that updates to CERT advisories are now placed in the original file, rather than in separate README files (a welcome change), and that CERT has established a new `affiliates' program, partly to attract more funding.
There were several talks on coping with very large sites. Someone from the University of North Carolina held a BOF on what goes wrong when a system has tens of thousands of logins; most of the time was taken up by war stories. Dan Geer (LoneWolf Systems) talked about his experiences in keeping a very large web site running smoothly as it grew in size and popularity. Stuart McRobert (Imperial College, London) spoke on the origins and evolution of Sunsite Northern Europe, one of several archive sites supported by Sun Microsystems; as a measure of size, they now have about 70GB of disk. All of these talks seemed to convey the same general messages: monitor performance carefully; don't try to fix bottlenecks until you've measured them, because you're probably wrong otherwise; fixing today's bottleneck just buys you time until you find tomorrow's, so don't spend too much time on it. None of these are novel ideas, but they are often forgotten in real life.
There two particularly interesting talks on automated system administration and software installation. Henry Spencer (SP Systems) described a system for maintaining and updating user accounts across a large network of machines at Sheridan College; the scheme is noteworthy for its simplicity and clever use of existing tools (most parts are written in expect, a slightly-enhanced tcl). Jim Trocki (American Cyanamid) explained how he uses a single-floppy, memory-resident copy of Linux as a PC administration tool. Because it can speak to the network and has all the normal UNIX tools, but can also read and write DOS file systems, Linux makes it far easier to automate installing programs and customizing configuration files; but because the Linux used is a minimal version that fits on a floppy and runs just in memory, there's no need to devote work or disk space to installing Linux on a system that will never run it in real life.
Marc Pelletier (Softgard Inc) gave a fascinating BOF on adapting UNIX to use Unicode as its sole character set. Unicode is a 16-bit character set containing characters for a wide variety of languages; it is an international standard. Pelletier has adapted a copy of Linux to use Unicode, with some concessions so that ASCII files still work as well. His approach is different from systems like Plan 9 in which programs use Unicode internally but files are kept in UTF (an 8-bit encoding, of which 7-bit ASCII is a proper subset); in Pelletier's system, text files are allowed to be straight 16-bit Unicode, and programs that read text are expected to cope. There are a number of interesting and sometimes messy issues. The cat command grows more internal complication and a new option (will anyone notice these days?). Rendering Unicode (displaying it on a screen) is complicated, because characters are language-specific; Hebrew characters are supposed to be printed right-to-left, for example. Pelletier has a test document containing a mix of English, Hebrew, and Chinese (rendered vertically). On the other hand, there are Unicode characters reserved to the implementation; Pelletier has modified the shell to use these special characters for quoting, wildcards, I/O redirection, and so on, so one can type a command argument containing > or * without worrying about quotes.
A panel intended to debate the merits of formal standards, apparently intended to be a lively debate, was in fact fairly calm: the pro-standards speakers admitted that the POSIX standards for system administration interfaces are not very good, and the strongly anti-standards speaker quickly made it clear that he didn't know what he was talking about. There was some entertainment and some insight into how standards work, but no real conclusion.
Someone whose name I missed ran a BOF on `Traits of a Great Sysadmin': a well-run discussion of the skills important to system administrators. The focus was on the non-technical skills in which so many technically-sharp people are weak: communicating comfortably with management and the user community; teaching people how to do things themselves, rather than just doing it all for them; understanding that someone who doesn't know how the kernel works isn't necessarily your intellectual inferior; and so on.
That BOF made an interesting foreword (for those who attended it) to the next day's talk by Randall Schwartz, noted perl expert and recently convicted computer felon. The subject was the latter experience. While working on contract for one part of Intel, Schwartz attempted to investigate and fix problems on systems in another division of Intel, where he had had a previous contract but was no longer employed. The trouble is that he didn't talk to the people who were responsible for those systems, and ended up stepping on many toes, including his own. The offended system administrators and Intel management pressed felony charges under Oregon's rather general computer crime laws; Schwartz was convicted, and sentenced to pay US$62000 in restitution (Intel's claimed costs of investigation) and to serve three months in jail (delayed to allow an appeal).
The intent of the talk was to make others aware of what happened, lest they fall into similar pits, and to further Schwartz's campaign to have such overly-broad laws fixed. It sounds to me like Schwartz, Intel, and the Oregon legal community all screwed up; Schwartz's actions were unwise, perhaps unethical, probably grounds for dismissal, but hardly felonious. Schwartz admits that his actions were dumb (though I'm not sure he really understands why); the other parties have yet to own up to their blunders.
I was disappointed by my first LISA, last year in Monterey CA; there were a few good talks, but on the whole it was a boring experience. This year's conference was much better, and was certainly worth while.
Some of the most interesting material at this LISA probably didn't belong there. Adapting UNIX to use Unicode is not system administration, and even the paper on IP address mapping seemed a bit out of place. On the other hand, there were lots of good talks that did have to do with system administration, so perhaps the mismatched talks don't matter.