You are hereAharon Robbins

Aharon Robbins


[Interview conducted on the 15th of march 2002 by Jonas Öberg]

As a long time hacker for the GNU Project, whats the most important thing that the GNU Project has meant to you personally? Any specific software you use almost daily, or any contacts you've made through your GNU hacking that you wouldn't otherwise have had?

My work on GNU awk has led into many interesting things.

First of all, it has led to friendships with Brian Kernighan of Bell Labs (the "K" in "K&R"), and with Michael Brennan, the author of mawk. I value both of these friendships very highly.

Of more day-to-day importance, my work with gawk led me into working with O'Reilly & Associates. The first thing I did was revise "sed & awk". I have since done several more books for them. They also published the gawk documentation for 3.1 under the title "Effective awk Programming", 3rd edition, very scrupulously complying with the GNU FDL.

Most of my living now comes from writing for O'Reilly. Secondarily, I do contract programming, and work on gawk as time permits.

You're most known for hacking gawk, but you've also been involved in some other utilities that many people might not be aware of. For example, you've been involved in some parts of fileutils, shellutils and you've also contributed some string and memory functions (strchr, memset, memcpy and so on). The question is of course, what makes someone get involved at such an early stage of development? What was the incentive to get involved?

I'd been part of the USENET community since the early '80s, where writing and sharing code was a significant thing to do, before the GNU project really got started (or, at least, became well known).

In 1987 I picked up a copy of the original book on awk, and got interested in the language. But "new" awk was not widely available, so I thought I'd see if the FSF had a version, and if I could get involved in upgrading it to the features of new awk.

I initially started working with David Trueman, who had already volunteered to improve gawk. We worked together, quite well, for a number of years, until he had to bow out of the project.

The strchr, memset, and so on routines were shipped with gawk for portability. At the time, Unix systems varied more widely than they do now. We found it easiest to code to "standard" interfaces and then include replacement versions of routines, instead of using lots of #ifdef goo to take advantage of whatever a system had locally.

But in 1990, you had already been working with UNIX for a long time. You first learned unix on a PDP-11 and the sixth generation UNIX in 1980. What made you get involved in computers?

In my senior year in high school I took some "aptitude" tests, and computers was one of the things that showed up. My first year in college I took a programming class (in BASIC, no less!), and just kept going from there, eventually getting an M.S. in computer science from Georgia Tech.

When we spoke with Chet Ramey, the GNU bash maintainer, we discussed the influence of bash in the POSIX standardization. GNU awk is another project that has figurated in the POSIX groups. What did the relation between the POSIX groups and gawk look like and what does it look like today? Are you still involved in the process in some way?

I was on the balloting group for the initial shell & utilities standard. As such I helped a lot with the gawk part and with some of the others in a more minor fashion.

Gawk is largely POSIX-compliant, even with the new 2001 standard, other than in one (arcane) area, which it was too late to fix in 3.1.

I am still on some of the various POSIX mailing lists, but not nearly as actively involved as I was previously. Having a wife and kids is largely responsible for this. :-)

And if we look at gawk for a minute, is there anything in particular you would like to introduce, but isn't able to because it would break existing functionality?

I've tried to avoid adding too much stuff. Treating regexps as first class objects would be a nice feature, as would multi-dimensional arrays and function pointers. But that's just stretching the language too much. (Not to mention, the gawk internal design.)

I am proud of the introduction of line profiling in gawk 3.1. I think that's a valuable tool.

Do you have any particularly fond memories from gawk hacking that you remember today and that you'd like to share? Such as the first patch you receive, the first large corporation using gawk or any other event that made you truly feel proud of what you had done?

Not really. It's enjoyable to see how widely gawk has been ported, and I have a team of porters and testers, many of whom have been helping out for a long time now. I continue to enjoy the occasional "wow this is great software/documentation" emails. Those don't seem to happen as often; with Linux being so popular, I guess that most of the mainline Unix tools are sort of taken for granted.

Hmmm, there is one interesting antecdote. Circa 1993 or so, Rick Adams at UUNET was using gawk to produce accounting scripts, running > 650,000 articles at a time through a program he'd written. I was just amazed. He needed help with the performance of gawk arrays when processing so much data. I revamped the array handling to grow the hash table dynamically as the array size grew, and that solved his problem.

A couple of years ago, Valentin Hilbig wrote a webserver using awk (launched from inetd). And awk is certainly flexible enough to allow these sorts of hacks, but are there anything you think one should absolutely not attempt to do in awk, and gawk in particular?

Binary I/O, it's just not made for it.

The interface to the system is too limited (no chdir, no stat, etc.). This is rather sad. Gawk now has the ability to load dynamic libraries to add new built-ins, but this is sort of a bag on the side (it was contributed code), and I would eventually like to redesign both the awk-level interface and the internals for this.

You've written a couple of books for O'Reilly, such as UNIX in a Nutshell and, of course, Effective awk programming. Did you ever get around to make an audio recording of chapter 2 of UNIX in a Nutshell?

A reader of rec.humor.funny, eh? :-) No, O'Reilly never decided to do anything serious about audio books. Besides, I don't think my voice would be the best for that. :-)

Or are you working on any new books today and if so, when can we expect them to see the light of day?

I have just finished revising the O'Reilly book on the Korn shell to cover ksh93, whose source code is now available. It should hit the bookstores in May or so. I'm pleased with the revision; this book needed updating, and I think the new edition maintains the high quality of the first edition.

As an author yourself, what do you feel about licenses such as a the GNU Free Documentation License, and the process in general of releasing technical documentation freely?

Overall, I think it's good. O'Reilly has released several books under open licenses, including the gawk doc, as I mentioned earlier. I think it's possible to have documentation released under Free licenses, and still make money publishing print versions; we'll see how well the gawk doc does over the next few years.

In 1997, you moved from your home in the US to Israel together with your family. What is it that you work with there today and do you ever long to go back to the US?

As mentioned, I work independently. I use GNU/Linux for most of my work, although some contract programming has involved using Windows, for which I used Cygwin.

In terms of going back to the US, we visit there every summer, so that our kids can be with their grandparents and cousins. And I've made the occasional business trip. Although we miss our families, we don't particularly miss the US itself. (It is amazing how much American stuff one can find here in the grocery stores, as well as American chains: Toys'R'US, Office Depot, Domino's Pizza, Burger King, McDonald's...)

And finally, what was the last movie you saw and what is your favourite one?

I'm not a huge movie fan. I don't remember the last movie I saw in a theater. My "favorite" movie is probably Walt Disney's "Mary Poppins", which I first saw at around age 5. :-)