<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<?xml-stylesheet type="text/css" href="../_sisu/css/sax.css"?>
<!-- Document processing information:
     * Generated by: SiSU 2.8.2 of 2011w10/5 (2011-03-11)
     * Ruby version: ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
     * 
     * Last Generated on: Fri Mar 11 15:28:22 +0100 2011
     * SiSU http://www.jus.uio.no/sisu
-->

<document>
<head>
<metadata>
	<meta>Title:</meta>
	<data class="md">
		The Cathedral and the Bazaar
	</data>
</metadata>
<metadata>
	<meta>Creator:</meta>
	<data class="md">
		Eric S. Raymond
	</data>
</metadata>
<metadata>
	<meta>Rights:</meta>
	<data class="md">
		Copyright &#169; 2000 Eric S. Raymond.;<br /> License: Permission is granted to copy, distribute and/or modify this document under the terms of the Open Publication License, version 2.0.
	</data>
</metadata>
<metadata>
	<meta>Publisher:</meta>
	<data class="md">
		SiSU ‹&#60;text:a xlink:type='simple' xlink:href='http://www.jus.uio.no/sisu'&#62;http://www.jus.uio.no/sisu&#60;/text:a&#62;› (this copy)
	</data>
</metadata>
<metadata>
	<meta>Abstract:</meta>
	<data class="md">
		I anatomize a successful open-source project, fetchmail, that was run as a deliberate test of the surprising theories about software engineering suggested by the history of Linux. I discuss these theories in terms of two fundamentally different development styles, the "cathedral" model of most of the commercial world versus the "bazaar" model of the Linux world. I show that these models derive from opposing assumptions about the nature of the software-debugging task. I then make a sustained argument from the Linux experience for the proposition that "Given enough eyeballs, all bugs are shallow", suggest productive analogies with other self-correcting systems of selfish agents, and conclude with some exploration of the implications of this insight for the future of software.
	</data>
</metadata>
<metadata>
	<meta>Date created:</meta>
	<data class="md">
		1997-05-21
	</data>
</metadata>
<metadata>
	<meta>Date issued:</meta>
	<data class="md">
		1997-05-21
	</data>
</metadata>
<metadata>
	<meta>Date available:</meta>
	<data class="md">
		1997-05-21
	</data>
</metadata>
<metadata>
	<meta>Date modified:</meta>
	<data class="md">
		2002-08-02
	</data>
</metadata>
<metadata>
	<meta>Date:</meta>
	<data class="md">
		2002-08-02
	</data>
</metadata>
<metadata>
	<meta>Sourcefile:</meta>
	<data class="md">
		the_cathedral_and_the_bazaar.eric_s_raymond.sst
	</data>
</metadata>
<metadata>
	<meta>Filetype:</meta>
	<data class="md">
		SiSU text 2.0
	</data>
</metadata>
<metadata>
	<meta>Source digest:</meta>
	<data class="md">
		SHA256(the_cathedral_and_the_bazaar.eric_s_raymond.sst)= 608a77af8d498d32cb55f97c156f50e7877859288cef5d387d2e7004b3fcc085
	</data>
</metadata>
<metadata>
	<meta>Skin digest:</meta>
	<data class="md">
		SHA256(skin_sisu.rb)= 296e8f9c884bc0427ffad291d7e37538a90561a276da407a822b4214e600363b
	</data>
</metadata>
<metadata>
	<meta>Generated by:</meta>
	<data class="md">
		Generated by: SiSU 2.8.2 of 2011w10/5 (2011-03-11)
	</data>
</metadata>
<metadata>
	<meta>Ruby version:</meta>
	<data class="md">
		ruby 1.8.7 (2008-08-11 patchlevel 72) [i486-linux]
	</data>
</metadata>
<metadata>
	<meta>Document (dal) last generated:</meta>
	<data class="md">
		Fri Mar 11 15:28:18 +0100 2011
	</data>
</metadata>
</head>
<body>
<object id="1">
	<ocn>1</ocn>
	<text class="h1">
		The Cathedral and the Bazaar,<br />Eric S. Raymond
	</text>
</object>
<object id="2">
	<ocn>2</ocn>
	<text class="h4">
		The Cathedral and the Bazaar
	</text>
</object>
<object id="3">
	<ocn>3</ocn>
	<text class="norm">
		Linux is subversive. Who would have thought even five years ago (1991)
that a world-class operating system could coalesce as if by magic out
of part-time hacking by several thousand developers scattered all over
the planet, connected only by the tenuous strands of the Internet?
	</text>
</object>
<object id="4">
	<ocn>4</ocn>
	<text class="norm">
		Certainly not I. By the time Linux swam onto my radar screen in early
1993, I had already been involved in Unix and open-source development
for ten years. I was one of the first GNU contributors in the
mid-1980s. I had released a good deal of open-source software onto the
net, developing or co-developing several programs (nethack, Emacs's VC
and GUD modes, xlife, and others) that are still in wide use today. I
thought I knew how it was done.
	</text>
</object>
<object id="5">
	<ocn>5</ocn>
	<text class="norm">
		Linux overturned much of what I thought I knew. I had been preaching
the Unix gospel of small tools, rapid prototyping and evolutionary
programming for years. But I also believed there was a certain critical
complexity above which a more centralized, a priori approach was
required. I believed that the most important software (operating
systems and really large tools like the Emacs programming editor)
needed to be built like cathedrals, carefully crafted by individual
wizards or small bands of mages working in splendid isolation, with no
beta to be released before its time.
	</text>
</object>
<object id="6">
	<ocn>6</ocn>
	<text class="norm">
		Linus Torvalds's style of development&#8212;release early and often,
delegate everything you can, be open to the point of
promiscuity&#8212;came as a surprise. No quiet, reverent
cathedral-building here&#8212;rather, the Linux community seemed to
resemble a great babbling bazaar of differing agendas and approaches
(aptly symbolized by the Linux archive sites, who'd take submissions
from anyone) out of which a coherent and stable system could seemingly
emerge only by a succession of miracles.
	</text>
</object>
<object id="7">
	<ocn>7</ocn>
	<text class="norm">
		The fact that this bazaar style seemed to work, and work well, came as
a distinct shock. As I learned my way around, I worked hard not just at
individual projects, but also at trying to understand why the Linux
world not only didn't fly apart in confusion but seemed to go from
strength to strength at a speed barely imaginable to
cathedral-builders.
	</text>
</object>
<object id="8">
	<ocn>8</ocn>
	<text class="norm">
		By mid-1996 I thought I was beginning to understand. Chance handed me a
perfect way to test my theory, in the form of an open-source project
that I could consciously try to run in the bazaar style. So I
did&#8212;and it was a significant success.
	</text>
</object>
<object id="9">
	<ocn>9</ocn>
	<text class="norm">
		This is the story of that project. I'll use it to propose some
aphorisms about effective open-source development. Not all of these are
things I first learned in the Linux world, but we'll see how the Linux
world gives them particular point. If I'm correct, they'll help you
understand exactly what it is that makes the Linux community such a
fountain of good software&#8212;and, perhaps, they will help you become
more productive yourself.
	</text>
</object>
<object id="10">
	<ocn>10</ocn>
	<text class="h4">
		The Mail Must Get Through
	</text>
</object>
<object id="11">
	<ocn>11</ocn>
	<text class="norm">
		Since 1993 I'd been running the technical side of a small free-access
Internet service provider called Chester County InterLink (CCIL) in
West Chester, Pennsylvania. I co-founded CCIL and wrote our unique
multiuser bulletin-board software&#8212;you can check it out by
telnetting to locke.ccil.org. Today it supports almost three thousand
users on thirty lines. The job allowed me 24-hour-a-day access to the
net through CCIL's 56K line&#8212;in fact, the job practically demanded
it!
	</text>
</object>
<object id="12">
	<ocn>12</ocn>
	<text class="norm">
		I had gotten quite used to instant Internet email. I found having to
periodically telnet over to locke to check my mail annoying. What I
wanted was for my mail to be delivered on snark (my home system) so
that I would be notified when it arrived and could handle it using all
my local tools.
	</text>
</object>
<object id="13">
	<ocn>13</ocn>
	<text class="norm">
		The Internet's native mail forwarding protocol, SMTP (Simple Mail
Transfer Protocol), wouldn't suit, because it works best when machines
are connected full-time, while my personal machine isn't always on the
Internet, and doesn't have a static IP address. What I needed was a
program that would reach out over my intermittent dialup connection and
pull across my mail to be delivered locally. I knew such things
existed, and that most of them used a simple application protocol
called POP (Post Office Protocol). POP is now widely supported by most
common mail clients, but at the time, it wasn't built in to the mail
reader I was using.
	</text>
</object>
<object id="14">
	<ocn>14</ocn>
	<text class="norm">
		I needed a POP3 client. So I went out on the Internet and found one.
Actually, I found three or four. I used one of them for a while, but it
was missing what seemed an obvious feature, the ability to hack the
addresses on fetched mail so replies would work properly.
	</text>
</object>
<object id="15">
	<ocn>15</ocn>
	<text class="norm">
		The problem was this: suppose someone named `joe' on locke sent me
mail. If I fetched the mail to snark and then tried to reply to it, my
mailer would cheerfully try to ship it to a nonexistent `joe' on snark.
Hand-editing reply addresses to tack on &#60; @ccil.org&#62; quickly
got to be a serious pain.
	</text>
</object>
<object id="16">
	<ocn>16</ocn>
	<text class="norm">
		This was clearly something the computer ought to be doing for me. But
none of the existing POP clients knew how! And this brings us to the
first lesson:
	</text>
</object>
<object id="17">
	<ocn>17</ocn>
	<text class="indent1">
		1. Every good work of software starts by scratching a developer's
personal itch.
	</text>
</object>
<object id="18">
	<ocn>18</ocn>
	<text class="norm">
		Perhaps this should have been obvious (it's long been proverbial that
"Necessity is the mother of invention") but too often software
developers spend their days grinding away for pay at programs they
neither need nor love. But not in the Linux world&#8212;which may
explain why the average quality of software originated in the Linux
community is so high.
	</text>
</object>
<object id="19">
	<ocn>19</ocn>
	<text class="norm">
		So, did I immediately launch into a furious whirl of coding up a
brand-new POP3 client to compete with the existing ones? Not on your
life! I looked carefully at the POP utilities I had in hand, asking
myself "Which one is closest to what I want?" Because:
	</text>
</object>
<object id="20">
	<ocn>20</ocn>
	<text class="indent1">
		2. Good programmers know what to write. Great ones know what to rewrite
(and reuse).
	</text>
</object>
<object id="21">
	<ocn>21</ocn>
	<text class="norm">
		While I don't claim to be a great programmer, I try to imitate one. An
important trait of the great ones is constructive laziness. They know
that you get an A not for effort but for results, and that it's almost
always easier to start from a good partial solution than from nothing
at all.
	</text>
</object>
<object id="22">
	<ocn>22</ocn>
	<text class="norm">
		Linus Torvalds, for example, didn't actually try to write Linux from
scratch. Instead, he started by reusing code and ideas from Minix, a
tiny Unix-like operating system for PC clones. Eventually all the Minix
code went away or was completely rewritten&#8212;but while it was
there, it provided scaffolding for the infant that would eventually
become Linux.
	</text>
</object>
<object id="23">
	<ocn>23</ocn>
	<text class="norm">
		In the same spirit, I went looking for an existing POP utility that was
reasonably well coded, to use as a development base.
	</text>
</object>
<object id="24">
	<ocn>24</ocn>
	<text class="norm">
		The source-sharing tradition of the Unix world has always been friendly
to code reuse (this is why the GNU project chose Unix as a base OS, in
spite of serious reservations about the OS itself). The Linux world has
taken this tradition nearly to its technological limit; it has
terabytes of open sources generally available. So spending time looking
for some else's almost-good-enough is more likely to give you good
results in the Linux world than anywhere else.
	</text>
</object>
<object id="25">
	<ocn>25</ocn>
	<text class="norm">
		And it did for me. With those I'd found earlier, my second search made
up a total of nine candidates&#8212;fetchpop, PopTart, get-mail, gwpop,
pimp, pop-perl, popc, popmail and upop. The one I first settled on was
`fetchpop' by Seung-Hong Oh. I put my header-rewrite feature in it, and
made various other improvements which the author accepted into his 1.9
release.
	</text>
</object>
<object id="26">
	<ocn>26</ocn>
	<text class="norm">
		A few weeks later, though, I stumbled across the code for popclient by
Carl Harris, and found I had a problem. Though fetchpop had some good
original ideas in it (such as its background-daemon mode), it could
only handle POP3 and was rather amateurishly coded (Seung-Hong was at
that time a bright but inexperienced programmer, and both traits
showed). Carl's code was better, quite professional and solid, but his
program lacked several important and rather tricky-to-implement
fetchpop features (including those I'd coded myself).
	</text>
</object>
<object id="27">
	<ocn>27</ocn>
	<text class="norm">
		Stay or switch? If I switched, I'd be throwing away the coding I'd
already done in exchange for a better development base.
	</text>
</object>
<object id="28">
	<ocn>28</ocn>
	<text class="norm">
		A practical motive to switch was the presence of multiple-protocol
support. POP3 is the most commonly used of the post-office server
protocols, but not the only one. Fetchpop and the other competition
didn't do POP2, RPOP, or APOP, and I was already having vague thoughts
of perhaps adding IMAP (Internet Message Access Protocol, the most
recently designed and most powerful post-office protocol) just for fun.
	</text>
</object>
<object id="29">
	<ocn>29</ocn>
	<text class="norm">
		But I had a more theoretical reason to think switching might be as good
an idea as well, something I learned long before Linux.
	</text>
</object>
<object id="30">
	<ocn>30</ocn>
	<text class="indent1">
		3. "Plan to throw one away; you will, anyhow." (Fred Brooks, The
Mythical Man-Month, Chapter 11)
	</text>
</object>
<object id="31">
	<ocn>31</ocn>
	<text class="norm">
		Or, to put it another way, you often don't really understand the
problem until after the first time you implement a solution. The second
time, maybe you know enough to do it right. So if you want to get it
right, be ready to start over at least once [JB].
	</text>
</object>
<object id="32">
	<ocn>32</ocn>
	<text class="norm">
		Well (I told myself) the changes to fetchpop had been my first try. So
I switched.
	</text>
</object>
<object id="33">
	<ocn>33</ocn>
	<text class="norm">
		After I sent my first set of popclient patches to Carl Harris on 25
June 1996, I found out that he had basically lost interest in popclient
some time before. The code was a bit dusty, with minor bugs hanging
out. I had many changes to make, and we quickly agreed that the logical
thing for me to do was take over the program.
	</text>
</object>
<object id="34">
	<ocn>34</ocn>
	<text class="norm">
		Without my actually noticing, the project had escalated. No longer was
I just contemplating minor patches to an existing POP client. I took on
maintaining an entire one, and there were ideas bubbling in my head
that I knew would probably lead to major changes.
	</text>
</object>
<object id="35">
	<ocn>35</ocn>
	<text class="norm">
		In a software culture that encourages code-sharing, this is a natural
way for a project to evolve. I was acting out this principle:
	</text>
</object>
<object id="36">
	<ocn>36</ocn>
	<text class="indent1">
		4. If you have the right attitude, interesting problems will find you.
	</text>
</object>
<object id="37">
	<ocn>37</ocn>
	<text class="norm">
		But Carl Harris's attitude was even more important. He understood that
	</text>
</object>
<object id="38">
	<ocn>38</ocn>
	<text class="indent1">
		5. When you lose interest in a program, your last duty to it is to hand
it off to a competent successor.
	</text>
</object>
<object id="39">
	<ocn>39</ocn>
	<text class="norm">
		Without ever having to discuss it, Carl and I knew we had a common goal
of having the best solution out there. The only question for either of
us was whether I could establish that I was a safe pair of hands. Once
I did that, he acted with grace and dispatch. I hope I will do as well
when it comes my turn.
	</text>
</object>
<object id="40">
	<ocn>40</ocn>
	<text class="h4">
		The Importance of Having Users
	</text>
</object>
<object id="41">
	<ocn>41</ocn>
	<text class="norm">
		And so I inherited popclient. Just as importantly, I inherited
popclient's user base. Users are wonderful things to have, and not just
because they demonstrate that you're serving a need, that you've done
something right. Properly cultivated, they can become co-developers.
	</text>
</object>
<object id="42">
	<ocn>42</ocn>
	<text class="norm">
		Another strength of the Unix tradition, one that Linux pushes to a
happy extreme, is that a lot of users are hackers too. Because source
code is available, they can be effective hackers. This can be
tremendously useful for shortening debugging time. Given a bit of
encouragement, your users will diagnose problems, suggest fixes, and
help improve the code far more quickly than you could unaided.
	</text>
</object>
<object id="43">
	<ocn>43</ocn>
	<text class="indent1">
		6. Treating your users as co-developers is your least-hassle route to
rapid code improvement and effective debugging.
	</text>
</object>
<object id="44">
	<ocn>44</ocn>
	<text class="norm">
		The power of this effect is easy to underestimate. In fact, pretty well
all of us in the open-source world drastically underestimated how well
it would scale up with number of users and against system complexity,
until Linus Torvalds showed us differently.
	</text>
</object>
<object id="45">
	<ocn>45</ocn>
	<text class="norm">
		In fact, I think Linus's cleverest and most consequential hack was not
the construction of the Linux kernel itself, but rather his invention
of the Linux development model. When I expressed this opinion in his
presence once, he smiled and quietly repeated something he has often
said: "I'm basically a very lazy person who likes to get credit for
things other people actually do." Lazy like a fox. Or, as Robert
Heinlein famously wrote of one of his characters, too lazy to fail.
	</text>
</object>
<object id="46">
	<ocn>46</ocn>
	<text class="norm">
		In retrospect, one precedent for the methods and success of Linux can
be seen in the development of the GNU Emacs Lisp library and Lisp code
archives. In contrast to the cathedral-building style of the Emacs C
core and most other GNU tools, the evolution of the Lisp code pool was
fluid and very user-driven. Ideas and prototype modes were often
rewritten three or four times before reaching a stable final form. And
loosely-coupled collaborations enabled by the Internet, a la Linux,
were frequent.
	</text>
</object>
<object id="47">
	<ocn>47</ocn>
	<text class="norm">
		Indeed, my own most successful single hack previous to fetchmail was
probably Emacs VC (version control) mode, a Linux-like collaboration by
email with three other people, only one of whom (Richard Stallman, the
author of Emacs and founder of the Free Software Foundation) I have met
to this day. It was a front-end for SCCS, RCS and later CVS from within
Emacs that offered "one-touch" version control operations. It evolved
from a tiny, crude sccs.el mode somebody else had written. And the
development of VC succeeded because, unlike Emacs itself, Emacs Lisp
code could go through release/test/improve generations very quickly.
	</text>
</object>
<object id="48">
	<ocn>48</ocn>
	<text class="norm">
		The Emacs story is not unique. There have been other software products
with a two-level architecture and a two-tier user community that
combined a cathedral-mode core and a bazaar-mode toolbox. One such is
MATLAB, a commercial data-analysis and visualization tool. Users of
MATLAB and other products with a similar structure invariably report
that the action, the ferment, the innovation mostly takes place in the
open part of the tool where a large and varied community can tinker
with it.
	</text>
</object>
<object id="49">
	<ocn>49</ocn>
	<text class="h4">
		Release Early, Release Often
	</text>
</object>
<object id="50">
	<ocn>50</ocn>
	<text class="norm">
		Early and frequent releases are a critical part of the Linux
development model. Most developers (including me) used to believe this
was bad policy for larger than trivial projects, because early versions
are almost by definition buggy versions and you don't want to wear out
the patience of your users.
	</text>
</object>
<object id="51">
	<ocn>51</ocn>
	<text class="norm">
		This belief reinforced the general commitment to a cathedral-building
style of development. If the overriding objective was for users to see
as few bugs as possible, why then you'd only release a version every
six months (or less often), and work like a dog on debugging between
releases. The Emacs C core was developed this way. The Lisp library, in
effect, was not&#8212;because there were active Lisp archives outside
the FSF's control, where you could go to find new and development code
versions independently of Emacs's release cycle [QR].
	</text>
</object>
<object id="52">
	<ocn>52</ocn>
	<text class="norm">
		The most important of these, the Ohio State Emacs Lisp archive,
anticipated the spirit and many of the features of today's big Linux
archives. But few of us really thought very hard about what we were
doing, or about what the very existence of that archive suggested about
problems in the FSF's cathedral-building development model. I made one
serious attempt around 1992 to get a lot of the Ohio code formally
merged into the official Emacs Lisp library. I ran into political
trouble and was largely unsuccessful.
	</text>
</object>
<object id="53">
	<ocn>53</ocn>
	<text class="norm">
		But by a year later, as Linux became widely visible, it was clear that
something different and much healthier was going on there. Linus's open
development policy was the very opposite of cathedral-building. Linux's
Internet archives were burgeoning, multiple distributions were being
floated. And all of this was driven by an unheard-of frequency of core
system releases.
	</text>
</object>
<object id="54">
	<ocn>54</ocn>
	<text class="norm">
		Linus was treating his users as co-developers in the most effective
possible way:
	</text>
</object>
<object id="55">
	<ocn>55</ocn>
	<text class="indent1">
		7. Release early. Release often. And listen to your customers.
	</text>
</object>
<object id="56">
	<ocn>56</ocn>
	<text class="norm">
		Linus's innovation wasn't so much in doing quick-turnaround releases
incorporating lots of user feedback (something like this had been
Unix-world tradition for a long time), but in scaling it up to a level
of intensity that matched the complexity of what he was developing. In
those early times (around 1991) it wasn't unknown for him to release a
new kernel more than once a day! Because he cultivated his base of
co-developers and leveraged the Internet for collaboration harder than
anyone else, this worked.
	</text>
</object>
<object id="57">
	<ocn>57</ocn>
	<text class="norm">
		But how did it work? And was it something I could duplicate, or did it
rely on some unique genius of Linus Torvalds?
	</text>
</object>
<object id="58">
	<ocn>58</ocn>
	<text class="norm">
		I didn't think so. Granted, Linus is a damn fine hacker. How many of us
could engineer an entire production-quality operating system kernel
from scratch? But Linux didn't represent any awesome conceptual leap
forward. Linus is not (or at least, not yet) an innovative genius of
design in the way that, say, Richard Stallman or James Gosling (of NeWS
and Java) are. Rather, Linus seems to me to be a genius of engineering
and implementation, with a sixth sense for avoiding bugs and
development dead-ends and a true knack for finding the minimum-effort
path from point A to point B. Indeed, the whole design of Linux
breathes this quality and mirrors Linus's essentially conservative and
simplifying design approach.
	</text>
</object>
<object id="59">
	<ocn>59</ocn>
	<text class="norm">
		So, if rapid releases and leveraging the Internet medium to the hilt
were not accidents but integral parts of Linus's engineering-genius
insight into the minimum-effort path, what was he maximizing? What was
he cranking out of the machinery?
	</text>
</object>
<object id="60">
	<ocn>60</ocn>
	<text class="norm">
		Put that way, the question answers itself. Linus was keeping his
hacker/users constantly stimulated and rewarded&#8212;stimulated by the
prospect of having an ego-satisfying piece of the action, rewarded by
the sight of constant (even daily) improvement in their work.
	</text>
</object>
<object id="61">
	<ocn>61</ocn>
	<text class="norm">
		Linus was directly aiming to maximize the number of person-hours thrown
at debugging and development, even at the possible cost of instability
in the code and user-base burnout if any serious bug proved
intractable. Linus was behaving as though he believed something like
this:
	</text>
</object>
<object id="62">
	<ocn>62</ocn>
	<text class="indent1">
		8. Given a large enough beta-tester and co-developer base, almost every
problem will be characterized quickly and the fix obvious to someone.
	</text>
</object>
<object id="63">
	<ocn>63</ocn>
	<text class="norm">
		Or, less formally, "Given enough eyeballs, all bugs are shallow." I dub
this: "Linus's Law".
	</text>
</object>
<object id="64">
	<ocn>64</ocn>
	<text class="norm">
		My original formulation was that every problem "will be transparent to
somebody". Linus demurred that the person who understands and fixes the
problem is not necessarily or even usually the person who first
characterizes it. "Somebody finds the problem," he says, "and somebody
else understands it. And I'll go on record as saying that finding it is
the bigger challenge." That correction is important; we'll see how in
the next section, when we examine the practice of debugging in more
detail. But the key point is that both parts of the process (finding
and fixing) tend to happen rapidly.
	</text>
</object>
<object id="65">
	<ocn>65</ocn>
	<text class="norm">
		In Linus's Law, I think, lies the core difference underlying the
cathedral-builder and bazaar styles. In the cathedral-builder view of
programming, bugs and development problems are tricky, insidious, deep
phenomena. It takes months of scrutiny by a dedicated few to develop
confidence that you've winkled them all out. Thus the long release
intervals, and the inevitable disappointment when long-awaited releases
are not perfect.
	</text>
</object>
<object id="66">
	<ocn>66</ocn>
	<text class="norm">
		In the bazaar view, on the other hand, you assume that bugs are
generally shallow phenomena&#8212;or, at least, that they turn shallow
pretty quickly when exposed to a thousand eager co-developers pounding
on every single new release. Accordingly you release often in order to
get more corrections, and as a beneficial side effect you have less to
lose if an occasional botch gets out the door.
	</text>
</object>
<object id="67">
	<ocn>67</ocn>
	<text class="norm">
		And that's it. That's enough. If "Linus's Law" is false, then any
system as complex as the Linux kernel, being hacked over by as many
hands as the that kernel was, should at some point have collapsed under
the weight of unforseen bad interactions and undiscovered "deep" bugs.
If it's true, on the other hand, it is sufficient to explain Linux's
relative lack of bugginess and its continuous uptimes spanning months
or even years.
	</text>
</object>
<object id="68">
	<ocn>68</ocn>
	<text class="norm">
		Maybe it shouldn't have been such a surprise, at that. Sociologists
years ago discovered that the averaged opinion of a mass of equally
expert (or equally ignorant) observers is quite a bit more reliable a
predictor than the opinion of a single randomly-chosen one of the
observers. They called this the Delphi effect. It appears that what
Linus has shown is that this applies even to debugging an operating
system&#8212;that the Delphi effect can tame development complexity
even at the complexity level of an OS kernel. [CV]
	</text>
</object>
<object id="69">
	<ocn>69</ocn>
	<text class="norm">
		One special feature of the Linux situation that clearly helps along the
Delphi effect is the fact that the contributors for any given project
are self-selected. An early respondent pointed out that contributions
are received not from a random sample, but from people who are
interested enough to use the software, learn about how it works,
attempt to find solutions to problems they encounter, and actually
produce an apparently reasonable fix. Anyone who passes all these
filters is highly likely to have something useful to contribute.
	</text>
</object>
<object id="70">
	<ocn>70</ocn>
	<text class="norm">
		Linus's Law can be rephrased as "Debugging is parallelizable". Although
debugging requires debuggers to communicate with some coordinating
developer, it doesn't require significant coordination between
debuggers. Thus it doesn't fall prey to the same quadratic complexity
and management costs that make adding developers problematic.
	</text>
</object>
<object id="71">
	<ocn>71</ocn>
	<text class="norm">
		In practice, the theoretical loss of efficiency due to duplication of
work by debuggers almost never seems to be an issue in the Linux world.
One effect of a "release early and often" policy is to minimize such
duplication by propagating fed-back fixes quickly [JH].
	</text>
</object>
<object id="72">
	<ocn>72</ocn>
	<text class="norm">
		Brooks (the author of The Mythical Man-Month) even made an off-hand
observation related to this: "The total cost of maintaining a widely
used program is typically 40 percent or more of the cost of developing
it. Surprisingly this cost is strongly affected by the number of users.
More users find more bugs." [emphasis added].
	</text>
</object>
<object id="73">
	<ocn>73</ocn>
	<text class="norm">
		More users find more bugs because adding more users adds more different
ways of stressing the program. This effect is amplified when the users
are co-developers. Each one approaches the task of bug characterization
with a slightly different perceptual set and analytical toolkit, a
different angle on the problem. The "Delphi effect" seems to work
precisely because of this variation. In the specific context of
debugging, the variation also tends to reduce duplication of effort.
	</text>
</object>
<object id="74">
	<ocn>74</ocn>
	<text class="norm">
		So adding more beta-testers may not reduce the complexity of the
current "deepest" bug from the developer's point of view, but it
increases the probability that someone's toolkit will be matched to the
problem in such a way that the bug is shallow to that person.
	</text>
</object>
<object id="75">
	<ocn>75</ocn>
	<text class="norm">
		Linus coppers his bets, too. In case there are serious bugs, Linux
kernel version are numbered in such a way that potential users can make
a choice either to run the last version designated "stable" or to ride
the cutting edge and risk bugs in order to get new features. This
tactic is not yet systematically imitated by most Linux hackers, but
perhaps it should be; the fact that either choice is available makes
both more attractive. [HBS]
	</text>
</object>
<object id="76">
	<ocn>76</ocn>
	<text class="h4">
		How Many Eyeballs Tame Complexity
	</text>
</object>
<object id="77">
	<ocn>77</ocn>
	<text class="norm">
		It's one thing to observe in the large that the bazaar style greatly
accelerates debugging and code evolution. It's another to understand
exactly how and why it does so at the micro-level of day-to-day
developer and tester behavior. In this section (written three years
after the original paper, using insights by developers who read it and
re-examined their own behavior) we'll take a hard look at the actual
mechanisms. Non-technically inclined readers can safely skip to the
next section.
	</text>
</object>
<object id="78">
	<ocn>78</ocn>
	<text class="norm">
		One key to understanding is to realize exactly why it is that the kind
of bug report non&#8211;source-aware users normally turn in tends not
to be very useful. Non&#8211;source-aware users tend to report only
surface symptoms; they take their environment for granted, so they (a)
omit critical background data, and (b) seldom include a reliable recipe
for reproducing the bug.
	</text>
</object>
<object id="79">
	<ocn>79</ocn>
	<text class="norm">
		The underlying problem here is a mismatch between the tester's and the
developer's mental models of the program; the tester, on the outside
looking in, and the developer on the inside looking out. In
closed-source development they're both stuck in these roles, and tend
to talk past each other and find each other deeply frustrating.
	</text>
</object>
<object id="80">
	<ocn>80</ocn>
	<text class="norm">
		Open-source development breaks this bind, making it far easier for
tester and developer to develop a shared representation grounded in the
actual source code and to communicate effectively about it.
Practically, there is a huge difference in leverage for the developer
between the kind of bug report that just reports externally-visible
symptoms and the kind that hooks directly to the developer's
source-code&#8211;based mental representation of the program.
	</text>
</object>
<object id="81">
	<ocn>81</ocn>
	<text class="norm">
		Most bugs, most of the time, are easily nailed given even an incomplete
but suggestive characterization of their error conditions at
source-code level. When someone among your beta-testers can point out,
"there's a boundary problem in line nnn", or even just "under
conditions X, Y, and Z, this variable rolls over", a quick look at the
offending code often suffices to pin down the exact mode of failure and
generate a fix.
	</text>
</object>
<object id="82">
	<ocn>82</ocn>
	<text class="norm">
		Thus, source-code awareness by both parties greatly enhances both good
communication and the synergy between what a beta-tester reports and
what the core developer(s) know. In turn, this means that the core
developers' time tends to be well conserved, even with many
collaborators.
	</text>
</object>
<object id="83">
	<ocn>83</ocn>
	<text class="norm">
		Another characteristic of the open-source method that conserves
developer time is the communication structure of typical open-source
projects. Above I used the term "core developer"; this reflects a
distinction between the project core (typically quite small; a single
core developer is common, and one to three is typical) and the project
halo of beta-testers and available contributors (which often numbers in
the hundreds).
	</text>
</object>
<object id="84">
	<ocn>84</ocn>
	<text class="norm">
		The fundamental problem that traditional software-development
organization addresses is Brook's Law: "Adding more programmers to a
late project makes it later." More generally, Brooks's Law predicts
that the complexity and communication costs of a project rise with the
square of the number of developers, while work done only rises
linearly.
	</text>
</object>
<object id="85">
	<ocn>85</ocn>
	<text class="norm">
		Brooks's Law is founded on experience that bugs tend strongly to
cluster at the interfaces between code written by different people, and
that communications/coordination overhead on a project tends to rise
with the number of interfaces between human beings. Thus, problems
scale with the number of communications paths between developers, which
scales as the square of the humber of developers (more precisely,
according to the formula N*(N - 1)/2 where N is the number of
developers).
	</text>
</object>
<object id="86">
	<ocn>86</ocn>
	<text class="norm">
		The Brooks's Law analysis (and the resulting fear of large numbers in
development groups) rests on a hidden assummption: that the
communications structure of the project is necessarily a complete
graph, that everybody talks to everybody else. But on open-source
projects, the halo developers work on what are in effect separable
parallel subtasks and interact with each other very little; code
changes and bug reports stream through the core group, and only within
that small core group do we pay the full Brooksian overhead. [SU]
	</text>
</object>
<object id="87">
	<ocn>87</ocn>
	<text class="norm">
		There are are still more reasons that source-code&#8211;level bug
reporting tends to be very efficient. They center around the fact that
a single error can often have multiple possible symptoms, manifesting
differently depending on details of the user's usage pattern and
environment. Such errors tend to be exactly the sort of complex and
subtle bugs (such as dynamic-memory-management errors or
nondeterministic interrupt-window artifacts) that are hardest to
reproduce at will or to pin down by static analysis, and which do the
most to create long-term problems in software.
	</text>
</object>
<object id="88">
	<ocn>88</ocn>
	<text class="norm">
		A tester who sends in a tentative source-code&#8211;level
characterization of such a multi-symptom bug (e.g. "It looks to me like
there's a window in the signal handling near line 1250" or "Where are
you zeroing that buffer?") may give a developer, otherwise too close to
the code to see it, the critical clue to a half-dozen disparate
symptoms. In cases like this, it may be hard or even impossible to know
which externally-visible misbehaviour was caused by precisely which
bug&#8212;but with frequent releases, it's unnecessary to know. Other
collaborators will be likely to find out quickly whether their bug has
been fixed or not. In many cases, source-level bug reports will cause
misbehaviours to drop out without ever having been attributed to any
specific fix.
	</text>
</object>
<object id="89">
	<ocn>89</ocn>
	<text class="norm">
		Complex multi-symptom errors also tend to have multiple trace paths
from surface symptoms back to the actual bug. Which of the trace paths
a given developer or tester can chase may depend on subtleties of that
person's environment, and may well change in a not obviously
deterministic way over time. In effect, each developer and tester
samples a semi-random set of the program's state space when looking for
the etiology of a symptom. The more subtle and complex the bug, the
less likely that skill will be able to guarantee the relevance of that
sample.
	</text>
</object>
<object id="90">
	<ocn>90</ocn>
	<text class="norm">
		For simple and easily reproducible bugs, then, the accent will be on
the "semi" rather than the "random"; debugging skill and intimacy with
the code and its architecture will matter a lot. But for complex bugs,
the accent will be on the "random". Under these circumstances many
people running traces will be much more effective than a few people
running traces sequentially&#8212;even if the few have a much higher
average skill level.
	</text>
</object>
<object id="91">
	<ocn>91</ocn>
	<text class="norm">
		This effect will be greatly amplified if the difficulty of following
trace paths from different surface symptoms back to a bug varies
significantly in a way that can't be predicted by looking at the
symptoms. A single developer sampling those paths sequentially will be
as likely to pick a difficult trace path on the first try as an easy
one. On the other hand, suppose many people are trying trace paths in
parallel while doing rapid releases. Then it is likely one of them will
find the easiest path immediately, and nail the bug in a much shorter
time. The project maintainer will see that, ship a new release, and the
other people running traces on the same bug will be able to stop before
having spent too much time on their more difficult traces [RJ].
	</text>
</object>
<object id="92">
	<ocn>92</ocn>
	<text class="h4">
		When Is a Rose Not a Rose?
	</text>
</object>
<object id="93">
	<ocn>93</ocn>
	<text class="norm">
		Having studied Linus's behavior and formed a theory about why it was
successful, I made a conscious decision to test this theory on my new
(admittedly much less complex and ambitious) project.
	</text>
</object>
<object id="94">
	<ocn>94</ocn>
	<text class="norm">
		But the first thing I did was reorganize and simplify popclient a lot.
Carl Harris's implementation was very sound, but exhibited a kind of
unnecessary complexity common to many C programmers. He treated the
code as central and the data structures as support for the code. As a
result, the code was beautiful but the data structure design ad-hoc and
rather ugly (at least by the high standards of this veteran LISP
hacker).
	</text>
</object>
<object id="95">
	<ocn>95</ocn>
	<text class="norm">
		I had another purpose for rewriting besides improving the code and the
data structure design, however. That was to evolve it into something I
understood completely. It's no fun to be responsible for fixing bugs in
a program you don't understand.
	</text>
</object>
<object id="96">
	<ocn>96</ocn>
	<text class="norm">
		For the first month or so, then, I was simply following out the
implications of Carl's basic design. The first serious change I made
was to add IMAP support. I did this by reorganizing the protocol
machines into a generic driver and three method tables (for POP2, POP3,
and IMAP). This and the previous changes illustrate a general principle
that's good for programmers to keep in mind, especially in languages
like C that don't naturally do dynamic typing:
	</text>
</object>
<object id="97">
	<ocn>97</ocn>
	<text class="indent1">
		9. Smart data structures and dumb code works a lot better than the
other way around.
	</text>
</object>
<object id="98">
	<ocn>98</ocn>
	<text class="norm">
		Brooks, Chapter 9: "Show me your flowchart and conceal your tables, and
I shall continue to be mystified. Show me your tables, and I won't
usually need your flowchart; it'll be obvious." Allowing for thirty
years of terminological/cultural shift, it's the same point.
	</text>
</object>
<object id="99">
	<ocn>99</ocn>
	<text class="norm">
		At this point (early September 1996, about six weeks from zero) I
started thinking that a name change might be in order&#8212;after all,
it wasn't just a POP client any more. But I hesitated, because there
was as yet nothing genuinely new in the design. My version of popclient
had yet to develop an identity of its own.
	</text>
</object>
<object id="100">
	<ocn>100</ocn>
	<text class="norm">
		That changed, radically, when popclient learned how to forward fetched
mail to the SMTP port. I'll get to that in a moment. But first: I said
earlier that I'd decided to use this project to test my theory about
what Linus Torvalds had done right. How (you may well ask) did I do
that? In these ways:
	</text>
</object>
<object id="101">
	<ocn>101</ocn>
	<text class="norm">
		I released early and often (almost never less often than every ten
days; during periods of intense development, once a day).
	</text>
</object>
<object id="102">
	<ocn>102</ocn>
	<text class="norm">
		I grew my beta list by adding to it everyone who contacted me about
fetchmail.
	</text>
</object>
<object id="103">
	<ocn>103</ocn>
	<text class="norm">
		I sent chatty announcements to the beta list whenever I released,
encouraging people to participate.
	</text>
</object>
<object id="104">
	<ocn>104</ocn>
	<text class="norm">
		And I listened to my beta-testers, polling them about design decisions
and stroking them whenever they sent in patches and feedback.
	</text>
</object>
<object id="105">
	<ocn>105</ocn>
	<text class="norm">
		The payoff from these simple measures was immediate. From the beginning
of the project, I got bug reports of a quality most developers would
kill for, often with good fixes attached. I got thoughtful criticism, I
got fan mail, I got intelligent feature suggestions. Which leads to:
	</text>
</object>
<object id="106">
	<ocn>106</ocn>
	<text class="indent1">
		10. If you treat your beta-testers as if they're your most valuable
resource, they will respond by becoming your most valuable resource.
	</text>
</object>
<object id="107">
	<ocn>107</ocn>
	<text class="norm">
		One interesting measure of fetchmail's success is the sheer size of the
project beta list, fetchmail-friends. At the time of latest revision of
this paper (November 2000) it has 287 members and is adding two or
three a week.
	</text>
</object>
<object id="108">
	<ocn>108</ocn>
	<text class="norm">
		Actually, when I revised in late May 1997 I found the list was
beginning to lose members from its high of close to 300 for an
interesting reason. Several people have asked me to unsubscribe them
because fetchmail is working so well for them that they no longer need
to see the list traffic! Perhaps this is part of the normal life-cycle
of a mature bazaar-style project.
	</text>
</object>
<object id="109">
	<ocn>109</ocn>
	<text class="h4">
		Popclient becomes Fetchmail
	</text>
</object>
<object id="110">
	<ocn>110</ocn>
	<text class="norm">
		The real turning point in the project was when Harry Hochheiser sent me
his scratch code for forwarding mail to the client machine's SMTP port.
I realized almost immediately that a reliable implementation of this
feature would make all the other mail delivery modes next to obsolete.
	</text>
</object>
<object id="111">
	<ocn>111</ocn>
	<text class="norm">
		For many weeks I had been tweaking fetchmail rather incrementally while
feeling like the interface design was serviceable but
grubby&#8212;inelegant and with too many exiguous options hanging out
all over. The options to dump fetched mail to a mailbox file or
standard output particularly bothered me, but I couldn't figure out
why.
	</text>
</object>
<object id="112">
	<ocn>112</ocn>
	<text class="norm">
		(If you don't care about the technicalia of Internet mail, the next two
paragraphs can be safely skipped.)
	</text>
</object>
<object id="113">
	<ocn>113</ocn>
	<text class="norm">
		What I saw when I thought about SMTP forwarding was that popclient had
been trying to do too many things. It had been designed to be both a
mail transport agent (MTA) and a local delivery agent (MDA). With SMTP
forwarding, it could get out of the MDA business and be a pure MTA,
handing off mail to other programs for local delivery just as sendmail
does.
	</text>
</object>
<object id="114">
	<ocn>114</ocn>
	<text class="norm">
		Why mess with all the complexity of configuring a mail delivery agent
or setting up lock-and-append on a mailbox when port 25 is almost
guaranteed to be there on any platform with TCP/IP support in the first
place? Especially when this means retrieved mail is guaranteed to look
like normal sender-initiated SMTP mail, which is really what we want
anyway.
	</text>
</object>
<object id="115">
	<ocn>115</ocn>
	<text class="norm">
		(Back to a higher level....)
	</text>
</object>
<object id="116">
	<ocn>116</ocn>
	<text class="norm">
		Even if you didn't follow the preceding technical jargon, there are
several important lessons here. First, this SMTP-forwarding concept was
the biggest single payoff I got from consciously trying to emulate
Linus's methods. A user gave me this terrific idea&#8212;all I had to
do was understand the implications.
	</text>
</object>
<object id="117">
	<ocn>117</ocn>
	<text class="indent1">
		11. The next best thing to having good ideas is recognizing good ideas
from your users. Sometimes the latter is better.
	</text>
</object>
<object id="118">
	<ocn>118</ocn>
	<text class="norm">
		Interestingly enough, you will quickly find that if you are completely
and self-deprecatingly truthful about how much you owe other people,
the world at large will treat you as though you did every bit of the
invention yourself and are just being becomingly modest about your
innate genius. We can all see how well this worked for Linus!
	</text>
</object>
<object id="119">
	<ocn>119</ocn>
	<text class="norm">
		(When I gave my talk at the first Perl Conference in August 1997,
hacker extraordinaire Larry Wall was in the front row. As I got to the
last line above he called out, religious-revival style, "Tell it, tell
it, brother!". The whole audience laughed, because they knew this had
worked for the inventor of Perl, too.)
	</text>
</object>
<object id="120">
	<ocn>120</ocn>
	<text class="norm">
		After a very few weeks of running the project in the same spirit, I
began to get similar praise not just from my users but from other
people to whom the word leaked out. I stashed away some of that email;
I'll look at it again sometime if I ever start wondering whether my
life has been worthwhile :-).
	</text>
</object>
<object id="121">
	<ocn>121</ocn>
	<text class="norm">
		But there are two more fundamental, non-political lessons here that are
general to all kinds of design.
	</text>
</object>
<object id="122">
	<ocn>122</ocn>
	<text class="indent1">
		12. Often, the most striking and innovative solutions come from
realizing that your concept of the problem was wrong.
	</text>
</object>
<object id="123">
	<ocn>123</ocn>
	<text class="norm">
		I had been trying to solve the wrong problem by continuing to develop
popclient as a combined MTA/MDA with all kinds of funky local delivery
modes. Fetchmail's design needed to be rethought from the ground up as
a pure MTA, a part of the normal SMTP-speaking Internet mail path.
	</text>
</object>
<object id="124">
	<ocn>124</ocn>
	<text class="norm">
		When you hit a wall in development&#8212;when you find yourself hard
put to think past the next patch&#8212;it's often time to ask not
whether you've got the right answer, but whether you're asking the
right question. Perhaps the problem needs to be reframed.
	</text>
</object>
<object id="125">
	<ocn>125</ocn>
	<text class="norm">
		Well, I had reframed my problem. Clearly, the right thing to do was (1)
hack SMTP forwarding support into the generic driver, (2) make it the
default mode, and (3) eventually throw out all the other delivery
modes, especially the deliver-to-file and deliver-to-standard-output
options.
	</text>
</object>
<object id="126">
	<ocn>126</ocn>
	<text class="norm">
		I hesitated over step 3 for some time, fearing to upset long-time
popclient users dependent on the alternate delivery mechanisms. In
theory, they could immediately switch to .forward files or their
non-sendmail equivalents to get the same effects. In practice the
transition might have been messy.
	</text>
</object>
<object id="127">
	<ocn>127</ocn>
	<text class="norm">
		But when I did it, the benefits proved huge. The cruftiest parts of the
driver code vanished. Configuration got radically simpler&#8212;no more
grovelling around for the system MDA and user's mailbox, no more
worries about whether the underlying OS supports file locking.
	</text>
</object>
<object id="128">
	<ocn>128</ocn>
	<text class="norm">
		Also, the only way to lose mail vanished. If you specified delivery to
a file and the disk got full, your mail got lost. This can't happen
with SMTP forwarding because your SMTP listener won't return OK unless
the message can be delivered or at least spooled for later delivery.
	</text>
</object>
<object id="129">
	<ocn>129</ocn>
	<text class="norm">
		Also, performance improved (though not so you'd notice it in a single
run). Another not insignificant benefit of this change was that the
manual page got a lot simpler.
	</text>
</object>
<object id="130">
	<ocn>130</ocn>
	<text class="norm">
		Later, I had to bring delivery via a user-specified local MDA back in
order to allow handling of some obscure situations involving dynamic
SLIP. But I found a much simpler way to do it.
	</text>
</object>
<object id="131">
	<ocn>131</ocn>
	<text class="norm">
		The moral? Don't hesitate to throw away superannuated features when you
can do it without loss of effectiveness. Antoine de Saint-Exup&#233;ry
(who was an aviator and aircraft designer when he wasn't authoring
classic children's books) said:
	</text>
</object>
<object id="132">
	<ocn>132</ocn>
	<text class="indent1">
		13. "Perfection (in design) is achieved not when there is nothing more
to add, but rather when there is nothing more to take away."
	</text>
</object>
<object id="133">
	<ocn>133</ocn>
	<text class="norm">
		When your code is getting both better and simpler, that is when you
know it's right. And in the process, the fetchmail design acquired an
identity of its own, different from the ancestral popclient.
	</text>
</object>
<object id="134">
	<ocn>134</ocn>
	<text class="norm">
		It was time for the name change. The new design looked much more like a
dual of sendmail than the old popclient had; both are MTAs, but where
sendmail pushes then delivers, the new popclient pulls then delivers.
So, two months off the blocks, I renamed it fetchmail.
	</text>
</object>
<object id="135">
	<ocn>135</ocn>
	<text class="norm">
		There is a more general lesson in this story about how SMTP delivery
came to fetchmail. It is not only debugging that is parallelizable;
development and (to a perhaps surprising extent) exploration of design
space is, too. When your development mode is rapidly iterative,
development and enhancement may become special cases of
debugging&#8212;fixing `bugs of omission' in the original capabilities
or concept of the software.
	</text>
</object>
<object id="136">
	<ocn>136</ocn>
	<text class="norm">
		Even at a higher level of design, it can be very valuable to have lots
of co-developers random-walking through the design space near your
product. Consider the way a puddle of water finds a drain, or better
yet how ants find food: exploration essentially by diffusion, followed
by exploitation mediated by a scalable communication mechanism. This
works very well; as with Harry Hochheiser and me, one of your outriders
may well find a huge win nearby that you were just a little too
close-focused to see.
	</text>
</object>
<object id="137">
	<ocn>137</ocn>
	<text class="h4">
		Fetchmail Grows Up
	</text>
</object>
<object id="138">
	<ocn>138</ocn>
	<text class="norm">
		There I was with a neat and innovative design, code that I knew worked
well because I used it every day, and a burgeoning beta list. It
gradually dawned on me that I was no longer engaged in a trivial
personal hack that might happen to be useful to few other people. I had
my hands on a program that every hacker with a Unix box and a SLIP/PPP
mail connection really needs.
	</text>
</object>
<object id="139">
	<ocn>139</ocn>
	<text class="norm">
		With the SMTP forwarding feature, it pulled far enough in front of the
competition to potentially become a "category killer", one of those
classic programs that fills its niche so competently that the
alternatives are not just discarded but almost forgotten.
	</text>
</object>
<object id="140">
	<ocn>140</ocn>
	<text class="norm">
		I think you can't really aim or plan for a result like this. You have
to get pulled into it by design ideas so powerful that afterward the
results just seem inevitable, natural, even foreordained. The only way
to try for ideas like that is by having lots of ideas&#8212;or by
having the engineering judgment to take other peoples' good ideas
beyond where the originators thought they could go.
	</text>
</object>
<object id="141">
	<ocn>141</ocn>
	<text class="norm">
		Andy Tanenbaum had the original idea to build a simple native Unix for
IBM PCs, for use as a teaching tool (he called it Minix). Linus
Torvalds pushed the Minix concept further than Andrew probably thought
it could go&#8212;and it grew into something wonderful. In the same way
(though on a smaller scale), I took some ideas by Carl Harris and Harry
Hochheiser and pushed them hard. Neither of us was `original' in the
romantic way people think is genius. But then, most science and
engineering and software development isn't done by original genius,
hacker mythology to the contrary.
	</text>
</object>
<object id="142">
	<ocn>142</ocn>
	<text class="norm">
		The results were pretty heady stuff all the same&#8212;in fact, just
the kind of success every hacker lives for! And they meant I would have
to set my standards even higher. To make fetchmail as good as I now saw
it could be, I'd have to write not just for my own needs, but also
include and support features necessary to others but outside my orbit.
And do that while keeping the program simple and robust.
	</text>
</object>
<object id="143">
	<ocn>143</ocn>
	<text class="norm">
		The first and overwhelmingly most important feature I wrote after
realizing this was multidrop support&#8212;the ability to fetch mail
from mailboxes that had accumulated all mail for a group of users, and
then route each piece of mail to its individual recipients.
	</text>
</object>
<object id="144">
	<ocn>144</ocn>
	<text class="norm">
		I decided to add the multidrop support partly because some users were
clamoring for it, but mostly because I thought it would shake bugs out
of the single-drop code by forcing me to deal with addressing in full
generality. And so it proved. Getting RFC 822 address parsing right
took me a remarkably long time, not because any individual piece of it
is hard but because it involved a pile of interdependent and fussy
details.
	</text>
</object>
<object id="145">
	<ocn>145</ocn>
	<text class="norm">
		But multidrop addressing turned out to be an excellent design decision
as well. Here's how I knew:
	</text>
</object>
<object id="146">
	<ocn>146</ocn>
	<text class="indent1">
		14. Any tool should be useful in the expected way, but a truly great
tool lends itself to uses you never expected.
	</text>
</object>
<object id="147">
	<ocn>147</ocn>
	<text class="norm">
		The unexpected use for multidrop fetchmail is to run mailing lists with
the list kept, and alias expansion done, on the client side of the
Internet connection. This means someone running a personal machine
through an ISP account can manage a mailing list without continuing
access to the ISP's alias files.
	</text>
</object>
<object id="148">
	<ocn>148</ocn>
	<text class="norm">
		Another important change demanded by my beta-testers was support for
8-bit MIME (Multipurpose Internet Mail Extensions) operation. This was
pretty easy to do, because I had been careful to keep the code 8-bit
clean (that is, to not press the 8th bit, unused in the ASCII character
set, into service to carry information within the program). Not because
I anticipated the demand for this feature, but rather in obedience to
another rule:
	</text>
</object>
<object id="149">
	<ocn>149</ocn>
	<text class="indent1">
		15. When writing gateway software of any kind, take pains to disturb
the data stream as little as possible&#8212;and never throw away
information unless the recipient forces you to!
	</text>
</object>
<object id="150">
	<ocn>150</ocn>
	<text class="norm">
		Had I not obeyed this rule, 8-bit MIME support would have been
difficult and buggy. As it was, all I had to do is read the MIME
standard (RFC 1652) and add a trivial bit of header-generation logic.
	</text>
</object>
<object id="151">
	<ocn>151</ocn>
	<text class="norm">
		Some European users bugged me into adding an option to limit the number
of messages retrieved per session (so they can control costs from their
expensive phone networks). I resisted this for a long time, and I'm
still not entirely happy about it. But if you're writing for the world,
you have to listen to your customers&#8212;this doesn't change just
because they're not paying you in money.
	</text>
</object>
<object id="152">
	<ocn>152</ocn>
	<text class="h4">
		A Few More Lessons from Fetchmail
	</text>
</object>
<object id="153">
	<ocn>153</ocn>
	<text class="norm">
		Before we go back to general software-engineering issues, there are a
couple more specific lessons from the fetchmail experience to ponder.
Nontechnical readers can safely skip this section.
	</text>
</object>
<object id="154">
	<ocn>154</ocn>
	<text class="norm">
		The rc (control) file syntax includes optional `noise' keywords that
are entirely ignored by the parser. The English-like syntax they allow
is considerably more readable than the traditional terse keyword-value
pairs you get when you strip them all out.
	</text>
</object>
<object id="155">
	<ocn>155</ocn>
	<text class="norm">
		These started out as a late-night experiment when I noticed how much
the rc file declarations were beginning to resemble an imperative
minilanguage. (This is also why I changed the original popclient
"server" keyword to "poll").
	</text>
</object>
<object id="156">
	<ocn>156</ocn>
	<text class="norm">
		It seemed to me that trying to make that imperative minilanguage more
like English might make it easier to use. Now, although I'm a convinced
partisan of the "make it a language" school of design as exemplified by
Emacs and HTML and many database engines, I am not normally a big fan
of "English-like" syntaxes.
	</text>
</object>
<object id="157">
	<ocn>157</ocn>
	<text class="norm">
		Traditionally programmers have tended to favor control syntaxes that
are very precise and compact and have no redundancy at all. This is a
cultural legacy from when computing resources were expensive, so
parsing stages had to be as cheap and simple as possible. English, with
about 50% redundancy, looked like a very inappropriate model then.
	</text>
</object>
<object id="158">
	<ocn>158</ocn>
	<text class="norm">
		This is not my reason for normally avoiding English-like syntaxes; I
mention it here only to demolish it. With cheap cycles and core,
terseness should not be an end in itself. Nowadays it's more important
for a language to be convenient for humans than to be cheap for the
computer.
	</text>
</object>
<object id="159">
	<ocn>159</ocn>
	<text class="norm">
		There remain, however, good reasons to be wary. One is the complexity
cost of the parsing stage&#8212;you don't want to raise that to the
point where it's a significant source of bugs and user confusion in
itself. Another is that trying to make a language syntax English-like
often demands that the "English" it speaks be bent seriously out of
shape, so much so that the superficial resemblance to natural language
is as confusing as a traditional syntax would have been. (You see this
bad effect in a lot of so-called "fourth generation" and commercial
database-query languages.)
	</text>
</object>
<object id="160">
	<ocn>160</ocn>
	<text class="norm">
		The fetchmail control syntax seems to avoid these problems because the
language domain is extremely restricted. It's nowhere near a
general-purpose language; the things it says simply are not very
complicated, so there's little potential for confusion in moving
mentally between a tiny subset of English and the actual control
language. I think there may be a broader lesson here:
	</text>
</object>
<object id="161">
	<ocn>161</ocn>
	<text class="indent1">
		16. When your language is nowhere near Turing-complete, syntactic sugar
can be your friend.
	</text>
</object>
<object id="162">
	<ocn>162</ocn>
	<text class="norm">
		Another lesson is about security by obscurity. Some fetchmail users
asked me to change the software to store passwords encrypted in the rc
file, so snoopers wouldn't be able to casually see them.
	</text>
</object>
<object id="163">
	<ocn>163</ocn>
	<text class="norm">
		I didn't do it, because this doesn't actually add protection. Anyone
who's acquired permissions to read your rc file will be able to run
fetchmail as you anyway&#8212;and if it's your password they're after,
they'd be able to rip the necessary decoder out of the fetchmail code
itself to get it.
	</text>
</object>
<object id="164">
	<ocn>164</ocn>
	<text class="norm">
		All .fetchmailrc password encryption would have done is give a false
sense of security to people who don't think very hard. The general rule
here is:
	</text>
</object>
<object id="165">
	<ocn>165</ocn>
	<text class="indent1">
		17. A security system is only as secure as its secret. Beware of
pseudo-secrets.
	</text>
</object>
<object id="166">
	<ocn>166</ocn>
	<text class="h4">
		Necessary Preconditions for the Bazaar Style
	</text>
</object>
<object id="167">
	<ocn>167</ocn>
	<text class="norm">
		Early reviewers and test audiences for this essay consistently raised
questions about the preconditions for successful bazaar-style
development, including both the qualifications of the project leader
and the state of code at the time one goes public and starts to try to
build a co-developer community.
	</text>
</object>
<object id="168">
	<ocn>168</ocn>
	<text class="norm">
		It's fairly clear that one cannot code from the ground up in bazaar
style [IN]. One can test, debug and improve in bazaar style, but it
would be very hard to originate a project in bazaar mode. Linus didn't
try it. I didn't either. Your nascent developer community needs to have
something runnable and testable to play with.
	</text>
</object>
<object id="169">
	<ocn>169</ocn>
	<text class="norm">
		When you start community-building, what you need to be able to present
is a plausible promise. Your program doesn't have to work particularly
well. It can be crude, buggy, incomplete, and poorly documented. What
it must not fail to do is (a) run, and (b) convince potential
co-developers that it can be evolved into something really neat in the
foreseeable future.
	</text>
</object>
<object id="170">
	<ocn>170</ocn>
	<text class="norm">
		Linux and fetchmail both went public with strong, attractive basic
designs. Many people thinking about the bazaar model as I have
presented it have correctly considered this critical, then jumped from
that to the conclusion that a high degree of design intuition and
cleverness in the project leader is indispensable.
	</text>
</object>
<object id="171">
	<ocn>171</ocn>
	<text class="norm">
		But Linus got his design from Unix. I got mine initially from the
ancestral popclient (though it would later change a great deal, much
more proportionately speaking than has Linux). So does the
leader/coordinator for a bazaar-style effort really have to have
exceptional design talent, or can he get by through leveraging the
design talent of others?
	</text>
</object>
<object id="172">
	<ocn>172</ocn>
	<text class="norm">
		I think it is not critical that the coordinator be able to originate
designs of exceptional brilliance, but it is absolutely critical that
the coordinator be able to recognize good design ideas from others.
	</text>
</object>
<object id="173">
	<ocn>173</ocn>
	<text class="norm">
		Both the Linux and fetchmail projects show evidence of this. Linus,
while not (as previously discussed) a spectacularly original designer,
has displayed a powerful knack for recognizing good design and
integrating it into the Linux kernel. And I have already described how
the single most powerful design idea in fetchmail (SMTP forwarding)
came from somebody else.
	</text>
</object>
<object id="174">
	<ocn>174</ocn>
	<text class="norm">
		Early audiences of this essay complimented me by suggesting that I am
prone to undervalue design originality in bazaar projects because I
have a lot of it myself, and therefore take it for granted. There may
be some truth to this; design (as opposed to coding or debugging) is
certainly my strongest skill.
	</text>
</object>
<object id="175">
	<ocn>175</ocn>
	<text class="norm">
		But the problem with being clever and original in software design is
that it gets to be a habit&#8212;you start reflexively making things
cute and complicated when you should be keeping them robust and simple.
I have had projects crash on me because I made this mistake, but I
managed to avoid this with fetchmail.
	</text>
</object>
<object id="176">
	<ocn>176</ocn>
	<text class="norm">
		So I believe the fetchmail project succeeded partly because I
restrained my tendency to be clever; this argues (at least) against
design originality being essential for successful bazaar projects. And
consider Linux. Suppose Linus Torvalds had been trying to pull off
fundamental innovations in operating system design during the
development; does it seem at all likely that the resulting kernel would
be as stable and successful as what we have?
	</text>
</object>
<object id="177">
	<ocn>177</ocn>
	<text class="norm">
		A certain base level of design and coding skill is required, of course,
but I expect almost anybody seriously thinking of launching a bazaar
effort will already be above that minimum. The open-source community's
internal market in reputation exerts subtle pressure on people not to
launch development efforts they're not competent to follow through on.
So far this seems to have worked pretty well.
	</text>
</object>
<object id="178">
	<ocn>178</ocn>
	<text class="norm">
		There is another kind of skill not normally associated with software
development which I think is as important as design cleverness to
bazaar projects&#8212;and it may be more important. A bazaar project
coordinator or leader must have good people and communications skills.
	</text>
</object>
<object id="179">
	<ocn>179</ocn>
	<text class="norm">
		This should be obvious. In order to build a development community, you
need to attract people, interest them in what you're doing, and keep
them happy about the amount of work they're doing. Technical sizzle
will go a long way towards accomplishing this, but it's far from the
whole story. The personality you project matters, too.
	</text>
</object>
<object id="180">
	<ocn>180</ocn>
	<text class="norm">
		It is not a coincidence that Linus is a nice guy who makes people like
him and want to help him. It's not a coincidence that I'm an energetic
extrovert who enjoys working a crowd and has some of the delivery and
instincts of a stand-up comic. To make the bazaar model work, it helps
enormously if you have at least a little skill at charming people.
	</text>
</object>
<object id="181">
	<ocn>181</ocn>
	<text class="h4">
		The Social Context of Open-Source Software
	</text>
</object>
<object id="182">
	<ocn>182</ocn>
	<text class="norm">
		It is truly written: the best hacks start out as personal solutions to
the author's everyday problems, and spread because the problem turns
out to be typical for a large class of users. This takes us back to the
matter of rule 1, restated in a perhaps more useful way:
	</text>
</object>
<object id="183">
	<ocn>183</ocn>
	<text class="indent1">
		18. To solve an interesting problem, start by finding a problem that is
interesting to you.
	</text>
</object>
<object id="184">
	<ocn>184</ocn>
	<text class="norm">
		So it was with Carl Harris and the ancestral popclient, and so with me
and fetchmail. But this has been understood for a long time. The
interesting point, the point that the histories of Linux and fetchmail
seem to demand we focus on, is the next stage&#8212;the evolution of
software in the presence of a large and active community of users and
co-developers.
	</text>
</object>
<object id="185">
	<ocn>185</ocn>
	<text class="norm">
		In The Mythical Man-Month, Fred Brooks observed that programmer time is
not fungible; adding developers to a late software project makes it
later. As we've seen previously, he argued that the complexity and
communication costs of a project rise with the square of the number of
developers, while work done only rises linearly. Brooks's Law has been
widely regarded as a truism. But we've examined in this essay an number
of ways in which the process of open-source development falsifies the
assumptionms behind it&#8212;and, empirically, if Brooks's Law were the
whole picture Linux would be impossible.
	</text>
</object>
<object id="186">
	<ocn>186</ocn>
	<text class="norm">
		Gerald Weinberg's classic The Psychology of Computer Programming
supplied what, in hindsight, we can see as a vital correction to
Brooks. In his discussion of "egoless programming", Weinberg observed
that in shops where developers are not territorial about their code,
and encourage other people to look for bugs and potential improvements
in it, improvement happens dramatically faster than elsewhere.
(Recently, Kent Beck's `extreme programming' technique of deploying
coders in pairs looking over one anothers' shoulders might be seen as
an attempt to force this effect.)
	</text>
</object>
<object id="187">
	<ocn>187</ocn>
	<text class="norm">
		Weinberg's choice of terminology has perhaps prevented his analysis
from gaining the acceptance it deserved&#8212;one has to smile at the
thought of describing Internet hackers as "egoless". But I think his
argument looks more compelling today than ever.
	</text>
</object>
<object id="188">
	<ocn>188</ocn>
	<text class="norm">
		The bazaar method, by harnessing the full power of the "egoless
programming" effect, strongly mitigates the effect of Brooks's Law. The
principle behind Brooks's Law is not repealed, but given a large
developer population and cheap communications its effects can be
swamped by competing nonlinearities that are not otherwise visible.
This resembles the relationship between Newtonian and Einsteinian
physics&#8212;the older system is still valid at low energies, but if
you push mass and velocity high enough you get surprises like nuclear
explosions or Linux.
	</text>
</object>
<object id="189">
	<ocn>189</ocn>
	<text class="norm">
		The history of Unix should have prepared us for what we're learning
from Linux (and what I've verified experimentally on a smaller scale by
deliberately copying Linus's methods [EGCS]). That is, while coding
remains an essentially solitary activity, the really great hacks come
from harnessing the attention and brainpower of entire communities. The
developer who uses only his or her own brain in a closed project is
going to fall behind the developer who knows how to create an open,
evolutionary context in which feedback exploring the design space, code
contributions, bug-spotting, and other improvements come from from
hundreds (perhaps thousands) of people.
	</text>
</object>
<object id="190">
	<ocn>190</ocn>
	<text class="norm">
		But the traditional Unix world was prevented from pushing this approach
to the ultimate by several factors. One was the legal contraints of
various licenses, trade secrets, and commercial interests. Another (in
hindsight) was that the Internet wasn't yet good enough.
	</text>
</object>
<object id="191">
	<ocn>191</ocn>
	<text class="norm">
		Before cheap Internet, there were some geographically compact
communities where the culture encouraged Weinberg's "egoless"
programming, and a developer could easily attract a lot of skilled
kibitzers and co-developers. Bell Labs, the MIT AI and LCS labs, UC
Berkeley&#8212;these became the home of innovations that are legendary
and still potent.
	</text>
</object>
<object id="192">
	<ocn>192</ocn>
	<text class="norm">
		Linux was the first project for which a conscious and successful effort
to use the entire world as its talent pool was made. I don't think it's
a coincidence that the gestation period of Linux coincided with the
birth of the World Wide Web, and that Linux left its infancy during the
same period in 1993&#8211;1994 that saw the takeoff of the ISP industry
and the explosion of mainstream interest in the Internet. Linus was the
first person who learned how to play by the new rules that pervasive
Internet access made possible.
	</text>
</object>
<object id="193">
	<ocn>193</ocn>
	<text class="norm">
		While cheap Internet was a necessary condition for the Linux model to
evolve, I think it was not by itself a sufficient condition. Another
vital factor was the development of a leadership style and set of
cooperative customs that could allow developers to attract
co-developers and get maximum leverage out of the medium.
	</text>
</object>
<object id="194">
	<ocn>194</ocn>
	<text class="norm">
		But what is this leadership style and what are these customs? They
cannot be based on power relationships&#8212;and even if they could be,
leadership by coercion would not produce the results we see. Weinberg
quotes the autobiography of the 19th-century Russian anarchist Pyotr
Alexeyvich Kropotkin's Memoirs of a Revolutionist to good effect on
this subject:
	</text>
</object>
<object id="195">
	<ocn>195</ocn>
	<text class="indent1">
		Having been brought up in a serf-owner's family, I entered active life,
like all young men of my time, with a great deal of confidence in the
necessity of commanding, ordering, scolding, punishing and the like.
But when, at an early stage, I had to manage serious enterprises and to
deal with [free] men, and when each mistake would lead at once to heavy
consequences, I began to appreciate the difference between acting on
the principle of command and discipline and acting on the principle of
common understanding. The former works admirably in a military parade,
but it is worth nothing where real life is concerned, and the aim can
be achieved only through the severe effort of many converging wills.
	</text>
</object>
<object id="196">
	<ocn>196</ocn>
	<text class="norm">
		The "severe effort of many converging wills" is precisely what a
project like Linux requires&#8212;and the "principle of command" is
effectively impossible to apply among volunteers in the anarchist's
paradise we call the Internet. To operate and compete effectively,
hackers who want to lead collaborative projects have to learn how to
recruit and energize effective communities of interest in the mode
vaguely suggested by Kropotkin's "principle of understanding". They
must learn to use Linus's Law.[SP]
	</text>
</object>
<object id="197">
	<ocn>197</ocn>
	<text class="norm">
		Earlier I referred to the "Delphi effect" as a possible explanation for
Linus's Law. But more powerful analogies to adaptive systems in biology
and economics also irresistably suggest themselves. The Linux world
behaves in many respects like a free market or an ecology, a collection
of selfish agents attempting to maximize utility which in the process
produces a self-correcting spontaneous order more elaborate and
efficient than any amount of central planning could have achieved.
Here, then, is the place to seek the "principle of understanding".
	</text>
</object>
<object id="198">
	<ocn>198</ocn>
	<text class="norm">
		The "utility function" Linux hackers are maximizing is not classically
economic, but is the intangible of their own ego satisfaction and
reputation among other hackers. (One may call their motivation
"altruistic", but this ignores the fact that altruism is itself a form
of ego satisfaction for the altruist). Voluntary cultures that work
this way are not actually uncommon; one other in which I have long
participated is science fiction fandom, which unlike hackerdom has long
explicitly recognized "egoboo" (ego-boosting, or the enhancement of
one's reputation among other fans) as the basic drive behind volunteer
activity.
	</text>
</object>
<object id="199">
	<ocn>199</ocn>
	<text class="norm">
		Linus, by successfully positioning himself as the gatekeeper of a
project in which the development is mostly done by others, and
nurturing interest in the project until it became self-sustaining, has
shown an acute grasp of Kropotkin's "principle of shared
understanding". This quasi-economic view of the Linux world enables us
to see how that understanding is applied.
	</text>
</object>
<object id="200">
	<ocn>200</ocn>
	<text class="norm">
		We may view Linus's method as a way to create an efficient market in
"egoboo"&#8212;to connect the selfishness of individual hackers as
firmly as possible to difficult ends that can only be achieved by
sustained cooperation. With the fetchmail project I have shown (albeit
on a smaller scale) that his methods can be duplicated with good
results. Perhaps I have even done it a bit more consciously and
systematically than he.
	</text>
</object>
<object id="201">
	<ocn>201</ocn>
	<text class="norm">
		Many people (especially those who politically distrust free markets)
would expect a culture of self-directed egoists to be fragmented,
territorial, wasteful, secretive, and hostile. But this expectation is
clearly falsified by (to give just one example) the stunning variety,
quality, and depth of Linux documentation. It is a hallowed given that
programmers hate documenting; how is it, then, that Linux hackers
generate so much documentation? Evidently Linux's free market in egoboo
works better to produce virtuous, other-directed behavior than the
massively-funded documentation shops of commercial software producers.
	</text>
</object>
<object id="202">
	<ocn>202</ocn>
	<text class="norm">
		Both the fetchmail and Linux kernel projects show that by properly
rewarding the egos of many other hackers, a strong
developer/coordinator can use the Internet to capture the benefits of
having lots of co-developers without having a project collapse into a
chaotic mess. So to Brooks's Law I counter-propose the following:
	</text>
</object>
<object id="203">
	<ocn>203</ocn>
	<text class="indent1">
		19: Provided the development coordinator has a communications medium at
least as good as the Internet, and knows how to lead without coercion,
many heads are inevitably better than one.
	</text>
</object>
<object id="204">
	<ocn>204</ocn>
	<text class="norm">
		I think the future of open-source software will increasingly belong to
people who know how to play Linus's game, people who leave behind the
cathedral and embrace the bazaar. This is not to say that individual
vision and brilliance will no longer matter; rather, I think that the
cutting edge of open-source software will belong to people who start
from individual vision and brilliance, then amplify it through the
effective construction of voluntary communities of interest.
	</text>
</object>
<object id="205">
	<ocn>205</ocn>
	<text class="norm">
		Perhaps this is not only the future of open-source software. No
closed-source developer can match the pool of talent the Linux
community can bring to bear on a problem. Very few could afford even to
hire the more than 200 (1999: 600, 2000: 800) people who have
contributed to fetchmail!
	</text>
</object>
<object id="206">
	<ocn>206</ocn>
	<text class="norm">
		Perhaps in the end the open-source culture will triumph not because
cooperation is morally right or software "hoarding" is morally wrong
(assuming you believe the latter, which neither Linus nor I do), but
simply because the closed-source world cannot win an evolutionary arms
race with open-source communities that can put orders of magnitude more
skilled time into a problem.
	</text>
</object>
<object id="207">
	<ocn>207</ocn>
	<text class="h4">
		On Management and the Maginot Line
	</text>
</object>
<object id="208">
	<ocn>208</ocn>
	<text class="norm">
		The original Cathedral and Bazaar paper of 1997 ended with the vision
above&#8212;that of happy networked hordes of programmer/anarchists
outcompeting and overwhelming the hierarchical world of conventional
closed software.
	</text>
</object>
<object id="209">
	<ocn>209</ocn>
	<text class="norm">
		A good many skeptics weren't convinced, however; and the questions they
raise deserve a fair engagement. Most of the objections to the bazaar
argument come down to the claim that its proponents have underestimated
the productivity-multiplying effect of conventional management.
	</text>
</object>
<object id="210">
	<ocn>210</ocn>
	<text class="norm">
		Traditionally-minded software-development managers often object that
the casualness with which project groups form and change and dissolve
in the open-source world negates a significant part of the apparent
advantage of numbers that the open-source community has over any single
closed-source developer. They would observe that in software
development it is really sustained effort over time and the degree to
which customers can expect continuing investment in the product that
matters, not just how many people have thrown a bone in the pot and
left it to simmer.
	</text>
</object>
<object id="211">
	<ocn>211</ocn>
	<text class="norm">
		There is something to this argument, to be sure; in fact, I have
developed the idea that expected future service value is the key to the
economics of software production in the essay The Magic Cauldron.
	</text>
</object>
<object id="212">
	<ocn>212</ocn>
	<text class="norm">
		But this argument also has a major hidden problem; its implicit
assumption that open-source development cannot deliver such sustained
effort. In fact, there have been open-source projects that maintained a
coherent direction and an effective maintainer community over quite
long periods of time without the kinds of incentive structures or
institutional controls that conventional management finds essential.
The development of the GNU Emacs editor is an extreme and instructive
example; it has absorbed the efforts of hundreds of contributors over
15 years into a unified architectural vision, despite high turnover and
the fact that only one person (its author) has been continuously active
during all that time. No closed-source editor has ever matched this
longevity record.
	</text>
</object>
<object id="213">
	<ocn>213</ocn>
	<text class="norm">
		This suggests a reason for questioning the advantages of
conventionally-managed software development that is independent of the
rest of the arguments over cathedral vs. bazaar mode. If it's possible
for GNU Emacs to express a consistent architectural vision over 15
years, or for an operating system like Linux to do the same over 8
years of rapidly changing hardware and platform technology; and if (as
is indeed the case) there have been many well-architected open-source
projects of more than 5 years duration -- then we are entitled to
wonder what, if anything, the tremendous overhead of
conventionally-managed development is actually buying us.
	</text>
</object>
<object id="214">
	<ocn>214</ocn>
	<text class="norm">
		Whatever it is certainly doesn't include reliable execution by
deadline, or on budget, or to all features of the specification; it's a
rare `managed' project that meets even one of these goals, let alone
all three. It also does not appear to be ability to adapt to changes in
technology and economic context during the project lifetime, either;
the open-source community has proven far more effective on that score
(as one can readily verify, for example, by comparing the 30-year
history of the Internet with the short half-lives of proprietary
networking technologies&#8212;or the cost of the 16-bit to 32-bit
transition in Microsoft Windows with the nearly effortless upward
migration of Linux during the same period, not only along the Intel
line of development but to more than a dozen other hardware platforms,
including the 64-bit Alpha as well).
	</text>
</object>
<object id="215">
	<ocn>215</ocn>
	<text class="norm">
		One thing many people think the traditional mode buys you is somebody
to hold legally liable and potentially recover compensation from if the
project goes wrong. But this is an illusion; most software licenses are
written to disclaim even warranty of merchantability, let alone
performance&#8212;and cases of successful recovery for software
nonperformance are vanishingly rare. Even if they were common, feeling
comforted by having somebody to sue would be missing the point. You
didn't want to be in a lawsuit; you wanted working software.
	</text>
</object>
<object id="216">
	<ocn>216</ocn>
	<text class="norm">
		So what is all that management overhead buying?
	</text>
</object>
<object id="217">
	<ocn>217</ocn>
	<text class="norm">
		In order to understand that, we need to understand what software
development managers believe they do. A woman I know who seems to be
very good at this job says software project management has five
functions:
	</text>
</object>
<object id="218">
	<ocn>218</ocn>
	<text class="norm">
		To define goals and keep everybody pointed in the same direction
	</text>
</object>
<object id="219">
	<ocn>219</ocn>
	<text class="norm">
		To monitor and make sure crucial details don't get skipped
	</text>
</object>
<object id="220">
	<ocn>220</ocn>
	<text class="norm">
		To motivate people to do boring but necessary drudgework
	</text>
</object>
<object id="221">
	<ocn>221</ocn>
	<text class="norm">
		To organize the deployment of people for best productivity
	</text>
</object>
<object id="222">
	<ocn>222</ocn>
	<text class="norm">
		To marshal resources needed to sustain the project
	</text>
</object>
<object id="223">
	<ocn>223</ocn>
	<text class="norm">
		Apparently worthy goals, all of these; but under the open-source model,
and in its surrounding social context, they can begin to seem strangely
irrelevant. We'll take them in reverse order.
	</text>
</object>
<object id="224">
	<ocn>224</ocn>
	<text class="norm">
		My friend reports that a lot of resource marshalling is basically
defensive; once you have your people and machines and office space, you
have to defend them from peer managers competing for the same
resources, and from higher-ups trying to allocate the most efficient
use of a limited pool.
	</text>
</object>
<object id="225">
	<ocn>225</ocn>
	<text class="norm">
		But open-source developers are volunteers, self-selected for both
interest and ability to contribute to the projects they work on (and
this remains generally true even when they are being paid a salary to
hack open source.) The volunteer ethos tends to take care of the
`attack' side of resource-marshalling automatically; people bring their
own resources to the table. And there is little or no need for a
manager to `play defense' in the conventional sense.
	</text>
</object>
<object id="226">
	<ocn>226</ocn>
	<text class="norm">
		Anyway, in a world of cheap PCs and fast Internet links, we find pretty
consistently that the only really limiting resource is skilled
attention. Open-source projects, when they founder, essentially never
do so for want of machines or links or office space; they die only when
the developers themselves lose interest.
	</text>
</object>
<object id="227">
	<ocn>227</ocn>
	<text class="norm">
		That being the case, it's doubly important that open-source hackers
organize themselves for maximum productivity by
self-selection&#8212;and the social milieu selects ruthlessly for
competence. My friend, familiar with both the open-source world and
large closed projects, believes that open source has been successful
partly because its culture only accepts the most talented 5% or so of
the programming population. She spends most of her time organizing the
deployment of the other 95%, and has thus observed first-hand the
well-known variance of a factor of one hundred in productivity between
the most able programmers and the merely competent.
	</text>
</object>
<object id="228">
	<ocn>228</ocn>
	<text class="norm">
		The size of that variance has always raised an awkward question: would
individual projects, and the field as a whole, be better off without
more than 50% of the least able in it? Thoughtful managers have
understood for a long time that if conventional software management's
only function were to convert the least able from a net loss to a
marginal win, the game might not be worth the candle.
	</text>
</object>
<object id="229">
	<ocn>229</ocn>
	<text class="norm">
		The success of the open-source community sharpens this question
considerably, by providing hard evidence that it is often cheaper and
more effective to recruit self-selected volunteers from the Internet
than it is to manage buildings full of people who would rather be doing
something else.
	</text>
</object>
<object id="230">
	<ocn>230</ocn>
	<text class="norm">
		Which brings us neatly to the question of motivation. An equivalent and
often-heard way to state my friend's point is that traditional
development management is a necessary compensation for poorly motivated
programmers who would not otherwise turn out good work.
	</text>
</object>
<object id="231">
	<ocn>231</ocn>
	<text class="norm">
		This answer usually travels with a claim that the open-source community
can only be relied on only to do work that is `sexy' or technically
sweet; anything else will be left undone (or done only poorly) unless
it's churned out by money-motivated cubicle peons with managers
cracking whips over them. I address the psychological and social
reasons for being skeptical of this claim in Homesteading the
Noosphere. For present purposes, however, I think it's more interesting
to point out the implications of accepting it as true.
	</text>
</object>
<object id="232">
	<ocn>232</ocn>
	<text class="norm">
		If the conventional, closed-source, heavily-managed style of software
development is really defended only by a sort of Maginot Line of
problems conducive to boredom, then it's going to remain viable in each
individual application area for only so long as nobody finds those
problems really interesting and nobody else finds any way to route
around them. Because the moment there is open-source competition for a
`boring' piece of software, customers are going to know that it was
finally tackled by someone who chose that problem to solve because of a
fascination with the problem itself&#8212;which, in software as in
other kinds of creative work, is a far more effective motivator than
money alone.
	</text>
</object>
<object id="233">
	<ocn>233</ocn>
	<text class="norm">
		Having a conventional management structure solely in order to motivate,
then, is probably good tactics but bad strategy; a short-term win, but
in the longer term a surer loss.
	</text>
</object>
<object id="234">
	<ocn>234</ocn>
	<text class="norm">
		So far, conventional development management looks like a bad bet now
against open source on two points (resource marshalling, organization),
and like it's living on borrowed time with respect to a third
(motivation). And the poor beleaguered conventional manager is not
going to get any succour from the monitoring issue; the strongest
argument the open-source community has is that decentralized peer
review trumps all the conventional methods for trying to ensure that
details don't get slipped.
	</text>
</object>
<object id="235">
	<ocn>235</ocn>
	<text class="norm">
		Can we save defining goals as a justification for the overhead of
conventional software project management? Perhaps; but to do so, we'll
need good reason to believe that management committees and corporate
roadmaps are more successful at defining worthy and widely shared goals
than the project leaders and tribal elders who fill the analogous role
in the open-source world.
	</text>
</object>
<object id="236">
	<ocn>236</ocn>
	<text class="norm">
		That is on the face of it a pretty hard case to make. And it's not so
much the open-source side of the balance (the longevity of Emacs, or
Linus Torvalds's ability to mobilize hordes of developers with talk of
"world domination") that makes it tough. Rather, it's the demonstrated
awfulness of conventional mechanisms for defining the goals of software
projects.
	</text>
</object>
<object id="237">
	<ocn>237</ocn>
	<text class="norm">
		One of the best-known folk theorems of software engineering is that 60%
to 75% of conventional software projects either are never completed or
are rejected by their intended users. If that range is anywhere near
true (and I've never met a manager of any experience who disputes it)
then more projects than not are being aimed at goals that are either
(a) not realistically attainable, or (b) just plain wrong.
	</text>
</object>
<object id="238">
	<ocn>238</ocn>
	<text class="norm">
		This, more than any other problem, is the reason that in today's
software engineering world the very phrase "management committee" is
likely to send chills down the hearer's spine&#8212;even (or perhaps
especially) if the hearer is a manager. The days when only programmers
griped about this pattern are long past; Dilbert cartoons hang over
executives' desks now.
	</text>
</object>
<object id="239">
	<ocn>239</ocn>
	<text class="norm">
		Our reply, then, to the traditional software development manager, is
simple&#8212;if the open-source community has really underestimated the
value of conventional management, why do so many of you display
contempt for your own process?
	</text>
</object>
<object id="240">
	<ocn>240</ocn>
	<text class="norm">
		Once again the example of the open-source community sharpens this
question considerably&#8212;because we have fun doing what we do. Our
creative play has been racking up technical, market-share, and
mind-share successes at an astounding rate. We're proving not only that
we can do better software, but that joy is an asset.
	</text>
</object>
<object id="241">
	<ocn>241</ocn>
	<text class="norm">
		Two and a half years after the first version of this essay, the most
radical thought I can offer to close with is no longer a vision of an
open-source&#8211;dominated software world; that, after all, looks
plausible to a lot of sober people in suits these days.
	</text>
</object>
<object id="242">
	<ocn>242</ocn>
	<text class="norm">
		Rather, I want to suggest what may be a wider lesson about software,
(and probably about every kind of creative or professional work). Human
beings generally take pleasure in a task when it falls in a sort of
optimal-challenge zone; not so easy as to be boring, not too hard to
achieve. A happy programmer is one who is neither underutilized nor
weighed down with ill-formulated goals and stressful process friction.
Enjoyment predicts efficiency.
	</text>
</object>
<object id="243">
	<ocn>243</ocn>
	<text class="norm">
		Relating to your own work process with fear and loathing (even in the
displaced, ironic way suggested by hanging up Dilbert cartoons) should
therefore be regarded in itself as a sign that the process has failed.
Joy, humor, and playfulness are indeed assets; it was not mainly for
the alliteration that I wrote of "happy hordes" above, and it is no
mere joke that the Linux mascot is a cuddly, neotenous penguin.
	</text>
</object>
<object id="244">
	<ocn>244</ocn>
	<text class="norm">
		It may well turn out that one of the most important effects of open
source's success will be to teach us that play is the most economically
efficient mode of creative work.
	</text>
</object>
<object id="245">
	<ocn>245</ocn>
	<text class="h4">
		Epilog: Netscape Embraces the Bazaar
	</text>
</object>
<object id="246">
	<ocn>246</ocn>
	<text class="norm">
		It's a strange feeling to realize you're helping make history....
	</text>
</object>
<object id="247">
	<ocn>247</ocn>
	<text class="norm">
		On January 22 1998, approximately seven months after I first published
The Cathedral and the Bazaar, Netscape Communications, Inc. announced
plans to give away the source for Netscape Communicator. I had had no
clue this was going to happen before the day of the announcement.
	</text>
</object>
<object id="248">
	<ocn>248</ocn>
	<text class="norm">
		Eric Hahn, executive vice president and chief technology officer at
Netscape, emailed me shortly afterwards as follows: "On behalf of
everyone at Netscape, I want to thank you for helping us get to this
point in the first place. Your thinking and writings were fundamental
inspirations to our decision."
	</text>
</object>
<object id="249">
	<ocn>249</ocn>
	<text class="norm">
		The following week I flew out to Silicon Valley at Netscape's
invitation for a day-long strategy conference (on 4 Feb 1998) with some
of their top executives and technical people. We designed Netscape's
source-release strategy and license together.
	</text>
</object>
<object id="250">
	<ocn>250</ocn>
	<text class="norm">
		A few days later I wrote the following:
	</text>
</object>
<object id="251">
	<ocn>251</ocn>
	<text class="indent1">
		Netscape is about to provide us with a large-scale, real-world test of
the bazaar model in the commercial world. The open-source culture now
faces a danger; if Netscape's execution doesn't work, the open-source
concept may be so discredited that the commercial world won't touch it
again for another decade.
	</text>
</object>
<object id="252">
	<ocn>252</ocn>
	<text class="indent1">
		On the other hand, this is also a spectacular opportunity. Initial
reaction to the move on Wall Street and elsewhere has been cautiously
positive. We're being given a chance to prove ourselves, too. If
Netscape regains substantial market share through this move, it just
may set off a long-overdue revolution in the software industry.
	</text>
</object>
<object id="253">
	<ocn>253</ocn>
	<text class="indent1">
		The next year should be a very instructive and interesting time.
	</text>
</object>
<object id="254">
	<ocn>254</ocn>
	<text class="norm">
		And indeed it was. As I write in mid-2000, the development of what was
later named Mozilla has been only a qualified success. It achieved
Netscape's original goal, which was to deny Microsoft a monopoly lock
on the browser market. It has also achieved some dramatic successes
(notably the release of the next-generation Gecko rendering engine).
	</text>
</object>
<object id="255">
	<ocn>255</ocn>
	<text class="norm">
		However, it has not yet garnered the massive development effort from
outside Netscape that the Mozilla founders had originally hoped for.
The problem here seems to be that for a long time the Mozilla
distribution actually broke one of the basic rules of the bazaar model;
it didn't ship with something potential contributors could easily run
and see working. (Until more than a year after release, building
Mozilla from source required a license for the proprietary Motif
library.)
	</text>
</object>
<object id="256">
	<ocn>256</ocn>
	<text class="norm">
		Most negatively (from the point of view of the outside world) the
Mozilla group didn't ship a production-quality browser for two and a
half years after the project launch&#8212;and in 1999 one of the
project's principals caused a bit of a sensation by resigning,
complaining of poor management and missed opportunities. "Open source,"
he correctly observed, "is not magic pixie dust."
	</text>
</object>
<object id="257">
	<ocn>257</ocn>
	<text class="norm">
		And indeed it is not. The long-term prognosis for Mozilla looks
dramatically better now (in November 2000) than it did at the time of
Jamie Zawinski's resignation letter&#8212;in the last few weeks the
nightly releases have finally passed the critical threshold to
production usability. But Jamie was right to point out that going open
will not necessarily save an existing project that suffers from
ill-defined goals or spaghetti code or any of the software
engineering's other chronic ills. Mozilla has managed to provide an
example simultaneously of how open source can succeed and how it could
fail.
	</text>
</object>
<object id="258">
	<ocn>258</ocn>
	<text class="norm">
		In the mean time, however, the open-source idea has scored successes
and found backers elsewhere. Since the Netscape release we've seen a
tremendous explosion of interest in the open-source development model,
a trend both driven by and driving the continuing success of the Linux
operating system. The trend Mozilla touched off is continuing at an
accelerating rate.
	</text>
</object>
</body>
</document>

