Extending the World Wide Web for Writing and Viewing Collaborative Hypermedia by Philip Greenspun (philg@mit.edu) INTRODUCTION "You make me sound like a world-class caustic nympho bitch," wrote one woman about her portrayal in Travels with Samantha, a travelogue I wrote for the Web [Greenspun 1993a]. This was much funnier than anything I'd written about her and it made me yearn for a system that would support a single author and hundreds of commentors to produce a collaborative book. I envision an author writing something and putting it up on the Web. Comments come back as proposed transactions to the book (corrections, additions, sidenotes, footnotes, linked articles). The author can quickly approve or discard transactions, much as he or she would deal with email. For example, a spelling correction would come in tagged as such and the act of typing "a" when the correction was in front of the author would cause the manuscript to be changed on disk. A footnote would similarly be automatically be inserted into the text and the other footnotes would be renumbered. Assuming the existence of an annotating manuscript, we need a viewer so that a reader can enjoy the author's primary text and look at comments and supplements in a convenient manner when desired. Furthermore, if there is to be a reliable stream of new annotations, the viewer must provide a mechanism for a reader to make a very precise annotation, i.e., on which paragraph or sentence is a comment being offered, which word's spelling is being corrected, etc. If this were only useful for travelogues and travel guidebooks, it would still be worth building. After all, no guidebook author is going to be as expert on every cathedral in Italy as a Harvard student who did an art history thesis on the one in Assisi. A guidebook that combined the insights of all the world's leading experts in a convenient manner would be very valuable indeed. However, collaborative books of the kind described above are not only good for guidebooks and travelogues. I'm convinced that they are the right medium for literary criticism. Have you ever tried to read The Sound and the Fury? [Faulkner 1946] It is a lot clearer if you have three books of professorial criticism on the novel open at the same time. However, turning the pages of four books simultaneously isn't much fun. If one were able to view the primary text and annotations by experts in a synchronized simultaneous manner, that would be immensely more convenient. Naturally, we do not want to force the author, readers, and annotators to be on the same computer system. Hence we are talking about a wide area hypertext system. It is well and good to build the world's nicest wide area hypertext system, but realistically nobody is going to use it unless it is compatible with the World Wide Web or perhaps sold by Microsoft. As I was wasting my time programming MIT Lisp Machines when Bill Gates was writing BASIC and buying MS/DOS, I'm going to limit this paper to addressing the issues of what we need to add to Web browsers and clients to make collaborative books practical. THE WEB TODAY Churchill's famous quotation, usually rendered as "We are near the bleak choice between shame and war. We shall choose shame and get war" [Gilbert 1991], best describes our current situation with HTML and the Web. HTML represents the worst of two worlds. We could have taken a formatting language and added hypertext anchors so that users had beautifully designed documents on their desktops. We could have developed a powerful document structure language so that browsers could automatically do intelligent things with Web documents. What we have got with HTML is ugly documents without formatting or structural information. Because of HTML's single directive for separating paragraphs, many modern novels cannot even be rendered readable much less attractive. For example, consider the English Patient [Ondaatje 1992]. Although its narrative style is about as unconventional as you'd expect for a Booker Prize winner, the English Patient is formatted very typically for a modern novel. Sections are introduced with a substantial amount of whitespace (3 cm), a large capital letter about twice the height of the normal font, and the first few words in small caps. Paragraphs are not typically separated by vertical whitespace as in Mosaic but by their first line being indented about three characters. (This makes dialog much easier to read than in Mosaic, by the way, where whitespace cuts huge gaps between short sentences and breaks the flow of dialog.) Chronological or thematic breaks are denoted by vertical whitespace between paragraphs, anywhere from one line's worth to a couple of centimeters. If the thematic break has been large, it gets a lot of whitespace and the first line of the next paragraph is not indented. If the thematic break is small, it gets only a line of whitespace and the first line of the next paragraph is indented. The English Patient is not an easy book to read in paperback. It would become, however, a virtually impossible book to read in Mosaic because neither the author's nor the book designer's intents are expressible in HTML. Before we worry about how to do an annotated version of the English Patient with criticism from all over the Internet, we should make sure that HTML is expressive enough so that a browser can render the primary text readable. Let's assume for the sake of argument that this has been accomplished somehow. WHAT KINDS OF ANNOTATIONS DO WE NEED? The simplest annotation is an essay on the entire work. Examples of this may be seen in the "Other Voices" section of [Greenspun 1993b]. The correction either complains about an error in a specific portion of a work or proposes that a specific block of text be replaced with a supplied new block. The footnote is a pointer from a word, sentence, or paragraph to a substantial work that is somehow related to the specified word, sentence or paragraph. The sidenote is an argument with, amplification of, or comment on a word, sentence or paragraph. It should come in classified as one of the preceding three so that readers can choose among a large group of comments selectively. We will want recursion so that, for example, it is possible to add a footnote or a sidenote to an existing sidenote. HOW IS THE READER TO SEPARATE THE WHEAT FROM THE CHAFF Most people who have experienced free-for-all discussions on USENET newsgroups do not describe it as having been a happy experience. However, all of the mechanisms that have been invented and proposed for taming the USENET monster may be employed here. If you really hate that stuffy Yale professor's interpretation of Hamlet, press a button and you'll never even be offered a comment from him again (though that comment would still be available to you on an "all comments" menu option). If you have generally liked comments from an intelligent person at Bell Labs, those person's annotations will be given greater prominence. If you are just starting out with the Web and have never expressed any preferences, but know that your friend Chantal Wright has exquisite taste, you can ask to have her preferences used as a starting point. The best guarantee that noise won't overwhelm the signal is that the entire collaborative book is under the control of a single author. A comment that is completely ill-informed or off the topic won't go in. This isn't exactly in the best Internet spirit, but hey, who wrote the book to begin with? In any case, Internet is fairly safe from domination by any one person. Suppose that a Democratic member of Congress wrote a book called Wounded Dove about the evils of drugs and how his appropriations for the DEA and prisons were helping to win the War on Drugs. Suppose then that he invited constituants to submit comments and experiences and included those that supported the War on Drugs but excluded comments from people who liked drugs and from Libertarians opposed to this role for the government. Nothing would stop any Internet user from compiling the rejected comments into Comments on Wounded Dove, which would become available to readers who searched for "Wounded" in a Web crawler [Pinkerton 1994]. PRESENTATION (IS ONE WINDOW ENOUGH?) Mosaic has one window. If you want to look at a footnote or an annotation or a link to anywhere else, the primary text disappears. This is unacceptable for several reasons. First, one may wish to look at the primary text and an annotation simultaneously without having to operate two or more browsers simultaneously and perform window system gymnastics. Second, it is an established principle of human interface that visual stability is valued by users [Apple Computer 1992]. There is nothing stable about having the primary text disappear every few seconds as annotations are checked out. Finally, it is important to have visual cues to separate annotations from the primary text. They should be in different fonts or different windows or different positions or something. I think that Mosaic needs to be extended in at least three ways to facilitate presentation of collaborative books. First, there should be a "comment bar" along the left or right edge of the window indicating the presence of comments. This way, the flow of the text isn't interrupted by little GIFs or too many anchors. If a comment is really on a specific word, then perhaps that word will have to be rendered specially, but sentence and paragraph comments shouldn't interrupt the flow of the text. Second, footnotes should pop up in a window near the bottom of the screen and occlude as little of the primary text as possible and sidenotes should show up in scrollable windows to the left or right of the main Mosaic window. If the user really gets interested in a sidenote, he should have the option of moving that text into the main window and replacing the primary text. Third, looking up words in a dictionary or encyclopedia should be easy even if those words aren't themselves hypertext anchors (this isn't something for collaborative books per se, but is sorely needed). Users should be able to set a default dictionary or encyclopedia and then click the mouse on a word to have its definition pop up in a side window. Special documents, e.g., Beowulf, might supply dictionary URLs that override the user's default. EXTENDING THESE IDEAS TO DOCUMENTS WITH MULTIPLE AUTHORS Although it can be argued that what I propose is not a true collaborative work because one author retains control at all times, most of my ideas for authoring can be extended to the case of multiple authors and all of my ideas about presentation can be extended as well. When the document is created, the authors can decide amongst themselves how much control is to be exercised. Choices include total chaos (anyone can add or remove anything), club chaos (anyone within a club can add or remove anything; changes from the outside require approval of the club), democratic chaos (changes need to be voted on), etc. Structured email messages [Malone 1987] suffice to handle voting and approval. The principal difference from the single author case is that changes may be made in the "primary text" as well as annotations being added in sidenotes. When the document is presented, users within the "author club" gain the authority to propose changes in the primary text, but otherwise the look of the annotated document doesn't change much. STICKING ANNOTATIONS INTO HTML WITHOUT BREAKING CURRENT BROWSERS It isn't pretty, but it is possible to put arbitrary information into the HEAD of an HTML level 2 document. Current versions of Mosaic simply ignore those extra META tags. Tags of the form (add-annotation :paragraph 8 :type :sidenote :text "This cathedral was actually started in 1346 but a war between the city-states of ..." would enable a sufficiently powerful browser to add all the extra hooks I talked about above. CONCLUSION Somebody should extend a Web browser and build some authoring tools so that we can start experimenting with collaborative books, an exciting new form of literature that is only possible with Internet. ACKNOWLEDGMENTS The Advanced Computing Laboratory, of Los Alamos National Laboratory, Los Alamos, NM 87545, provided me with ideas and a glorious array of beautifully-maintained computers. I am particularly grateful to Ron Daniel, David Forslund, and Jerry Delapp for making my summer at the ACL productive. Years of support and stimulation from the MIT AI Lab and Laboratory for Computer Science have left an indelible stamp on my brain and it is impossible for me to remember who gave me which idea anymore. Some of them might even be my own! REFERENCES Apple Computer 1992. Macintosh Human Interface Guidelines. Addison-Wesley. Faulkner, William 1946. The Sound and the Fury & As I Lay Dying. Modern Library, New York Martin Gilbert 1991. Churchill A Life. Henry Holt & Company, New York, page 595 Greenspun, Philip 1993a. Travels with Samantha. http://www-swiss.ai.mit.edu/samantha/travels-with-samantha.html Greenspun, Philip 1993b. Berlin and Prague: Nazis, Jews, stamp collectors, and beautiful women http://www-swiss.ai.mit.edu/philg/berlin-prague/book-cover.html Malone, Thomas W., Grant, Kenneth R., Lai, Jum-Yew, Rao, Ramana, and Rosenblitt, David 1987. "Semistructured Messages are Surprisingly Useful for Computer-Supported Coordination." ACM Transactions on Office Information Systems, 5, 2, pp. 115-131. Ondaatje, Michael 1992. The English Patient. Vintage International, New York Pinkerton, Brian 1994. The Web Crawler. http://www.biotech.washington.edu/WebCrawler/WebCrawler.html