Small logo
















Part Two: Technical Issues, Strategy, and Implementation

Whereas Part One described itself as being about "content, not technology," Part Two is about "technology, not content," and while the first part was intended as a "rhetorical goad," Part Two is intended to be a cattle prod: electric and an immediate realization of the abstractions of rhetoric.

With these lofty goals in mind, I will present here an overview of the Processed Book implementation, followed by a top-down view of the various thoughts and issues that drove the design, culminating in some discussion of the design decisions and trade-offs.

Concepts and Facilities

Let's begin by presenting an overview of the more salient aspects of the Processed Book implementation.  If you've already sampled the Processed Book online, you may want to skip this section.  If you haven't yet used the Processed Book, this section provides the essential technical background, introduces the terminology, and summarizes the concepts and facilities that are referred to throughout the rest of this paper.

Overview

PBOS (Processed Book Operating System, or Processed Book Open Source, depending on the lunar phase) is an implementation of some of the ideas in the original essay "The Processed Book".  PBOS provides a very rich context for adding value to a Web-based book through multi-user comments, various kinds of connections, and analytical and visualization tools.  The results can range from a few, formal review notes for an author, to a crayonesque markup by mobs of postmodern deconstructors, with the visibility of each controlled by the viewer. 

This implementation (i.e., PBOS) is focused on providing very general and extensible facilities for medium to large-scale documents, which are useful for all of the applications mentioned or hinted at in the original paper.

The aim of this section is to provide a comprehensive overview of the implementation's concepts and facilities, both for beginning users and as a basis for informed discussions.  By highlighting the more unusual features and their utility, I hope to give some sense to how PBOS differs operationally from wikis, blogs, and Web pages.

From a practical point of view, the aim of PBOS itself is to be a complete and extensible environment for the creation, maintenance, review, and distribution of Processed Books.  Hence the acronymic conceit of "Operating System."

Please note that while PBOS should operate on any standards-compliant Web browser, it makes extensive use of JavaScript and currently has been tested for reliability only on machines running Microsoft Windows Internet Explorer 6.0 and Firefox 1.x, and OSX Macintosh Safari 2.0.1.

PBOS Concepts

The heart of PBOS is simple and pure:

Beyond its conceptual heart, the circulatory system of PBOS is Annotations, the quintessential element of PBOS, implementing the various aspects of the Processed Book in the several different types of Annotations:

PBOS Tools

Annotations are added by simply selecting text in the Book, selecting an Annotation type from the resulting menu, and filling the resulting dialog box.

Clicking on the text icons in the right margin column allows operations on the associated Annotations.  A normal (left, on Windows) click either opens a window to view the Annotation's data, or follows the associated link.

Right clicking brings up a menu that allows editing, deletion, or movement of the Annotation.

Besides all the usual suspects for movement and maintenance in Books, text and Annotations, PBOS provides some more exotic operations on entire books:

This allows sophisticated lexical analysis, as well as marking points of interest in the Book.

This allows quick visual filtering for information encoded by the Annotations.
This is primarily useful to extract and analyze complex Annotations from large Books, or for selecting Annotations for editing.
Extensions are useful as a way to add semantics to PBOS so that new kinds of processes can be accommodated.
Under Library, you can access the expected operations related to whole Books; the most interesting is:

Perspectives

Introduction

This section tracks the initial and evolving discussions of what a Processed Book might be and also introduces ideas that will recur throughout this paper.  We also made use of this perspective to stimulate discussion on the capabilities and features and user interface ideas and to avoid binding ourselves to specific solutions too early in the design process.

One recurring theme is the distinctions among structure, syntax, and semantics.  This discussion arose because there's a tendency to assume that software can discern meaning, when in fact it can generally only comprehend structure and syntax.  The problem with this is that, of course, meaning is what matters to a human being—to such a degree that we even unconsciously correct lexical and syntactic errors.  Our approach was to try to capture some faint ghosts of the actual semantics in syntactical and structural ways and to add features that focused on these reflections of semantics.

Another recurring theme is "semantic nets", where we are using the term in its technical sense.  The core idea of a semantic net is that the links among the items have some iota of meaning, typically by indicating some relationship between the linked elements.  This means that in a consistently encoded document you can extract all things related in some common way..  In some senses, such an extraction represents a very wimpy and noisy form of semantics, but it at least encodes some sense of meaning in a way that allows for useful analysis.

Historical/Distinctions

Of course, the ultimate progenitor of the Processed Book is the "book" itself.  As extolled at length in Asimov's 1973 essay, "The Ancient and the Ultimate", where it's made abundantly clear that the utility, user interface, portability, random access capabilities, etc. of the traditional book far exceeds that of the (then) cutting edge video cassette, the book remains the standard by which we should judge its replacements.  This better-than-retro thinking had two impacts on us: there was always the question of how can we be "better than a book", and "how do we keep the virtues of the traditional book intact".

The second progenitor of the Processed Book is so obvious that it's easily overlooked: the World Wide Web.  In many ways the Processed Book is simply a constrained Web site, where a nearly unchangeable core document can be surrounded by a plethora of annotations and links that add value, without creating the confusion inherent in the highly dynamic documents characteristic of a typical Web site.

One important question is, How is a Processed Book different from the other forms of Net media that interconnect tightly related content?  Specifically, we were at pains to understand how our project would differ from the wiki and blog.

Functionally, blogs are viewed as a completely distinct form, both because they were almost invariably narrative (as opposed to the tree/network composition of the processed book) and because little structure is imposed upon them.

Wikis are distinct in our implementation of the Processed Book in that they are extremely changeable (see Part One for additional discussion of this point).

The closest implementation to our concept of the Processed Book is the widely used Slashdot site, which imposes structure, but typically has very small core, though unchangeable, documents at its heart, but does not allow annotations of individual pieces of the core document.

"Context is King"

Media pundits were off by only a single letter in declaring that "content is king".  The interesting thing here is that in many ways the Processed Book adds context around the core document, adding value to it, but still letting it retain its original semantics as a standalone document.

The irony is that it turned out that content devoid of meaning has little operational value on the Internet, while the context that surrounds content, in the form of site organization or the credibility of a news organization, is perceived to have a lot of value.  It's our conceit that the Processed Book might have a similar effect in that it provides a great deal of context around a particular book

The Identity Problem

At the risk of recursively making this document itself into a Processed Book, here's a quotation from an early e-mail among the Processed Book team about the identity problem:

"It struck me as odd that you started from the view of a traditional, non-electronic book (even in your choice of words), rather than calling it, say, "The Information Nexus—Processed Book as Metaphor", to emphasize that you're not talking about books at all. 

I now understand you're deliberately tantalizing your intended audience with a phrase that would induce domestic visions of tomes and Cuisinarts, while jacking their brains into the CyBerPuNk library of yesterday's future…

I think, however, that there's a fundamental issue of "identity" that must be somehow recognized: an actual book's physical and informational identities are—literally (and literarily), bound between its covers—and this notion is broken in a most fundamental way when it's information is dissociated from its paper."

This raises the point that a book is not about paper and covers, but really about the meaning of what lies inside, the content.  By placing this meaning in an interactive and extensible medium, the boundary is blurred between the book-as-physical-object and the book-as-content.  We ultimately resolved this problem by making it difficult to change the "core document" so that its identity was (mostly) preserved.

User control

In some of our early discussions, there were ideas that involved things happening outside the control of the user, so that the user became a passive viewer much as in watching television.  We were unhappy with this, so we put a great deal of emphasis on allowing user control, for instance, in the ability to control how the annotations appear in the margin, or even whether they appear at all, as an antidote to this pernicious tendency toward passivity

A recent episode of the popular CSI television show, however, made an interesting point about the value of automatically controlled context.  The opening shot was of a sea-bottom shipwreck, apparently old, followed by a fairly rapid zoom back to a satellite shot of a hurricane, making it clear that the wreck was current, and emphasizing that semantics can be powerfully enhanced, or even changed, by an automatically controlled view of context.  We didn't find a way to implement this particular feature in our context, however, but we wonder if someone will work on this with the PBOS open source code (this is one of many things we would like to see people develop using the open source code).

Anchored Context

As we progressed in our discussions, it became clear that a substantial value of the Processed Book was as a sort of "inverse context": the core document (the "book") provided the context for the surrounding material.  This meant that the book could be viewed in the context of all of its critiques, homages, corrections, and history, among many other things.

This ability to have a static core surrounded by a dynamic commentary, all of which was accessible to a reader, seemed to be an important aspect of the Processed Book and greatly influenced the later design.

GutenBorg: the "printing press" for the Processed book

Lastly, we considered naming the project "GutenBorg", for its marriage of tradition and technology, since we were in effect providing the printing press for the Processed Book, the mechanism by which the content is made available to the reader.

But the implementation is really even more ambitious in some respects than a printing press: through its annotations it's the pen for critics, the warehouse (where books are stored before sale), the bookstore, the collated margin notes from all readers ambitious enough to produce them, and the ultimate archival library, accessible anywhere on earth where a few kilobits per minute can be brought to a screen.  The printing press is a means of production, but PBOS creates the supply chain and life-cycle environment in which a book resides.

In one sense this is just the usual statement that the Internet dis-intermediates, but in another, it helped us to view the Processed Book as some inevitable aspect of the future: it's made inevitable by the virtual elimination of the distance from author to reader, and the addition of an ever-growing value of surrounding context.

Aspects

In the original essay "The Processed Book," five aspects of the Processed Book were defined.  Let's review here how we translated these ideas into the online implementation.

The Book as Portal

The idea of a portal mostly translates into outbound links, either to specific external content or to places within external content.  The portal idea views the book as being at the center of a network and surrounded by connected content, with most of that content being static.

For the static links, this is largely implemented by the outbound links facility, but we also extended the portal idea into the realm of dynamic content by integrating the BizVantage service with dynamic content determined moment-by-moment by the users' interests.

The Book as Self-Referencing Text

This encompasses several concepts, the simplest of which is that a book contains either simple or sophisticated self-references.  A simple self-reference might be an index or even a table of contents.  More complex forms of self-reference would be a dictionary that requires that every word that occurs in the dictionary also has a definition in that same dictionary.  A yet more complex aspect of this is the idea that the book includes some form of lexical, syntactic, or semantic analysis of itself as part of its text.  It's not sufficient that there be just the capability for analysis, but for the term "self-referencing" to apply, the results of analysis actually have to be included in the book.  Presumably these results cannot actually analyze themselves to avoid recursive descent into a semantic hell.

There is a literal sense that we haven't implemented this internal self-referencing at all, since all of the mechanisms we provide reside outside the text of the book itself, which remains relatively sacrosanct in our implementation.  However, we have in spirit implemented the capability to do fairly complex and even slightly semantic analyses, using the "dissect text" capability, which draws on the WordNet software to use its semantic net of relationships to find semantically related words/phrases.

The Book as Platform

This aspect of the Processed Book is primarily about annotations, either added to the basic book or as inbound links from other content.  Note that the term "metatag" in the original paper is used in a very different way than it is used in HTML.

We've provided two implementations of this: "inbound links" that allow linking from other sites to specific points within the Book; and the "note" annotation type, which allows arbitrary, and recursively annotated, notes to be added or created.

The Book as Machine Component

The concept here is that of the machine readability, and processability, of the base content, across a wide range of viewpoints: semantic, syntactic, and structural.

We've implemented this with the "dissect text" capability: PBOS enables a WordNet-based lexical analysis to extract paragraphs, sentences and phrases, and either get simple statistics or do additional processing.  We also allow automatic annotations, marking the occurrence of the words/phrase things found by the analysis as Annotations.

In addition, we allow analysis of the annotations themselves so that it's possible to get either simple statistics or to run arbitrarily complex queries to count and/or extract specific annotations, or annotations having specific properties.  For instance, the user can get a list (or just counts) of all Annotations with the with the word "critique" in the Annotation text.

The Book as Network Node

This concept potentially subsumes all of the aforementioned aspects of the Processed Book, but in addition to that, PBOS also allows the normal forms of HTML linking to be included in the core document.

Utility

At a global level, one way to approach user interface design is to try and define the utility or value of the application in various areas where it might be used.  This avoids falling into the "local motherhood" trap of designing each element of the interface to local perfection, while ignoring the overall context of use.

For academic or scholarly use, we saw two places of value for PBOS.  The first is in supporting the discourse that occurs in this environment, where critical and thoughtful comments can be collected and assimilated in a single shared reference location.  In this context one might think of the Processed Book as an ongoing colloquium about the core book itself.

The second use is more about the book being the center of discourse than about the deliberate collection of discussions.  The idea is that the Processed Book can become a center of knowledge, linked to and from related materials, and subject to annotation from a wide variety of sources—including those far beyond the range of traditional academic discussion.

For commercial use, which is likely to develop after the open source code gets downloaded by curious developers, we believe that the Processed Book is primarily useful for dynamic documents, where a relatively large document such as a corporate strategy or new product marketing plan can be rapidly annotated, while still preserving the original document.  The corporate strategy document is our touchstone here is the corporate strategy document, because turning strategy into tactics is exactly where the concept of adding annotations to an unchanging core document adds substantial value in way not addressed by more conventional tools.

We also considered, but didn't follow-up on, the idea of using the Processed Book context as a support tool for consensus decision-making, whether in the commercial or not-for-profit spheres.  To make this work we would need to add some capabilities to allow "voting" on decisions, which was beyond the scope of the current project, although we did look for something "off the shelf" that we could bolt on to the PBOS.  Perhaps other developers will use the open source code to add this functionality.

For technical use, far and away the best application seems to be for technical documents under heavy use, where it is important to retain the invariant base document, but also crucial to collect notes on what it actually being done, discoveries about deficiencies in the core document, etc..

Model Uses

To have some touchstone applications on which we could try out ideas about user interface and relevant capabilities, we also considered for PBOS a number of typical uses, as opposed to classes of user:

We expect one of the most common uses will be the adding of personal notes to a document stored under PBOS for personal use, perhaps for writing reports or reviews of the content of the PBOS book.

Operationally this makes ease of use and learning key.  Also critical is the ability to extract and organize annotations outside of the core document.  We anticipate that this would often require access by small groups of people; thus the design of any access control mechanisms would have to accommodate small groups as well as individuals.

Another likely use would be as groupware, especially in environments where a small to medium-size group might be attempting to develop a medium to large-scale document through continuous review and critique.  This implied making PBOS useful as a form of groupware, but we didn't pursue this very far, as groupware is already a well-developed area.

In this context (that is, collaborative software), there might also be an interest in some consensus form of annotation or correction, a view that was reinforced both by the current popularity of James Surowiecki's book The Wisdom of Crowds and by previous exposure to the Delphi method of group decision-making.  But again this verged on groupware, so we elected not to follow this path.

Technical documents are another likely area of application.  PBOS could be used to develop such documents from scratch and especially in critiquing them, with the goal of iterative improvement.  Yet again, this led us in the direction of groupware, so we elected to focus primarily on the ability to add critical and supporting commentary to a pretty much fixed core document.

It's interesting to note that many of these uses involved the iterative refinement of the document, which really wasn't part of the original Processed Book concept.  Once you have the ability to interact with a document and make annotations, it becomes very natural to want to reflect those annotations in the base document.  (This matter arose for Joe Esposito when we began working on the project together.  As our ideas developed, he wanted to go back to the original "Processed Book" essay and revise, but he held off—reluctantly—in favor of making annotations on the essay instead.)

Task-oriented

Reader/Annotator/Author Model

In software design, it's often useful to have mental models of different kinds of uses that will be made of the product.  This should have a major influence on design and lead to much more effective user interfaces.

For PBOS, there were clear sets of users with distinct needs:

Design

Design Issues

In this section, we'll discuss some high-level design decisions that aren't obvious from the requirements.

On the user interface, we made two basic decisions early on:

As mentioned above, it was important to leverage existing software:

There are a couple of things we would do differently knowing what we now know:

What follows are some quick sketches of the design issues for the major components, along with some of the reasons for the design decisions.

BizVantage is not ideal for this particular Processed Book implementation because it's designed for individuals or very small groups with a common interest.  Because it learns from what articles users choose to view, it may not be particularly effective in the demonstration environment, where many different people with many different interests are influencing BizVantage's selection of articles.

The "create incoming link" capability looks deceptively simple, being symmetric with the "create outgoing link" capability.  It turns out to be interesting and unexpectedly complex, both conceptually and from an implementation viewpoint.

Because annotations could be connected to any point in the text, we wanted to allow users to do what ordinarily only Webmasters can do: create links into the middle of Web pages.  We also wanted to have some degree of control over whether those links would work if the annotations were removed (if they were removed, we wanted to be able to turn the incoming links off).

The net result was a fairly elaborate scheme that allows addressing down to the individual character level within the document, but still allows incoming links to be managed by their creator as if they're part of the PBOS document.

Extensibility was dealt with by requiring all data be kept in the database and making the database itself self-describing, essentially prohibiting any other form of data storage.  While this made certain things fairly clumsy, it had the effect of isolating the data from the program quite effectively.

We then put direct extensibility features into the application so that new annotation types could be added, as well as new elements to existing or new annotation types.

We also mandated that all of the core capabilities of PBOS be implemented in PHP and MySQL.

In this section were will provide an overview of some of the design issues that arose during the project.

Requirements

There were a few very general, high level requirements that drove most of the design:

We had to cover all of the "aspects" mentioned in the original Processed Book paper;-and, because of our limited budget, we needed to get maximum leverage by using existing software any place it was feasible, even at the cost of compromising functionality. 

On performance, our only criterion was to keep tomake sure that all operations could be completed in under three seconds, for those operations the user expected in real-time.  This isn't a very stiff criterion, since the normal goal would be under one second, but we wanted to make sure that we had all the functionality done, in-budget, even at the expense of some speed.  We were also expecting that we could get such performance fairly easily using JavaScript, thereby [pushing the real-time aspects out to the clientuser's computer (client).

On the implementation environment, we decided to stick with a single language and system to the extent possible, somewhat against our own inclinations to mix-and-match systems and languages as where convenient.  This was largely because we wanted the resulting system to be both portable and easy to modify, and we felt that intermixing, say, Perl, C++, Python and LISP wouldn't give us this result.  So we settled on PHP, with a frosting of SQL via mySQL; the.  This should also make it relatively easy to port PBOS to a Microsoft ASB ASP environment.

The last requirement was extensibility, because of our hope that PBOS might find wider use, and because we were primarily interested in exploring the full range of Processed Book capabilities.  We felt as Eextensibility had to incur occur in two domainsdimensions: it should be very easy to extend the annotation types, and associated data types, since the concept of annotations was at the heart of almost all the Processed Book capabilities; and it should be relatively straightforward to modify or extend the code underlying the Processed Book.

We lavished a good deal of attention on user interface, on the premise that making the Processed Book actually usable, especially with a short learning curve, would give the most value to our funders and users in selling the conceptand in getting the users to spread the idea of the Processed Book by word of mouth.  Towards this end, we set informal goals of: less than one minute for a reader Reader to begin reading text, including viewing annotations; less than three ten minutes of training to get annotators Annotators to the point of adding simple annotationsAnnotations; and less then 30 minutes for authors Authors to understand what they needed to do to add new books Books to the Processed Book library.

We decided early on that we should preserve the invariance of the book Book itself; operationally, this meant that the book still had to be readable as a book, even if it were heavily annotated.  We didn't think that 600 years of interface optimization on the form of the printed book form should be thrown away. in this merger of Cuisinart and print.

Lastly, we didn't want to limit, at least conceptually, the kinds of things that could be connected to the book or the form of its content.  Specifically theThe general rule was that anything that could be done on the Web could be put into a Processed Book.

Implementation Notes and Consideration

As Robert Heinlein noted, "any sufficiently advanced technology is indistinguishable from magic", and user interface is one such technology.  We've gone to considerable work to connect the Processed Book concepts to a simple, somewhat elegant interface that's quick to learn and easy to use.  Not quite magic, but at least sleight-of-hand.

The user interface does require a robust and standards compliant implementation of JavaScript to operate, and includes most of the formatting and real-time interactions.

PBOS requires a Web server that supports PHP (at least version 4.3.4) and a MySQL database (at least 3.23.58).  To support the conversion of HTML into our internal Wiki-style formatting when adding new books, PBOS requires a Perl interpreter (at least 5.6.0) with the HTML::WikiConverter library (at least 0.30).

PBOS currently puts image files directly into the Web server's file system, so for adding images to books it requires that a "book-images" directory exists in the same directory as the PHP files, and that the Web server be able to create subdirectories in that directory.

The Dissect Text function requires that PHP be able to access WordNet functions.  We created a PHP module by downloading WordNet 1.7.1 and the source code available from here: http://www.foxsurfer.com/wordnet/.

Design: Undone Features

Inevitably, in a project like this many ideas are considered, and discarded, for reasons of budget, feasibility or time.  In this section we provide perhaps an overly detailed description of various things batted around, largely because they extend the concept of the Processed Book in ways not mentioned in the original paper and probably represent the closest thing to the value of real experience with a Processed Book implementation.

XML output

One thing we considered seriously was having all screen output be in XML format, with the thought that this would be useful for people writing additional machine processing applications of Processed Book output.

To our surprise, we discovered that many Web browsers don't support setting up XML formatting capabilities and then simply sending XML data such that it is automatically formatted.  Since we didn't have time or budget to deal with this in a more complicated way, we elected not to do it, but we hope members of the development community will fill this gap.

For PBOS at this point it would be moderately difficult to arrange for true XML output, since it would involve changing all of the output mechanisms throughout.

Semantic Web

In a sense, Semantic Web technology is the next step up from XML.  We considered trying to put out at least the structured information in a standardized form that was compatible with the Semantic Web technology, but ultimately decided the concepts of the Semantic Web were not really well-developed enough to make this worthwhile.  More specifically, we couldn't see any short-term leverage that we could demonstrate from doing this, so it seemed too abstract to be worth the bother.  For PBOS, Semantic Web output would be even harder to add at this point than XML output.

Semantics: Context Views

Several of us were interested in capturing some form of semantics, being very frustrated in only being able to deal with text as a series of meaningless words.

One of the things we considered in this regard was the idea of "context views".  The core idea here was to extract fragments of a Processed Book based on the semantics of some context, as (weakly) interpreted by WordNet.

For instance, imagine Huckleberry Finn interpreted through the context of "religion": all the passages that contain any words descended from religion or their synonyms in WordNet could be assembled into a single document or perhaps simply annotated in the margin in the usual way.

The result would be the ability to focus on a book from a single context or viewpoint, either seeing just that viewpoint extracted or being able to see it embedded in the original context.

We didn't implement this largely because we had already implemented a mechanism with WordNet, and we felt this would be just gilding the lily.  It would be quite simple to do, since it does not extend the basic concepts of PBOS at all, and is closely related to the existing "dissect text" mechanism.

Semantics: Constrained English

The basic idea of "constrained English" is to have at least some form of annotation where the language is limited lexically and grammatically to something understandable by a computer.  This would mean that the semantics of the annotation could be understood, at least as well as computers can ever understand human language.  Implicit in this is the idea that the annotation would be checked to conform to whatever we had defined as the standards.

For instance, constrained English would allow students to paraphrase each paragraph in a novel as constrained English, resulting in annotations that at least in theory, captured some of the semantics of the original writing.

Another example would be to have standard language for describing, say, physical events, so that the actors, objects and time sequence of an event in a novel could be captured.

Note that this standardized encoding would also support more accurate translation to other languages, if only because a significant amount of English's ambiguities would be filtered by the guaranteed precision of the constrained form.

We didn't pursue constrained English very far, since it looked like a major project on its own.  We speculate that work of this kind is being undertaken in the field of computational linguistics and it is our hope that researchers in that area will turn their attention to the Processed Book..

Except for making the decisions of exactly what language to enforce, constrained English would be pretty easy to implement inside PBOS, since it doesn't extend the basic concepts and since Annotations are a very general mechanism.  Note also that since the annotation types are extensible, not even the database needs to be changed.

Consensus Process

As alluded to earlier, we were intrigued with the idea of a "consensus process", where users might be polled to get their consensus opinion on, say, numbers, facts, or opinions embedded in a core document.

In practice, this might work by adding an annotation that invoked an automatic process that, say, initiates software that does iterative refinement among members of a group to come to a consensus value or range of values for some specific content.  This consensus, and perhaps notes on how it was arrived at, would then become part of the annotation.

As this was beyond the scope of the original Processed Book project, and seemed like it would be almost orthogonal to the rest of the implementation, not to mention quite a bit of work even for a simple consensus process, we elected not to pursue it.

We think, however, that for certain applications, this is one of the most promising simple extensions to PBOS, and could yield quite a bit of value if applied in the right context.

Once the consensus driving procedure was created, it should be relatively simple to add it to PBOS, since it would only be adding a new kind of annotation, something that PBOS was designed to do easily.

User Views

This is simply the idea of having named and saved sets of the parameters that control the view the user sees of a PBOS document.  It would complicate the user interface significantly, adding another layer of potential confusion.  We elected not to implement it, as we felt the additional confusion was not compensated for by a significant gain in function.  However, in a context where different viewpoints for individual users to the same document was important, this could be a very valuable capability.

Because this would extend the user model in significant ways, it's not trivial to implement, and would involve significant changes to the database.

RSS Feeds

During the course of this project, RSS feeds became more important as an element of the Web.  We might well have included them if they had been important at its conception.

Outbound feeds would presumably connect some classes of annotations to the outside world, so the changes would be automatically propagated to other sites.  Note that it makes little sense to have an RSS feed related to the core document, since it's not expected to change very much.

Inbound feeds are problematic, since the usual convention of embedding the RSS headline information in the page doesn't make sense if you view the core document as static.  One could relax this view, viewing the RSS headlines as "logically static", since they always derive from the same source.  Or one could add in RSS inbound feeds as an Annotation type, so that the Annotation would have to be clicked to see its actual content of RSS headlines.

In a certain sense, RSS feeds are the equivalent of the BizVantage component, since they deal with updates selected from the Web.  So it might make sense to add an external RSS page, much like the BizVantage page, with content selected by matching the text that the annotation is connected to.

All of these are fairly significant extensions of the basic PBOS ideas and would require quite a bit of work.  It also seems like this capability would only add marginal value to the Processed Book.

Book Circles

Inspired by the success of the Oprah Winfrey reading groups, the idea here was to provide some very specific tools to allow groups of people to annotate in standardized ways: to ask questions of one another, to comment upon the meaning of particular passages, etc., as a means to encourage discussion and understanding of the book itself.  This would be quite easy to do, since as described, it mostly involves adding additional Annotation types, a capability which is already built into PBOS.

Collections of Documents

PBOS is focused on annotations added to a single book at a time, viewing the book as the center of its annotation universe.

An obvious extension would be to allow the annotation set to be across many books, presumably categorized in some way, so that the cross-annotations make sense and have utility.  (This begins to touch on the idea for Project Casaubon, which was discussed in Part One.)

Because PBOS already has the notion of a library, this would not be extremely difficult to do, but it would involve alteration of the database, so that in effect a collection of books is viewed as a single large book for certain kinds of purposes, such as "analyze" and "dissect text".

Audio Books

One of the more exotic ideas considered was that of an audio version of PBOS.  The core idea here is to make all of the mechanisms accessible through sound and voice, applied to an audio core document.  While the technology of recording, word recognition, and playback control is certainly well enough developed to support this, it's unclear how one makes Annotations that don't interfere with the audio "reading" of a book: in PBOS terminology, how do we distinguish the "core document" from the Annotations, while making the listener aware—in real-time—that an Annotation is associated with the words just heard?

Another consideration is that the bandwidth for selection on a screen is much greater than the bandwidth available for selection in audio, since the entire screen is visible to the user at once.  This makes it difficult to map the annotations onto the content, as well as to give the user a sense of all the annotations available at any point.

As is probably obvious, this is by far the hardest extension to contemplate, much less implement.  It's essentially a new implementation, from the ground up, although presumably the existing database could be used: since its content is dynamically mapped into the display, it contains little that's only relevant to the display and that content is cleanly isolated.