|
-- StephenRamsay - 20 Aug 2005
Hackers, Scholars, and AcademeAcademic peer review--the process by which research articles and grant proposals are submitted to third-party experts for evaluation--serves a two-fold purpose. It ensures that academic research meets a high standard of quality, and it signals to other scholars that a particular research activity has, in the opinion of other members of the research community, met this standard. Scholars whose work has been repeatedly subjected to this process and found worthy may use this fact as evidence in support of promotion and tenure, further grant funding, and other opportunities for professional advancement. Work that has not been peer reviewed is automatically viewed with suspicion, and academics whose work repeatedly fails to gain the approval of review panels may fall victim to what is called--not without justice--academia's "publish or perish" model. All of this is of grave concern to scholars in digital humanities, and in particular, those who design and build software for scholarly inquiry. Most software systems represent a serious investment of time and energy (comparable, in most cases, to the writing of a monograph or the management of a large experimental research project). Yet the "seal of approval" so necessary to the academic success of individual researchers is largely unavailable to those who undertake such projects. The situation is doubly alarming for practitioners in digital humanities, since it is obvious that the growth and continuance of the discipline relies on the availability of such tools. In this paper, I'd like to examine several aspects of the traditional peer-review process and consider the ways in which each aspect is (or might be) applied to the evaluation and review of software. By doing so, I hope to isolate the particular elements that would have to be present in order for a peer-review process to work in this particular subfield of the humanities, and for both individual researchers and the larger scholarly community to gain the benefits of such a process.Guarding against ErrorErrors of fact, insufficient awareness of prior research, and plagiarism not only render research defective and unreliable, but bring discredit upon both authors and publishers. Outside, independent reviews provide a useful check against unintentional instances of such errors (as well as providing a barrier against willful infractions). In the world of software, errors are largely vetted through a combination of elements both internal to a development process and in the wider marketplace of users. Most high-quality software is subjected to a number of quality control mechanisms before being released to a wider public. This includes both extensive component-level testing as well as formal "code reviews" in which a number of project participants critique the organization and engineering principles evident in the source code. Such reviews typically point out errors of style in the writing of programs, as well as facilitating discussion of the overall cogency of the system, its adherence to design standards, and its compliance with the particular idioms of a programming language. Code that fails to meet these standards is changed or discarded, and programmers who routinely engage in sloppy workmanship may find that they are not trusted with the more complex parts of the system (or they may experience other, far more dire professional consequences). Bad code is as embarrassing to a programmer as bad data is to a scientist. Like traditional peer review, code review tries to eliminate errors before they become ensconced in a more stable form. Once software is released, market forces tend to take over. Buggy, poorly maintained software seldom finds favor with users (even when it's the only thing available). Stable, robust software is used, maintained, and gradually improved.Providing Community FeedbackPeer review often represents the first exposure of an idea to the wider community of scholars. In the case of a grant proposal, the feedback generated may bear on important questions of scope, feasibility, and need, as well as providing encouragement, advice, and the benefit of experience. In a sense, it provides a gauge of the market for a particular idea or approach. Ideas for software nearly always come either from some concrete need on the part of the developer or a perceived need within the marketplace. In both cases, several questions need to be posed before any code is written. Does something already exist that can do the job? Are there lessons to be learned from past attempts? Would the benefit of modifying or improving an existing system warrant the work involved? Would a program that duplicates the functionality of another program (perhaps differently, or with an enhanced set of features) find favor with users? Most such questions are easily resolved by consulting potential users as well as other experts in the development of software. Anyone who proposes to create a new programming language, for example, will be cautioned to study the scholarly literature of programming languages carefully before undertaking such a task in order to avoid reinventing a technique or re-solving a well understood problem. Asking a group of potential users if they would find a particular system useful also yields important information about whether a system should be built, and if so, how it should work.[1] Attempts to get at such information range from large focus group studies, standardization efforts, and RFCs [2], to ad hoc surveys and other, less formal methods of gauging interest. Software written in complete isolation from the community of both users and developers, like bug-ridden software, seldom finds favor with users (if it manages to find users at all).The Imprimatur of PublicationPublication with a well known and respected academic press (or funding from a respected institution) provides clear evidence to other researchers that a particular work or research endeavor has been properly vetted by independent experts in the field. Software, by contrast, is not "published" in the conventional sense, but there are several analogues to the imprimatur that publication typically grants. Companies that have garnered a following among users may be considered as publishers, but the analogy quickly breaks down. Software companies are not required to vet their products with an independent entity, and software produced by a company is almost always developed in house. In the world of open source software, however, several organizations exist that do vet software from outside entities and attempt to guarantee a standard of quality (typically through a formal peer review process). The Free Software Foundation and the Apache Software Foundation are two outstanding examples of such organizations, and both have formal review processes. [3] Both foundations are independent, non-profit entities committed to supporting particular activities in the area of software development. Both insist that contributed software comply with particular code and quality standards. The FSF maintains a list of active evaluators who judge projects on the basis of quality, design, and overall integration with the goals and purpose of the larger project. The ASF has various elected committees and boards that perform similar functions. In both cases, the admission of developers into the project is primarily merit based; developers who consistently demonstrate skill and craftsmanship through stable, well maintained code are given increasingly important roles in the management of the overall system. In both cases, the resulting software is nearly universally recognized by users as being of a very high quality. In many cases, the software produced by these two organization has become the benchmark against which all other implementations are judged.Software in the HumanitiesThe existing mechanisms of software development can be made to resemble the traditional process of academic peer review, and it is not difficult to imagine the creation of an organizational entity that, like the FSF and ASF, provides a merit based system of code review and evaluation. Projects brought under the aegis of such an entity could easily provide the two-fold benefits of peer review described above (the assesment of quality and the imprimatur of publication). But an objection naturally arises: While such a process might successfully guarantee quality for software, it would do little to make the creation of software itself a legitimate research activity in the humanities. Moreover, one might argue that the study of software design and the assessment of quality already exists within its own discipline (computer science, information technology, and allied fields). Peer review, in other words, presupposes disciplinarity. While one might establish a peer review process in the hope of gaining further legitimacy for a particular field, that effort is likely to fail in the absence of some widespread agreement about the legitimacy of the activity as such. In the case of scholarly technology, the question lies upon two axes: Is software development in the humanities an activity distinct from software development as such, and if it is, are there ways to convince the wider scholarly community of its legitimacy as a discipline? There are many ways in which the development of software in the humanities is indistinct from any kind of software development. The particular disciplines of system design, software engineering, and programming apply to a great number of design endeavors, and a scholarly technologist in the humanities pursues his or her craft with the same framework as any software technologist. In this sense, digital humanities and those branches of computer science concerned with such issues speak a common language. Yet recent years have seen the rise of a number of distinct (though clearly cognate) disciplines devoted to the application of computing technology to specific areas of inquiry, including bioinformatics, computational linguistics, statistical computing, and many others. What distinguishes such disciplines is not their novel approach to design or engineering, but the substantial level of domain expertise necessary for undertaking projects within their purview. Bioinformatics is the province not of computer scientists, but of biologists--some of whom devote themselves solely to the task of designing and building systems for analyzing biological data. Statistical computing is similarly bifurcated; relatively few computer scientists possess the domain expertise necessary for creating advanced systems for statistical analysis of the sort needed by professional statistics researchers, and so statisticians themselves undertake the task of designing domain-specific systems. [4] The situation is no different in the humanities. Scholarly questions and methods of inquiry that arise in fields like literary studies, history, philosophy, and language study are of sufficient complexity to warrant the existence of experts with substantial training in these subjects. The question of legitimacy--assuming one grants the argument made above--is more subtle and sociological in nature. Right now, the role of software development in the humanities is not unlike that of scholarly editing within the context of, for example, an English department. However much work textual editing might involve, the product of such effort is still often seen as belonging to "the lower criticism"--something necessary to more traditional philosophical and exegetical activities, but nonetheless inferior to them. Editing a book "counts less" than writing a book, and writing a piece of software presumably counts less than using software to achieve some stated research goal. [5] Such views are cause for considerable indignation among both textual editors and scholarly technologists, but the former group (being the elder of the two by far) has done much more to mitigate this lamentable situation. Textual editors, who spend many years studying their subject and honing their skills, have organized scholarly societies for the purpose of advancing the understanding of the subject and maintain journals in the specific area of textual scholarship. Critical editions of texts are subjected to peer review processes every bit as rigorous as any process aimed at evaluating traditional research. It is easy to identify leaders in the field, domain experts, and a standard of quality by which editing projects may be judged. As a result of this, many English departments have a textual specialist, and more critically, several others who will defend the intellectual integrity of the activity against all detractors. Increasingly, those who would persist in the view that editing is somehow a lesser thing are accused of ignorance, partisanship, or both. Like textual editors, developers of scholarly software are universally convinced that the activity is of paramount importance to certain areas of humanistic inquiry, rigorous and engaging as an intellectual pursuit, and scholarly in nature. Translating this into cultural legitimacy is largely a function of practitioners' ability to create the institutions that lend legitimacy to any scholarly activity. This means professional societies not merely for researchers who use computing in the context of humanistic inquiry, but associations solely dedicated to the advancement of software design and development within the larger discipline. It means journals devoted not to research results from software based projects, but journals devoted to the design and creation of the tools themselves. Finally, it means the creation of an organizational entity that can facilitate the process of peer review as it pertains to the specific area of software development. No one of these elements is likely to create the desired effect, but in concert (and in cooperation) they may well do so. The advancement of individual researchers is an important side effect of this process, but the ultimate goal is the continuance of computer-based study in the humanities as such. Without tools, the discipline cannot continue to thrive. Creating the institutions of academic legitimacy does not guarantee that legitimacy will emerge within the larger, more diverse community of scholars. The utility of scholarly software still ultimately depends upon the research questions it adjudicates (much as a critical edition is useful only insofar as it helps to facilitate scholarly inquiry). The real question is whether those who would devote themselves to facilitating such activities--perhaps at the expensive of not having the time to engage in such activities themselves--deserve to have their work placed on the same plane as those who engage full time in the more mainstream activity. Scholarly technologists think they do. It remains for those of us who do this kind of work not only to create the institutions of legitimacy for our discipline, but to ensure that the fruits of our labors are evident to all.Footnote 1: Questions of the form, "What kind of software would you find useful?" seldom yield actionable results. It is unlikely that, for example, the invention of the web browser would have emerged from such a question. More specific queries--"What if you had a system that would let you view linked documents from all over the world on your computer?"--often provide much more useful insights. Footnote 2: The RFCs or "Requests for Comment" are a series of Internet standards put forth by individual researchers or small groups and vetted by the wider community of Internet users. The process is less formal than that promulgated by standards bodies such as ISO, ANSI, or IEEE, but several of these documents have become the de facto standards used by the industry at large. The most famous RFC is undoubtedly RFC 822 ("Standard for the Format of ARPA Internet Text Messages") which describes the standard protocol for email. Despite the name, RFCs are typically designated as such even after the standard has stabilized and has entered into widespread use. Footnote 3: The Free Software Foundation (http://www.gnu.org) is the organizational entity behind the GNU Project, which sponsors the creation and maintenance of a wide variety of utilities for UNIX-like operating systems as well as several other extremely popular tools. The Apache Software Foundation (http://www.apache.org/) produces the most popular web server in general use (Apache HTTPD) and also sponsors a number of large development efforts related to XML, web services, and web application frameworks. Footnote 4: This not to say that scholars whose primary background is in computer science can't (or haven't) built such systems, but only to point out that the sociological aspects of disciplinarity appear, in the case of very complicated systems, to favor the movement of domain experts in another field toward the discipline of computer science and engineering. Footnote 5: It is worth pointing out that this same situation exists within computer science itself. The development of tools is in most cases subservient (from the standpoint of a scholarly journal or funding organization) to the principle the tool demonstrates or the research it enables. | |