Peer Review of Humanities Computing Software: How, what, and by whom?
John Bradley, King's College London
The abstract for this session reports that software development for Humanities-oriented tools seems to have slowed down, and suggests that establishing a mechanism for peer review for software might be one of the ways by which software development for the Humanities could be revitalised. It is perhaps natural to start thinking about peer review in the context in which peer review is normally considered -- using the model that is involved in getting an article published in a scholarly journal. To me, however, it has seemed useful to expand our thinking about the role of peer review in supporting digital development in the humanities in three basic ways:
- What are we reviewing?
- Who are the peers?
- For what purpose are they reviewing?
First, then, what are we reviewing? The question is important because the world in which humanities computing is done today is substantially different from that which was in operation in the 1980s and '90s -- the last great flowering of general software tools intended for humanists. There is the growth of XML and processing tools for it such as XSLT. A great deal of work around marked up text that formally would have required specialised software can be done with these tools alone. The open source revolution has made powerful engines, such as traditional relational database engines such as MySQL, XSLT processors, or the more exotic XML-based databases such as eXist available for free, and these tools can provide a sophisticated basis upon which to build. Serious work in the humanities, different from that done by our non-technical colleagues, can be done using these tools with little or no traditional programming required, and much of this work can already be found today at the centre of humanities computing.
So, perhaps we think we are interested in using peer review to promote the development of tools for our non-technical colleagues -- for the academic next door who, although s/he has a computer on his/her desk, uses it currently only for word processing and browsing the WWW. If we expect innovative software to be developed that might be used by this community on their own materials on their desktop machine, we must these days think of GUI-based software. There are, after all, precedents for this in a sister field -- the Social Sciences. There a number of pieces of specialised software to support Qualitative Analysis of texts have been developed, and several, including Nud*ist, NVivo and Atlas/ti have received rather wide acceptance among that academic community.
If we wish to develop GUI-based software, then we must first recognise that GUI development is very expensive. It is conventional wisdom in the programming industry that 80% of the code for GUI-based programs merely handles the support of clicking, selecting and highlighting objects appearing on the screen. Only about 20% of the code development, then, goes into the development of new functions for the software to do. Furthermore, coding for GUI interaction is very complex, making use of complex programming methodologies and strategies, and making it a professional activity. Finally, beyond the heavy overhead of GUI development, there is always the infamous cost of handling last 10% of the features or bugs that turn up, which end up taking far more than 10% of the overall development time.
Furthermore, software built using the various open-source tools is generally only usable by those with quite a sophisticated understanding of the computer. Our non-technical colleague next door who only uses word processing software and a browser would find it quite impossible to use famous open-source software like Apache, or even the various free XSLT processors, let alone free database engines such as eXist or MySQL without considerable help. To package up software so that it is truly useable by anyone with a desktop computer, particularly if it does something other than a well understood application such as word processing, requires an extensive commitment to the development of program documentation, well beyond what is normally done for open-source software. It also requires continued technical support.
For all these reasons, GUI-oriented software is often a team project, often involving professional programming staff. Software development of this kind is a profession on its own, and one with its own standards and mature practice that is quite outside of the humanities. Software development does have its own peer review practices, but the peers are programming professionals, and employ criteria in their peer review that is quite different from what is used in academic circles. In the end, developing standalone software for non-technical colleagues involves the amount of work more on the scale of a book then an article, and requires the kind of resources, financial and otherwise, that are substantially larger than what is needed to write an article for a traditional journal.
This brings us to a second observation about peer review: that it turns up in many other areas than for journal review. Indeed, Horace Freeland Judson, in his controversial article "Structural Transformations of the Sciences and the End of Peer Review" distinguishes between what he calls "peer review" for grant applications and "refereeing" for journal publishing. Grant applications provide funding to support development of larger scale humanities projects, and given the high cost of software development, the grant approach might well provide a better model for the kind of peer review that should be applied to software tools development. Peer review is also, of course, involved in appointments to academic posts and, at least in the UK, may involve peers from institutions other than those at the appointment's host institution. Perhaps support for software development should look at these kind of peer review for models in addition to considering the model of journal article-oriented peer review.
So when thinking about peer review to support an academic and/or tenure appointment, who are the peers, and what are they looking for? For academics who wish to program, the peers that matter are other academics. For these people in turn the most important criteria surely has to be to what extent the work has lead to a significant development of their field -- the same criteria that they use to evaluate the more traditional research output of other colleagues.
Software evaluation in that context is, it is true, somewhat of an uphill battle. There are, however, places where software is evaluated from a discipline-perspective, and in at least three we have examined we find confirmation of the importance of "significance to the field" as the criteria for evaluation, but also some hints on how one might fit tool building into this context.
First, let us turn to the Stoa Consortium's website (
http://www.stoa.org). The Stoa's stated purpose is
"to foster a new style of refereed scholarly publications in the humanities not only of interest to specialists but also -- and just as importantly -- accessible by design and choice of medium to wide public audiences. " (http://www.stoa.org/goals.shtml)
Note that their primary interest is in making available digital resources that can serve both the scholar and the broader public over the WWW. The fact that they are "resource oriented" is important, and this is an issue to which we will return later. What STOA publishes are "research tools" that are, in some sense, tools to support research in the traditional scholarly sense, so that they can be recognised as significant in that way. Because, however, they are digital in conception rather than print-oriented they are able to take advantage of the things that digital resources can do well and print can only do less satisfactorily. It is, perhaps, in this way that the things they publish can work to "develop and refine new models for scholarly collaboration via the internet".
The Stoa evaluates all scholarship submitted to it according to three criteria published on their website:
- the quality and importance of the work from a disciplinary perspective. The collaborators on particular projects will in many cases already have taken steps to secure this necessary and traditional sort of review for themselves, but the Stoa can also help to establish structures and procedures for it as needed.
- accessibility to wide audiences. On the implementation of this point we mean to be very flexible, but we do firmly believe that the issue should always be addressed.
- consistency with the technical considerations advanced by this consortium. The idea is NOT to generate an unduly complex and burdensome set of regulations, but simply to help promote good practices (defined as those which enhance long-term archivability and interoperability).
Note that the first stated criteria is "the quality and importance of the work from a disciplinary perspective", and that the Stoa actually assumes that the kind of traditional sort of review that justifies the work in this way will have been done already. Criteria number 3 touches on technical considerations, but the Stoa acknowledges there that their criteria for good technical design are still rather general--such as preparing material in XML rather than HTML. Overall, we can see that the Stoa expects that the importance of the work, from a discipline perspective, is important enough that will have already been made.
A second example touches more on the discipline-specific evaluation of software tools rather than digital resources, and comes from Melina Alexa and Cornelia Zuell's excellent book "A Review of Software for Text Analysis". Here the authors are reviewing a collection of pieces of software that support text analysis from a Social Sciences perspective. In the concluding remarks one finds one of the reasons why Alexa and Zuell wrote the book: "assisting text analysts [...] in questioning, testing and enriching their analysis methodology", and, indeed, the pieces of software they review are categorised both on how well they support existing textual analysis methodologies in their field, but also how well they challenge or extend these methods. Once again, the tool is not evaluated seriously on the quality of the programming (although, of course, a defective piece of software that did not work properly would presumably get short shrift in this work, even if it tried to embody or support some good research practice).
The United Kingdom's Research Assessment Exercise (RAE) (found at
http://www.hero.ac.uk) provides a third example of criteria for digital work in non-computing fields, and provides some further useful insight into what is considered important to support academic research in the higher education sector. The stated purpose of the RAE is:
…to enable the higher education funding bodies to distribute public funds for research selectively on the basis of quality. Institutions conducting the best research receive a larger proportion of the available grant so that the infrastructure for the top level of research in the UK is protected and developed. (http://www.hero.ac.uk/rae/AboutUs/)
Elsewhere on the site one finds the statement that "we believe that the UK RAE is the only such exercise which is conducted over all disciplines on a national scale." Furthermore, the RAE is run as a largely open and transparent process. Independent peer review panels are created for each academic discipline, and each participating academic department submits a report to the appropriate panel that shows the research activity their members undertook. The RAE site publishes online not only guidelines for how these research reports should be evaluated for the purpose of the exercise, but provides access to the full submissions from participating academic departments. It also contains summaries (to the discipline level, not to the level of the department) written by the peer review panels giving an "account of their observations about the strengths, weaknesses and intensity of activity of the research areas falling within the unit of assessment" (
http://www.hero.ac.uk/rae/overview/) across the UK. The RAE is not really aimed at individuals, but at departments, but good research performers contribute to the ranking, since the departments submit as a part of their research evidence a list of up to 4 research objects for evaluation for all academics involved in the submission. A panel of peers evaluates the submitted materials and uses this as one of the main criteria to rank quality of research for the department as a whole.
A review of the outcome reports from various discipline-specific peer review panels to see the impact of software development at the RAE site is instructive. Programming and software is not mentioned in the Computer Science report. Software tool development is not explicitly mentioned in the report for linguistics, even though there are world-class centres doing this work in the UK. The RAE's "definition of research" is also instructive. It includes not only classic humanities research under the rubric of "scholarship", but also makes explicit mention of "the invention and generation of ideas, images, performances and artefacts including design, where these lead to new or substantially improved insights" -- a category of research presumably aimed to support performance-oriented disciplines such as Music, Drama, or Art.
It is interesting that comments on "practice as research projects" appeared in review panel reports for Music, Art and Design and Drama, Dance and the Performing Arts. All three reports, one way or another, pointed out that departments needed to "report [on research] embedded in practice", (my italics) with a comment that there was "a need for greater rigour, but also diversity, in ways of presenting/validating practice as research". Thus, there is a possibility that software development could be viewed as pieces of "practice", although it is interesting that the reviewers rated most highly those pieces of practice for which a relevant research agenda could be most clearly identified, and a similar argument would, one presumes, have to be made about software submitted as research output. In a department such as French, for example, the software would need to clearly work within the context of a research agenda that could be understood by French colleagues.
Perhaps a different approach to the issue of getting "research recognition" for software development is to frame it not in the context of a traditional humanities department, but in the context of Humanities Computing. After all, one finds that gradually more and more Humanities schools are beginning to understand that HC stimulates new insights into humanities research, and that HC can be viewed as separate from both traditional humanities departments, from computer science, and from computing as a profession. At King's College London, for example, the Centre for Computing in the Humanities (CCH) is now viewed as an academic department within the School of the Humanities. Our development work has not been, to date, so much on designing "stand-alone" software tools for others to use. Instead, our projects aim to produce "born digital" research resources, and we have indeed been instrumental in attracting several large scale humanities projects to King's on the strength of this vision. Our role in attracting research funding has been effective enough that the CCH has been recently recognised as having a strategic role in the developing the School of Humanities' Research agenda.
The development of digital research resources rather than stand-alone software tools provides a recognisable connection with established disciplines, and the research resources we develop are, we believe, clearly significantly different from the traditional "research tool" in book form. Our work with the discipline specialists is viewed as a partnership with other departments, and is by its very nature collaborative, since it is recognised that we bring our own insights to the material under study that is different from those provided by the discipline researchers. To achieve this we become engaged with the nature of the materials under study that goes beyond what would be necessary for us to provide a "service" role, and places us clearly in a partnership position with the discipline academics. The digital resources we, with our discipline colleagues, produce are (or will be) browser accessible, and therefore immediately available to a broad range of discipline researchers, although they don't look like journal articles or monographs transferred to electronic form. In the end, the digital resources that emerge from our collaborative work shows not only to our own local discipline colleagues, but also to a broader humanities community that the computer can support truly new insights into Humanities materials.
So, how can Humanities Computing be made significant more broadly in the humanities? One thing to do is ensure that one publishes HC-oriented articles in established discipline journals -- not only in journals such as
Literary and Linguistic Computing (LLC) or
Text Technology that are read by an HC-aware audience. Perhaps we should promote the "practice as research" model more actively in non-performance disciplines, and draw our academic colleague's attention to the relevance of this model to humanities-oriented tools development. We could endeavour to ensure that the tools that do emerge can be justified as making new or substantially improved insights in our chosen discipline, and/or the Humanities as a whole.
Within the Humanities Computing community, we might turn over an occasional issue of
LLC or
Text Technology to serious, critical, software reviews, where the focus is on how the tool seriously advances scholarly research. Perhaps the new ADHO "electronic publication" would provide a good vehicle for raising the profile of digital tools -- although it would be important to ensure that such a publication was set up in such a way that non-computing colleagues could take it as seriously as they would take a traditional print journal published by a major publisher (as
LLC is). A book like Alexa/Zeull's "A Review of Software for Text Analysis", published as it is more for the discipline- than HC-specialist, could provide further support. It would not need to be an introduction to Humanities Computing, rather it should be a critical assessment of how software changes research -- something perhaps like an updated version of Rosanne Potter's 1989 collection
Literary Computing and Literary Criticism.
Computing has important things to do in the Humanities. As a part of the HC community, we need to support research work that further develops the place of computing there, and the development of peer review to support this work is one of ways to achieve this goal. We must, however, ensure that we understand the proper role of peer review in providing this support. We believe that there are many issues in software development that make modelling what is needed on peer review for journal articles the wrong starting point.
Comments
Steve Ramsay says . . .
Nice work, John. Your characteristically precise appraisal of cognate procedures (like those employed by Stoa) is quite useful. A few things:
1. "For academics who wish to program, the peers that matter are other academics. For these people in turn the most important criteria surely has to be to what extent the work has lead to a significant development of their field -- the same criteria that they use to evaluate the more traditional research output of other colleagues."
I think there's a distinction that needs to be made here between peer review as it is applied in the case of a refereed article or grant and peer review in the context of tenure and promotion. In the former case, the peers are not evaluating the impact of the research, but, in a sense, its probabe impact (as well, I think, as a number of other matters).
2. This article rightly outlines the enormous challenges involved with creating general purpose applications for ordinary users, but I think it assumes that this is the desired direction for software in HC. I'm not sure it is (or rather, I'm not sure that it's the only possible scenario). Perhaps it would be better to argue this point straightaway? In other words, to say that the goal
should be to create these types of tools.
3. "So, how can Humanities Computing be made significant more broadly in the humanities?" You probably want to be careful here, since this is a much broader question than the one proposed for this series of essays. After all, there are many things understood to be within the ambit of humanities computing which do not include the creation of software.
4. I noticed a few typos, but didn't feel empowered to alter your work without your permission. Then I said to myself, "Wait! This is a wiki! You're supposed to alter peoples' work." So, I went back and fixed a couple of things (but I think I missed some). I think we should all agree to fix minor errors when we encounter them.