is a SSHRC funded project to study how users use text analysis tools.
Project Description
A huge array of text analysis tools have emerged over the two decades, ranging from simple embeddable widgets to powerful linguistic parsers. Web-based text analysis tools can be integrated into wikis, blogs, document repositories, and full scale commercial publishing platforms. Just What Do They Do (
JWDTD?) seeks to discover how these analytical tools are emerging on the web, how they are being integrated by publishers and content providers, and how they are being used by readers.
Specifically we will be looking at two types of users:
Publishers or Content Providers.
The study will focus on academic content providers from small scholarly blogs to academic journal publishers to large aggregators like JSTOR. What kinds of text analysis tools are content providers developing? Which tools are they integrating into their publishing platforms?
Readers.
No one knows how people are using these text analysis tools. We have tools at hand that can provide quantitative use statistics, but we need to go beyond a simple log of usage frequency to understand how analytic tools are integrated into the research process. Interviews and usability assessments will gather qualitative data to understand how individuals think about these text analysis tools, and how they would like to use them.
Ultimately Just What Do They Do will provide evidence to support the design of text analysis tools that are best suited to researchers’ needs.
Methodology
Just What Do They Do? asks how these tools are used by author/publishers and readers. Because this phenomenon is new, few have studied it, the exception being a
PhD? thesis by Peter Arthur that looks at the educational effectiveness of embedded reading tools.
JWDTD? proposes four strategies for studying the new analytics:
Tool Testing and Review
JWDTD? has been testing and reviewing text analysis tools. We have developed
Guidelines for testing tools. We have also been reviewing tools.
JWDTD? will conduct an environmental scan of the variety of analytical tools "in the wild". What are the types of tools made available to bloggers for enhancing their sites? What sorts of analytics are professional sites adding? What types of visualization and analytics are most commonly embedded? Why have word clouds taken off as a popular enhancement to content?
Badge Bazaar
We have developed a number of embeddable tools for academic use by online journals and e-text archives (Voyeur, and
TAPoRware through the
TAPoR portal). We will develop a web site where authors or website developers can explore and select different embeddable tools, both ours and others. This site, which we will call the Badge Bazaar will be a social research site that not only logs what people search for and try, but gives users an opportunity to vote for tools, comment on tools, and propose new tools. Our hypothesis is that with appropriate outreach to the research community we can get sufficient traffic on the site to gather useful statistics and narrative comments. It should be added that the Bazaar will also be one of the
research outcomes.
To balance the information gathered by the Bazaar, we will run usability studies where research users, drawn primarily from the digital humanities community, are interviewed about their needs, then introduced to analytical tools and observed completing tasks on sites enhanced with analytics. These sessions will be captured and studied to help us understand how users think about using the tools.
Recommendation Engine
The Voyeur system we have developed is modular enough that it can propose different combinations of analytics for different types of texts. We will work with recommendation engines developed in computer science, such as those used commercially to recommend purchases based on what you have selected, like Amazon’s “Customers Who Bought this Item Also Bought” feature.
Research Outcomes
JWDTD? will produce three types of outcomes. First there will be new research web sites like the Badge Bazaar and the recommendation engine site which will meet a need for research tool discovery. These will build on and make more accessible research tools like Voyeur that have been developed for the digital humanities community. The second type of outcome will be the posters, conference papers and peer-reviewed papers, which will be shared through digital humanities, information studies, and computer science venues. Lastly we will return the results including information about tools through other social research sites that provide lists of tools and research about tools. The idea is to seed our
results through the web research community where it will be visible and useful. Example sites include
DiRT?: Digital Tools Research wiki .
We believe the outcomes of this project will be of interest far beyond the humanities computing community – that they will be useful to content developers and, importantly, a guide to just what users do such that we are ready for large-scale analytics as very large corpora become available.
JWDTD? proposes that we can learn now how users think about analytics as we scale up. Anyone developing a research web site with significant content should be interested in how users use analytical tools.