Links for Text Analysis
Tutorials and Documentation
The following are links to useful tutorials and help files for different text analysis tools.
TAPoR Recipes is a collection of tutorials built around tasks. Associated with each one is an Excercise with a walk-through of an actual project using the recipe. For example,
Identify Themes Within a Text shows how you can use text analysis tools to find interesting themes in a text. These were written by Shawn Day.
TAPoR Training includes a video tutorial on using the portal for text analysis and one on using
HyperPo. These are an excellent way to see how you can use the
TAPoR portal and
HyperPo.
OCR and Using Electronic Texts
This is the script for a workshop of the TAPoR McMaster node on scanning texts, optical character recognition, and using electronic texts.
TACTweb Workbook
This is an interactive workbook for the TACTweb environment.
TAPoRLive Instructions
These are the instructions for the TAPoR Live CD instructions.
Face of Text streaming video
Streaming video of selected speakers from the Face of Text conference.
Using TACT is available from the MLA as a free PDF. It is a book that introduces text analysis and how to use
TACT.
QuickGuide: TACT is a guide by the Indiana University Library Electronic Text Resource Service (LETRS)
The basics of concording and
Method in text-analysis are two introductions by Willard
McCarty? for his
Fundamentals of the digital humanities course.
--
GeoffreyRockwell - 06 May 2005
Lists of Tools
Here are some links to lists of tools.
TAPoR Tools is the list of tools the TAPoR portal knows about. These are web services you can use right from the portal.
Text Analysis Software is a site at the University of Alberta that lists different text tools.
Tools Center is a site with information about tools that is run by the Center for History and New Media
TAPoRware has a list of tools available.
The Text Analysis Info Page is a site that has information about a number of tools a useful glossary (explanation of terms) and other related information.
Intute is "free online service providing you with access to the very best Web resources for education and research." You can search for tools.
DiRT: Digital Research Tools wiki is a great collection managed by a team including Lisa Shapiro.
KDnuggets, a site on "Knowledge Discovery" or data mining has a page with links to Text Analysis, Text Mining, and Information Retrieval Software
UCSB Toy Chest is a "English Department Knowledge Base at the University of California, Santa Barbara" with reviews.
Computer Content Analysis Programs is a list from the web site The Content Analysis Guidebook Online.
LingPipe Competiton is a good list of linguistic tools, both commercial and academic.
--
GeoffreyRockwell - 27 Aug 2006
Projects
Some text analysis projects and lists of projects:
Also, see
About Computer Assisted Text Analysis where there are links to lists of tools.
TADA Projects
The Globalisation Project
The Globalization and Autonomy Online Compendium is a collective publication by the team of leading Canadian and international scholars who are part of the SSHRCC Major Collaborative Research Initiative on Globalization and Autonomy.
The Big See
To explore the potential for visualization and HPC in the analysis of texts - To explore the potential for high-resolution text visualizations. What can we do if we have very high resolutions displays - how would an interactive text visualization be different in this case? - To explore the use of animation in text visualization.
TAPoRWare
TAPoRware is a collection tools that enable user to perform text analysis on XML, HTML and plain text files over the Web.
Philologic
PhiloLogic‚Ñ¢ is the primary full-text search, retrieval and analysis tool developed by the ARTFL Project and the Digital Library Development Center (DLDC) at the University of Chicago.
NORA
The goal of the nora project is to produce software for discovering, visualizing, and exploring significant patterns across large collections of full-text humanities resources in existing digital libraries.
HyperPO
HyperPo is an extensible text reading program. The first-time user can have HyperPo display an electronic text with little or no supplementary information (in a format that would seem comforting and familiar), and progressively add types of textual information deemed potentially interesting or useful.
DUCT
the Electronic Text Centre's (ETC) development of the Digitally Unified Collections of Texts (D.U.C.T.) portal was conducted largely in the spirit of Sue Fisher's "Needs Assessment Report" and the accompanying recommendations for the building of an infrastructure that brings together TAPoR content and those functions to be performed against it, most notably the managing and searching of texts and associated interoperability issues.
MashingTexts
This site documents the Mashing Texts project funded by SSHRC through the Research Development Initiatives program. This project is the companion to the Digital Texts 2.0 project. JiTR (Just-in-Time Text Research) is the tentative name of the framework being developed.
The McMaster Museum Roman Coins Collection
The goal of the McMaster Museum of Art Online Roman Coin Collection is to contribute to the growing collection of primary historical and numismatic sources on the world wide web. By giving researchers and students access to the primary source materials in the museum's collection, we hope to encourage the use of modern information technology in the pursuit of classical scholarship. By displaying the collection online, we hope to create more visibility for the collection at the McMaster Museum of Art.
Dictionary of Words in the Wild
What would it be like to live in a world in which there were no written words to be seen, ever? (Willard McCarty?)
This project is in its infancy. The idea is a database of photos of words taken in everyday settings. The dictionary has an API so it can support tools that would return, for example a phrase of word pictures.
TAPoR
The TAPoR portal will be a workbench of text processing tools that users can use on e-texts they bring or find on the Internet.
XText
XTeXT is a text search and retrieval/web application platform provided to the TAPoR project by isagn inc. It is based on search technology developed at the University of Waterloo. It is fast, scalable, extensible and under continuous improvement. Read more about XTeXT or XTeXT workshops.
--
ShawnDay - 14 Apr 2008
- globalLGReflect.png:
Blogs
TADA Meeting Blog A blog of notes about the original TADA meeting
Scribblings & Musings is Stéfan Sinclair's blog
grockwel: Research Notes is Geoffrey Rockwell's blog. It has archives on
Text Analysis,
Text Technology and TAPoR, and
Visualization
Word Send is kept by Vika Zafrin
TAPoR General News is the news feed from the
TAPoR project.
Companies and Commercial Products
TextAnalyst by Megaputer is a Text Mining package.
--
GeoffreyRockwell - 27 Aug 2006