Document Management
Document Management is a rather nebulous supercategory for various overlapping product classifications such as: Knowledge Management (KM), Enterprise Content Management (ECM), Content Management Systems (CMS), Digital Asset Management (DAM), etc. This environmental scan will to address this overlap as we seek to provide connections to the various needs identified by the user stories and scenarios.
Measurability
As a first step to finding a means to compare these dissimilar solutions this document will identify the functions of the subcomponents of these products and briefly describe them. This will help to allow us to measure applicability towards needs identified in the stories and scenarios. Additionally we can also identify the sort of questions that we can ask about both existing open source solutions as well as our perceived needs.
Questions
| Location |
Where will documents be stored? Where will people need to go to access documents? |
| Filing |
How will documents be filed? What methods will be used to organize or index the documents to assist in later retrieval? |
| Retrieval |
How will documents be found? |
| Security |
How will documents be kept secure? How will unauthorized personnel be prevented from reading, modifying or destroying documents? |
| Disaster Recovery |
How can documents be recovered in case of destruction or other issues? |
| Retention |
How long should documents be retained? There will be a wide variety of documents amassed to support editorial decisions and result from the research process. What format do we need them in? How long do they need to be retained? |
| Archiving |
How can documents be preserved for future readability? As above. When does support material move to an archival repository? |
| Distribution |
How do we make documents be available to the people that need them? |
| Workflow |
If documents need to pass from one person to another, what are the rules for how their work should flow? |
| Creation |
How are documents created? This question becomes important when multiple people need to collaborate, and the logistics of version control and authoring arise. |
| Authentication/Approval |
How do we provide needed requirements for legal submission to government and private industry that the documents are original and meet their standards for authentication? |
Perceived Orlando/O.Canada Requirements
- desiderata.rtf: An initial outline of document management requirements provided by Susan Brown.
Distillation of this document suggest that the following are core requirements of the document management requirements:
The basic proposed object in the O.Canada model is the entry. The entry is really a collection in itself - of things both meta and data. The polished entry itself will become the published object, but research data, descriptive data (including hooks to process and role) should relate to all components of collection.
The existing Orlando system is composed of:
- Current repository is individual files within the file system of a server.
- CheckIn and CheckOut are manually managed via a simple database.
- Document meta resides in the text document data and is appended manually.
- Versioning and rollback are not currently available.
- The XML Editor is a separate dated application identified as inefficient and buggy.
- Management relies on manually generated reports and extracts generated on demand from the database.
- Roles are managed outside of the digital system.
The basic requirements for O.Canada are:
- A Repository to contain all digital assets
- Allow for the collection of research notes attached to the evolving entry (research collection)
- Provide versioning of all documents within collections
- Process/Workflow Management
- To link Roles (editor, contributor, researcher) to the research and publishing process
- Metadata
- Security
- Roles
- Contributor
- Editor
- Bibliography Checker
! Researcher
- End-User
- Dashboard
This may or may not require an API to:
- other Repository
- XML Editor/Viewer
- All documents start from standardized templates that should be managed by the workflow engine
- Need also to be able to search and identify existing name authority for xref
- Series of standard reference sources that researcher starts with
Document Lifecycle Management Solutions
This solution is based on the Fedora repository. It offers intriguing possibilities, but I have not been as able to really discern how well it supports the publishing workflow as with the other products.
Initially promising. Good XML support and editing and management of DTDs within. Also implements JBOSS routines to allow for user configurable workflow routines. Dashboard.
"Alfresco have integrated state-of-the-art open source and Java technology such as Spring, Hibernate, Lucene, MyFaces, JSR-168, JSR-170 and web services into a simple-to-use, extensible, Enterprise Content Management (ECM) system. The intelligent repository provides out-of-the-box portal integration and full content control with integrated document management, security, document status and workflow. This allows Alfresco to turn your file system into a simple to use, compliant, auditable repository."
Pros: mature, handles workflow very well using JBPN with JBoss. They use
TinyMCE? (and herald the embedded HTML editor ;-)...question is what sort of functionality can we port in?
Cons: not into document interrelations, (i.e. collections) - their spaces metaphor does provide some of this, a little windows office centric
Q's how well would it work with Oxygen for example?
Alfresco is a modern state-of-the-art ECM built using Spring, Hibernate, Lucene and jBPM based on standards such JSR-170, JSR-168, Web Services and REST. This allows Alfresco to be deployed in any J2SE 5.0 (JRE 5.0) application server such as Apache Tomcat or JBoss Application Server
Worst case...really like the workflow, dashboard and rules definition...would be useful to play through and compare to existing Orlando to imagine how it could be improved.
A mature and well-supported open source solution. Supports grouping of data, meta-tags, versioning, and has some nice time saving features for dealing with large volumes of data. I particularly like the ability to have pre-defined data requests per document type - as well as the ‘discussion’. It doesn’t have anything specific for media management - but the system’s flexibility keeps that from being an issue. Per O.Canada requirements, it also has some basic collaboration through checkin/out and ‘work flow management’.
Looks to be a very promising small scale solution (rapid prototyping). JLibrary allows large volume management, meta-data for groups, and even a relationship manager(separate from meta and categories). What it is missing is the workflow component.
Conclusion: May be useful to get something quickly going to start to get feedback and play with some concepts.
Haven't been able to hit their or related sites for the last couple weeks. Not a good sign, although had noted interesting things about the project.
Strong dashboard and process defined by task/user. Role as strong - almost verging on project management with a repository behind it.
Very focussed on international support. Workflow again isn't engineered in, but remains open to play. Focus is on versioning and access control.
One of the main strengths here is the collabourative nature of the process. Takes discussion into the meta of a document. Strong repository, less focus on management as quick and ready access and strong and flexible search. RESTful.
Workflow seems to be adaptable. Seems a little more robust and multi-purpose than some of the other solutions. Collections of related documents (Or hierarchy are again not a strong point - sense that the main drivers of DCMS being medical and compliance industries find this less of a need) CPS is a little more focussed on end user as well , not as much on document creation management and publishing.
Raw Repositories with Appropriate APIs
We have been discussing whether to move
JiTR over to Fedora as it has various features. James has found a project,
eSciDoc that has created a workflow management system with Fedora.
Solid Foundation and potential, but purely foundational and provides nothing 'right out of the box' - not saying this is a bad thing.
Exploring the XML Side
Collected Considerations
Without trying to let API's get in front of this, am trying to stay repository agnostic at this stage. Thus need to consider repository API's.
--
ShawnDay - 4 Jan 2008