| Documentum Primer | | Print | |
| Written by Frank Jacquette |
| Wednesday, 05 March 2008 17:04 |
|
Throughout the pharmaceutical and life sciences world Documentum has firmly entrenched itself as the document management heavyweight. But what is it, and what is all this "document management" stuff anyway?
Simply put, document management is the art of using computers and software to store, control, modify, and publish electronic documents. Electronic documents can be plain files, such as text files, word processing files, or spreadsheet files. Electronic documents can also be scanned images of actual paper documents (dealing with scanned images is an entire discipline unto itself, and is usually called imaging.)
The need for document management arose in the early 1990s as organizations began to store more and more of their documentation on computers. Pharmaceutical companies, in particular, wanted to automate the process of creating a New Drug Application (NDA); a typical NDA weighs in at many thousands of pages, and is created by dozens or hundreds of people over several years.
A document management system like Documentum provides several important features, such as file format neutrality, centralized storage and check-in/check-out, security, document attributes, full-text indexing, versioning, renditions, and publishing of aggregate documents. File format neutrality Electronic documents come in many different file formats: files are created using different applications (like Microsoft Word or Microsoft Excel), on different platforms (like Microsoft Windows, Apple Macintosh, or UNIX), and in different versions of each application. A document management system happily stores documents regardless of their source and with little regard to their internal contents. Centralized storage and check-in/check-out Documentum (and most industrial-strength document management applications) is server-based, which means that it stores files on a central computer called a server rather than on each individual users computer. In addition, the document management system ensures that only one person can edit the document at a time by enforcing a check-in/check-out system. Only one person can have a document checked out a time, and until that person checks the document back in other users can view, but not edit, the document. In conjunction with security and versioning, this is a powerful capability. The centralized storage location is frequently called a document repository. Security The document management system enforces security rules on each document. These rules are typically defined by the system administrator, and define who many create, read, write, version, delete, or otherwise control individual documents. Users are required to log in with a user name and password, and user operations are typically recorded in an audit log. This creates a trail of accountability associated with each document while also preventing the accidental or intentional deletion of important documents. Document attributes In addition to the information kept within the document itself, there are other pieces of information associated with the document, such as title, author, and creation date. Documentum calls these bits of data attributes; some other systems call these bits metadata. The document management system maintains these attributes for you, and can typically be extended to record custom attributes specific to your business. Full-text Indexing A document management system provides the ability to search across the documents stored in the system, despite the fact that those documents are stored in diverse file formats. Versioning When you edit a file on your computer and save it, it usually overwrites the old version, causing that last version to be lost forever. Document management systems dont overwrite old versions; instead, they version the document and store the new copy alongside the old one. To the user, it looks like a single document, but if necessary the system enables the user to go back and retrieve older versions. In conjunction with the check-in/check-out system and the security provided by the system, this is a powerful track for recording the entire lifecycle of a document. Renditions Renditions are copies of an original document stored in the system, but rendered in a different file format. This is most useful in conjunction with the publishing of collections of documents. As an example, consider a system where you have several Word documents, an Excel spreadsheet, dozens of text documents, and some scanned images of paper documents. Ordinarily these diverse file formats would not happily co-exist in a single aggregate document, but we can coerce them into doing so by creating renditions of each in a single common file format; Adobes Portable Document Format (PDF) is the most popular choice. The rendition happily coexists along with the original file format, and is only used when needed for a purpose like publishing. Publishing aggregate documents An aggregate document is one that is composed of many separate files (the term Documentum prefers is virtual document, and in the early days they were referred to as compound documents.) An NDA is a perfect example of an aggregate document: it may include Word documents developed by the pharmaceutical companies, Excel spreadsheets containing data, scanned case report forms from clinical trials, and even esoteric files such as chemical structure diagrams or CAD drawings of manufacturing equipment. The goal is to produce a single large document in a single format, and this is where the document management system becomes an indisposable tool. Publishing consists of creating renditions for all of the aggregate documents components in a single common file format, such as PDF, and then joining those files into one large file. Publishing is a sub-field all to itself, and utilities such as Liquents CoreDossier facilitate the document assembly and rendering process. A true publishing engine can repaginate the assembled document, apply common headers and footers, develop comprehensive tables of contents, and perform other functions that help make the final document look as if it was created in one long seamless process. So how did Documentum become the dominant document management software on the market, especially within pharmaceutical companies? Documentum started life as a software project in Xerox's Palo Alto Research Center (PARC) and was spun off as a separate company in 1990. Having identified the pharmaceutical regulatory process as its first target market, the company quickly established itself as first to market with an industrial strength system. Documentum's success derives from three important facts:
The last is Documentum's greatest strength, but it also contributes to the significant confusion surrounding Documentum and just what it is. Documentum software actually consists of several components, many of which may be hidden from the user:
If you're using a Documentum system today, it could be built using any one of the client components (WorkSpace, RightSite, Desktop Client, or WDK; there aren't too many DocuWorks, Accelera, ViewSpace, or SmartSpace users left out there) or could even be built entirely from scratch with a custom application on the desktop talking directly to the Documentum server. Clearly a large group of systems are Documentum-based, but it may not be at all obvious which parts are basic out-of-the-box software and which are customizations. For most users the distinction is unimportant so long as the system makes their jobs easier. |