Home
Documentum Primer PDF  | Print |  E-mail
Written by Frank Jacquette   
Wednesday, 05 March 2008 17:04

Throughout the pharmaceutical and life sciences world Documentum has firmly entrenched itself as the document management heavyweight. But what is it, and what is all this "document management" stuff anyway?

Simply put, document management is the art of using computers and software to store, control, modify, and publish electronic documents. Electronic documents can be plain files, such as text files, word processing files, or spreadsheet files. Electronic documents can also be scanned images of actual paper documents (dealing with scanned images is an entire discipline unto itself, and is usually called imaging.)

The need for document management arose in the early 1990s as organizations began to store more and more of their documentation on computers. Pharmaceutical companies, in particular, wanted to automate the process of creating a New Drug Application (NDA); a typical NDA weighs in at many thousands of pages, and is created by dozens or hundreds of people over several years.

A document management system like Documentum provides several important features, such as file format neutrality, centralized storage and check-in/check-out, security, document attributes, full-text indexing, versioning, renditions, and publishing of aggregate documents.

File format neutrality

Electronic documents come in many different file formats: files are created using different applications (like Microsoft Word or Microsoft Excel), on different platforms (like Microsoft Windows, Apple Macintosh, or UNIX), and in different versions of each application. A document management system happily stores documents regardless of their source and with little regard to their internal contents.
 
Centralized storage and check-in/check-out

Documentum (and most industrial-strength document management applications) is server-based, which means that it stores files on a central computer called a server rather than on each individual users computer. In addition, the document management system ensures that only one person can edit the document at a time by enforcing a check-in/check-out system. Only one person can have a document checked out a time, and until that person checks the document back in other users can view, but not edit, the document. In conjunction with security and versioning, this is a powerful capability. The centralized storage location is frequently called a document repository.
 
Security

The document management system enforces security rules on each document. These rules are typically defined by the system administrator, and define who many create, read, write, version, delete, or otherwise control individual documents. Users are required to log in with a user name and password, and user operations are typically recorded in an audit log. This creates a trail of accountability associated with each document while also preventing the accidental or intentional deletion of important documents.
 
Document attributes

In addition to the information kept within the document itself, there are other pieces of information associated with the document, such as title, author, and creation date. Documentum calls these bits of data attributes; some other systems call these bits metadata. The document management system maintains these attributes for you, and can typically be extended to record custom attributes specific to your business.
 
Full-text Indexing

A document management system provides the ability to search across the documents stored in the system, despite the fact that those documents are stored in diverse file formats.
 
Versioning

When you edit a file on your computer and save it, it usually overwrites the old version, causing that last version to be lost forever. Document management systems dont overwrite old versions; instead, they version the document and store the new copy alongside the old one. To the user, it looks like a single document, but if necessary the system enables the user to go back and retrieve older versions. In conjunction with the check-in/check-out system and the security provided by the system, this is a powerful track for recording the entire lifecycle of a document.
 
Renditions


Renditions are copies of an original document stored in the system, but rendered in a different file format. This is most useful in conjunction with the publishing of collections of documents. As an example, consider a system where you have several Word documents, an Excel spreadsheet, dozens of text documents, and some scanned images of paper documents. Ordinarily these diverse file formats would not happily co-exist in a single aggregate document, but we can coerce them into doing so by creating renditions of each in a single common file format; Adobes Portable Document Format (PDF) is the most popular choice. The rendition happily coexists along with the original file format, and is only used when needed for a purpose like publishing.
 
Publishing aggregate documents

An aggregate document is one that is composed of many separate files (the term Documentum prefers is virtual document, and in the early days they were referred to as compound documents.) An NDA is a perfect example of an aggregate document: it may include Word documents developed by the pharmaceutical companies, Excel spreadsheets containing data, scanned case report forms from clinical trials, and even esoteric files such as chemical structure diagrams or CAD drawings of manufacturing equipment. The goal is to produce a single large document in a single format, and this is where the document management system becomes an indisposable tool. Publishing consists of creating renditions for all of the aggregate documents components in a single common file format, such as PDF, and then joining those files into one large file.

Publishing is a sub-field all to itself, and utilities such as Liquents CoreDossier facilitate the document assembly and rendering process. A true publishing engine can repaginate the assembled document, apply common headers and footers, develop comprehensive tables of contents, and perform other functions that help make the final document look as if it was created in one long seamless process.

So how did Documentum become the dominant document management software on the market, especially within pharmaceutical companies?

Documentum started life as a software project in Xerox's Palo Alto Research Center (PARC) and was spun off as a separate company in 1990. Having identified the pharmaceutical regulatory process as its first target market, the company quickly established itself as first to market with an industrial strength system.

Documentum's success derives from three important facts:
  • The software was built upon industry-wide standards. Attribute data was stored in a relational database like Oracle or Informix (and later Sybase and SQLServer) rather than in a proprietary, Documentum-only format. Documentum partnered with Adobe early to support PDF documents in Documentum. Combined with its file format neutrality, this made Documentum a popular choice in a world that still had significant diversity in platforms and file formats.
  • The system was truly industrial-strength. At a time when most desktop PCs still had 640 K of main memory and hard drives measured in tens of megabytes, Documentum was designed to accommodate millions of documents. As computing power grew, Documentum kept pace.
  • The software was expandable and customizable. Recognizing that any off-the-shelf solution would be unable to handle the business environment of every customer, Documentum enabled and encouraged customization of their software. This created a group of consultants and vendors who specialized in the development of Documentum customizations, and those vendors in turn promoted Documentum and encouraged its use while simultaneously expanding its capabilities.

The last is Documentum's greatest strength, but it also contributes to the significant confusion surrounding Documentum and just what it is. Documentum software actually consists of several components, many of which may be hidden from the user:

  • The Documentum server is the heart of the system, and although it has been enhanced and modified over the years it is still recognizable as the same product that Documentum first started with in the early 1990s. Over the years Documentum (the company) has marketed the server as the DocPage Server, EDMS 98, Documentum 4i, and other names. It runs on a centralized server system.
  • DocuWorks was Documentum's first attempt at a desktop client: the software the end user uses to talk to the server. Somewhat underpowered, it was replaced by WorkSpace.
  • WorkSpace was Documentum's mainstay on the desktop for several years, and is still in use in many places despite no longer being supported by the company. It provides a friendly cabinet/folder view and is easily customized. Most custom applications developed for Documentum prior to 1999 build upon WorkSpace. WorkSpace was available for Microsoft Windows and Macintosh platforms.
  • Accelera was Documentum's first foray into the world of web-based clients. Simple and straightforward, it was popular but somewhat underpowered. It was replaced by the disastrous first version of RightSite.
  • RightSite was Documentum's second venture on the web, and it attempted to reproduce most of the functionality of WorkSpace in a web-based environment. RightSite had two faces that were presented to the user: ViewSpace, which was a read-only view, and SmartSpace, which was closer to WorkSpace but somewhat slimmed down. To further add to the confusion, Documentum simultaneously introduced a client/server version of SmartSpace which looked a lot like WorkSpace. The initial version of RightSite was over-ambitious and poorly architected considering the state of web technology at the time.
  • The second release of RightSite was almost a complete re-write. More stable than its predecessor, it still wasnt as easy to customize as WorkSpace.
  • The Documentum Desktop Client was the first major change in Documentums desktop strategy since the introduction of WorkSpace. The Desktop Client provided a Documentum interface that looked and felt like the Microsoft Windows File Manager. Although this provided a smoother integration with the Windows look and feel, customizations previously developed for WorkSpace didnt work with the Desktop Client. The Desktop Client is still the primary desktop product, and WorkSpace has been declared obsolete.
  • In 2000, Documentum announced that RightSite would be phased out in favor of the Documentum Web Development Kit (WDK.) The WDK enables developers to build web-based user interfaces to the Documentum server more quickly than was possible with RightSite.

If you're using a Documentum system today, it could be built using any one of the client components (WorkSpace, RightSite, Desktop Client, or WDK; there aren't too many DocuWorks, Accelera, ViewSpace, or SmartSpace users left out there) or could even be built entirely from scratch with a custom application on the desktop talking directly to the Documentum server.

Clearly a large group of systems are Documentum-based, but it may not be at all obvious which parts are basic out-of-the-box software and which are customizations. For most users the distinction is unimportant so long as the system makes their jobs easier.
 
Jacquette Consulting, Powered by Joomla! and designed by SiteGround web hosting