Eradication of Illiteracy

What talented team had defined as “the information to retrieve” has a special name: projection, and, for most relational databases, names of attributes, presented in projection, are case-insensitive.

It seems that some members of talented team think that they are smart enough to read this blog and make some conclusions about security:

API> ?,c,select user_password from dm_user where user_name=USER
user_password   
----------------
****************
(1 row affected)

API> ?,c,select * from (select user_password from dm_user where user_name=USER)
user_password   
----------------
****************
(1 row affected)

But all their attempts are doomed to failure:

API> ?,c,select USER_PASSWORD from dm_user where user_name=USER
user_password
-----------------------------------------------------------------------
AAAAEAjkr5it6wBqYfLetO/ob9j+75axyTIlb6WpnS8vLcP58ppmenSigXCm4pT1Q3nG ...

API> readquery,c,select * from (select * from dm_user where user_name=USER)
...
q0
API> next,c,q0
...
OK
API> get,c,q0,user_password
...
AAAAEAjkr5it6wBqYfLetO/ob9j+75axyTIlb6WpnS8vLcP58ppmenSigXCm4pT1Q3nGK ...

Surprise: Documentum loses data

I’m not sure whether it is a common practice, but all Documentum projects I have participated in, have the following requirement: every document, created in repository, must have unique human-readable identifier, which business users may refer in their daily activities, extra requirements could be:

  • identifiers are sequential
  • there should no holes in sequence
  • sequence starts from the beginning on every new day/week/month/year

and, depending on the requirements, implementation could be either optimistic:

private int getAnnualNextNumber(ICounter counter)
        throws DfException {
    DfException resultException = null;
    for (int i = 0; i < 10; i++) {
        try {
            synchronized (CounterService.class) {
                int lastNumber = counter.getLastNumber() + 1;
                counter.setLastNumber(lastNumber);
                counter.save();
                return lastNumber;
            }
        } catch (DfException dfe) {
            resultException = dfe;
            // neither fetch nor revert is required because DFC
            // resets object's state in
            // com.documentum.fc.client.DfSysObject#doSave
            // counter.fetch(null);
        }
    }
    throw resultException;
}

or pessimistic:

private int getAnnualNextNumber(ICounter counter)
    throws DfException {
    synchronized (CounterService.class) {
        counter.lock();
        int lastNumber = counter.getLastNumber() + 1;
        counter.setLastNumber(lastNumber);
        counter.save();
        return lastNumber;
    }
}

Interesting thing here is in case of optimistic implementation we observe that our code generates duplicates, from audit perspective it looks like CS does not receive updated value from DFC:

Q & A. XVI

I will work, maybe, in a D2 implementation project that could be released in a public site. I do not have updated information regarding D2 4.7 security holes: I need an independent point of view and you are probably the only person that has a clear understanding of what I am talking. Can you help me to understand what has not yet been fixed just in the D2 layer?

Current D2 security status: any authenticated user may gain superuser privileges šŸ™‚

Unicode support

Yesterday I discovered a funny blogpost about unicode support in Documentum (have no idea why it is named “DOCUMENTUM PROBLEMS AND HOW TO FIX THEM: #1” if it does not contain any solution), and now I would like to share my vision on the problem.

It is not clear why that blogpost is referring to “CS-49851 ā€“ “Server does not recognize a UTF-8 enabled database and unnecessarily errors on attribute length””, because I have seen other related CRs dated by 2005 or so, however I can explain why OpenText will never implement a proper unicode support in Documentum.

At current moment Documentum supports four database engines:

  • MSSQL
  • Oracle
  • DB2
  • PostgreSQL

What do you think, which database is the most problematic from unicode perspective? To answer this question we must understand what does “varchar(n)” mean for every database:

database data type description
MSSQL varchar(n) Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are charvarying or charactervarying.
nvarchar(n) Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying
Oracle varchar2(n) The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column. For each row, Oracle Database stores each value in the column as a variable-length field unless a value exceeds the column’s maximum length, in which case Oracle Database returns an error. Using VARCHAR2 and VARCHAR saves on space used by the table.
DB2 varchar(n) Varying-length character strings with a maximum length of n bytes. n must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 32704.
vargraphic(n) Varying-length graphic strings. The maximum length, n, must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 16352.
PostgreSQL varchar(n) SQL defines two primary character types: character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string.

So, in order to implement proper unicode support Documentum must:

  • Do nothing for PostgreSQL
  • Change string semantics from byte to character in case of Oracle (i.e. alter table dm_ysobject_s modify (object_name varchar2(255 char)))
  • Change string datatype from varchar to vargraphic in case of DB2 (I believe something like ALTER TABLE DM_SYSOBJECT_S ALTER COLUMN OBJECT_NAME SET DATA TYPE VARGRAPHIC(255), though I’m not sure it will work)
  • Discontinue support of MSSQL because this database wrongly assumes that the maximum length of any UTF-8 character is 2 bytes (compare: Ć© (C3 A9) and eĢ (65 CC 81))

So, it is clear that it is not possible to implement proper unicode support in case of MSSQL, so OpenText will do nothing because otherwise Documentum will behave differently on different databases.

Cooking composer

Well, no doubts that Documentum Composer is an evil, and even vendor fails to maintain it, nevertheless it is a kind of evil we need to deal with – I do know how to create a lot of Documentum artefacts using API and DQL only, but I have no idea how to install process templates, moreover, somewhen in 2009 there was a hope, that EMC would create a robust technique to transfer process templates between repositories, but that was just a hope – current support of XPDL does not allow to transfer workflow templates between Documentum repositories :(, so, I decided to share my composer-related experience.

First of all, lets define objectives we pursue when dealing with composer, in my opinion there are following goals:

  • we must store all composer-related stuff in version control system
  • the build process must be fully automated, be a part of SDLC and support CI/CD practices
  • the deployment phase should not take a lot of time

Storing composer project in VCS

The first steps you need to perform after creating composer project are (actually, this was not obvious to me 6 years ago, because I was not experienced eclipse user):

  • export project to the filesystem folder backed by VCS
  • remove project from workspace
  • import project into workspace

Automating build process and shortening deployment phase

This parts are already challenging. When EMC developed composer (actually it is just Eclipse plugin), they did really think that developers would use it as IDE (how wrong they were), and because of that Documentum composer lacks some vital functionality. If you are not familiar with evolution of build automation tools, below is gist:

  • in 1976 Stuart Feldman created make, before him developers used shell-scripts to build their software
  • in 2000 James Duncan Davidson released first public version of Apache Ant
  • in 2002 Takari’s Jason van Zyl created Apache Maven
  • and in 2007 Hans Dockter and Adam Murdoch released a first version of gradle

Actually, gradle is my personal choice, but I also do not experience any difficulties with both maven and ant, however, talented team still thinks that shell-scripts is a good option:

(Un)fortunately it is not an option for me, so, I wrote a simple eclipse plugin, which allows somehow automate composer tasks, for example, you may write following ant build file:

<?xml version="1.0"?>

<project name="myproject" default="all">

    <macrodef name="copy.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="true" replace="true" 
            />
        </sequential>
    </macrodef>

    <macrodef name="import.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="false" 
            />
        </sequential>
    </macrodef>

    <macrodef name="copy.dar">
        <attribute name="project" />
        <sequential>
            <mkdir dir="${output.dir}" />
            <pro.copyDar project="@{project}" todir="${output.dir}" />
        </sequential>
    </macrodef>

    <target name="create-workspace" description="Create local composer workspace">
        <import.project project="MyDocumentumProject" />
    </target>

    <target name="create-build-workspace" description="Create build composer workspace">
        <copy.project project="MyDocumentumProject" />
    </target>

    <target name="importcontent" description="Import content">
        <pro.importContents file="${basedir}/importcontents.txt" />
    </target>

    <target name="build-workspace" description="build eclipse project">
        <eclipse.incrementalBuild kind="full" />
    </target>

    <target name="clean-workspace" description="clean eclipse project">
        <eclipse.incrementalBuild kind="clean" />
    </target>

    <target name="copy">
        <copy.dar project="MyDocumentumProject" />
    </target>

    <target name="setoptions" description="Set upgrade options">
        <pro.setUpgradeOptions file="${basedir}/upgradeoptions.txt" />
    </target>

    <target name="all" depends="create-build-workspace, importcontent, setoptions, build-workspace, copy" />

</project>

and call ant from either ant or maven (via exec-maven-plugin), or gradle (via JavaExec or ComposerExec).