Unicode support

Yesterday I discovered a funny blogpost about unicode support in Documentum (have no idea why it is named “DOCUMENTUM PROBLEMS AND HOW TO FIX THEM: #1” if it does not contain any solution), and now I would like to share my vision on the problem.

It is not clear why that blogpost is referring to “CS-49851 – “Server does not recognize a UTF-8 enabled database and unnecessarily errors on attribute length””, because I have seen other related CRs dated by 2005 or so, however I can explain why OpenText will never implement a proper unicode support in Documentum.

At current moment Documentum supports four database engines:

  • MSSQL
  • Oracle
  • DB2
  • PostgreSQL

What do you think, which database is the most problematic from unicode perspective? To answer this question we must understand what does “varchar(n)” mean for every database:

database data type description
MSSQL varchar(n) Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are charvarying or charactervarying.
nvarchar(n) Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying
Oracle varchar2(n) The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column. For each row, Oracle Database stores each value in the column as a variable-length field unless a value exceeds the column’s maximum length, in which case Oracle Database returns an error. Using VARCHAR2 and VARCHAR saves on space used by the table.
DB2 varchar(n) Varying-length character strings with a maximum length of n bytes. n must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 32704.
vargraphic(n) Varying-length graphic strings. The maximum length, n, must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 16352.
PostgreSQL varchar(n) SQL defines two primary character types: character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string.

So, in order to implement proper unicode support Documentum must:

  • Do nothing for PostgreSQL
  • Change string semantics from byte to character in case of Oracle (i.e. alter table dm_ysobject_s modify (object_name varchar2(255 char)))
  • Change string datatype from varchar to vargraphic in case of DB2 (I believe something like ALTER TABLE DM_SYSOBJECT_S ALTER COLUMN OBJECT_NAME SET DATA TYPE VARGRAPHIC(255), though I’m not sure it will work)
  • Discontinue support of MSSQL because this database wrongly assumes that the maximum length of any UTF-8 character is 2 bytes (compare: é (C3 A9) and é (65 CC 81))

So, it is clear that it is not possible to implement proper unicode support in case of MSSQL, so OpenText will do nothing because otherwise Documentum will behave differently on different databases.

Cooking composer

Well, no doubts that Documentum Composer is an evil, and even vendor fails to maintain it, nevertheless it is a kind of evil we need to deal with – I do know how to create a lot of Documentum artefacts using API and DQL only, but I have no idea how to install process templates, moreover, somewhen in 2009 there was a hope, that EMC would create a robust technique to transfer process templates between repositories, but that was just a hope – current support of XPDL does not allow to transfer workflow templates between Documentum repositories :(, so, I decided to share my composer-related experience.

First of all, lets define objectives we pursue when dealing with composer, in my opinion there are following goals:

  • we must store all composer-related stuff in version control system
  • the build process must be fully automated, be a part of SDLC and support CI/CD practices
  • the deployment phase should not take a lot of time

Storing composer project in VCS

The first steps you need to perform after creating composer project are (actually, this was not obvious to me 6 years ago, because I was not experienced eclipse user):

  • export project to the filesystem folder backed by VCS
  • remove project from workspace
  • import project into workspace

Automating build process and shortening deployment phase

This parts are already challenging. When EMC developed composer (actually it is just Eclipse plugin), they did really think that developers would use it as IDE (how wrong they were), and because of that Documentum composer lacks some vital functionality. If you are not familiar with evolution of build automation tools, below is gist:

  • in 1976 Stuart Feldman created make, before him developers used shell-scripts to build their software
  • in 2000 James Duncan Davidson released first public version of Apache Ant
  • in 2002 Takari’s Jason van Zyl created Apache Maven
  • and in 2007 Hans Dockter and Adam Murdoch released a first version of gradle

Actually, gradle is my personal choice, but I also do not experience any difficulties with both maven and ant, however, talented team still thinks that shell-scripts is a good option:

(Un)fortunately it is not an option for me, so, I wrote a simple eclipse plugin, which allows somehow automate composer tasks, for example, you may write following ant build file:

<?xml version="1.0"?>

<project name="myproject" default="all">

    <macrodef name="copy.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="true" replace="true" 
            />
        </sequential>
    </macrodef>

    <macrodef name="import.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="false" 
            />
        </sequential>
    </macrodef>

    <macrodef name="copy.dar">
        <attribute name="project" />
        <sequential>
            <mkdir dir="${output.dir}" />
            <pro.copyDar project="@{project}" todir="${output.dir}" />
        </sequential>
    </macrodef>

    <target name="create-workspace" description="Create local composer workspace">
        <import.project project="MyDocumentumProject" />
    </target>

    <target name="create-build-workspace" description="Create build composer workspace">
        <copy.project project="MyDocumentumProject" />
    </target>

    <target name="importcontent" description="Import content">
        <pro.importContents file="${basedir}/importcontents.txt" />
    </target>

    <target name="build-workspace" description="build eclipse project">
        <eclipse.incrementalBuild kind="full" />
    </target>

    <target name="clean-workspace" description="clean eclipse project">
        <eclipse.incrementalBuild kind="clean" />
    </target>

    <target name="copy">
        <copy.dar project="MyDocumentumProject" />
    </target>

    <target name="setoptions" description="Set upgrade options">
        <pro.setUpgradeOptions file="${basedir}/upgradeoptions.txt" />
    </target>

    <target name="all" depends="create-build-workspace, importcontent, setoptions, build-workspace, copy" />

</project>

and call ant from either ant or maven (via exec-maven-plugin), or gradle (via JavaExec or ComposerExec).

Are changes coming?

Pro Documentum

On last week something weird had happened – a couple of researches disclosed information about vulnerabilities in Documentum xPression and Documentum WDK applications:

and these disclosures are qualitatively different from what EMC was publishing previosely – these disclosures had been coordinated. Let’s explain this point. When Documentum was under EMC wing EMC was never published correct/true information about security flaws: they were always underestimating security impact and were never noticing that exploit/PoC were available in the wild, and such behaviour, obviously, had negative impact on customers: customers see that vulnerability impact is medium and prefer do not install security fixes – that is a kind of…

View original post 651 more words

Why CURSOR_SHARING=FORCE sucks. Part II

I believe everybody who maintains Documentum repository with intensive workflow do see following query in the top of database performance reports:

UPDATE /*+ USE_NL(dmi_workitem_s) */
      dmi_workitem_s
   SET a_wq_name = :p0
 WHERE r_object_id =
          ANY (SELECT /*+ CARDINALITY(1) */
                      wis2.r_object_id
                 FROM (SELECT wis3.r_object_id AS r_object_id
                         FROM (  SELECT /*+ CARDINALITY(1) */
                                        wis.r_object_id AS r_object_id
                                   FROM dmi_workitem_s wis
                                  WHERE     wis.r_runtime_state >= 0
                                        AND wis.r_runtime_state <= 1
                                        AND wis.r_auto_method_id >
                                               '0000000000000000'
                                        AND wis.a_wq_name = ' '
                                        AND wis.r_next_retry_date < SYSDATE
                               ORDER BY r_creation_date ASC) wis3
                        WHERE ROWNUM <= 90) wis2);

This query is performed by workflow agent and their bad performance actually reveals a poor database design because dmi_workitem_s table does not contain column with high selectivity:

  • records with r_runtime_state IN (0, 1) relates to both auto and manual activities
  • records with r_auto_method_id > ‘0000000000000000’ relates to both completed and non-completed auto activities

Actually, in case of MSSQL and PostgreSQL it would possible to create “ideal” index for this query like:

CREATE INDEX idx_auto_tasls
   ON dmi_workitem_s (r_object_id, r_next_retry_date, r_creation_date)
   WHERE wis.r_runtime_state >= 0
     AND wis.r_runtime_state <= 1
     AND wis.r_auto_method_id > '0000000000000000'
     AND wis.a_wq_name = ' ';

because both MSSQL and PostgreSQL support partial indexes, in case of Oracle the solution is not so straightforward, because it either required to rewrite query to the following form:

UPDATE /*+ USE_NL(dmi_workitem_s) */
      dmi_workitem_s
   SET a_wq_name = :p0
 WHERE r_object_id =
          ANY (SELECT /*+ CARDINALITY(1) */
                      wis2.r_object_id
                 FROM (SELECT wis3.r_object_id AS r_object_id
                         FROM (  SELECT /*+ CARDINALITY(1) */
                                        wis.r_object_id AS r_object_id
                                   FROM dmi_workitem_s wis
                                  WHERE     CASE
                                               WHEN     wis.r_runtime_state >=
                                                           0
                                                    AND wis.r_runtime_state <=
                                                           1
                                                    AND wis.r_auto_method_id >
                                                           '0000000000000000'
                                                    AND wis.a_wq_name = ' '
                                               THEN
                                                  1
                                            END = 1
                                        AND wis.r_next_retry_date < SYSDATE
                               ORDER BY r_creation_date ASC) wis3
                        WHERE ROWNUM <= 90) wis2);

and create following functional-based index:

CREATE INDEX idx_auto_tasks
   ON dmi_workitem_s (
      CASE
         WHEN     r_runtime_state >= 0
              AND r_runtime_state <= 1
              AND r_auto_method_id > '0000000000000000'
              AND a_wq_name = ' '
         THEN
            1
      END);

or create materialized view like:

CREATE MATERIALIZED VIEW mv_auto_tasks
   REFRESH FAST ON COMMIT
   ENABLE QUERY REWRITE
AS
   SELECT wis.r_object_id, wis.r_creation_date, wis.r_next_retry_date
     FROM dmi_workitem_s wis
    WHERE     wis.r_runtime_state >= 0
          AND wis.r_runtime_state <= 1
          AND wis.r_auto_method_id > '0000000000000000'
          AND wis.a_wq_name = ' ';

and take advantage of query rewrite:

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                   |                    |     1 |    70 |     3  (34)| 00:00:01 |
|   1 |  UPDATE                            | DMI_WORKITEM_S     |       |       |            |          |
|   2 |   NESTED LOOPS                     |                    |     1 |    70 |     3  (34)| 00:00:01 |
|   3 |    VIEW                            |                    |     1 |    10 |     3  (34)| 00:00:01 |
|*  4 |     COUNT STOPKEY                  |                    |       |       |            |          |
|   5 |      VIEW                          |                    |     1 |    10 |     3  (34)| 00:00:01 |
|*  6 |       SORT ORDER BY STOPKEY        |                    |     1 |    28 |     3  (34)| 00:00:01 |
|*  7 |        MAT_VIEW REWRITE ACCESS FULL| MV_AUTO_TASKS      |     1 |    28 |     2   (0)| 00:00:01 |
|*  8 |    INDEX UNIQUE SCAN               | D_1F024BE98000018C |     1 |    60 |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(ROWNUM<=90)
   6 - filter(ROWNUM<=90)
   7 - filter("MV_AUTO_TASKS"."R_NEXT_RETRY_DATE"<SYSDATE@!)
   8 - access("R_OBJECT_ID"="WIS2"."R_OBJECT_ID")

Unfortunately, due to CURSOR_SHARING=FORCE recommendation neither option is applicable, and the only “option” is use hex editor to modify documentum binary – in case of oracle we need to place CURSOR_SHARING_EXACT hint and modify where clause.

A FATAL error has occurred. Part II

20 months ago I described a bizarre behaviour in webtop, now it is time to describe how to solve such problem (actually, customer have shared a simple testcase when user changes his password via Ctrl+Alt+Del on Windows computer and after that he need to clear cookies in order to force webtop to work). I do think the best option here is to replace actual user’s password by login ticket and the best candidate for that is com.documentum.web.formext.session.AuthenticationService:

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpSession;

import com.documentum.fc.client.IDfSession;
import com.documentum.fc.client.IDfSessionManager;
import com.documentum.fc.common.DfException;
import com.documentum.fc.common.DfLoginInfo;
import com.documentum.fc.common.IDfLoginInfo;

/**
 * @author Andrey B. Panfilov <andrey@panfilov.tel>
 */
public class AuthenticationServiceCustom extends AuthenticationService {

    public AuthenticationServiceCustom() {
        super();
    }

    @Override
    public void login(HttpSession httpSession, String principalName,
            String docbase, HttpServletRequest req)
        throws DfException {
        super.login(httpSession, principalName, docbase, req);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String principalName,
            String docbase)
        throws DfException {
        super.login(httpSession, principalName, docbase);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase,
            String userLoginName, String userPassword, String domain)
        throws PasswordExpiredException, DfException {
        super.login(httpSession, docbase, userLoginName, userPassword, domain);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase, String domain,
            Object binaryCredential)
        throws DfException {
        super.login(httpSession, docbase, domain, binaryCredential);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase, String domain,
            Object binaryCredential, HttpServletRequest req)
        throws DfException {
        super.login(httpSession, docbase, domain, binaryCredential, req);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase,
            String userLoginName, String password, String domain,
            HttpServletRequest req)
        throws DfException {
        super.login(httpSession, docbase, userLoginName, password, domain, req);
        replaceTicket(docbase);
    }

    private void replaceTicket(String docbase) throws DfException {
        IDfSessionManager sessionManager = SessionManagerHttpBinding
                .getSessionManager();
        IDfSession session = null;
        try {
            int dotIndex = docbase.indexOf('.');
            if (dotIndex != -1) {
                docbase = docbase.substring(0, dotIndex);
            }
            session = sessionManager.getSession(docbase);
            int timeout = session.getServerConfig()
                    .getInt("max_login_ticket_timeout");
            String ticket = session.getLoginTicketEx(null, "docbase", timeout,
                    false, docbase);
            String userName = session.getLoginUserName();
            if (sessionManager.hasIdentity(docbase)) {
                sessionManager.clearIdentity(docbase);
            }
            IDfLoginInfo loginInfo = new DfLoginInfo(userName, ticket);
            sessionManager.setIdentity(docbase, loginInfo);
        } finally {
            if (session != null) {
                sessionManager.release(session);
            }
        }
    }

}