Unicode support

Yesterday I discovered a funny blogpost about unicode support in Documentum (have no idea why it is named “DOCUMENTUM PROBLEMS AND HOW TO FIX THEM: #1” if it does not contain any solution), and now I would like to share my vision on the problem.

It is not clear why that blogpost is referring to “CS-49851 – “Server does not recognize a UTF-8 enabled database and unnecessarily errors on attribute length””, because I have seen other related CRs dated by 2005 or so, however I can explain why OpenText will never implement a proper unicode support in Documentum.

At current moment Documentum supports four database engines:

  • MSSQL
  • Oracle
  • DB2
  • PostgreSQL

What do you think, which database is the most problematic from unicode perspective? To answer this question we must understand what does “varchar(n)” mean for every database:

database data type description
MSSQL varchar(n) Variable-length, non-Unicode string data. n defines the string length and can be a value from 1 through 8,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size is the actual length of the data entered + 2 bytes. The ISO synonyms for varchar are charvarying or charactervarying.
nvarchar(n) Variable-length Unicode string data. n defines the string length and can be a value from 1 through 4,000. max indicates that the maximum storage size is 2^31-1 bytes (2 GB). The storage size, in bytes, is two times the actual length of data entered + 2 bytes. The ISO synonyms for nvarchar are national char varying and national character varying
Oracle varchar2(n) The VARCHAR2 datatype stores variable-length character strings. When you create a table with a VARCHAR2 column, you specify a maximum string length (in bytes or characters) between 1 and 4000 bytes for the VARCHAR2 column. For each row, Oracle Database stores each value in the column as a variable-length field unless a value exceeds the column’s maximum length, in which case Oracle Database returns an error. Using VARCHAR2 and VARCHAR saves on space used by the table.
DB2 varchar(n) Varying-length character strings with a maximum length of n bytes. n must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 32704.
vargraphic(n) Varying-length graphic strings. The maximum length, n, must be greater than 0 and less than a number that depends on the page size of the table space. The maximum length is 16352.
PostgreSQL varchar(n) SQL defines two primary character types: character varying(n) and character(n), where n is a positive integer. Both of these types can store strings up to n characters (not bytes) in length. An attempt to store a longer string into a column of these types will result in an error, unless the excess characters are all spaces, in which case the string will be truncated to the maximum length. (This somewhat bizarre exception is required by the SQL standard.) If the string to be stored is shorter than the declared length, values of type character will be space-padded; values of type character varying will simply store the shorter string.

So, in order to implement proper unicode support Documentum must:

  • Do nothing for PostgreSQL
  • Change string semantics from byte to character in case of Oracle (i.e. alter table dm_ysobject_s modify (object_name varchar2(255 char)))
  • Change string datatype from varchar to vargraphic in case of DB2 (I believe something like ALTER TABLE DM_SYSOBJECT_S ALTER COLUMN OBJECT_NAME SET DATA TYPE VARGRAPHIC(255), though I’m not sure it will work)
  • Discontinue support of MSSQL because this database wrongly assumes that the maximum length of any UTF-8 character is 2 bytes (compare: é (C3 A9) and é (65 CC 81))

So, it is clear that it is not possible to implement proper unicode support in case of MSSQL, so OpenText will do nothing because otherwise Documentum will behave differently on different databases.

Cooking composer

Well, no doubts that Documentum Composer is an evil, and even vendor fails to maintain it, nevertheless it is a kind of evil we need to deal with – I do know how to create a lot of Documentum artefacts using API and DQL only, but I have no idea how to install process templates, moreover, somewhen in 2009 there was a hope, that EMC would create a robust technique to transfer process templates between repositories, but that was just a hope – current support of XPDL does not allow to transfer workflow templates between Documentum repositories :(, so, I decided to share my composer-related experience.

First of all, lets define objectives we pursue when dealing with composer, in my opinion there are following goals:

  • we must store all composer-related stuff in version control system
  • the build process must be fully automated, be a part of SDLC and support CI/CD practices
  • the deployment phase should not take a lot of time

Storing composer project in VCS

The first steps you need to perform after creating composer project are (actually, this was not obvious to me 6 years ago, because I was not experienced eclipse user):

  • export project to the filesystem folder backed by VCS
  • remove project from workspace
  • import project into workspace

Automating build process and shortening deployment phase

This parts are already challenging. When EMC developed composer (actually it is just Eclipse plugin), they did really think that developers would use it as IDE (how wrong they were), and because of that Documentum composer lacks some vital functionality. If you are not familiar with evolution of build automation tools, below is gist:

  • in 1976 Stuart Feldman created make, before him developers used shell-scripts to build their software
  • in 2000 James Duncan Davidson released first public version of Apache Ant
  • in 2002 Takari’s Jason van Zyl created Apache Maven
  • and in 2007 Hans Dockter and Adam Murdoch released a first version of gradle

Actually, gradle is my personal choice, but I also do not experience any difficulties with both maven and ant, however, talented team still thinks that shell-scripts is a good option:

(Un)fortunately it is not an option for me, so, I wrote a simple eclipse plugin, which allows somehow automate composer tasks, for example, you may write following ant build file:

<?xml version="1.0"?>

<project name="myproject" default="all">

    <macrodef name="copy.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="true" replace="true" 
            />
        </sequential>
    </macrodef>

    <macrodef name="import.project">
        <attribute name="project" />
        <sequential>
            <pro.importProject project="@{project}" 
                               location="${composer.project.dir}" 
                               copy="false" 
            />
        </sequential>
    </macrodef>

    <macrodef name="copy.dar">
        <attribute name="project" />
        <sequential>
            <mkdir dir="${output.dir}" />
            <pro.copyDar project="@{project}" todir="${output.dir}" />
        </sequential>
    </macrodef>

    <target name="create-workspace" description="Create local composer workspace">
        <import.project project="MyDocumentumProject" />
    </target>

    <target name="create-build-workspace" description="Create build composer workspace">
        <copy.project project="MyDocumentumProject" />
    </target>

    <target name="importcontent" description="Import content">
        <pro.importContents file="${basedir}/importcontents.txt" />
    </target>

    <target name="build-workspace" description="build eclipse project">
        <eclipse.incrementalBuild kind="full" />
    </target>

    <target name="clean-workspace" description="clean eclipse project">
        <eclipse.incrementalBuild kind="clean" />
    </target>

    <target name="copy">
        <copy.dar project="MyDocumentumProject" />
    </target>

    <target name="setoptions" description="Set upgrade options">
        <pro.setUpgradeOptions file="${basedir}/upgradeoptions.txt" />
    </target>

    <target name="all" depends="create-build-workspace, importcontent, setoptions, build-workspace, copy" />

</project>

and call ant from either ant or maven (via exec-maven-plugin), or gradle (via JavaExec or ComposerExec).

Are changes coming?

Pro Documentum

On last week something weird had happened – a couple of researches disclosed information about vulnerabilities in Documentum xPression and Documentum WDK applications:

and these disclosures are qualitatively different from what EMC was publishing previosely – these disclosures had been coordinated. Let’s explain this point. When Documentum was under EMC wing EMC was never published correct/true information about security flaws: they were always underestimating security impact and were never noticing that exploit/PoC were available in the wild, and such behaviour, obviously, had negative impact on customers: customers see that vulnerability impact is medium and prefer do not install security fixes – that is a kind of…

View original post 651 more words

Why CURSOR_SHARING=FORCE sucks. Part II

I believe everybody who maintains Documentum repository with intensive workflow do see following query in the top of database performance reports:

UPDATE /*+ USE_NL(dmi_workitem_s) */
      dmi_workitem_s
   SET a_wq_name = :p0
 WHERE r_object_id =
          ANY (SELECT /*+ CARDINALITY(1) */
                      wis2.r_object_id
                 FROM (SELECT wis3.r_object_id AS r_object_id
                         FROM (  SELECT /*+ CARDINALITY(1) */
                                        wis.r_object_id AS r_object_id
                                   FROM dmi_workitem_s wis
                                  WHERE     wis.r_runtime_state >= 0
                                        AND wis.r_runtime_state <= 1
                                        AND wis.r_auto_method_id >
                                               '0000000000000000'
                                        AND wis.a_wq_name = ' '
                                        AND wis.r_next_retry_date < SYSDATE
                               ORDER BY r_creation_date ASC) wis3
                        WHERE ROWNUM <= 90) wis2);

This query is performed by workflow agent and their bad performance actually reveals a poor database design because dmi_workitem_s table does not contain column with high selectivity:

  • records with r_runtime_state IN (0, 1) relates to both auto and manual activities
  • records with r_auto_method_id > ‘0000000000000000’ relates to both completed and non-completed auto activities

Actually, in case of MSSQL and PostgreSQL it would possible to create “ideal” index for this query like:

CREATE INDEX idx_auto_tasls
   ON dmi_workitem_s (r_object_id, r_next_retry_date, r_creation_date)
   WHERE wis.r_runtime_state >= 0
     AND wis.r_runtime_state <= 1
     AND wis.r_auto_method_id > '0000000000000000'
     AND wis.a_wq_name = ' ';

because both MSSQL and PostgreSQL support partial indexes, in case of Oracle the solution is not so straightforward, because it either required to rewrite query to the following form:

UPDATE /*+ USE_NL(dmi_workitem_s) */
      dmi_workitem_s
   SET a_wq_name = :p0
 WHERE r_object_id =
          ANY (SELECT /*+ CARDINALITY(1) */
                      wis2.r_object_id
                 FROM (SELECT wis3.r_object_id AS r_object_id
                         FROM (  SELECT /*+ CARDINALITY(1) */
                                        wis.r_object_id AS r_object_id
                                   FROM dmi_workitem_s wis
                                  WHERE     CASE
                                               WHEN     wis.r_runtime_state >=
                                                           0
                                                    AND wis.r_runtime_state <=
                                                           1
                                                    AND wis.r_auto_method_id >
                                                           '0000000000000000'
                                                    AND wis.a_wq_name = ' '
                                               THEN
                                                  1
                                            END = 1
                                        AND wis.r_next_retry_date < SYSDATE
                               ORDER BY r_creation_date ASC) wis3
                        WHERE ROWNUM <= 90) wis2);

and create following functional-based index:

CREATE INDEX idx_auto_tasks
   ON dmi_workitem_s (
      CASE
         WHEN     r_runtime_state >= 0
              AND r_runtime_state <= 1
              AND r_auto_method_id > '0000000000000000'
              AND a_wq_name = ' '
         THEN
            1
      END);

or create materialized view like:

CREATE MATERIALIZED VIEW mv_auto_tasks
   REFRESH FAST ON COMMIT
   ENABLE QUERY REWRITE
AS
   SELECT wis.r_object_id, wis.r_creation_date, wis.r_next_retry_date
     FROM dmi_workitem_s wis
    WHERE     wis.r_runtime_state >= 0
          AND wis.r_runtime_state <= 1
          AND wis.r_auto_method_id > '0000000000000000'
          AND wis.a_wq_name = ' ';

and take advantage of query rewrite:

---------------------------------------------------------------------------------------------------------
| Id  | Operation                          | Name               | Rows  | Bytes | Cost (%CPU)| Time     |
---------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                   |                    |     1 |    70 |     3  (34)| 00:00:01 |
|   1 |  UPDATE                            | DMI_WORKITEM_S     |       |       |            |          |
|   2 |   NESTED LOOPS                     |                    |     1 |    70 |     3  (34)| 00:00:01 |
|   3 |    VIEW                            |                    |     1 |    10 |     3  (34)| 00:00:01 |
|*  4 |     COUNT STOPKEY                  |                    |       |       |            |          |
|   5 |      VIEW                          |                    |     1 |    10 |     3  (34)| 00:00:01 |
|*  6 |       SORT ORDER BY STOPKEY        |                    |     1 |    28 |     3  (34)| 00:00:01 |
|*  7 |        MAT_VIEW REWRITE ACCESS FULL| MV_AUTO_TASKS      |     1 |    28 |     2   (0)| 00:00:01 |
|*  8 |    INDEX UNIQUE SCAN               | D_1F024BE98000018C |     1 |    60 |     0   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
   4 - filter(ROWNUM<=90)
   6 - filter(ROWNUM<=90)
   7 - filter("MV_AUTO_TASKS"."R_NEXT_RETRY_DATE"<SYSDATE@!)
   8 - access("R_OBJECT_ID"="WIS2"."R_OBJECT_ID")

Unfortunately, due to CURSOR_SHARING=FORCE recommendation neither option is applicable, and the only “option” is use hex editor to modify documentum binary – in case of oracle we need to place CURSOR_SHARING_EXACT hint and modify where clause.

A FATAL error has occurred. Part II

20 months ago I described a bizarre behaviour in webtop, now it is time to describe how to solve such problem (actually, customer have shared a simple testcase when user changes his password via Ctrl+Alt+Del on Windows computer and after that he need to clear cookies in order to force webtop to work). I do think the best option here is to replace actual user’s password by login ticket and the best candidate for that is com.documentum.web.formext.session.AuthenticationService:

import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpSession;

import com.documentum.fc.client.IDfSession;
import com.documentum.fc.client.IDfSessionManager;
import com.documentum.fc.common.DfException;
import com.documentum.fc.common.DfLoginInfo;
import com.documentum.fc.common.IDfLoginInfo;

/**
 * @author Andrey B. Panfilov <andrey@panfilov.tel>
 */
public class AuthenticationServiceCustom extends AuthenticationService {

    public AuthenticationServiceCustom() {
        super();
    }

    @Override
    public void login(HttpSession httpSession, String principalName,
            String docbase, HttpServletRequest req)
        throws DfException {
        super.login(httpSession, principalName, docbase, req);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String principalName,
            String docbase)
        throws DfException {
        super.login(httpSession, principalName, docbase);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase,
            String userLoginName, String userPassword, String domain)
        throws PasswordExpiredException, DfException {
        super.login(httpSession, docbase, userLoginName, userPassword, domain);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase, String domain,
            Object binaryCredential)
        throws DfException {
        super.login(httpSession, docbase, domain, binaryCredential);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase, String domain,
            Object binaryCredential, HttpServletRequest req)
        throws DfException {
        super.login(httpSession, docbase, domain, binaryCredential, req);
        replaceTicket(docbase);
    }

    @Override
    public void login(HttpSession httpSession, String docbase,
            String userLoginName, String password, String domain,
            HttpServletRequest req)
        throws DfException {
        super.login(httpSession, docbase, userLoginName, password, domain, req);
        replaceTicket(docbase);
    }

    private void replaceTicket(String docbase) throws DfException {
        IDfSessionManager sessionManager = SessionManagerHttpBinding
                .getSessionManager();
        IDfSession session = null;
        try {
            int dotIndex = docbase.indexOf('.');
            if (dotIndex != -1) {
                docbase = docbase.substring(0, dotIndex);
            }
            session = sessionManager.getSession(docbase);
            int timeout = session.getServerConfig()
                    .getInt("max_login_ticket_timeout");
            String ticket = session.getLoginTicketEx(null, "docbase", timeout,
                    false, docbase);
            String userName = session.getLoginUserName();
            if (sessionManager.hasIdentity(docbase)) {
                sessionManager.clearIdentity(docbase);
            }
            IDfLoginInfo loginInfo = new DfLoginInfo(userName, ticket);
            sessionManager.setIdentity(docbase, loginInfo);
        } finally {
            if (session != null) {
                sessionManager.release(session);
            }
        }
    }

}

Why CURSOR_SHARING=FORCE sucks

As you might have guessed, my colleagues involved my in load testing activities. Actually, there is nothing challenging in writing load tests (I do think that analysis is another topic), but forcing Documentum to work under load is a big challenge. On first iteration we got following docbase sessions graph:

Which is actually weird, because I do suppose that collected metrics should somehow reflect load, but in our scenario we put ~ constant load, but docbase sessions metric not seemed to be a constant 😦 , which could mean one of the following:

  • I’m an idiot and my estimations were wrong, but in this case it means that customer should not use Documentum due to unpredictable performance
  • Something was wrong on Documentum side

Let’s check database side …

Recommendation 1: SQL Tuning
Estimated benefit is 22.41 active sessions, 56.8% of total activity.
--------------------------------------------------------------------
Action
   Investigate the SELECT statement with SQL_ID "34h8xv6j5mx04" for 
   possible performance improvements. You can supplement the information 
   given here with an ASH report for this SQL_ID.
   Related Object
      SQL statement with SQL_ID 34h8xv6j5mx04.
      select gs.group_name, gs.is_dynamic, 
      gs.is_dynamic_default,gs.is_protected, gs.is_module_only from 
      dm_group_s gs, dm_group_r gr where gs.r_object_id = gr.r_object_id 
      and gs.is_dynamic = :"SYS_B_0" and gs.is_dynamic_default = :"SYS_B_1" 
      and (gr.users_names = :P0 or gr.groups_names = :"SYS_B_2" or 
      gr.groups_names in (select gr1.i_nondyn_supergroups_names from 
      dm_group_r gr1, dm_group_r gr2 where gr1.r_object_id = 
      gr2.r_object_id and (gr2.users_names = :P1 or gr2.groups_names = 
      :"SYS_B_3") and gr1.i_nondyn_supergroups_names IS NOT NULL))


SQL ID: 34h8xv6j5mx04                     DB/Inst: ECM/ECM      Snaps: 13-14
-> 1st Capture and Last Capture Snap IDs
   refer to Snapshot IDs witin the snapshot range
-> select gs.group_name, gs.is_dynamic, gs.is_dynamic_default,gs.is_prote...

    Plan Hash           Total Elapsed                 1st Capture   Last Capture
#   Value                    Time(ms)    Executions       Snap ID        Snap ID
--- ---------------- ---------------- ------------- ------------- --------------
1   2451896125             85,924,271       16,7191            14             14
          -------------------------------------------------------------       

Execution Plan
------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop
------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                    |       |       | 15936 (100)|          |       |      
|   1 |  FILTER                             |                    |       |       |            |          |       |      
|   2 |   PX COORDINATOR                    |                    |       |       |            |          |       |      
|   3 |    PX SEND QC (RANDOM)              | :TQ10001           |  9328K|   676M| 15936   (2)| 00:03:12 |       |      
|   4 |     HASH JOIN                       |                    |  9328K|   676M| 15936   (2)| 00:03:12 |       |      
|   5 |      BUFFER SORT                    |                    |       |       |            |          |       |      
|   6 |       PART JOIN FILTER CREATE       | :BF0000            |   386K|    19M|  1859   (1)| 00:00:23 |       |      
|   7 |        PX RECEIVE                   |                    |   386K|    19M|  1859   (1)| 00:00:23 |       |      
|   8 |         PX SEND PARTITION (KEY)     | :TQ10000           |   386K|    19M|  1859   (1)| 00:00:23 |       |      
|   9 |          TABLE ACCESS BY INDEX ROWID| DM_GROUP_S         |   386K|    19M|  1859   (1)| 00:00:23 |       |      
|  10 |           INDEX RANGE SCAN          | D_XXXXXXXX8000010A |   773K|       |   124   (1)| 00:00:02 |       |      
|  11 |      PX PARTITION HASH JOIN-FILTER  |                    |    18M|   407M| 14035   (2)| 00:02:49 |:BF0000|:BF000
|  12 |       TABLE ACCESS FULL             | DM_GROUP_R         |    18M|   407M| 14035   (2)| 00:02:49 |:BF0000|:BF000
|  13 |   NESTED LOOPS                      |                    |     4 |   228 |     3   (0)| 00:00:01 |       |      
|  14 |    NESTED LOOPS                     |                    |    88 |   228 |     3   (0)| 00:00:01 |       |      
|  15 |     PARTITION HASH ALL              |                    |     4 |   136 |     2   (0)| 00:00:01 |     1 |    20
|  16 |      INDEX RANGE SCAN               | D_XXXXXXXX80000056 |     4 |   136 |     2   (0)| 00:00:01 |     1 |    20
|  17 |     PARTITION HASH ITERATOR         |                    |    22 |       |     1   (0)| 00:00:01 |   KEY |   KEY
|  18 |      INDEX RANGE SCAN               | D_XXXXXXXX80000109 |    22 |       |     1   (0)| 00:00:01 |   KEY |   KEY
|  19 |    TABLE ACCESS BY LOCAL INDEX ROWID| DM_GROUP_R         |     1 |    23 |     1   (0)| 00:00:01 |     1 |     1
------------------------------------------------------------------------------------------------------------------------
 
Full SQL Text

SQL ID       SQL Text                                                         
------------ -----------------------------------------------------------------
34h8xv6j5mx0 select gs.group_name, gs.is_dynamic, gs.is_dynamic_default, gs.is
             _protected, gs.is_module_only from dm_group_s gs, dm_group_r gr w
             here gs.r_object_id = gr.r_object_id and gs.is_dynamic = :"SYS_B_
             0" and gs.is_dynamic_default = :"SYS_B_1" and (gr.users_names = :
             P0 or gr.groups_names = :"SYS_B_2" or gr.groups_names in (select 
             gr1.i_nondyn_supergroups_names from dm_group_r gr1, dm_group_r gr
             2 where gr1.r_object_id = gr2.r_object_id and (gr2.users_names = 
             :P1 or gr2.groups_names = :"SYS_B_3") and gr1.i_nondyn_supergroup
             s_names IS NOT NULL))

That is awesome! Single SQL query consumes 60% of database resources! Before trying to optimize this query we need to figure out how it really looks – how can we understand that we optimized some query if we don’t know how to execute it? And here we are:

SELECT gs.group_name,
       gs.is_dynamic,
       gs.is_dynamic_default,
       gs.is_protected,
       gs.is_module_only
  FROM dm_group_s gs, dm_group_r gr
 WHERE     gs.r_object_id = gr.r_object_id
       AND gs.is_dynamic = 1
       AND gs.is_dynamic_default = 1
       AND (   gr.users_names = :P0
            OR gr.groups_names = 'dm_world'
            OR gr.groups_names IN (
          SELECT gr1.i_nondyn_supergroups_names
            FROM dm_group_r gr1, dm_group_r gr2
           WHERE     gr1.r_object_id = gr2.r_object_id
                     AND (   gr2.users_names = :P1
                             OR gr2.groups_names = 'dm_world')
                     AND gr1.i_nondyn_supergroups_names IS NOT NULL)
       );

This query is intended to return dynamic groups (gs.is_dynamic = 1 AND gs.is_dynamic_default = 1) enabled by default, now guess how many dynamic groups enabled by default does our docbase have? Zero! 60% of database resources to return empty resultset! But due to CURSOR_SHARING=FORCE Oracle do not understand that we ask him to return empty resultset and, so, it selects “suboptimal” execution plan, though the optimal is following:

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                           | Name               | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
--------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                    |                    |     1 |    77 |     2   (0)| 00:00:01 |       |       |
|*  1 |  FILTER                             |                    |       |       |            |          |       |       |
|   2 |   NESTED LOOPS                      |                    |     1 |    77 |     2   (0)| 00:00:01 |       |       |
|   3 |    NESTED LOOPS                     |                    |    25 |    77 |     2   (0)| 00:00:01 |       |       |
|*  4 |     TABLE ACCESS BY INDEX ROWID     | DM_GROUP_S         |     1 |    53 |     1   (0)| 00:00:01 |       |       |
|*  5 |      INDEX RANGE SCAN               | D_XXXXXXXX8000010A |     1 |       |     1   (0)| 00:00:01 |       |       |
|   6 |     PARTITION HASH ITERATOR         |                    |    25 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|*  7 |      INDEX RANGE SCAN               | D_XXXXXXXX80000109 |    25 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|   8 |    TABLE ACCESS BY LOCAL INDEX ROWID| DM_GROUP_R         |    25 |   600 |     1   (0)| 00:00:01 |     1 |     1 |
|   9 |   NESTED LOOPS                      |                    |     2 |   114 |     3   (0)| 00:00:01 |       |       |
|  10 |    NESTED LOOPS                     |                    |    26 |   114 |     3   (0)| 00:00:01 |       |       |
|  11 |     PARTITION HASH ALL              |                    |    15 |   495 |     2   (0)| 00:00:01 |     1 |    20 |
|* 12 |      INDEX RANGE SCAN               | D_XXXXXXXX80000056 |    15 |   495 |     2   (0)| 00:00:01 |     1 |    20 |
|  13 |     PARTITION HASH ITERATOR         |                    |    13 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 14 |      INDEX RANGE SCAN               | D_XXXXXXXX80000109 |    13 |       |     1   (0)| 00:00:01 |   KEY |   KEY |
|* 15 |    TABLE ACCESS BY LOCAL INDEX ROWID| DM_GROUP_R         |     1 |    24 |     1   (0)| 00:00:01 |     1 |     1 |
--------------------------------------------------------------------------------------------------------------------------

So, we stabilised execution plan for this query and on next iteration we got following docbase sessions graph:

Good news: I’m not an idiot.

Q & A. XV

As a follow-up for XCP2 vs ACLs

I have very….hm, how to call this stupidity of ACL security model logic….I have repository with permissions inheriting from folder. Folder is created by regular user and ACL assigned to folder is owned by this user, with class set to REGULAR. When another regular user needs to add document to this folder, it is not possible, with DM_SYSOBJECT_E_INVALID_ACL_DOMAIN exception, since folder ACL is regular and thereby not alowed to be used/set by another regular user, only superuser or folder ACL owner. So, ACL from folder may not be inherited to document and document can not be created.

Why, when ACL with its entries should specify exactly who can do smth and with which permissions?
And, why default ACLs created by regular users are not PUBLIC?
And, why cant I set by some docbase configuration that all ACLs created by regular users are PUBLIC?

Well, when I said that fundamentals guide is bit confusing I was too polite, the home truth is that fundamentals guide is a piece of dog crap. Let’s explain that.

From fundamentals guide:

ACLs are either external or internal ACLs:

  • External ACLs are created explicitly by users. The name of an external ACL is determined by the user. External ACLs are managed by users, either the user who creates them or superusers.
  • Internal ACLs are created by Content Server. Internal ACLs are created in a variety of situations. For example, if a user creates a document and grants access to the document to HenryJ, Content Server assigns an internal ACL to the document. (The internal ACL is derived from the default ACL with the addition of the permission granted to HenryJ.) The names of internal ACL begin with dm_. Internal ACLs are managed by Content Server.

The external and internal ACLs are further characterized as public or private ACLs:

  • Public ACLs are available for use by any user in the repository. Public ACLs created by the repository owner are called system ACLs. System ACLs can only be managed by the repository owner. Other public ACLs can be managed by their owners or a user with Sysadmin or Superuser
    privileges.
  • Private ACLs are created and owned by a user other than the repository owner. However, unlike public ACLs, private ACLs are available for use only by their owners, and only their owners or a superuser can manage them.

From object reference guide:

acl_class (Integer) specifies whether the ACL is a regular ACL, a template, an instance of a template, or a public ACL. Valid values are:

  • 0: Regular ACL
  • 1: Template ACL
  • 2: Template instance
  • 3: Public ACL

r_is_internal (Boolean) indicates whether the ACL was created explicitly by a user or implicitly by the server.

First of all, the classification internal/external seems to be extremely confusing – I would prefer temporary/permanent terms because ACLs with r_is_iternal=TRUE are subject to deleting via dm_clean job, and because dm_clean job uses following query:

SELECT x.r_object_id
  FROM dm_acl_s x
 WHERE     x.r_is_internal = 1
       AND NOT EXISTS
                  ( (SELECT a1.r_object_id
                       FROM dm_acl_s a1, dm_sysobject_s b
                      WHERE     a1.object_name = b.acl_name
                            AND a1.owner_name = b.acl_domain
                            AND a1.r_object_id = x.r_object_id)
                   UNION
                   (SELECT a2.r_object_id
                      FROM dm_acl_s a2, dm_user_s c
                     WHERE     a2.object_name = c.acl_name
                           AND a2.owner_name = c.acl_domain
                           AND a2.r_object_id = x.r_object_id)
                   UNION
                   (SELECT a3.r_object_id
                      FROM dm_acl_s a3, dmi_type_info_s d
                     WHERE     a3.owner_name = d.acl_domain
                           AND a3.object_name = d.acl_name
                           AND a3.r_object_id = x.r_object_id))

it is clear that dm_clean job does not pay attention to the value of acl_class attribute. Next, when does Content Server create temporary ACLs?

  • When we directly grant access to sysobject:
    API> create,c,dm_document
    ...
    09024be980077401
    API> set,c,l,acl_name
    SET> Global User Default ACL
    ...
    OK
    API> set,c,l,acl_domain
    SET> dm_dbo
    ...
    OK
    API> save,c,l
    ...
    OK
    API> get,c,l,acl_name
    ...
    Global User Default ACL
    API> grant,c,l,dm_world,AccessPermit,,6
    ...
    OK
    API> save,c,l
    ...
    OK
    API> get,c,l,acl_name
    ...
    dm_45024be980003115
    
  • When we indirectly (via owner_permit/world_permit attributes, or when we take advantage of ACL Templates and assign new alias set to sysobject) grant access to sysobject:
    API> set,c,l,world_permit
    SET> 7
    ...
    OK
    API> save,c,l
    ...
    OK
    API> get,c,l,acl_name
    ...
    dm_45024be980003116
    
  • Other case I will describe further

Now about ACL classes. Frankly speaking, I do not understand the phrase “ACLs available for use” here, because where are following activities which we may or may not to perform with ACLs:

  • create
  • assign to sysobject
  • modify
  • delete

so, I will try to examine all cases. At first, we need to understand what Content Server means under ACL’s owner (the value of owner_name attribute), if you think that it is valid user’s name you are wrong: actually it may be any valid user or group (technically group is also a user because all dm_group records have corresponding dm_user records), or even ‘dm_world’ keyword:

API> create,c,dm_acl
...
45024be980003117
API> set,c,l,owner_name
SET> dm_bof_registry
...
OK
API> save,c,l
...
OK
API> create,c,dm_acl
...
45024be980003118
API> set,c,l,owner_name
SET> dm_superusers
...
OK
API> save,c,l
...
OK
API> create,c,dm_acl
...
45024be98000311b
// content server replaces dm_dbo
// by repository owner name
// and further I will do the same
API> set,c,l,owner_name
SET> dm_dbo
...
OK
API> save,c,l
...
OK
API> create,c,dm_acl
...
45024be980003119
API> set,c,l,owner_name
SET> dm_world
...
OK
API> save,c,l
...
OK
API> create,c,dm_acl
...
45024be98000311a
API> set,c,l,owner_name
SET> non_existing_user
...
OK
API> save,c,l
...
[DM_ACL_E_USER_NOT_EXIST]error:  "The owner_name or accessor_name 'non_existing_user' 
  given in the ACL 'dm_45024be98000311a' does not exist."

And when we are talking that “user is an owner of ACL” this actually means one of following:

  • the value of owner_name ACL’s attribute is ‘dm_world’
  • the value of owner_name ACL’s attribute is the name of user
  • the value of owner_name ACL’s is a valid group and the user is a member of that group

Now the rules:

  • Nobody may create ACLs with acl_class=2 and nobody may set value of acl_class to 2:
    API> create,c,dm_acl
    ...
    45024be98000312c
    API> set,c,l,acl_class
    SET> 2
    ...
    OK
    API> save,c,l
    ...
    [DM_ACL_E_CANT_CHANGE_INSTANCE]error:  
     "The ACL  is an instance of an ACL template."
    
  • Nobody but superusers may change value of object_name attribute (have no idea what was the cause of this restriction):
    API> retrieve,c,dm_acl where object_name='Global User Default ACL'
    ...
    45024be9800001c6
    API> grant,c,l,dm_world,AccessPermit,,7
    ...
    OK
    API> save,c,l
    ...
    OK
    API> set,c,l,object_name
    SET> test
    ...
    OK
    API> save,c,l
    ...
    [DM_ACL_E_CHANGE_OBJNAME_PRIV]error:  
      "Only SUPERUSER can change object_name."
    
    
    API> retrieve,c,dm_user where user_name=USER
    ...
    11024be980001100
    API> get,c,l,user_privileges
    ...
    8
    
  • Regular users are allowed to:
    • modify ACL if they belong to ACL’s owner
    • set ACL’s owner only to value they belong to
  • Sysadmins are allowed to:
    • modify ACL if ACL’s owner is dm_dbo, but it is not allowed to set ACL’s owner to value other than sysadmin belongs to
    • modify ACL if it’s acl_class is 3 regardless it’s owner
    • set ACL’s owner to dm_dbo – this behaviour seems to be inconsistent because in this case efficient permissions of sysadmins are the same as permissions of superusers, except object_name case:
      API> fetch,c,45024be980003137
      ...
      OK
      API> save,c,l
      ...
      [DM_ACL_E_NOT_OWNER]error:  
        "The ACL 'dm_45024be980003137' can only be modified by 
        its owner 'dmadmin' or superusers."
      
      
      API> set,c,l,owner_name
      SET> dm_dbo
      ...
      OK
      API> save,c,l
      ...
      OK
      
  • It is possible to assign ACL to sysobject only if one or more of following requirements are met
    • ACL’s acl_class is 3
    • ACL’s owner_name is dm_dbo
    • sysobject’s owner (not current user!) belongs to ACL’s owner:
      API> retrieve,c,dm_acl where owner_name='dmadmin'
      ...
      45024be9800001a9
      API> get,c,l,acl_class
      ...
      0
      API> get,c,l,object_name
      ...
      dm_45024be9800001a9
      API> create,c,dm_document
      ...
      09024be98007756b
      API> set,c,l,acl_name
      SET> dm_45024be9800001a9
      ...
      OK
      API> set,c,l,acl_domain
      SET> dmadmin
      ...
      OK
      API> save,c,l
      ...
      [DM_SYSOBJECT_E_INVALID_ACL_DOMAIN]error:  
        "The dm_document '' is given an invalid ACL domain 'dmadmin'."
      
      // but
      API> create,c,dm_document
      ...
      09024be98007756c
      API> set,c,l,acl_name
      SET> dm_45024be9800001a9
      ...
      OK
      API> set,c,l,acl_domain
      SET> dmadmin
      ...
      OK
      API> set,c,l,owner_name
      SET> dmadmin
      ...
      OK
      API> save,c,l
      ...
      OK
      
    • current user is a superuser, in this case Content Server creates new temporary ACL:
      API> ?,c,select user_privileges, user_name from dm_user where user_name=USER
      user_privileges  user_name
      ---------------  ---------
                   16  dmadmin
      (1 row affected)
      
      API> retrieve,c,dm_acl where owner_name='sysadmin' and acl_class=0
      ...
      45024be980003136
      API> get,c,l,object_name
      ...
      dm_45024be980003136
      API> create,c,dm_document
      ...
      09024be980077580
      API> save,c,l
      ...
      OK
      API> get,c,l,acl_name
      ...
      dm_45024be980000101
      API> set,c,l,acl_name
      SET> dm_45024be980003136
      ...
      OK
      API> set,c,l,acl_domain
      SET> sysadmin
      ...
      OK
      API> save,c,l
      ...
      OK
      API> get,c,l,acl_name
      ...
      dm_45024be980003144
      

As regards to the questions…

Yes, it is not possible to specify default acl_class even in data dictionary:

API> apply,c,,ALLOW_BASE_TYPE_CHANGES,ALLOW_CHANGE_FLAG,B,T
...
q0
API> ?,c,q0
result      
------------
T           
(1 row affected)

API> ?,c,alter type dm_acl modify (acl_class (SET default=3))
[DM_QUERY2_E_DATA_DICT_ERROR_FOR_ATTR_A_C]error:  
 "The following error(s) occurred processing an ALTER/CREATE statement 
 for type dm_acl, attribute acl_class."

[DM_DATA_DICT_E_TYPE_CANNOT_HAVE_DEFAULT_VALUE]error:  
 "You cannot specify a DEFAULT value for any attribute of the system type dm_acl."

Creating TBO for dm_acl is not an option, because temporary ACLs are created on Content Server side. On the other hand nothing prevents you from creating TBOs which will override certain IDfSysObject and IDfUser methods and you will get a full control over what is going on, the only question here is why mature product still does not support basic functionality 🙂 For example, ACL inheritance implemented in xCP2 differs from default CS implementation – when content server recognises that it is not possible to follow rules described above it creates temporary ACL (here I have no idea what behaviour is better: get exception or get different ACLs), that means EMC have spent some time on implemented new functionality, but the result is poor.