What is wrong in Documentum. Part III

Now the first part of my own thoughts about missing features in Documentum (actually EMC has launched Enhancement Request System (ERS) program, however I have no idea what are they going to achieve using this program – aren’t their product managers able to check the bug tracker?).

Unicode support

It is very embarrassing, but in 2014 Documentum still does not support unicode properly: when I’m qualifying attribute as string(10) I’m expecting that user will be able to store up to 10 characters into value of this attribute, but Documentum has another opinion about how to count characters – it count bytes, not characters:

API> ?,c,create type test_type (test_attr string(10)) with supertype null
(1 row affected)

API> create,c,test_type
API> set,c,l,test_attr
SET> тесттест
[DFC_OBJECT_BADATTRVALUE] value is too big for attribute 'test_attr'. 
    Value UTF-8 length is 16. Maximum length is 10.

API> ?,c,create test_type object set test_attr='тесттест'
(1 row affected)
   "attempt to assign string of excessive length to attribute 0"

API> get,c,0001ffd78000911b,test_attr

And hence, when I’m planning design of database schema, I must take into account this annoying behaviour, i.e. I should make attribute twice (for Russian language) the length that business analyst expects it to be, in case of WDK I must also override DocbaseAttributeValueTag to force it to render input fields using “original” length. But it is only a part of problem, the real problems begin when I’m going to integrate Documentum with external (modern and properly designed) systems. For example, let’s imagine that I’m going to synchronize accounts using MS Active Directory, what problems await me there?

  1. I can’t map MSAD’s displayName attribute to dm_user’s description attribute – I should either rewrite LDAP synchronization job, or create my own type for MSAD’s users with custom description field and start customizing web-interface
  2. Internally LDAP synchronization job maps MSAD’s distinguished name to dm_users’s user_ldap_dn attribute, and again it fails due to differences in character length semantics in MSAD and Documentum

It seems that EMC guys understand know about problem, but they are trying to solve it using some weird way: in D7 they just increased length of some dm_user’s and dm_group’s attributes. But what about other attributes? Is 255 bytes enough for object_name? Is 740 bytes enough for folder path? Is 400 bytes enough for title?

Ugly security model

I do know two common models to grant access permissions in Documentum:

  1. Centralized ACLs: administrator/application sets up some pre-defined ACLs and assigns them to document according to document’s attributes
  2. Workflow-based ACLs: acls are assigned to document during some business-process – target users get certain permissions on document according to their roles in business process

The problem is it is not possible to mix both approaches, moreover, business-users are getting confused about this fact: from user’s perspective sending document to reviewer using Documentum workflow is the same as sending document to reviewer by e-mail – there is no need to grant extra permissions on document when you send it by e-mail – all recipients can read e-mail attachments. I tried to solve this problem using aliases, but the solution was very limited – the idea was to setup centralized ACLs as templates, something like:

API> dump,c,l

  object_name                     : my_acl
  description                     : my_acl
  owner_name                      : repo
  globally_managed                : F
  acl_class                       : 1


  r_is_internal                   : F
  r_accessor_name              [0]: dm_world
                               [1]: dm_owner
                               [2]: dm_superusers
                               [3]: %write_accessor01
                              [12]: %write_accessor10
                              [13]: %version_accessor01
                              [22]: %version_accessor10
                              [82]: %browse_accessor10
  r_accessor_permit            [0]: 1
                               [1]: 7
                               [2]: 7
                               [3]: 6
                              [12]: 6
                              [13]: 5
                              [22]: 5

, create default aliasset for all documents:

  alias_name                   [0]: write_accessor01
                               [9]: write_accessor10
                              [10]: version_accessor01
                              [19]: version_accessor10

After that it was possible to manage permissions in workflow using aliasset and inherit pre-defined permissions from acl template, I think it’s obvious why described approach is too limited.

What need to do to make Documentum security model more flexible? I thought about two options:

  1. Add ability to assign multiple ACLs to a single document (like make acl_domain and acl_name attributes repeating, or use the same approach as in acl templates, i.e. make r_template_id attribute repeating)
  2. Extend role model: currently Documentum treats specially document’s owner only, why do not add a couple of extra repeating attributes to dm_sysobject like read_accessors, write_accessors, etc?

No scripting engine

Another embarrassing thing: Documentum has no embeddable scripting engine at all, for administrative routines they may offer dmbasic which makes my eyes bleeding, compare:

Sub Main()
  If (dmAPIGet("connect,test,dmadmin,dmadmin")="") Then
    Print "Could not connect to Docbase"
    Exit Sub
  End If

  DmQuery = dmAPIGet("query,c,select r_object_id, object_name" & _
                       " from dm_document enable(return_top 10)")
  If (DmQuery = "") Then
     Print "Could not execute query"
     Exit Sub
  End If

  While(dmAPIExec("next,c," & DmQuery))
    ObjectName   = dmAPIGet("get,c," & DmQuery & ",object_name")
    print ObjectName

  ret = dmAPIExec("close,c," & DmQuery)

End Sub

with how the same would be written using python:

from dctmpy.docbaseclient import DocbaseClient

def main():
    session = DocbaseClient(host="", port=12000,
                            username="dmadmin", password="dmadmin")
    for q in session.query("select r_object_id, object_name "
                           "from dm_document enable(return_top 10)"):
        print q['object_name']

if __name__ == '__main__':

or using groovy:

withSession(docbase: "test",
        user: "dmadmin", password: "dmadmin", {
    it.query("select r_object_id, object_name " +
            "from dm_document enable(return_top 10)", {
        println it['object_name']

or using groovy again:

session = newSession(docbase: "test",
        user: "dmadmin", password: "dmadmin")
for (q in session.select("select r_object_id, object_name " +
        "from dm_document enable(return_top 10)")) {
    println q['object_name']

I bet EMC QA have a lot of fun with dmbasic. Though there are a lot of options to implement embeddable scripting engine in Java, EMC prefers to do nothing, moreover they invent square wheels – take a look at BPM, I can’t believe that it is really convenient to design something like:
but in xCP EMC went further, and now BPM is the only option to customize xCP.

UPD. It seems that EMC got internet advocate from India: http://rarrao.wordpress.com/ – better they didn’t.

I even made a screenshot to save this imbecility – I bet the word “embeddable” caused some difficulties in understanding, so I’m not going to mention “DSL” abbreviation.

Statement #1: The dmbasic script he pasted is very good with proper error handling and proper messages to the user.

Neither python nor groovy require “proper error handling” because both languages are modern and both languages do know about exceptions:

Python 2.6.6 (r266:84292, Oct 12 2012, 14:23:48)
[GCC 4.4.6 20120305 (Red Hat 4.4.6-4)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from dctmpy.docbaseclient import DocbaseClient
>>> session = DocbaseClient(host="", port=12000,
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "dctmpy/docbaseclient.py", line 57, in __init__
  File "dctmpy/docbaseclient.py", line 308, in authenticate
    result = self.authenticate_user(self.username, self.obfuscate(self.password))
  File "dctmpy/docbaseclient.py", line 418, in inner
    return method(self, NULL_ID, name, request(self, *args), cls)
  File "dctmpy/rpc/__init__.py", line 113, in as_object
    return session.apply(RPC_APPLY_FOR_OBJECT, object_id, method, request, cls)
  File "dctmpy/docbaseclient.py", line 233, in apply
    response = self.rpc(rpc_id, [self._get_method(method), object_id, request])
  File "dctmpy/docbaseclient.py", line 214, in rpc
    raise RuntimeError(reason)
RuntimeError: DM_SESSION_E_AUTH_FAIL: nonexistinguser

Statement #2: If I have to write the python like code in dmbasic, it would look like this

Given example is syntactically incorrect:

 ~]$ cat > 1.ebs
Sub Main()
  DmQuery = dmAPIGet("query,c,select r_object_id, object_name" & _
                       " from dm_document enable(return_top 10)")
  While(dmAPIExec("next,c," & DmQuery))
    ObjectName   = dmAPIGet("get,c," & DmQuery & ",object_name")
    print ObjectName
  ret = dmAPIExec("close,c," & DmQuery)
End Sub
[dmadmin@docu70dev01 ~]$ dmbasic -f 1.ebs
dmbasic: Error 168 in line 2, col 43: Encountered: end of line
Expecting: '=', .

The full python/groovy analog should look like:

Sub Main()
  session = dmAPIGet("connect,ssc_dev,dmadmin,dmadmin")

  If (session = "") Then
     Print dmAPIGet("getmessage,a")
     Exit Sub
  End If

  ' good news that "query" api method accepts query
  ' as last argument - no need to escape commas
  ' try to play with createaudit method
  DmQuery = dmAPIGet("query,c,select r_object_id, object_name" & _
                       " from dm_document enable(return_top 10)")

  If (DmQuery = "") Then
     Print dmAPIGet("getmessage,c")
     Exit Sub
  End If

  While(dmAPIExec("next,c," & DmQuery))
    ObjectName  = dmAPIGet("get,c," & DmQuery & ",object_name")
    DmErr = dmAPIGet("getmessage,c")
    If (DmErr <> "") Then
      Print DmErr
      Exit Sub
    End If
    Print ObjectName
  ret = dmAPIExec("close,c," & DmQuery)
End Sub

Fuck yeah! getmessage after every api call!

Statement #3: Each language has its own pros and cons and are meant as tools to get the job done.

Yeap, however Documentum has/had 20 (twenty) vulnerable (arbitrary code execution) docbase methods, and all of them are written on dmbasic.

4 thoughts on “What is wrong in Documentum. Part III

  1. Pingback: Dynamic groups. Advances. Part IV | Documentum in a (nuts)HELL
  2. Pingback: ACL performance | Documentum in a (nuts)HELL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s