Rumours, rumours, rumours…

Today morning I discovered an amusing article about Documentum future: Maybe OpenText Will Delight Documentum Users After All, by Virginia Backaitis, according to this article the diagram (I always thought that the purpose of any diagram is to visualise complex data/statements, it seems that I was wrong) below:

proves that OpenText will invest in Documentum! LOL 🙂

PS. Shame on FME! These guys just decided to not whistle for a wind and have silently improved their products, so impatient guys – instead of that they were need to wait two years and continue to loose money.

JMS high availability feature

The topic of this blogpost seems to be one of the most useless things in Documentum – I already mentioned that in What is wrong in Documentum. Part I, but it seems that talented team is still trying to “improve” this “feature”:

so, I think this topic deserves more thorough explanation.

Well, why do we need JMS? Frankly speaking, ten years ago I was bullshitting customers by describing JMS like: “It is an embedded application server, purposed to execute custom business logic” – means nothing, but sounds good 🙂 In real life JMS is a single point of failure which causes 80% of issues. So, if it is so unstable and affects business-users we need to stabilise it somehow – and there is a big difference between talented team and me: when I see phrase single point of failure I start thinking how to eliminate/avoid “point of failure”, when talented team see the same phrase they think about how to eliminate “single” (I have no idea why, m.b. it is just hard to read the whole phrase).

Quick question: JMS in Documentum 7.3 supports both load-balancing and failover (this was highlighted as a key enhancement) – do we really need load-balancing capabilities? How to answer to this question? Put load on Content Server and measure what resources are consumed by components, typically, I get something like:

So, Content Server consumes five times more resources than JMS, do I need to load balance JMS? Definitely not! I need to load balance Content Serves.

Now about failover capabilities. Let’s list everything which is typically executed on JMS:

  • workflow methods – I can’t understand why workflow agent is still a part of Content Server if JMS can do the same, moreover, there is no reason to failover workflow methods – slight delays in workflows, caused by JMS failures, do not affect business-users – you just need to monitor JMS health
  • job methods – the same considerations as above (dm_agent_exec is a piece of shit)
  • lifecycle methods – looks like a design gap: I can’t understand why lifecycle transitions are implemented as docbase methods: dm_bp_transition_java method just reads information about lifecycle states and executes corresponding SBOs – why do not do the same on TBO side?
  • custom docbase methods – you are a bloody idiot if custom docbase methods are a part of your application
  • mail methods – looks like a design gap too: it would be more convenient to add a special flag to dmi_queue_item objects which would indicate whether the corresponding email has been sent or not

Did I miss something? Yes, in Documentum 7.3 talented team implemented SAML authentication through JMS – taking opensource C++ implementation was too difficult.

Going private

Recently I have noticed that ECD/IIG/whoeveridontcare employees use information from this blog in their semi-official documents without keeping references to the original blogposts, so I decided to make this blog private starting from the 1st of April, engaged readers will able to request access – I will able to refuse.

Explanation for dfc.diagnostics.resources.enable

It is bit weird, but everybody who previously asked me about session leaks didn’t properly understand the root cause of the problem – the typical question was: we started getting a lot of DM_API_E_NO_SESSION errors, we increased value of concurrent_sessions parameter in server.ini, but this didn’t help. Well, I have written this blogpost in order to shed a light on session leaks in DFC. First of all, you may face with two following errors:

DfServiceException:: THREAD: main; MSG: [DM_SESSION_E_MAX_SESSIONS_EXCEEDED]
  error:  "Maximum number of server sessions exceeded"; ERRORCODE: 100; NEXT: null
	at com.documentum.fc.client.impl.docbase.DocbaseExceptionMapper.newException(DocbaseExceptionMapper.java:44)
	at com.documentum.fc.client.impl.docbase.DocbaseExceptionMapper.newException(DocbaseExceptionMapper.java:34)
	at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient.newSessionByAddr(NetwiseDocbaseRpcClient.java:118)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.beginSession(DocbaseConnection.java:299)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.open(DocbaseConnection.java:130)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.<init>(DocbaseConnection.java:100)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.<init>(DocbaseConnection.java:60)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionFactory.newDocbaseConnection(DocbaseConnectionFactory.java:26)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionManager.createNewConnection(DocbaseConnectionManager.java:180)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionManager.getDocbaseConnection(DocbaseConnectionManager.java:110)
	at com.documentum.fc.client.impl.session.SessionFactory.newSession(SessionFactory.java:23)
	at com.documentum.fc.client.impl.session.PrincipalAwareSessionFactory.newSession(PrincipalAwareSessionFactory.java:44)
	at com.documentum.fc.client.impl.session.PooledSessionFactory.newSession(PooledSessionFactory.java:49)
	at com.documentum.fc.client.impl.session.SessionManager.getSessionFromFactory(SessionManager.java:134)
	at com.documentum.fc.client.impl.session.SessionManager.newSession(SessionManager.java:72)
	at com.documentum.fc.client.impl.session.SessionManager.getSession(SessionManager.java:191)

and

DfServiceException:: THREAD: main; MSG: [DM_API_E_NO_SESSION]
  error:  "There are no more available sessions."; ERRORCODE: 100; NEXT: null
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionManager.getConnectionFromPool(DocbaseConnectionManager.java:168)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionManager.getDocbaseConnection(DocbaseConnectionManager.java:94)
	at com.documentum.fc.client.impl.session.SessionFactory.newSession(SessionFactory.java:23)
	at com.documentum.fc.client.impl.session.PrincipalAwareSessionFactory.newSession(PrincipalAwareSessionFactory.java:44)
	at com.documentum.fc.client.impl.session.PooledSessionFactory.newSession(PooledSessionFactory.java:49)
	at com.documentum.fc.client.impl.session.SessionManager.getSessionFromFactory(SessionManager.java:134)
	at com.documentum.fc.client.impl.session.SessionManager.newSession(SessionManager.java:72)
	at com.documentum.fc.client.impl.session.SessionManager.getSession(SessionManager.java:191)

The first one (DM_SESSION_E_MAX_SESSIONS_EXCEEDED) means that you are running out of Content Server sessions (concurrent_sessions parameter in server.ini), the second one (DM_API_E_NO_SESSION) means that you are running out of DFC sessions (dfc.session.max_count parameter in dfc.propetries), the problem is both errors may relate to either incorrect sizing or programming mistake. So, before trying to investigate memory leaks you need to eliminate sizing issue – it is might be bit challenging because dumb DFC does not provide any runtime statistics about sessions (recent DFC versions support dfc.connection.profile_connections and dfc.connection.trace_connections_only flags in dfc.properties, these flags cause com.documentum.fc.client.impl.connection.docbase.DocbaseConnectionManager class to write extra debug information into log files, but I would prefer something more reliable, JMX for example), so, let’s pretend that sizing issue is not our case. What is the next?

Official documentation (DFC Development Guide) states following:

and there are at least two mistakes:

  1. it also logs information about leaks of stream objects, which were created through ISysObjectInternal#getStream method, corresponding log message looks like:
    [Resource Housekeeper] ERROR com.documentum.fc.impl.util.io.InputStreamHandle$DisposableData  - [DFC_STREAM_NOT_CLOSED] Unclosed stream found during garbage collection, com.documentum.fc.client.content.impl.PullerInputStream@28aac04e.
    com.documentum.fc.impl.util.ThrowableStack: Stack when stream was created
    	at com.documentum.fc.client.content.impl.ContentInputStream.<init>(ContentInputStream.java:17)
    	at com.documentum.fc.client.content.impl.PusherPullerContentAccessor.buildStreamFromContext(PusherPullerContentAccessor.java:46)
    	at com.documentum.fc.client.content.impl.PusherPullerContentAccessor.getStream(PusherPullerContentAccessor.java:28)
    	at com.documentum.fc.client.content.impl.ContentAccessorFactory.getStream(ContentAccessorFactory.java:37)
    	at com.documentum.fc.client.content.impl.Store.getStream(Store.java:63)
    	at com.documentum.fc.client.content.impl.FileStore___PROXY.getStream(FileStore___PROXY.java)
    	at com.documentum.fc.client.content.impl.Content.getStream(Content.java:185)
    	at com.documentum.fc.client.content.impl.Content___PROXY.getStream(Content___PROXY.java)
    	at com.documentum.fc.client.content.impl.ContentManager.getStream(ContentManager.java:84)
    	at com.documentum.fc.client.content.impl.ContentManager.getStream(ContentManager.java:53)
    	at com.documentum.fc.client.DfSysObject.getStream(DfSysObject.java:1994)
  2. the last statement is incorrect, correct is: if DFC wasn’t buggy it would place an appropriate message in the log

So, what does happen when we enable dfc.diagnostics.resources.enable? The short answer is: if your code suffers from leaks of transaction sessions DFC replaces them by a memory leak 🙂

Now the long answer. DFC manages following types of sessions:

  • com.documentum.fc.client.impl.session.Session
  • com.documentum.fc.client.impl.session.WeakSessionHandle
  • com.documentum.fc.client.impl.session.StrongSessionHandle
  • com.documentum.fc.client.impl.session.DynamicSessionHandle

the first one is the real/original session, last three are decorators. We typically deal with StrongSessionHandle (IDfSessionManager#getSession) and DynamicSessionHandle (IDfTypedObject#getSession):

@Test
public void test() throws Exception {
    IDfClientX clientX = new DfClientX();
    IDfClient client = clientX.getLocalClient();
    IDfSessionManager sessionManager = client.newSessionManager();
    IDfLoginInfo loginInfo = clientX.getLoginInfo();
    loginInfo.setUser("dmadmin");
    loginInfo.setPassword("dmadmin");
    sessionManager.setIdentity("DCTM_DEV", loginInfo);
    IDfSession session = sessionManager.getSession("DCTM_DEV");
    System.out.println("session managers return "
            + session.getClass().getName());
    IDfSysObject object = (IDfSysObject) session
            .getObjectByQualification("dm_document");
    System.out.println("typed objects store "
            + object.getSession().getClass().getName());
    sessionManager.release(session);
    sessionManager.flushSessions();
    System.out.println("we may use dynamic session handles "
            + "even if original session was released");
    object.getString("object_name");
    object.save();
    System.out.println("we may not use strong session handles "
            + "if it was released");
    object = (IDfSysObject) session.getObjectByQualification("dm_document");
}

session managers return com.documentum.fc.client.impl.session.StrongSessionHandle
typed objects store com.documentum.fc.client.impl.session.DynamicSessionHandle
we may use dynamic session handles even if original session was released
we may not use strong session handles if it was released

com.documentum.fc.common.DfRuntimeException: Using session handle that has already been released

	at com.documentum.fc.client.impl.session.StrongSessionHandle.referenceSession(StrongSessionHandle.java:64)
	at com.documentum.fc.client.impl.session.SessionHandle.referenceScopableSession(SessionHandle.java:74)
	at com.documentum.fc.client.impl.session.SessionHandle.getObjectByQualification(SessionHandle.java:752)

And when we enable dfc.diagnostics.resources.enable DFC starts keeping track of StrongSessionHandle instances, i.e. if instance of StrongSessionHandle class wasn’t properly released DFC logs corresponding error. How would you implement such functionality? Actually the implementation is pretty straightforward – we need to know circumstances under which our object was constructed, in order to do that we save construction stack into object and log that stack in finalize() method, something like:

public class FinalizationDemo {

    private final boolean _diagnosticsEnabled = true;

    @Test
    public void test() throws Exception {
        try {
            while (true) {
                IResource resource = getResource();
            }
        } catch (OutOfMemoryError error) {
            error.printStackTrace();
            System.exit(1);
        }
    }

    private IResource getResource() {
        return new Resource();
    }

    interface IResource extends Closeable {

        boolean isClosed();

    }

    class Resource implements IResource {

        private boolean _closed;

        private Throwable _construction;

        private final byte[] data = new byte[1000 * 100];

        public Resource() {
            if (_diagnosticsEnabled) {
                _construction = new Exception();
            }
        }

        @Override
        public void close() throws IOException {
            System.out.println("Closed");
            _closed = true;
            _construction = null;
        }

        @Override
        public boolean isClosed() {
            return _closed;
        }

        @Override
        protected void finalize() throws Throwable {
            if (_construction != null && !isClosed()) {
                _construction.printStackTrace();
            }
        }
    }

}

ECD’s implementation is different – instead of using finalizers they use weak references, moreover before logging error message they try to cleanup resources (such behaviour suffers from side effects described in previous post) and their code looks like:

public class DisposableDemo {

    private final ReferenceQueue<WeakReference> _queue = new ReferenceQueue<>();
    private final Map<Reference, IDisposable> _data = Collections
            .synchronizedMap(new IdentityHashMap<Reference, IDisposable>());

    private final boolean _diagnosticsEnabled = true;

    public DisposableDemo() {
        Thread thread = new Thread(new DisposingTask());
        thread.setPriority(Thread.MAX_PRIORITY);
        thread.start();
    }

    class DisposingTask implements Runnable {
        @Override
        public void run() {
            try {
                Reference reference;
                while ((reference = _queue.remove()) != null) {
                    IDisposable disposable = _data.remove(reference);
                    if (disposable == null) {
                        continue;
                    }
                    try {
                        disposable.dispose();
                    } catch (IOException ex) {
                        ex.printStackTrace();
                    }
                }
            } catch (InterruptedException ex) {
                Thread.currentThread().interrupt();
            }
        }
    }

    private void register(Object resource, IDisposable disposable) {
        WeakReference reference = new WeakReference(resource, _queue);
        _data.put(reference, disposable);
    }

    @Test
    public void test() throws Exception {
        try {
            while (true) {
                IResource resource = getResource();
                // System.gc();
            }
        } catch (OutOfMemoryError error) {
            error.printStackTrace();
            System.exit(1);
        }
    }

    private IResource getResource() {
        IResource resource = new Resource();
        ResourceDecorator decorator = new ResourceDecorator(resource);
        if (_diagnosticsEnabled) {
            register(decorator, decorator._disposable);
        }
        return decorator;
    }

    interface IDisposable {

        void dispose() throws IOException;

    }

    interface IResource extends Closeable {

        boolean isClosed();

    }

    class Resource implements IResource {

        private boolean _closed;

        private final byte[] data = new byte[1000 * 100];

        @Override
        public void close() throws IOException {
            System.out.println("Closed");
            _closed = true;
        }

        @Override
        public boolean isClosed() {
            return _closed;
        }

    }

    class ResourceDecorator implements IResource {

        private final IResource _resource;

        private final IDisposable _disposable;

        public ResourceDecorator(IResource resource) {
            _resource = resource;
            if (_diagnosticsEnabled) {
                _disposable = new Disposable(_resource);
            } else {
                _disposable = null;
            }
        }

        @Override
        public void close() throws IOException {
            _resource.close();
        }

        @Override
        public boolean isClosed() {
            return _resource.isClosed();
        }
    }

    // !!! must not refer to ResourceDecorator
    static class Disposable implements IDisposable {

        private IResource _resource;

        private final Throwable _construction;

        public Disposable(IResource resource) {
            _resource = resource;
            _construction = new Exception();
        }

        @Override
        public void dispose() throws IOException {
            if (!_resource.isClosed()) {
                // closing resource
                _resource.close();
                // log construction stacktrace
                _construction.printStackTrace();
            }
        }

    }

}

Numbers…

  • EMC acquired Documentum for $2.2 billion in 2003-2004 (I suggest to check US Inflation Calculator before arguing)
  • OpenText acquired ECD (Documentum, Captiva, XHive, Kazeon, etc) for $1.6 billion in 2016
  • According to Mark Barrenechea at the moment of acquisition ECD has about 5600 customers, moreover, amount of OpenText customers was about the same (i.e. about 5000-6000)
  • SAP has about 345000 customers
  • Alfresco has about 1800 enterprise customers (bit strange because they reported growth of 500 customers per year in 2010 and 2012)

Conclusions:

  • SAP sales channel does not work, there is no chance that OpenText will improve Documentum sales due to SAP partnership
  • ECD/IIG has lost about 10000 customers (I think that in 2003 Documentum had about 5000 customers, plus I take into account Alfresco trends) due to poor support and non-targeted sales
  • OpenText will try to push InfoArchive to exisiting SAP customers (there is no big market for InfoArchive: java-api only, no connectors to other systems except SAP and SP, outdated technology)
  • OpenText will try to return 10000 lost Documentum customers back

Job scheduling

I’m not sure what is the most buggy part of content server, but the most unpredictable one is agentexec – I do believe it was designed during bad trip by a brain-damaged person hired within equal opportunity employment program. I believe the diagram below will help you to troubleshoot the behaviour of agent exec:

There is also an outdated document from EMC intended to explain job scheduling.