Session management. Beginning

It’s worth to read about probability density function before continuing reading this post.

One my skypemate is worried about following performance problem: they developed some kind of integration between customer application and EMC Documentum, initially integration was utilizing DFS capabilities, but later, due to performance issues, they switched to pure DFC, further, they had to upgrade to the latest supported DFC version and after that customer has started complaining about performance issues. Though my skypemate associates these complains with changes made in latest versions of DFC (ISM), I’m very skeptical about his suspicions: in my opinion any upgrade is a stress for production environment and any upgrade gives customer a motive to complaining – this is the nature of any customer: it always sounds like a brilliant idea to reveal old issues after some activities have been performed in production environment.

Though I do not share the opinion of my skypemate, his suspicions gave me an idea to write a series of posts about session management in DFC. But before discussing some aspects of DFC implementation we need to perform some basic measurements for typical DFC operations, otherwise we will not able to understand which operations are relatively fast and which ones are slow. To perform these measurements I designed a series of microbenchmarks – there are a dozen of benchmarks but currently we are interested only in four of them:

Benchmark methodology

Main benchmark class (tel.panfilov.documentum.benchmark.Benchmark) spawns a certain number of threads, each spawned thread performs a certain operation in a loop and increments global counter upon successful completion of operation, main thread measures global counter changes at specific time intervals and prints difference with previous value in standard output. Here I assume that all such measurements are independent. Diagrams shown below were created with following parameters:

  1. number of concurrent threads = 1
  2. number of measurements = 1000
  3. sleep time between measurements in milliseconds = 10000

For example:

java tel.panfilov.documentum.benchmark.Benchmark ssc_dev dmadmin dmadmin \
> tel.panfilov.documentum.benchmark.impl.Connection 1 1000
Executions per 10000ms: 2, iteration: 1
Executions per 10000ms: 3, iteration: 2
Executions per 10000ms: 4, iteration: 3
Executions per 10000ms: 3, iteration: 4

...

Executions per 10000ms: 3, iteration: 998
Executions per 10000ms: 2, iteration: 999
Executions per 10000ms: 3, iteration: 1000

Connection and authentication benchmarks

Connection benchmark was designed to do not leverage session pooling capabilities because I wanted to understand how slow Documentum stack performs connection handshaking, and the result is extremely poor – 6-8 connections per second:

Because authentication is a part of connection handshaking, I designed Authentication benchmark to understand how slow Documentum performs authentication – the result for user with inline password (unix authentication is 4 times slower, I suppose that LDAP authentication is even more slower) is 30-40 successful authentications per second:

Compare these results with Oracle database (OracleJDBC benchmark):

Why connection and authentication benchmarks demonstrate so poor performance?

Connection handshaking

When DFC-client tries to establish new connection with content server it performs following sequence of actions:

  1. establishes TCP connection with content server (after TCP handshaking content server spawns new process/thread which establishes connection with underlying database)
  2. sends “new session” RPC
  3. requests list of available RPC commands by sending “ENTRY_POINTS” RPC command
  4. sends “AUTHENTICATE_USER” RPC command
  5. sends “GET_ERRORS” RPC command
  6. receives available messages from content server

Actually, algorithm described above has following performance gaps:

  1. this is a good idea to maintain several spare or idle server processes/threads, which stand ready to serve incoming requests
  2. there is no reason to request entry points at every “connection” request – entry points must be cached on client side
  3. there is no reason to request messages from server by sending “GET_ERRORS” RPC command – actually this step is initiated by content server through setting extra flags in response for “AUTHENTICATE_USER” RPC command:
    [dmadmin@docu70dev01 ~]$ iapi
    Please enter a docbase name (docubase): ssc_dev
    Please enter a user (dmadmin):
    Please enter password for dmadmin:
    
    
    
            EMC Documentum iapi - Interactive API interface
            (c) Copyright EMC Corp., 1992 - 2012
            All rights reserved.
            Client Library Release 7.0.0130.0537
    
    
    Connecting to Server using docbase ssc_dev
    // receiving of this message is initiated by CS
    [DM_SESSION_I_SESSION_START]info:  "Session 0101ffd78010db5e started for user dmadmin."
    
    
    Connected to Documentum Server running Release 7.0.0140.0644  Linux.Oracle
    Session id is s0
    API>
    

Furthermore, content server does not cache user’s credentials after successful authentication, so every time when authentication is performed content server “honestly” tries to check user’s credentials. So, the result is “predictable”. To confirm my suspicions about suboptimal connection handshaking algorithm I have hacked my dctmpy library to measure how fast it performs only first and second steps of connection handshaking (and disconnect as well) and got expected result – about 25 connections per second, i.e. 3 times improvement:

[dmadmin@docu67dev02 ~]$ cat > test1.py
# coding=utf-8

from timeit import timeit


def main():
    setup = """\
from dctmpy.docbaseclient import DocbaseClient
session = DocbaseClient(
    host="192.168.2.56",
    port=12000,
    docbaseid=131031
)
entrypoints = session.entrypoints
session.disconnect
    """

    stmt = """\
session = DocbaseClient(
    host="192.168.2.56",
    port=12000,
    entrypoints=entrypoints,
    docbaseid=131031
)
session.disconnect()
    """

    A = timeit(setup=setup, stmt=stmt, number=1000)
    print("%15s %6.2fs" % ("Python", A))


if __name__ == '__main__':
    main()
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  39.08s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  36.94s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  40.12s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  39.45s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  36.24s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  37.98s
[dmadmin@docu67dev02 ~]$ python test1.py
         Python  42.81s

Object creation and fetch

The results for these two benchmarks are disappointing too – I suppose that results should be at least 10 times better.

About 20 sysobject creations per second:

About 80-100 fetches per second:

12 thoughts on “Session management. Beginning

  1. Good work, Andrew. The research looks very persuasively. So, CS and DFC seem to have quite large optimization work reserve 🙂 . Inspite of Moore’s law it looks at least carelessly – not to review old architecture in modern circumstances.

    Like

  2. Relax, it’s just a beginning, in next posts I’m going to demonstrate that ISM is a fake and show how to achieve the same results by just putting one single line in java code.

    Like

  3. Pingback: Session management. Misconceptions | Documentum in a (nuts)HELL
  4. Pingback: Bulk fetches | Documentum in a (nuts)HELL
  5. Pingback: Q&A. III | Documentum in a (nuts)HELL
  6. Pingback: How to slow down ingestion | Documentum in a (nuts)HELL
  7. Pingback: Ingestion rates | Documentum in a (nuts)HELL
  8. Pingback: A FATAL error has occurred | Documentum in a (nuts)HELL
  9. Pingback: Encryption madness. Holy crap | Documentum in a (nuts)HELL
  10. Pingback: Workflow throughput | Documentum in a (nuts)HELL
  11. Pingback: Why you need dfc.session.keepalive.enable | Documentum in a (nuts)HELL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s