Maximum length of DQL statement

A week ago my colleague had got me stumped by asking following question: “What is the maximum length of DQL statement?”

Official documentation, as expected, does say nothing about such restrictions, so I had tried to make a small experiment:

from dctmpy.docbaseclient import DocbaseClient

def main():
    part1 = "SELECT r_object_id FROM dm_sysobject WHERE 1=1 "
    part2 = " AND 1=1" * 10000
    part3 = " ENABLE(RETURN_TOP 1)"
    session = DocbaseClient(host="192.168.2.52", port=10000,
                            username="test01", password="test01")
    for q in session.query("%s" % (part1 + part2 + part3)):
        print q['r_object_id']

if __name__ == '__main__':
    main()

but got a weird error:

error(10054, 'An existing connection was forcibly closed by the remote host')

Hmm, it seems that Content Server forcibly closes connection if receives too long query. Ok, let use a “binary” search algorithm to find the maximum allowed length of DQL statement:

from dctmpy.docbaseclient import DocbaseClient


def main():
    part1 = "SELECT r_object_id FROM dm_sysobject WHERE 1=1 "
    part2 = " AND 1=1"
    part3 = " ENABLE(RETURN_TOP 1)"
    i = 20
    s = 0
    args = ""
    while i >= 0:
        args = part1 + part2 * (s + 2 ** i) + part3
        print "Trying query with length=%d" % len(args)
        try:
            session = DocbaseClient(host="192.168.2.52", port=10000,
                                    username="test01", password="test01")
            for q in session.query("%s" % args):
                pass
        except Exception, e:
            i -= 1
            continue
        s += 2 ** i
        i -= 1
    print "Maximum query length: %d" % len(args)


if __name__ == '__main__':
    main()

Result:

Trying query with length=8388676
Trying query with length=4194372
Trying query with length=2097220
Trying query with length=1048644
Trying query with length=524356
Trying query with length=262212
Trying query with length=131140
Trying query with length=65604
Trying query with length=32836
Trying query with length=49220
Trying query with length=57412
Trying query with length=61508
Trying query with length=63556
Trying query with length=64580
Trying query with length=64068
Trying query with length=63812
Trying query with length=63940
Trying query with length=63876
Trying query with length=63844
Trying query with length=63828
Trying query with length=63820
Maximum query length: 63820

Now the resulting statement in DQL Editor:

Hmm, SET_PUSH_OBJECT_STATUS is a RPC-command, so, this error means that in case of long queries DFC sends extra RPCs and my home-grown implementation of Documentum protocol differs from EMC’s one. After some research I have found that DFC sends long RPCs using following way:

  • apply,c,,SET_PUSH_OBJECT_STATUS,_PUSHED_ID_,ID,objectId,_PUSH_STATUS_,B,T
  • apply,c,objectId,RPC_NAME,CHUNK1
  • apply,c,objectId,RPC_NAME,CHUNK2
  • apply,c,objectId,RPC_NAME,CHUNKN
  • apply,c,objectId,RPC_NAME,_USE_SESSION_CHUNKED_OBJ_STRING_
  • apply,c,,SET_PUSH_OBJECT_STATUS,_PUSHED_ID_,ID,objectId,_PUSH_STATUS_,B,F

and in case of EXEC RPC-command DFC sends 0000000000000000 as objectId, so, SET_PUSH_OBJECT_STATUS RPC-command raises a DM_SESSION_E_NON_EXIST_OBJ error. After some meditation over python code I decided that it would be a good idea to send a session identifier as objectId:

And now my implementation is able to execute much longer DQL statements:

from dctmpy.docbaseclient import DocbaseClient


def main():
    part1 = "SELECT r_object_id FROM dm_sysobject WHERE 1=1 "
    part2 = " AND 1=1" * 20000
    part3 = " ENABLE(RETURN_TOP 1)"
    print "Query length: %d" % (len(part1 + part2 + part3))
    session = DocbaseClient(host="192.168.2.52", port=10000,
                            username="test01", password="test01")
    for q in session.query("%s" % (part1 + part2 + part3)):
        print "r_object_id: %s" % q['r_object_id']


if __name__ == '__main__':
    main()

result:

Query length: 160068
r_object_id: 3a01fd0880000153

Now it’s obvious that DFC has a bug that does not allow to run DQL queries with length exceeding ~63000 bytes, unfortunately, Content Server has a so slow DQL parser, that it’s practically pointless to execute queries with length exceeding 128K:

from time import time

from dctmpy.docbaseclient import DocbaseClient


def main():
    part1 = "SELECT r_object_id FROM dm_sysobject WHERE 1=1 "
    part2 = " AND 1=1"
    part3 = " ENABLE(RETURN_TOP 1)"
    i = 0
    session = DocbaseClient(host="192.168.2.52", port=10000,
                            username="test01", password="test01")
    while True:
        i += 1
        args = part1 + part2 * (2 ** i) + part3
        t = time()
        for q in session.query("%s" % args):
            pass
        print "%d\t%d" % (len(args), time() - t)


if __name__ == '__main__':
    main()

2 thoughts on “Maximum length of DQL statement

  1. Pingback: Bulk fetches | Documentum in a (nuts)HELL
  2. Pingback: Why you should stay clear of REST | Pro Documentum

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s