Q & A. XVI

I will work, maybe, in a D2 implementation project that could be released in a public site. I do not have updated information regarding D2 4.7 security holes: I need an independent point of view and you are probably the only person that has a clear understanding of what I am talking. Can you help me to understand what has not yet been fixed just in the D2 layer?

Current D2 security status: any authenticated user may gain superuser privileges 🙂

Q & A. XIV

Hi,
I need to link multiple documents to folder in documentum.
those documents are checked out. Can you please advise how can i achieve this?

Thanks
Ram

Unfortunately, this guy didn’t provide a good description for his problem, so, let’s pretend that he is trying to do something like:

update dm_document objects
link '/target/folder'
where ...

and getting something like:

API> ?,c,update dm_document objects link '/Temp' where r_object_id='09024be98006ab34'
[DM_QUERY_F_UP_SAVE]fatal:  "UPDATE:  An error has occurred during a save operation."

[DM_SYSOBJECT_E_LOCKED]error:  
  "The operation on  sysobject was unsuccessful because it is locked by user DCTM_DEV."


API> get,c,09024be98006ab34,r_lock_owner
...
DCTM_DEV

Now follow my hands – it is a magic:

API> ?,c,alter group dm_escalated_allow_save_on_lock  add dmadmin

API> ?,c,update dm_document objects link '/Temp' where r_object_id='09024be98006ab34'
objects_updated
---------------
              1
(1 row affected)
[DM_QUERY_I_NUM_UPDATE]info:  "1 objects were affected by your UPDATE statement."


API> revert,c,09024be98006ab34
...
OK
API> get,c,09024be98006ab34,r_lock_owner
...
DCTM_DEV

Q & A. XIII

Q:

Is there any way to directly run a Java server method on a specific Docbase service instance (in HA setup)?

I have a workaround with a job set up to run on designated target server, however I want to execute a method instantly via the “EXECUTE do_method…” call for example, instead of relying on the agent exec scheduler.

Thanks.

A:

The answer depends on what you are really want to achieve, let’s explain. When you are executing docbase method (next discourse is about Java methods only) there are at least three participants:

  1. Content server which sends http request to JMS and waits for response – your current session
  2. JMS instance that executes you docbase method – in normal situation content server always prefers “embedded” JMS instance
  3. Content serves JMS instance connects to – it depends on how your docbase method is written (for example in case of workflow and job methods content server provides -docbase_name argument using <docbase>.<server> form, so it binds JMS instances to specific content server

As regards to your problem…

If you want to stay within “EXECUTE do_method…” the answer is: “No, you can’t influence on how content server selects JMS instance, if you want to execute docbase method on specific JMS you need to connect to corresponding content server”.

However, if your objectives are something like “We need to execute com.documentum.fc.methodserver.IDfMethod on specific JMS instance”, the answer is “Yes, it is possible:

API> retrieve,c,dm_server_config
...
3d024be980000102
API> dump,c,l
...  


  app_server_name [0]: do_method
                  [1]: do_mail
                  [2]: do_bpm
                  [3]: heavy_proc <- this one runs the show
  app_server_uri  [0]: http://localhost:9080/DmMethods/servlet/DoMethod
                  [1]: http://localhost:9080/DmMail/servlet/DoMail
                  [2]: http://localhost:9080/bpm/servlet/DoMethod
                  [3]: http://localhost:8888/


~]$ nc -l 8888
... waiting ...

API> apply,c,,HTTP_POST,
      APP_SERVER_NAME,S,heavy_proc,
      ARGUMENTS,S,'
        -method_verb MyMethodClass 
        -__dm_docbase__ DCTM_DEV 
        -__dm_server_config__ DCTM_DEV 
        -docbase_name DCTM_DEV.DCTM_DEV 
        -user_name dmadmin 
      ',TRACE_LAUNCH,B,T
...
q0

~]$ nc -l 8888
POST / HTTP/1.1
User-Agent: Documentum Server 7.2.0030.0195  Linux64.Oracle (HTTP Client)
Host: localhost:8888
Connection: close
Content-Type: application/x-www-form-urlencoded
Content-Length: 567

__signature_params__=method_verb.....

Q & A. XII

Yesterday I was asked a weird question, and answer for that question deservers an individual blogpost 🙂

Q:

Hi,

I always try to follow up your posts and most of the times it just leaves me confused because some of the conclusions drawn here are contradictory from what EMC guides say. I was currently looking into an issue where the application throws “No more available sessions” error. One of the approach that I was thinking of suggesting was using session manager to release session instead of session.disconnect that we have seen caused some issues in the past for one of our applications. Can you throw some light on this?

Further, from what I see above – Is this always the recommended approach of getting session?
IDfSessionManager sMgr = client.newSessionManager();
Session = sMgr.getSession();

Do we have any particular scenario where private session is used using “IDfSessionManager.newSession()”?
A video from EMC architect that uses this code – https://www.youtube.com/watch?v=YjcxOfiiCNM

A:

This blogpost is not about “how to do Documentum routines”, it is about challenges which I have faced with and how I have managed those challenges, if you think that my opinion is inconsistent with EMC’s documentation/guides/white papers/etc you are always free to take advantage of your support contract and ask EMC why their documentation is a piece of shit, but don’t ask me about that – when I writing a blogpost I always trying to provide some code snippet to prove my opinion.

Feedback storm

On last week I received a dozen of requests related to the one of recent blogposts, and all those requests contain the same two questions:

  • Why is it password protected?
  • When will it be publicly available?

The short story is: after installing latest Content Server patches I had faced with severe compatibility issues and further research revealed that those compatibility issues are caused by “security enhancements” which do not really look like security fixes, for example, I had spent about 6 hours to find out why my application has started failing when running on latest patchset and how to disable new weird behaviour, and just 30 minutes to write a new proof of concept which bypasses new “security enhancements”. After that I performed a deep analysis of last ten security vulnerabilities, which were announced by EMC as remediated and found the same problem: nothing was fixed, moreover some of them contain so rude mistakes that those mistakes look more like backdoors rather than mistakes. At current state I’m trying to bring together all information I have and this blogpost will be publicly available soon.

Q & A. XI

Hopefully, not too boring – if it is, please just reply “sorry, too boring” so I don’t get my hopes up.

I stumbled across this post, which leads me to believe you may have more insights to sessions management than most – Session management. Misconceptions

I was a bit disturbed by you comment about the white-paper – I was so excited when I found it, now you say it’s filled with misconceptions – yikes!

I have a Web ‘application’ that serves as a public site – no logins at all. When I first took over this site our Content Server was crashing constantly, running out of memory. Why…because the public application was creating a new session for each and every user of the site. The same content sever also supports a non-public login-based site, meaning every user has their own Documentum login. This application historically used a lot of CS memory for each session on the CS (not a 1-to-1 I know).

We started having an issue (out of nowhere) this week which severely impacted the public application. For the past 5-6 years this application has run supporting up-to 300-concurrent user off of ~2 client DFC sessions. We are able to do this because the majority of our requests are NOT served by DCTM, instead we use a full-text index (not Xplore). We finally had to take the entire site offline to try to figure out what the heck was going on.

A few pieces of info led us to believe that the issues were related to a slow storage unit backing one of our Filestores. With no traffic flowing to the app, one person tried to download a content file…while waiting for about 1-minute, another user hit another page that runs 3 simple DQL queries. From the application side of things, we saw no response on the page with the queries, UNTIL the content download had returned. This sure appears to be a threading/contention issue…

We have run a few test since, but haven’t yet nailed down anything. I can tell you that we use @Singleton to mark our application’s single session manager. That session manager provides a session via getSession() to each class that asks for one – regardless of http session or request. In my attempt to bandaid the issue, I switched the session request for the slow operating file download to newSession(). Not only did this seem to prevent other requests from waiting on the slow download, it also resulted in lightening fast downloads suddenly. I was concerned that this might cause too many sessions, as some people scrape our site for the content downloads – however so far we’ve only got up-to 15 sessions on the Content Servers from the ‘public’ user.

I am not super experienced with threading, but a quick play around in the code, adding a thread sleep after a getSession() is called, then opening a new browser tab and asking for another page did not show any waiting. This makes me think it’s something within the Session itself that has a threading issue?

Thanks for any thoughts you can share!

As regards your feelings: in my universe (not EMC’s) white-papers are not intended to fill documentation gaps, white-paper may be a good starting point for working with product/technology, but nothing else, and white-papers typically have information about authors (this gives reader a clue about author’s competence, EMC’s white-papers do not have such information, so I consider all EMC’s white-papers as a piece of shit just because some of them are wrong and misleading), product versions, purpose and target audience. But EMC instead of improving their documentation tries to fill documentation gaps by white-papers, the problem is white-papers are not documents and they do not provide a contract about product behaviour (documentation does).

As regards your problem…

When I first took over this site our Content Server was crashing constantly, running out of memory. Why…because the public application was creating a new session for each and every user of the site.

I believe you are talking about some ancient 32-bit content server builds for MS Windows. Indeed, there was such problem because on the one hand MS Windows has a limits on concurrent TCP connections, but these limits are far beyond the practical sense, on the other hand on MS Windows Content Server services client connections using threads within a single process, so the amount of concurrent session Content Server can create is also limited by amount of memory single process can consume – for 32-bit builds this limit is about 2Gb (I’m not a MS Windows expert, so, real values may differ, but the main idea is correct). Also, there is an interesting fact for UNIX builds. On UNIX Content Server services client connections using separate processes, so technically the amount of concurrent client connections is limited by amount of memory presented on CS host, but even in this case EMC put dumb limits on amount of concurrent sessions.

With no traffic flowing to the app, one person tried to download a content file…while waiting for about 1-minute, another user hit another page that runs 3 simple DQL queries. From the application side of things, we saw no response on the page with the queries, UNTIL the content download had returned. This sure appears to be a threading/contention issue…

I do think that either your diagnosis is wrong or you do not provide enough information or you have missed something (actually, taking a thread dump will provide you more clues than just analysing code). Let me explain. You have a hypothesis that “when we share DFC session among threads and and one thread performs transfer of content other threads are unable to execute DQL queries”, relying on the input data you have provided I can say without a doubt that this hypothesis is clearly untenable, the justification of my statement is: DFC performs content transfer from Content Server using 64K chunks, in order to request a chunk DFC sends RPC to content server, when DFC executes DQL queries it also sends RPCs to content server, RPC calls are serialised (you can’t send multiple RPC calls simultaneously) – there is a contention between threads but behaviour differs from yours. Below is a proof of concept – I have created a document of 1Gb worth and try to download it and perform some queries simultaneously, my expectation is if “synchronised” keyword in java provided fair scheduling my code would able to perform about 1024*1024/64/2=8192 (1024*1024/64 – is a number of RPCs required to transfer content, 2 – is a number of RPCs required to execute query) “simultaneous” queries:

public static void main(String[] args) throws Exception {
	final IDfClient client = new DfClientX().getLocalClient();
	client.getClientConfig().setString(DfPreferences.DFC_DOCBROKER_HOST, "192.168.0.253");
	final IDfSession session = client.newSession("DCTM_DEV", new DfLoginInfo("dmadmin", "dmadmin"));
	IDfSysObject sysObject = (IDfSysObject) session.getObject(DfId.valueOf("08024be980023d9d"));
	Thread t1 = new Thread(new Runnable() {
		public void run() {
			try {
				((IDfSysObject) session.getObject(DfId.valueOf("08024be980023d9d"))).getFile(null);
			} catch (DfException ex) {
				throw new RuntimeException(ex);
			}
		}
	});
	Thread t2 = new Thread(new Runnable() {
		public void run() {
			int count = 0;
			try {
				while (!Thread.currentThread().isInterrupted()) {
					IDfQuery q = new DfQuery("select count(*) from dm_server_config");
					q.execute(session, IDfQuery.DF_EXEC_QUERY).close();
					count++;
				}
			} catch (Exception ex) {
				// ignore
			}
			System.out.println(count);
		}
	});
	t1.start();
	t2.start();
	t1.join();
	t2.interrupt();
}

actual result is about 9000-10000 queries and corresponding thread dumps are:

"Thread-9" prio=5 tid=0x00007ff3739b6800 nid=0x5b07 waiting for monitor entry [0x000070000185e000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection$IteratorSynchronizer.getType(DocbaseConnection.java:1971)
	- waiting to lock <0x0000000703009670> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.collection.TypedDataCollection.getTypedData(TypedDataCollection.java:39)
	at com.documentum.fc.client.impl.collection.TypedDataCollection.<init>(TypedDataCollection.java:46)
	at com.documentum.fc.client.impl.collection.TypedDataCollection.newInstance(TypedDataCollection.java:31)
	at com.documentum.fc.client.DfQuery.runQuery(DfQuery.java:167)
	at com.documentum.fc.client.DfQuery.execute(DfQuery.java:216)
	at QAXI$2.run(QAXI.java:50)
	at java.lang.Thread.run(Thread.java:745)

"Thread-8" prio=5 tid=0x00007ff3738ae800 nid=0x360b runnable [0x000070000175a000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:152)
	at java.net.SocketInputStream.read(SocketInputStream.java:122)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
	at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:554)
	at sun.security.ssl.InputRecord.read(InputRecord.java:509)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:934)
	- locked <0x0000000703141a80> (a java.lang.Object)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:891)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
	- locked <0x000000070322d4c0> (a sun.security.ssl.AppInputStream)
	at com.documentum.fc.impl.util.io.MessageChannel.readSocket(MessageChannel.java:129)
	at com.documentum.fc.impl.util.io.MessageChannel.readLength(MessageChannel.java:100)
	at com.documentum.fc.impl.util.io.MessageChannel.getIncomingMessageLength(MessageChannel.java:92)
	at com.documentum.fc.impl.util.io.MessageChannel.read(MessageChannel.java:77)
	- locked <0x00000007031b50f0> (a com.documentum.fc.impl.util.io.MessageChannel)
	at com.documentum.fc.client.impl.connection.netwise.AbstractNetwiseRpcClient.receiveMessage(AbstractNetwiseRpcClient.java:183)
	at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient.getBlock(NetwiseDocbaseRpcClient.java:1023)
	- locked <0x00000007031b5068> (a com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.getBlock(DocbaseConnection.java:1473)
	- locked <0x0000000703009670> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.connection.docbase.RawPuller.getBlock(RawPuller.java:52)
	- locked <0x0000000703009670> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.content.impl.BlockPuller.nextBlock(BlockPuller.java:49)
	at com.documentum.fc.client.content.impl.PullerInputStream.getNextBuffer(PullerInputStream.java:73)
	at com.documentum.fc.client.content.impl.PullerInputStream.ensureBufferHasData(PullerInputStream.java:63)
	at com.documentum.fc.client.content.impl.PullerInputStream.read(PullerInputStream.java:88)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:50)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:30)
	at com.documentum.fc.client.content.impl.LocalContentFile.<init>(LocalContentFile.java:38)
	at com.documentum.fc.client.content.impl.LocalContentManager$LocalContentDirectory.createFile(LocalContentManager.java:480)
	at com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager.createLocalContentFile(LocalContentManager.java:363)
	- locked <0x00000007cd220198> (a com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager)
	at com.documentum.fc.client.content.impl.LocalContentManager.createContentFile(LocalContentManager.java:148)
	at com.documentum.fc.client.content.impl.ContentManager.namelessGetFile(ContentManager.java:253)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:198)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:173)
	at com.documentum.fc.client.DfSysObject.getFileEx2(DfSysObject.java:1972)
	- locked <0x000000070310dfe0> (a com.documentum.fc.client.DfSysObject)
	at com.documentum.fc.client.DfSysObject.getFileEx(DfSysObject.java:1964)
	at com.documentum.fc.client.DfSysObject.getFile(DfSysObject.java:1959)
	at com.documentum.fc.client.DfSysObject___PROXY.getFile(DfSysObject___PROXY.java)
	at QAXI$1.run(QAXI.java:38)
	at java.lang.Thread.run(Thread.java:745)

or:

"Thread-9" prio=5 tid=0x00007fc9f322f000 nid=0x5b0f runnable [0x000070000185e000]
   java.lang.Thread.State: RUNNABLE
	at java.net.SocketInputStream.socketRead0(Native Method)
	at java.net.SocketInputStream.read(SocketInputStream.java:152)
	at java.net.SocketInputStream.read(SocketInputStream.java:122)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:442)
	at sun.security.ssl.InputRecord.read(InputRecord.java:480)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:934)
	- locked <0x00000007b2bb45b0> (a java.lang.Object)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:891)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:102)
	- locked <0x00000007b2bb4660> (a sun.security.ssl.AppInputStream)
	at com.documentum.fc.impl.util.io.MessageChannel.readSocket(MessageChannel.java:129)
	at com.documentum.fc.impl.util.io.MessageChannel.readLength(MessageChannel.java:100)
	at com.documentum.fc.impl.util.io.MessageChannel.getIncomingMessageLength(MessageChannel.java:92)
	at com.documentum.fc.impl.util.io.MessageChannel.read(MessageChannel.java:77)
	- locked <0x00000007b2bb4690> (a com.documentum.fc.impl.util.io.MessageChannel)
	at com.documentum.fc.client.impl.connection.netwise.AbstractNetwiseRpcClient.receiveMessage(AbstractNetwiseRpcClient.java:183)
	at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient.closeCollection(NetwiseDocbaseRpcClient.java:934)
	- locked <0x00000007b2bb4710> (a com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient)
	at com.documentum.fc.client.impl.connection.docbase.netwise.NetwiseDocbaseRpcClient$TypedDataIterator.close(NetwiseDocbaseRpcClient.java:1366)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection$IteratorSynchronizer.close(DocbaseConnection.java:1963)
	- locked <0x00000007b2bb4790> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.collection.TypedDataCollection.close(TypedDataCollection.java:54)
	- locked <0x00000007b10120d0> (a com.documentum.fc.client.impl.collection.TypedDataCollection)
	at com.documentum.fc.client.impl.collection.CollectionHandle.close(CollectionHandle.java:42)
	at QAXI$2.run(QAXI.java:50)
	at java.lang.Thread.run(Thread.java:745)

"Thread-8" prio=5 tid=0x00007fc9f322e000 nid=0x320b waiting for monitor entry [0x000070000175a000]
   java.lang.Thread.State: BLOCKED (on object monitor)
	at com.documentum.fc.client.impl.connection.docbase.RawPuller.getBlock(RawPuller.java:52)
	- waiting to lock <0x00000007b2bb4790> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.content.impl.BlockPuller.nextBlock(BlockPuller.java:49)
	at com.documentum.fc.client.content.impl.PullerInputStream.getNextBuffer(PullerInputStream.java:73)
	at com.documentum.fc.client.content.impl.PullerInputStream.ensureBufferHasData(PullerInputStream.java:63)
	at com.documentum.fc.client.content.impl.PullerInputStream.read(PullerInputStream.java:88)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:133)
	at java.io.FilterInputStream.read(FilterInputStream.java:107)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:50)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:30)
	at com.documentum.fc.client.content.impl.LocalContentFile.<init>(LocalContentFile.java:38)
	at com.documentum.fc.client.content.impl.LocalContentManager$LocalContentDirectory.createFile(LocalContentManager.java:480)
	at com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager.createLocalContentFile(LocalContentManager.java:363)
	- locked <0x00000007abec1350> (a com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager)
	at com.documentum.fc.client.content.impl.LocalContentManager.createContentFile(LocalContentManager.java:148)
	at com.documentum.fc.client.content.impl.ContentManager.namelessGetFile(ContentManager.java:253)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:198)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:173)
	at com.documentum.fc.client.DfSysObject.getFileEx2(DfSysObject.java:1972)
	- locked <0x00000007b2b8e5b8> (a com.documentum.fc.client.DfSysObject)
	at com.documentum.fc.client.DfSysObject.getFileEx(DfSysObject.java:1964)
	at com.documentum.fc.client.DfSysObject.getFile(DfSysObject.java:1959)
	at com.documentum.fc.client.DfSysObject___PROXY.getFile(DfSysObject___PROXY.java)
	at QAXI$1.run(QAXI.java:38)
	at java.lang.Thread.run(Thread.java:745)

Note, if I introduce transactions in my PoC I will get a behaviour you have described (I do not want to say that using transactions is a clue for your problem, just start your troubleshooting with taking thread dump):

public static void main(String[] args) throws Exception {
	final IDfClient client = new DfClientX().getLocalClient();
	client.getClientConfig().setString(DfPreferences.DFC_DOCBROKER_HOST, "192.168.0.253");
	final IDfSession session = client.newSession("DCTM_DEV", new DfLoginInfo("dmadmin", "dmadmin"));
	IDfSysObject sysObject = (IDfSysObject) session.getObject(DfId.valueOf("08024be980023d9d"));
	Thread t1 = new Thread(new Runnable() {
		public void run() {
			try {
				session.beginTrans();
				((IDfSysObject) session.getObject(DfId.valueOf("08024be980023d9d"))).getFile(null);
			} catch (DfException ex) {
				throw new RuntimeException(ex);
			} finally {
				try {
					session.abortTrans();
				} catch (DfException ex) {
					// ignore
				}
			}
		}
	});
	Thread t2 = new Thread(new Runnable() {
		public void run() {
			int count = 0;
			try {
				while (!Thread.currentThread().isInterrupted()) {
					IDfQuery q = new DfQuery("select count(*) from dm_server_config");
					q.execute(session, IDfQuery.DF_EXEC_QUERY).close();
					count++;
				}
			} catch (Exception ex) {
				// ignore
			}
			System.out.println(count);
		}
	});
	t1.start();
	t2.start();
	t1.join();
	t2.interrupt();
}

result is 2-3 (actually should be 1 but writing a full PoC is bit boring) “simultaneous” queries and thread dump looks like:

"Thread-9" prio=5 tid=0x00007fb0c2a91800 nid=0x5d07 in Object.wait() [0x000070000185e000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0x00000007bae39970> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.waitForCorrectSharingContext(DocbaseConnection.java:820)
	- locked <0x00000007bae39970> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.evaluateRpc(DocbaseConnection.java:1108)
	- locked <0x00000007bae39970> (a com.documentum.fc.client.impl.connection.docbase.DocbaseConnection)
	at com.documentum.fc.client.impl.connection.docbase.DocbaseConnection.applyForCollection(DocbaseConnection.java:1265)
	at com.documentum.fc.client.impl.docbase.DocbaseApi.exec(DocbaseApi.java:83)
	at com.documentum.fc.client.impl.session.Session.query(Session.java:3630)
	at com.documentum.fc.client.impl.session.SessionHandle.query(SessionHandle.java:2322)
	at com.documentum.fc.client.DfQuery.runQuery(DfQuery.java:167)
	at com.documentum.fc.client.DfQuery.execute(DfQuery.java:216)
	at QAXI$2.run(QAXI.java:57)
	at java.lang.Thread.run(Thread.java:745)

"Thread-8" prio=5 tid=0x00007fb0c3887800 nid=0x1307 runnable [0x000070000175a000]
   java.lang.Thread.State: RUNNABLE
	at java.io.FileOutputStream.writeBytes(Native Method)
	at java.io.FileOutputStream.write(FileOutputStream.java:345)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:51)
	at com.documentum.fc.impl.util.io.StreamUtility.copyContents(StreamUtility.java:30)
	at com.documentum.fc.client.content.impl.LocalContentFile.<init>(LocalContentFile.java:38)
	at com.documentum.fc.client.content.impl.LocalContentManager$LocalContentDirectory.createFile(LocalContentManager.java:480)
	at com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager.createLocalContentFile(LocalContentManager.java:363)
	- locked <0x00000007bae93480> (a com.documentum.fc.client.content.impl.LocalContentManager$SessionLocalContentManager)
	at com.documentum.fc.client.content.impl.LocalContentManager.createContentFile(LocalContentManager.java:148)
	at com.documentum.fc.client.content.impl.ContentManager.namelessGetFile(ContentManager.java:253)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:198)
	at com.documentum.fc.client.content.impl.ContentManager.getFile(ContentManager.java:173)
	at com.documentum.fc.client.DfSysObject.getFileEx2(DfSysObject.java:1972)
	- locked <0x00000007bb011c60> (a com.documentum.fc.client.DfSysObject)
	at com.documentum.fc.client.DfSysObject.getFileEx(DfSysObject.java:1964)
	at com.documentum.fc.client.DfSysObject.getFile(DfSysObject.java:1959)
	at com.documentum.fc.client.DfSysObject___PROXY.getFile(DfSysObject___PROXY.java)
	at QAXI$1.run(QAXI.java:39)
	at java.lang.Thread.run(Thread.java:745)

I can tell you that we use @Singleton to mark our application’s single session manager. That session manager provides a session via getSession() to each class that asks for one – regardless of http session or request. In my attempt to bandaid the issue, I switched the session request for the slow operating file download to newSession(). Not only did this seem to prevent other requests from waiting on the slow download, it also resulted in lightening fast downloads suddenly. I was concerned that this might cause too many sessions, as some people scrape our site for the content downloads – however so far we’ve only got up-to 15 sessions on the Content Servers from the ‘public’ user.

Sharing sessions/session managers among thread is not safe, using newSession() is not safe too – please check my explanation on ECN. If you want to find a balance between performance and amount of content server sessions consider to implement a pool of session managers – you can simply adopt Commons Pool. Also, check how you transfer content from Content Server. I also may suggest to use ACS for content transfer operations but this will complicate your solution and most probably will have no benefits due to following reasons:

  • you will start depending on JMS, so you will need a backup plan for handling JMS availability issues
  • when creating ACS link Content Server generates SHA1 checksum for content – if your storage is slow it may be an issue because the content will be read twice: one time for generating SHA1 checksum, second time when sending content
  • never ever use this pattern for getting ACS urls – if ACS is unavailable IDfExportOperation transfers content to DFC host, use com.documentum.fc.client.IDfSysObject#getAcsRequests instead

Q & A. X

Q:

Hi,
I am trying to write a standalone DF/D2 program. I create a DFC session and then make it in D2 context by D2Session.initTBO. I think perform normal DFC set, save operation on a sysobject. When I try to apply a D2 configuration like D2AuditConfig.apply I get the below error How to correct this??

ERROR 1 – D2 lockbox file or D2Method.passphrase property within it could not be found.
Exception in thread “main” DfException:: THREAD: main; MSG: Impossible to decrypt the method server response; ERRORCODE: ff; NEXT: null
at com.emc.d2.api.methods.D2Method.start(D2Method.java:417)

A:

You have two options:

  • put and setup all Lockbox stuff onto client side
  • Take advantage of reflection:
    Field ticketField = D2Session.class.getField("s_ticket");
    ticketField.setAccessible(true);
    Map tickets = (Map) ticketField.get(null);
    tickets.put("docbase_name", "dmadmin_password");
    

Q:

Also, cant it disable Lockbox altogether in 7.2+D24.5 environment?

A:

Download latest (or m.b. previous to latest or so) service pack for D2 4.2, extract com.emc.common.java.crypto.AESCrypto class from C6-Common-4.2.0.jar, insert it into C6-Common-4.5.0.jar.

Q & A. IX

Hi,

i’m relatively new to the xCP2 (and documentum) world and i have a question: the color of the application’s background is blue and i don’t like blue… juste joking !

i couldn’t find anywhere tips about continuous integration. is there a way to build and xCP2 project with commande line ? i want to make a continuous integration using Jenkins, SOMETHING to build, and (i think)the xmstool to deploy it every night.
However, couldn’t find anything to build the project from command line(and i shearshed up to google page 5!) in a unix environnement. Do you have any clues about what i can use?

Thanks,
Regards,
Rochdi

Hi,

I’m not sure that I’ll able to answer your question because I have never tried to perform automatic builds of xCP2 projects before (actually, after some unsuccessful attempts to create a complex (not a “hello world”) xCP2 application we have found that at current moment xCP2 is completely unusable/unstable/unreliable/undocumented and switched back to WDK/DFC), however EMC folks tell that there is an option to mavenize xCP2 project – it seems that attempt to execute “mvm package” is doomed to failure if xCP Designer is up and running:

> G:\app\maven\3.2.3\bin\mvn package -Dmaven.repo.local=G:\app\xCPDesigner\maven

....

[ERROR] Unexpected system error packaging dar for project 'ap'
[INFO] ------------------------------------------------------------------------
[INFO] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 6.120 s
[INFO] Finished at: 2015-07-30T21:34:08+10:00
[INFO] Final Memory: 18M/223M
[INFO] ------------------------------------------------------------------------
[ERROR] Failed to execute goal com.emc.xcp.builder:xcp-dar:1.0.8:run (xcp-dar) on project ap: 
   Xcp mojo executing command 'preparepackage' for project 'ap' failed unexpectedly: j
ava.lang.IllegalArgumentException: xcpProject must not be null.
[ERROR] at com.emc.xcp.builder.packaging.PackagerUtil.generateRunConfig(PackagerUtil.java:18)
[ERROR] at internal.com.emc.xcp.builder.packaging.projectpackagers.RunConfigPackager.doPackaging(RunConfigPackager.java:14)
[ERROR] at internal.com.emc.xcp.builder.packaging.InternalPackagerUtil.packageProject(InternalPackagerUtil.java:22)
[ERROR] at internal.com.emc.xcp.builder.packaging.maven.XcpDarCommand.execute(XcpDarCommand.java:43)
[ERROR] at internal.com.emc.xcp.builder.build.maven.MavenHookManager.execute(MavenHookManager.java:29)
[ERROR] at internal.com.emc.xcp.builder.build.maven.MavenHookServlet.doPost(MavenHookServlet.java:51)
[ERROR] at javax.servlet.http.HttpServlet.service(HttpServlet.java:755)
[ERROR] at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
[ERROR] at org.eclipse.equinox.http.registry.internal.ServletManager$ServletWrapper.service(ServletManager.java:180)
[ERROR] at org.eclipse.equinox.http.servlet.internal.ServletRegistration.service(ServletRegistration.java:61)
[ERROR] at org.eclipse.equinox.http.servlet.internal.ProxyServlet.processAlias(ProxyServlet.java:128)
[ERROR] at org.eclipse.equinox.http.servlet.internal.ProxyServlet.service(ProxyServlet.java:68)
[ERROR] at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
[ERROR] at org.eclipse.equinox.http.jetty.internal.HttpServerManager$InternalHttpServiceServlet.service(HttpServerManager.java:384)
[ERROR] at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:598)
[ERROR] at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:486)

but in offline mode it does build something (note designerPath property):

G:\app\maven\3.2.3\bin\mvn package -Dmaven.repo.local=G:\app\xCPDesigner\maven -DdesignerPath=G:\app\xCPDesigner

...

[INFO] Webapp assembled in [5613 msecs]
[INFO] Building war: G:\app\xCPDesigner\Applications\ap\ap\target\ap-ap-1.0.0.war
[WARNING] Warning: selected war files include a WEB-INF/web.xml which will be ignored
(webxml attribute is missing from war task, or ignoreWebxml attribute is specified as 'true')
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:02 min
[INFO] Finished at: 2015-07-30T21:36:41+10:00
[INFO] Final Memory: 15M/218M
[INFO] ------------------------------------------------------------------------

Q & A. VIII

I am grateful to you for creating dctmpy which I am planning to heavily use in my icinga2 monitoring environment. This work you have done is commendable. Earlier I was planning to write my own custom plugins using perl, then I figured out it is not gonna be easy considering the effort required to make Db::Documentum (which Scott created) work in D6+ environments. That is when I came across your wonderful work. I have successfully tested functionalities like login, sessioncount, targets, login etc.

I believe you have recently introduced a few modes. And as the documentation in emc community about dctmpy (https://community.emc.com/people/aldago-zF7Lc/blog/2014/05/19/monitoring-documentum-with-nagios-and-dctmpy-plugin) is not up to date, could you please provide some details about different modes like jobs, indexagents, acsstatus, timeskew, xplorestatus, query, method, indexqueue, serverworkqueue, countquery and how to use them. I will be especially interested in query and countquery. If this means I can run any query in docbase and compare the output against the thresholds we supply, that will be awesome.

Eagerly looking forward to your response,
Once again, thank you so much!
– Vishnu

At first, I strongly recommend to not consider Alvaro’s post as a guide to action, not because it’s wrong, but because configuring nagios *.cfg files is the same as writing sendmail.cf without m4 – my preference is to use opsview.

Before describing capabilities of dctmpy I think it worth to define what needs to be monitored and why, otherwise monitoring objectives are not clear – for example, I have tried to understand what ReveilleSoftware really does and after checking some presentations and youtube clips I got understanding that ReveilleSoftware just draw bars and pies :). So, dctmpy seems to be the only reliable monitoring solution for Documentum, others, if exist, are either based on DFC, which requires some extra setup or hacks, or other Documentum services, which makes them dependent on underlying service.

Docbroker service

Typical Docbroker issues are:

  1. Docbroker is down – somebody forgot to start it or Docbroker failed or there are connectivity issues or attacker stopped docbroker
  2. Content Server is not registered on Docbroker – misconfiguration on CS side or connectivity issues
  3. Wrong Content server is registered on Docbroker – I have seen some stupid cases when infrastructure guys clone (EMC does not provide any reliable solution for loading data into repository, so the most reliable way to perform cloning) PROD to UAT but forget to modify network settings, after that users work with wrong environment
  4. Attacker poisoned registration information
  5. Docbroker is running under DoS – for some weird reason Docbroker’s implementation is extremely ugly, and even telnet on Docbroker port causes DoS, example:
    # session 1
     ~]$ nc 192.168.13.131 1489
    <just enter here>
    
    # session 2
    ~]$ time timeout 20 dmqdocbroker -c getdocbasemap
    dmqdocbroker: A DocBroker Query Tool
    dmqdocbroker: Documentum Client Library Version: 7.2.0000.0054
    Targeting current host
    Targeting port 1489
    
    real    0m20.002s
    user    0m0.002s
    sys     0m0.000s
     ~]$ echo $?
    124
    
  6. DoS is caused by slow client or network problems, yes, it’s weird, but client or server with network issues could affect all Documentum infrastructure, so, it is always a good idea to use different docbrokers for different services

I belive all this situations are covered by nagios_check_docbroker, some example:

Basic check of availability:

nagios_check_docbroker -H 192.168.13.131:1489
CHECKDOCBROKER OK - docbase_map_time is 6ms, Registered docbases: DCTM_DEV
| docbase_map_time=6ms;100;;0

The same for SSL connection (note -s flag and increased response time):

nagios_check_docbroker -H 192.168.13.131:1490 -s
CHECKDOCBROKER OK - docbase_map_time is 423ms, Registered docbases: DCTM_DEV
| docbase_map_time=423ms;;;0

Adding response time thresholds:

nagios_check_docbroker -H 192.168.13.131:1490 -s -w 100
CHECKDOCBROKER WARNING - docbase_map_time is 490ms (outside range 0:100),
      Registered docbases: DCTM_DEV
| docbase_map_time=490ms;100;;0

nagios_check_docbroker -H 192.168.13.131:1490 -s -w 100 -c 200
CHECKDOCBROKER CRITICAL - docbase_map_time is 442ms (outside range 0:200),
           Registered docbases: DCTM_DEV
| docbase_map_time=442ms;100;200;0

Checking registration of certain docbase(s):

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV
CHECKDOCBROKER OK - docbase_map_time is 7ms,
      Server DCTM_DEV.DCTM_DEV is registered on 192.168.13.131:1489
| docbase_map_time=7ms;;;0

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV1
CHECKDOCBROKER CRITICAL - 
      Docbase DCTM_DEV1 is not registered on 192.168.13.131:1489,
      docbase_map_time is 7ms
| docbase_map_time=7ms;;;0

# multiple docbases
nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV1,DCTM_DEV
CHECKDOCBROKER CRITICAL - 
      Docbase DCTM_DEV1 is not registered on 192.168.13.131:1489,
      docbase_map_time is 5ms,
      Server DCTM_DEV.DCTM_DEV is registered on 192.168.13.131:1489
| docbase_map_time=5ms;;;0

Checking registration of certain server(s):

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV.DCTM_DEV
CHECKDOCBROKER OK - docbase_map_time is 6ms,
       Server DCTM_DEV.DCTM_DEV@192.168.13.131 is registered on 192.168.13.131:1489
| docbase_map_time=6ms;;;0

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV.DCTM
CHECKDOCBROKER CRITICAL - 
       Server DCTM_DEV.DCTM is not registered on 192.168.13.131:1489,
       docbase_map_time is 11ms
| docbase_map_time=11ms;;;0

#multiple servers
nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV.DCTM,DCTM_DEV.DCTM_DEV
CHECKDOCBROKER CRITICAL - 
       Server DCTM_DEV.DCTM is not registered on 192.168.13.131:1489,
       docbase_map_time is 7ms, 
       Server DCTM_DEV.DCTM_DEV@192.168.13.131 is registered on 192.168.13.131:1489
| docbase_map_time=7ms;;;0

Checking IP addresses of registered servers:

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV.DCTM_DEV@192.168.13.131
CHECKDOCBROKER OK - docbase_map_time is 8ms,
       Server DCTM_DEV.DCTM_DEV@192.168.13.131 is registered on 192.168.13.131:1489
| docbase_map_time=8ms;;;0

nagios_check_docbroker -H 192.168.13.131:1489 -d DCTM_DEV.DCTM_DEV@192.168.13.132
CHECKDOCBROKER CRITICAL - 
       Server DCTM_DEV.DCTM_DEV (status: Open) is registered on 192.168.13.131:1489 
        with wrong ip address: 192.168.13.131, expected: 192.168.13.132,
       docbase_map_time is 7ms
| docbase_map_time=7ms;;;0

Checking malicious registrations (note -f flag):

nagios_check_docbroker -H 192.168.13.131:1489 -f -d DCTM_DEV.DCTM_DEV@192.168.13.132
CHECKDOCBROKER CRITICAL - 
       Server DCTM_DEV.DCTM_DEV (status: Open) is registered on 192.168.13.131:1489
         with wrong ip address: 192.168.13.131, expected: 192.168.13.132, 
       Malicious server DCTM_DEV.DCTM_DEV@192.168.13.131 (status: Open)
         is registered on 192.168.13.131:1489,
       docbase_map_time is 9ms
| docbase_map_time=9ms;;;0

Repository services

Actually, there are a lot of things to be monitored, so nagios_check_docbase covers the most common issues. Common command line pattern for all checks is:

nagios_check_docbase -H <hostname> -p <port> -i <docbaseid> -l <username>
 -a <password> -m <mode> -n <name> [-s] [-t <timeout>] <specific arguments>

where:

  • hostname – hostname or ip address where Documentum is running
  • port – tcp port Documentum is listening on (this is not a docbroker port)
  • docbaseid – docbase identifier (see docbase_id in server.ini, might be omitted but in this case you will get stupid exceptions in repository log)
  • username – username to connect to Documentum
  • password – password to connect to Documentum
  • -s – defines whether to use SSL connection
  • timeout – defines timeout in seconds after which check fails, default is 60 seconds (useful for query checks), for example:
    nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10000 \
      -m countquery \
      --query "select count(*) from dm_folder a, dm_folder b, dm_folder c"
    COUNTQUERY UNKNOWN: Timeout: check execution aborted after 60s
    
    nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
      -m countquery -t 3600 \
      --query "select count(*) from dm_folder a, dm_folder b, dm_folder c"
    COUNTQUERY OK - countquery is 14544652121
    | countquery=14544652121;;;0 query_time=2703163ms;;;0
    
  • name – name of check displayed in output, default is uppercase of check name, for example:
    nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m login
    LOGIN OK - user: dmadmin, connection: 1229ms, authentication: 136ms
    | authentication_time=136ms;;;0 connection_time=1229ms;;;0
    
    nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
     -m login -n superuser_login
    SUPERUSER_LOGIN OK - user: dmadmin, connection: 941ms, authentication: 86ms
    | authentication_time=86ms;;;0 connection_time=941ms;;;0
    
  • mode – one of:
    • sessioncount – checks count of active sessions in repository, i.e. hot_list_size in COUNT_SESSIONS RPC command result, example (last number in performance output is a value of concurrent_sessions in server.ini):
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
        -m sessioncount
      SESSIONCOUNT OK - sessioncount is 4
      | sessioncount=4;;;0;100
      
      # critical threshold
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
         -m sessioncount -c 2
      SESSIONCOUNT CRITICAL - sessioncount is 4 (outside range 0:2)
      | sessioncount=4;;2;0;100
      
      # warning and critical thresholds:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
         -m sessioncount -w 2 -c 6
      SESSIONCOUNT WARNING - sessioncount is 4 (outside range 0:2)
      | sessioncount=4;2;6;0;100
      
    • targets – checks whether repository is registered on all configured docbrokers, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m targets
      TARGETS OK - DCTM_DEV.DCTM_DEV has status Open on docu72dev01:1489
      
    • indexagents – checks status of configured index agents, i.e. checks that status returned by FTINDEX_AGENT_ADMIN RPC is 100, example:
      # no index agents configured in docbase:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
         -m indexagents
      INDEXAGENTS WARNING - No indexagents
      
      # stopped index agent
      nagios_check_docbase -H 192.168.2.56:12000/131031 -l dmadmin -a dmadmin \
           -m indexagents
      INDEXAGENTS WARNING - Indexagent docu70dev01_9200_IndexAgent is stopped
      
    • jobs – check job scheduling, i.e. checks whether job is in active state (might be picked up by agentexec), checks last return code of job method, checks whether agentexec honors scheduling (last check is very inaccurate because of weird agentexec implementation, so checking jobs which are supposed to be executed frequently might produce unexpected results), example:
      # single job
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m jobs \
         --job dm_UpdateStats
      JOBS OK - dm_UpdateStats last run - 1 days 02:31:34 ago
      
      # multiple jobs (comma-separated list)
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m jobs \
         --job dm_ConsistencyChecker,dm_UpdateStats
      JOBS CRITICAL - dm_ConsistencyChecker is inactive,
          dm_UpdateStats last run - 1 days 02:35:22 ago
      
      # job with bad last return code
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m jobs \
         --job dm_usageReport
      JOBS CRITICAL - dm_usageReport has status: FAILED:  
         Could not launch method dm_usageReport:  OS error: (No Error), DM error: ()
      
    • nojobs – checks whether certain job is not scheduled (so it is reversed “jobs” mode) – default Documentum installation schedules certain jobs which consume a lot of resources but do nothing useful, such jobs must be disabled, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m nojobs --job dm_DBWarning
      NOJOBS CRITICAL - dm_DBWarning is active
      
    • timeskew – check time difference in seconds between Documentum host and monitoring server, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m timeskew
      TIMESKEW OK - timeskew is 66.02
      | timeskew=66.0209999084;;;0
      
      # critical theshold
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m timeskew -c 60
      TIMESKEW CRITICAL - timeskew is 66.23 (outside range 0:60)
      | timeskew=66.2279999256;;60;0
      
      # warning and critical thesholds
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m timeskew -w 60 -c 120
      TIMESKEW WARNING - timeskew is 66.17 (outside range 0:60)
      | timeskew=66.1689999104;60;120;0
      
    • query – executes select statement and checks whether the count of returned rows is inside of specified threshold ranges (for checks described previously threshold ranges were trivial (i.e. “less than”), but for this check, I believe, you may want to specify more complex conditions like “count of returned rows must be greater than specified threshold”, see nagios-plugin documentation for threshold formats), additionally output might be formatted be specifying –format argument, example:
      # no thresholds, just formatted output
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
         --query "select user_name,user_state from dm_user where user_state<>0" \
         -m query --format {user_name}:{user_state}
      QUERY OK - hacker:1 - 3ms
      | count=1;;;0 query_time=3ms;;;0
      
      # count of rows does not exceed critical threshold 
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          --query "select user_name,user_state from dm_user where user_state<>0" \
          -m query --format {user_name}:{user_state} -w 0 -c 1
      QUERY WARNING - hacker:1 - 3ms (outside range 0:0)
      | count=1;0;1;0 query_time=3ms;;;0
      
      # count of rows is greater than or equal to critical threshold
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          --query "select user_name,user_state from dm_user where user_state<>0" \
          -m query --format {user_name}:{user_state} -c 2:
      QUERY CRITICAL - hacker:1 - 3ms (outside range 2:)
      | count=1;;2:;0 query_time=3ms;;;0
      
      # count of rows is greater than or equal to critical threshold
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          --query "select user_name,user_state from dm_user where user_state<>0" \
          -m query --format {user_name}:{user_state} -c 2:
      QUERY CRITICAL - hacker:1 (outside range 2:)
      | count=1;;2:;0 query_time=3ms;;;0
      
      # also check query execution time against thresholds
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          --query "select user_name,user_state from dm_user where user_state<>0" \
          -m query --format {user_name}:{user_state} -c 2: --criticaltime 2
      QUERY CRITICAL - hacker:1 - 3ms (outside range 2:)
      | count=1;;2:;0 query_time=3ms;;2;0
      
    • method – technically it is the same as “query” mode, but accepts only “execute do_method” queries and additionally checks value of launch_failed result attribute, I believe such approach to check health of JMS is more reliable than “jmscheck” mode (see below), example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m method --query "execute do_method with method='JMSHealthChecker'"
      METHOD OK
      | query_time=14ms;;;0
      
    • countquery – technically it is the same as “query” mode, but this mode assumes that query returns only single row with single attribute (actually it just picks up the first row and the first attribute in row), example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m countquery --query "select count(*) from dm_sysobject"
      COUNTQUERY OK - countquery is 8746
      | countquery=8746;;;0 query_time=7ms;;;0
      
    • workqueue – checks the total number of non-completed auto-activities for whole repository, actually it checks whether the configured number of workflow agents is sufficient or not, in some cases growth of workflow queue may indicate some issues either with workflow agent or with JMS, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m workqueue
      WORKQUEUE OK - workqueue is 0
      | workqueue=0;;;0
      
    • serverworkqueue – checks the number of non-completed auto-activities for current server, i.e. number of auto-activities acquired by server’s workflow agent, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m serverworkqueue
      SERVERWORKQUEUE OK - DCTM_DEV is 0
      | DCTM_DEV=0;;;0
      
    • indexqueue – checks indexagent queue size, it’s worth to combine this check with “indexagents” check, because, again, due to weird implementation of indexaget it may report “running” status, but does not process queue, example:
      nagios_check_docbase -H 192.168.2.56:12000/131031 -l dmadmin -a dmadmin \
           -m indexqueue -w 1000 -c 2000
      INDEXQUEUE CRITICAL - _fulltext_index_user is 4.978e+04 (outside range 0:2000)
      | _fulltext_index_user=49781;1000;2000;0
      
    • ctsqueue – the same as “indexqueue” but for CTS, no example because I do not have CTS installed
    • failedtasks – checks the number of failed auto-activities, example:
      nagios_check_docbase -H 192.168.2.56:12000/131031 -l dmadmin -a dmadmin \
             -m failedtasks
      FAILEDTASKS CRITICAL - 1 task(s): 'Last Performer' (tp002-000_user1)
      
    • login – checks if certain user is able to authenticate (I use this to check LDAP availability), example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 -m login
      LOGIN OK - user: dmadmin, connection: 1804ms, authentication: 93ms
      | authentication_time=93ms;;;0 connection_time=1804ms;;;0
      
      # thresholds
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
          -m login --warningtime 500 --criticaltime 1000
      LOGIN WARNING - user: dmadmin, connection: 909ms, authentication: 86ms
      | authentication_time=86ms;500;1000;0 connection_time=909ms;500;1000;0
      
    • jmsstatus – checks availability of JMS, example:
      nagios_check_docbase -H dctms://dmadmin:dmadmin@192.168.13.131:10001 \
         -m jmsstatus
      JMSSTATUS OK - http://docu72dev01:9080/DmMethods/servlet/DoMethod - 60ms, 
                     http://docu72dev01:9080/DmMail/servlet/DoMail - 2ms, 
                     http://docu72dev01:9080/bpm/servlet/DoMethod - 6ms
      | response_time_08024be980000ced_do_bpm=6ms;;;0
      response_time_08024be980000ced_do_mail=2ms;;;0
      response_time_08024be980000ced_do_method=60ms;;;0
      
    • ctsstatus – checks availability of CTS, no example
    • acsstatus – checks availability of ACS, no example
    • xplorestatus – checks availability of xPlore, no example

Because arguments host, port, docbaseid, username, password are mandatory it makes hard to create flexible setup in nagios (for example opsview allows to set only four arguments for template), so these arguments might be collapsed into the single one (host) using following convention (see previous examples):

dctm[s]://username:password@host:port/docbaseid

also password might be obfuscated using following approach:

echo -ne "password" | \
  perl -na -F// -e 'print reverse map{sprintf("%02x",(ord$_^0xB6||0xB6))}@F'

for example:

 ~]$ echo -ne dmadmin | \
> perl -na -F// -e 'print reverse map{sprintf("%02x",(ord$_^0xB6||0xB6))}@F'
d8dfdbd2d7dbd2[dmadmin@docu72dev01 ~]$
 ~]$ check_docbase.py -H dctms://dmadmin:d8dfdbd2d7dbd2@192.168.13.131:10001 \
       -m login
LOGIN OK - user: dmadmin, connection: 1805ms, authentication: 93ms
| authentication_time=93ms;;;0 connection_time=1805ms;;;0

Q & A. VII

Hi Andrew,

Hope you are doing fine..Just wanted your suggestion on type adoption scenario for xCP 2.2 environment.

For suppose, If we create a new xCP 2.2 Project “Project-1” and adopt a custom type say some_type and and then we can proceed and deploy this application. That is fine but in future if we want to create new xCP project “Project2” and the same custom type is required i.e some_type, we cannot adopt the type some_type in “Project2” because the type is already adopted in “Project1” so xCP designer doesnt’ allow by design.

So i’m thinking of creating a project say “BaseTypesProject” and then at first adopt all the custom types and when in future if we want to create new xCP project and wanted the same custom types, we will refer the “BaseTypesProject” as dependency project so that we can get all the types.

But for some strange reasons it looks like there is no way to make a Project as dependency project in xCP 2.2. Wondering what is the best way to solve this issue

I do believe that you may temporary “unadopt” adopted types on DEV ENV and import them into another project, check this topic on ECN.