Start Java Method Server Properly

Post was originally published on ECN.

Long time ago I noticed some misbehavior in Content Server: if Java Method Server was down for a long time and then started it takes significant time for Content Server to understand that JMS is up and running. In such situation I have used one of two following “solutions” to resume proper work of workflow methods:

  • restart Content Server
  • disable and enable JMS in DA (DA->Admin->basic configuration->Java Method Server)

Both solutions were wrong! Content Server uses following stupid algorithm to check JMS availability:

  • if CS finds out that JMS is unreachable it executes JMSHealthChecker method
  • if execution of JMSHealthChecker was unsuccessful CS increments check interval
  • successful execution of JMSHealthChecker method resets check interval

Let’s check what requests does Content Server send when JMS is unavailable using nc utility and debugging capabilities of DUMP_JMS_CONFIG_LIST RPC-command:

~]$ while `true`;do date; nc -l 9080 < /dev/null; done  
Tue Feb  4 06:06:42 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:06:48 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:07:48 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:09:48 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:13:48 MSK 2014

Note incrementing intervals between timestamps: 06:06:48 -> 06:07:48 -> 06:09:48 -> 06:13:48

API> apply,c,,DUMP_JMS_CONFIG_LIST  
...  
q0  
API> next,c,q0  
...  
OK  
API> dump,c,q0  
...  
USER ATTRIBUTES  
  
  jms_list_last_refreshed         : Tue Feb  4 05:50:46 2014  
  incr_wait_time_on_failure       : 30  
  max_wait_time_on_failure        : 3600  
  current_jms_index               : 0  
  jms_config_id                [0]: 0801d92080000b65  
  jms_config_name              [0]: JMS docu67dev01:9080 for repo.repo  
  server_config_id             [0]: 3d01d92080000102  
  server_config_name           [0]: repo  
  jms_to_cs_proximity          [0]: 1  
  is_disabled_in_docbase       [0]: F  
  is_marked_dead_in_cache      [0]: T  
  intended_purpose             [0]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS  
  last_failure_time            [0]: Tue Feb  4 06:06:48 2014  
  next_retry_time              [0]: Tue Feb  4 06:07:48 2014  
  failure_count                [0]: 2  
  
SYSTEM ATTRIBUTES  
  
APPLICATION ATTRIBUTES  
  
INTERNAL ATTRIBUTES  
  
API> close,c,q0  
...  
OK   
API> apply,c,,DUMP_JMS_CONFIG_LIST  
...  
q0  
API> next,c,q0  
...  
OK  
API> dump,c,q0  
...  
USER ATTRIBUTES  
  
  jms_list_last_refreshed         : Tue Feb  4 05:50:46 2014  
  incr_wait_time_on_failure       : 30  
  max_wait_time_on_failure        : 3600  
  current_jms_index               : 0  
  jms_config_id                [0]: 0801d92080000b65  
  jms_config_name              [0]: JMS docu67dev01:9080 for repo.repo  
  server_config_id             [0]: 3d01d92080000102  
  server_config_name           [0]: repo  
  jms_to_cs_proximity          [0]: 1  
  is_disabled_in_docbase       [0]: F  
  is_marked_dead_in_cache      [0]: T  
  intended_purpose             [0]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS  
  last_failure_time            [0]: Tue Feb  4 06:07:48 2014  
  next_retry_time              [0]: Tue Feb  4 06:09:48 2014  
  failure_count                [0]: 3  
  
SYSTEM ATTRIBUTES  
  
APPLICATION ATTRIBUTES  
  
INTERNAL ATTRIBUTES  
  
API> close,c,q0  
...  
OK  
  
  
API> apply,c,,DUMP_JMS_CONFIG_LIST  
...  
q0  
API> next,c,q0  
...  
OK  
API> dump,c,q0  
...  
USER ATTRIBUTES  
  
  jms_list_last_refreshed         : Tue Feb  4 05:50:46 2014  
  incr_wait_time_on_failure       : 30  
  max_wait_time_on_failure        : 3600  
  current_jms_index               : 0  
  jms_config_id                [0]: 0801d92080000b65  
  jms_config_name              [0]: JMS docu67dev01:9080 for repo.repo  
  server_config_id             [0]: 3d01d92080000102  
  server_config_name           [0]: repo  
  jms_to_cs_proximity          [0]: 1  
  is_disabled_in_docbase       [0]: F  
  is_marked_dead_in_cache      [0]: T  
  intended_purpose             [0]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS  
  last_failure_time            [0]: Tue Feb  4 06:09:48 2014  
  next_retry_time              [0]: Tue Feb  4 06:13:48 2014  
  failure_count                [0]: 4  
  
SYSTEM ATTRIBUTES  
  
APPLICATION ATTRIBUTES  
  
INTERNAL ATTRIBUTES  
  
API> close,c,q0  
...  
OK

Note next_retry_time timestamps: 06:06:48 -> 06:07:48 -> 06:09:48 -> 06:13:48 – the same as in nc output.

Initially I thought that Content Server increments check interval by 30 seconds (see incr_wait_time_on_failure in output of DUMP_JMS_CONFIG_LIST and ), but it seems behavior depends on whether Content Server is able to establish TCP connection or not – if you just shutdown Java Method Server and execute DUMP_JMS_CONFIG_LIST you will find that check interval is incremented by 30 seconds but in my experiment (I used nc to dump http traffic, so Content Server is able to establish TCP connection) I noticed another behavior.

Now about the proper way to reset check interval:

API> apply,c,,TIME  
...  
q0  
API> next,c,q0  
...  
OK  
API> get,c,q0,result  
...  
2/4/2014 06:39:28  
API> close,c,q0  
...  
OK  
API> apply,c,,DUMP_JMS_CONFIG_LIST  
...  
q0  
API> next,c,q0  
...  
OK  
API> dump,c,q0  
...  
USER ATTRIBUTES  
  
  jms_list_last_refreshed         : Tue Feb  4 05:50:46 2014  
  incr_wait_time_on_failure       : 30  
  max_wait_time_on_failure        : 3600  
  current_jms_index               : 0  
  jms_config_id                [0]: 0801d92080000b65  
  jms_config_name              [0]: JMS docu67dev01:9080 for repo.repo  
  server_config_id             [0]: 3d01d92080000102  
  server_config_name           [0]: repo  
  jms_to_cs_proximity          [0]: 1  
  is_disabled_in_docbase       [0]: F  
  is_marked_dead_in_cache      [0]: T  
  intended_purpose             [0]: DM_JMS_PURPOSE_DEFAULT_EMBEDDED_JMS  
  last_failure_time            [0]: Tue Feb  4 06:37:48 2014  
  next_retry_time              [0]: Tue Feb  4 07:09:48 2014  
  failure_count                [0]: 7  
  
SYSTEM ATTRIBUTES  
  
APPLICATION ATTRIBUTES  
  
INTERNAL ATTRIBUTES  
  
API> close,c,q0  
...  
OK  
API> apply,c,,REFRESH_JMS_CONFIG_LIST  
...  
q0  
API> next,c,q0  
...  
OK  
API> get,c,q0,result  
...  
T  
API> close,c,q0  
...  
OK  
-- previous execution of JMSHealthCheckMethod at 06:37:48  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:37:48 MSK 2014  
  
  
-- rescheduled execution of JMSHealthCheckMethod at 06:43:18 (not 07:09:48!!!)  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:42:48 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:43:18 MSK 2014  
POST /bpm/servlet/DoMethod HTTP/1.1  
User-Agent: Documentum Server 6.7.1240.0300  Linux.Oracle (HTTP Client)  
Host: docu67dev01:9080  
Connection: close  
Content-Type: application/x-www-form-urlencoded  
Content-Length: 502  
  
... method_verb=com.documentum.bpm.rtutil.JMSHealthCheckMethod ...  
Tue Feb  4 06:44:18 MSK 2014  

Execution of REFRESH_JMS_CONFIG_LIST RPC command resets JMSHealthCheckMethod check interval.

2 thoughts on “Start Java Method Server Properly

  1. Pingback: RPC Commands (draft) | Documentum in a (nuts)HELL
  2. Pingback: Mature product :) | Documentum in a (nuts)HELL

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s