anti-performance series

It is been already a year since I had started nurturing an idea how to write a blogpost about performance best practices, unfortunately, such idea was initially doomed to failure – there are a lot of materials that need to be verified before posting, and no doubt it should take a lot of time, so I “invented” another format: I will try to prove or ruin statements from performance guides provided by talented team.

Actually, some performance-related statements were already ruined in previous posts:

Minimizing and consolidating activities
System throughput varies between 3-100 activities per second, depending on system configuration and hardware. Workflows with more activities take longer to complete. The largest performance impact for processing activities results from opening up a new Content Server session. As a result, the biggest performance improvement comes from minimizing the number of discrete activities in a workflow. Minimize the number of workflow activities by, 1) eliminating unnecessary activities altogether or 2) consolidating the steps performed by multiple activities, into a single condensed activity.
To improve the completion rate of individual activities, do the following:

  • Use the bpm_noop template wherever possible. This particular noop does not create an additional template and does not send an HTTP post to the JMS
  • Within the automatic activity, do the work on behalf of a superuser instead of a regular user
  • Turn off auditing whenever unnecessary

Iteratively modify the number of system workflow threads to assess the impact on user response time, activity throughput, and system resource consumption. More workflow threads result in greater automatic activity throughput up to the point where system resource consumption degrades performance. Scale up slowly to understand when resource limitations begin to show (Content Server CPU and database CPU utilization). The following provides some guidelines:

  • A single CPU Content Server host cannot process 10,000 activities per hour, regardless of how it is configured
  • Be cautious if CPU or memory utilization exceeds 80% for any tier in the system
  • Do not configure more than three threads per CPU core

If throughput requirements exceed the capacity that a single Content Server can provide, add more Content Servers. Each Content Server instance (and associated workflow agent) nominally supports 15 concurrent workflow threads. Deploy one Content Server instance for every multiple of 15 concurrent workflow threads required by your solution. Avoid more than 25 workflow threads for any Content Server.

In general the statements above are misleading:

  • I doubt that “The largest performance impact for processing activities results from opening up a new Content Server session”: at first, JMS do not open new sessions – all sessions are already in session pool, bad thing here is DFC performs authentication when it acquires session from session pool – CS generates new ticket for every auto-activity and these tickets never match passwords associated with pooled sessions and, if my memory severs me right, such reauthentication takes 2 RPCs, at second, dealing with workitem typically takes 4 RPCs: begin transaction, acquire, complete, commit (+ content server does some extra job: creating next activity, updating workflow object, etc) + we need to do some useful work (i.e. perform business logic)
  • workflow delays, caused by processing of auto-activities, does not affect business users: business users are not robots, they do not complete tasks as quick as thought – a couple of extra minutes won’t make sense. On the other hand “consolidating” auto-activities has a negative impact on a project complexity: you need to either consolidate both code and docbase methods or create an extra layer, purposed to implement such consolidation (actually, we use the second option, but that wasn’t influenced by performance considerations), so, it is much better to keep code simple in spite of EMC’s idea about consolidations sounds reasonable
  • I have no idea what were prerequisites to suggest invoking auto-activities under superuser account (I would accept the following scenario: all auto-activities are invoked under installation owner account and CS/JMS takes advantage of trusted authentication, but workflow agent does not support such option), but my preferred option is to assign “previous activity performer” as performer of auto-activity and take advantage of dynamic groups – such approach allows to keep track of last performer of manual activities – business users are able to see who have sent them a task
  • “10,000 auto-activities per hour for single CPU host” is extremely pessimistic estimation – 30,000-50,000 is more close to reality on modern hardware
  • there is no scientific explanation why we need to limit the number of workflow agents by 25 (extra licence fees?) – I do believe that “2 * number of cores” is a good starting point for any hardware configuration

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s