Parallelism challenge

In recent project after refactoring some improperly implemented functionality (don’t worry, that functionality was implemented by previous consulters :)), we realized that we need to delete about 50 million sysobjects, the bad news is documentum performs deletes very slowly – in single-thread operation it’s capable to delete about 50 sysobject per second, i.e. in our case it would take about 12 days – seems not to be a good perspective 😦 The obvious solution was to use multithreading, but I prefer do not code in Java when it comes to perform administration routines. So, i implemented a very basic STDIN multiplexer using bourne again shell:


# amount of processes to spawn

# command to spawn

# opened descriptors
declare -a DESCRIPTORS
# spawned pids
declare -a PROCESSES

# closes all desriptors opened previously
close() {
 for d in ${DESCRIPTORS[*]}; do
  eval "exec $d<&-"

# waits all spawned processes
# wait call does not work because
# processes are spawned in subshell
waitall() {
 local pid
 while :; do
  for pid in $@; do
   kill -0 $pid 2>/dev/null
   if [ "x0" = x$? ]; then
    set -- $@ $pid
  (("$#" > 0)) || break
  sleep 5

# find process's children
findchild() {
 local pid=$1
 while read p pp; do
  if [ "x$pid" = "x$pp" ]; then
   echo $p
 done < <(ps -eo pid,ppid)

# terminates process's tree
killtree() {
 local pid=$1
 local sig=${2-TERM}
 kill -STOP $pid 2>/dev/null
 if [ "x0" = x$? ]; then
  for child in `findchild $pid`; do
   killtree $child $sig
  kill -$sig $pid 2>/dev/null
  kill -CONT $pid 2>/dev/null

# closes all descriptors and
# terminates all spawned processes
abort() {
 local pid
 for pid in ${PROCESSES[*]}; do
  killtree $pid TERM
  killtree $pid KILL

# emergency exit: close all descriptors
# and terminate spawned processes
trap "abort" 0

for (( i=0; i<=$PROCS-1; i+=1 )); do
 # spawning command
 exec {FD}> >($CMD) || exit $?
 # storing pid of spawned process
 # storing opened descriptor

while read line; do
 i=$(((i + 1) % $PROCS))
 echo $line >&${DESCRIPTORS[i]}

# normal exit: close all descriptors
# and wait spawned processes
waitall ${PROCESSES[*]}

trap - 0

Now I’m able to perform deletes in parallel:

 ~$] 40 iapi docbase -Udmadmin -Ppassword -X \
> < delete_objects.api > delete_objects.log

4 thoughts on “Parallelism challenge

  1. Добрый день, Андрей
    Посмотри, пожалуйста, по-моему что то пропустил

    exec ‘{FD}’
    ++ iapi dmprod -Udmadmin -P -X
    ./ line 82: exec: {FD}: not found


  2. Pingback: Workflow throughput | Documentum in a (nuts)HELL

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s