Do you like obsessive advertisement?

I have no idea who was the author of dumb idea to display document’s content in a browser – I always thought that if you unable to open file in specialized application this means you are not intended to see that file. Today I noticed on LinkedIn an advertisement of another one square wheel and realised that the advertisement of ARender is really obsessive, some examples:

But being a curious person I decided to give ARender a chance and “tried” it, the result, as expected, was mediocre – 10Mb of network traffic for a small pdf file, interesting, how it can be fast (quote: “Extremely fast startup time, no application download required at client side.”) if it sends a bunch of http requests on every resize? But, may be a network traffic is not an issue anymore, after all we are in 2015. Ok, let’s explore ARender site.

Oops…

Oops…

Do you have any idea why I like /etc/passwd file (I believe passing /dev/zero is also a funny option)? It contains information about users’ home directories, which in turn contain .bash_history files:

Oh no…

PS. I got a response from ARender team:

Greetings,

We have read your blogpost thoroughly regarding the problems you raised on our document viewer, ARender.

First of all, many thanks for sending us the potential weaknesses and bugs you could find in order for us to improve and consolidate our solutions.

Regarding your raised issues about ARender’s bandwidth usage, this originates from our backward compatibility with Internet Explorer 6. As the latter does not handle resizing of pictures very well, we had to request pictures with different sizes for each window size change. Now with ARender 3, and the drop of IE6 compatibility, we will soon be able to use a rezising mechanism, with only some key pictures sizes requested on demand when the quality starts to be altered by the zoom. This will leverage the number of images requested but also the number of http requests.

For the security issue regarding the access to critical system paths, it is possible in ARender to turn off the filesystem access, and in the future, to restrict specific paths once ARender enters production environment. We also recently integrated ARender in docker, that we will try to promote and push as standard usage. As ARender is then deployed in a minimalistic environement, there will be no services exposure either than ARender itself and no access to the real host filesystem.

Some ideas about organising storage for content files

Memento mori

When planning how are you going to store content files always think about disaster recovery, the typical case is: storage admins ask you how many disk space do you need and after that they provision one large 10-20Tb LUN for Documentum – this is completely wrong, because in case of disaster recovery your primary goal is to decrease RTO and RPO, but restoring “obsolete” files in 10-20Tb LUNs won’t help you – business users always have a preferences about what needs to be recovered first, it may be content of specific/business-critical types or content loaded within last two days/weeks/months, also keep in mind that Documentum does not work without content of /System cabinet.

General considerations are:

  1. always prefer NAS to SAN – in general, NAS appliances are slower than SAN, but it is not an issue for Documentum, furthermore, most NAS appliances have a build-in capabilities which do not exists in SAN appliances, for example: if you need to scale your repository on multiple servers you have two options: create a cluster filesystem (cluster software costs extra money and requires extra maintenance) or use NAS, typically NAS appliances represent a symbiosis between filesystem, network and disk drivers, so, the most of NAS appliances have a build-in replication and snapshot capabilities (SAN appliances may have such capabilities too, but the problem is SAN appliances have no idea about what is stored in underlying LUN)
  2. if you have no choice and SAN is the only option always use volume manager – never ever create a filesystem on a LUN without volume manager, otherwise in future you will unable to perform an extremely simple operations without downtime, for example, if I need to move all data from one storage to another (somebody decided to decommission and old appliance or I decided to move old data on slow storage) I just add new physical volume to the existing disk group, remove old physical volume and wait some time while volume manager moves data between physical volumes in online
  3. split content volumes into maintainable pieces – it may be a 3-6 months’ worth of data or 1-2Tb volumes, in my deployments I have found that 2Tb is an optimal size
  4. try to understand business value of stored content and design storage accordingly, Content Storage Services option is your friend here

Trusted Content Services

Never ever use Trusted Content Services option for encrypting content files, the considerations are:

  • it does not bring any value from security perspective, even stubborn EMC employees realised that
  • there are different opinions about how to properly use AV-software in Documentum environment, some guy think that real-time scan is good and get something like: , another guys think that periodic AV-scans of content volumes is ok, but what are you going to find if all content is encrypted? Moreover, viruses have a dumb nature: today infected file may be treated as harmless, tomorrow it will be harmful, so, encryption is not AV friend.
  • it seems that EMC fails to provide backward compatibility for TCS option across releases and operating systems: How will content be re-encrypted during TCS 7.2 upgrade?, Documentum Migration from AIX to Linux