PDF.js

Technology Services Group recently published a blogpost which compares PDF.js and OpenAnnotate, unfortunately, all their comparison is based on hypothesis that PDF.js does not support progressive loading:

PDF.js loads the entire PDF into the client via JavaScript. This works fine for moderately large documents (10 pages), however many of our clients have documents in the 300-700 page range. Larger files put a lot of strain on the network, and leaves minimal options when it comes to performance tuning.

Which is actually not true, because PDF.js does support progressive loading since 2013: Implement progressive loading of PDFs, actually, it is more correct to say that PDF.js does support progressive loading since birth, because PDF.js was originally created as a Firefox extension and was included in Mozilla Firefox since 2012 (version 15), and it was enabled by default since 2013 (version 19). Unfortunately, after receiving a couple of valuable comments Technology Services Group embarrassingly decided to remove those comments and close blogpost for further comments:

Those valuable comments were:

  • PDF.js does support PDF annotating capabilities via plugin
  • If TSG thinks that progressive loading does require linearized PDFs, why they do not optimize all PDFs before storing

The most interesting thing here is a fact, that progressive loading does not require linearized PDFs:

Mastering PDF annotations

About a month ago I was asked about possible solution for the following case:

  1. customer has EDMS based on EMC Documentum
  2. after certain lifecycle state documents become readonly (that means that documentcontent gets signed and not supposed to be changed anymore)
  3. despite of readonlyness users want to be able to put some comments into content
  4. customer is potentially ready to convert all content into pdf

You may say that PDF annotations is an option for me. Bullshit! I have checked a couple of implementations and found that all implementations are unsuitable for me, primary reasons are:

  1. some of implementations require read-write access to content (actually there is a workaround with extracting PDF annotations upon upload, but this looks very ugly)
  2. some of implementations are nailed down to specific application and, moreover, are intended to work only through browser, which is very inconvenient – this stupid concept perfectly demonstrated in EMC presentation – user’s monitor is capable to display two A4 pages simultaneously but UI is able to display only 1/3 of the page: – worst UI ever

So, I started to study vendor’s (i.e. Adobe) documentation and have found a possible solution: Acrobat® Online Collaboration: Setup and Administration. Unfortunately, Adobe’s document does not state clearly what is required to inject into pdf to make it “onlinereview-capable”, but after some debugging I have found following solution based on iText library:

String id = "randomId";
String baseUrl = "http://docu70dev01/annotations";
Date date = new Date();
StringBuilder javaScript = new StringBuilder();
javaScript.append("(function () {");
javaScript
        .append("if (app.viewerVersion >= 8 && (!app.viewerType.match(/Reader/) || ");
javaScript
        .append("requestPermission(permission.annot, permission.create) == permission.granted)) {");
javaScript.append("var msg = {").append("doc: this,");
javaScript.append("initiator: (new String(\"\")),");
javaScript.append("id: (new String(\"").append(id).append("\")),");
javaScript.append("source: (new String(\"").append(baseUrl).append("/")
        .append(id).append("/\")),");
javaScript
        .append("driver: (new String(\"urn://ns.adobe.com/Collaboration/SharedReview/WebDAV\")),");
javaScript.append("invitees: (new String(\"\")),");
javaScript.append("sentDate: (new Date(").append(date.getTime() / 1000)
        .append(")),");
javaScript.append("deadDate: (new Boolean(false)),");
javaScript.append("requireSave: (new Boolean(false)),");
javaScript.append("cc: (new String(\"\")),");
javaScript.append("distributionMethod: (new String(\"MANUAL\")),");
javaScript.append("versionInfo: (new Number(11)),");
javaScript.append("accessLevel: (new Number(0))");
javaScript.append("};");
javaScript.append("Collab.registerReview(msg);");
javaScript.append("}");
javaScript.append("})()");
String file = "G:\\Users\\andrey\\Documents\\test.pdf";
PdfReader reader = new PdfReader(file);
PdfStamper stamper = new PdfStamper(reader, new FileOutputStream(
        "G:\\Users\\andrey\\Documents\\test_review.pdf"));
stamper.addJavaScript(javaScript.toString());
stamper.close();
reader.close();

And after that Acrobat Reader gets cool functionality:

All what I need now is implement WebDav-server 🙂