Friday, January 20, 2012

Add Javascript to Existing PDF files (Python)

There are several tools on the net, that helps you to add javascript to new PDF files - but I could not find any tool to add Javascript to existing PDF files.

Here is a python script that accomplishes that. For this you need to install the following python package (which is a modified pyPDF - found at http://pybrary.net/pyPdf/). The modified pyPDF is found at http://goo.gl/sJcMO. Or you could simply unzip the modified pyPDF in your current directory.


Help text and Usage for this script:
$ python addjs2pdf.py 
Usage: addjs2pdf.py [options] in-pdf-file out-pdf-file

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -j JAVASCRIPT, --javascript=JAVASCRIPT
                        javascript to embed (default embedded JavaScript is
                        app.alert messagebox)
  -f JAVASCRIPTFILE, --javascriptfile=JAVASCRIPTFILE
                        javascript file to embed

  add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened
  Based on modified pyPDF http://pybrary.net/pyPdf/ and inspiration from https://DidierStevens.com

$

Here is the addjstopdf.py:
from pyPdf import PdfFileWriter, PdfFileReader

import optparse

def Main():
    """add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened
    """

    parser = optparse.OptionParser(usage='usage: %prog [options] in-pdf-file out-pdf-file', version='%prog 0.1')
    parser.add_option('-j', '--javascript', help='javascript to embed (default embedded JavaScript is app.alert messagebox)')
    parser.add_option('-f', '--javascriptfile', help='javascript file to embed')
    (options, args) = parser.parse_args()

    if len(args) != 2:
        parser.print_help()
        print ''
        print '  add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened'
        print '  Based on modified pyPDF http://pybrary.net/pyPdf/ and inspiration from https://DidierStevens.com'
        print ''
        return

    input1 = PdfFileReader(file(args[0], "rb"))
    output = PdfFileWriter()
        
    pages = input1.getNumPages()
    for p in range(pages):
        output.addPage(input1.getPage(p))
    if options.javascript == None and options.javascriptfile == None:
            javascript = """app.alert({cMsg: 'Hello from PDF JavaScript', cTitle: 'Testing PDF JavaScript', nIcon: 3});"""
    elif options.javascript != None:
            javascript = options.javascript
    else:
        try:
            fileJavasScript = open(options.javascriptfile, 'rb')
        except:
            print "error opening file %s" % options.javascriptfile
            return

        try:
            javascript = fileJavasScript.read()
        except:
            print "error reading file %s" % options.javascriptfile
            return
        finally:
            fileJavasScript.close()

    output.addJS(javascript)
    outputStream = file(args[1], "wb")
    output.write(outputStream)
    outputStream.close()

if __name__ == '__main__':
    Main()

4 comments:

  1. Very easy to understand post, but i still have a question. When I do a document.close(), I get an error "The document has no pages". I
    am not sure, what I am doing wrong.

    ReplyDelete
    Replies
    1. @Tom: Can you post (or email) the code that you are using? There is no document.close() in the above code nor do I understand why a .close() at any of the above code places can throw that error. -- Moorthy rsmoorthy at gmail.com

      Delete
  2. Hi there! Thanks for your post.

    I was wondering if it is possible to programmatically add page-level javascript (instead of document level javascript) to a PDF.. perhaps with pyPDF or any other library you're aware of?

    I need to add page-open javascript to a whole bunch of PDF's. Unfortunately I'm not able to use document-level javascript because I'm using 3D annotations and I can't get a handle on those at document open.


    - Arvind

    ReplyDelete