Friday, January 20, 2012

Add Javascript to Existing PDF files (Python)

There are several tools on the net, that helps you to add javascript to new PDF files - but I could not find any tool to add Javascript to existing PDF files.

Here is a python script that accomplishes that. For this you need to install the following python package (which is a modified pyPDF - found at http://pybrary.net/pyPdf/). The modified pyPDF is found at http://goo.gl/sJcMO. Or you could simply unzip the modified pyPDF in your current directory.


Help text and Usage for this script:
$ python addjs2pdf.py 
Usage: addjs2pdf.py [options] in-pdf-file out-pdf-file

Options:
  --version             show program's version number and exit
  -h, --help            show this help message and exit
  -j JAVASCRIPT, --javascript=JAVASCRIPT
                        javascript to embed (default embedded JavaScript is
                        app.alert messagebox)
  -f JAVASCRIPTFILE, --javascriptfile=JAVASCRIPTFILE
                        javascript file to embed

  add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened
  Based on modified pyPDF http://pybrary.net/pyPdf/ and inspiration from https://DidierStevens.com

$

Here is the addjstopdf.py:
from pyPdf import PdfFileWriter, PdfFileReader

import optparse

def Main():
    """add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened
    """

    parser = optparse.OptionParser(usage='usage: %prog [options] in-pdf-file out-pdf-file', version='%prog 0.1')
    parser.add_option('-j', '--javascript', help='javascript to embed (default embedded JavaScript is app.alert messagebox)')
    parser.add_option('-f', '--javascriptfile', help='javascript file to embed')
    (options, args) = parser.parse_args()

    if len(args) != 2:
        parser.print_help()
        print ''
        print '  add-js-to-pdf, use it to add embedded JavaScript to a PDF document that will execute automatically when the document is opened'
        print '  Based on modified pyPDF http://pybrary.net/pyPdf/ and inspiration from https://DidierStevens.com'
        print ''
        return

    input1 = PdfFileReader(file(args[0], "rb"))
    output = PdfFileWriter()
        
    pages = input1.getNumPages()
    for p in range(pages):
        output.addPage(input1.getPage(p))
    if options.javascript == None and options.javascriptfile == None:
            javascript = """app.alert({cMsg: 'Hello from PDF JavaScript', cTitle: 'Testing PDF JavaScript', nIcon: 3});"""
    elif options.javascript != None:
            javascript = options.javascript
    else:
        try:
            fileJavasScript = open(options.javascriptfile, 'rb')
        except:
            print "error opening file %s" % options.javascriptfile
            return

        try:
            javascript = fileJavasScript.read()
        except:
            print "error reading file %s" % options.javascriptfile
            return
        finally:
            fileJavasScript.close()

    output.addJS(javascript)
    outputStream = file(args[1], "wb")
    output.write(outputStream)
    outputStream.close()

if __name__ == '__main__':
    Main()

5 comments:

  1. Very easy to understand post, but i still have a question. When I do a document.close(), I get an error "The document has no pages". I
    am not sure, what I am doing wrong.

    ReplyDelete
    Replies
    1. @Tom: Can you post (or email) the code that you are using? There is no document.close() in the above code nor do I understand why a .close() at any of the above code places can throw that error. -- Moorthy rsmoorthy at gmail.com

      Delete
  2. Hi there! Thanks for your post.

    I was wondering if it is possible to programmatically add page-level javascript (instead of document level javascript) to a PDF.. perhaps with pyPDF or any other library you're aware of?

    I need to add page-open javascript to a whole bunch of PDF's. Unfortunately I'm not able to use document-level javascript because I'm using 3D annotations and I can't get a handle on those at document open.


    - Arvind

    ReplyDelete
  3. This is because all browsers have accepted all browsers have accepted JavaScript as a scripting language for them and provides integrated support for it. All you need to do is to handle some of the tasks that are dependent on DOM of different browser properly.
    best web design tutorials

    ReplyDelete