It is an interesting challenge, so I dug the topic a bit.
There is indeed a number of related Java libraries, the difficulty is to choose one. I wanted to try the ICEpdf library, but they ask to register before doing a download, so I skipped them...
The article at
http://adamtaft.blogspot.fr/2010/04/open-source-java-pdf-viewing-library.html makes an interesting (quick) comparison of some libraries, and gives a snippet for
Apache PDFBox. Hey, it is an Apache project, as free as it can get, and seems, from some reports, to be in a decent shape, so I gave it a try.
I adapted the given code to Processing:
- import org.apache.pdfbox.*;
- import org.apache.pdfbox.pdmodel.*;
- import java.awt.image.BufferedImage;
- void setup()
- {
- size(800, 800);
- background(#F0F8FF);
-
- InputStream inputStream = createInput("G:/Downloads/PDF Samples/example_001.pdf");
- PDDocument doc = null;
- try
- {
- doc = PDDocument.load(inputStream);
- final PDPage page = (PDPage) doc.getDocumentCatalog().getAllPages().get(0);
-
- PipedInputStream pis = new PipedInputStream();
- final PipedOutputStream pos = new PipedOutputStream(pis);
- BufferedImage bi = page.convertToImage(BufferedImage.TYPE_INT_ARGB, 120);
- println("Conversion done");
- PImage image = new PImage(bi);
- image(image, 0, 0);
- }
- catch (Exception e)
- {
- e.printStackTrace();
- }
- finally
- {
- if (doc != null)
- {
- try { doc.close(); }
- catch (IOException e) { println("Problem when closing doc: " + e.getMessage()); }
- }
- }
- }
The PDF file is from
http://www.tcpdf.org/examples.php which seems to be a good source of various kinds of PDF files. You probably don't need a very sophisticated rendering, anyway.
PDFBox had issues with the embedded fonts, and used a default one instead.
To run it, I made a PDFBox/library folder in the sketchbook libraries folder, and I put there:
pdfbox-1.7.1.jar
fontbox-1.7.1.jar
jempbox-1.7.1.jar
commons-logging-1.1.1.jar
I renamed to PDFBox.jar to follow Processing' naming conventions. Common Logging is a hard dependency of the library (so are fontbox and jempbox, see the Dependencies page of the site).
It displayed the content of the PDF file successfully.