We are about to switch to a new forum software. Until then we have removed the registration on this forum.
Hi,
I have a simple sketch that will take a .csv file generated from a process and generate a PDF report from it. I want the output to be 8.5x11 inches @300dpi (standard US letter). The csv file is relatively large, it is a a test report generated from an electric car battery pack test and records the voltage, resistance, etc, of every cell (there are 108 cells...) a few times a second (typically 2-4 times a second). The test can run for two or more hours and the resulting file is too big to practically work with in MS Excel, typically there are 10000-15000 rows and 552 columns.
Among some textual data, my sketch is generating a graph on the pdf that has 108 x rows points that are drawn on top of each other with high transparency. This makes the resultant file very large and sluggish in OSX preview.app. Even if only 1 x rows (e.g. 12000 points) are generated, there is still a noticeable slowdown. Is there a way I can optimize this chart so the pdf is smaller?
The code I am using to output the pdf is:
int PageHorizDim = 2550;
int PageVertDim = 3300;
background(255,255,255);
size(PageHorizDim,PageVertDim,PDF,"output.pdf");
The entire sketch including reading in the ~70MB .csv file runs in less than 5 seconds on my macbook. It takes maybe 15 seconds to open the PDF. I should note that the preview.app does not actually crash, so I suppose as long as I can print the file it should be OK. I don't know if a printer can handle the file but even with the (~1M) points the pdf is still only 7.2MB. Still, I wonder if the output can be optimized somewhat, and if it is worth it? The area of the graph (points) is 1500x750.
Answers
Moved to General Discussion since that's not a Question about Code (the code you show isn't significant).
Changed markup of code. You must not hit the C button then paste the code, but paste the code, select it then hit the C button.
And your problem has probably no simple solution: a PDF file is a vector file, so the reader has to render each item individually, which can take quite some time when there are lot of items to process.
Exporting to a bitmap would solve the issue, but you loose the high scalability.
I wound up only using every 10th point which helps the performance drastically, and increased the alpha value. Your answer was as I feared, most pdf viewers are just not meant to deal with 1M+ points... For release software I will probably just plot the min, max, and average cell voltage in the end, reducing 107 x rows points. ;)
Also, thank you for the, erm... moderating. :)