Forum How do I...?

Any recommendations for keeping Prince's memory footprint down?

petrmotejlek
Hiya.

I have an HTML file, about 5 MBs big. Even when I remove all style files and images, it still seems to take about 190 MBs of memory when Prince works with it. Is there anything in particular I can do to the HTML to have it consume less memory, but still be able to produce the same PDF file? I already tried splitting the HTML into multiple files (with no regard for the resulting PDF -- I just removed a couple higher-level divs and then split the rest into multiple files), and it still arrives at about 190 MBs of memory.

The file is attached (I am not sending over any of the stylesheets or images, since they don't seem to change the memory footprint). It is a pricelist/catalogue, so it is mainly comprised of tables (with product images and prices).

Thank you!

PS: I tested this using the Debian 12 release of Prince 15.2-1 (https://www.princexml.com/download/prince_15.2-1_debian12_amd64.deb).
PPS: Using the Alpine release of Prince, it seems to go for a bit over 600 MBs of memory using the same file :D.
  1. pricelist.html4.7 MB
wangp
Hi, if you are using `/usr/bin/time -v` to measure the maximum resident set size (RSS), please note that the BusyBox implementation of time (used on Alpine Linux) reports an incorrect value of Max RSS. It is inflated by a factor of 4.

https://bugs.busybox.net/show_bug.cgi?id=15751
petrmotejlek
Thanks, wangp. However, the issue still stands. I only did the extra test with Alpine to test multiple environments. With the Debian 12 (on Debian 12) release of Prince, this particular HTML file causes 190 MBs of memory to be consumed. A mere 5 MB file should, IMHO, not eat 190 MBs of memory.

I was really hoping for some kind of guidance as to how one should structure files like this better, to make Prince use less memory.
mikeday
May I ask is the goal to run Prince on bigger documents than this, or on servers with limited memory available, or to run many copies of Prince in parallel, or something else?
petrmotejlek
Hey, @mikeday. The app is a pricelist/catalogue generator. It takes a database and exports all products from individual brands as individual HTML files (one of which you see as the attachment), then it loops over the pricelists and converts them each into separate PDF files. So, there's no parallelization or anything like that. There's about 30 brands in total, so I generate 30 HTML files and then try to convert them one by one into separate PDF files.

The app is part of a bigger ecosystem, hosted in K8s, and as such, it needs (ideally) some resource limits (namely CPU and memory). This app did not use to run in K8s, but I am now migrating it there, as it is the last piece of the ecosystem which does not run in K8s.

And I was really surprised to see it get killed by OOM killer when I set the memory limit to 256 MBs (Prince isn't the only thing running there, there's also the wrapping PHP code). I was thinking, yeah, it will probably go somewhere between 128 and 200 :).

And that's when I started testing with just the plain HTML file (with no styles and images) to see why on earth Prince needs 190 MB of memory to produce a PDF out of that. To me it feels like maybe I am just generating the HTML wrong. Maybe there's some hack/workaround I could do to use less of memory. Hence my original question -- maybe there are some tips :).
mikeday
If each file only takes a few seconds and uses a few hundred MB then that doesn't seem like a problem: the memory usage is transient and temporary and should be well within the server capacity. The difficulty for us here is that while it may be possible to reduce the memory usage for simple documents, this may result in more complex code for complicated documents, so a small decrease in memory usage can incur a heavy maintenance burden. It's often easier to install more RAM! :D