Forum Bugs

Prince8.1 server edition maxing out CPU

ScrappyTexas
Hi there,

We're having a production issue at the moment.

I'm producing PDFs via a C# asp.net website. It makes calls to another local website to generate PDFs from the URLs. The code looks like this:

public const string FileRoot = @"c:\PDFDirectory\";

void BuildPDFURL(string FileName, int TypeID, int ID) {
Prince prn = new Prince(@"C:\Program Files (x86)\Prince8.1\Engine\bin\prince.exe");
prn.SetLog("C:\\PDFDirectory\\log.txt");
prn.SetJavaScript(true);
prn.Convert("http://127.0.0.1:10300/page.aspx?id=" + ID.ToString(), FileRoot + FileName);
}

When I call Prince on a large page ~600kb if you were to download it as HTML, it blocks one CPU of our Virtual Machine - if a concurrent user generates another page, the whole machine locks up.

These PDFs can take a good while to generate, 30 seconds plus, but the issue is that this freezes our production environment.

As mentioned in the title, we have Prince8.1 (Rev5).

I've tested Prince10 Rev6 on my dev environment and it doesn't improve things (even though my CPU is WAY more powerful than the 2.2GHz Xeon processor on our VM).

Is there anyway I can improve this? Sometimes the PDFs include images served over the web, always include js / CSS, and can be added to by a user at any time - hence me producing from a dynamic page and not a document.

Any help would be greatly appreciated.

Thanks in advance,
Tom
mikeday
There are several issues involved here, as 600kb of HTML normally should not take more than a few seconds to convert.

If the documents can contain images loaded from the web, that can significantly increase conversion times depending on the speed of the remote server and available bandwidth. You can test this by running Prince with the --no-network flag on your local machine and see if it affects the time.

Also if you are running JavaScript, that could be having an impact. Does the conversion time change when JavaScript is disabled?

Regarding the locking up, is this purely due to server cpu usage, or is it some kind of threading issue? You would not expect one request to block other requests from being processed. Do you get similar behaviour if Prince is converting a file, instead of retrieving the HTML from the server?
ScrappyTexas
Hi Mike,

Thanks for your replay.

As requested, I ran the generator without javascript enabled and it was much quicker. CPU hit was the same, but for a fraction of the time.

So this leads me to another question - is there a way to build a linked table of contents without javascript?

Here is what I currently run:
<script type="text/javascript">
function getText(e) {
var text = "";
for (var x = e.firstChild; x != null; x = x.nextSibling) {
if (x.nodeType == x.TEXT_NODE) {
text += x.data;
} else if (x.nodeType == x.ELEMENT_NODE) {
text += getText(x);
}
}
return text;
}

function maketoc() {
var hs = document.getElementsByClassName("toc-include");
var toc = document.getElementById('toc-container');
for (var i = 0; i < hs.length; i++) {
var text = document.createTextNode(getText(hs[i]));
var span = document.createElement("span");
span.appendChild(text);
hs[i].setAttribute("id", "ch" + i);
var link = document.createElement("a");
link.setAttribute("href", "#ch" + i);
if (hs[i].tagName == "H3" || hs[i].tagName == "h3") {
link.className = "subHead2";
} else if (hs[i].tagName == "H2" || hs[i].tagName == "h2") {
link.className = "subHead1";
}
link.appendChild(span);
var p = document.createElement("p");
p.appendChild(link);
toc.appendChild(p);
}
}
</script>

This basically loops around looking for any heading elements and adds them to the table of contents and therefore the page numbers are appended correctly.

I appreciate the logic isn't very efficient, but I couldn't get a working TOC any other way.

Thanks and best regards,
Tom
mikeday
This script should run noticeably faster in Prince 10 compared to Prince 8.1, as we have made a number of improvements to JavaScript evaluation for DOM methods like getElementsByClassName.

If it is still slow with Prince 10, perhaps you could attach or email me (mikeday@yeslogic.com) a copy of the HTML so we can do some performance profiling.