Forum How do I...?

How do I determine the number of pages in a generated PDF using Java?

moss
I've noticed very similar question: How do I determine the number of pages in a generated PDF?.

I'm interested in doing the same using Java.

When I add the javascript from the above example to the markup, I can see that "total-page-count" is being logged, but I need to access this in our Java code.

We're on Prince 10 rev 3.
hallvord
Hi moss,
I haven't tested this yet, just read the docs, but it looks like it should be simple to get this working by saving the same snippet of JS in a file:

function logPageCount()
{
    Log.data("total-page-count", Prince.pageCount);
}

Prince.addEventListener("complete", logPageCount, false);


and then calling

Prince.addScript('/path/to/that/file.js');


and use the PrinceEvents API to listen for messages and record the page number.
http://www.princexml.com/doc/java/api/com/princexml/PrinceEvents.html
Let me know if this works for you.

Announcement: repos for tests/utils

moss
Hallvord, that basically does the same thing as I did, just using a different method (add the script using external file vs include it in a script tag within the markup).

That doesn't work though because that event callback seems to be executed only when "a warning or error message is received from Prince", as per the API reference.

To be honest, I'm not sure if the functionality works even in case of a warning or error message. I tested a conversion using
<html><body>hello

I expected this conversion to fail, but there was no call into the callbacks, and I could see in the Prince log that the conversion was successful as it reported "finished: success". What am I missing here?

Edited by moss

jim_albright
I find the total pages by looking at a log file produced in debugging mode.
stats: total-page-count: 6



Wed Sep 23 14:14:24 2015: ---- begin
Wed Sep 23 14:14:24 2015: debug: loading license: C:\Program Files (x86)\Prince\Engine/license/license.dat
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine/style/fonts.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine/style/fonts.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: Loading document...
Wed Sep 23 14:14:24 2015: loading XML input: C:\Users\Albright\Documents\My Princess\input\Biblica.en.NIV.bible.2011-1TH.htm.copy
Wed Sep 23 14:14:24 2015: debug: loaded document: C:\Users\Albright\Documents\My Princess\input\Biblica.en.NIV.bible.2011-1TH.htm.copy
Wed Sep 23 14:14:24 2015: debug: sniffed doctype: XHTML
Wed Sep 23 14:14:24 2015: Running scripts...
Wed Sep 23 14:14:24 2015: loading script: C:\Users\Albright\Documents\My Princess\PrincessJavascript.js
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Users\Albright\Documents\My Princess\PrincessJavascript.js
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: Applying style sheets...
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine/style/xhtml-ns.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine/style/xhtml-ns.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine/style/common.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine/style/common.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine\style\hyph.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine\style\hyph.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine\style\mathml.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine\style\mathml.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Program Files (x86)\Prince\Engine\style\svg.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Program Files (x86)\Prince\Engine\style\svg.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Users\Albright\Documents\My Princess\css\USX 1 column ragged Prince.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Users\Albright\Documents\My Princess\css\USX 1 column ragged Prince.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: loading style sheet: C:\Users\Albright\Documents\Visual Studio 2008\Projects\princess2\princess2\bin\x86\Debug\RequiredFiles\PrincessShowTrackingColor.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: C:\Users\Albright\Documents\Visual Studio 2008\Projects\princess2\princess2\bin\x86\Debug\RequiredFiles\PrincessShowTrackingColor.css
Wed Sep 23 14:14:24 2015: debug: loaded resource: type: no
Wed Sep 23 14:14:24 2015: Preparing document...
Wed Sep 23 14:14:24 2015: Converting document...
Wed Sep 23 14:14:24 2015: debug: pack
Wed Sep 23 14:14:24 2015: debug: font request: bold sans-serif
Wed Sep 23 14:14:24 2015: debug: font scan: Arial
Wed Sep 23 14:14:24 2015: debug: font scan: Arial, 34 matches
Wed Sep 23 14:14:24 2015: debug: font scan: MS Gothic
Wed Sep 23 14:14:24 2015: debug: font scan: MS Gothic, 7 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Andika Basic
Wed Sep 23 14:14:24 2015: debug: font scan: Andika Basic, 1 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Charis SIL
Wed Sep 23 14:14:24 2015: debug: font scan: Charis SIL, 24 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Times New Roman
Wed Sep 23 14:14:24 2015: debug: font scan: Times New Roman, 34 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Symbol
Wed Sep 23 14:14:24 2015: debug: font scan: Symbol, 1 matches
Wed Sep 23 14:14:24 2015: debug: font scan: MingLiU
Wed Sep 23 14:14:24 2015: debug: font scan: MingLiU, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: MS Mincho
Wed Sep 23 14:14:24 2015: debug: font scan: MS Mincho, 14 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Mangal
Wed Sep 23 14:14:24 2015: debug: font scan: Mangal, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Vrinda
Wed Sep 23 14:14:24 2015: debug: font scan: Vrinda, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Raavi
Wed Sep 23 14:14:24 2015: debug: font scan: Raavi, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Shruti
Wed Sep 23 14:14:24 2015: debug: font scan: Shruti, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Latha
Wed Sep 23 14:14:24 2015: debug: font scan: Latha, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Gautami
Wed Sep 23 14:14:24 2015: debug: font scan: Gautami, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Tunga
Wed Sep 23 14:14:24 2015: debug: font scan: Tunga, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Kartika
Wed Sep 23 14:14:24 2015: debug: font scan: Kartika, 2 matches
Wed Sep 23 14:14:24 2015: debug: font scan: Arial Unicode MS
Wed Sep 23 14:14:24 2015: debug: font scan: Arial Unicode MS, 15 matches
Wed Sep 23 14:14:24 2015: used font: Arial, Bold
Wed Sep 23 14:14:24 2015: debug: font request: serif
Wed Sep 23 14:14:24 2015: used font: Andika Basic, Regular
Wed Sep 23 14:14:24 2015: debug: font request: sans-serif
Wed Sep 23 14:14:24 2015: used font: Arial, Regular
Wed Sep 23 14:14:24 2015: debug: font request: Charis SIL
Wed Sep 23 14:14:24 2015: used font: Charis SIL, Regular
Wed Sep 23 14:14:25 2015: debug: writing output: C:\Users\Albright\Documents\My Princess\input\Biblica.en.NIV.bible.2011-1TH.pdf
Wed Sep 23 14:14:25 2015: stats: total-page-count: 6
Wed Sep 23 14:14:25 2015: finished: success
Wed Sep 23 14:14:25 2015: ---- end

Jim Albright
Wycliffe Bible Translators

hallvord
Hi moss, I should probably test before trying to give advice :)

It is possible to use the JavaScript hack - but it needs a bit of adjustment. You can either make the script output a warning-type message:

    Log.warning("total-page-count:", Prince.pageCount);


or set
prince.setVerbose(true)

in your Java program and use Log.info() in your JS. Now, the verbose output gives a lot of information (and the message listener will of course have to weed through all of it to look for the total-page-count message) so you may prefer the Log.warning() method although it's not really as semantically "pure" as it doesn't make sense to report total page number as a "warning" ;)

Hope this helps!

Announcement: repos for tests/utils

mikeday
We are currently adding a way to get these data messages back from the C# and Java wrappers. :)
hallvord
I posted my test code here:
https://gist.github.com/hallvors/1d216f00225e35f3c1271e9df9a00e2e
(Log.warning("I haven't written Java code for years and never learnt it in depth, so this is likely not a glorious example to be copied faithfully.. :-p)")

Announcement: repos for tests/utils

Edited by hallvord

hallvord
and @mikeday - that's good news, OP will appreciate a cleaner solution I'm sure :)

Announcement: repos for tests/utils

moss
jim_albright wrote:
I find the total pages by looking at a log file produced in debugging mode.


Yes, I was able to do it as well but I wasn't asking about how to do that. I'm asking about having access to that piece of information in code.

@hallvord I have no trouble sending this value to the console using the verbose setting (which we use as default anyway). The issue is the lack of an easy way to extract that information in code (you could create different log files for each conversion and then load and parse that file but that's too long winded and is added I/O). Also, I think all the Log.warning and Log.error calls do is output the message in the log. They do not seem to trigger the event handler at all.

@mikeday Good to hear that you plan on including this in the wrappers. We need to move on with the development though, so we will use other methods.

Can you explain why the event handler isn't called even if I attempt to convert a malformed input?
mikeday
We don't currently issue errors or warning for "malformed" HTML, as almost all documents are malformed in one way or another and the parser will correct them automatically, much like a browser.

However, if you set the input type to "xml" then the conversion will fail and you should receive error and warning messages.

Also, Prince now gives warning messages for CSS syntax errors, and of course for files that cannot be loaded or downloaded.