Forum How do I...?

Prince on AWS Lambda appears to only work on cold starts

jbk
I'm pretty new to both Prince and Lambda but I'm noticing something quite weird—the very first time I run prince on a new Lambda it works fine, but on repeated runs within the next few minutes it finishes almost immediately but produces an incomplete PDF. The file cuts off in the middle of a table row after about 100 pages (the document should be about 2,500 pages).

If I wait >7 minutes to ensure I'm getting a cold Lambda, it works again (once).

I'm using the latest Lambda package (Prince 14.2) and the Lambda's resources are maxed out.

Locally (M1 Mac mini) it works fine every time.

Anyone seen anything like this before?
alfie
Hi jbk,

Sounds unusual. I would have expected the opposite to happen i.e having it faster and working once the Lambda was warm.

Are you able to try smaller documents to see it it's not a timeout issue? Or maybe incrementally remove parts of the document until it starts working again, and then we can debug from there?

Thanks,

Alfie
jbk
Hi Alfie,

It's definitely not a timeout issue, I'm able to reproduce it now with much smaller documents and it happens even with runs that only take a couple seconds.

I'll keep fiddling with removing different parts of the document, but here's where I am so far:
  • with a 160KB HTML file it works consistently, cold start or warm
  • with a 300KB HTML file (107-page PDF) I can reproduce this error repeatedly, but it's now intermittent—sometimes happens on warm starts but sometimes works fine
  • even with the 300KB HTML file I can't get it to ever fail on a cold start
  • almost the entire document is one HTML table, but splitting this out into about 20 tables so they're each shorter doesn't seem to make a difference (still sometimes works, sometimes doesn't)
  • my lambda is running Node 14, Prince 14.2 with the basic package setup downloaded from the website and extracted directly in the lambda directory (not a layer), and has 10GB RAM, 10GB ephemeral storage, and the max timeout (15min)

More details below, but let me know if there's anything else I can do to help debug! Thanks.

- Jacob



Here is the entire Node script running on the Lambda—it just calls prince with the filenames passed in at runtime.

import { promisify } from 'util'
import { basename, resolve } from 'path'
import { execFile } from 'child_process'
import { createReadStream, createWriteStream } from 'fs'
import { S3Client, GetObjectCommand, PutObjectCommand } from '@aws-sdk/client-s3'

const execFilePromise = promisify(execFile)
const s3 = new S3Client()

export const handler = async (event, context) => {
  const { input, output, bucket, debug } = event

  const html = `/tmp/${basename(input)}`
  const pdf = `/tmp/${basename(output)}`

  const download = await s3.send(
    new GetObjectCommand({
      Bucket: bucket,
      Key: input,
    })
  )
  download.Body.pipe(createWriteStream(html))

  try {
    const { stderr, stdout } = await execFilePromise(
      debug ? resolve('./princedebug') : resolve('./prince'),
      [html, '-o', pdf, '--verbose', '--debug']
    )
    if (stderr) {
      console.log(stderr)
    }
    if (stdout) {
      console.log(stdout)
    }
  } catch (error) {
    throw error
  }

  await s3.send(
    new PutObjectCommand({
      Bucket: bucket,
      Key: output,
      Body: createReadStream(pdf),
      ContentType: 'application/pdf',
    })
  )

  if (debug) {
    await s3.send(
      new PutObjectCommand({
        Bucket: bucket,
        Key: `${output}.debug`,
        Body: createReadStream('/tmp/prince.debug'),
        ContentType: 'text/plain',
      })
    )
  }

  return {
    statusCode: 204,
  }
}


I'm attaching the input HTML, one correct PDF, and one incomplete PDF. You can see that the incomplete PDF ends on page 60 in the middle of a table row and the table of contents entries that should link to pages at the end of the document say page 0.

Here are the prince logs for both those runs on Lambda.

Successful (003_600kb_NOT_working-multiple_tables_2022-05-09_06-10-41_complete.pdf):

prince: debug: init locking for OpenSSL
prince: debug: loading license: /var/task/prince-engine/license/license.dat
prince: debug: loading /var/task/prince-engine/license/license.dat because it is the main resource
prince: loading document: /var/task/prince-engine/license/license.dat
prince: debug: loaded resource: /var/task/prince-engine/license/license.dat
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/fonts.css
prince: debug: loaded resource: /var/task/prince-engine/style/fonts.css
prince: debug: loaded resource: type: no
prince: debug: enabling parallel downloads
prince: Loading document...
prince: loading HTML5 input: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: loading document: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: loaded resource: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: loaded resource: type: no
prince: debug: loaded document: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: sniffed doctype: XHTML
prince: Applying style sheets...
prince: loading style sheet: /var/task/prince-engine/style/common.css
prince: debug: loaded resource: /var/task/prince-engine/style/common.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/html.css
prince: debug: loaded resource: /var/task/prince-engine/style/html.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/hyph.css
prince: debug: loaded resource: /var/task/prince-engine/style/hyph.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/mathml.css
prince: debug: loaded resource: /var/task/prince-engine/style/mathml.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/svg.css
prince: debug: loaded resource: /var/task/prince-engine/style/svg.css
prince: debug: loaded resource: type: no
prince: /tmp/003_600kb_NOT_working-multiple_tables.html: warning: unsupported properties: text-decoration-line
prince: Preparing document...
prince: Converting document...
prince: debug: pack
prince: debug: font request: serif
prince: debug: font scan: times new roman
prince: debug: font scan: times new roman, 0 matches
prince: debug: font scan: dejavu serif
prince: debug: font scan: dejavu serif, 0 matches
prince: debug: font scan: dejavu lgc serif
prince: debug: font scan: dejavu lgc serif, 0 matches
prince: debug: font scan: liberation serif
prince: debug: found font: liberation serif Italic
prince: debug: found font: liberation serif Italic
prince: debug: found font: liberation serif Regular
prince: debug: found font: liberation serif Regular
prince: debug: font scan: liberation serif, 4 matches
prince: debug: font scan: opensymbol
prince: debug: font scan: opensymbol, 0 matches
prince: debug: font scan: dejavu sans
prince: debug: font scan: dejavu sans, 0 matches
prince: debug: font scan: ar pl uming cn
prince: debug: font scan: ar pl uming cn, 0 matches
prince: debug: font scan: ar pl sungtil gb
prince: debug: font scan: ar pl sungtil gb, 0 matches
prince: debug: font scan: kochi mincho
prince: debug: font scan: kochi mincho, 0 matches
prince: debug: font scan: ipamincho
prince: debug: font scan: ipamincho, 0 matches
prince: debug: font scan: takaomincho
prince: debug: font scan: takaomincho, 0 matches
prince: debug: font scan: unbatang
prince: debug: font scan: unbatang, 0 matches
prince: debug: font scan: baekmuk batang
prince: debug: font scan: baekmuk batang, 0 matches
prince: debug: font scan: lohit devanagari
prince: debug: font scan: lohit devanagari, 0 matches
prince: debug: font scan: lohit bengali
prince: debug: font scan: lohit bengali, 0 matches
prince: debug: font scan: ani
prince: debug: font scan: ani, 0 matches
prince: debug: font scan: mukti narrow
prince: debug: font scan: mukti narrow, 0 matches
prince: debug: font scan: lohit punjabi
prince: debug: font scan: lohit punjabi, 0 matches
prince: debug: font scan: lohit gujarati
prince: debug: font scan: lohit gujarati, 0 matches
prince: debug: font scan: lohit tamil
prince: debug: font scan: lohit tamil, 0 matches
prince: debug: font scan: lohit telugu
prince: debug: font scan: lohit telugu, 0 matches
prince: debug: font scan: lohit kannada
prince: debug: font scan: lohit kannada, 0 matches
prince: debug: font scan: lohit malayalam
prince: debug: font scan: lohit malayalam, 0 matches
prince: debug: font scan: lohit oriya
prince: debug: font scan: lohit oriya, 0 matches
prince: debug: font scan: garuda
prince: debug: font scan: garuda, 0 matches
prince: debug: font scan: joypixels
prince: debug: font scan: joypixels, 0 matches
prince: debug: font scan: noto color emoji
prince: debug: font scan: noto color emoji, 0 matches
prince: loading font: /var/task/fonts/LiberationSerif-Regular.ttf
prince: debug: loaded resource: /var/task/fonts/LiberationSerif-Regular.ttf
prince: debug: loaded resource: type: no
prince: used font: Liberation Serif, Regular
prince: debug: font request: bold serif
prince: loading font: /var/task/fonts/LiberationSerif-Bold.ttf
prince: debug: loaded resource: /var/task/fonts/LiberationSerif-Bold.ttf
prince: debug: loaded resource: type: no
prince: used font: Liberation Serif, Bold
prince: Resolving cross-references...
prince: debug: pack
prince: debug: writing PDF to file: /tmp/003_600kb_NOT_working-multiple_tables_2022-05-09_06-10-41.pdf
prince: Finished: success


Unsuccessful (003_600kb_NOT_working-multiple_tables_2022-05-09_06-11-09_incomplete.pdf):

prince: debug: init locking for OpenSSL
prince: debug: loading license: /var/task/prince-engine/license/license.dat
prince: debug: loading /var/task/prince-engine/license/license.dat because it is the main resource
prince: loading document: /var/task/prince-engine/license/license.dat
prince: debug: loaded resource: /var/task/prince-engine/license/license.dat
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/fonts.css
prince: debug: loaded resource: /var/task/prince-engine/style/fonts.css
prince: debug: loaded resource: type: no
prince: debug: enabling parallel downloads
prince: Loading document...
prince: loading HTML5 input: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: loading document: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: loaded resource: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: loaded resource: type: no
prince: debug: loaded document: /tmp/003_600kb_NOT_working-multiple_tables.html
prince: debug: sniffed doctype: XHTML
prince: Applying style sheets...
prince: loading style sheet: /var/task/prince-engine/style/common.css
prince: debug: loaded resource: /var/task/prince-engine/style/common.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/html.css
prince: debug: loaded resource: /var/task/prince-engine/style/html.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/hyph.css
prince: debug: loaded resource: /var/task/prince-engine/style/hyph.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/mathml.css
prince: debug: loaded resource: /var/task/prince-engine/style/mathml.css
prince: debug: loaded resource: type: no
prince: loading style sheet: /var/task/prince-engine/style/svg.css
prince: debug: loaded resource: /var/task/prince-engine/style/svg.css
prince: debug: loaded resource: type: no
prince: /tmp/003_600kb_NOT_working-multiple_tables.html: warning: unsupported properties: text-decoration-line
prince: Preparing document...
prince: Converting document...
prince: debug: pack
prince: debug: font request: serif
prince: debug: font scan: times new roman
prince: debug: font scan: times new roman, 0 matches
prince: debug: font scan: dejavu serif
prince: debug: font scan: dejavu serif, 0 matches
prince: debug: font scan: dejavu lgc serif
prince: debug: font scan: dejavu lgc serif, 0 matches
prince: debug: font scan: liberation serif
prince: debug: found font: liberation serif Italic
prince: debug: found font: liberation serif Italic
prince: debug: found font: liberation serif Regular
prince: debug: found font: liberation serif Regular
prince: debug: font scan: liberation serif, 4 matches
prince: debug: font scan: opensymbol
prince: debug: font scan: opensymbol, 0 matches
prince: debug: font scan: dejavu sans
prince: debug: font scan: dejavu sans, 0 matches
prince: debug: font scan: ar pl uming cn
prince: debug: font scan: ar pl uming cn, 0 matches
prince: debug: font scan: ar pl sungtil gb
prince: debug: font scan: ar pl sungtil gb, 0 matches
prince: debug: font scan: kochi mincho
prince: debug: font scan: kochi mincho, 0 matches
prince: debug: font scan: ipamincho
prince: debug: font scan: ipamincho, 0 matches
prince: debug: font scan: takaomincho
prince: debug: font scan: takaomincho, 0 matches
prince: debug: font scan: unbatang
prince: debug: font scan: unbatang, 0 matches
prince: debug: font scan: baekmuk batang
prince: debug: font scan: baekmuk batang, 0 matches
prince: debug: font scan: lohit devanagari
prince: debug: font scan: lohit devanagari, 0 matches
prince: debug: font scan: lohit bengali
prince: debug: font scan: lohit bengali, 0 matches
prince: debug: font scan: ani
prince: debug: font scan: ani, 0 matches
prince: debug: font scan: mukti narrow
prince: debug: font scan: mukti narrow, 0 matches
prince: debug: font scan: lohit punjabi
prince: debug: font scan: lohit punjabi, 0 matches
prince: debug: font scan: lohit gujarati
prince: debug: font scan: lohit gujarati, 0 matches
prince: debug: font scan: lohit tamil
prince: debug: font scan: lohit tamil, 0 matches
prince: debug: font scan: lohit telugu
prince: debug: font scan: lohit telugu, 0 matches
prince: debug: font scan: lohit kannada
prince: debug: font scan: lohit kannada, 0 matches
prince: debug: font scan: lohit malayalam
prince: debug: font scan: lohit malayalam, 0 matches
prince: debug: font scan: lohit oriya
prince: debug: font scan: lohit oriya, 0 matches
prince: debug: font scan: garuda
prince: debug: font scan: garuda, 0 matches
prince: debug: font scan: joypixels
prince: debug: font scan: joypixels, 0 matches
prince: debug: font scan: noto color emoji
prince: debug: font scan: noto color emoji, 0 matches
prince: loading font: /var/task/fonts/LiberationSerif-Regular.ttf
prince: debug: loaded resource: /var/task/fonts/LiberationSerif-Regular.ttf
prince: debug: loaded resource: type: no
prince: used font: Liberation Serif, Regular
prince: debug: font request: bold serif
prince: loading font: /var/task/fonts/LiberationSerif-Bold.ttf
prince: debug: loaded resource: /var/task/fonts/LiberationSerif-Bold.ttf
prince: debug: loaded resource: type: no
prince: used font: Liberation Serif, Bold
prince: Resolving cross-references...
prince: debug: pack
prince: debug: writing PDF to file: /tmp/003_600kb_NOT_working-multiple_tables_2022-05-09_06-11-09.pdf
prince: Finished: success
  1. 003_600kb_NOT_working-multiple_tables.html581.2 kB
  2. 003_600kb_NOT_working-multiple_tables_2022-05-09_06-10-41_complete.pdf407.7 kB
  3. 003_600kb_NOT_working-multiple_tables_2022-05-09_06-11-09_incomplete.pdf242.0 kB

Edited by jbk

jbk
Quick additional note: removing the unsupported 'text-decoration-line' CSS property doesn't fix it, and the incomplete PDFs are always slightly different sizes and cut off in a different place every time. I noticed this testing larger documents too—the first one would work but the following ones would cut off randomly after approximately 100-150 pages but never in the same place.
wangp
The PDFs are not corrupted (not cut off prematurely) so the next thing I would check is that the input being fed into Prince is complete. I don't know Node or AWS Lambda at all, but perhaps when you download the input from S3 into the temporary file, you need to flush the pipe or something like that?
alfie
Nice call @wangp.

Jacob: are you able to print a debug message for the size of the HTML input that you've read it? That should help pinpoint a short read
jbk
Well crap, that was it @wangp!

My Node script wasn't waiting for
  download.Body.pipe(createWriteStream(html))

to finish running, so Prince was being invoked with just part of the HTML file.

I hadn't seen your comment @alfie but thanks, that's what I started doing, console logging a ton of stuff and trying to print file metadata. I actually couldn't get the file size to reliably show up correctly (even now that it's working), but a bunch of my log messages were out of order which is what eventually helped me pinpoint this.

I would guess the reason it was always about 100-150 pages was that that's how much HTML could be downloaded and piped into the file in the milliseconds before Prince got to it.

Forcing the script to wait for the stream to close appears to have fixed everything:
  await new Promise(resolve => {
    download.Body.pipe(createWriteStream(html)).on('finish', resolve)
  })


This is what I get for just skimming the Node docs on streams.

Thanks so much for your help everyone!!

Edited by jbk

jbk
Still curious why it consistently did work on cold starts... 🤔

Edited by jbk

alfie
I'm guessing because cold starts took longer, making the request to S3 able to finish on time
jbk
Yeah that has to be it... I knew cold starts took time to 'boot up' but do they also run slower?

Once the handler function actually runs, I would think it would take the same amount of time to get from 'starting to download the file' to 'starting prince', unless the whole thing runs a bit slower or prince needs to set some stuff up the first time it runs.
alfie
On the first run, things like fonts will need to be brought in and parsed etc. After that, caches should take over leading to an faster future runs.