Forum Bugs

Page "break" problem with big tables (>= page size)

iamleo
Hi,
I have some output with tables that can have the size of more than one page. The algorithm to break these tables is "wrong" undependently of using "page-break-inside: avoid" or not.

In the attachment, you'll see some minimal pathological example, where, instead of "detecting" immediatly that the table will be too long and starting to output it immediatly, princexml will first show a page with only the title, then a page with only the subtitle, then finally the table...

There is 3 cases:
1) table size = 1 page + page-break-inside: avoid
2) table size = 1 page
3) table size > 1 page + page-break-inside: avoid

The same wrong behavior is observed for all.

Best regards.
Lorenzo
  1. bug.html64.2 kB
    source
  2. bug.pdf41.6 kB
    output
mikeday
The heading elements (h1, h2, etc.) have page-break-after: avoid applied to them by default, could that be causing the problem?
iamleo
No, this is unrelated to h1, h2 page-break-after value.

If you use smaller tables you have the right behavior, see attached doc.
  1. bug.html96.8 kB
    source with also smaller tables
  2. bug.pdf48.9 kB
    output with also smaller tables
mikeday
If you remove the "page-break-inside: avoid" from table elements and .nobreak divs, there will be no unexpected page breaks. After that, where would you prefer to prevent it from breaking?

If the primary goal is to prevent small tables from breaking, but allow big tables to break, unfortunately that is currently difficult to achieve unless you count the number of rows and then decide whether to apply page-break-inside or not.
iamleo
No, the result isn't logical at all.

Problem 1, we have:
- Title
- Subtitle
- Table height of 1 page
We get (in term of pages):
1. Title
2. Subtitle
3. Table height of 1 page
We normally expect (due to the nobreak in table) in term of pages:
1. Title + subtitle
2. Table height of 1 page

=> 1 extra page with no reason (the table is not broken, no problem with that)

Problem 2, we have:
- Title
- Subtitle
- Table height of MORE than 1 page (it will have to be broken in ALL situations)
We get (in term of pages):
1. Title
2. Subtitle
3. Table height of MORE than 1 page
We normally expect in term of pages:
1. Title + subtitle + start of the table
2. End of the table

=> The table is broken, yes, it need to be broken it is longer than one page, no need to have extra page before eventually brokes it.
mikeday
It's because the h2 and the table are inside a nobreak div, and because the h2 has page-break-after: avoid in the default style sheet.

If you comment out the .nobreak class and add "h2 { page-break-after: auto }" it should be better.
iamleo
Your algorithm should compute the height of these elements and then choose the appropriate best solution to page disposition (like other libs or even latex will do).

I have even removed the div nobreak, as you can see in the example below, the result doesn't change. The only remaining "nobreak" is to avoid (if possible) breaking tables. The problem persists.
  1. bug.html96.5 kB
    source
  2. bug.pdf48.9 kB
    output
mikeday
Yes, it is a limitation of our page breaking support which we hope to fix in the future.

Can you also add "h2 { page-break-after: auto }" to the style sheet?
iamleo
Adding h2 { page-break-after: auto } will help with problem 1, but not at all with problem 2...

But again, the definition of the "nobreaks" in the document are clearly logical: "nobreaks" are hints to avoid *if possible* breaking between some elements. At the end the algorithm must choose the best disposition and in your lib, the final solution add extra pages for no good reason. I hope this can be improved. The lib is great however...
mikeday
Yes, smarter page breaking is necessary to fix problem 2. Basically look ahead and see whether the entire table can fit on the next page, and if it doesn't, put it on the current page. Sounds simple, but so many other things are going on with page breaking it has been difficult to implement. :)
iamleo
I understand and it doesn't sound simple for me (also a developper) :) That's the only problem I had with your lib (and many many more with the other I have tried), but even if the problem is hard, it would be great if it can be improved.

My fix at the moment, will be to detect if our tables are going to be 1 page or more and to remove page-break-inside: avoid; for them.
vijay.v
@mikeday We too facing same type of problem, When we have long table we have to place it as lanscape and it to be in seperate page, this creating blank area when it is breaking to new page, this needs to be avoid, how can i do this

  1. article.html220.5 kB
  2. article.pdf345.7 kB
  3. style-1.css3.7 kB