Forum Bugs

Two column bug causes tables to run off the page

bookdev
June 28 2008 when you created the two column Wikipedia samples (http://www.princexml.com/howcome/2008/wikipedia/united-states.pdf) the table “Leading population centers” automatically took a column-span of 2 since it was too wide to fit in one column.
However, when I run the exact same command line using PrinceXML version 7 (prince --no-author-style -s http://www.princexml.com/howcome/2008/wikipedia/wiki2.css http://en.wikipedia.org/wiki/United_States -o United_States.pdf), the same table has a column-span of 1 and, as a result, either overlaps the column on the right or runs off the right side of the page.
Is this due to a bug in PrinceXML 7? How can I get PrinceXML to make the table two columns wide like yours? I can't see any problems with the table that would cause this. See: <table class="infobox" style="text-align: center; width: 97%; margin-right: 10px; font-size: 90%;">
On a two column format (body{colums:2}), could you please be so kind as to suggest how to make tables too wide for a single column to automatically take a column-span of 2, if you don’t mind?
bookdev
This bug shows in the Wikipedia Norway sample PDF (http://www.princexml.com/howcome/2008/wikipedia/norway.pdf). That is, the 2008 version of PrinceXML displays wide tables correctly spanning two columns but version 7 of PrinceXML doesn't, with the result that wide tables either overlap the right-hand column or run off the right side of the page.
howcome
The problem is that Wikipedia's markup and styling changes, it's a moving target. And the markup tends to lose semantics (e.g., class names) and gain style attributes. To detect the population centre table, I added this rule:

table[style="text-align:center; width:97%; margin-right:10px; font-size:90%"]
{ float: bottom }

That is, I select an element based on the content of its style attribute. It would have been much better if Wikipedia used a class name like <table class=wide> or something. I've tried to improve Wikipedia's markup in the past, but with no detectable success:

http://www.princexml.com/howcome/2009/wikipedia/

Anyway, you can find an updated style sheet here:

http://www.princexml.com/howcome/2010/wikipedia/wiki2-table-detect.css

It will only detect the table as long as the style attribute has exactly the content shown above. Therefore, don't expect it to work forever.

A better approach to formatting wikipedia may be to intercept the content before it's converted to sematically shallow HTML.

-h&kon
bookdev
Thanks so much for the updated style sheet, Håkon. That worked perfectly. Since Wikipedia's markup and styling is a moving target I went with: table { float: bottom } That fixed all problems with all tables. Unfortunately, it seems to have had the unintended consequence of making embedded tables pop out and run underneath the tables they were embedded in. Could you please be so kind as to let me know if there’s anyway to avoid that, if you don’t mind? (I figure if you don’t know, no one will.)

Thanks for the link to your Wikipedia markup suggestions. Wikipedia would be much improved if they took your advice. I also found your Printing a Book with CSS helpful.

I agree that working directly with the “wiki language” would have some advantages over HTML.

Thanks, again. It’s not often you get help from the Chairman.
mikeday
You could try something like "table table { float: none }" to avoid floating tables inside tables.
bookdev
Thanks, Mike. That worked perfectly.