Forum Bugs

Problem rendering long hebrew strings

conner_bw
Feed the following into Prince:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en">
<head>
<meta content="text/html; charset=UTF-8" http-equiv="content-type" />
<style type="text/css">
@page { size: 10cm; }
</style>
<title>Hebrew Test</title>
</head>
<body>
<div class="chapter" id="hebrew-test"><div class="chapter-title-wrap"><h3 class="chapter-number">1</h3><h2 class="chapter-title">Hebrew Test</h2></div><div class="ugc chapter-ugc"><p>It is for economic reasons that in Gen. 31:43, he says: או חיום לאלה מה־אעשה ולבנתי לי־הוא ראה אשר־אתהוכל צאני והצאן בני הבניםו בנתי הבנתי אל־יעקבאו ילדו אשר לבניהן (“The daughters are my daughters, the children are my children, the flocks are my flocks, and all that you see is mine. But what can I do today about these daughters of mine, or about their children whom they have borne?”). When read in light of the ancient value of children’s labor rather than modern emotional pricelessness, Laban’s motivation becomes clear to the modern reader, as it would have been to the ancient audience.</p> </div></div>
</body>
</html>


Expected: Spaces in between hebrew characters "gracefully" rendered based on viewport size, multi-line.

Actual: Hebrew characters printed on single line, bleeding out of page margins.

Tested with Prince 8.1-r5 and 9.0

Help?
mikeday
It works for me (see PDF below). Are you applying any additional CSS, and are the spaces actual spaces or non-breaking spaces? Can you attach the actual document instead of pasting it? Also, which operating system are you running Prince on?
  1. hebrew.pdf43.8 kB
    Works for me
conner_bw
Please find attached assets (HTML, broken PDF).

$ prince testme.html

I am running on Ubuntu and Debian.

Cheers.
  1. testme.zip42.7 kB
    Test Me
mikeday
The spaces between the Hebrew words are all U+A0 non-breaking spaces, which is why they aren't breaking. Can you change them to regular spaces?
conner_bw
Thanks for your reply.

This is user generated input. I can massage the text, yes, but the filter will be indiscriminate; may inadvertently break the intention of said user in other places. Will have a go at this.

In comparison, If I open the HTML file in a browser (Chrome, Firefox) and resize the view port it renders as expected. Shouldn't prince try to match that behaviour?

Thanks.

Edited by conner_bw

mikeday
Chrome 24 seems to break the text, although oddly, but Firefox 14 does not. I don't know what behaviour browsers should be aiming for; all you can do with a space is break it or not, and it's a non-breaking space. :)
conner_bw
Ok. Will work with this advice. Thanks!

(Edit/PS: I'm on Chrome 28 and Firefox 22.)

Edited by conner_bw

mikeday
Testing with Chrome 28 and Firefox 22 on Ubuntu it still seems to break in Chrome and not break in Firefox. I'm not sure why it's breaking in Chrome, nor why you see it breaking in Firefox.
conner_bw
No worries. I'm on your side now. :)

Non breaking means non breaking. We will deal with this. It's just hard to explain invisible characters to non-devs. Also the fact that I didn't notice this myself isn't a winner. Haha.

Cheers.
mikeday
You can always use the Prince text-replace property to change them to regular spaces:
div { prince-text-replace: '\A0' ' ' }

Or JavaScript regular expressions, or your scripting language of choice. :)