Forum Bugs

ZWJ (UTF-8, Zero Width Joiner)

bentahar
Hi

First, thank you so much for producing and maintaining such a brilliant tool. I use it all the time :)

I use Prince to produce PDFs for my Kindle from Arabic texts. I often want to use ZWJ instead of 'kashida', but Prince inserts with it a symbol that looks like an upwards arrow with an 'x' head (See the attached image). This is not exclusive to Arabic texts, as can be seen from the attached image. A similar problem exists with ZWNJ (Zero Width Non-Joiner).

By the way, will kashidas be used for justification in the next release? These will be very useful for justifying poetry. Thanks. (http://www.princexml.com/roadmap/)

ZWJ.png


ZWJ.xml: (I hope the encoding is preserved)
<?xml version="1.0"?>
<root>
<p>ZWJ: ‍</p>
<p>ZWNJ: ‌</p>
<p>Hello</p>
<p>Hello ‍ again</p>
<p2>هـ (with kashida)</p2>
<p2>ه‍ (with ZWJ)</p2>
<p2>بن‌طاهر (with ZWNJ)</p2>
</root>

ZWJ.css:
root {
display:block;
font-size: 60px;
font-family: tahoma; 
}
p, p2 {
display: block;
background: skyblue;
margin-top: 0.25em;
}
p2 {
direction: rtl;
}
@page {
size: 800px;
}


Best regards,
Bentahar

PS. There is another problem which I will report later. It is related to the placement of diacritics, and seems to depend on the specific font used. I will investigate this further.
  1. ZWJ.png61.9 kB
mikeday
Currently Prince does not give the ZWJ and ZWNJ characters any special treatment, so if the fonts have glyphs for them they will be displayed as with any other character. Could you perhaps use zero width space instead of zero width non-joiner? That should prevent ligatures forming without leaving any visible gap, although it will allow line breaks between the two characters.

As for zero width joiner, I'm not quite sure how we should connect characters that would not normally be connected. Can you give an example of this for Arabic?
bentahar
Thanks Mike.

The Zero Width Space trick does really work, so thanks for the tip :-)

The reason I have tried using ZWJ is to try and colour parts of a word. (In Latin based scripts, letters are separated so no special care is needed in something like: Hello World!)
In the attached example, I have produced three lines from a poem, where the 1st is plain, the 2nd uses ZWJ, and the 3rd uses kashida. The desired outcome is the one with ZWJ but without the spurious added characters where ZWJs are placed.
If this outcome can be achieved without having to even insert the ZWJs then that would be great, i.e. treat words separated by only XML tag(s) as one word, as in the first instances of these verses (See the XML source).

Attached is a zip file with the XML and CSS sources. I hope you can help with this.

Many thanks in advance.

Kind regards,
Bentahar
  1. zwj.png90.5 kB
  2. zwj.zip16.4 kB
mikeday
Thanks for the nice sample document. We will add support for the ZWJ character in Prince 8.1.
mikeday
Prince 8.1 is now available for download and supports the zero-width joiner character.
jim_albright
zwnj is working great in Prince 11.1. Thank you again.

Jim Albright
Wycliffe Bible Translators