Troubles with Japanese punctuation
We're having some issues with Japanese font rendering regarding punctuation being displayed vertically centered instead of aligned to the bottom. My assumption here is that this is some sort of font issue as it is working locally on Mac OS, but not on our server running Ubuntu 18.04.
Is there anything specific I can check or that we need to do when rendering these types of characters in Prince? Maybe some CSS we're missing?
- Screen Shot 2019-06-21 at 10.17.41 AM.png 31.4 kB
Which font are you using, and which font is ending up in the final PDF file? (You can check under document properties in the PDF viewer).
The PDF is being rendered with Open Sans Condensed as the base font and the Japanese text looks to be using UMingCN. The text here is user editable and could end up in any number of languages in case that helps at all.
Would you be able to send me (email@example.com) a sample of the HTML?
I think the issue here is that UMingCN is a Chinese font, and although it includes glyphs for Japanese it isn't really the best choice for this text. You could specify a Japanese font with the font-family property, or you could specify the text language with the lang="ja" attribute if you are using a latest build
of Prince. (However it appears that we will need to update the list of default fonts as Ubuntu is currently shipping TakaoMincho while we default to Kochi Mincho and IPAMincho).
I tried adding TakaoMincho and TakaoGothic to the font lookup in fonts.css, but that did not seem to help anything either. While in there I noticed that the Japanese fonts are using a unicode-range of "U+3040-309F, U+30A0-30FF, U+4E00-9FBF". The particular characters in question are U+3001 and u+3002 which are part of 3000-303F (CJK Symbols and Punctuation). This appears to fall through to the next font, Chinese, which would explain the font being used.
From what I can tell it looks like different languages prefer these in the middle vs the bottom left so I'm not sure how we could best handle these types of situations with an input box that may be any language
Today we have updated the latest build
to reference the Takao fonts and it does not have the unicode-range issue.
However, there is still the problem that if you have user-submitted text in any language then it is difficult to know whether to apply lang="zh" or lang="ja". It might be possible to default to check for Japanese-specific characters and then default to Chinese in their absence? But even this is not as good as knowing the actual user preference.
Thanks Mike! I'll give the latest build a shot and see what happens. We may end up having to do some sort of language selection since, like you said, it's hard to know what language it should be given a user input field that could be anything.