Forum How do I...?

How to convert HTML documents with lots of non-standard Unicode characters?

IZh
Hi!

I have an HTML document that hast lots of "non-standard" (non-English) characters:
mathematical symbols, CJK characters, etc. Consider following example:
---8<----------------------------------
<!DOCTYPE html>
<html>
<head>
<title>Unicode test</title>
</head>
<body>
&zscr;
</body>
</html>
---8<----------------------------------
I can see it well in my browser because I have Cambria Math font that contains this character,
but as I understand, Prince is limited to only few font families by default. Am I right?
Prince doesn't see the character and replaces it with '?'.

As I understand, I cannot just create a CSS for my fonts because I have no a single font
that contains all of the characters that are in my document. So for some characters the browser
uses one font and for some -- another one.

Is it possible to set it up so the Prince will look for every unknown character for fonts that contain it?

Thank you.
  1. unicode-test.html0.1 kB
    Sample document
IZh
Yes, I figured out that it's possible to provide several fonts for each family in fonts.css, but it's not easy to find what font contains particular character.

It would be better to have ability to use all available fonts. So the fonts listed in fonts.css will have higher precedence but if the Prince encounters character that is not supported by listed fonts, it will use first font (among all fonts installed in the system) that contains it.
mikeday
You can add a @font-face rule in the default fonts.css so that Prince will include Cambria Math as one of the default serif fonts. There is no way for Prince to search every font on the system, as in many cases this would give unwanted results.

If you don't want to modify fonts.css you can just specify multiple font families in your document:
<p style="font-family: Cambria, Cambria Math, Arial, sans-serif">
...
</p>