Forum How do I...?

how do I indicate character encoding?

erelsgl
I ran prince on an XHTML file and got an error message:

prince: www/za/productdefinition/draft-backup-only-06-06-14.html:9: error: Input is not proper UTF-8, indicate encoding ! Bytes: 0xE4 0xE2 0xE3 0xF8


the encoding is really not UTF-8, however, I have encoding information in the HTML file:
<meta http-equiv='Content-Type' content='text/html; charset=windows-1255' />


and in the CSS file linked from the HTML file:
@charset "windows-1255";


and I still get this error.

What should I do to define the encoding properly?
mikeday
XHTML documents must be well-formed XML, so the HTML-specific trick of using the <meta> tag to indicate the encoding will not work.

If you are using any encoding other than ASCII, UTF-8 or UTF-16 then you need to specify the encoding in the XML declaration, like this:
<?xml version="1.0" encoding="iso-8859-1"?>

The XML declaration must be the first thing in the XML file, it cannot even be preceded by whitespace or comments, as the XML parser needs to read the encoding before it can figure out how to parse the rest of the file.

With regards to CSS, unfortunately Prince does not support the @charset syntax in CSS files yet, so they must be in ASCII or UTF-8. (Unless the CSS rules are included within the <style> element of an XHTML file, then they can be written in the encoding used by the XHTML document).
erelsgl
this solved the problem!
khelll
i tried everyting related to unicode this is not working yet.