Clearly, there's a mistake in the line above? It doens't make sense to Russian
somebody's punctuation, does it? To learn what went wrong, you can jump to the last example in this document. Or, if you can hold your suspense, you can read from the top and learn how the 'text-replace' property can help you polish your documents before printing.
Our formatter of choice is Prince, an HTML-and-CSS-to-PDF converter. The screenshots you see, and the PDF documents linked from this guide, have all been generated with Prince. You can easily create the same pdf files by downloading Prince and pointing it to the HTML links provided in this document.
A standard keyboard has the normal dash
character (-), but not the longer endash
(–) or emdash
(&emdash;). It's convenient to to just type two normal dashes and let computers replace the string before printing. And this is exactly what Prince does with the 'text-replace' property in this code:
htmlbody { prince-text-replace: "--" "\2014" } ... <p>Mais--oui!
The code \2014
refers to the hexadecimal Unicode number of the emdash character. It's not something you need to learn, it's easy to look up these codes, or simply copy them from this document.
If you prefer the endash over the emdash, this is quickly fixed for the whole document by changing just one character in the style sheet:
htmlbody { prince-text-replace: "--" "\2013" } ... <p>Mais--oui!
In some countries there is a tradition for adding a space before and after the ndash. Here's how to do it:
htmlbody { prince-text-replace: "--" "\2008\2013\2008" } ... <p>Mais--oui!
The Unicode character \2008
is called a punctuation space
, and it has traditionally the same width as the period (full stop).
In French, one typically adds some space before exclamation marks, question marks, colon, and semi-color. The hair space
character \200A
is handy for this task. To specify a list of replacements, we just write the pairs of strings.
htmlbody { prince-text-replace: "!" "\200A!" /* add hair space before exclamation mark */ "?" "\200A?" /* add hair space before question mark */ ":" "\200A:" /* add hair space before colon */ ";" "\200A;" /* add hair space before semi-colon */ } ... <p>Non? Voir: <br>le deux-points; et le point-virgule!
CSS will happily ignore the comments and newslines between the pairs.
In English, the apostrophe is used in contractions (that's) and possessive nouns (Joe's keyboard). Our keyboards only have the typewriter's apostrophe, but in print you probably want to use the typesetter's apostrophe. The typesetter’s apostrophe, I mean. Prince will easily fix this:
htmlbody { prince-text-replace: "'" "\2019" } ... <p>That's Joe's keyboard.
The 'text-replace' property is very powerful, but also limited. It will only replace one string with another, and you cannot specify wildcards or regular expressions. Therefore, you will not be able to process a pair of typewriter's quote marks:
"Quote marks are nice"
... into proper quotation marks. To achieve this, you should use the <q>
element instead:
html<q>Quote marks are nice</q>
As can be seen above, the 'text-replace' property makes it easy to polish your punctuation without changing the HTML source. However, we must also stress that the property is very powerful to the point of being dangerous. For example, in one line of code you can change the meaning for the text:
htmlbody { prince-text-replace: "Polish" "Russian" } ... <p>The Polish coastline on the Baltic sea.
Such changes may have unforseen geopolitical consequences. Also, words are sometimes used in contexts where you don't expect them. Like, in the heading of this document:
htmlbody { prince-text-replace: "Polish" "Russian" } ... <p>Polish your punctuation with one powerful property.