Forum How do I...?

Discretionary Hyphens

dauwhe
Hi Mike,

Could you describe how Prince processes a word containing a discretionary hyphen? We've been doing some testing, and it appears that Prince will remove *some* of the dictionary-defined break points from such a word, sometimes.

We're curious about this because I'm getting some complaints from users that discretionary hyphens are sometimes ignored. I wonder how much of this is because the justification algorithm decides the desired break point is not allowed.

Here's a little test file I'm using. The first column is the word with the discretionary hyphen, the second column shows where the discretionary hyphen is in the word. At the top seem to be the dictionary entries that control how this word hyphenates (although I'm sure there's more going on there).

<!DOCTYPE html>
<html>
<head>
<title>Discretionary Hyphens</title>
<style>
  @page {
   size: letter;
   margin: 1in;
  }
  
  body { hyphens: prince-expand-all; }
  
  p { 
  text-align: justify;
  font-size: 10pt;
  margin: 0;
  padding: 0;
  }
  
  span { hyphens: auto; }
</style>
</head>
<body>
<h1>Hyphens</h1>

<p>su2r</p>
<p>pri4s</p>
<p>s3ing</p>
<p>1ly</p>
<p> </p>
<p>surprisingly</p>
<p>s&shy;urprisingly | <span>s*urprisingly</span></p>
<p>su&shy;rprisingly | <span>su*rprisingly</span></p>
<p>sur&shy;prisingly | <span>sur*prisingly</span></p>
<p>surp&shy;risingly | <span>surp*risingly</span></p>
<p>surpr&shy;isingly | <span>surpr*isingly</span></p>
<p>surpri&shy;singly | <span>surpri*singly</span></p>
<p>surpris&shy;ingly | <span>surpris*ingly</span></p>
<p>surprisi&shy;ngly | <span>surprisi*ngly</span></p>
<p>surprisin&shy;gly | <span>surprisin*gly</span></p>
<p>surprising&shy;ly | <span>surprising*ly</span></p>
<p>surprisingl&shy;y | <span>surprisingl*y</span></p>
</body>
</html>


Thanks,

Dave
dauwhe
Ah, forgot the PDF that results from the above HTML.
  1. discretionary-hyphens-001.pdf29.1 kB
    PDF result
mikeday
Good question, I will take a look at this and get back to you.
jbzech
I am also struggling with this occasionally. For me, the issue tends to be compound words where the editorial rule is to hyphenate between the 2 words. Today's example:

Business-
people blah blah


which was hyphenating as

Businesspeo-
ple blah blah


Technically it's not incorrect hyphenation, but editorially it is incorrect. I can turn off hyphenation on the word with a span.no-hyphen, but my lines get very, very loose in this case when I do that.

My understanding of the hyphenation algorithm suggests that I should be able to add

.bus6i6ness5peo6ple. 


to the .dic file to fix it. But that doesn't work. The only thing that works seems to be a much shorter bit, such as

peo6pl


which I am afraid may have other ramifications on the hyphenation in my book.

How are discretionary hyphens supported in PrinceXML, if they are?

Is there a fairly good resource to explain the rules behind the hyphenation algorithm?
dauwhe
In my experience, discretionary hyphens usually (but not always) work as you hope. Add &shy; where you want the word to break.

I think Prince uses the TeX justification algorithm. If you are brave, buy a copy of The TeXbook and read away!

jbzech
I believe last time I brought this up, Mike referred to it as a black box I'd do well not to look in to...

I did try &shy; in this case, and it did not work. I guess perhaps the hyphenation calculations somehow overruled it. But I didn't realize it was an option until today. I'll have that to go to in the future.

Thanks, dauwhe.
abc
Hi Mike,

The initial post in this query is yet to be implemented?

Abc
jim_albright
Try higher numbers. There may already be equal words/patterns in the file. I have found that the TeX hyphenation works. For me I started with a sorted list of ALL the words in the file and then added the hyphen points.

You can then create your own hyphen file to use.

Jim Albright
Wycliffe Bible Translators

abc
Okay, thanks. I will give it a go.