Forum Bugs

Thousands of Letters Dropped

scott_w
I have attached two PDF's the one titled "dropped_letters" is generated from Prince, the second one, "dropped_letters_reducedSize" is a copy of the first one that I saved as a reduced size PDF. During the reduce file size it drops letters.

As an example, take a look at page 13, in the "dropped_letters.pdf" see the title "Influencing Others through Literacy." Use the text edit tool in Acrobat and copy this text and paste it into a text editor. What you will get when you paste it is, "Influencing Others through iteracy," the "L" is missing, but you see it in the PDF. Now save as a reduced size PDF (File/Save As/Reduced Size PDF) and you will see the letter "L" disappear along with a lot of other letters. In one of my 300 page test files it deleted thousands of letters.

I believe this has something to do with how the PDF is created by Prince and is not a bug with Adobes Acrobat, I'm running tests against non Prince created files to verify this.
  1. dropped_letters.zip633.2 kB
dauwhe
I've been unable to open the zip file.

What fonts are you using?
mikeday
I can't reproduce this problem in Acrobat 7.0 Pro, or Adobe Reader 9, in both cases when I copy and paste the text there are no missing letters. However, neither of these versions have a save as reduced PDF option as far as I can see, so I can't test this aspect. Which Acrobat version are you using?
scott_w
The font we are using is a version of Palatino extended by our font team, it is unicode/opentype compliant.

I have tested the file using Acrobat Pro 7.0, there is a reduce file size under file. My tests with 7.0 did not show the error. I tested it in Acrobat 10.0.0 and it behaved as I described in my earlier post, copying and pasting text showed missing characters and reducing file size deleted characters from view. I then tested it in 10.0.3 and copying and pasting resulted in missing characters however reducing files size DID NOT delete characters.

This is very odd, maybe an Acrobat bug, still looking into have not yet been able to duplicate it on a non Prince generated PDF. Thoughts?
scott_w
I ran some more tests trying different fonts and testing the results in Acrobat 10.0.0. I have tested three true type fonts and three opentype fonts. The true type fonts did not give any problems the opentype fonts DID have missing letters.
mikeday
If you can recreate this with a very simple test document, eg. containing one single word, and send us the document/font that causes the issue we may be able to figure out what is going wrong.
scott_w
I was able to create a very small file that fails. The key to the failure is the occurrence of "fl" and "fi", two ligature characters. As I looked at my original sample it appears that on every line where there was an "fl" or "fi" there was a missing character.

I tested this using fonts on the server and on my local drive. I'm supplying you with three fonts to try and if you notice I have included their names in the CSS file, just comment and un-comment them to get it to work. I'm also attaching a few PDF's so you can see what I'm seeing.

As previously mentioned when you copy and paste the text out of the PDF into a text editor you will get missing tex, the"X" at the end of the two lines. However when you save as a reduced size PDF (File/Save As/Reduced Size PDF) in Acrobat 10.0.0 you will not always see the characters drop even though they are not there in a copy and paste test. For some fonts they will drop and for others the stay and some will throw you an error and replace all of the text with dots.

I have one of my font guys in next week and we will look at this with him also, we are still not sure where this is happening, we haven't seen this with PDF's generated from InDesign yet using the same fonts. Hopefully you will be able to help us track this down or at lest eliminate Prince from the variable.

Just an FYI we registered my copy yesterday and ran the tests again, some of the samples I sent you were run before the registration file was updated.
  1. dropped_letters_testFiles.zip882.5 kB
mikeday
The ligature aspect could be a helpful clue. Can you confirm that this only affects OTF fonts and not TTF fonts? Unfortunately I still can't reproduce the cut and paste dropping characters issue with any of those PDF files, having tested in Acrobat 7.0, Adobe Reader 9, and Adobe Reader X.
scott_w
Yes it appears to be only OTF.
mikeday
And do you find that when you open the PDFs you uploaded that are using the OTF font, that you get the copy/paste issue in Adobe Reader?
scott_w
No. Adobe Reader does not give me the copy paste error.
mikeday
But Acrobat 10 does?
scott_w
Yes, Acrobat Pro 10.0.0 does. We have some machines with 10.0.3 and the text doesn't disappear when you reduce file size but the copy paste still shows missing text.
jim_albright
Are the ligatures automatically created by Prince or not?

Jim Albright
Wycliffe Bible Translators

mikeday
Yes, Prince will automatically apply ligature substitutions found in OpenType fonts.
scott_w
I just got off the phone with Adobe. We have run some more tests here and looks like it might be Adobe Acrobat.
scott_w
It has been almost four years and Adobe still hasn't addressed this bug. The cutting and pasting portion has been fixed but when you optimize or reduce the PDF file size using Acrobats tools the characters still drop. Because one of our divisions produces their own fonts we felt it might be an issue with how we create our fonts, but we haven't been able to track it down. All of our True Type or OpenType TT fonts do not exhibit this problem and most but not all of our OpenType PS fonts do exhibit this problem.

Recently we even found that fonts we haven't built have the same issues. I am going to assume that you will have access to Minion Pro and Myriad Pro OpenType PS fonts. If you could verify that you are seeing what we are and also let us know if you feel this is PrinceXML issue, or!?!

Here is the HTML5 test file:

<html>
	<head>
		<title>PDF Font Ligature Test</title>
		<meta charset="UTF-8" />
	</head>
	<body>
		<p>PDF Font Ligature Test</p>
		<section style="font-family:'MyriadPro-Regular', 'Myriad Pro';">
			<p>Myriad Pro Regular (OpenType-PS)</p>
			<p>flX</p>
			<p>fiX</p>
		</section>
		<section style="font-family:'MinionPro-Regular', 'Minion Pro';">
			<p>Minion Pro Regular (OpenType-PS)</p>
			<p>flX</p>
			<p>fiX</p>
		</section>
	</body>
</html>


We just run it with no other CSS applied (other then the Prince default). This will produce a PDF with three fonts embedded, Times New Roman, Myriad Pro and Minion Pro.

When you optimize or reduce the files size of the PDF using Acrobat (11.0.0) the ligature characters (fl and fi) are swapped for non ligature characters and the letter "X" is dropped.

Also if you take the PDF that Prince created and compress it using the online service http://smallpdf.com/compress-pdf all sorts of characters in the two OpenTypePS fonts get blown away.

Now there is a work around. If I use "font-variant:prince-opentype(ccmp)" and assign it to the body element of the above test file like the example below the above tests do not produce any negative results.

<html>
	<head>
		<title>PDF Font Ligature Test</title>
		<meta charset="UTF-8" />
	</head>
	<body style="font-variant:prince-opentype(ccmp);">
		<p>PDF Font Ligature Test</p>
		<section style="font-family:'MyriadPro-Regular', 'Myriad Pro';">
			<p>Myriad Pro Regular (OpenType-PS)</p>
			<p>flX</p>
			<p>fiX</p>
		</section>
		<section style="font-family:'MinionPro-Regular', 'Minion Pro';">
			<p>Minion Pro Regular (OpenType-PS)</p>
			<p>flX</p>
			<p>fiX</p>
		</section>
	</body>
</html>


This is a really big problem for us, we are currently producing content in over 100 languages using traditional publishing software and we are ramping up to match that using PrinceXML, we have over 30 languages being output by PrinceXML being tested now and our moving to a server license. But it is a problem having to turn the ligatures off to work around this issue.

Any help would be greatly appreciated.
dauwhe
Just tried this with Prince 9.0r5 and Acrobat 10.1.13 on Mac OS 10.9.5. Attached is the result of saving the resulting PDF at a reduced size.

Is all this happening on the same machine? Are the fonts available to Prince also available to Acrobat?

Thanks,

Dave
  1. font-reduced.pdf198.1 kB
scott_w
Using Prince 9.0r2 and Acrobat 11.0.0 on Mac 10.10.1.

Your PDF shows the OpenType fonts "Embedded" and "Embedded Subset" but my PDF only shows them "Embedded".

However when I reduce your PDF using Acrobat/File/Save As Other.../Reduced Size PDF... the letter "X" drops again. (See attached)

And when I used the http://smallpdf.com/compress-pdf to compress it I also lost lots of type.
  1. SW_font-reduced.pdf38.2 kB
scott_w
Sorry, Yes this is happening on multiple machines and the fonts are available to Prince and Acrobat
mikeday
Sorry about this, we do hope to work around these PDF viewer bugs by changing the way Prince embeds OpenType fonts with PostScript/Type1 outlines.
scott_w
Do you have an idea how long it might be before we see this corrected? Is the "font-variant:prince-opentype(ccmp)" the safest work around if we have to use OpenType fonts?

Some of our OpenType PS fonts are working and some are not, is there something we can change in the font other then converting them to True Type that would work?
mikeday
At the moment I don't think there is anything you can change in the font, besides converting the whole thing to True Type outlines.