Forum Bugs

emded html no printed

Hi,
in my html pages I use

<embed src="url" width="100%" height="300">

The embeded html is not printed to pdf using prince.

Please advise
Best regards
Stefan
mikeday
Prince does not support the embed element in this way, although there is partial support for iframe.
mrwarper
Just an idea: add a script that issues an HttpXMLRequest on the src attribute of every <embed> element, then replace each element with the contents of whatever is linked. I think I can dig up an example of something similar if it's really necessary.
Hi mrwarper,
please provide an example.
Thanks
Stefan
mrwarper
This is an example I use in real life, trimmed down to the very basics. (I checked it in a browser just in case I broke something with the trimming -- it works fine.)

XMLHttpRequest always had problems with local filesystems (because there's no http transfers), so you probably want to test in on a local web server before moving on to other stuff, modifying, etc.

In index.htm you'll find a <script> tag in the head linking to merger.js, and another one triggering the function LoadExternal later in the body. The latter is not necessary if you enable window.addEventListener at the end of merger.js to trigger LoadExternal when the body ends loading -- it is only there to speed up loading external stuff and avoid the waiting.

You'll see the LoadExternal function in merger.js looks for an <a> element with the id "retrieveme" (external.htm in this case), and sets the target for download, with the function InjectExternal to be executed when the data arrives. It is InjectExternal the one that does the real replacement by looking again for #retrieveme.

While LoadExternal should be easy enough to modify to retrieve more files from different elements' pointers, the current InjectExternal doesn't know what element it was invoked on/it must replace when injecting its data, hence the initial lookup. If you were to invoke it several times, you should devise a way to connect target elements and downloaded data sets before replacing like crazy.

Hope this helps :)
  1. merger.zip2.1 kB
hallvord
Hi mrwarper, nice work :)
I would like to suggest that you consider adding this to the prince-scripts repository, let's make a new "utils" section for it: https://github.com/yeslogic/prince-scripts/

One minor comment: the beHead() method might not do what you probably expect since it limits itself to the "outer" elements. If I sent you something like
<p><script>foo()</script></p>

as far as I can tell from skimming the code, this method would just see the P tag and thus not remove the SCRIPT. Might not be the behaviour you wanted.. Consider something like
var elms = externalContents.getElementsByTagName('*');
for(var i = elms.length -1; elms[i]; i--){ ... }

Announcement: repos for tests/utils

mrwarper
Thank you, Hallvord :)

Please feel free to add the code above to your repository -- the purpose of publishing it here is to try and help others, so as long as you give credit I can't see why not let you do the same.

I guess you should have had a deeper look at the code by now, but in case you didn't: the purpose of the function beHead is to remove any HTML elements that might live in the <head> of the imported file (and nowhere else) before injecting external data into the main document.

Imported file data have no <head> nor <body> tags, instead you get a concatenation of their contents -- so unless you remove them prior to injecting the data into place, you will end up with an extraneous <title> and other elements coming from the imported <head> in your main HTML. Thus I'm trying to sanitize only elements that would be direct descendants of that <head>, and if, for whatever the reason, <script>s lived buried somewhere in the <body>, I'm fine with that.

I suppose that function of mine is not the pinnacle of data sanitation, but it's not fed uncontrolled HTML and I haven't had a problem with it so far. However, I'll gladly welcome any further comments addressing obvious shortcomings or oversights ;)