Forum Bugs

Problem: "transfer closed with 332 bytes remaining to r

elephant
Setup: A simple Python server which serves HTML and CSS files.

A simple script which calls Prince just like described in the docs results in the following call:

import subprocess
call = ['prince', 'http://localhost:8082/test.html', 'test.pdf']
subprocess.call(call);


Running this gives the following error in a command line:

prince: http://localhost:8082/test.html: error: transfer closed with 13 bytes re
maining to read
prince: http://localhost:8082/test.html: error: could not load input file
prince: error: no input documents to process


Although the page is loaded properly in a browser. Output from LiveHTTPheaders in Firefox for the same file is:
http://localhost:8082/test.html

GET /test.html HTTP/1.1
...

HTTP/1.x 200 OK
Server: SimpleHTTP/0.6 Python/2.5.1
Date: Thu, 20 Sep 2007 15:35:13 GMT
Content-Type: text/html
Content-Length: 397
Last-Modified: Thu, 20 Sep 2007 15:17:06 GMT


It seems if I overwrite the default python server and add a

Content-Type: text/html; charset=utf-8


- so a simple encoding information - it seems to work and prince can fetch the page. The problem continues for a CSS file which is embedded in the HTML with

<link rel="stylesheet" type="text/css" href="/css/test.css" />


which again results in a similar error output from prince:

prince: http://localhost:8082/test.css: warning: transfer closed with
332 bytes remaining to read


My question is how does Prince differ from Browsers in handling CSS and HTML fetching via HTTP? Does it always assume UTF-8 if nothing is given and therefor fails to load these files?

Maybe this is not a Prince Bug at all but to solve the problem I have (I might not be able to rewrite any Server to correctly give encoding information about CSS files) to understand why this happens. I also think that UTF-8 is not the standard encoding for HTML (should be ISO-8859-1 I think) and maybe not CSS (although this might be). BTW, adding @charset "ascii" at the beginning of the stylesheet does not help.

Thanks a lot!
mikeday
The actual encoding does not matter, as Prince does not attempt to decode the text until the entire file has been retrieved from the server. So it must be some kind of HTTP related issue, perhaps an incompatibility between the curl library that we are using and the HTTP server. Perhaps you could try running the curl utility from the command-line on your URL and seeing if it works?
elephant
Sorry, I found the actual problem which is a problem in the Python 2.5.1 SimpleHTTPServer and not in Prince. Maybe even best to remove this topic from the forum as it is not Prince related.

Sorry again.
mikeday
That's okay, many integration problems are like that. It's worth leaving it on the forum in case other people experience similar issues.
elephant
So just a few remarks why we had problems:

- Python 2.4 SimpleHTTPServer sends a different value for HTTP header "Content-Length" than Python 2.5 SimpleHTTPServer for "text/*" content types on Windows when line breaks "\r\n" are used. Actually Python 2.4 is correct and 2.5 is not, I still need to report this problem. Python 2.5 seems to report the Content-Length counting \r\n as 2 characters but sends \n files only so the actual length is lower. Therefor the message by Prince "...x bytes remaining to read"

- On the "client side" Prince (and BTW Curl) seem to be very picky about the correct Content-Length being sent. Browsers like Firefox seem to ignore if reported Content-Length and actual Content-Length differ.

@Mike
When I think more about the problem, this still may be seen as a bug in Prince though. The CSS spec says to use any styles in a stylesheet even if they are not correctly closed. In this case the stylesheet even is correct, only the transfer is not. One could argue Prince (and Curl) should actually use the stylerules they get, even if the file they get is smaller than expected when comparing with HTTP Content-Length.
I guess it would be better if Prince would print out the warning as it is now but still use the CSS if it is valid.

Hope I am not talking gibberish here...
elephant
So just a few remarks why we had problems:

- Python 2.4 SimpleHTTPServer sends a different value for HTTP header "Content-Length" than Python 2.5 SimpleHTTPServer for "text/*" content types on Windows when line breaks "r n" are used. Actually Python 2.4 is correct and 2.5 is not, I still need to report this problem. Python 2.5 seems to report the Content-Length counting "BACKSLASH r BACKSLASH n" as 2 characters but sends files with linebreak "BACKSLASH n" only so the actual length is lower. Therefor the message by Prince "...x bytes remaining to read"

- On the "client side" Prince (and BTW Curl) seem to be very picky about the correct Content-Length being sent. Browsers like Firefox seem to ignore if reported Content-Length and actual Content-Length differ.

@Mike
When I think more about the problem, this still may be seen as a bug in Prince though. The CSS spec says to use any styles in a stylesheet even if they are not correctly closed. In this case the stylesheet even is correct, only the transfer is not. One could argue Prince (and Curl) should actually use the stylerules they get, even if the file they get is smaller than expected when comparing with HTTP Content-Length.
I guess it would be better if Prince would print out the warning as it is now but still use the CSS if it is valid.

Hope I am not talking gibberish here...
elephant
So just a few remarks why we had problems:

- Python 2.4 SimpleHTTPServer sends a different value for HTTP header "Content-Length" than Python 2.5 SimpleHTTPServer for "text/*" content types on Windows when line breaks "r n" are used. Actually Python 2.4 is correct and 2.5 is not, I still need to report this problem. Python 2.5 seems to report the Content-Length counting "BACKSLASH r BACKSLASH n" as 2 characters but sends files with linebreak "BACKSLASH n" only so the actual length is lower. Therefor the message by Prince "...x bytes remaining to read"

- On the "client side" Prince (and BTW Curl) seem to be very picky about the correct Content-Length being sent. Browsers like Firefox seem to ignore if reported Content-Length and actual Content-Length differ.

@Mike
When I think more about the problem, this still may be seen as a bug in Prince though. The CSS spec says to use any styles in a stylesheet even if they are not correctly closed. In this case the stylesheet even is correct, only the transfer is not. One could argue Prince (and Curl) should actually use the stylerules they get, even if the file they get is smaller than expected when comparing with HTTP Content-Length.
I guess it would be better if Prince would print out the warning as it is now but still use the CSS if it is valid.

Hope I am not talking gibberish here...
elephant
sorry about the multiple posts, I kept getting en PHP error saying something like "cannot send email" so thought the post did not come through...
mikeday
Right, at the moment if a resource has not been fully downloaded from HTTP with no errors then Prince will not use it at all. It's true that for CSS this might be a bit too cautious, but for images and XML documents they generally won't be usable if the end of the file has been chopped off. For now we would probably wait and see if there are more instances of this server behaviour in the wild before changing Prince, as it would be nice to be able to rely on the Content-Length header.
elephant
The handling of files via HTTP is completely reasonable and as there is a workaround for my particular problem completely satisfactory. I guess it is just good to be aware of the way Prince handles this and maybe there a possibility to improve on this sometime :)