Forum How do I...?

Problem outputing PDFs from HTML in .net web app

okeeffe
I'm working on a proof of concept to convert some pages on our web site (Sitecore, ASP.NET 4.0) to PDF. We have a print stylesheet that we'd like to apply so that (theoretically) the PDF would look similar to the print version of the page. Here's the code I have so far:
Prince prince = new Prince("C:\\Program Files (x86)\\Prince\\engine\\bin\\prince.exe");
prince.SetLog(ConfigurationManager.AppSettings["logfile"]);
prince.SetHTML(true);
prince.AddStyleSheet(ConfigurationManager.AppSettings["stylesheet"]);
prince.Convert(Request.Url.ToString(), Response.OutputStream);
Response.ContentType = "application/pdf";

When this runs, a PDF is output, but it only contains about 20% of the page. I've run this code with many different web pages and can't get a PDF that contains all of the page content. I assume the issue is bad formatting or a javascript file getting in the way. The log file logs a number of formatting issues, but I don't know how this impacts what is output in the PDF file. Here's an example of the errors logged:
error: htmlParseEntityRef: expecting ';'
error: Unexpected end tag : br
error: Unexpected end tag : br
error: Unexpected end tag : p

Would we be better off not trying to convert HTML and instead create a print-friendly version of the page to convert? I'm feeling a bit lost at this point trying to figure out how PrinceXML converts web pages and how much the formatting and styles get in the way. If anyone has any suggestions or can describe how you convert web pages in your .NET web site, I'd be most grateful.
mikeday
20% vertically or horizontally? :)

Probably the web page is one big absolutely positioned block or float that cannot be easily split across multiple sheets of paper. It may be helpful to tweak some of the CSS in a print-specific style sheet to avoid this.