Printing InfoPath forms that have rich text boxes may add blank pages


Consider the following scenario. You create a Microsoft InfoPath form template and add one or more Rich Text Box controls to the form. A user fills out the form and adds enough content to the rich text box to fill more than a single printed page. When the form is printed, the content in the rich text box may not be completely printed. Blank pages may be included in the printed content, and the contents of the rich text box may overlap other controls or content in the form.


This problem is because of a design limitation in InfoPath. InfoPath displays the XML data in a form by transforming it into HTML content and then rendering that HTML content by using the Windows HTML engine, MSHTML. Typically, this works as expected. XML data can easily be transformed into HTML, HTML can easily be rendered by MSHTML, and HTML provides a wide and flexible range of rendering. However, in some cases, a display that is optimized for web browsers is not the ideal behavior for forms.

When a large amount of text is entered in a rich text box in InfoPath, the control may expand across a page boundary. When a field spans a page boundary, the MSHTML engine automatically moves the beginning of the field to the top of the next printed page. The MSHTML engine does this to keep the printed data for each field together as much as it can. For example, consider an image that is added to a Picture control. It would not be desirable to print half of the picture on one page and the other half on another page. MSHTML cannot determine whether the rich text box contains text that can be split across pages without impairing readability or images that should not be split across pages. Therefore, MSHTML treats the control like all other controls. This is the best generalized behavior.


There is no simple workaround to resolve this problem. The only way to prevent the problem is to make sure that the contents of the rich text box do not span across a page boundary. However, InfoPath cannot predict the height of a Rich Text Box control, and InfoPath cannot predict whether the rich text box will break across a page boundary. This condition depends on the content the user adds to the control.

The best way to prevent this issue is to limit how much content is entered into a Rich Text Box control. Consider that InfoPath forms are intended to collect information from users in an XML format that other business applications can easily reuse. If you enter large amounts of formatted content into a single form field, you are largely bypassing this purpose. Generally, it would be better to enter that data into individual fields in the form that corresponds to the purpose of the data so that it can be easily reused or reported on. If the purpose of the Rich Text Box control is for users to add information from supporting documentation to the data that they added to the forms, then consider using a file attachment control to attach supporting documents without requiring users to delete text from one place and paste it somewhere else.

As an example of good separation of data, consider a form for Help Desk personnel to track support calls. Some items, such as name, site, and product, have individual fields that use a Drop Down control or a Combo control. Log files, although their contents could be copied into a text box or into a rich text box, are attached as files. Finally, elements of the report that require rich text are divided over several rich text boxes by theme. For example, symptoms, environment, sets of troubleshooting steps, and resolution all use different Rich Text Box controls. With this design, the contents of rich text boxes rarely span page boundaries.

Article ID: 2705421 - Last Review: 25 Apr 2012 - Revision: 1