The Docutils Publisher

Author: David Goodger
Contact: goodger@python.org
Date: 2005-06-27
Revision: 3599
Copyright: This document has been placed in the public domain.

Contents

Publisher Convenience Functions

Each of these functions set up a docutils.core.Publisher object, then call its publish method. docutils.core.Publisher.publish handles everything else. There are five convenience functions in the docutils.core module:

publish_cmdline:

for command-line front-end tools, like rst2html.py. There are several examples in the tools/ directory. A detailed analysis of one such tool is in Inside A Docutils Command-Line Front-End Tool

publish_file:

for programmatic use with file-like I/O. In addition to writing the encoded output to a file, also returns the encoded output as a string.

publish_string:

for programmatic use with string I/O. Returns the encoded output as a string.

publish_parts:

for programmatic use with string input; returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings; encoding is up to the client. Useful when only portions of the processed document are desired. See publish_parts Details below.

There are usage examples in the docutils/examples.py module.

publish_programmatically:
 

for custom programmatic use. This function implements common code and is used by publish_file, publish_string, and publish_parts. It returns a 2-tuple: the encoded string output and the Publisher object.

Configuration

To pass application-specific setting defaults to the Publisher convenience functions, use the settings_overrides parameter. Pass a dictionary of setting names & values, like this:

overrides = {'input_encoding': 'ascii',
             'output_encoding': 'latin-1'}
output = publish_string(..., settings_overrides=overrides)

Settings from command-line options override configuration file settings, and they override application defaults. For details, see Docutils Runtime Settings. See Docutils Configuration Files for details about individual settings.

Encodings

The default output encoding of Docutils is UTF-8. If you have any non-ASCII in your input text, you may have to do a bit more setup. Docutils may introduce some non-ASCII text if you use auto-symbol footnotes or the "contents" directive.

publish_parts Details

The docutils.core.publish_parts convenience function returns a dictionary of document parts. Dictionary keys are the names of parts, and values are Unicode strings.

Each Writer component may publish a different set of document parts, described below. Currently only the HTML Writer implements more than the "whole" part.

Parts Provided By All Writers

whole
parts['whole'] contains the entire formatted document.

Parts Provided By the HTML Writer

body
parts['body'] is equivalent to parts['fragment']. It is not equivalent to parts['html_body'].
docinfo
parts['docinfo'] contains the document bibliographic data.
footer
parts['footer'] contains the document footer content, meant to appear at the bottom of a web page, or repeated at the bottom of every printed page.
fragment
parts['fragment'] contains the document body (not the HTML <body>). In other words, it contains the entire document, less the document title, subtitle, docinfo, header, and footer.
header
parts['header'] contains the document header content, meant to appear at the top of a web page, or repeated at the top of every printed page.
html_body
parts['html_body'] contains the HTML <body> content, less the <body> and </body> tags themselves.
html_head

parts['html_head'] contains the HTML <head> content, less the stylesheet link and the <head> and </head> tags themselves. Since publish_parts returns Unicode strings and does not know about the output encoding, the "Content-Type" meta tag's "charset" value is left unresolved, as "%s":

<meta http-equiv="Content-Type" content="text/html; charset=%s" />

The interpolation should be done by client code.

html_prolog

parts['html_prolog] contains the XML declaration and the doctype declaration. The XML declaration's "encoding" attribute's value is left unresolved, as "%s":

<?xml version="1.0" encoding="%s" ?>

The interpolation should be done by client code.

html_subtitle
parts['html_subtitle'] contains the document subtitle, including the enclosing <h2 class="subtitle"> & </h2> tags.
html_title
parts['html_title'] contains the document title, including the enclosing <h1 class="title"> & </h1> tags.
meta
parts['meta'] contains all <meta ... /> tags.
stylesheet
parts['stylesheet'] contains the document stylesheet link.
subtitle
parts['subtitle'] contains the document subtitle text and any inline markup. It does not include the enclosing <h2> & </h2> tags.
title
parts['title'] contains the document title text and any inline markup. It does not include the enclosing <h1> & </h1> tags.