Doctools > Docbook, Namespaces & Mortality > A. Proposed components

A. Proposed components

The purpose of this document is to describe the architecture proposed, rather than to list every desirable component involved. This appendix gives an overview of various realistic formats and options, in order to facilitate more natural discussion using real-world examples. It is hoped that this helps give an idea of how using the system would feel.

Additional modules correspond to the various other original intended formats for Doctools to render. Most notably this includes a module for puredocs (or several, depending on the exact requirements), which was one of the first formats for Doctools to transform alongside Docbook.

Other” formats such as Puredocs' are omitted since they are self-explanatory. The modules listed here are intended to give an idea for how a Docbook-like schema may be split by namespace and implemented in parallel.

Modules

This list is based on my experience of the types of things documents tend to contain. It's a rough outline of what I expect we will write, not a comprehensive list.

The implementation for these can be written from scratch, picked from the carcases of previous projects, imported from other opensource projects, provided by external tools, or some combination of the above.

  • page: Fundamental page generation. This would not be used if output is to be embedded as part of a larger system.

  • A docbook module, in an unguaranteed state of implementation. This is mostly for the convenience of backwards compatibility. It will contain very little other than the official Docbook stylesheets. Possibly our PDF-via-TeXML implementation will be discontinued in favour of the other modules.

  • docbook-simple. This is expected to be a complete implementation.

  • metadata describing the document's author, publisher, publication date, and so on.

    Document descriptions and other metadata are to be provided for the benefit of search systems, which especially suits XHTML.

  • revisions of the document, describing the document's history, edits, and marking up additions and deleted regions. The regions marked up may be served well with a namespaced attribute added to the usual tags, which serves to reference a revision history.

  • notes attached to the document. PDF is particularly adept at providing a mechanism to view these notes.

  • quotes, inlined in body text and set as blockquotes.

  • margin comments. Use-cases for these seem rare. ConTeXt supports them well.

  • frontmatter: a title page, an abstract, and other similar prefacing information.

  • structural elements: chapters, sections, parts, appendices, volumes, and so on. Tables of contents, figures and other listable items. Listable items are expected to be registered per-module, most likely by an xpath expression. See the section called “Notes” for a discussion of passing notes between modules.

  • lists, including itemised lists, ordered lists and definitions of terms.

  • Body text, providing paragraphs, emphasis, and the text itself.

  • figures, both external and inlined.

  • tables

  • datetime: dates and times, unsurprisingly.

  • maths: mathematics, perhaps input in TeX's format. MathML would be sensible to implement, too.

  • footnotes. ConTeXt has particularly good support for these, although LaTeX's is limited.

    Footnotes may required to be grouped per section; the note-passing mechanism may perhaps provide a hook for where to render lists of footnotes.

  • bibliography references (referencing them, not generating them).

  • Inter-document links. These exist already: InterdocumentLinks.

    Linking within a document and to URLs can be provided by standard XML mechanisms such as XLink (as Docbook 5 now uses).

  • glossary of terms. Again ConTeXt is particularly helpful here.

  • keywords for indexing documents.

  • website structure. Most importantly, pages, and the relations between them. This is expected to be similar to the XInclude-based system (which I enjoyed far more than Docbook-website) with keywords per-page intersecting as sets (these may be made available via notes) to suggest relevance of related pages.