Skip to content

now/nmc

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

425 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

                                   NoMarks

  NoMarks is a {“Lightweight markup language”}¹ without any markups or
  markdowns.  That’s not strictly true, as it employs various symbols to define
  the structure, but it doesn’t try to overprice nor underprice you by trying
  its hardest to be as unambiguous as possible in how it treats your content.
  This is achieved by using indentation for structural nesting and placing
  various symbols (or, well, marks) in said indentation to define what content
  follows.  This is of course similar to how, for example, Markdown² works.
  The difference is that a symbol appearing at the beginning of the textual
  content of a line can never be misinterpreted as a symbol/mark.  The
  indentation also makes it easy to see what level of nesting you’re content is
  at, making it easy to navigate.  Some symbols are also treated as special in
  inline content.  Here, too, NoMarks tries very hard to make processing
  unambiguous.  To stay true to this goal, there’s no escape mechanism using
  backslashes or similar.  But let’s not waste any more time with this rather
  informal introduction.  Instead, let’s look at the constructs that make up
  NoMarks.  But (!) first, lets clear up some possible confusions about the
  NoMarks name.  NoMarks consists of a couple of components and may name any of
  them at any time.  We have

= NoMarks, the language. = The language used in input documents.
= NoMarks, the converter. = The software that turns documents written in the
    NoMarks language into NoMarks-specific XML documents.
= NoMarks, the distribution. = The software and all supporting content.
= NoMarks, the project. = The project that encapsulates the above components,
    and more.

  In this document, the NoMarks name may stand alone when it’s obvious from the
  context what it means, but most often refers to /NoMarks, the language/.

¹ See http://en.wikipedia.org/wiki/Lightweight_markup_language
² See http://daringfireball.net/projects/markdown/

§ Block Structures

    A NoMarks document consists of zero or more blank lines, a title, zero or
    more blocks, zero or more sections, and finally zero or more blank lines.
    Titles, blocks, and sections are separated from each other by one or more
    blank lines.  A blank line consists of zero or more whitespace characters.
    Any blank lines at the beginning or end of the document are ignored.

  § Document Title

      A NoMarks document’s title consists of one or more lines of text.  As
      we’ll discuss further in the next section, any lines that follow the
      first must be indented at the same level as the first line, in this case
      two spaces.  It is customary for a NoMarks document’s title to be
      centered.  This document’s title was written as

                                           NoMarks

      The document title is left completely unprocessed for inlines.

  § Paragraphs

      A paragraph consists of one or more lines indented by 2×n
      spaces, where /n/ is the current section level (nesting)
      followed by 2 additional spaces, making up the tag of the
      content that follows.  A “tag” defines what content follows and is seen
      at the beginning of all block-level elements.  This paragraph was thus
      written as

          A paragraph consists of one or more lines indented by 2×n
          spaces, where /n/ is the current section level (nesting)
          followed by 2 additional spaces, making up the tag of the
          content that follows. …

      Actually, this indention rule is common for all block elements.  Thus,
      the current indention tells you how deeply you are nested and tells a
      NoMarks processor how to properly treat different elements in the
      document.

      Furthermore, the fact that lines beyond the first are tagged by two
      spaces isn’t actually the case.  These lines are tagged as “continuation
      lines” and this rule applies for all block-level elements.

  § Code Blocks

      A code block consists of one or more lines tagged by four spaces.  Note
      that this is in addition to the current indentation level as described in
      the previous section.  The code block from the previous section was thus
      written as

                  A paragraph consists of one or more lines indented by 2×n
                  spaces, where /n/ is the current section level (nesting)
                  followed by 2 additional spaces, making up the tag of the
                  content that follows. …

      Any blank lines “between” what would otherwise be seen as separate code
      blocks are included in the first code block, so they don’t require any
      indentation.

      The contents of a code block is left completely unprocessed, so you don’t
      need to worry about symbols getting misinterpreted as inline structures.

  § Enumerations

      An enumeration – an ordered list – consists of one or more enumerated
      items.  An enumerated item consists of a list-item number followed by
      space, then by a paragraph, that is, two spaces and one or more inlines.
      This paragraph may then be followed by zero or more block-level elements.
      These block-level elements will be indented one level further in than the
      item itself, so the indentation will be 2×(n+1), where /n/ is the current
      section level.

      The enumeration

    ₁   This is item 1.
    ₂   This is item 2.

        It consists of two paragraphs.
    ₃   This is item 3.

      is thus written as

        ₁   This is item 1.
        ₂   This is item 2.

            It consists of two paragraphs.
        ₃   This is item 3.

      Perhaps you caught on to the reason for the three spaces?  Though not
      strictly necessary for being able to unambiguously parse the first line,
      they are nonetheless needed for the aesthetics of having them line up
      with any following block elements.  Having this additional indent also
      makes it possible to have more than nine items in the list and easier to
      see the numerals.

  § Itemizations

      An itemization – an unordered list – works the same way as an
      enumeration.  The only difference is that a bullet is used as a tag
      instead of list-item numbers:

        •   This is item 1
        •   This is item 2
        •   Item 3 here

      renders

    •   This is item 1
    •   This is item 2
    •   Item 3 here

  § Definitions

    = Definitions. = A list of definitions consist of one definition or more.
    = Definition. = A definition consists of a term followed by one more blocks
        that define the meaning of the term.
    = Term. = A term consists of a “= ” tag in the indent followed by
        unprocessed text up until a terminating “. = ”.

      The above list of definitions was thus written as

        = Definitions. = A list of definitions consist of one definition or more.
        = Definition. = A definition consists of a term followed by one more
          blocks that define the meaning of the term.
        = Term. = A term consists of a “= ” tag in the indent followed by
          unprocessed text up until a terminating “. = ”.

      There may be any number of spaces between ‘.’ and ‘=’ in the termination
      of a term.

  § Quotes

      Quotes consists one or more lines followed by zero or one attributions.
      Each line consists of a one or more inlines.  The attribution also
      consists of one or more inlines.

      To quote Shakespeare:

    > To clique, or not to clique, that is the question:
    > Whether ’tis nobler in the mind to suffer
    > The slings and arrows of outrageous fortune,
    > Or to take arms against a sea of troubles,
    > And by opposing end them?

      Here, each line is an actual line from Hamlet.  It was rendered via

        > To clique, or not to clique, that is the question:
        > Whether 'tis nobler in the mind to suffer
        > The slings and arrows of outrageous fortune,
        > Or to take arms against a sea of troubles,
        > And by opposing end them?

      Sometimes you want a quote to span multiple lines, but only actually
      render as one.  In that case you simply replace the ‘>’ part of the tag
      with a space, ‘ ’, making it a continuation line.  Let’s
      use the following quote, due to Antoine de Saint-Exupéry, as an example:

    > A designer knows he has achieved perfection not when there is nothing
      left to add, but when there is nothing left to take away.
    — Antoine de Saint-Exupéry

      This was written as

        > A designer knows he has achieved perfection not when there is nothing
          left to add, but when there is nothing left to take away.
        — Antoine de Saint-Exupéry

      This example also introduced the attribution line.  It is tagged by “— ”,
      that is, an em dash, ‘—’, followed by a space, ‘ ’.

  § Tables

      Tables consist of zero or one rows that make up the head of the table,
      followed by one or more rows that make up its body.  The head is, if
      present, separated from the body by a separator line.  Each row consists
      of one or more cells demarcated by ‘‹|›’ symbols, where the first cells
      left demarcation symbol appears, as is usual, in the indent.  The
      separator line is tagged by “‹|-›” and may be followed by any content.
      For clarity, it should display nicely with the rest of the table,
      continuing the sequence of ‘-’ across each column, using ‘+’ signs where
      it intersects with columns, and be terminated by a ‘‹|›’.

      Thus, the table

    | Part  | Consists of |
    |-------+-------------|
    | Table | Head?, Body |
    | Head  | Row         |
    | Body  | Rows        |
    | Row   | Cells       |

      is written as

          Thus, the table

        | Part  | Consists of |
        |-------+-------------|
        | Table | Head?, Body |
        | Head  | Row         |
        | Body  | Rows        |
        | Row   | Cells       |

          is written as

        …

  § Footnotes

      Any block may be followed by a list of footnotes.  Footnotes are usually
      used for various inlines that lack a decent syntax.  Footnotes consist of
      a footnote identifier followed by a space followed by the footnote
      definition.  Valid footnote identifiers are ‹¹›, ‹²›, ‹³›, ‹⁴›, ‹⁵›, ‹⁶›,
      ‹⁷›, ‹⁸›, ‹⁹›, and ‹⁰›.  These identifiers may be combined further to
      create more than the initial ten identifiers, but it’s bad form to use
      too many footnotes per block.

      Footnote definitions aren’t free-form and must follow a set of patterns.
      Here’s an example:

          Drop me a line¹.

        ¹ Email me at mailto:example@example.com

      This is a paragraph consisting of the text “Drop me a line”.  The word
      “line” has a footnote associated with it, identified by ‹¹›.  The
      footnote is defined as a link to “mailto:example@example.com”.  The link
      title will be “Email me”.  In the NoMarks XML language, known as NML,
      this will be converted to

        <p>Drop me a
          <link title="Email me"
                uri="mailto:example@example.com">line</link></p>

      The identifiers don’t have to come in order, which means that you don’t
      have to re-order them in the event that footnotes are added and deleted
      during editing.

      Footnotes nest, so you don’t have to define a footnote as soon as
      possible.  You can, for example, leave all footnoting of an enumeration’s
      items to the list as a whole, or even the section that the enumeration
      appears in.  (In fact, you can’t define footnotes for the first paragraph
      of an item anyway.  This is a limitation that may be lifted in the
      future, but it’s really not that much of a limitation, as having
      footnotes sprinkled inside your lists makes for very messy lists.  In
      fact, having blocks in your lists at all is sort of messy, but let’s not
      digress…)

      Footnotes nest all the way to the outermost section level.  To be
      specific, content is checked for missing footnotes before the first level
      1 section, then after each level 1 section.

  § Figures

      Figures may be used to place content to the right or left of the current
      content.  We can, for example, place the NoMarks logo to the right of this
      paragraph¹.  This was introduced via a footnote:

          We can, for example, place the NoMarks logo to the right of this
          paragraph¹.

        ¹ The NoMarks logo, see data/nomarks.svg (a crossed-over ‘M’)

    ¹ The NoMarks logo, see data/nomarks.svg (a crossed-over ‘M’)

      Such a footnote consists of a title, followed by “, see ”, a URI to an
      image, and an optional alternate text inside parentheses to use in place
      of the image in case it can’t be displayed or viewed.

      Figures can also be displayed in line with other content.  Here’s a
      bigger version of the NoMarks logo displayed in this manner:

    Fig.
                             data/nomarks-256.svg
                             (A crossed-over ‘M’)

                      A big version of the NoMarks logo.

      This was rendered via

            Fig.
                                     data/nomarks-256.svg
                                     (A crossed-over ‘M’)

                              A big version of the NoMarks logo.

      We thus have “Fig.” as a tag, followed by a URI to the image that we want
      to display, followed by an optional alternate text in parentheses,
      followed by a title for the figure.  Whitespace and newlines between the
      tag and the URI and between the URI and the figure are ignored, so you’re
      free to lay it out in a manner that makes it clear to someone reading the
      text that a figure would take up quite a bit of space at this point.  The
      alternate text must come on the line right after the URI, if present.

§ Sections

    Sections consist of a title tagged by “§ ”, followed by zero or more
    blocks, followed by zero or more subsections.  The blocks and subsections
    are indented one additional level.

    The first section of this document was written as

      § Block Structures

          A NoMarks document consists of zero or more blank lines, a title, …

§ Inlines

    Inlines consist of one or more words or inline elements – as described
    below – separated by whitespace or punctuation.  All whitespace is
    normalized into one space character, ‘ ’.

  § Abbreviations

      Abbreviations are defined using footnotes.  Defining an abbreviation,
      such as NML¹, looks like

    ¹ Abbreviation for NoMarks xmL

          Abbreviations are defined using footnotes.  Defining an abbreviation,
          such as NML¹, looks like

        ¹ Abbreviation for NoMarks xmL

      Any text that follows “Abbreviation for” will be used as the definition
      of the abbreviation.

  § Code

      Inline code is surrounded by ‘‹’ and ‘›’.  Taking a line earlier in this
      document as an example, we thus have

        Valid footnote identifiers are ‹¹›, ‹²›, …

      The contents inside a code inline is left completely unprocessed.  You
      can thus use code inlines to write characters that would otherwise be
      seen as special for NoMarks, as long as it makes sense to mark them as
      code.  The end character, ‘›’, can occur at the end of a code inline, for
      example, ‹‹code is written like this››, by following the last ‘›’ by
      another ‘›’.  Juxtaposed code inlines are merged, so ‹‹a›››‹‹b›› becomes
      ‹a››‹b›.

  § Emphasis

      Emphasis is written like ‹/this/›.  Emphasis doesn’t nest.  The ending
      solidus, ‘/’, will only be seen as an ending emphasis if it’s not
      followed by another solidus.

  § Escaped Characters

      To be able to write characters that would otherwise be seen as special
      for NoMarks, escaped characters, may be used.  Escaped characters are
      written as ‹‘c’›, where /c/ is the character to escape.  Note how well
      this meshes with the standard practice of surrounding a character with
      single quotes in English when you are referring to the character itself.
      Also note that the single quotes will be retained in any output, as the
      whole sequence is simply seen as text.  Only sequences where only one
      character appears between single quotes are handled this way, so you can
      use unescaped sequences inside nested quotes (“…‘…’…”) if you so wish.

  § Groups

      Sometimes you want a footnote to refer to more than one inline.  You can
      then group a number of inlines by surrounding them with ‘{’ and ‘}’:

        {Drop me a line}¹.

      Here, footnote ‹¹› will refer to the words “Drop me a line”.  Groups
      nest.

  § Footnotes

      Footnotes may be added to any inline by appending them to it.  More than
      one footnote may be added by separating the footnotes with a ‹⁺›.
      Footnotes are processed left to right, so the leftmost will be the
      innermost one.  Thus,

        {Drop me a line}¹⁺²

      will be represented by the following (imaginary) XML structure

        <footnote identifier="²">
          <footnote identifier="¹">
            <group>Drop me a line</group>
          </footnote>
        </footnote>

  § Links

      Links are defined using footnotes.  You can, for example, link to
      http://example.com¹ by writing

    ¹ See IANA’s Example Domain at http://example.com/

          Links are defined using footnotes.  You can, for example, link to
          http://example.com¹ by writing

        ¹ See IANA’s Example Domain at http://example.com/

      Link definitions come in a couple of varieties.  You can write any of the
      following:

    •   IANA’s Example Domain at http://example.com/
    •   IANA’s Example Domain: http://example.com/
    •   See IANA’s Example Domain at http://example.com/
    •   See IANA’s Example Domain: http://example.com/
    •   See the IANA Example Domain at http://example.com/
    •   See the IANA Example Domain: http://example.com/
    •   See http://example.com/

      The text ahead of the final “ at ” or “: ” before the URI¹ is the link’s
      title, minus any leading “See ” or “See the ”.  The last variant doesn’t
      have a title.  The URI is of course the URI that the link should point
      to.

    ¹ Abbreviation for Uniform Resource Identifier

§ Rationale

    So why would you want to write documents using NoMarks?  Well, if you’re
    the kind of person who likes your “source” documents to look much like your
    “target” documents, that is, your NoMarks text documents to look much like
    your generated, for example, HTML5 documents will look in a web browser,
    NoMarks may be for you.

    The NoMarks text format has been designed with two criteria in mind: ease
    of writing and – more importantly – ease of reading. These criteria are met
    by applying the same constraints on the format:

  •   Use as few constructs as possible
  •   Make blocks and inlines stand out from their surrounding
  •   Separate “meta-information” from “actual information”

    All block formatting relies on the use of tags the margin to signify what
    content follows.  These tags have been chosen to make it exceedingly easy
    to see what they represent: section signs for sections, bullets for
    itemizations, numbers for enumerations, e-mail quoting for quotes, and so
    on.  The use of a margin also avoids (almost) all confusions about what is
    what and clearly shows nesting relationships in an otherwise linear format.

    For inlines, unlikely symbols have been chosen, to both stand out, but also
    to avoid being confused for text.  Code inlines use seldom-used quotation
    marks to pop out.  Grouping uses brackets that clearly divide their content
    from their surrounding.  Emphasis uses a pre-existing convention that
    actually makes its content look slanted.  The “escaping” mechanism uses the
    convention of putting the symbol inside single quotes.  Footnotes clearly
    separate out “meta-information” from actual “information”.  Abbreviations
    and links aren’t necessarily important information while reading a passage,
    but can provide vital information to some readers.  This metadata can be
    rendered in even more beneficial ways in other document formats.

§ Previous Work

    NoMarks wasn’t created in a vacuum.  There already existed a plenitude of
    different {lightweight markup languages}¹ and NoMarks was influenced by
    quite a few of them, such as reStructuredText², Textile³, Markdown⁴,
    AsciiDoc⁵, Org-mode⁶, and even TeX⁷, though TeX is hardly “lightweight”.
    Let’s now review where NoMarks various constructs originate from.

  ¹ See http://en.wikipedia.org/wiki/Lightweight_markup_language
  ² See http://en.wikipedia.org/wiki/ReStructuredText
  ³ See http://en.wikipedia.org/wiki/Textile_(markup_language)
  ⁴ See http://en.wikipedia.org/wiki/Markdown
  ⁵ See http://en.wikipedia.org/wiki/AsciiDoc
  ⁶ See http://en.wikipedia.org/wiki/Org-mode
  ⁷ See http://en.wikipedia.org/wiki/TeX

    The idea of using indentation/a margin for nesting is an original idea, at
    least in this context. Programmers use it all the time to make the nesting
    structure of the code they’re writing clear to human readers of the code
    and some programming languages, such as Python¹, use it for making it clear
    to the computer as well.  XML documents are similarly indented to show the
    nesting structure of elements.  For NoMarks, it came about by first using a
    two-space margin for putting tags in, making these tags stand out and also
    making sure that they wouldn’t be confused by the processor as text.  When
    it came time to add sections and subsection it became clear that a wider
    margin could be used to define nesting as well.

  ¹ See Python (programming language):
    http://en.wikipedia.org/wiki/Python_(programming_language)

    The use of tags to mark what kind of block-level structure follows isn’t
    new.  Textile and Markdown use it for headings, lists, and so on.  Putting
    them in the margin, however, is, as already stated, new.  Using Unicode
    symbols is also new.  This choice was made for two reasons: they stand out
    more and they look better.

    Paragraphs are written in much the same way as they would in TeX, though
    TeX only cares about empty lines between blocks of text.

    Quotes use an age-old convention used in e-mails¹ for quoting previous
    e-mails.

  ¹ See http://en.wikipedia.org/wiki/Posting_style#Quoted_line_prefix

    Tables use the Org-mode table format, which makes them easy to edit in
    Emacs.  Org-mode’s tables are actually not unique to Org-mode (and where
    integrated from an older Emacs mode).  This way of formatting tables in
    plain text is intuitive and can be seen in various other formats, such as
    reStructuredText, Markdown, and MediaWiki¹.

  ¹ See http://en.wikipedia.org/wiki/MediaWiki

    Footnotes use Unicode symbols that, while actually being generic
    superscripts, look much like footnotes would in a printed document.  This
    use is sometimes used in e-mail, but hasn’t been seen in any other
    lightweight markup language to date.  The use of footnotes for providing
    meta-information exists in reStructuredText and Markdown, though these
    formats are more explicit about their formatting and don’t read as well.

    Figures are difficult to represent in text and other lightweight markup
    languages often use rather ugly escapes.  NoMarks tries to make them at
    least look nice, even though it can’t represent the actual image.

    The use of ‘‹’ and ‘›’ to demarcate inline code is unique.  Markdown uses
    pairs of ‘`’ characters.  Org-mode uses pairs of ‘=’ characters.  Using
    unique start and end characters makes it possible to nest them properly,
    though true nesting isn’t actually used; still, any sequence of characters
    may be constructed inside inline code in NoMarks.

    NoMarks use of ‘/’ characters around emphasized text seems to be
    unique among lightweight markup languages.  It has it’s origin in e-mail
    conventions and has been chosen over ‘*’ and ‘_’ as it looks better.

    Escaping using “‘…’” is a convention often used when discussing symbols in
    text processing and similar pursuits and is probably familiar to most
    English readers.

    Many of the existing lightweight markup languages are usually implemented
    using regular expressions to substitute the various constructs with their
    output format equivalent.  Thus, there’s usually only one output format and
    there’s no real formal specification for the language.  Having only one
    possible output format can be limiting, and even though this format, most
    often being HTML, can often be parsed and molded into something else, not
    having an intermediate format makes these languages hard to work with for
    external software.  Lacking a formal specification also creates problems,
    as it makes it hard to create compatible implementations.

    The NoMarks language has an intermediate XML representation, known as
    NoMarks XML, or NML for short.  NoMarks comes with a {RELAX NG}¹ grammar
    for this XML representation and is easily malleable using XSLT².  The
    NoMarks language is described, in software, using a BNF grammar, and even
    though this grammar isn’t quite enough to implement it, as there are some
    context-sensitivity in the language, it’s fairly easy to port grammar to
    other software that may want to build upon it.

  ¹ See the RELAX NG home page at http://relaxng.org/
  ² See XSL Transformations (XSLT) Version 1.0: http://www.w3.org/TR/xslt

§ Alternatives

    As was mentioned in the previous section on previous work, there are a lot
    of lightweight markup languages that one can use instead of NoMarks.
    NoMarks tries to find a sweet spot between ease of use and functionality.
    If you want advanced features, such as math and graph markup, AsciiDoc¹ may
    be for you.  If you want a simple format for comment entry on a blog,
    Markdown² may suit you well.

  ¹ See http://www.methods.co.nz/asciidoc/
  ² See http://daringfireball.net/projects/markdown/

    But why use a lightweight markup language at all?  Why not use Word?  Well,
    to be frank, if you’re actually asking and you’re actually expecting an
    answer, then NoMarks, or other lightweight markup languages, is probably
    not for you.  But let’s look at what makes NoMarks plus any text editor
    more well-suited for writing documents than Word:

  = Diffs. = Granted, Word comes with its own diffing tool, but NoMarks
      documents can easily be diffed by a version-control system and integrated
      in that kind of workflow.
  = Conversions. = Granted, Word documents can now be converted to many
      formats, but the output is often very complex and Office Open XML is
      extremely complex to process.
  = Automated changes. = Granted, Word can be programmed, so many changes to
      also be automated, but they must be done using Word.  You have to own
      Word.
  = Control. = You have absolute control over your document.  There’s not a
      piece of software between you and your words.
  = Reading. = A Word document requires a piece of software capable of
      processing an extremely complex document format, and there aren’t many of
      those around.  A NoMarks document can be ready by any piece of software
      that lets you display text.

    All of these arguments more or less boil down to the fact that a Word
    document requires a specific tool to view, change, or otherwise process
    them, whereas a NoMarks document requires no such tool.

§ Installation Instructions

    The normal ‹configure && make && make install› routine applies.  There’s a
    non-standard option, ‹--enable-xml-catalog-update›, that’ll try to update
    the system’s XML catalog with information on where to find the {RELAX NG}¹
    grammar and the {XSL templates}².  All of this is irrelevant if you install
    via a package manager, as it’ll have all this covered for you.

¹ See the RELAX NG home page at http://relaxng.org/
² See XSL Transformations (XSLT) Version 1.0: http://www.w3.org/TR/xslt

§ Command-line Interface

    The NoMarks distribution comes with a command-line utility that turns
    NoMarks documents into NoMarks XML documents.  It’s called ‹nmc› and
    may be invoked as follows:

      % nmc README > README.nml

    After execution, ‹README.nml› will contain the XML representation of the
    NoMarks document ‹README›.

    This isn’t very useful by itself, so let’s convert it to an HTML5¹
    document using xsltproc²:

      % xsltproc http://disu.se/software/nml/xsl/1.0/html.xsl \
        README.nml > README.html

    We could also turn it into a manual page:

      % xsltproc http://disu.se/software/nml/xsl/1.0/man.xsl \
        README.nml > README.1

    The ‹README› isn’t really written as a manual page, so we might not want to
    do that particular transformation in reality, though.

    You can read more about the ‹nmc› command in the ‹nmc› manual page:

      % man 1 nmc

    While we’re on the topic of manual pages, a description of the NoMarks
    language also exists in manual page format, if you ever need to refer to it
    quickly:

      % man 7 nmt

¹ See HTML: The Markup Language: http://www.w3.org/TR/html-markup/
² See the XML C parser and toolkit of Gnome: http://www.xmlsoft.org/

§ XSL Templates

    The section on the command-line interface introduced the XSL templates that
    are come with the NoMarks distribution.  This section expands on their use
    and are meant to be used as a reference.

  § HTML5

      The HTML5¹ template, ‹http://disu.se/software/nml/xsl/1.0/html.xsl›, is
      rather straight-forward.  Notable is that it generates ‹id› attributes
      for all sections by normalizing their titles into valid HTML5 IDs by
      turning all non-empty sequences whitespace into single ‘-’ characters,
      removing all non-ASCII-alpha-numerics, and downcasing the result and then
      concatenating this result unto the id of their parent section (separated
      by another ‘-’ character), if any.  This means that you should take care
      to use unique section titles among sibling sections (and
      conflict-generating combinations of different parent and child IDs).

      Two parameters are currently available:

    = Lang. = The source language, which defaults to “none”.
    = Stylesheet. = The CSS²⁺³ stylesheet to link to, which defaults to
        ‹style.css›.

      The HTML5 template depends on the following XSL extensions:

    •   http://exslt.org/common⁴http://exslt.org/functions⁵

      Xsltproc⁶ implements these and is known to process the stylesheet
      correctly.

      The stylesheet has been written with extension in mind.  You may
      specifically be interested in overriding the ‹nml:adjust-uri› function to
      adjust any URIs that occur in ‹image› and ‹link› elements to ones that fit
      with the location of the generated HTML5 documents.  Perusing the source
      should give you ideas as to what you might want to adjust beyond that.
      We try to keep the name of functions and templates and the general
      functionality of the stylesheet the same between versions, but the “API⁷”
      is still in flux, so things may break.

  ¹ See HTML: The Markup Language: http://www.w3.org/TR/html-markup/
  ² Abbreviation for Cascading Style Sheets
  ³ See the Cascading Style Sheets home page at
    http://www.w3.org/Style/CSS/Overview.en.html
  ⁴ See EXSLT – Common at http://www.exslt.org/exsl/
  ⁵ See EXSLT – Functions at http://www.exslt.org/func/
  ⁶ See the XML C parser and toolkit of Gnome: http://www.xmlsoft.org/
  ⁷ Abbreviation for Application Programming Interface

  § Manual Pages

      Manual pages may be generated using
      ‹http://disu.se/software/nml/xsl/1.0/man.xsl›.  It has quite a few
      parameters, of which you’ll probably only want to adjust a few:

    = Section. = The manual section that the generated manual page belongs to.
        Defaults to 1.
    = Date. = The date to use in the manual page footer.  Defaults to todays
        date in the “DD Month, YYYY” format.
    = Source. = The source of the manual page, usually being the project’s
        name.  Defaults to the empty string.
    = Manual. = The name of the manual that this manual page belongs to.
        Defaults to “User Commands”.
    = Man:indent. = The default indent to use for indents.  Defaults to 4.
    = Man:indent.list. = The indent to use for lists.  Defaults to
        ‹man:indent›.
    = Man:indent.code. = The indent to use for code blocks.  Defaults to
        ‹man:indent›.
    = Man:indent.table. = The indent to use for tables.  Defaults to
        ‹man:indent›.
    = Man:indent.figure. = The indent to use for figures.  Defaults to
        ‹man:indent›.

      Manual page generation from NoMarks documents can be automated using
      Makefile rules.  Looking at the Automake file ‹Makefile.am› that is used
      to build this project gives us, assuming that we’ve set ‹NMC› and
      ‹XSLTPROC› to appropriate values, the following rules:

        nmt.nml:
        	rm -f $@ $@.tmp
        	$(NMC) $< > $@.tmp
        	mv $@.tmp $@

        .nml.1:
        	rm -f $@ $@.tmp
        	$(XSLTPROC) \
        	  --stringparam source "$(PACKAGE_STRING)" \
        	  http://disu.se/software/nml/xsl/1.0/man.xsl $< > $@.tmp
        	mv $@.tmp $@

        .SUFFIXES: .nmt .nml .1

      The manual page template depends on the following XSL extensions:

    •   http://exslt.org/common¹http://exslt.org/dates-and-times²http://exslt.org/functions³http://exslt.org/strings⁴

      Xsltproc again implements these and is known to process the stylesheet
      correctly.

  ¹ See EXSLT – Common: http://www.exslt.org/exsl/
  ² See EXSLT – Dates And Times: http://www.exslt.org/dates-and-times/
  ³ See EXSLT – Functions: http://www.exslt.org/func/
  ⁴ See EXSLT – Strings: http://www.exslt.org/str/

§ Validation

    If you don’t trust that the ‹nmc› command is outputting valid NoMarks XML
    documents, you can use the {RELAX NG}¹ schema that comes with the NoMarks
    distribution.  It’s URI is
    ‹http://disu.se/software/nml/relaxng/1.0/nml.rnc› and validation is known to
    work with Jing², version 20091111:

      % java -jar jing-20091111.jar \
        -c http://disu.se/software/nml/relaxng/1.0/nml.rnc \
        README.nml

¹ See the RELAX NG home page at http://relaxng.org/
² See Jing – A RELAX NG validator in Java at
  http://www.thaiopensource.com/relaxng/jing.html

§ Development Dependencies

    The ‹nmc› software is built using the GNU autotools.  Autoconf¹ 2.69 is
    required for generating ‹configure›.  Makefiles are generated with
    Automake² 1.13.1 or newer.

    The grammar is compiled using Bison³ 2.7 or newer.  A yacc-compatible
    grammar compiler isn’t enough, as quite a few Bison-specific extensions are
    used, with good reason.

    Xsltproc⁴ is used to generate the manual pages.  Version 1.1.27 is known to
    work.

    The tools that build the Unicode tables used by the library to classify
    characters for correct lexing depend upon Unicode data that Wget⁵ fetches
    from the {Unicode repository}⁶.  Any version should work; 1.14 is known to
    work.  Oh, and Unicode 6.2.0 is currently being used.

    Valgrind⁷ may optionally be used to test the source for memory-related
    errors.  Run ‹make maintainer-check-valgrind› to do so.

    The ‹configure› script supports a non-standard option:
    ‹--enable-gcc-warnings›.  When given, this option enables a lot of
    additional warnings that help ensure that the source stays both correct and
    conforming.

¹ Autoconf: http://www.gnu.org/software/autoconf/
² Automake: http://www.gnu.org/software/automake/
³ Bison: http://www.gnu.org/software/bison/
⁴ Xsltproc: http://www.xmlsoft.org/
⁵ Wget: http://www.gnu.org/software/wget/
⁶ Unicode repository: http://www.unicode.org/Public/UNIDATA/
⁷ Valgrind: http://valgrind.org/

§ Build Dependencies

    The C source code requires a C99-compliant compiler, for example, GCC¹.

  ¹ See http://www.gnu.org/software/gcc/

§ Runtime Dependencies

    NMC has no runtime dependencies beyond the standard C runtime.  You will,
    however, want some sort of XML toolchain to turn the generated XML into
    something interesting.  Xsltproc¹ support all the extensions that the XSL
    templates require (common, dates-and-times, functions, strings) and is
    widely available.

  ¹ See http://www.xmlsoft.org/

§ Reporting Bugs

    Please report any bugs that you encounter to the {issue tracker}¹.

  ¹ See https://github.com/now/nmc/issues

§ Authors

    Nikolai Weibull wrote the code, the tests, the manual pages, and this
    README.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published