Repository: msprev/panzer Branch: master Commit: 9a45452c3d6b Files: 20 Total size: 166.3 KB Directory structure: gitextract_wzwtz714/ ├── .gitignore ├── LICENSE.txt ├── MANIFEST.in ├── README.md ├── README.rst ├── doc/ │ ├── .gitignore │ ├── README.md │ └── makefile ├── panzer/ │ ├── __init__.py │ ├── cli.py │ ├── const.py │ ├── document.py │ ├── error.py │ ├── info.py │ ├── load.py │ ├── meta.py │ ├── panzer.py │ ├── util.py │ └── version.py └── setup.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # python stuff *.pyc __pycache__/ /dist/ build/ venv/ /*.egg-info .ropeproject/ # vim tags tags # test stuff test/output-panzer/ test/output-pandoc/ ================================================ FILE: LICENSE.txt ================================================ Copyright (c) 2015, Mark Sprevak All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: - Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. - Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. - Neither the name of Mark Sprevak nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ================================================ FILE: MANIFEST.in ================================================ include README.rst include LICENSE.txt ================================================ FILE: README.md ================================================ ----- Development has ceased on panzer. Over the years, pandoc has gained powerful new functionality (e.g. the `--metadata-file` option and Lua filters) that means that 90% of what can be done with panzer can be done with pandoc and some simple wrapper scripts. I no longer use panzer in my own workflow for this reason. If you would like to take over development of panzer, let me know. ----- # panzer panzer adds *styles* to [pandoc](http://johnmacfarlane.net/pandoc/index.html). Styles provide a way to set all options for a pandoc document with one line (‘I want this document be an article/CV/notes/letter’). You can think of styles as a level up in abstraction from a pandoc template. Styles are combinations of templates, metadata settings, pandoc command line options, and instructions to run filters, scripts and postprocessors. These settings can be customised on a per writer and per document basis. Styles can be combined and can bear inheritance relations to each other. panzer exposes a large amount of structured information to the external processes called by styles, allowing those processes to be both more powerful and themselves controllable via metadata (and hence also by styles). Styles simplify makefiles, bundling everything related to the look of the document in one place. You can think of panzer as an exoskeleton that sits around pandoc and configures pandoc based on a single choice in your document. To use a style, add a field with your style name to the yaml metadata block of your document: ``` yaml style: Notes ``` Multiple styles can be supplied as a list: ``` yaml style: - Notes - BoldHeadings ``` Styles are defined in a yaml file ([example](https://github.com/msprev/dot-panzer/blob/master/styles/styles.yaml)). The style definition file, plus associated executables, are placed in the `.panzer` directory in the user’s home folder ([example](https://github.com/msprev/dot-panzer)). A style can also be defined inside the document’s metadata block: ``` yaml --- style: Notes styledef: Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: columns: "`75`" lua-filter: - run: macroexpand.lua filter: - run: deemph.py ... ``` Style settings can be overridden by adding the appropriate field outside a style definition in the document’s metadata block: ``` yaml --- style: Notes numbersections: true filter: - run: smallcaps.py commandline: - pdf-engine: "`xelatex`" ... ``` # Installation ``` bash pip3 install git+https://github.com/msprev/panzer ``` *Requirements:* - [pandoc](http://johnmacfarlane.net/pandoc/index.html) \> 2.0 - [Python 3](https://www.python.org/downloads/) - [pip](https://pip.pypa.io/en/stable/index.html) (included in most Python 3 distributions) *To upgrade existing installation:* ``` bash pip3 install --upgrade git+https://github.com/msprev/panzer ``` On Arch Linux systems, the AUR package [panzer-git](https://aur.archlinux.org/packages/panzer-git/) can be used. ## Troubleshooting An [issue](https://github.com/msprev/panzer/issues/20) has been reported using pip to install on Windows. If the method above does not work, use the alternative install method below. ``` git clone https://github.com/msprev/panzer cd panzer python3 setup.py install ``` *To upgrade existing installation:* ``` cd /path/to/panzer/directory/cloned git pull python3 setup.py install --force ``` # Use Run `panzer` on your document as you would `pandoc`. If the document lacks a `style` field, this is equivalent to running `pandoc`. If the document has a `style` field, panzer will invoke pandoc plus any associated scripts, filters, and populate the appropriate metadata fields. `panzer` accepts the same command line options as `pandoc`. These options are passed to the underlying instance of pandoc. pandoc command line options can also be set via metadata. panzer has additional command line options. These are prefixed by triple dashes (`---`). Run the command `panzer -h` to see them: ``` -h, --help, ---help, ---h show this help message and exit -v, --version, ---version, ---v show program's version number and exit ---quiet only print errors and warnings ---strict exit on first error ---panzer-support PANZER_SUPPORT panzer user data directory ---pandoc PANDOC pandoc executable ---debug DEBUG filename to write .log and .json debug files ``` Panzer expects all input and output to be utf-8. # Style definition A style definition may consist of: | field | value | value type | | :------------ | :--------------------------------- | :---------------------------- | | `parent` | parent(s) of style | `MetaList` or `MetaInlines` | | `metadata` | default metadata fields | `MetaMap` | | `commandline` | pandoc command line options | `MetaMap` | | `template` | pandoc template | `MetaInlines` or `MetaString` | | `preflight` | run before input doc is processed | `MetaList` | | `filter` | pandoc filters | `MetaList` | | `lua-filter` | pandoc lua filters | `MetaList` | | `postprocess` | run on pandoc’s output | `MetaList` | | `postflight` | run after output file written | `MetaList` | | `cleanup` | run on exit irrespective of errors | `MetaList` | Style definitions are hierarchically structured by *name* and *writer*. Style names by convention should be MixedCase (`MyNotes`) to avoid confusion with other metadata fields. Writer names are the same as those of the relevant pandoc writer (e.g. `latex`, `html`, `docx`, etc.) A special writer, `all`, matches every writer. - `parent` takes a list or single style. Children inherit the properties of their parents. Children may have multiple parents. - `metadata` contains default metadata set by the style. Any metadata field that can appear in a pandoc document can appear here. - `commandline` specifies pandoc’s command line options. - `template` is a pandoc [template](http://johnmacfarlane.net/pandoc/demo/example9/templates.html) for the style. - `preflight` lists executables run before the document is processed. These are run after panzer reads the input, but before that input is sent to pandoc. - `filter` lists pandoc [json filters](http://johnmacfarlane.net/pandoc/scripting.html). Filters gain two new properties from panzer. For more info, see section on [compatibility](#compatibility) with pandoc. - `lua-filter` lists pandoc [lua filters](https://pandoc.org/lua-filters.html). - `postprocessor` lists executable to pipe pandoc’s output through. Standard unix executables (`sed`, `tr`, etc.) are examples of possible use. Postprocessors are skipped if a binary writer (e.g. `docx`) is used. - `postflight` lists executables run after the output has been written. If output is stdout, postflight scripts are run after stdout has been flushed. - `cleanup` lists executables run before panzer exits and after postflight scripts. Cleanup scripts run irrespective of whether an error has occurred earlier. Example: ``` yaml Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: wrap: preserve filter: - run: deemph.py postflight: - run: latexmk.py ``` If panzer were run on the following document with the latex writer selected, ``` yaml --- title: "My document" style: Notes ... ``` it would run pandoc with filter `deemph.py` and command line option `--wrap=preserve` on the following and then execute `latexmk.py`. ``` yaml --- title: "My document" numbersections: true fontsize: 12pt ... ``` ## Style overriding Styles may be defined: - ‘Globally’ in `.yaml` files in `.panzer/styles/` - ‘Locally’ in `.yaml` files in the current working directory `./styles/`) - ‘In document’ inside a `styledef` field in the document’s yaml metadata block If no `.panzer/styles/` directory is found, panzer will look for global style definitions in `.panzer/styles.yaml` if it exists. If no `./styles/` directory is found in the current working directory, panzer will look for local style definitions in `./styles.yaml` if it exists. Overriding among style settings is determined by the following rules: | \# | overriding rule | | :- | :----------------------------------------------------------------- | | 1 | Local style definitions override global style definitions | | 2 | In document style definitions override local style definitions | | 3 | Writer-specific settings override settings for `all` | | 4 | In a list, later styles override earlier ones | | 5 | Children override parents | | 6 | Fields set outside a style definition override any style’s setting | For fields that pertain to scripts/filters, overriding is *additive*; for other fields, it is *non-additive*: - For `metadata`, `template`, and `commandline`, if one style overrides another (say, a parent and child set `numbersections` to different values), then inheritance is non-additive, and only one (the child) wins. - For `preflight`, `lua-filter`, `filter`, `postflight` and `cleanup` if one style overrides another, then the ‘winner’ adds its items after those of the ‘loser’. For example, if the parent adds to `postflight` an item `-run: latexmk.py`, and the child adds `- run: printlog.py`, then `printlog.py` will be run after `latexmk.py` - To remove an item from an additive list, add it as the value of a `kill` field: for example, `- kill: latexmk.py` Arguments passed to panzer directly on the command line trump any style settings, and cannot be overridden by any metadata setting. Filters specified on the command line (via `--filter` and `--lua-filter`) are run first, and cannot be removed. All lua filters are run after json filters. pandoc options set via panzer’s command line invocation override any set via `commandline`. Multiple input files are joined according to pandoc’s rules. Metadata are merged using left-biased union. This means overriding behaviour when merging multiple input files is different from that of panzer, and always non-additive. If fed input from stdin, panzer buffers this to a temporary file in the current working directory before proceeding. This is required to allow preflight scripts to access the data. The temporary file is removed when panzer exits. ## The run list Executables (scripts, filters, postprocessors) are specified by a list (the ‘run list’). The list determines what gets run when. Processes are executed from first to last in the run list. If an item appears as the value of a `run:` field, then it is added to the run list. If an item appears as the value of a `kill:` field, then any previous occurrence is removed from the run list. Killing an item does not prevent it from being added later. A run list can be completely emptied by adding the special item `- killall: true`. Arguments can be passed to executables by listing them as the value of the `args` field of that item. The value of the `args` field is passed as the command line options to the external process. This value of `args` should be a quoted inline code span (e.g. ``"`--options`"``) to prevent the parser interpreting it as markdown. Note that json filters always receive the writer name as their first argument. Lua filters cannot take arguments and the contents of their `args` field is ignored. Example: ``` yaml - filter: - run: setbaseheader.py args: "`--level=2`" - postprocess: - run: sed args: "`-e 's/hello/goodbye/g'`" - postflight: - kill: open_pdf.py - cleanup: - killall: true ``` The filter `setbaseheader.py` receives the writer name as its first argument and `--level=2` as its second argument. When panzer is searching for a filter `foo.py`, it will look for: | \# | look for | | :- | :---------------------------------------------- | | 1 | `./foo.py` | | 2 | `./filter/foo.py` | | 3 | `./filter/foo/foo.py` | | 4 | `~/.panzer/filter/foo.py` | | 5 | `~/.panzer/filter/foo/foo.py` | | 6 | `foo.py` in PATH defined by current environment | Similar rules apply to other executables and to templates. The typical structure for the support directory `.panzer` is: .panzer/ cleanup/ filter/ lua-filter/ postflight/ postprocess/ preflight/ template/ shared/ styles/ Within each directory, each executable may have a named subdirectory: postflight/ latexmk/ latexmk.py ## Pandoc command line options Arbitrary pandoc command line options can be set using metadata via `commandline`. `commandline` can appear outside a style definition and in a document’s metadata block, where it overrides the settings of any style. `commandline` contains one field for each pandoc command line option. The field name is the unabbreviated name of the relevant pandoc command line option (e.g. `standalone`). - For pandoc flags, the value should be boolean (`true`, `false`), e.g. `standalone: true`. - For pandoc key-values, the value should be a quoted inline code span, e.g. ``include-in-header: "`path/to/my/header`"``. - For pandoc repeated key-values, the value should be a list of inline code spans, e.g. ``` yaml commandline: include-in-header: - "`file1.txt`" - "`file2.txt`" - "`file3.txt`" ``` Repeated key-value options in `comandline` are added after any provided from the command line. Overriding styles append to repeated key-value lists of the styles that they override. `false` plays a special role. `false` means that the pandoc command line option with the field’s name, if set, should be unset. `false` can be used for both flags and key-value options (e.g. `include-in-header: false`). Example: ``` yaml commandline: standalone: true slide-level: "`3`" number-sections: false include-in-header: false ``` This passes the following options to pandoc `--standalone --slide-level=3` and removes any `--number-sections` and `--include-in-header=...` options. These pandoc command line options cannot be set via `commandline`: - `bash-completion` - `dump-args` - `filter` - `from` - `help` - `ignore-args` - `list-extensions` - `list-highlight-languages` - `list-highlight-styles` - `list-input-formats` - `list-output-formats` - `lua-filter` - `metadata` - `output` - `print-default-data-file` - `print-default-template` - `print-highlight-style` - `read` - `template` - `to` - `variable` - `version` - `write` # Passing messages to external processes External processes have just as much information as panzer does. panzer sends its information to external processes via a json message. This message is sent as a string over stdin to scripts (preflight, postflight, cleanup scripts). It is stored inside a `CodeBlock` of the AST for filters. Note that filters need to parse the `panzer_reserved` field and deserialise the contents of its `CodeBlock` to recover the json message. Some relevant discussion is [here](https://github.com/msprev/panzer/issues/38#issuecomment-367664291). Postprocessors do not receive a json message (if you need it, you should probably be using a filter). JSON_MESSAGE = [{'metadata': METADATA, 'template': TEMPLATE, 'style': STYLE, 'stylefull': STYLEFULL, 'styledef': STYLEDEF, 'runlist': RUNLIST, 'options': OPTIONS}] - `METADATA` is a copy of the metadata branch of the document’s AST (useful for scripts, not useful for filters) - `TEMPLATE` is a string with path to the current template - `STYLE` is a list of current style(s) - `STYLEFULL` is a list of current style(s) including all parents, grandparents, etc. in order of application - `STYLEDEF` is a copy of all style definitions employed in document - `RUNLIST` is a list of processes in the run list; it has the following structure: RUNLIST = [{'kind': 'preflight'|'filter'|'lua-filter'|'postprocess'|'postflight'|'cleanup', 'command': 'my command', 'arguments': ['argument1', 'argument2', ...], 'status': 'queued'|'running'|'failed'|'done' }, ... ... ] - `OPTIONS` is a dictionary containing panzer’s and pandoc’s command line options: ``` python OPTIONS = { 'panzer': { 'panzer_support': const.DEFAULT_SUPPORT_DIR, 'pandoc': 'pandoc', 'debug': str(), 'quiet': False, 'strict': False, 'stdin_temp_file': str() # tempfile used to buffer stdin }, 'pandoc': { 'input': list(), # list of input files 'output': '-', # output file; '-' is stdout 'pdf_output': False, # if pandoc will write a .pdf 'read': str(), # reader 'write': str(), # writer 'options': {'r': dict(), 'w': dict()} } } ``` `options` contains the command line options with which pandoc is called. It consists of two separate dictionaries. The dictionary under the `'r'` key contains all pandoc options pertaining to reading the source documents to the AST. The dictionary under the `'w'` key contains all pandoc options pertaining to writing the AST to the output document. Scripts read the json message above by deserialising json input on stdin. Filters can read the json message by reading the metadata field, `panzer_reserved`, stored as a raw code block in the AST, and deserialising the string `JSON_MESSAGE_STR` to recover the json: panzer_reserved: json_message: | ``` {.json} JSON_MESSAGE_STR ``` # Receiving messages from external processes panzer captures stderr output from all executables. This is for pretty printing of info and errors. Scripts and filters should send json messages to panzer via stderr. If a message is sent to stderr that is not correctly formatted, panzer will print it verbatim prefixed by a ‘\!’. The json message that panzer expects is a newline-separated sequence of utf-8 encoded json dictionaries, each with the following structure: { 'level': LEVEL, 'message': MESSAGE } - `LEVEL` is a string that sets the error level; it can take one of the following values: ``` 'CRITICAL' 'ERROR' 'WARNING' 'INFO' 'DEBUG' 'NOTSET' ``` - `MESSAGE` is a string with your message # Compatibility panzer accepts pandoc filters. panzer allows filters to behave in two new ways: 1. Json filters can take more than one command line argument (first argument still reserved for the writer). 2. A `panzer_reserved` field is added to the AST metadata branch with goodies for filters to mine. For pandoc, json filters and lua-filters are applied in the order specified by respective occurances of `--filter` and `--lua-filter` on the command line. This behaviour is not entirely supported in panzer. Instead, all json filters are applied first and in the order specified on the command line and the style definition (command line filters are applied first and unkillable). Then the lua-filters are applied, also in the order specified on the command line and by the style definition (command line filters are applied first and unkillable). The reasons for the divergence with pandoc’s behaviour are complex but mainly derive from performance benefit. The follow pandoc command line options cannot be used with panzer: - `--bash-completion` - `--dump-args` - `--ignore-args` - `--list-extensions` - `--list-highlight-languages` - `--list-highlight-styles` - `--list-input-formats` - `--list-output-formats` - `--print-default-template`, `-D` - `--print-default-data-file` - `--version`, `-v` - `--help`, `-h` The following metadata fields are reserved for use by panzer: - `styledef` - `style` - `template` - `preflight` - `filter` - `lua-filter` - `postflight` - `postprocess` - `cleanup` - `commandline` - `panzer_reserved` - `read` The writer name `all` is also occupied. # Known issues Pull requests welcome: - Slower than I would like (calls to subprocess slow in Python) - Calls to subprocesses (scripts, filters, etc.) block ui - [Possible issue under Windows](https://github.com/msprev/panzer/pull/9), so far reported by only one user. A leading dot plus slash is required on filter filenames. Rather than having `- run: foo.bar`, on Windows one needs to have `- run: ./foo.bar`. More information on this is welcome. I am happy to fix compatibility problems under Windows. # FAQ 1. Why do I get the error `[Errno 13] Permission denied`? Filters and scripts must be executable. Vanilla pandoc allows filters to be run without their executable permission set. panzer does not allow this. The solution: set the executable permission of your filter or script, `chmod +x myfilter_name.py` For more, see [here](https://github.com/msprev/panzer/issues/22). 2. Does panzer expand `~` or `*` inside field of a style definition? panzer does not do any shell expansion/globbing inside a style definition. The reason is described [here](https://github.com/msprev/panzer/issues/23). TL;DR: expansion and globbing are messy and not something that panzer is in a position to do correctly or predictably inside a style definition. You need to use the full path to reference your home directory inside a style definition. # Similar - - - - # Release notes - 1.4.1 (22 February 2018): - improved support of lua filters thanks to feedback from [jzeneto](https://github.com/jzeneto) - 1.4 (20 February 2018): - support added for lua filters - 1.3.1 (18 December 2017): - updated for pandoc 2.0.5 [\#35](https://github.com/msprev/panzer/issues/34). Support for all changes to command line interface and `pptx` writer. - 1.3 (7 November 2017): - updated for pandoc 2.0 [\#31](https://github.com/msprev/panzer/issues/31). Please note that this version of panzer *breaks compatibility with versions of pandoc earlier than 2.0*. Please upgrade to a version of pandoc \>2.0. Versions of pandoc prior to 2.0 will no longer be supported in future releases of panzer. - 1.2 (12 January 2017): - fixed issue introduced by breaking change in panzer 1.1 [\#27](https://github.com/msprev/panzer/issues/27). Added panzer compatibility mode for pandoc versions \<1.18. All version of pandoc \>1.12.1 should work with panzer now. - 1.1 (27 October 2016): - breaking change: support pandoc 1.18’s new api; earlier versions of pandoc will not work - 1.0 (21 July 2015): - new: `---strict` panzer command line option: [\#10](https://github.com/msprev/panzer/issues/10) - new: `commandline` allows repeated options using lists: [\#3](https://github.com/msprev/panzer/issues/3) - new: `commandline` lists behave as additive in style inheritance: [\#6](https://github.com/msprev/panzer/issues/6) - new: support multiple yaml style definition files: [\#4](https://github.com/msprev/panzer/issues/4) - new: support local yaml style definition files: [\#4](https://github.com/msprev/panzer/issues/4) - new: simplify format for panzer’s json message: [ce2a12](https://github.com/msprev/panzer/commit/f3a6cc28b78957827cb572e254977c2344ce2a12) - new: reproduce pandoc’s reader depending on writer settings: [\#1](https://github.com/msprev/panzer/issues/1), [\#7](https://github.com/msprev/panzer/issues/7) - fix: refactor `commandline` implementation: [\#1](https://github.com/msprev/panzer/issues/1) - fix: improve documentation: [\#2](https://github.com/msprev/panzer/issues/2) - fix: unicode error in `setup.py`: [\#12](https://github.com/msprev/panzer/issues/12) - fix: support yaml style definition files without closing empty line: [\#13](https://github.com/msprev/panzer/issues/13) - fix: add `.gitignore` files to repository: [PR\#1](https://github.com/msprev/panzer/pull/9) - 1.0b2 (23 May 2015): - new: `commandline` - set arbitrary pandoc command line options via metadata - 1.0b1 (14 May 2015): - initial release ================================================ FILE: README.rst ================================================ ================= panzer user guide ================= :Author: Mark Sprevak :Date: 6 November 2018 -------------- Development has ceased on panzer. Over the years, pandoc has gained powerful new functionality (e.g. the ``--metadata-file`` option and Lua filters) that means that 90% of what can be done with panzer can be done with pandoc and some simple wrapper scripts. I no longer use panzer in my own workflow for this reason. If you would like to take over development of panzer, let me know. -------------- panzer ====== panzer adds *styles* to `pandoc `__. Styles provide a way to set all options for a pandoc document with one line (‘I want this document be an article/CV/notes/letter’). You can think of styles as a level up in abstraction from a pandoc template. Styles are combinations of templates, metadata settings, pandoc command line options, and instructions to run filters, scripts and postprocessors. These settings can be customised on a per writer and per document basis. Styles can be combined and can bear inheritance relations to each other. panzer exposes a large amount of structured information to the external processes called by styles, allowing those processes to be both more powerful and themselves controllable via metadata (and hence also by styles). Styles simplify makefiles, bundling everything related to the look of the document in one place. You can think of panzer as an exoskeleton that sits around pandoc and configures pandoc based on a single choice in your document. To use a style, add a field with your style name to the yaml metadata block of your document: .. code:: yaml style: Notes Multiple styles can be supplied as a list: .. code:: yaml style: - Notes - BoldHeadings Styles are defined in a yaml file (`example `__). The style definition file, plus associated executables, are placed in the ``.panzer`` directory in the user’s home folder (`example `__). A style can also be defined inside the document’s metadata block: .. code:: yaml --- style: Notes styledef: Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: columns: "`75`" lua-filter: - run: macroexpand.lua filter: - run: deemph.py ... Style settings can be overridden by adding the appropriate field outside a style definition in the document’s metadata block: .. code:: yaml --- style: Notes numbersections: true filter: - run: smallcaps.py commandline: - pdf-engine: "`xelatex`" ... Installation ============ .. code:: bash pip3 install git+https://github.com/msprev/panzer *Requirements:* - `pandoc `__ > 2.0 - `Python 3 `__ - `pip `__ (included in most Python 3 distributions) *To upgrade existing installation:* .. code:: bash pip3 install --upgrade git+https://github.com/msprev/panzer On Arch Linux systems, the AUR package `panzer-git `__ can be used. Troubleshooting --------------- An `issue `__ has been reported using pip to install on Windows. If the method above does not work, use the alternative install method below. :: git clone https://github.com/msprev/panzer cd panzer python3 setup.py install *To upgrade existing installation:* :: cd /path/to/panzer/directory/cloned git pull python3 setup.py install --force Use === Run ``panzer`` on your document as you would ``pandoc``. If the document lacks a ``style`` field, this is equivalent to running ``pandoc``. If the document has a ``style`` field, panzer will invoke pandoc plus any associated scripts, filters, and populate the appropriate metadata fields. ``panzer`` accepts the same command line options as ``pandoc``. These options are passed to the underlying instance of pandoc. pandoc command line options can also be set via metadata. panzer has additional command line options. These are prefixed by triple dashes (``---``). Run the command ``panzer -h`` to see them: :: -h, --help, ---help, ---h show this help message and exit -v, --version, ---version, ---v show program's version number and exit ---quiet only print errors and warnings ---strict exit on first error ---panzer-support PANZER_SUPPORT panzer user data directory ---pandoc PANDOC pandoc executable ---debug DEBUG filename to write .log and .json debug files Panzer expects all input and output to be utf-8. Style definition ================ A style definition may consist of: =============== ================================== ================================= field value value type =============== ================================== ================================= ``parent`` parent(s) of style ``MetaList`` or ``MetaInlines`` ``metadata`` default metadata fields ``MetaMap`` ``commandline`` pandoc command line options ``MetaMap`` ``template`` pandoc template ``MetaInlines`` or ``MetaString`` ``preflight`` run before input doc is processed ``MetaList`` ``filter`` pandoc filters ``MetaList`` ``lua-filter`` pandoc lua filters ``MetaList`` ``postprocess`` run on pandoc’s output ``MetaList`` ``postflight`` run after output file written ``MetaList`` ``cleanup`` run on exit irrespective of errors ``MetaList`` =============== ================================== ================================= Style definitions are hierarchically structured by *name* and *writer*. Style names by convention should be MixedCase (``MyNotes``) to avoid confusion with other metadata fields. Writer names are the same as those of the relevant pandoc writer (e.g. ``latex``, ``html``, ``docx``, etc.) A special writer, ``all``, matches every writer. - ``parent`` takes a list or single style. Children inherit the properties of their parents. Children may have multiple parents. - ``metadata`` contains default metadata set by the style. Any metadata field that can appear in a pandoc document can appear here. - ``commandline`` specifies pandoc’s command line options. - ``template`` is a pandoc `template `__ for the style. - ``preflight`` lists executables run before the document is processed. These are run after panzer reads the input, but before that input is sent to pandoc. - ``filter`` lists pandoc `json filters `__. Filters gain two new properties from panzer. For more info, see section on `compatibility <#compatibility>`__ with pandoc. - ``lua-filter`` lists pandoc `lua filters `__. - ``postprocessor`` lists executable to pipe pandoc’s output through. Standard unix executables (``sed``, ``tr``, etc.) are examples of possible use. Postprocessors are skipped if a binary writer (e.g. ``docx``) is used. - ``postflight`` lists executables run after the output has been written. If output is stdout, postflight scripts are run after stdout has been flushed. - ``cleanup`` lists executables run before panzer exits and after postflight scripts. Cleanup scripts run irrespective of whether an error has occurred earlier. Example: .. code:: yaml Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: wrap: preserve filter: - run: deemph.py postflight: - run: latexmk.py If panzer were run on the following document with the latex writer selected, .. code:: yaml --- title: "My document" style: Notes ... it would run pandoc with filter ``deemph.py`` and command line option ``--wrap=preserve`` on the following and then execute ``latexmk.py``. .. code:: yaml --- title: "My document" numbersections: true fontsize: 12pt ... Style overriding ---------------- Styles may be defined: - ‘Globally’ in ``.yaml`` files in ``.panzer/styles/`` - ‘Locally’ in ``.yaml`` files in the current working directory ``./styles/``) - ‘In document’ inside a ``styledef`` field in the document’s yaml metadata block If no ``.panzer/styles/`` directory is found, panzer will look for global style definitions in ``.panzer/styles.yaml`` if it exists. If no ``./styles/`` directory is found in the current working directory, panzer will look for local style definitions in ``./styles.yaml`` if it exists. Overriding among style settings is determined by the following rules: = ================================================================== # overriding rule = ================================================================== 1 Local style definitions override global style definitions 2 In document style definitions override local style definitions 3 Writer-specific settings override settings for ``all`` 4 In a list, later styles override earlier ones 5 Children override parents 6 Fields set outside a style definition override any style’s setting = ================================================================== For fields that pertain to scripts/filters, overriding is *additive*; for other fields, it is *non-additive*: - For ``metadata``, ``template``, and ``commandline``, if one style overrides another (say, a parent and child set ``numbersections`` to different values), then inheritance is non-additive, and only one (the child) wins. - For ``preflight``, ``lua-filter``, ``filter``, ``postflight`` and ``cleanup`` if one style overrides another, then the ‘winner’ adds its items after those of the ‘loser’. For example, if the parent adds to ``postflight`` an item ``-run: latexmk.py``, and the child adds ``- run: printlog.py``, then ``printlog.py`` will be run after ``latexmk.py`` - To remove an item from an additive list, add it as the value of a ``kill`` field: for example, ``- kill: latexmk.py`` Arguments passed to panzer directly on the command line trump any style settings, and cannot be overridden by any metadata setting. Filters specified on the command line (via ``--filter`` and ``--lua-filter``) are run first, and cannot be removed. All lua filters are run after json filters. pandoc options set via panzer’s command line invocation override any set via ``commandline``. Multiple input files are joined according to pandoc’s rules. Metadata are merged using left-biased union. This means overriding behaviour when merging multiple input files is different from that of panzer, and always non-additive. If fed input from stdin, panzer buffers this to a temporary file in the current working directory before proceeding. This is required to allow preflight scripts to access the data. The temporary file is removed when panzer exits. The run list ------------ Executables (scripts, filters, postprocessors) are specified by a list (the ‘run list’). The list determines what gets run when. Processes are executed from first to last in the run list. If an item appears as the value of a ``run:`` field, then it is added to the run list. If an item appears as the value of a ``kill:`` field, then any previous occurrence is removed from the run list. Killing an item does not prevent it from being added later. A run list can be completely emptied by adding the special item ``- killall: true``. Arguments can be passed to executables by listing them as the value of the ``args`` field of that item. The value of the ``args`` field is passed as the command line options to the external process. This value of ``args`` should be a quoted inline code span (e.g. :literal:`"`--options`"`) to prevent the parser interpreting it as markdown. Note that json filters always receive the writer name as their first argument. Lua filters cannot take arguments and the contents of their ``args`` field is ignored. Example: .. code:: yaml - filter: - run: setbaseheader.py args: "`--level=2`" - postprocess: - run: sed args: "`-e 's/hello/goodbye/g'`" - postflight: - kill: open_pdf.py - cleanup: - killall: true The filter ``setbaseheader.py`` receives the writer name as its first argument and ``--level=2`` as its second argument. When panzer is searching for a filter ``foo.py``, it will look for: = ================================================= # look for = ================================================= 1 ``./foo.py`` 2 ``./filter/foo.py`` 3 ``./filter/foo/foo.py`` 4 ``~/.panzer/filter/foo.py`` 5 ``~/.panzer/filter/foo/foo.py`` 6 ``foo.py`` in PATH defined by current environment = ================================================= Similar rules apply to other executables and to templates. The typical structure for the support directory ``.panzer`` is: :: .panzer/ cleanup/ filter/ lua-filter/ postflight/ postprocess/ preflight/ template/ shared/ styles/ Within each directory, each executable may have a named subdirectory: :: postflight/ latexmk/ latexmk.py Pandoc command line options --------------------------- Arbitrary pandoc command line options can be set using metadata via ``commandline``. ``commandline`` can appear outside a style definition and in a document’s metadata block, where it overrides the settings of any style. ``commandline`` contains one field for each pandoc command line option. The field name is the unabbreviated name of the relevant pandoc command line option (e.g. ``standalone``). - For pandoc flags, the value should be boolean (``true``, ``false``), e.g. \ ``standalone: true``. - For pandoc key-values, the value should be a quoted inline code span, e.g. \ :literal:`include-in-header: "`path/to/my/header`"`. - For pandoc repeated key-values, the value should be a list of inline code spans, e.g. .. code:: yaml commandline: include-in-header: - "`file1.txt`" - "`file2.txt`" - "`file3.txt`" Repeated key-value options in ``comandline`` are added after any provided from the command line. Overriding styles append to repeated key-value lists of the styles that they override. ``false`` plays a special role. ``false`` means that the pandoc command line option with the field’s name, if set, should be unset. ``false`` can be used for both flags and key-value options (e.g. ``include-in-header: false``). Example: .. code:: yaml commandline: standalone: true slide-level: "`3`" number-sections: false include-in-header: false This passes the following options to pandoc ``--standalone --slide-level=3`` and removes any ``--number-sections`` and ``--include-in-header=...`` options. These pandoc command line options cannot be set via ``commandline``: - ``bash-completion`` - ``dump-args`` - ``filter`` - ``from`` - ``help`` - ``ignore-args`` - ``list-extensions`` - ``list-highlight-languages`` - ``list-highlight-styles`` - ``list-input-formats`` - ``list-output-formats`` - ``lua-filter`` - ``metadata`` - ``output`` - ``print-default-data-file`` - ``print-default-template`` - ``print-highlight-style`` - ``read`` - ``template`` - ``to`` - ``variable`` - ``version`` - ``write`` Passing messages to external processes ====================================== External processes have just as much information as panzer does. panzer sends its information to external processes via a json message. This message is sent as a string over stdin to scripts (preflight, postflight, cleanup scripts). It is stored inside a ``CodeBlock`` of the AST for filters. Note that filters need to parse the ``panzer_reserved`` field and deserialise the contents of its ``CodeBlock`` to recover the json message. Some relevant discussion is `here `__. Postprocessors do not receive a json message (if you need it, you should probably be using a filter). :: JSON_MESSAGE = [{'metadata': METADATA, 'template': TEMPLATE, 'style': STYLE, 'stylefull': STYLEFULL, 'styledef': STYLEDEF, 'runlist': RUNLIST, 'options': OPTIONS}] - ``METADATA`` is a copy of the metadata branch of the document’s AST (useful for scripts, not useful for filters) - ``TEMPLATE`` is a string with path to the current template - ``STYLE`` is a list of current style(s) - ``STYLEFULL`` is a list of current style(s) including all parents, grandparents, etc. in order of application - ``STYLEDEF`` is a copy of all style definitions employed in document - ``RUNLIST`` is a list of processes in the run list; it has the following structure: :: RUNLIST = [{'kind': 'preflight'|'filter'|'lua-filter'|'postprocess'|'postflight'|'cleanup', 'command': 'my command', 'arguments': ['argument1', 'argument2', ...], 'status': 'queued'|'running'|'failed'|'done' }, ... ... ] - ``OPTIONS`` is a dictionary containing panzer’s and pandoc’s command line options: .. code:: python OPTIONS = { 'panzer': { 'panzer_support': const.DEFAULT_SUPPORT_DIR, 'pandoc': 'pandoc', 'debug': str(), 'quiet': False, 'strict': False, 'stdin_temp_file': str() # tempfile used to buffer stdin }, 'pandoc': { 'input': list(), # list of input files 'output': '-', # output file; '-' is stdout 'pdf_output': False, # if pandoc will write a .pdf 'read': str(), # reader 'write': str(), # writer 'options': {'r': dict(), 'w': dict()} } } ``options`` contains the command line options with which pandoc is called. It consists of two separate dictionaries. The dictionary under the ``'r'`` key contains all pandoc options pertaining to reading the source documents to the AST. The dictionary under the ``'w'`` key contains all pandoc options pertaining to writing the AST to the output document. Scripts read the json message above by deserialising json input on stdin. Filters can read the json message by reading the metadata field, ``panzer_reserved``, stored as a raw code block in the AST, and deserialising the string ``JSON_MESSAGE_STR`` to recover the json: :: panzer_reserved: json_message: | ``` {.json} JSON_MESSAGE_STR ``` Receiving messages from external processes ========================================== panzer captures stderr output from all executables. This is for pretty printing of info and errors. Scripts and filters should send json messages to panzer via stderr. If a message is sent to stderr that is not correctly formatted, panzer will print it verbatim prefixed by a ‘!’. The json message that panzer expects is a newline-separated sequence of utf-8 encoded json dictionaries, each with the following structure: :: { 'level': LEVEL, 'message': MESSAGE } - ``LEVEL`` is a string that sets the error level; it can take one of the following values: :: 'CRITICAL' 'ERROR' 'WARNING' 'INFO' 'DEBUG' 'NOTSET' - ``MESSAGE`` is a string with your message Compatibility ============= panzer accepts pandoc filters. panzer allows filters to behave in two new ways: 1. Json filters can take more than one command line argument (first argument still reserved for the writer). 2. A ``panzer_reserved`` field is added to the AST metadata branch with goodies for filters to mine. For pandoc, json filters and lua-filters are applied in the order specified by respective occurances of ``--filter`` and ``--lua-filter`` on the command line. This behaviour is not entirely supported in panzer. Instead, all json filters are applied first and in the order specified on the command line and the style definition (command line filters are applied first and unkillable). Then the lua-filters are applied, also in the order specified on the command line and by the style definition (command line filters are applied first and unkillable). The reasons for the divergence with pandoc’s behaviour are complex but mainly derive from performance benefit. The follow pandoc command line options cannot be used with panzer: - ``--bash-completion`` - ``--dump-args`` - ``--ignore-args`` - ``--list-extensions`` - ``--list-highlight-languages`` - ``--list-highlight-styles`` - ``--list-input-formats`` - ``--list-output-formats`` - ``--print-default-template``, ``-D`` - ``--print-default-data-file`` - ``--version``, ``-v`` - ``--help``, ``-h`` The following metadata fields are reserved for use by panzer: - ``styledef`` - ``style`` - ``template`` - ``preflight`` - ``filter`` - ``lua-filter`` - ``postflight`` - ``postprocess`` - ``cleanup`` - ``commandline`` - ``panzer_reserved`` - ``read`` The writer name ``all`` is also occupied. Known issues ============ Pull requests welcome: - Slower than I would like (calls to subprocess slow in Python) - Calls to subprocesses (scripts, filters, etc.) block ui - `Possible issue under Windows `__, so far reported by only one user. A leading dot plus slash is required on filter filenames. Rather than having ``- run: foo.bar``, on Windows one needs to have ``- run: ./foo.bar``. More information on this is welcome. I am happy to fix compatibility problems under Windows. FAQ === 1. Why do I get the error ``[Errno 13] Permission denied``? Filters and scripts must be executable. Vanilla pandoc allows filters to be run without their executable permission set. panzer does not allow this. The solution: set the executable permission of your filter or script, ``chmod +x myfilter_name.py`` For more, see `here `__. 2. Does panzer expand ``~`` or ``*`` inside field of a style definition? panzer does not do any shell expansion/globbing inside a style definition. The reason is described `here `__. TL;DR: expansion and globbing are messy and not something that panzer is in a position to do correctly or predictably inside a style definition. You need to use the full path to reference your home directory inside a style definition. Similar ======= - https://github.com/mb21/panrun - https://github.com/htdebeer/pandocomatic - https://github.com/balachia/panopy - https://github.com/phyllisstein/pandown Release notes ============= - 1.4.1 (22 February 2018): - improved support of lua filters thanks to feedback from `jzeneto `__ - 1.4 (20 February 2018): - support added for lua filters - 1.3.1 (18 December 2017): - updated for pandoc 2.0.5 `#35 `__. Support for all changes to command line interface and ``pptx`` writer. - 1.3 (7 November 2017): - updated for pandoc 2.0 `#31 `__. Please note that this version of panzer *breaks compatibility with versions of pandoc earlier than 2.0*. Please upgrade to a version of pandoc >2.0. Versions of pandoc prior to 2.0 will no longer be supported in future releases of panzer. - 1.2 (12 January 2017): - fixed issue introduced by breaking change in panzer 1.1 `#27 `__. Added panzer compatibility mode for pandoc versions <1.18. All version of pandoc >1.12.1 should work with panzer now. - 1.1 (27 October 2016): - breaking change: support pandoc 1.18’s new api; earlier versions of pandoc will not work - 1.0 (21 July 2015): - new: ``---strict`` panzer command line option: `#10 `__ - new: ``commandline`` allows repeated options using lists: `#3 `__ - new: ``commandline`` lists behave as additive in style inheritance: `#6 `__ - new: support multiple yaml style definition files: `#4 `__ - new: support local yaml style definition files: `#4 `__ - new: simplify format for panzer’s json message: `ce2a12 `__ - new: reproduce pandoc’s reader depending on writer settings: `#1 `__, `#7 `__ - fix: refactor ``commandline`` implementation: `#1 `__ - fix: improve documentation: `#2 `__ - fix: unicode error in ``setup.py``: `#12 `__ - fix: support yaml style definition files without closing empty line: `#13 `__ - fix: add ``.gitignore`` files to repository: `PR#1 `__ - 1.0b2 (23 May 2015): - new: ``commandline`` - set arbitrary pandoc command line options via metadata - 1.0b1 (14 May 2015): - initial release ================================================ FILE: doc/.gitignore ================================================ .tmp/* *.pdf *.html ================================================ FILE: doc/README.md ================================================ --- title: "panzer user guide" author: Mark Sprevak date: 6 November 2018 ... --- Development has ceased on panzer. Over the years, pandoc has gained powerful new functionality (e.g. the `--metadata-file` option and Lua filters) that means that 90% of what can be done with panzer can be done with pandoc and some simple wrapper scripts. I no longer use panzer in my own workflow for this reason. If you would like to take over development of panzer, let me know. --- # panzer panzer adds *styles* to [pandoc][]. Styles provide a way to set all options for a pandoc document with one line ('I want this document be an article/CV/notes/letter'). You can think of styles as a level up in abstraction from a pandoc template. Styles are combinations of templates, metadata settings, pandoc command line options, and instructions to run filters, scripts and postprocessors. These settings can be customised on a per writer and per document basis. Styles can be combined and can bear inheritance relations to each other. panzer exposes a large amount of structured information to the external processes called by styles, allowing those processes to be both more powerful and themselves controllable via metadata (and hence also by styles). Styles simplify makefiles, bundling everything related to the look of the document in one place. You can think of panzer as an exoskeleton that sits around pandoc and configures pandoc based on a single choice in your document. To use a style, add a field with your style name to the yaml metadata block of your document: ``` {.yaml} style: Notes ``` Multiple styles can be supplied as a list: ``` {.yaml} style: - Notes - BoldHeadings ``` Styles are defined in a yaml file ([example][example-yaml]). The style definition file, plus associated executables, are placed in the `.panzer` directory in the user's home folder ([example][example-dot-panzer]). A style can also be defined inside the document's metadata block: ``` {.yaml} --- style: Notes styledef: Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: columns: "`75`" lua-filter: - run: macroexpand.lua filter: - run: deemph.py ... ``` Style settings can be overridden by adding the appropriate field outside a style definition in the document's metadata block: ``` {.yaml} --- style: Notes numbersections: true filter: - run: smallcaps.py commandline: - pdf-engine: "`xelatex`" ... ``` # Installation ``` {.bash} pip3 install git+https://github.com/msprev/panzer ``` *Requirements:* * [pandoc][] > 2.0 * [Python 3][] * [pip][] (included in most Python 3 distributions) *To upgrade existing installation:* ``` {.bash} pip3 install --upgrade git+https://github.com/msprev/panzer ``` On Arch Linux systems, the AUR package [panzer-git](https://aur.archlinux.org/packages/panzer-git/) can be used. ## Troubleshooting An [issue][pip-issue] has been reported using pip to install on Windows. If the method above does not work, use the alternative install method below. git clone https://github.com/msprev/panzer cd panzer python3 setup.py install *To upgrade existing installation:* cd /path/to/panzer/directory/cloned git pull python3 setup.py install --force # Use Run `panzer` on your document as you would `pandoc`. If the document lacks a `style` field, this is equivalent to running `pandoc`. If the document has a `style` field, panzer will invoke pandoc plus any associated scripts, filters, and populate the appropriate metadata fields. `panzer` accepts the same command line options as `pandoc`. These options are passed to the underlying instance of pandoc. pandoc command line options can also be set via metadata. panzer has additional command line options. These are prefixed by triple dashes (`---`). Run the command `panzer -h` to see them: ``` -h, --help, ---help, ---h show this help message and exit -v, --version, ---version, ---v show program's version number and exit ---quiet only print errors and warnings ---strict exit on first error ---panzer-support PANZER_SUPPORT panzer user data directory ---pandoc PANDOC pandoc executable ---debug DEBUG filename to write .log and .json debug files ``` Panzer expects all input and output to be utf-8. # Style definition A style definition may consist of: field value value type --------------- ------------------------------------ ----------------------------- `parent` parent(s) of style `MetaList` or `MetaInlines` `metadata` default metadata fields `MetaMap` `commandline` pandoc command line options `MetaMap` `template` pandoc template `MetaInlines` or `MetaString` `preflight` run before input doc is processed `MetaList` `filter` pandoc filters `MetaList` `lua-filter` pandoc lua filters `MetaList` `postprocess` run on pandoc's output `MetaList` `postflight` run after output file written `MetaList` `cleanup` run on exit irrespective of errors `MetaList` Style definitions are hierarchically structured by *name* and *writer*. Style names by convention should be MixedCase (`MyNotes`) to avoid confusion with other metadata fields. Writer names are the same as those of the relevant pandoc writer (e.g. `latex`, `html`, `docx`, etc.) A special writer, `all`, matches every writer. - `parent` takes a list or single style. Children inherit the properties of their parents. Children may have multiple parents. - `metadata` contains default metadata set by the style. Any metadata field that can appear in a pandoc document can appear here. - `commandline` specifies pandoc's command line options. - `template` is a pandoc [template][] for the style. - `preflight` lists executables run before the document is processed. These are run after panzer reads the input, but before that input is sent to pandoc. - `filter` lists pandoc [json filters][]. Filters gain two new properties from panzer. For more info, see section on [compatibility](#compatibility) with pandoc. - `lua-filter` lists pandoc [lua filters](https://pandoc.org/lua-filters.html). - `postprocessor` lists executable to pipe pandoc's output through. Standard unix executables (`sed`, `tr`, etc.) are examples of possible use. Postprocessors are skipped if a binary writer (e.g. `docx`) is used. - `postflight` lists executables run after the output has been written. If output is stdout, postflight scripts are run after stdout has been flushed. - `cleanup` lists executables run before panzer exits and after postflight scripts. Cleanup scripts run irrespective of whether an error has occurred earlier. Example: ``` {.yaml} Notes: all: metadata: numbersections: false latex: metadata: numbersections: true fontsize: 12pt commandline: wrap: preserve filter: - run: deemph.py postflight: - run: latexmk.py ``` If panzer were run on the following document with the latex writer selected, ``` {.yaml} --- title: "My document" style: Notes ... ``` it would run pandoc with filter `deemph.py` and command line option `--wrap=preserve` on the following and then execute `latexmk.py`. ``` {.yaml} --- title: "My document" numbersections: true fontsize: 12pt ... ``` ## Style overriding Styles may be defined: - 'Globally' in `.yaml` files in `.panzer/styles/` - 'Locally' in `.yaml` files in the current working directory `./styles/`) - 'In document' inside a `styledef` field in the document's yaml metadata block If no `.panzer/styles/` directory is found, panzer will look for global style definitions in `.panzer/styles.yaml` if it exists. If no `./styles/` directory is found in the current working directory, panzer will look for local style definitions in `./styles.yaml` if it exists. Overriding among style settings is determined by the following rules: \# overriding rule ---- ------------------------------------------------------------------------------- 1 Local style definitions override global style definitions 2 In document style definitions override local style definitions 3 Writer-specific settings override settings for `all` 4 In a list, later styles override earlier ones 5 Children override parents 6 Fields set outside a style definition override any style's setting For fields that pertain to scripts/filters, overriding is *additive*; for other fields, it is *non-additive*: - For `metadata`, `template`, and `commandline`, if one style overrides another (say, a parent and child set `numbersections` to different values), then inheritance is non-additive, and only one (the child) wins. - For `preflight`, `lua-filter`, `filter`, `postflight` and `cleanup` if one style overrides another, then the 'winner' adds its items after those of the 'loser'. For example, if the parent adds to `postflight` an item `-run: latexmk.py`, and the child adds `- run: printlog.py`, then `printlog.py` will be run after `latexmk.py` - To remove an item from an additive list, add it as the value of a `kill` field: for example, `- kill: latexmk.py` Arguments passed to panzer directly on the command line trump any style settings, and cannot be overridden by any metadata setting. Filters specified on the command line (via `--filter` and `--lua-filter`) are run first, and cannot be removed. All lua filters are run after json filters. pandoc options set via panzer's command line invocation override any set via `commandline`. Multiple input files are joined according to pandoc's rules. Metadata are merged using left-biased union. This means overriding behaviour when merging multiple input files is different from that of panzer, and always non-additive. If fed input from stdin, panzer buffers this to a temporary file in the current working directory before proceeding. This is required to allow preflight scripts to access the data. The temporary file is removed when panzer exits. ## The run list Executables (scripts, filters, postprocessors) are specified by a list (the 'run list'). The list determines what gets run when. Processes are executed from first to last in the run list. If an item appears as the value of a `run:` field, then it is added to the run list. If an item appears as the value of a `kill:` field, then any previous occurrence is removed from the run list. Killing an item does not prevent it from being added later. A run list can be completely emptied by adding the special item `- killall: true`. Arguments can be passed to executables by listing them as the value of the `args` field of that item. The value of the `args` field is passed as the command line options to the external process. This value of `args` should be a quoted inline code span (e.g. ``"`--options`"``) to prevent the parser interpreting it as markdown. Note that json filters always receive the writer name as their first argument. Lua filters cannot take arguments and the contents of their `args` field is ignored. Example: ``` {.yaml} - filter: - run: setbaseheader.py args: "`--level=2`" - postprocess: - run: sed args: "`-e 's/hello/goodbye/g'`" - postflight: - kill: open_pdf.py - cleanup: - killall: true ``` The filter `setbaseheader.py` receives the writer name as its first argument and `--level=2` as its second argument. When panzer is searching for a filter `foo.py`, it will look for: # look for --- ------------------------------------------------- 1 `./foo.py` 2 `./filter/foo.py` 3 `./filter/foo/foo.py` 4 `~/.panzer/filter/foo.py` 5 `~/.panzer/filter/foo/foo.py` 6 `foo.py` in PATH defined by current environment Similar rules apply to other executables and to templates. The typical structure for the support directory `.panzer` is: .panzer/ cleanup/ filter/ lua-filter/ postflight/ postprocess/ preflight/ template/ shared/ styles/ Within each directory, each executable may have a named subdirectory: postflight/ latexmk/ latexmk.py ## Pandoc command line options Arbitrary pandoc command line options can be set using metadata via `commandline`. `commandline` can appear outside a style definition and in a document's metadata block, where it overrides the settings of any style. `commandline` contains one field for each pandoc command line option. The field name is the unabbreviated name of the relevant pandoc command line option (e.g. `standalone`). - For pandoc flags, the value should be boolean (`true`, `false`), e.g. `standalone: true`. - For pandoc key-values, the value should be a quoted inline code span, e.g. ``include-in-header: "`path/to/my/header`"``. - For pandoc repeated key-values, the value should be a list of inline code spans, e.g. ``` {.yaml} commandline: include-in-header: - "`file1.txt`" - "`file2.txt`" - "`file3.txt`" ``` Repeated key-value options in `comandline` are added after any provided from the command line. Overriding styles append to repeated key-value lists of the styles that they override. `false` plays a special role. `false` means that the pandoc command line option with the field's name, if set, should be unset. `false` can be used for both flags and key-value options (e.g. `include-in-header: false`). Example: ``` {.yaml} commandline: standalone: true slide-level: "`3`" number-sections: false include-in-header: false ``` This passes the following options to pandoc `--standalone --slide-level=3` and removes any `--number-sections` and `--include-in-header=...` options. These pandoc command line options cannot be set via `commandline`: - `bash-completion` - `dump-args` - `filter` - `from` - `help` - `ignore-args` - `list-extensions` - `list-highlight-languages` - `list-highlight-styles` - `list-input-formats` - `list-output-formats` - `lua-filter` - `metadata` - `output` - `print-default-data-file` - `print-default-template` - `print-highlight-style` - `read` - `template` - `to` - `variable` - `version` - `write` # Passing messages to external processes External processes have just as much information as panzer does. panzer sends its information to external processes via a json message. This message is sent as a string over stdin to scripts (preflight, postflight, cleanup scripts). It is stored inside a `CodeBlock` of the AST for filters. Note that filters need to parse the `panzer_reserved` field and deserialise the contents of its `CodeBlock` to recover the json message. Some relevant discussion is [here](https://github.com/msprev/panzer/issues/38#issuecomment-367664291). Postprocessors do not receive a json message (if you need it, you should probably be using a filter). ``` JSON_MESSAGE = [{'metadata': METADATA, 'template': TEMPLATE, 'style': STYLE, 'stylefull': STYLEFULL, 'styledef': STYLEDEF, 'runlist': RUNLIST, 'options': OPTIONS}] ``` - `METADATA` is a copy of the metadata branch of the document's AST (useful for scripts, not useful for filters) - `TEMPLATE` is a string with path to the current template - `STYLE` is a list of current style(s) - `STYLEFULL` is a list of current style(s) including all parents, grandparents, etc. in order of application - `STYLEDEF` is a copy of all style definitions employed in document - `RUNLIST` is a list of processes in the run list; it has the following structure: ``` RUNLIST = [{'kind': 'preflight'|'filter'|'lua-filter'|'postprocess'|'postflight'|'cleanup', 'command': 'my command', 'arguments': ['argument1', 'argument2', ...], 'status': 'queued'|'running'|'failed'|'done' }, ... ... ] ``` - `OPTIONS` is a dictionary containing panzer's and pandoc's command line options: ``` {.python} OPTIONS = { 'panzer': { 'panzer_support': const.DEFAULT_SUPPORT_DIR, 'pandoc': 'pandoc', 'debug': str(), 'quiet': False, 'strict': False, 'stdin_temp_file': str() # tempfile used to buffer stdin }, 'pandoc': { 'input': list(), # list of input files 'output': '-', # output file; '-' is stdout 'pdf_output': False, # if pandoc will write a .pdf 'read': str(), # reader 'write': str(), # writer 'options': {'r': dict(), 'w': dict()} } } ``` `options` contains the command line options with which pandoc is called. It consists of two separate dictionaries. The dictionary under the `'r'` key contains all pandoc options pertaining to reading the source documents to the AST. The dictionary under the `'w'` key contains all pandoc options pertaining to writing the AST to the output document. Scripts read the json message above by deserialising json input on stdin. Filters can read the json message by reading the metadata field, `panzer_reserved`, stored as a raw code block in the AST, and deserialising the string `JSON_MESSAGE_STR` to recover the json: panzer_reserved: json_message: | ``` {.json} JSON_MESSAGE_STR ``` # Receiving messages from external processes panzer captures stderr output from all executables. This is for pretty printing of info and errors. Scripts and filters should send json messages to panzer via stderr. If a message is sent to stderr that is not correctly formatted, panzer will print it verbatim prefixed by a '!'. The json message that panzer expects is a newline-separated sequence of utf-8 encoded json dictionaries, each with the following structure: { 'level': LEVEL, 'message': MESSAGE } - `LEVEL` is a string that sets the error level; it can take one of the following values: 'CRITICAL' 'ERROR' 'WARNING' 'INFO' 'DEBUG' 'NOTSET' - `MESSAGE` is a string with your message # Compatibility panzer accepts pandoc filters. panzer allows filters to behave in two new ways: 1. Json filters can take more than one command line argument (first argument still reserved for the writer). 2. A `panzer_reserved` field is added to the AST metadata branch with goodies for filters to mine. For pandoc, json filters and lua-filters are applied in the order specified by respective occurances of `--filter` and `--lua-filter` on the command line. This behaviour is not entirely supported in panzer. Instead, all json filters are applied first and in the order specified on the command line and the style definition (command line filters are applied first and unkillable). Then the lua-filters are applied, also in the order specified on the command line and by the style definition (command line filters are applied first and unkillable). The reasons for the divergence with pandoc's behaviour are complex but mainly derive from performance benefit. The follow pandoc command line options cannot be used with panzer: - `--bash-completion` - `--dump-args` - `--ignore-args` - `--list-extensions` - `--list-highlight-languages` - `--list-highlight-styles` - `--list-input-formats` - `--list-output-formats` - `--print-default-template`, `-D` - `--print-default-data-file` - `--version`, `-v` - `--help`, `-h` The following metadata fields are reserved for use by panzer: * `styledef` * `style` * `template` * `preflight` * `filter` * `lua-filter` * `postflight` * `postprocess` * `cleanup` * `commandline` * `panzer_reserved` * `read` The writer name `all` is also occupied. # Known issues Pull requests welcome: * Slower than I would like (calls to subprocess slow in Python) * Calls to subprocesses (scripts, filters, etc.) block ui * [Possible issue under Windows](https://github.com/msprev/panzer/pull/9), so far reported by only one user. A leading dot plus slash is required on filter filenames. Rather than having `- run: foo.bar`, on Windows one needs to have `- run: ./foo.bar`. More information on this is welcome. I am happy to fix compatibility problems under Windows. # FAQ 1. Why do I get the error `[Errno 13] Permission denied`? Filters and scripts must be executable. Vanilla pandoc allows filters to be run without their executable permission set. panzer does not allow this. The solution: set the executable permission of your filter or script, `chmod +x myfilter_name.py` For more, see [here](https://github.com/msprev/panzer/issues/22). 2. Does panzer expand `~` or `*` inside field of a style definition? panzer does not do any shell expansion/globbing inside a style definition. The reason is described [here](https://github.com/msprev/panzer/issues/23). TL;DR: expansion and globbing are messy and not something that panzer is in a position to do correctly or predictably inside a style definition. You need to use the full path to reference your home directory inside a style definition. # Similar * * * * # Release notes - 1.4.1 (22 February 2018): - improved support of lua filters thanks to feedback from [jzeneto](https://github.com/jzeneto) - 1.4 (20 February 2018): - support added for lua filters - 1.3.1 (18 December 2017): - updated for pandoc 2.0.5 [#35](https://github.com/msprev/panzer/issues/34). Support for all changes to command line interface and `pptx` writer. - 1.3 (7 November 2017): - updated for pandoc 2.0 [#31](https://github.com/msprev/panzer/issues/31). Please note that this version of panzer *breaks compatibility with versions of pandoc earlier than 2.0*. Please upgrade to a version of pandoc >2.0. Versions of pandoc prior to 2.0 will no longer be supported in future releases of panzer. - 1.2 (12 January 2017): - fixed issue introduced by breaking change in panzer 1.1 [#27](https://github.com/msprev/panzer/issues/27). Added panzer compatibility mode for pandoc versions <1.18. All version of pandoc >1.12.1 should work with panzer now. - 1.1 (27 October 2016): - breaking change: support pandoc 1.18's new api; earlier versions of pandoc will not work - 1.0 (21 July 2015): - new: `---strict` panzer command line option: [#10](https://github.com/msprev/panzer/issues/10) - new: `commandline` allows repeated options using lists: [#3](https://github.com/msprev/panzer/issues/3) - new: `commandline` lists behave as additive in style inheritance: [#6](https://github.com/msprev/panzer/issues/6) - new: support multiple yaml style definition files: [#4](https://github.com/msprev/panzer/issues/4) - new: support local yaml style definition files: [#4](https://github.com/msprev/panzer/issues/4) - new: simplify format for panzer's json message: [ce2a12](https://github.com/msprev/panzer/commit/f3a6cc28b78957827cb572e254977c2344ce2a12) - new: reproduce pandoc's reader depending on writer settings: [#1](https://github.com/msprev/panzer/issues/1), [#7](https://github.com/msprev/panzer/issues/7) - fix: refactor `commandline` implementation: [#1](https://github.com/msprev/panzer/issues/1) - fix: improve documentation: [#2](https://github.com/msprev/panzer/issues/2) - fix: unicode error in `setup.py`: [#12](https://github.com/msprev/panzer/issues/12) - fix: support yaml style definition files without closing empty line: [#13](https://github.com/msprev/panzer/issues/13) - fix: add `.gitignore` files to repository: [PR#1](https://github.com/msprev/panzer/pull/9) - 1.0b2 (23 May 2015): - new: `commandline` - set arbitrary pandoc command line options via metadata - 1.0b1 (14 May 2015): - initial release [pandoc]: http://johnmacfarlane.net/pandoc/index.html [panzer]: https://github.com/msprev [python 3]: https://www.python.org/downloads/ [json filters]: http://johnmacfarlane.net/pandoc/scripting.html [template]: http://johnmacfarlane.net/pandoc/demo/example9/templates.html [example-yaml]: https://github.com/msprev/dot-panzer/blob/master/styles/styles.yaml [example-dot-panzer]: https://github.com/msprev/dot-panzer [setuptools for Python3]: http://stackoverflow.com/questions/14426491/python-3-importerror-no-module-named-setuptools [pip]: https://pip.pypa.io/en/stable/index.html [pip-issue]: https://github.com/msprev/panzer/issues/20 ================================================ FILE: doc/makefile ================================================ SOURCE = README.md PANDOC = /usr/local/bin/pandoc PANZER = /usr/local/bin/panzer TEMPLATES = /Users/msprevak/Dropbox/msprevak/dot-panzer/template OUT_BASENAME := $(basename $(SOURCE)) .PHONY: clean ######################################################################## # Make the documentation files for release # ######################################################################## doc: $(SOURCE) $(PANDOC) $(SOURCE) -o ../$(OUT_BASENAME).md -t gfm --standalone; \ $(PANDOC) $(SOURCE) -o ../$(OUT_BASENAME).rst -t rst --standalone ######################################################################## # Standard makefile for markdown file # ######################################################################## html: $(SOURCE) $(PANZER) $(SOURCE) -o $(OUT_BASENAME).html pdf: $(SOURCE) $(PANZER) $(SOURCE) -o $(OUT_BASENAME).tex docx: $(SOURCE) $(PANZER) $(SOURCE) -o $(OUT_BASENAME).docx pandoc-html: $(SOURCE) $(PANDOC) \ $(SOURCE) \ -o .tmp/_$(OUT_BASENAME)_.html \ --template $(TEMPLATES)/article.html \ -H $(TEMPLATES)/Momento-pandoc.css \ --smart \ --standalone clean: - rm -f .tmp/* \ $(OUT_BASENAME).pdf $(OUT_BASENAME).docx $(OUT_BASENAME).html \ ../$(OUT_BASENAME).md \ ../$(OUT_BASENAME).rst ================================================ FILE: panzer/__init__.py ================================================ """ check that python3 is being used """ import sys if sys.version_info[0] != 3: print("panzer cannot run --- it requires Python 3") sys.exit(1) ================================================ FILE: panzer/cli.py ================================================ """ command line options for panzer """ import argparse import os import shutil import sys import tempfile from . import const from . import version PANZER_DESCRIPTION = ''' Panzer-specific arguments are prefixed by triple dashes ('---'). Other arguments are passed to pandoc. panzer default user data directory: "%s" pandoc default executable: "%s" ''' % (const.DEFAULT_SUPPORT_DIR, shutil.which('pandoc')) PANZER_EPILOG = ''' Copyright (C) 2017 Mark Sprevak Web: http://sites.google.com/site/msprevak This is free software; see the source for copying conditions. There is no warranty, not even for merchantability or fitness for a particular purpose. ''' def parse_cli_options(options): """ parse command line options """ # # disable pylint warnings: # + Too many local variables (too-many-locals) # + Too many branches (too-many-branches) # pylint: disable=R0912 # pylint: disable=R0914 # # 1. Parse options specific to panzer panzer_known, unknown = panzer_parse() # 2. Update options with panzer-specific values for field in panzer_known: val = panzer_known[field] if val: options['panzer'][field] = val # 3. Parse options specific to pandoc pandoc_known, unknown = pandoc_parse(unknown) # 2. Update options with pandoc-specific values for field in pandoc_known: val = pandoc_known[field] if val: options['pandoc'][field] = val # 3. Check for pandoc output being pdf if os.path.splitext(options['pandoc']['output'])[1].lower() == '.pdf': options['pandoc']['pdf_output'] = True # 4. Detect pandoc's writer # - first case: writer explicitly specified by cli option if options['pandoc']['write']: pass # - second case: html default writer for stdout elif options['pandoc']['output'] == '-': options['pandoc']['write'] = 'html' # - third case: writer set via output filename extension else: ext = os.path.splitext(options['pandoc']['output'])[1].lower() implicit_writer = const.PANDOC_WRITER_MAPPING.get(ext) if implicit_writer is not None: options['pandoc']['write'] = implicit_writer else: # - html is default writer for unrecognised extensions options['pandoc']['write'] = 'html' # 5. Input from stdin # - if one of the inputs is stdin then read from stdin now into # - temp file, then replace '-'s in input filelist with reference to file if '-' in options['pandoc']['input']: # Read from stdin now into temp file in cwd stdin_bytes = sys.stdin.buffer.read() with tempfile.NamedTemporaryFile(prefix='__panzer-', suffix='__', dir=os.getcwd(), delete=False) as temp_file: temp_filename = os.path.join(os.getcwd(), temp_file.name) options['panzer']['stdin_temp_file'] = temp_filename temp_file.write(stdin_bytes) temp_file.flush() # Replace all reference to stdin in pandoc cli with temp file for index, val in enumerate(options['pandoc']['input']): if val == '-': options['pandoc']['input'][index] = options['panzer']['stdin_temp_file'] # 6. Remaining options for pandoc opt_known, unknown = pandoc_opt_parse(unknown) # - sort them into reader and writer phase options for opt in opt_known: # undo weird transform that argparse does to match option name # https://docs.python.org/dev/library/argparse.html#dest opt_name = str(opt).replace('_', '-') if opt_name not in const.PANDOC_OPT_PHASE: print('ERROR: ' 'do not know reader/writer type of command line option "--%s"' '---ignoring' % opt_name) continue for phase in const.PANDOC_OPT_PHASE[opt_name]: options['pandoc']['options'][phase][opt_name] = opt_known[opt] # cli option is mutable by `commandline` metadata if # - not set ( == None or == False) # - of type list if opt_known[opt] == None \ or opt_known[opt] == False \ or type(opt_known[opt]) is list: options['pandoc']['mutable'][phase][opt_name] = True else: options['pandoc']['mutable'][phase][opt_name] = False options['pandoc'] = set_quirky_dependencies(options['pandoc']) # 7. print error messages for unknown options for opt in unknown: if opt in const.PANDOC_BAD_OPTS: print('ERROR: ' 'pandoc command line option "%s" not supported by panzer' '---ignoring' % opt) else: print('ERROR: ' 'do not recognize command line option "%s"' '---ignoring' % opt) return options def panzer_parse(): """ return list of arguments recognised by panzer + unknowns """ panzer_parser = argparse.ArgumentParser( description=PANZER_DESCRIPTION, epilog=PANZER_EPILOG, formatter_class=argparse.RawTextHelpFormatter, add_help=False) panzer_parser.add_argument("-h", "--help", '---help', '---h', action="help", help="show this help message and exit") panzer_parser.add_argument('-v', '--version', '---version', '---v', action='version', version=('%(prog)s ' + version.VERSION)) panzer_parser.add_argument("---quiet", action='store_true', help='only print errors and warnings') panzer_parser.add_argument("---strict", action='store_true', help='exit on first error') panzer_parser.add_argument("---panzer-support", help='panzer user data directory') panzer_parser.add_argument("---pandoc", help='pandoc executable') panzer_parser.add_argument("---debug", help='filename to write .log and .json debug files') panzer_known_raw, unknown = panzer_parser.parse_known_args() panzer_known = vars(panzer_known_raw) return (panzer_known, unknown) def pandoc_parse(args): """ return list of arguments recognised by pandoc + unknowns """ pandoc_parser = argparse.ArgumentParser(prog='pandoc') pandoc_parser.add_argument('input', nargs='*') pandoc_parser.add_argument("--read", "-r", "--from", "-f") pandoc_parser.add_argument("--write", "-w", "--to", "-t") pandoc_parser.add_argument("--output", "-o") pandoc_parser.add_argument("--template") pandoc_parser.add_argument("--filter", nargs=1, action='append') pandoc_parser.add_argument("--lua-filter", nargs=1, action='append') pandoc_known_raw, unknown = pandoc_parser.parse_known_args(args) pandoc_known = vars(pandoc_known_raw) return (pandoc_known, unknown) def pandoc_opt_parse(args): """ return list of pandoc command line options """ opt_parser = argparse.ArgumentParser(prog='pandoc') # general options opt_parser.add_argument("--data-dir") # reader options opt_parser.add_argument('--abbreviations') opt_parser.add_argument('--base-header-level') opt_parser.add_argument('--default-image-extension') opt_parser.add_argument('--extract-media') opt_parser.add_argument('--file-scope', action='store_true') opt_parser.add_argument('--indented-code-classes') opt_parser.add_argument('--metadata', '-M', nargs=1, action='append') opt_parser.add_argument('--old-dashes', action='store_true') opt_parser.add_argument('--preserve-tabs', '-p', action='store_true') opt_parser.add_argument('--tab-stop') opt_parser.add_argument('--track-changes') # writer options opt_parser.add_argument('--ascii', action='store_true') opt_parser.add_argument('--atx-headers', action='store_true') opt_parser.add_argument('--biblatex', action='store_true') opt_parser.add_argument('--bibliography', nargs=1, action='append') opt_parser.add_argument('--chapters', action='store_true') opt_parser.add_argument('--citation-abbreviations') opt_parser.add_argument('--columns') opt_parser.add_argument('--csl') opt_parser.add_argument('--css', '-c', nargs=1, action='append') opt_parser.add_argument('--dpi') opt_parser.add_argument('--email-obfuscation') opt_parser.add_argument('--eol') opt_parser.add_argument('--epub-chapter-level') opt_parser.add_argument('--epub-cover-image') opt_parser.add_argument('--epub-embed-font') opt_parser.add_argument('--epub-metadata') opt_parser.add_argument('--epub-subdirectory') opt_parser.add_argument('--gladtex', action='store_true') opt_parser.add_argument('--highlight-style') opt_parser.add_argument('--html-q-tags', action='store_true') opt_parser.add_argument('--id-prefix') opt_parser.add_argument('--include-after-body', '-A', nargs=1, action='append') opt_parser.add_argument('--include-before-body', '-B', nargs=1, action='append') opt_parser.add_argument('--include-in-header', '-H', nargs=1, action='append') opt_parser.add_argument('--incremental', '-i', action='store_true') opt_parser.add_argument('--jsmath') opt_parser.add_argument('--katex') opt_parser.add_argument('--katex-stylesheet') opt_parser.add_argument('--pdf-engine') opt_parser.add_argument('--pdf-engine-opt', nargs=1, action='append') opt_parser.add_argument('--latexmathml', '-m') opt_parser.add_argument('--listings', action='store_true') opt_parser.add_argument('--log') opt_parser.add_argument('--mathjax') opt_parser.add_argument('--mathml', action='store_true') opt_parser.add_argument('--mimetex') opt_parser.add_argument('--natbib', action='store_true') opt_parser.add_argument('--no-highlight', action='store_true') opt_parser.add_argument('--no-tex-ligatures', action='store_true') opt_parser.add_argument('--no-wrap', action='store_true') opt_parser.add_argument('--number-offset') opt_parser.add_argument('--number-sections', '-N', action='store_true') opt_parser.add_argument('--reference-doc') opt_parser.add_argument('--reference-links', action='store_true') opt_parser.add_argument('--reference-location') opt_parser.add_argument('--request-header') opt_parser.add_argument('--resource-path') opt_parser.add_argument('--section-divs', action='store_true') opt_parser.add_argument('--self-contained', action='store_true') opt_parser.add_argument('--slide-level') opt_parser.add_argument('--standalone', '-s', action='store_true') opt_parser.add_argument('--syntax-definition') opt_parser.add_argument('--table-of-contents', '--toc', action='store_true') opt_parser.add_argument('--title-prefix', '-T') opt_parser.add_argument('--toc-depth') opt_parser.add_argument('--top-level-division') opt_parser.add_argument('--variable', '-V', nargs=1, action='append') opt_parser.add_argument('--verbose', action='store_true') opt_parser.add_argument('--webtex') opt_parser.add_argument('--wrap') opt_known_raw, unknown = opt_parser.parse_known_args(args) opt_known = vars(opt_known_raw) return (opt_known, unknown) def set_quirky_dependencies(pandoc): """ Set defaults for pandoc options that are dependent in a quirky way, and that panzer route via json would disrupt. Quirky here means that pandoc would have to know the writer to set the reader to the correct defaults or vice versa """ # --smart: reader setting # True when the output format is latex or context, unless --no-tex-ligatures # is used. # if (pandoc['write'] == 'latex' or \ # pandoc['write'] == 'beamer' or \ # pandoc['write'] == 'context') \ # and pandoc['options']['w']['no-tex-ligatures'] == False: # pandoc['options']['r']['smart'] = True ## this is commented out as apparently not needed with pandoc >2.0 return pandoc ================================================ FILE: panzer/const.py ================================================ """ constants for panzer code """ import os DEBUG_TIMING = False USE_OLD_API = False REQUIRE_PANDOC_ATLEAST = "2.0" DEFAULT_SUPPORT_DIR = os.path.join(os.path.expanduser('~'), '.panzer') ENCODING = 'utf8' # keys to access type and content of metadata fields T = 't' C = 'c' # list of 'kind' of items on runlist, in order they should run RUNLIST_KIND = ['preflight', 'filter', 'lua-filter', 'postprocess', 'postflight', 'cleanup'] # 'status' of items on runlist QUEUED = 'queued' RUNNING = 'running' FAILED = 'failed' DONE = 'done' # ast of an empty pandoc document EMPTY_DOCUMENT = {"blocks":[],"pandoc-api-version":[1,17,0,4],"meta":{}} EMPTY_DOCUMENT_OLDAPI = [{'unMeta': {}}, []] # writers that give binary outputs # these cannot be written to stdout BINARY_WRITERS = ['odt', 'docx', 'epub', 'epub3', 'pptx'] # forbidden options for panzer command line PANDOC_BAD_OPTS = [ '--bash-completion', '--dump-args', '--ignore-args', '--list-extensions', '--list-highlight-languages', '--list-highlight-styles', '--list-input-formats', '--list-output-formats', '--print-default-data-file', '--print-default-template', '--print-highlight-style', '-D' ] # forbidden options for 'commandline' metadata field PANDOC_BAD_COMMANDLINE = [ 'bash-completion', 'dump-args', 'filter', 'from', 'help', 'ignore-args', 'list-extensions', 'list-highlight-languages', 'list-highlight-styles', 'list-input-formats', 'list-output-formats', 'lua-filter', 'metadata', 'output', 'print-default-data-file', 'print-default-template', 'print-highlight-style', 'read', 'template', 'to', 'variable', 'version', 'write' ] # additive command line options PANDOC_OPT_ADDITIVE = ['metadata', 'variable', 'bibliography', 'include-in-header', 'include-before-body', 'include-after-body', 'css', 'pdf-engine-opt'] # pandoc's command line options, divided by reader or writer PANDOC_OPT_PHASE = { # general options 'data-dir': 'rw', 'log': 'rw', # reader options 'abbreviations': 'r', 'base-header-level': 'r', 'bibliography': 'r', 'citation-abbreviations': 'r', 'csl': 'r', 'default-image-extension': 'r', 'extract-media': 'r', 'file-scope': 'r', 'indented-code-classes': 'r', 'metadata': 'r', 'metadata-file': 'r', 'old-dashes': 'r', 'preserve-tabs': 'r', 'strip-empty-paragraphs': 'r', 'tab-stop': 'r', 'track-changes': 'r', # writer options 'ascii': 'w', 'atx-headers': 'w', 'biblatex': 'w', 'chapters': 'w', 'columns': 'w', 'css': 'w', 'dpi': 'w', 'email-obfuscation': 'w', 'eol': 'w', 'epub-chapter-level': 'w', 'epub-cover-image': 'w', 'epub-embed-font': 'w', 'epub-metadata': 'w', 'epub-subdirectory': 'w', 'gladtex': 'w', 'highlight-style': 'w', 'html-q-tags': 'w', 'id-prefix': 'w', 'include-after-body': 'w', 'include-before-body': 'w', 'include-in-header': 'w', 'incremental': 'w', 'jsmath': 'w', 'katex': 'w', 'katex-stylesheet': 'w', 'latexmathml': 'w', 'listings': 'w', 'mathjax': 'w', 'mathml': 'w', 'mimetex': 'w', 'natbib': 'w', 'no-highlight': 'w', 'no-tex-ligatures': 'w', 'no-wrap': 'w', 'number-offset': 'w', 'number-sections': 'w', 'pdf-engine': 'w', 'pdf-engine-opt': 'w', 'reference-doc': 'w', 'reference-links': 'w', 'reference-location': 'w', 'request-header': 'w', 'resource-path': 'w', 'section-divs': 'w', 'self-contained': 'w', 'slide-level': 'w', 'standalone': 'w', 'syntax-definition': 'w', 'table-of-contents': 'w', 'title-prefix': 'w', 'toc-depth': 'w', 'top-level-division': 'w', 'variable': 'w', 'verbose': 'w', 'webtex': 'w', 'wrap': 'w' } # Adapted from https://github.com/jgm/pandoc/blob/master/pandoc.hs#L841 PANDOC_WRITER_MAPPING = { "" : "markdown", ".tex" : "latex", ".latex" : "latex", ".ltx" : "latex", ".context" : "context", ".ctx" : "context", ".rtf" : "rtf", ".rst" : "rst", ".s5" : "s5", ".native" : "native", ".json" : "json", ".txt" : "markdown", ".text" : "markdown", ".md" : "markdown", ".markdown" : "markdown", ".textile" : "textile", ".lhs" : "markdown+lhs", ".texi" : "texinfo", ".texinfo" : "texinfo", ".db" : "docbook", ".odt" : "odt", ".docx" : "docx", ".epub" : "epub", ".org" : "org", ".asciidoc" : "asciidoc", ".adoc" : "asciidoc", ".fb2" : "fb2", ".opml" : "opml", ".icml" : "icml", ".tei.xml" : "tei", ".tei" : "tei", ".ms" : "ms", ".roff" : "ms", ".pptx" : "pptx", ".1" : "man", ".2" : "man", ".3" : "man", ".4" : "man", ".5" : "man", ".6" : "man", ".7" : "man", ".8" : "man", ".9" : "man" } ================================================ FILE: panzer/document.py ================================================ """ panzer document class and its methods """ import json import os import pandocfilters import subprocess import sys from . import error from . import meta from . import util from . import info from . import const class Document(object): """ representation of pandoc/panzer documents - ast: pandoc abstract syntax tree of document - style: list of styles for document - stylefull: full list of styles including all parents - styledef: style definitions - runlist: run list for document - options: panzer and pandoc command line options - template: template for document - output: string filled with output when processing complete """ # # disable pylint warnings: # + Too many instance attributes # pylint: disable=R0902 # def __init__(self): """ new blank document """ # - defaults if const.USE_OLD_API: self.ast = const.EMPTY_DOCUMENT_OLDAPI else: self.ast = const.EMPTY_DOCUMENT self.style = list() self.stylefull = list() self.styledef = dict() self.runlist = list() self.template = None self.output = None self.options = { 'panzer': { 'panzer_support' : const.DEFAULT_SUPPORT_DIR, 'pandoc' : 'pandoc', 'debug' : str(), 'quiet' : False, 'strict' : False, 'stdin_temp_file' : str() }, 'pandoc': { 'input' : ['-'], 'output' : '-', 'pdf_output' : False, 'read' : str(), 'write' : str(), 'template' : str(), 'filter' : list(), 'lua_filter' : list(), 'options' : {'r': dict(), 'w': dict()}, 'mutable' : {'r': dict(), 'w': dict()} } } def empty(self): """ empty document of all its content but `self.options` (used when re-reading from input with new reader command line options) """ # - defaults if const.USE_OLD_API: self.ast = const.EMPTY_DOCUMENT_OLDAPI else: self.ast = const.EMPTY_DOCUMENT self.style = list() self.stylefull = list() self.styledef = dict() self.runlist = list() self.template = None self.output = None def populate(self, ast, global_styledef, local_styledef): """ populate document's: `self.ast`, `self.styledef`, `self.style`, `self.stylefull` remaining fields: `self.template` - set after 'transform' applied `self.runlist` - set after 'transform' applied `self.output` - set after 'pandoc' applied """ # - set self.ast: if ast: self.ast = ast else: info.log('ERROR', 'panzer', 'source document(s) empty') # - check if panzer_reserved key already exists in metadata metadata = self.get_metadata() try: meta.get_content(metadata, 'panzer_reserved') info.log('ERROR', 'panzer', 'special field "panzer_reserved" already in metadata' '---will be overwritten') except error.MissingField: pass # - set self.styledef self.populate_styledef(global_styledef, local_styledef) # - set self.style and self.stylefull self.populate_style() # - remove any styledef not used in doc self.styledef = {key: self.styledef[key] for key in self.styledef if key in self.stylefull} def populate_styledef(self, global_styledef, local_styledef): """ populate `self.styledef` from `global_styledef` `local_styledef` document style definition inside `styledef` metadata field """ info.log('INFO', 'panzer', info.pretty_title('style definitions')) # - print global style definitions if global_styledef: info.log('INFO', 'panzer', 'global:') for line in info.pretty_keys(global_styledef): info.log('INFO', 'panzer', ' ' + line) else: info.log('INFO', 'panzer', 'no global definitions loaded') # - print local style definitions overridden = dict() if local_styledef: info.log('INFO', 'panzer', 'local:') for line in info.pretty_keys(local_styledef): info.log('INFO', 'panzer', ' ' + line) # - extract and print document style definitions indoc_styledef = dict() try: indoc_styledef = meta.get_content(self.get_metadata(), 'styledef', 'MetaMap') info.log('INFO', 'panzer', 'document:') for line in info.pretty_keys(indoc_styledef): info.log('INFO', 'panzer', ' ' + line) except error.MissingField as err: info.log('DEBUG', 'panzer', err) except error.WrongType as err: info.log('ERROR', 'panzer', err) # - update the style definitions (self.styledef).update(global_styledef) (self.styledef).update(local_styledef) (self.styledef).update(indoc_styledef) # - print messages about overriding messages = list() messages += ['local document definition of "%s" overrides global definition of "%s"' % (key, key) for key in self.styledef if key in local_styledef and key in global_styledef] messages += ['document definition of "%s" overrides local definition of "%s"' % (key, key) for key in self.styledef if key in indoc_styledef and key in local_styledef] messages += ['document definition of "%s" overrides global definition of "%s"' % (key, key) for key in self.styledef if key in indoc_styledef and key in global_styledef and key not in local_styledef] for m in messages: info.log('INFO', 'panzer', m) def populate_style(self): """ populate `self.style` and `self.stylefull` """ info.log('INFO', 'panzer', info.pretty_title('document style')) # - try to extract value of style field try: self.style = meta.get_list_or_inline(self.get_metadata(), 'style') if self.style == ['']: raise error.MissingField except error.MissingField: info.log('INFO', 'panzer', 'no "style" field found, run only pandoc') return except error.WrongType as err: info.log('ERROR', 'panzer', err) return info.log('INFO', 'panzer', 'style:') info.log('INFO', 'panzer', info.pretty_list(self.style)) # - expand the style hierarchy self.stylefull = meta.expand_style_hierarchy(self.style, self.styledef) info.log('INFO', 'panzer', 'full hierarchy:') info.log('INFO', 'panzer', info.pretty_list(self.stylefull)) def build_runlist(self): """ populate `self.runlist` using `self.ast`'s metadata """ info.log('INFO', 'panzer', info.pretty_title('run list')) metadata = self.get_metadata() runlist = self.runlist for kind in const.RUNLIST_KIND: # - sanity check try: field_type = meta.get_type(metadata, kind) if field_type != 'MetaList': info.log('ERROR', 'panzer', 'value of field "%s" should be of type "MetaList"' '---found value of type "%s", ignoring it' % (kind, field_type)) continue except error.MissingField: pass # - if 'filter', add filter list specified on command line first if kind == 'filter': for cmd in self.options['pandoc']['filter']: entry = dict() entry['kind'] = 'filter' entry['status'] = const.QUEUED entry['command'] = cmd[0] entry['arguments'] = list() runlist.append(entry) # - if 'lua-filter', add filter list specified on command line first elif kind == 'lua-filter': for cmd in self.options['pandoc']['lua_filter']: entry = dict() entry['kind'] = 'lua-filter' entry['status'] = const.QUEUED entry['command'] = cmd[0] entry['arguments'] = list() runlist.append(entry) # - add commands specified in metadata if kind in metadata: entries = meta.get_runlist(metadata, kind, self.options) runlist.extend(entries) # - now some cleanup: # -- filters: add writer as first argument for entry in runlist: if entry['kind'] == 'filter': writer = self.options['pandoc']['write'] strip_exts = (writer.split('+')[0].split('-'))[0] entry['arguments'].insert(0, strip_exts) # -- postprocessors: remove them if output kind is pdf # .. or if a binary writer is selected if self.options['pandoc']['pdf_output'] \ or self.options['pandoc']['write'] in const.BINARY_WRITERS: new_runlist = list() for entry in runlist: if entry['kind'] == 'postprocess': info.log('INFO', 'panzer', 'postprocess "%s" skipped --- output of pandoc is binary file' % entry['command']) continue new_runlist.append(entry) runlist = new_runlist msg = info.pretty_runlist(runlist) for line in msg: info.log('INFO', 'panzer', line) self.runlist = runlist def apply_commandline(self, metadata): """ 1. parse `self.ast`'s `commandline` field 2. apply result to update self.options['pandoc']['options'] (the result are the options used for calling pandoc) """ if 'commandline' not in metadata: return commandline = meta.parse_commandline(metadata) if not commandline: return self.options['pandoc']['options'] = \ meta.update_pandoc_options(self.options['pandoc']['options'], commandline, self.options['pandoc']['mutable']) def lock_commandline(self): """ make the commandline line options all immutable """ for phase in self.options['pandoc']['mutable']: for opt in self.options['pandoc']['mutable'][phase]: self.options['pandoc']['mutable'][phase][opt] = False def json_message(self, clear=False): """ create json message to pass to executables. This method does 2 things: 1. injects json message into `panzer_reserved` field of `self.ast` 2. returns json message as a string if 'clear' is set, then embedded json message is removeed and None returned """ metadata = self.get_metadata() # - delete old 'panzer_reserved' key if 'panzer_reserved' in metadata: del metadata['panzer_reserved'] if clear: self.set_metadata(metadata) return None # - create a decrapified version of self.options # - remove stuff only of internal use to panzer options = dict() options['panzer'] = dict(self.options['panzer']) options['pandoc'] = dict(self.options['pandoc']) del options['pandoc']['template'] del options['pandoc']['filter'] del options['pandoc']['lua_filter'] del options['pandoc']['mutable'] # - build new json_message data = [{'metadata': metadata, 'template': self.template, 'style': self.style, 'stylefull': self.stylefull, 'styledef': self.styledef, 'runlist': self.runlist, 'options': options}] json_message = json.dumps(data) # - inject into metadata content = {"json_message": { "t": "MetaBlocks", "c": [{"t": "CodeBlock", "c": [["", ["json"], []], json_message]}]}} meta.set_content(metadata, 'panzer_reserved', content, 'MetaMap') self.set_metadata(metadata) # - return json_message return json_message def purge_style_fields(self): """ remove metadata fields from `self.ast` used to call panzer """ kill_list = const.RUNLIST_KIND kill_list += ['style'] kill_list += ['styledef'] kill_list += ['template'] kill_list += ['commandline'] metadata = self.get_metadata() new_metadata = {key: metadata[key] for key in metadata if key not in kill_list} self.set_metadata(new_metadata) def get_metadata(self): """ return metadata branch of `self.ast` """ return meta.get_metadata(self.ast) def set_metadata(self, new_metadata): """ set metadata branch of `self.ast` to `new_metadata` """ if const.USE_OLD_API: try: self.ast[0]['unMeta'] = new_metadata except (IndexError, KeyError): self.ast = const.EMPTY_DOCUMENT_OLDAPI self.ast[0]['unMeta'] = new_metadata else: self.ast['meta'] = new_metadata def transform(self): """ transform `self` by applying styles listed in `self.stylefull` """ writer = self.options['pandoc']['write'] info.log('INFO', 'panzer', 'writer:') info.log('INFO', 'panzer', ' %s' % writer) # 1. Do transform # - start with blank metadata new_metadata = dict() # - apply styles, first to last for style in self.stylefull: all_s = meta.get_nested_content(self.styledef, [style, 'all'], 'MetaMap') new_metadata = meta.update_metadata(new_metadata, all_s) self.apply_commandline(all_s) cur_s = meta.get_nested_content(self.styledef, [style, writer], 'MetaMap') new_metadata = meta.update_metadata(new_metadata, cur_s) self.apply_commandline(cur_s) # - add in document metadata in document indoc_data = self.get_metadata() # -- add items from additive fields in indoc metadata new_metadata = meta.update_additive_lists(new_metadata, indoc_data) for field in const.RUNLIST_KIND: if field in indoc_data: del indoc_data[field] # -- add all other (non-additive) fields in new_metadata.update(indoc_data) # -- apply items from indoc `commandline` field self.apply_commandline(indoc_data) # 2. Apply kill rules to trim run lists for field in const.RUNLIST_KIND: try: original_list = meta.get_content(new_metadata, field, 'MetaList') trimmed_list = meta.apply_kill_rules(original_list) if trimmed_list: meta.set_content(new_metadata, field, trimmed_list, 'MetaList') else: # if all items killed, delete field del new_metadata[field] except error.MissingField: continue except error.WrongType as err: info.log('WARNING', 'panzer', err) continue # 3. Set template try: if meta.get_type(new_metadata, 'template') == 'MetaInlines': template_raw = meta.get_content(new_metadata, 'template', 'MetaInlines') template_str = pandocfilters.stringify(template_raw) elif meta.get_type(new_metadata, 'template') == 'MetaString': template_str = meta.get_content(new_metadata, 'template', 'MetaString') if template_str == '': raise error.MissingField else: raise error.WrongType self.template = util.resolve_path(template_str, 'template', self.options) except (error.MissingField, error.WrongType) as err: info.log('DEBUG', 'panzer', err) if self.template: info.log('INFO', 'panzer', info.pretty_title('template')) info.log('INFO', 'panzer', ' %s' % info.pretty_path(self.template)) # 4. Update document's metadata self.set_metadata(new_metadata) def run_scripts(self, kind, do_not_stop=False): """ execute commands of type `kind` listed in `self.runlist` `do_not_stop`: runlist executed no matter what errors occur (used by cleanup scripts) """ # - check if no run list to run to_run = [entry for entry in self.runlist if entry['kind'] == kind] if not to_run: return info.log('INFO', 'panzer', info.pretty_title(kind)) # - maximum number of executables to run for i, entry in enumerate(self.runlist): # - skip entries that are not of the right kind if entry['kind'] != kind: continue # - build the command to run command = [entry['command']] + entry['arguments'] filename = os.path.basename(entry['command']) info.log('INFO', 'panzer', info.pretty_runlist_entry(i, len(self.runlist), entry['command'], entry['arguments'])) info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) # - run the command stderr = str() try: entry['status'] = const.RUNNING process = subprocess.Popen(' '.join(command), stdin=subprocess.PIPE, stderr=subprocess.PIPE, shell=True) # send panzer's json message to scripts via stdin in_pipe = self.json_message() in_pipe_bytes = in_pipe.encode(const.ENCODING) stderr_bytes = process.communicate(input=in_pipe_bytes)[1] entry['status'] = const.DONE stderr = stderr_bytes.decode(const.ENCODING) if stderr: entry['stderr'] = info.decode_stderr_json(stderr) except OSError as err: entry['status'] = const.FAILED info.log('ERROR', filename, err) continue except Exception as err: # pylint: disable=W0703 # if do_not_stop: always run next script # disable pylint warnings: # + Catching too general exception entry['status'] = const.FAILED if do_not_stop: info.log('ERROR', filename, err) continue else: raise finally: info.log_stderr(stderr, filename) def jsonfilter(self): """ pipe through external command listed in filters """ to_run = [entry for entry in self.runlist if entry['kind'] == 'filter'] if not to_run: return info.log('INFO', 'panzer', info.pretty_title('filter')) # Run commands for i, entry in enumerate(self.runlist): if entry['kind'] != 'filter': continue # - add debugging info command = [entry['command']] + entry['arguments'] filename = os.path.basename(entry['command']) info.log('INFO', 'panzer', info.pretty_runlist_entry(i, len(self.runlist), entry['command'], entry['arguments'])) info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) # - run the command and log any errors stderr = str() try: entry['status'] = const.RUNNING self.json_message() # Set up incoming pipe in_pipe = json.dumps(self.ast) # Set up outgoing pipe in case of failure out_pipe = in_pipe process = subprocess.Popen(' '.join(command), stderr=subprocess.PIPE, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True) in_pipe_bytes = in_pipe.encode(const.ENCODING) out_pipe_bytes, stderr_bytes = \ process.communicate(input=in_pipe_bytes) entry['status'] = const.DONE out_pipe = out_pipe_bytes.decode(const.ENCODING) stderr = stderr_bytes.decode(const.ENCODING) if stderr: entry['stderr'] = info.decode_stderr_json(stderr) except OSError as err: entry['status'] = const.FAILED info.log('ERROR', filename, err) continue except Exception: entry['status'] = const.FAILED raise finally: # remove embedded json message info.log_stderr(stderr, filename) # 4. Update document's data with output from commands try: self.ast = json.loads(out_pipe) self.json_message(clear=True) except ValueError: info.log('ERROR', 'panzer', 'failed to receive json object from filter' '---skipping filter') continue def pandoc(self): """ run pandoc on document Normally, input to pandoc is passed via stdin and output received via stout. Exception is when the output file has .pdf extension or a binary writer selected. Then, output is simply the binary file that panzer does not process further, and internal document not updated by pandoc. """ # 1. Build pandoc command command = [self.options['panzer']['pandoc']] command += ['-'] command += ['--read', 'json'] command += ['--write', self.options['pandoc']['write']] command += ['--output', self.options['pandoc']['output']] # - template specified on cli has precedence if self.options['pandoc']['template']: command += ['--template=%s' % self.options['pandoc']['template']] elif self.template: command += ['--template=%s' % self.template] opts = meta.build_cli_options(self.options['pandoc']['options']['w']) command += opts info.log('INFO', 'panzer', info.pretty_title('pandoc write')) # - add lua filters luaopts = list() for i, entry in enumerate(self.runlist): if entry['kind'] != 'lua-filter': continue command += ['--lua-filter', entry['command']] luaopts += ['--lua-filter', entry['command']] info.log('INFO', 'panzer', info.pretty_runlist_entry(i, len(self.runlist), entry['command'], entry['arguments'])) # 2. Prefill input and output pipes in_pipe = json.dumps(self.ast) out_pipe = str() stderr = str() # 3. Run pandoc command if opts or luaopts: info.log('INFO', 'panzer', 'pandoc writing with options:') info.log('INFO', 'panzer', info.pretty_list(opts + luaopts, separator=' ')) else: info.log('INFO', 'panzer', 'running') info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) try: info.time_stamp('ready to do popen') process = subprocess.Popen(command, stderr=subprocess.PIPE, stdin=subprocess.PIPE, stdout=subprocess.PIPE) info.time_stamp('popen done') in_pipe_bytes = in_pipe.encode(const.ENCODING) out_pipe_bytes, stderr_bytes = \ process.communicate(input=in_pipe_bytes) info.time_stamp('communicate done') out_pipe = out_pipe_bytes.decode(const.ENCODING) stderr = stderr_bytes.decode(const.ENCODING) except OSError as err: info.log('ERROR', 'pandoc', err) finally: info.log_stderr(stderr) sys.stdout.buffer.write(out_pipe_bytes) sys.stdout.flush() # mark all lua filters as 'done' for entry in self.runlist: if entry['kind'] == 'lua-filter': entry['status'] = const.DONE # 4. Capture output of pandoc if it is to stdout if self.options['pandoc']['pdf_output'] \ or self.options['pandoc']['write'] in const.BINARY_WRITERS: # do nothing with a binary output pass elif self.options['pandoc']['output'] == '-': self.output = out_pipe def postprocess(self): """ postprocess through external command listed in 'postprocess' """ to_run = [entry for entry in self.runlist if entry['kind'] == 'postprocess'] if not to_run: return info.log('INFO', 'panzer', info.pretty_title('postprocess')) # prepare the input # case 1: pandoc output written to stdout if self.options['pandoc']['output'] == '-': in_pipe = self.output info.log('INFO', 'panzer', "input read from pandoc's stdout") # case 2: pandoc output written to file else: with open(self.options['pandoc']['output'], 'r', encoding=const.ENCODING) as fp: in_pipe = fp.read() info.log('INFO', 'panzer', 'input read from "%s"' % self.options['pandoc']['output']) # Run commands for i, entry in enumerate(self.runlist): if entry['kind'] != 'postprocess': continue # - add debugging info command = [entry['command']] + entry['arguments'] filename = os.path.basename(entry['command']) info.log('INFO', 'panzer', info.pretty_runlist_entry(i, len(self.runlist), entry['command'], entry['arguments'])) info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) # - run the command and log any errors stderr = str() try: entry['status'] = const.RUNNING # Set up outgoing pipe in case of failure out_pipe = in_pipe process = subprocess.Popen(' '.join(command), stderr=subprocess.PIPE, stdin=subprocess.PIPE, stdout=subprocess.PIPE, shell=True) in_pipe_bytes = in_pipe.encode(const.ENCODING) out_pipe_bytes, stderr_bytes = \ process.communicate(input=in_pipe_bytes) entry['status'] = const.DONE out_pipe = out_pipe_bytes.decode(const.ENCODING) stderr = stderr_bytes.decode(const.ENCODING) if stderr: entry['stderr'] = info.decode_stderr_json(stderr) except OSError as err: entry['status'] = const.FAILED info.log('ERROR', filename, err) continue except Exception: entry['status'] = const.FAILED raise finally: info.log_stderr(stderr, filename) self.output = out_pipe in_pipe = out_pipe # 4. write final output # case 1: stdout as output if self.options['pandoc']['output'] == '-': sys.stdout.buffer.write(out_pipe_bytes) sys.stdout.flush() info.log('INFO', 'panzer', 'output written to stdout') info.log('DEBUG', 'panzer', 'output written stdout by panzer') # case 2: output to file else: with open(self.options['pandoc']['output'], 'w', encoding=const.ENCODING) as output_file: output_file.write(out_pipe) output_file.flush() info.log('INFO', 'panzer', 'output written to "%s"' % self.options['pandoc']['output']) ================================================ FILE: panzer/error.py ================================================ """ Exception classes for panzer """ class PanzerError(Exception): """ base class for all panzer exceptions """ pass class SetupError(PanzerError): """ error in the setup phase """ pass class BadASTError(PanzerError): """ malformatted AST encountered (e.g. C or T fields missing) """ pass class BadArgsFormat(PanzerError): """ args field for item in run list has incorrect format """ pass class NoArgsAllowed(PanzerError): """ no command line arguments allowed to be passed to lua filters """ pass class MissingField(PanzerError): """ looked for metadata field, did not find it """ pass class WrongType(PanzerError): """ looked for value of a type, encountered different type """ pass class InternalError(PanzerError): """ function invoked with invalid parameters """ pass class StrictModeError(PanzerError): """ An error on `---strict` mode that causes panzer to exit - On `--strict` mode: exception raised if any error of level 'ERROR' or above is logged - Without `--strict` mode: exception never raised """ pass ================================================ FILE: panzer/info.py ================================================ """ functions for logging and printing info """ import json import logging import logging.config import os import time from . import const from . import error # - lookup table for internal strings to logging levels LEVELS = { 'CRITICAL' : logging.CRITICAL, 'ERROR' : logging.ERROR, 'WARNING' : logging.WARNING, 'INFO' : logging.INFO, 'DEBUG' : logging.DEBUG, 'NOTSET' : logging.NOTSET } def start_logger(options): """ start the logger """ # - default configuration config = { 'version' : 1, 'disable_existing_loggers' : False, 'formatters': { 'detailed': { 'format': '%(asctime)s - %(levelname)s - %(message)s' }, 'mimimal': { 'format': '%(message)s' } }, 'handlers': { 'log_file_handler': { 'class' : 'logging.FileHandler', 'level' : 'DEBUG', 'formatter' : 'detailed', 'filename' : options['panzer']['debug'] + '.log', 'encoding' : const.ENCODING }, 'console': { 'class' : 'logging.StreamHandler', 'level' : 'INFO', 'formatter' : 'mimimal', 'stream' : 'ext://sys.stderr' } }, 'loggers': { __name__: { 'handlers' : ['console', 'log_file_handler'], 'level' : 'DEBUG', 'propagate' : True } } } # - set 'debug' mode if requested if not options['panzer']['debug']: config['loggers'][__name__]['handlers'].remove('log_file_handler') del config['handlers']['log_file_handler'] else: # - delete old log file if it exists # - don't see value in keeping old logs here... filename = config['handlers']['log_file_handler']['filename'] if os.path.exists(filename): os.remove(filename) # - set 'quiet' mode if requested if options['panzer']['quiet']: verbosity_level = 'WARNING' else: verbosity_level = 'INFO' config['handlers']['console']['level'] = verbosity_level # - set 'strict' mode if requested if options['panzer']['strict']: log.strict_mode = True # - send configuration to logger logging.config.dictConfig(config) log('DEBUG', 'panzer', pretty_start_log('panzer starts')) log('DEBUG', 'panzer', pretty_title('OPTIONS')) log('DEBUG', 'panzer', pretty_json_repr(options)) def log(level_str, sender, message): """ send a log message """ my_logger = logging.getLogger(__name__) # set strict_mode to default value, if not already set if not hasattr(log, "strict_mode"): log.strict_mode = False # - lookup table for internal strings to pretty output strings pretty_levels = { 'CRITICAL' : 'FATAL: ', 'ERROR' : 'ERROR: ', 'WARNING' : 'WARNING: ', 'INFO' : ' ', 'DEBUG' : ' ', 'NOTSET' : ' ' } message = str(message) sender_str = '' message_str = '' level = LEVELS.get(level_str, LEVELS['ERROR']) # -- level pretty_level_str = pretty_levels.get(level_str, pretty_levels['ERROR']) # -- sender if sender != 'panzer': # sender_str = ' ' + sender + ': ' sender_str = ' ' # -- message message_str = message output = '' output += pretty_level_str output += sender_str output += message_str my_logger.log(level, output) # - if 'strict' mode and error logged, raise exception to exit panzer if log.strict_mode and (level_str == 'ERROR' or level_str == 'CRITICAL'): log.strict_mode = False raise error.StrictModeError def go_quiet(): """ force logging level to be --quiet """ my_logger = logging.getLogger(__name__) my_logger.setLevel(LEVELS['WARNING']) def go_loud(options): """ return logging level to that set in options """ my_logger = logging.getLogger(__name__) if options['panzer']['quiet']: verbosity_level = 'WARNING' else: verbosity_level = 'INFO' my_logger.setLevel(LEVELS[verbosity_level]) def decode_stderr_json(stderr): """ return a list of decoded json messages in stderr """ # - check for blank input if not stderr: # - nothing to do return list() # - split the input (based on newlines) into list of json strings output = list() for line in stderr.split('\n'): if not line: # - skip blank lines: no valid json or message to decode continue json_message = list() try: json_message = json.loads(line) except ValueError: # - if json cannot be decoded, just log as ERROR prefixed by '!' json_message = {'level': 'ERROR', 'message': '!' + line} output.append(json_message) return output def log_stderr(stderr, sender=str()): """ send a log from external executable """ # 1. check for blank input if not stderr: # - nothing to do return # 2. get a string with sender's name if sender: # - remove file extension from sender's name if present sender = os.path.splitext(sender)[0] # 3. now handle the messages sent by sender json_message = decode_stderr_json(stderr) for item in json_message: level = item['level'] message = item['message'] log(level, sender, message) def pretty_keys(dictionary): """ return pretty printed list of dictionary keys, num per line """ if not dictionary: return [] # - number of keys printed per line num = 5 # - turn into sorted list keys = list(dictionary.keys()) keys.sort() # - fill with blank elements to width num missing = (len(keys) % num) if missing != 0: to_add = num - missing keys.extend([''] * to_add) # - turn into 2D matrix matrix = [[keys[i+j] for i in range(0, num)] for j in range(0, len(keys), num)] # - calculate max width for each column len_matrix = [[len(col) for col in row] for row in matrix] max_len_col = [max([row[j] for row in len_matrix]) for j in range(0, num)] # - pad with spaces matrix = [[row[j].ljust(max_len_col[j]) for j in range(0, num)] for row in matrix] # - return list of lines to print matrix = [' '.join(row) for row in matrix] return matrix def pretty_list(input_list, separator=', '): """ return pretty printed list """ if input_list: output = ' %s' % separator.join(input_list) else: output = ' empty' return output def pretty_json_repr(data): """ return pretty printed data as a json """ return json.dumps(data, sort_keys=True, indent=2) def pretty_title(title): """ return pretty printed section title """ output = '-' * 5 + ' ' + title.lower() + ' ' + '-' * 5 return output def pretty_start_log(title): """ return pretty printed title for starting log """ output = '>' * 10 + ' ' + title + ' ' + '<' * 10 return output def pretty_end_log(title): """ return pretty printed title for ending log """ output = '>' * 10 + ' ' + title + ' ' + '<' * 10 + '\n\n' return output def pretty_path(input_path): """ return path string replacing '~' for home directory """ home_path = os.path.expanduser('~') cwd_path = os.getcwd() output_path = input_path.replace(home_path, '~').replace(cwd_path, './') return output_path def pretty_runlist(runlist): """ return pretty printed runlist """ if not runlist: return [' empty'] output = list() current_kind = str() for i, entry in enumerate(runlist): if current_kind != entry['kind']: output.append(entry['kind'] + ':') current_kind = entry['kind'] basename = pretty_path(entry['command']) if entry['arguments']: basename += ' ' basename += ' '.join(entry['arguments']) line = '%d' % (i+1) line = line.rjust(3, ' ') line += ' %s' % basename output.append(line) return output def pretty_runlist_entry(num, max_num, command, arguments): """ return pretty printed run list entry """ basename = command if arguments: basename += ' ' basename += ' '.join(arguments) cur = '%d' % (num+1) cur = cur.rjust(len(str(max_num)), ' ') line = ' [%s/%d] %s' % (cur, max_num, basename) return line def time_stamp(text): """ print time since first & previous time_stamp call """ if not const.DEBUG_TIMING: return try: now = time.time() - time_stamp.start except AttributeError: time_stamp.start = time.time() now = 0 try: elapsed = now - time_stamp.last except AttributeError: elapsed = 0 now_str = str(round(now * 1000)).rjust(7) now_str += ' msec' now_str += ' ' now_str += text.ljust(30) if elapsed * 1000 > 1: now_str += str(round(elapsed * 1000)).rjust(7) now_str += ' msec' else: now_str += ' ' * 12 time_stamp.last = now print(now_str) ================================================ FILE: panzer/load.py ================================================ """ loading documents into panzer """ import os import json import subprocess from . import error from . import info from . import const from . import meta def load(options): """ return ast from running pandoc on input documents """ # 1. Build pandoc command command = [options['panzer']['pandoc']] command += options['pandoc']['input'].copy() if options['pandoc']['read']: command += ['--read', options['pandoc']['read']] command += ['--write', 'json', '--output', '-'] opts = meta.build_cli_options(options['pandoc']['options']['r']) command += opts info.log('INFO', 'panzer', info.pretty_title('pandoc read')) info.log('DEBUG', 'panzer', 'loading source document(s)') info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) if opts: info.log('INFO', 'panzer', 'pandoc reading with options:') info.log('INFO', 'panzer', info.pretty_list(opts, separator=' ')) else: info.log('INFO', 'panzer', 'running') out_pipe = str() stderr = str() ast = None try: process = subprocess.Popen(command, stderr=subprocess.PIPE, stdout=subprocess.PIPE) out_pipe_bytes, stderr_bytes = process.communicate() out_pipe = out_pipe_bytes.decode(const.ENCODING) stderr = stderr_bytes.decode(const.ENCODING) except OSError as err: info.log('ERROR', 'pandoc', err) finally: info.log_stderr(stderr) try: ast = json.loads(out_pipe) except ValueError: raise error.BadASTError('failed to receive valid ' 'json object from pandoc') return ast def load_all_styledefs(options): """ return global, local styledef pair finds global styledef from `.panzer/styles/*.{yaml,yml}` finds local styledef from `./styles/*.{yaml,yml}` """ support_dir = options['panzer']['panzer_support'] info.log('DEBUG', 'panzer', 'loading global style definitions file') global_styledef = load_styledef(support_dir, options) if global_styledef == {}: info.log('WARNING', 'panzer', 'no global style definitions found') info.log('DEBUG', 'panzer', 'loading local style definitions file') local_styledef = load_styledef('.', options) return global_styledef, local_styledef def load_styledef(path, options): """ return metadata branch as dict of styledef file at `path` reads from `path/styles/*.{yaml,yaml}` (if this fails, checks `path/styles.yaml` as legacy option) returns {} if no metadata found """ # - read in style definition data from yaml files styles_dir = os.path.join(path, 'styles') filenames = list() # - read from .panzer/styles/*.{yaml,yml} if os.path.exists(styles_dir): filenames = [os.path.join(path, 'styles', f) for f in os.listdir(styles_dir) if f.endswith('.yaml') or f.endswith('.yml')] # - read .panzer/styles.yaml -- legacy option elif os.path.exists(os.path.join(path, 'styles.yaml')): filenames = [os.path.join(path, 'styles.yaml')] data = list() for f in filenames: with open(f, 'r', encoding=const.ENCODING) as styles_file: data += styles_file.readlines() data += ['\n'] if data == []: return dict() # - top and tail with metadata markings data.insert(0, "---\n") data.append("...\n") data_string = ''.join(data) # - build pandoc command command = [options['panzer']['pandoc']] command += ['-'] command += ['--write', 'json'] command += ['--output', '-'] opts = meta.build_cli_options(options['pandoc']['options']['r']) # - remove inappropriate options for styles.yaml BAD_OPTS = ['metadata', 'track-changes', 'extract-media'] opts = [x for x in opts if x not in BAD_OPTS] command += opts info.log('DEBUG', 'panzer', 'run "%s"' % ' '.join(command)) # - send to pandoc to convert to json in_pipe = data_string out_pipe = '' stderr = '' try: process = subprocess.Popen(command, stderr=subprocess.PIPE, stdin=subprocess.PIPE, stdout=subprocess.PIPE) in_pipe_bytes = in_pipe.encode(const.ENCODING) out_pipe_bytes, stderr_bytes = process.communicate(input=in_pipe_bytes) out_pipe = out_pipe_bytes.decode(const.ENCODING) stderr = stderr_bytes.decode(const.ENCODING) except OSError as err: info.log('ERROR', 'pandoc', err) finally: info.log_stderr(stderr) # - convert json to python dict ast = None try: ast = json.loads(out_pipe) except ValueError: raise error.BadASTError('failed to receive valid ' 'json object from pandoc') # - return metadata branch of dict if not ast: return dict() else: return meta.get_metadata(ast) ================================================ FILE: panzer/meta.py ================================================ """ Functions for manipulating metadata """ import pandocfilters import shlex from . import const from . import info from . import util from . import error def update_metadata(old, new): """ return `old` updated with `new` metadata """ # 1. Update with values in 'metadata' field try: old.update(get_content(new, 'metadata', 'MetaMap')) except (error.MissingField, KeyError): pass except error.WrongType as err: info.log('WARNING', 'panzer', err) # 2. Update with values in fields for additive lists old = update_additive_lists(old, new) # 3. Update 'template' field if 'template' in new: old['template'] = new['template'] return old def update_additive_lists(old, new): """ return old updated with info from additive lists in new """ for field in const.RUNLIST_KIND: try: try: new_list = get_content(new, field, 'MetaList') except error.MissingField: # field not in incoming metadata, move to next list continue try: old_list = get_content(old, field, 'MetaList') except error.MissingField: # field not in old metadata, start with an empty list old_list = list() except error.WrongType as err: # wrong type of value under field, skip to next list info.log('WARNING', 'panzer', err) continue old_list.extend(new_list) set_content(old, field, old_list, 'MetaList') return old def apply_kill_rules(old_list): """ return old_list after applying kill rules """ new_list = list() for item in old_list: # 1. Sanity checks check_c_and_t_exist(item) item_content = item[const.C] item_type = item[const.T] if item_type != 'MetaMap': info.log('ERROR', 'panzer', 'fields "' + '", "'.join(const.RUNLIST_KIND) + '" ' 'value must be of type "MetaMap"---ignoring 1 item') continue if len(item_content.keys() & {'run', 'kill', 'killall'}) != 1: info.log('ERROR', 'panzer', 'must contain exactly one "run", "kill", ' 'or "killall" per item---ignoring 1 item') continue # 2. Now operate on content if 'run' in item_content: if get_type(item_content, 'run') != 'MetaInlines': info.log('ERROR', 'panzer', '"run" value must be of type "MetaInlines"' '---ignoring 1 item') continue new_list.append(item) elif 'kill' in item_content: try: to_be_killed = get_content(item_content, 'kill', 'MetaInlines') except error.WrongType as err: info.log('WARNING', 'panzer', err) continue new_list = [i for i in new_list if get_content(i[const.C], 'run', 'MetaInlines') != to_be_killed] continue elif 'killall' in item_content: try: if get_content(item_content, 'killall', 'MetaBool') is True: new_list = list() except error.WrongType as err: info.log('WARNING', 'panzer', err) continue else: # Should never occur, caught by previous syntax check continue return new_list def get_nested_content(metadata, fields, expected_type_of_leaf=None): """ return content of field by traversing a list of MetaMaps args: metadata : dictionary to traverse fields : list of fields to traverse in dictionary from shallowest to deepest. Content of every field, except the last, must be type 'MetaMap' (otherwise fields could not be traversed). The content of final field in the list is returned. expected_type_of_leaf : (optional) expected type of final field's content Returns: content of final field in list, or the empty dict ({}) if field of expected type is not found """ current_field = fields.pop(0) try: # If on a branch... if fields: next_content = get_content(metadata, current_field, 'MetaMap') return get_nested_content(next_content, fields, expected_type_of_leaf) # Else on a leaf... else: return get_content(metadata, current_field, expected_type_of_leaf) except error.MissingField: # current_field not found, return {}: nothing to update return dict() except error.WrongType as err: info.log('WARNING', 'panzer', err) # wrong type found, return {}: nothing to update return dict() def get_content(metadata, field, expected_type=None): """ return content of field """ if field not in metadata: raise error.MissingField('field "%s" not found' % field) check_c_and_t_exist(metadata[field]) if expected_type: found_type = metadata[field][const.T] if found_type != expected_type: raise error.WrongType('value of "%s": expecting type "%s", ' 'but found type "%s"' % (field, expected_type, found_type)) return metadata[field][const.C] def get_type(metadata, field): """ return type of field """ if field not in metadata: raise error.MissingField('field "%s" not found' % field) check_c_and_t_exist(metadata[field]) return metadata[field][const.T] def set_content(metadata, field, content, content_type): """ set content and type of field in metadata """ metadata[field] = {const.C: content, const.T: content_type} def get_list_or_inline(metadata, field): """ return content of MetaList or MetaInlines item coerced as list """ field_type = get_type(metadata, field) if field_type == 'MetaInlines': content_raw = get_content(metadata, field, 'MetaInlines') content = [pandocfilters.stringify(content_raw)] return content elif field_type == 'MetaString': content_raw = get_content(metadata, field, 'MetaString') content = [content_raw] return content elif field_type == 'MetaList': content = list() for content_raw in get_content(metadata, field, 'MetaList'): content.append(pandocfilters.stringify(content_raw)) return content else: raise error.WrongType('"%s" value must be of type "MetaInlines", ' '"MetaList", or "MetaString"' % field) def get_metadata(ast): """ returns metadata branch of ast or {} if not present """ try: if const.USE_OLD_API: metadata = ast[0]['unMeta'] else: metadata = ast['meta'] except KeyError: metadata = dict() return metadata def get_runlist(metadata, kind, options): """ return run list for kind from metadata """ runlist = list() # - return empty list unless entries of kind are in metadata try: metadata_list = get_content(metadata, kind, 'MetaList') except (error.WrongType, error.MissingField) as err: info.log('WARNING', 'panzer', err) return runlist for item in metadata_list: check_c_and_t_exist(item) item_content = item[const.C] # - create new entry entry = dict() entry['kind'] = kind entry['command'] = str() entry['status'] = const.QUEUED # - get entry command command_raw = get_content(item_content, 'run', 'MetaInlines') command_str = pandocfilters.stringify(command_raw) entry['command'] = util.resolve_path(command_str, kind, options) # - get entry arguments entry['arguments'] = list() if 'args' in item_content: try: # - lua filters cannot take arguments if kind == 'lua-filter': raise error.NoArgsAllowed if get_type(item_content, 'args') != 'MetaInlines': raise error.BadArgsFormat args_content = get_content(item_content, 'args', 'MetaInlines') if len(args_content) != 1 \ or args_content[0][const.T] != 'Code': raise error.BadArgsFormat arguments_raw = args_content[0][const.C][1] entry['arguments'] = shlex.split(arguments_raw) except error.NoArgsAllowed: info.log('ERROR', 'panzer', '"%s": lua filters do not take arguments -- arguments ignored' % command_str) entry['arguments'] = list() except error.BadArgsFormat: info.log('ERROR', 'panzer', 'Cannot read "args" of "%s". ' 'Syntax should be args: "`--ARGUMENTS`"' % command_str) entry['arguments'] = list() runlist.append(entry) return runlist def check_c_and_t_exist(item): """ check item contains both C and T fields """ if const.C not in item: message = 'Value of "%s" corrupt: "C" field missing' % repr(item) raise error.BadASTError(message) if const.T not in item: message = 'Value of "%s" corrupt: "T" field missing' % repr(item) raise error.BadASTError(message) def expand_style_hierarchy(stylelist, styledef): """ return stylelist expanded to include all parent styles """ expanded_list = [] for style in stylelist: if style not in styledef: # - style not in styledef tree info.log('ERROR', 'panzer', 'No style definition found for style "%s" --- ignoring it' % style) continue defcontent = get_content(styledef, style, 'MetaMap') if 'parent' in defcontent: # - non-leaf node parents = get_list_or_inline(defcontent, 'parent') expanded_list.extend(expand_style_hierarchy(parents, styledef)) expanded_list.append(style) return expanded_list def build_cli_options(dic): """ return a sorted list of command line options specified in the options dictionary `dic` """ # - flags flags = ['--%s' % opt for opt in dic if dic[opt] == True] flags.sort() # - key-values keyvals = ['--%s=%s' % (opt, dic[opt]) for opt in dic if type(dic[opt]) is str] keyvals.sort() # - repeated key-values rkeys = [key for key in dic if type(dic[key]) is list] rkeys.sort() rkeyvals = list() for key in rkeys: rkeyvals += ['--%s=%s' % (key, val[0]) for val in dic[key]] return flags + keyvals + rkeyvals def parse_commandline(metadata): """ return a dictiory of pandoc command line options by parsing `commandline` field in metadata; return None if `commandline` is absent in metadata """ if 'commandline' not in metadata: return None field_type = get_type(metadata, 'commandline') if field_type != 'MetaMap': info.log('ERROR', 'panzer', 'Value of field "%s" should be of type "MetaMap"' '---found value of type "%s", ignoring it' % ('commandline', field_type)) return None content = get_content(metadata, 'commandline') # 1. remove bad options from `commandline` bad_opts = list(const.PANDOC_BAD_COMMANDLINE) for key in content: if key in bad_opts: info.log('ERROR', 'panzer', '"%s" forbidden entry in panzer "commandline" ' 'map---ignoring' % key) if key not in const.PANDOC_OPT_PHASE: info.log('ERROR', 'panzer', 'do not recognise pandoc command line option "--%s" in "commandline" ' 'map---ignoring' % key) bad_opts += key content = {key: content[key] for key in content if key not in bad_opts} # 2. parse remaining opts commandline = {'r': dict(), 'w': dict()} for key in content: # 1. extract value of field with name 'key' val = None val_t = get_type(content, key) val_c = get_content(content, key) # if value is 'false', set OPTION: False if val_t == 'MetaBool' and val_c is False: val = False # if value is 'true', set OPTION: True elif val_t == 'MetaBool' and val_c is True \ and key not in const.PANDOC_OPT_ADDITIVE: val = True # if value type is inline code, set OPTION: VAL elif val_t == 'MetaInlines': if len(val_c) != 1 or val_c[0][const.T] != 'Code': info.log('ERROR', 'panzer', 'Cannot read option "%s" in "commandline" field. ' 'Syntax should be OPTION: "`VALUE`"' % key) continue if key in const.PANDOC_OPT_ADDITIVE: val = [get_list_or_inline(content, key)] else: val = get_list_or_inline(content, key)[0] # if value type is list of inline codes, set OPTION: [VALS] elif val_t == 'MetaList' and key in const.PANDOC_OPT_ADDITIVE: errs = False for item in val_c: if item[const.T] != 'MetaInlines' \ or item[const.C][0][const.T] != 'Code': info.log('ERROR', 'panzer', 'Cannot read option "%s" in "commandline" field. ' 'Syntax should be - OPTION: "`VALUE`"' % key) errs = True if not errs: val = [[x] for x in get_list_or_inline(content, key)] else: continue # otherwise, signal error else: info.log('ERROR', 'panzer', 'Cannot read entry "%s" with type "%s" in ' '"commandline"---ignoring' % (key, val_t)) continue # 2. update commandline dictionary with key, val for phase in const.PANDOC_OPT_PHASE[key]: commandline[phase][key] = val return commandline def update_pandoc_options(old, new, mutable): """ return dictionary of pandoc command line options 'old' updated with 'new' only options marked as mutable can be changed """ for p in ['r', 'w']: for key in new[p]: # if not mutable commandline line option, then skip it if not mutable[p][key]: continue # if 'False', reset old[p][key] to default elif new[p][key] is False: if type(old[p][key]) is list: old[p][key] = list() elif type(old[p][key]) is str: old[p][key] = None elif type(old[p][key]) is bool: old[p][key] = False # if list, extend old list with new elif key in old[p] and type(old[p][key]) is list: old[p][key].extend(new[p][key]) # otherwise, override old with new else: old[p][key] = new[p][key] return old ================================================ FILE: panzer/panzer.py ================================================ #!/usr/bin/env python3 """ panzer: pandoc with styles for more info: Author : Mark Sprevak Copyright : Copyright 2015, Mark Sprevak License : BSD3 """ import json import os import subprocess import sys from . import cli from . import document from . import error from . import info from . import load from . import meta from . import util from . import version __version__ = version.VERSION # Main function def main(): """ the main function """ info.time_stamp('panzer started') doc = document.Document() try: doc.options = cli.parse_cli_options(doc.options) util.check_pandoc_exists(doc.options) old_reader_opts = dict(doc.options['pandoc']['options']['r']) info.time_stamp('cli options parsed') info.start_logger(doc.options) info.time_stamp('logger started') util.check_support_directory(doc.options) info.time_stamp('support directory checked') global_styledef, local_styledef = load.load_all_styledefs(doc.options) info.time_stamp('local + global styledefs loaded') ast = load.load(doc.options) info.time_stamp('document loaded') doc.populate(ast, global_styledef, local_styledef) doc.transform() doc.lock_commandline() new_reader_opts = doc.options['pandoc']['options']['r'] # check if `commandline` contains any new reader options if new_reader_opts != old_reader_opts: # re-read input documents with new reader settings opts = meta.build_cli_options(new_reader_opts) info.log('INFO', 'panzer', info.pretty_title('pandoc read with metadata options')) info.log('INFO', 'panzer', 'pandoc reading with options:') info.log('INFO', 'panzer', info.pretty_list(opts, separator=' ')) info.go_quiet() doc.empty() global_styledef, local_styledef = load.load_all_styledefs(doc.options) ast = load.load(doc.options) doc.populate(ast, global_styledef, local_styledef) doc.transform() info.go_loud(doc.options) doc.build_runlist() doc.purge_style_fields() info.time_stamp('document transformed') doc.run_scripts('preflight') info.time_stamp('preflight scripts done') doc.jsonfilter() info.time_stamp('json filters done') doc.pandoc() info.time_stamp('pandoc done') doc.postprocess() info.time_stamp('postprocess done') doc.run_scripts('postflight') info.time_stamp('postflight scripts done') except error.SetupError as err: # - errors that occur before logging starts print(err, file=sys.stderr) sys.exit(1) except error.StrictModeError: info.log('CRITICAL', 'panzer', 'cannot continue because error occurred while in "strict" mode') sys.exit(1) except subprocess.CalledProcessError: info.log('CRITICAL', 'panzer', 'cannot continue because of fatal error') sys.exit(1) except (KeyError, error.MissingField, error.BadASTError, error.WrongType, error.InternalError) as err: # - panzer exceptions not caught elsewhere, should have been info.log('CRITICAL', 'panzer', err) sys.exit(1) finally: doc.run_scripts('cleanup', do_not_stop=True) # - if temp file created in setup, remove it if doc.options['panzer']['stdin_temp_file']: os.remove(doc.options['panzer']['stdin_temp_file']) info.log('DEBUG', 'panzer', 'deleted temp file: %s' % doc.options['panzer']['stdin_temp_file']) # - write json message to file if ---debug set if doc.options['panzer']['debug']: filename = doc.options['panzer']['debug'] + '.json' content = info.pretty_json_repr(json.loads(doc.json_message())) with open(filename, 'w', encoding='utf8') as output_file: output_file.write(content) output_file.flush() info.log('DEBUG', 'panzer', info.pretty_end_log('panzer quits')) # - successful exit info.time_stamp('finished') sys.exit(0) if __name__ == '__main__': main() ================================================ FILE: panzer/util.py ================================================ """ Support functions for non-core operations """ import os import subprocess import sys from . import const from . import error from . import info def check_pandoc_exists(options): """ check pandoc exists """ try: stdout_bytes = subprocess.check_output([options['panzer']['pandoc'], "--version"]) stdout = stdout_bytes.decode(const.ENCODING) except PermissionError as err: raise error.SetupError('%s cannot be executed as pandoc executable' % options['panzer']['pandoc']) except OSError as err: if err.errno == os.errno.ENOENT: raise error.SetupError('%s not found as pandoc executable' % options['panzer']['pandoc']) else: raise error.SetupError(err) stdout_list = stdout.splitlines() pandoc_ver = stdout_list[0].split(' ')[1] # print('pandoc version: %s' % pandoc_ver, file=sys.stderr) if versiontuple(pandoc_ver) < versiontuple(const.REQUIRE_PANDOC_ATLEAST): raise error.SetupError('pandoc %s or greater required' '---found pandoc version %s' % (const.REQUIRE_PANDOC_ATLEAST, pandoc_ver)) # check whether to use the new >=1.18 pandoc API or old (<1.18) one NEW_PANDOC_API = "1.18" if versiontuple(pandoc_ver) < versiontuple(NEW_PANDOC_API): const.USE_OLD_API = True # print('using old (<1.18) pandoc API') # else: # print('using new (>=1.18) pandoc API') def versiontuple(version_string): """ return tuple of version_string """ # pylint: disable=W0141 # disable warning for using builtin 'map' return tuple(map(int, (version_string.split(".")))) def check_support_directory(options): """ check support directory exists """ if options['panzer']['panzer_support'] != const.DEFAULT_SUPPORT_DIR: if not os.path.exists(options['panzer']['panzer_support']): info.log('ERROR', 'panzer', 'panzer support directory "%s" not found' % options['panzer']['panzer_support']) info.log('WARNING', 'panzer', 'using default panzer support directory: %s' % const.DEFAULT_SUPPORT_DIR) options['panzer']['panzer_support'] = const.DEFAULT_SUPPORT_DIR if options['panzer']['panzer_support'] == const.DEFAULT_SUPPORT_DIR: if not os.path.exists(const.DEFAULT_SUPPORT_DIR): info.log('WARNING', 'panzer', 'default panzer support directory "%s" not found' % const.DEFAULT_SUPPORT_DIR) info.log('WARNING', 'panzer', 'create empty support directory "%s"?' % const.DEFAULT_SUPPORT_DIR) input(" Press Enter to continue...") create_default_support_dir() os.environ['PANZER_SHARED'] = \ os.path.join(options['panzer']['panzer_support'], 'shared') def create_default_support_dir(): """ create a empty panzer support directory """ # - create .panzer os.mkdir(const.DEFAULT_SUPPORT_DIR) info.log('INFO', 'panzer', 'created "%s"' % const.DEFAULT_SUPPORT_DIR) # - create subdirectories of .panzer subdirs = ['preflight', 'filter', 'lua-filter', 'postprocess', 'postflight', 'cleanup', 'template', 'styles'] for subdir in subdirs: target = os.path.join(const.DEFAULT_SUPPORT_DIR, subdir) os.mkdir(target) info.log('INFO', 'panzer', 'created "%s"' % target) # - create styles.yaml style_definitions = os.path.join(const.DEFAULT_SUPPORT_DIR, 'styles', 'styles.yaml') open(style_definitions, 'w').close() info.log('INFO', 'panzer', 'created empty "styles/styles.yaml"') def resolve_path(filename, kind, options): """ return path to filename of kind field """ basename = os.path.splitext(filename)[0] paths = list() paths.append(filename) paths.append(os.path.join(kind, filename)) paths.append(os.path.join(kind, basename, filename)) paths.append(os.path.join(options['panzer']['panzer_support'], kind, filename)) paths.append(os.path.join(options['panzer']['panzer_support'], kind, basename, filename)) for path in paths: if os.path.exists(path): return path return filename ================================================ FILE: panzer/version.py ================================================ """ version of panzer """ VERSION = "1.4.1" ================================================ FILE: setup.py ================================================ # encoding: utf-8 from setuptools import setup from panzer.version import VERSION with open('README.rst', 'r', encoding='utf8') as file: readme_text = file.read() setup(name='panzer', version=VERSION, description='pandoc with styles', long_description=readme_text, url='https://github.com/msprev/panzer', author='Mark Sprevak', author_email='mark.sprevak@ed.ac.uk', license='LICENSE.txt', packages=['panzer'], install_requires=['pandocfilters'], include_package_data=True, keywords=['pandoc'], classifiers=[ 'Development Status :: 4 - Beta', 'Environment :: Console', 'Intended Audience :: End Users/Desktop', 'Intended Audience :: Developers', 'License :: OSI Approved :: BSD License', 'Operating System :: OS Independent', 'Programming Language :: Python :: 3', 'Topic :: Text Processing' ], entry_points = { 'console_scripts': [ 'panzer = panzer.panzer:main' ] }, zip_safe=False)