Git-Pandoc Academic Workflow Software setup

ZettelDistraction · May 2023

Philosopher Richard Y. Chappell has written a straightforward and readable Git+Pandoc software setup and workflow guide suitable for academic writing. [Academics don't want you to know this secret trick: this free software setup is workable for writers of all kinds.] Professor Chappell's guide walks you through the software installation, which includes git for version control, Pandoc for document processing, and the VSCode editor with plugins for Pandoc citations and document export. After the one-time software installation, the writer begins each document with a Markdown template. The Markdown template includes some LaTeX macros for page formatting--Prof Chappell recommends installing the MikTeX LaTeX distribution. Modifying the Pandoc LaTeX template is unnecessary (though I have done this to avoid repetition): add LaTeX packages and macros to the Markdown document template as needed.

The setup is interesting and will get you part of the way to a Zettelkasten with VSCode. The installed software and workflow are compatible with Zettlr and Obsidian, though this requires additional configuration.

Will · May 2023

@ZettelDistraction, thanks for this. I've installed and configured Philosopher Richard Y. Chappell's workflow, which is simple and works flamingly beautiful.

I have been using a more elaborate workflow because I can tolerate some fussiness.

It draws on.

Jeffory Moro - [How I Write](https://jeffreymoro.com/blog/2020-09-21-how-i-write/
Scott Selisker's "Plain Text Workflow for Writing with Atom"
Grant Wythoff's and Dennis Tenen's "Sustainable Authorship in Plain Text using Pandoc and Markdown"

The big difference is that I have been using a Makefile to convert with pandoc rather than vs code's pandoc plugin.

I am confused about the "documentclass: article" entry in the YAML. I had to use the Wayback machine to find a detailed list of command options. When I selected the one I thought would be best for my document creation (memoir) style, conversions pooped out on the Latex command. I may have to get ChatGPT to help me with that.

Andy · May 2023

This kind of workflow has been in widespread use for the past decade, I would say. Early adopters were writing in Pandoc Markdown earlier, but around 2013 it became clear to me that Pandoc Markdown was going mainstream when I saw articles like John W. Maxwell's "Building publishing workflows with Pandoc and Git" (2013) and Dennis Tenen & Grant Wythoff's "Sustainable authorship in plain text using Pandoc and Markdown" (2014). I suspect Mac users were especially quick to jump on the bandwagon because Mac users have had BibDesk, an excellent BibTeX-based reference manager, since the early 2000s (I started using it in 2008). I have been writing in Pandoc Markdown in Scrivener for so many years that I can't remember when I started. Of course, new evangelists for Pandoc Markdown are needed for new generations of academics and software.

Will · May 2023

Grap, I fell down another rabbit hole. Not fault of yours. ChatGPT3.5 tutored me on some of the finer points of Latex's documentclass options and referred me to Wikibooks, where I found a link to Peter Wilson's 615-page documentation on just the memoir documentclass. I have a little reading to do.

While sitting at the kitchen table with Mary this morning, she asked me about my day and what I was working on. Even though I wanted to share about my work, I couldn't bring myself to bore her with the details of Latex formatting. We live in two separate worlds.

@Andy, thanks for the article references. You describe evidence of the durability of the Markdown, Pandoc, and Latex writing toolchain.

Andy · May 2023

@Will said:

@Andy, thanks for the article references. You describe evidence of the durability of the Markdown, Pandoc, and Latex writing toolchain.

It's worth noting that this durability is based on the foundations laid by Unix, as John W. Maxwell (again) described well in "Text processing techniques and traditions" (2018):

Unix defines a computing paradigm that we still live within. In Unix, everything is a file, or is at least treated as a file; the core file format is simply a sequence of text characters; the output of one process (as text) can become the input (as text) of another; an interactive "shell" environment (the 'command line') provides a standardized textual input and output mechanism; system documentation (manuals) are part of the system itself. The promise made by the original programmers to AT&T back in the early 1970s resulted in a toolkit called the Documentor's Workbench, a set of simple text-processing tools that are still with us, 40 years later. The internal documentation system for Unix—the command "man" (for manual)—includes a whole range of functionality, including editing software, spell checkers, grammar checkers, formatters, and indeed typesetting software. These features make Unix a machine built of text. It is phenomenally good for text processing, because text processing is at its very heart.

ZettelDistraction · May 2023

Thank you for these references--I will borrow them for a note on workflow. Here is another Pandoc script reference by Michał Wyrwa.

For the social sciences, this article on workflow software by Kieran Healy, Choosing your workflow applications, discusses emacs, R, and version control. There is a shorter published version. I'm not a social scientist, and I don't use emacs (I would if my pinky fingers could adjust to the key combinations), but I occasionally use R. The advice here is valuable, and it's nice to see an expert write about their process.

I started working with Unix on a Sun Workstation over 40 years ago, in 1981--the philosophy that everything is a file in Unix was the first thing anyone mentioned about it.

dandennison84 · May 2023

I've been using this workflow for a long time now (minus VSCode and add Zettlr). A useful command for pandoc is
pandoc -D latex > ~/my-default.latex.
This is the default pandoc template for LaTeX. You will find this is fairly identical to the defaults from your text editor (Zettlr for example). The template works by using yaml header variables to control LaTeX formatting then sticking your content in the $body element. You will see $variable-name all over the place. By adding those variables to your yaml header, pandoc will process your variables in the template and then produce the output.

The "darkside" of plain text publishing is if you use multiple outputs (PDF, HTML, docx, for example), and want to customize your output. You will eventually find that the limitations of markdown will raise its head as it doesn't offer enough structural elements. As soon as you want to customize things, you will at the very least be writing your own .tex template to feed to pandoc along with your content. At most, you will be writing custom HTML, and LaTeX, among other things. The mechanism most often used to customize your plain text is either yaml headers (your yaml block might eventually be longer than your text :P) or embedded HTML or LaTeX codes directly in your code. If you use an editor like Zettlr, it has direct support for that.

If you aren't careful, your plain text looks like a Frankenstein markup riddled with embedded html and latex codes.

ZettelDistraction · May 2023

@ dandennison84 Since you mentioned Zettlr, which I use, I wanted to indicate how Zettlr calls Pandoc. Zettlr's export functions call Pandoc with an export.FORMAT.yaml file, where FORMAT is the exported format, along with a bibliography file if this is specified. Here is an example where FORMAT=latex.

pandoc --defaults=%AppData%\Roaming\pandoc\export.latex.yaml --bibliography media\MyLibrary.json Hom20221026.md -o Hom20221026.tex

The links following are to the Pandoc configuration files in my Zettel GitHub repository; descriptions of the files are in the repository wiki. Guilty as charged: I modified the Pandoc default files and latex template. My modified default files are export.latex.yaml and export.pdf.yaml; the modified default latex template is template.tex.

@dandennison84 said:
If you aren't careful, your plain text looks like a Frankenstein markup riddled with embedded html and latex codes.

The result is more like the fishbowl head of Morbius from Dr. Who than Frankenstein, but that's a matter of opinion. In the episode where the brain of Morbius is attached to the body with the giant crustacean claw, the brain falls from the operating table during the transplant procedure and lands with a thud on a hard floor.

This file is licensed under the Creative Commons Attribution 2.0 Generic license.
Description:
DWE Cardiff Bay, Port Teigr November 2016
Doctor Who Experience
Date: 6 November 2016, 15:21
Source: Doctor Who Experience
Author: MangakaMaiden Photography

The Brain of Morbius. (2023, May 13). In Wikipedia. https://en.wikipedia.org/wiki/The_Brain_of_Morbius

dandennison84 · May 2023

Thanks I'll take a look! I have my own custom ones as well. I don't use the Zettlr flow much other than the lovely export to give me quick previews. I tend to use command line tools for that (since I sometimes swap editors for various reasons). I might take a look at how you are doing it though. I really like the ergonomics of Zettlr (and lack of bloat) and how easy it is to do the things I need simply.

Andy · May 2023

@dandennison84 said:

The "darkside" of plain text publishing is if you use multiple outputs (PDF, HTML, docx, for example), and want to customize your output. You will eventually find that the limitations of markdown will raise its head as it doesn't offer enough structural elements. As soon as you want to customize things, you will at the very least be writing your own .tex template to feed to pandoc along with your content. At most, you will be writing custom HTML, and LaTeX, among other things. The mechanism most often used to customize your plain text is either yaml headers (your yaml block might eventually be longer than your text :P) or embedded HTML or LaTeX codes directly in your code. If you use an editor like Zettlr, it has direct support for that.

If you aren't careful, your plain text looks like a Frankenstein markup riddled with embedded html and latex codes.

I have found that custom styles in Pandoc Markdown and (if necessary) a Pandoc filter (see also GitHub topic: pandoc-filter) adds a lot of flexibility and allows me to postpone the formatting of the custom structural elements until the writing is done. In Scrivener (which isn't entirely plain-text writing, only plain-text output) I can configure the compile settings to apply Pandoc Markdown custom styles to parts of the text during compilation without seeing the custom style markup in the text while I'm writing, which reduces the "Frankenstein markup" effect even more.

Zettelkasten Forum

Git-Pandoc Academic Workflow Software setup

Comments

Howdy, Stranger!

Quick Links

Categories

In this Discussion