Skip to content

Conversation

reptee
Copy link
Contributor

@reptee reptee commented Sep 10, 2025

Hello, this PR adds support for writing vimdoc, documentation format used by vim in its help pages.

Vimdoc is very loose and does not have a formal spec, best I could afford as a reference are vim’s/neovim’s help pages, :h help-writing, neovim’s treesitter grammar for vimdoc (including highlights.scm), and panvimdoc.

When I decided to conjure the vimdoc writer I found panvimdoc which implements vimdoc generation as a custom writer, but having robust solution built-in is better.

To generate idiomatic vimdoc, the writer relies heavily on definition lists, because if we look at the fragment of options.txt, we see

==============================================================================
1. Setting options                  *set-option* *E764*

                            *:se* *:set*
:se[t][!]       Show all options that differ from their default value.
            When [!] is present every option is on a separate
            line.

:se[t][!] all       Show all options.
            When [!] is present every option is on a separate
            line.

                                *E518* *E519*
:se[t] {option}?    Show value of {option}.
            NOTE: some legacy options were removed. |nvim-removed|

:se[t] {option}     Toggle option: set, switch it on.
            Number option: show value.
            String option: show value.

:se[t] no{option}   Toggle option: Reset, switch it off.

or fragment from treesitter.txt:

    `contains?`                                *treesitter-predicate-contains?*
        Match a string against parts of the text corresponding to a node: >query
            ((identifier) @foo (#contains? @foo "foo"))
            ((identifier) @foo-bar (#contains? @foo-bar "foo" "bar"))
<
    `any-contains?`                        *treesitter-predicate-any-contains?*
        Like `contains?`, but for quantified patterns only one captured node
        must match.

    `any-of?`                                    *treesitter-predicate-any-of?*
        Match any of the given strings against the text corresponding to
        a node: >query
            ((identifier) @foo (#any-of? @foo "foo" "bar"))
<
        This is the recommended way to check if the node matches one of many
        keywords, as it has been optimized for this.

Definition lists are everywhere and they always have tags.

Tags are important because vim’s documentation system is mostly based on the idea that you can quickly find information via :help tag. Problem with tags is that each tag should be unique, so each tag is prefixed with vimdoc-prefix metadata var if it is present, and left intact otherwise.

---
vimdoc-prefix: myproj
---

# header 1

[term]{#term}
: definition

`code`{#somecode}
: explanation

converts to

========================================================================
header 1                                                 *myproj-header-1*

term                                                         *myproj-term*
    definition
`code`                                                   *myproj-somecode*
    explanation

Notice how tags exceed writerColumns by two. This is because stars surrounding tags are concealed when viewing help pages in vim/neovim.

Vim also expects help files to start with a $filename.txt<Tab>$ShortDescription, so another variable is expected – filename which paired with abstract produces said format. See [test/vimdoc/definition-lists.vimdoc].

I added golden tests for vimdoc writer including new vimdoc-specific ones:

  • converting online vim help links to local references (see [test/vimdoc/vim-online-doc.markdown] and its vimdoc counterpart)
  • definition lists as shown above (see [test/vimdoc/definition-lists.markdown] and its vimdoc counterpart)
  • TOC generation (see [test/vimdoc/headers.markdown] and its two vimdoc counterparts)

I am obviously open to changes because it is my first attempt at producing a pandoc writer, previously I have only written a couple of lua filters.

I haven’t written any documentation (despite creating the writer to help others write documentation 😄) because I would like to get approval on the code first.

Disclosure: I used an LLM to prototype TOC generation. This is the only place where I used AI, and later it was heavily refactored by hand.

Current limitations

  • HorizontalRule produces dashed rules which makes vim think that empty line is a header when opening gO
  • Code blocks in vimdoc usually start on the last line of previous paragraph, but vimdoc writer creates it as a separate paragraph.
  • I don’t know how to render BlockQuotes, Quoted, Cite and figures. For now blockquotes are simply blocks prefixed with "| ", quotes and Cites are rendered as is with formatting stripped.
  • Table support is not perfect: multiline tables where header is multiline can’t be rendered idiomatically because first row in a vimdoc table should end with ~ so that this line is highlighted.
  • There’s no math in vim, so it renders as ${{math}}$
  • Referencing the same footnote multiple times creates multiple footnotes. I see markdown does the same, so I guess this is not unexpected.
  • Nothing is done with unicode. I saw some functionality related to emoji handling (both in pandoc and panvimdoc), but since (neo)vim is unicode-aware I assume unicode symbols may be rendered unchanged

Support for vimdoc, documentation format used by vim in its help pages

Relies heavily on definition lists and precise text alignment to
generate tags
@jgm
Copy link
Owner

jgm commented Sep 11, 2025

Very cool!

HorizontalRule produces dashed rules which makes vim think that empty line is a header when opening gO

I'd suggest producing something else, e.g. * * * *, that doesn't have this consequence.

The build is failing because of warnings (we use -Wall -Werror).

I downloaded writer.vimdoc and opened it with vim. One thing was strange. I am using a terminal that is 80 columns wide, but the lines exceeded this by 1. For example,

image

This seems to relate to your comment

Notice how tags exceed writerColumns by two. This is because stars surrounding tags are concealed when viewing help pages in vim/neovim.

and I wonder whether they should only exceed by one?

@reptee
Copy link
Contributor Author

reptee commented Sep 11, 2025

I downloaded writer.vimdoc and opened it with vim. One thing was strange. I am using a terminal that is 80 columns wide, but the lines exceeded this by 1. For example,

I reckon this is correct behavior, because traditionally vim help pages have 78 characters limit, so that if tag with two concealed stars is right-aligned, the result still fits in 80 characters. You can :set conceallevel=0 and see why it wraps

UPD: I gave it a second look and help pages bundled with vim. Headers that contain tag put the tag starting at column 57 (so 22 characters are reserved for unconcealed tag). Neovim help pages are not so consistent (see eg. api.txt). I would still right-align tags, because it looks nice and renders correctly when writerColumns=78 and terminal width is 80

I also found out that gO is neovim-specific keymap (oops).

@reptee
Copy link
Contributor Author

reptee commented Sep 11, 2025

Regarding CI failure, I see it say that forM is redundant, but I don't get any warnings when building (and HLS does not report any either), if I remove the import, I get compilation error. I am doing cabal clean; cabal build --ghc-options '$flagsFromCabalFile' to see if I get any warnings that way.

@reptee
Copy link
Contributor Author

reptee commented Sep 11, 2025

Clean build with ghc 9.8.4 produced no warnings. Installing ghc 9.0.2 right now to see if it changes anything

@reptee
Copy link
Contributor Author

reptee commented Sep 11, 2025

Turns out with older GHCs Control.Monad.{Reader,Writer} reexport forM. But explicitly hiding forM produces a warning on newer GHCs because Control.Monad.{Reader,Writer} don't reexport forM on newer GHCs...

Comment on lines +414 to +418
This is strong, and so is this.

An emphasized link /url.

This is strong and em.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there's no emphasis in vimdoc? Would it make sense to just use the markdown conventions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*italicized* and **strengthened** would be treated as references and concealed, and I am personally not a fan of _italicized_ and __strengthened__ underscore-style.

Subjectively ~~deleted~~ is not so bad, but in vim ~~text~~ will become ~~text~ with helpHeader highlight group if it happens to be the last word in the line. Neovim does not suffer from this due to treesitter highlighting

image image

@jgm
Copy link
Owner

jgm commented Sep 15, 2025

Have you put the extra test files in pandoc.cabal's extra-source-files stanza?

A good way to check that you have everything needed there is to do cabal sdist, unpack the archive and build from there.

@reptee
Copy link
Contributor Author

reptee commented Sep 15, 2025

Have you put the extra test files in pandoc.cabal's extra-source-files stanza?

A good way to check that you have everything needed there is to do cabal sdist, unpack the archive and build from there.

Nope, missed that, will fix now

@jgm jgm merged commit a0cfb3f into jgm:main Sep 15, 2025
11 of 14 checks passed
@jgm
Copy link
Owner

jgm commented Sep 15, 2025

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants