A tool to automate technical content generation for configuration files

Many months ago, a technical writer colleague of mine complained about how they were struggling to keep up with the frequent releases that the company was doing at the time. There were multiple products in their plate, with each having multiple configuration files (sometimes numbering more than 10). Although the configuration files overlapped within each product, because of the componentized platform the company had built the products upon, each product in theory could have different release versions of the components that used these configuration files. All of these had to be documented into readable (and most importantly usable) technical content.

There were number of problems in the manual process they were following.

  1. Most configuration files lacked descriptive comments that could help a potential user or a technical writer.
  2. Even when there were descriptive comments, the comments themselves lacked proper context. It was easy for a technical writer to misunderstand what the developer who wrote the comment was trying to express.
  3. Technical writers would often find themselves without deep technical knowledge that certain configuration elements represented.
  4. Figuring out the minute details of a certain configuration element could mean days, or sometimes weeks of written communication that goes back and forth with the developer team.
  5. The technical writer team could not simply scale out efficiently with the frequent release cycles.

They wanted to automate this and create a process that involved minimal human interaction to generate a somewhat meaningful content.

Me, being a developer (and a good friend) thought I could help.

The problem

When it comes to technical content, the gap that should be filled is the one between deep and accurate but non-articulate technical knowledge that the developers have and the broader domain related but shallow technical knowledge and excellent writing skills****that the technical writers have. Unfortunately, just putting technical writers and developers in the same room is not going to do that. The gap remains a gap most of the time because of communication errors, knowledge level differences, or (unfortunately) wildly running egos.

When a configuration file is added to a product, it’s the developer that writes the implementation who knows the actual purpose, parameters, and the usage of the configuration options. Therefore, any documentation that should be done for the configuration files, has to be done at the development time.

This is not a new concept. API documentation from code level to API method definition level is handled this way. Code comments, if structured, can be parsed in to readable content. DSLs like Swagger provides mechanisms to document API methods at the time they are written. These details are then automatically translated to documentation artifacts, like HTML or Markdown files. This should be the case for configuration files too.

The answer

This is the main rationale for starting the project, that at the time I code named “docblock”. It’s purpose was to introduce a way for structured comments to be written in configuration files by developers at development time, and parse them to generate documentation automatically. It would be a small binary tool that a bunch of configuration files would be fed into, to output Markdown or HTML.

At the time the development of the tool started, as my pet project, the use case I was interested (and my technical writer colleague was eager to tackle) was translating XML based configuration files. The output would be Markdown, that in turn could generate HTML content as well.

The structure of the documentation comment was the next story to address. I could either

  1. Provide a predefined set of “fields” in a documentation comment for the developer to fill in
  2. Let each use case (e.g.: company, project, component) decide the structure of the comment

The second approach is a bit complex, but would address most of the use cases. It gave rise to a concept called a parser language.

The parser language in docblock, is the format in which the documentation comment would be. Docblock would use this user defined format to translate and generate content from the comments.

The standard parser language definition shipped with docblock is the following.

A documentation comment adhering to the above format looks like the following.

  1. @doc defines that the particular comment is targeted to be a documentation comment, and docblock should start looking for defined fields. This string is called the keystonein the parser format.
  2. Next comes the element description. This should describe the purpose and the usage of the particular config element.
  3. @type is a user defined field in the above parser format. In this case, it stands for the type of values that the particular configuration element expects
  4. @possible_values is also a user defined field. It stands for a list of possible values that can be put in as the configuration element. This is useful for configuration options that expects a set of values instead of an open spectrum of values. (e.g.: none , all , single )

As it is evident now, docblock directly draws inspiration from JavaDoc style code comments.

When docblock parses the above documentation comment, the HTML content generated looks like the following.

Tabular rendering. Notice the column names match the values in the parser format.
Panned out description

The intended stakeholders of the tool are the following.

  1. Technical Writer Admin— This role will decide and define the parser format the developers should adhere to when documenting the configuration files.
  2. Developer— This role will document the configuration files in the earlier decided format and use the tool to verify correct content
  3. Team/Project Base Technical Writer — This role will use the tool to generate the documentation content. It will do the corrective tasks such as adding any additional content that is not automatically generated, correcting formatting, and removing sections that are not generally needed.

Meta

The code is open source and is licensed with Apache License v2. The code can be found at the GitLab repository. The first release is v0.1.

Docblock development halted after sometime, simply because I ran out of free time to organize towards coding. By the time I got back to the source code, the relevance story that the tool was initially developed for had moved on. The organization was moving towards non-XML based configuration files. In any case, I had to finish the first iteration of development and reach a milestone release. v0.1 is that first iteration of docblock (in other words, this is an answer to a problem asked more than a year ago).

The tool is developed using Go. Why Go? Because it is the only language that I can code in which can generate independent binaries. And I wanted to develop myself to an experienced Go user from just the student I am right now.

The code is written in a way that input configuration file types and output content types can be introduced in the future easily. However, for now, only XML is supported as an input type, and only Markdown (and Markdown generated HTML) is generated as output.

I’m planning to add YAML and TOML as input types as the immediate developments (This means docblock cannot yet auto generate its own config documentation). However, these would not be simple to translate like XML was since YAML parsers do not in general include comments in the object model.

Although docblock works with the intended (though narrowly scoped) story of XML config file automation, other use cases may not find it directly applicable, at least not yet. However, any input or testimonials of good/bad experiences is greatly appreciated.

This is my first serious repository hosted in GitLab. Working with the interface was a joy, and I really like the CI/CD Pipelines feature that free users get. I only had to define a gitlab-ci.yml file. GitLab takes care of the rest of the details. With a proper Makefile, using GitLab CI/CD was a breeze.

The issue board is another cool feature that GitLab excels in. There are minor details that makes the experience a lot better for a technical user. Wiki and tagging has a few gaps that can be filled in terms of UX, but still, a great service for a free tier.

If any of you are wondering, the name docblock was supposed to be only a codename. However I’m extremely bad at naming things and Go philosophy of managing dependencies does not permit easy name changes. And no, it does not carry any puns. It’s just abbreviated “documentationblock”.


Written on February 15, 2019 by chamila de alwis.

Originally published on Medium