Monday, 16 May 2016

API Documentation

I have been developing applications and frameworks for a long time.  Most of them have exposed APIs for third-party developers.  I have also, of course, consumed a great many APIs, libraries and frameworks from a variety of sources.  We are always told that we must document our code, usually in the form of comments and maybe in the form of technical and design documentation.  In the real world though, what form should this take?

Developer Documentation

I work mostly in the .NET world, so the most obvious starting point is to write XML comments in the code itself.  This does two things:

  • Places the documentation close the code itself.  This, in theory, means that the documentation is more likely to be maintained as the code evolves.
  • As the developer writes the code they should, hopefully, consider the interfaces they are writing from a consumers perspective.  I reason that if I can't describe in words what the the method I am writing does how on earth can I expect someone else to guess at how it should be used.
(This can lead to very "green" code that nods in the direction of Knuth's "Literate Programming").

There is a counter-argument that placing documentation alongside the code is dangerous because it can rapidly become out-of-date and misleading.  This sounds defeatist to me, essentially: "if we can't do it perfectly it is better to not do it at all".  I think in many cases even dated documentation is often better than none at all.

Over the years my attitude to XML commenting in code has become increasingly hard-line.  These days I document absolutely everything, public or private.  The thought required is useful and it is available through IntelliSense anywhere in a solution, so even if consumers will never see the comments they are available for me and other solution developers.

XML comments are not enough on their own though, it can only provide the very detailed low-level view of the code.  Much higher-level documentation is required to explain concepts, provide overviews and give developers the structure they need to use an API effectively.  For this we need more traditional tools to prevent prose and diagrams in a structured, formatted, style.  This could be in the easily edited form of wikis, Markdown documents or straight-forward word processor documents.  Or it could be in more structured forms such as MAML or similar structured documentation languages.

Having written all this documentation we need to make it accessible to the developers to.  The simplest approach would be stick it on a website, or provide it alongside the binaries in a MS Help 1.0 CHM. A slightly more complex approach would be integrate it into the developers tools using MS Help Viewer.


The documentation described above involves moving around large volumes of information.  Although it may be possible to do this by hand in practice one needs to use tools.  In the early days the de-facto tool for .NET documentation was NDoc.  This took the binaries and XML documentation files and generated MS Help help or HTML websites.  It all worked fairly cleanly and there were enough options to be able to customise the output.  Sadly, a a new version of NDoc was never released for .NET 2.0 - the maintainer received too much abuse and not enough support. 

With the demise of NDoc the developer community was left in something of a documentation tool vacuum.  There are serveral commercial offerings (such as Innovasys DocumentX), but in the open source world at least these are fairly unattractive.  Tools such as Doxygen can be persuaded to generate documentation for .NET, but this doesn't follow the .NET idiom as well as many would desire.

To fill the gap Microsoft started an open-source project called Sandcastle in 2006.  Initially it was purely a toolset with a fairly high barrier to entry.  The Sandcastle Help File Builder was developed independently to simplify usage.  Over time the two projects have merged and moved to GitHub where SHFB has now become the new de-facto standard for .NET documentation.

Recently Microsoft China have release a new tool, DocFx, that provides a slightly different approach to project documentation.  It is still developing very rapidly.  Rather than relying upon the very structured MAML format for conceptual documentation it uses the much simpler Markdown format, which has the added advantage of being well understood and quite well provided for with tools.  There is no need to install any tooling, instead a simple Nuget include everything necessary to build documentation as part of a static HTML site.

The Ideal Documentation System

Despite tools such as Sandcastle and DocFx I still don't feel as if my needs are well met.  The basic concepts behind XML comments are excellent and I think most day-to-day requirements at that level are well met (apart from the lack of support for namespace documentation).  My gripes are at the level of conceptual documentation and the generation of a finished product.  Here are my requirements for what I think would be the ideal documentation solution:
  • All documentation should be as close as possible to the code that implements the functionality.  XML comments make an excellent starting point, but I also want conceptual documentation to appear in close proximity.  MAML documents in a separate projects are not near enough.  The MD files supported by DocFx are better, but still require configuration to make them work.  I want to be able to dump a file anywhere in a project and see it appear in the appropriate namespace in the finished documentation.
  • There need to be minimal barriers to editing the documentation.  This means the use of very simple formatting markup (like the Markdown used by DocFx) or tools that make the formatting easier (WYSISYG or similar).
  • The project defaults should satisfy most needs for most users most of the time.  There should be very few manual steps necessary to get a documentation project generating the right output.  Where the requirements are a bit more complex good configuration tools and wizards need to take the edge off.
  • Documentation should be generated as part of the normal build.  Sandcastle projects still feel a little flakey and can be very slow.  DocFx projects are similar - they frequently crash in some of my solutions.
  • The documentation build should be quick - any friction in the documentation is likely to result in this relatively low-priority (to many developers) task being ignored.
  • It should easy to preview documentation, and the preview should be realistic and quick.
Based on these criteria I would say that none of the existing documentation tools (either free or commercial) is ideal.  So, the next step is to consider whether I "should scratch my own itch" and develop a tool that does what I need.