I’ve been reading a range of computing-related research in recent times and there’s one thing in particular that bugs me: research presented without (or with insufficient) source code.
On several occasions I’ve attempted to implement a neat algorithm that I’ve read about and have been confounded: written explanations often confuse more than they enlighten, and pseudo-code (where present) invariably glosses over critical implementation details, disguising the complexity of the implementation and/or run-time.
(And, in my limited experience, seeking assistance from the author via email rarely elicits a helpful response. I’d like to hope that that is not the norm, but I fear that it is not…)
The inclusion of source code with published research provides the means for others to more easily reproduce, test and validate the assertions from a paper, particularly exposing flawed assumptions and bugs.
More importantly, it should make it easier for others to build upon the research. The implementation reveals a great deal about research – its scope, shortcomings and opportunities for improvement – in ways that are often not exposed by the text alone. It provides a platform that can be built upon directly.
Standing on the shoulders of giants is easier if we can get a hand up from those already doing so, rather than having to re-engineer the same stepladder that got them there.
Even if done poorly, the availability of source code should still result in an improvement over the current state of affairs – I’d rather have poorly written source code than poorly written explanation of the same.
Space restrictions for submitted papers certainly does not encourage the inclusion of source code. To be fair, it makes little sense that reams of source code should be published in paper form.
Realistically, it makes little sense to expect that recent or future research will be read printed on paper. Ever. The common 2-column A4 format for publication is terrible for reading on-screen and not particularly pleasant even when printed. Continuing to prepare research for publication in this legacy style ensures that papers remain hard to read, while also preventing easy inclusion of more effective ways of presenting information – not just source code, but larger, more detailed diagrams, interactive systems and more useful, intuitive navigation (to name a few).
The issue of intellectual property of course needs consideration, but if the technique is able to described publicly it is not unreasonable to expect that an implementation should be no less freely available – even if both are encumbered by patent or other restrictions.
For researchers, my exhortation is to publish your source – it can only make your research more relevant and useful.
Jonathan Adamczewski is a graduate researcher at the University of Tasmania. Follow him on brnz.org/hbr