Code Review, Round 2
Long version: despite research becoming increasingly reliant on software, very little of that code is worked on in a collaborative fashion or shared before a paper is published (if at all), and even less is checked during the peer review process. In August-September 2013, the Mozilla Science Lab, PLOS, and Prof. Marian Petre (Open University, UK) therefore conducted a pilot study of code review of scientific software. Professional software developers form Mozilla reviewed parts of the software associated with papers published during the preceding year in "PLoS Computational Biology", and those reviews were then shared with the authors of the papers.
This study's principal findings were that:
- both developers and scientists were enthusiastic about the possibility of collaborating in this way, but
- a "drive-by" review at the end of the research process by someone who didn't understand the science being investigated wasn't actually much help.
We now believe that scientists would benefit more from ongoing reviews as they develop their code, and that the reviewer must be familiar with the science itself in order to do a "deep" review. We know from a variety of studies that code review not only improves the quality of software, but also increases programmers' productivity (because "measure twice, cut once" saves time). While most researchers are not programmers, and most code in a lab is not written for use by other people, we want to find out whether code review will have the same benefits in science.
More importantly, though, we want to see whether integrating code review into the research cycle will spur scientists to work in more open, more collaborative ways in general. Once researchers are used to reading one another's code, will they be more likely to re-use it as well? Will adoption of code review encourage them to use more open tools in writing their papers, and help them see how to make data more reusable as well?
To start answering these questions, we plan to conduct a second pilot study beginning next month. In it, we will match an experienced scientist-programmer (the mentor) with a small pre-existing team of scientists who are developing and sharing software, but who do not think of themselves primarily as software developers, and who are not yet doing code reviews. The mentor will do a small number of code reviews early in the study to show the team what reviews look like, when they should be done, how results should be communicated, etc. After that, mentors will be responsible for coaching the scientists over a 12-week period as they do reviews themselves.
We will observe mentors and teams throughout the study period to determine:
- what skills are actually transferred,
- how those skills are adapted to meet scientists' domain-specific needs,
- what transference techniques are most effective, and
- what other shifts in working practice result (e.g., more collaborative or iterative work).
All findings will be published under open access licenses to encourage further discussion and uptake.
We will start this study in February 2014. We are currently looking for a few more mentors, and more importantly, for small groups of scientists who would like to give code review a try. If you are interested, please mail us a short paragraph describing your research, and a link to your version control repository if you have one.
FAQ
- How good a domain match between developer and science group can we hope for?
- We should be able to match physicists with physicists and ecologists with ecologists, but anything finer-grained than that is unlikely.
- Does the software have to be open source and publicly available?
- Yes. This is partly a matter of setting a good example, but we also want to avoid the overhead of managing non-disclosure agreements.
- Do mentors and groups have to be physically co-located?
- No—in fact, we expect that most mentors won't be in the same place as the groups they're working with, but will instead interact via Skype, email, and the like.