Research Software Survey Results
There were 29 responses to the Research Software survey run at the end of 2016 – not bad considering it was only sent to about 100 people. Responses were received from the Schools of Biology, Chemistry, Computer Science, Geography & Geosciences, History, Mathematics & Statistics, Medicine, Physics & Astronomy, and Psychology and Neuroscience. These weren’t distributed evenly; only Biology, Computer Science, and Physics & Astronomy had more than two responses each.
13 of those 29 described their “highest level of training/education in programming/software development” as “Self-taught/learned on the job”.
All but 9 of those who responded indicated that they use some sort of version control system, in contrast with previous work which indicated much lower use of such tools. Unsurprisingly, GitHub is the most popular platform. All of the respondents from Computer Science used two or more version control services, while only 45% of respondents from other schools don’t use version control.
When choosing a version control system, the following considerations were given the most weight (figure in brackets indicates number of respondents considering the factor “important” or “very important”:
- The cost (24)
- The ease of use (25)
- The ability to directly create and manage repositories (rather than going through a service desk) (25)
- The ability to browse the repositories (to which you have appropriate access) on the service (24)
- The ability to collaborate with partners outside the University (25)
followed closely by
- The ability to make the repository publicly accessible (21)
- A web interface (22)
- Integration with other systems e.g. testing platforms, software archives (19)
- Issue tracking (19)
While “ease of use” is clearly subjective, it should be noted that GitHub meets all of these criteria, especially considering the free private repositories it provides to academic users.
However, a significant number considered two other requirements not met by GitHub to be important:
- The ability to manage repository access permissions based on organisational structures (e.g. class, research group, school, institution) (17)
- All the data is stored within the University, or with an approved partner subject to a formal data sharing agreement (14)
These requirements can be met by the Git instance run and maintained by IT Services, but that fails to meet a number of the other “important” requirements above as it is only available to St Andrews users, can’t be browsed and repositories can only be created via the IT Service Desk. On the face of it, there may be a service gap here which requires attention.
16 of the 29 share code publicly at the end of a project; 7 share code publicly during a project. When asked about reasons for not sharing code, the most popular responses were:
- Unlikely to be of interest to anyone (8)
- Would take too much effort to clean it up (6)
- Sensitivity concerns (e.g. commercially sensitive due to collaboration with industrial partner) (4)
Two indicated in the comment field for this question that academic advantage/being “scooped” was a disincentive to sharing.
Most of the respondents don’t archive their code at the end of a project; half of them attach a license to their code.
9 of the respondents were aware of funder “requirements around the publishing, sharing and/or archiving of software developed during a project”. 11 didn’t know if their funders had any such requirements.
All but 1 of the respondents expressed an an interest in at least one of the Software Carpentry lessons. The most popular ones were as follows:
- Version Control with Git (18)
- The Unix Shell (16)
- Programming with Python (15)
- Programming with MATLAB (14)
- Programming with R (10)
- Intermediate/Advanced R Lessons (Contributed Lesson) (10)
- R for Reproducible Scientific Analysis (9)
- Automation and Make (9)
There have already been discussions with CAPOD about running two Software Carpentry workshops (each over two days) this semester, with a view to running them on a regular basis. Watch out for more information on those.
18 people said that they were not interested in any of the Data Carpentry lessons. The most popular of those lessons was Ecology: Data Analysis and Visualization in R, which had 6 expressions of interest. Two respondents expressed an interest in C++ training in the question about any other training requirements.
Many thanks to all who responded. These figures will be a huge help to us in our efforts to improve the support for research software engineering at the University of St Andrews. Our immediate priorities are the Software Carpentry workshops we’re hoping to run this semester, examining what looks like a potential service gap around version control and developing guidance on best practice. Please consider contributing to the development of the guidance materials, and be aware that we’ll be looking for volunteer helpers at the Software Carpentry workshops when the dates are confirmed.