Presented by Dr. Timothy Menzies, Professor of Computer Science, NC State University, and IEEE Fellow
Much of the data described in this presentation is drawn from computational science research software, and Tim's research team engaged deeply with the PIs funded by NSF's Software Infrastructure for Sustained Innovation (SI2) program, launched in 2010 to fund software research at multiple scales. This program has included annual PI meetings, where Tim presented and interacted with this developer community. When Tim refers to "you" in this presentation, it is this community of computational science developers/PIs, including but not limited to those in the SI2 program.
While funded with an NSF Eager grant, we have been applying empirical software engineering (SE) methods to software systems built for computational science. What we expected to see, and what we saw, were two very different things.
Initially, we spent time talking to computational scientists from CERN and around America. Most expressed concerns that their software was somehow not as good as it might be. Yet those concerns proved unfounded, at least for the code we could access.
Using feedback from the computational science community, we found and analyzed 40+ packages (some of which were very widely used in computational science). Many of those systems had been maintained by large teams, for many years. For example:
Lammps has 16,000+ commits from 80 developers (since 2012).
Trilinos (which is a more recent package) has been built by 80,000 commits from over 200 developers.
Elastic search is over 8 years old and has been built by 40,000+ commits from over 1100 developers.
Dealii has been maintained and extended since 1990 via 40,000+ commits from 100 active developers.
Note that some of these projects (e.g. Dealii) are much larger and show greater longevity than many open source projects. When we talked to the developers of these 40+ packages (particularly the post-docs), we found a group that was very well versed in current coding practices (Githib, Travis, etc).
LESSON 1: Many of those systems were written in modern languages (e.g. Python) or used modern programming tools (e.g. version control)
The reasons for this were economic and sociological: these developers are smart people who know that after their NSF-funding is over, they might get well-paid jobs in the software industry. Hence, it was in their interests to know current practices. Accordingly:
LESSON 2: Increasingly, computational software is being written using state-of-the-art software tools and practices.
When we applied standard SE defect predictors to that code, to our surprise they mostly failed since:
LESSON 3: Computational science code has a different (and lower) bug rate than other kinds of software.
Standard empirical SE methods, when applied to computational science code, failed to build useful defect predictors. In fact, to handle the computational science codes, we had to develop new methods that could handle such exemplary software. In the end, such predictors could be built, but only after significantly extending standard empirical SE methods. Hence:
LESSON 4: Computational science is an excellent testbed for the rest of the SE community to stress test their tools.
Note that the above suffers from a sampling bias (we could only examine the open source packages). But one thing is clear: the state-of-the-practice in computational science software is much healthier and insightful than what is commonly believed.