Skip to main content

Science Gateways and Scientific Research, Part 1

By Marlon Pierce

You can read Science Gateways and Scientific Research, Part 2, here.

Author’s Note: This blog post is based on “Towards a Science Gateway Reference Architecture”, which was presented at the International Workshop for Science Gateways. Citation: Pierce, M. E., Miller, M. A., Brookes, E. H., Wong, M., Afgan, E., Liu, Y., Gesing, S., Dahan, M., Marru, S. & Walker, T. (2018). Towards a Science Gateway Reference Architecture. Paper given at the 10t​h​ International Workshop on Science Gateways (IWSG 2018),13-15 June 2018. Edinburgh, Scotland. DOI: I thank my co-authors for their input and critical reading of the original submission.

I have become dissatisfied with our working definitions of the term “science gateway”. This term, which goes back to the TeraGrid project, is generally understood by its practitioners and has been quite successfully adopted in some quarters of the cyberinfrastructure research community; hence, we have the Science Gateways Community Institute and the International Workshop on Science Gateways series. However, we still need to explain ourselves to colleagues who call what they do “science portals” or “virtual research environments” or other terms, and we need to also explain ourselves to those outside of our field. These include the scientists and educators who may benefit from science gateways, the broader computer science research community, operators of infrastructure that supports scientific research, and program officers who set strategic research priorities for government agencies.

A working definition for science gateways that I’ve personally written into numerous papers and proposals goes something like this: they are “user-centric environments that enable broader and deeper use of advanced computing resources, storage, data collections, and scientific applications. Gateways include graphical user interfaces (frequently Web browser-based), application programming interfaces (APIs), and middleware that provide access to software and data.”  

I think it was time to step back from this definition, which I find to be too operational, in order to examine the core reasons why gateways exist, why many of them have become so widely used and successful, what opportunities there are for common, community-wide development, and for taking the field forward. Some months back, I extended an invitation to several colleagues in the community to help with this rethinking, resulting in a paper that we submitted to the International Workshop on Science Gateways 2018.

While I think the paper is certainly worth a read as is, I would like to use this blog post as an opportunity to summarize some of the main ideas.

It’s About Science

The key for me in rethinking the definition of our own term for ourselves was that first word, “science”.  Science gateways need to enable scientific research. We also need to think about gateways as appropriate subjects themselves for scientific cyberinfrastructure research. And we need to balance the two. “Cyberinfrastructure research” always runs the risk of too much navel gazing and tail swallowing, becoming disconnected from the communities it nominally intends to serve. But likewise, if we don’t consider the computer science research opportunities, we’ll build dead systems that can’t scale, can’t evolve, and can’t attract new intellectual input.

Let’s consider first the use of science gateways to support scientific research. Broadly, scientific research consists of exploration of the current state of a research field, formulation of testable hypotheses or research questions, design and execution of experiments, management and sharing of data and metadata about experiments, analysis of experiments and development of conclusions, and communication of the conclusions and supporting methods through a broadening circle of colleagues, culminating in broadly available formal publications that are accessible to the community and reproducible, hopefully leading to the start of a new cycle of research.

The real scientific process is of course not linear. There are dead ends, ambiguous or opposing results, and accidental discoveries that happen during a research effort that alter its course, leaving some promising early results unexplored. One may think of research as exploring and mapping a new terrain, with the unfortunate limitation that only the routes to specific destinations reach formal publication.  

I think science gateways can help on all these fronts, serving to not only supplement the “happy path” of publication but also to provide a way to record and make available a scientist’s entire research effort.  

Publication, in its broadest sense, is the keystone activity of scientific research and so is a good place to think about what a science gateway could do and be. In the paper, we wrote “A science gateway is a software implementation that creates a specific set of capabilities, based in large part on access to externally managed resources, to support the creation, sharing, publication, and broad distribution of scientific research results. ‘Publication’ may mean the traditional processes used by peer-reviewed journals and the custom of pre-publication used in many scientific fields to circulate results quickly, stake a claim to a particular finding, and solicit initial feedback. It may also mean the broader dissemination of findings through non-traditional means that are associated with research altmetrics.”

To me, this leads to an important conclusion about the role of science gateways. They can supplement the current publication process by providing mechanisms for “findable, accessible, interoperable, and reusable” (FAIR) results. FAIR is normally thought of in terms of data management, but I think it is useful also to consider it in terms of publications of results.

What do we do with this conclusion? In the second part of this blog post, which will be published soon, I’ll work through some of the implications of connecting science gateways directly to the (broadly considered) scientific publication process and identify some core capabilities that emerge. This, I hope, will additionally point to the science of science gateways that we can pursue as a community.