Skip to main content

Science Gateways and Scientific Research, Part 2

By Marlon Pierce

You can read Science Gateways and Scientific Research, Part 1, here

Author’s Note: This blog post is based on “Towards a Science Gateway Reference Architecture”, which was presented at the International Workshop for Science Gateways (IWSG 2018). Citation: Pierce, M. E., Miller, M. A., Brookes, E. H., Wong, M., Afgan, E., Liu, Y., Gesing, S., Dahan, M., Marru, S. & Walker, T. (2018). Towards a Science Gateway Reference Architecture. Paper given at the 10t​h​ International Workshop on Science Gateways (IWSG 2018),13-15 June 2018. Edinburgh, Scotland. DOI: I thank my co-authors for their input and critical reading of the original submission.

In my previous blog post, I argued for viewing science gateways as cyberinfrastructure that directly supports scientific research and for tying it to a broadly considered publication process. In this post, I’ll work through some of the implications of this on the capabilities that a science gateway should offer. These are described more thoroughly in the IWSG 2018 paper.

The point of view that we put forward in our IWSG 2018 paper is that users of a science gateway when they use it for research, generate results that need to be preserved and that support Findable, Accessible, Interoperable and Reusable (FAIR) principles. This implies that science gateways have core, common capabilities. We summarize these in Table 1.

Table 1: Core science gateway capabilities

Core Gateway Capability Description
Recognize Users Gateways provide authentication, authorization, and identity management. Recognition of the user is a prerequisite for the capabilities that follow, all of which support the research process.
Integrate Services Gateways act as agents that integrate scientific and other services for their users. These may be developed in house or they may be integrated from third-party service providers. The core services of a specific gateway define what it does.
Organize User Interactions into Sessions Gateways help scientific users search and explore data sets and conduct computational experiments. The latter include both input and output data and metadata that others may explore. It is thus useful to organize these interactions into “sessions.”
Persistently Store User Interactions Science gateways allow researchers to recover previous sessions after the initial interaction. Gateways support this feature to provide reproducibility or repeatability, help users organize their results, and avoid unnecessary repetition. Persistent sessions allow users to check their work and assist gateway operators in diagnosing user reported issues.
Enable Sharing of Interactions Sessions are a core implementation concept for many science gateways. A specific scientific publication may be supported by many different sessions, perhaps from many different researchers. Sessions and their constituent elements should, therefore, be sharable.

The concept of persistent sessions is the key element of this table. The idea is that the gateway records how the user interacts with it as the user moves to some particular result. This may be, as a straightforward example, the execution of a particular workflow with specific inputs. The gateway remembers all the details. The user can view these details, organize as sessions, perhaps coupled with search tools. The sessions, if not all the data, are stored persistently by the gateway so that a user can always retrieve at least the steps that they used to get a particular result.

These persistent sessions can be organized in various ways, such as all the results used to create a particular graph or table in a paper.

Persistent sessions become even more powerful if the gateway allows the user to share them with other users or even make them available to all the users of a gateway.

The final step, and the great opportunity for gateways acting collectively, is to share session metadata across multiple gateways, in a format that is independent of anyone gateway, and which is potentially independent of all gateways: it then just becomes an independent digital object, discoverable by standard Internet search technologies.

This sounds FAIR to me.