By Eroma Abeysinghe
Explanation of Apache Airavata for Hosting a Gateway
Apache Airavata is a middleware framework that enables you to manage, execute, and monitor your application workflows on computing resources such as national supercomputers, campus clusters, and computing clouds. Airavata captures and organizes the metadata associated with these submissions, allowing you to share, clone, and resubmit computational experiments. In order to provide a better interface for middleware users, the gateway interface client is set up to communicate with Apache Airavata API. The PHP Gateway for Airavata (PGA) is the reference implementation gateway that users can use to set up a working gateway quickly. The PGA itself is open source and can be modified and customized further as per user requirements.
What are the benefits to science gateway developers? What problems does it address?
Apache Airavata middleware and the PGA provides the general purpose services for managing users, computational experiments, and computational data. Using the existing gateway web client (PGA) is the quickest way to start using Apache Airavata middleware for computational job submissions on HPCs. Interested gateway developers can get hosted versions of both the PGA and Apache Airavata that are run by the Science Gateways Research Center (SGRC) at Indiana University so that potential gateway providers can see if Apache Airavata is the right solution for them.
Airavata’s focus is on gateways that provide access to scientific software running on high-performance computing resources. There are a number of usage patterns. One is a “software as a service” gateway in which a gateway provider (gateway owner/admin/PI) wants to make his or her software widely available for a community of users. The gateway provider takes the responsibility of installing the software on target resources that run the code well, keeping the code up to date, and configuring web interfaces that help users run the code properly. This eliminates the need to distribute and support code directly to end users. Another common pattern is a campus gateway that provides (typically) well known scientific software applications on university campus resources. These gateways help increase the usage of campus computing facilities. Finally, you can use Apache Airavata to build domain gateways that serve a specific but broad scientific domain. These gateways provide access to well-known applications of a specific type (bioinformatics or computational chemistry, for example), usually with a wide range of resources.
To support all of these scenarios, Apache Airavata has a rich set of functions for describing scientific applications and how to run them on specific resources. Airavata doesn’t need to be installed directly on a computing resource, and it can be integrated with many different clusters. Thus a gateway provider may want to pull together several different resources (a campus cluster and some XSEDE supercomputers (https://www.xsede.org/ecosystem/resources), for example, to make them available to a user community.
What is used in Apache Airavata and PGA development?
Apache Airavata is open source middleware software developed in JAVA. It provides a programming language independent API that was developed using Apache Thrift. The API is accessible through multiple clients linked with web interfaces, desktop interfaces, and Jupyter notebooks. As described above, the PGA is our reference implementation client and is developed in PHP. We also have a desktop reference client developed in Java and JavaFX. The desktop client can be integrated with other desktop client applications. This is primarily used with SEAGrid.org, an Airavata-based gateway. A gateway can mix and match the two, allowing users to do intensive work on their desktop client but then check on their work, share it, and so forth, using a PGA-based web interface.
Apache Airavata middleware integrates a number of other services in production. Airavata uses Keycloak (http://www.keycloak.org/) for user identity management, RabbitMQ for internal inter-module record queueing, Zookeeper for distributed coordination, and MySQL as a production database.
What would make someone choose this solution over another?
Airavata and PGA are integrated to work with multiple job schedulers and HPCs ranging from national clusters, campus clusters, cloud resource and private clusters. As a result, you would be able to start executing and managing your HPC jobs in very little time. Apache Airavata specifically targets scientific software as a service on HPC resources, so if you want to make your HPC code available to a larger community without overburdening yourself with support issues, Apache Airavata is a good fit. Apache Airavata doesn’t need to run on the HPC resource itself and can be used to deliver software on many computing resources through a single gateway.
In addition to the software itself, users may be interested in the Airavata support model. While committed to open source, the Science Gateways Resource Center (SGRC) runs SciGaP, a hosted Apache Airavata platform, which works with different types of gateways, campus gateways, application gateways, and domain gateways. As a result, the SGRC has, over the years, gained vast knowledge that spans multiple science disciplines. The SGRC is comprised of well-experienced experts with diverse backgrounds who have been actively involved in building gateways and contributing to the gateway community for 15 years! SGRC staff is part of SGCI’s EDS service area, and you can request for gateway development support directly from SGCI by submitting a Consulting Services Request Form.
Steps of implementation:
This is pretty straightforward, easy, and quick! You could have a gateway up and running in no time with very little help from the Airavata team/SGRC. Upon request, SGCI will provide you with resources to host PGA and a step-by-step guide for deploying the PGA. All you have to do is follow the instructions that are available in the Deploying a Gateway with Airavata Using SGCI Hosting guide.