Webinar: Reproducible big data science—A case study in continuous FAIRness using Globus and Globus Genomics
October 9, 2019
Reproducible big data science: A case study in continuous FAIRness using Globus and Globus Genomics
Presented by Ravi Madduri, Scientist, Data and Learning, Argonne National Laboratory, and Senior Scientist, University of Chicago Consortium for Advanced Science and Engineering
Big biomedical data create exciting opportunities for discovery but make it difficult to capture analyses and outputs in forms that are findable, accessible, interoperable, and reusable (FAIR). In response, we describe tools that make it easy to capture, and assign identifiers to, data and code throughout the data lifecycle. We illustrate the use of these tools via a case study involving a multi-step analysis that creates an atlas of putative transcription factor binding sites from terabytes of ENCODE DNase I hypersensitive sites sequencing data. We show how the tools automate routine but complex tasks, capture analysis algorithms in understandable and reusable forms, and harness fast networks and powerful cloud computers to process data rapidly, all without sacrificing usability or reproducibility—thus ensuring that big data are not hard-to-(re)use data.
In this talk, we will describe the enhancements made to the Globus Genomics to support working with datasets referred to by minids, analyzing BagIt-based research objects called BDBags, and execution using software encapsulated using docker containers with unique identifiers. We will describe the tools and services developed to create end-to-end reproducible analysis pipelines while adhering to FAIR principles.
- Reproducible big data science: A case study in continuous FAIRness
- Globus Genomics has more general applicability to gateways serving other domains and types of big (or not-so-big) data in the service of reproducibility. Hear Ravi Madduri's answer to this question at 37:56 into the video: Can a science gateway that offers computational resources but not long-term storage of users' data use/add your framework to offer some of the needed information to the user for reproducibility? Do we need to add the code to our gateway or can we call your REST API?
Webinar: Portable, Scalable Computation with Containers and Abaco Functions
September 11, 2019
Portable, Scalable Computation with Containers and Abaco Functions
Presented by Joe Stubbs, Research Associate and Lead of Cloud and Interactive Computing Group, Texas Advanced Computing Center, University of Texas, Austin
Linux container technologies such as Docker have made it easier than ever to package and move applications from one computing environment to another. In the last five or so years, a new cloud computing model called “Functions-as-a-Service” (FaaS) has started to gain traction for the way it can reduce the maintenance associated with deploying and scaling application components. In this webinar, we will discuss the primary concepts of FaaS and introduce the Abaco API platform, hosted at the Texas Advanced Computing Center. Abaco (Actor Based Containers) is an NSF-funded project to provide a Functions-as-a-Service platform based on Docker containers and the Actor Model of Concurrent Computation to researchers. After an introduction, we’ll walk through an example of packaging, deploying, and scaling a Tensorflow image classifier on the Abaco cloud. Some prior familiarity with Docker and HTTP/Rest APIs would be helpful, though not strictly required.
Webinar: Community Building 101—Building and Growing your Gateway
August 14, 2019
Community Building 101—Building and Growing your Gateway
Let's talk about building a community around your gateway! What is an online community and how can you build one that not only grows but lasts? In this webinar, I'll walk through 10 questions you can ask yourself from the beginning of a new gateway project to ensure you’re building a welcoming, inclusive, and best of all, sustainable gateway community.
- Inclusion Resources from Listen Community Consulting
Webinar: Containers and Kubernetes—A Crash Course
July 10, 2019
Containers and Kubernetes—A Crash Course
The container paradigm has shaken up developers and sysadmins alike. There is a steep learning curve in developing and hosting new applications, and the benefits may not be immediately observed. Containers offer gateways a reliable way to enable reproducible, self-contained applications in a lightweight, efficient, and fast environment. In addition, Kubernetes orchestrates containers to create a tightly-integrated system for automating deployment of containerized applications such as Docker. Over the next hour, Bob and Jeff will give a brief introduction to both Containers and Kubernetes in the hopes that the words won't be scary and new users will be able to ask the right questions.
- Introdution to Containers tutorial
- Introduction to Kubernetes tutorial
- Katacoda tutorials on different container-related subjects
Webinar: Web Analytics Guide for Gateway Websites—Getting Started with Definitions, Goals, and Metrics
June 12, 2019
Web Analytics Guide for Gateway Websites: Getting Started with Definitions, Goals, and Metrics
Presented by Shari Thurow, Founder and SEO Director, Omni Marketing Interactive
Learn how to use web analytics software to measure important user activities on your gateway sites. Track site traffic, search and browsing patterns, visitor info, page performance, top-performing pages, exit pages, and other key performance indicators (KPIs).
Persona Template (PPT download)
Calls to Action (CTAs) Worksheet (XLS download)
A comparison of the information seeking patterns of researchers in the physical and social sciences article
Searching and sourcing online academic literature: Comparisons of doctoral students and junior faculty in eduction article
Student digital information-seeking behaviour in context article