Webinar: BOINC - Volunteer Computing for Science Gateways
October 11, 2017
BOINC: Volunteer Computing for Science Gateways
Presented by David Anderson, Research Scientist, Space Sciences Laboratory, at the University of California, Berkeley and Steven Clark, Purdue and nanoHUB
BOINC (Berkeley Open Infrastructure for Network Computing) is a distributed computing infrastructure based on a centralized server that coordinates volunteer computer resources. The volunteered resources can come from a variety of types of systems including home computers, institutional servers, and smartphones. BOINC has been used as the underlying foundation for a number of distributed computing projects.
Now, a collaboration between UC Berkeley and Purdue is adding volunteer computing to the nanoHUB nanoscience gateway. Owners of personal computers — Windows, Mac, Linux — will be able to support nanoHUB by transparently running compute-intensive nanoHub applications in the background on these computers. The goal is to greatly increase the computing throughput available to nanoHUB (perhaps tens of thousands of CPUs) at a lower cost than that of commercial clouds and dedicated hardware. This will support new paradigms, such as uncertainty quantification and anticipated computing, that can add significant scientific utility to nanoHUB.
We will describe the technical aspects of this work — moving jobs between batch systems, and using virtualization to run Linux jobs on consumer devices — as well as our plans for recruiting volunteers.
Our work will become part of the HUBzero software, allowing any Hub to add its own volunteer computing capability. More generally, the technology we're developing will simplify the task of adding volunteer computing to any science gateway.
BOINC Slides (David Anderson's part)
nanoHUB Slides (Steven Clark's part)
Webinar: Gateway Showcase featuring I-TASSER and Chem Compute
September 13, 2017
Gateway Showcase featuring
Presented by Chengxin Zhang, Department of Computational Medicine and Bioinformatics, University of Michigan
Presented by Mark Perri, Assistant Professor of Chemistry, Sonoma State University
I-TASSER Science Gateway for Protein Structure and Function Prediction
Chengxin Zhang, S. M. Mortuza, and Yang Zhang
Department of Computational Medicine and Bioinformatics, University of Michigan
Abstract: I-TASSER (Iterative Threading ASSembly Refinement) is a composite method for atomic-level protein structure prediction and structure-based protein function annotation. The on-line server system has been widely used by the biomedical community with registered users coming from more than 130 countries. In Fall 2016, I-TASSER began using XSEDE’s Comet supercomputer; as a result, I-TASSER singlehandedly tripled Comet’s users by adding more than 8,000 unique users in the first several months of integrating with Comet. In this presentation, we will discuss the following: (1) How I-TASSER generates protein structure and function predictions; (2) how I-TASSER is integrated with the XSEDE Gateway system and how the XSEDE resources have enhanced the functionality of I-TASSER and its usefulness to the community; (3) How to use the I-TASSER Gateway and how to interpret the I-TASSER output results for normal users; and (4) how to improve the quality of the I-TASSER model for advanced users (e.g. introduction of restraints, and strategy for modeling multi-domain proteins etc). The I-TASSER gateway is available at http://zhanglab.ccmb.med.umich.edu/I-TASSER/.
Chem Compute Science Gateway: Computational Chemistry for Undergraduates
Presented by Mark Perri, Assistant Professor of Chemistry, Sonoma State University
Abstract: We report on our efforts to provide a free, easy to use, but powerful interface for undergraduate students to submit computational jobs. Computational experiments are critical for engaging students in Physical Chemistry because they provide the opportunity for visualization of concepts, such as molecular orbitals. The Chem Compute Science Gateway allows students to submit jobs to the NSF XSEDE supercomputer network. Students can submit jobs using the GAMESS (General Atomic and Molecular Electronic Structure System) ab initio quantum chemistry package and the TINKER Molecular Modeling Package (Molecular Dynamics).
Webinar: Interactive Best Practices: Job Management & Scheduling
August 9, 2017
Interactive Best Practices: Job Management & Scheduling
Do you wonder what is the best way to manage and schedule computational jobs? How do experts approach this task? This month’s webinar invites people from a variety of resources to share how they approach job management and scheduling. The resources represented include the Condor Project, the CIPRES gateway, and the SEAGrid gateway.
Questions asked in Chat during the webinar
Q: Can SEAGrid be locally hosted (say within a HUBzero hub) or does it require SciGaP?
A: Yes, they can be hosted anywhere. To learn more details, jump to the 56 minute mark in the video recording.
Q: Is the repository for SEAGrid sharing publicly available?
A: Yes, it is on GitHub and can be forked.
Webinar: Jupyter as a Gateway for Scientific Collaboration and Education
July 20, 2017
Jupyter as a Gateway for Scientific Collaboration and Education
Presented by Carol Willing, Cal Poly SLO and Jupyter Steering Council
Project Jupyter, evolved from the IPython environment, provides a platform for interactive computing that is widely used today in research, education, journalism and industry. The core premise of the Jupyter architecture is to design tools around the experience of interactive computing, building an environment, protocol, file format and libraries optimized for the computational process when there is a human in the loop, in a live iteration with ideas and data assisted by the computer.
The Jupyter Notebook, a system that allows users to compose rich documents that combine narrative text and mathematics together with live code and the output of computations in any format compatible with a web browser (plots, animations, audio, video, etc.), provides a foundation for scientific collaboration. The next generation of the Jupyter web interface, JupyterLab, will combine in a single user interface not only the notebook, but multiple other tools to access Jupyter services and remote computational resources and data. A flexible and responsive UI allows the user to mix Notebooks, terminals, text editors, graphical consoles and more, presenting in a single, unified environment the tools needed to work with a remote environment. Furthermore, the entire design is extensible and based on plugins that interoperate via open APIs, making it possible to design new plugins tailored to specific types of data or user needs.
JupyterHub enables Jupyter Notebook and JupyterLab to be used by groups of users for research collaboration and education. We believe JupyterHub provides a foundation on which to build modern scientific gateways that support a wide range of user scenarios, from interactive data exploration in high-level languages like Python, Julia or R, to the education of researchers and students whose work relies on traditional HPC resources.
View the slides (Slideshare)
View the slides (Speaker Deck)
There's also a repo https://github.com/willingc/2017-science-gateways with the talk and a resources.md file with links https://github.com/willingc/2017-science-gateways/blob/master/resources/resources.md
Questions asked during the webinar (and some answers posted in chat)
If you have further questions, the best way to reach Carol is on Gitter or the Jupyter mailing list.
Q: I understand Jupyter Notebook isn't just for python, but can you use other languages? Is perl one of those languages?
A: Yes, here is a list of supported languages: https://github.com/jupyter/jupyter/wiki/Jupyter-kernels
Q: Is there a list for people exploring Jupyter in education. I am going to a hackathon on the subject, but I know there have been others. Is everything on GitHub or is the a better way to find out who is doing what?
A: Here are the mailing lists Carol mentioned:
Project Jupyter: https://groups.google.com/forum/#!forum/jupyter
Teaching with Jupyter: https://groups.google.com/forum/#!forum/jupyter-education
Jupyter in HPC: https://groups.google.com/forum/#!forum/jupyter-hpc
Q: How is the Jupyter ecosystem thinking about security, as code is executable on users web browser?
A: Related to security: https://gist.github.com/minrk/3af30da44d2495b7e19a and dated, but also: https://blog.jupyter.org/2016/08/03/security-fix-notebook-4-2-2/
Q: Jupyter Notebooks great for educational purposes; can you give an example or two on research/science project using Jupyter Notebook?
A: Check out nbviewer and Jupyter's gallery of interesting notebooks.
Here's the link to the JupyterLab + Real Time Collaboration presentation Carol mentioned: https://channel9.msdn.com/Events/PyData/Seattle2017/BRK11
Many thanks to Johnathan Rush for providing many links on the fly during the webinar!
Webinar: Gateway Showcase featuring VectorBase and CitSci.org
June 14, 2017
Gateway Showcase featuring VectorBase and CitSci.org
VectorBase: A bioinformatics resource for invertebrate vectors and other organisms related with human diseases
Presented by Gloria I. Giraldo-Calderón, PhD, VectorBase Scientific Liaison/Outreach Manager
Contact: ggiraldo at nd dot edu
Abstract: VectorBase (www.vectorbase.org) is a free, web-based bioinformatics resource center (BRC) for invertebrate vectors of human pathogens, funded by NIAID/NIH. This database is the ‘home’ of 40 genomes of arthropod vectors and pests and also has transcriptomes, proteomes and population data for an even wider list of species. The population biology data includes lab and field collected information and, in addition to the data imported from external databases or directly submitted by users, VectorBase also generates and computes primary data. Over its 13 years of existence, the discovery and interpretation of hosted data has been used for basic and translational research, as expressed in numerous scientific publications, using data from one or more studies in new or re-purpose analyses, descriptions, and hypotheses testing. Raw and process data can be exported or downloaded in a variety of different formats, visualized, browsed, queried and analyzed with the site tools or any other external tools. VectorBase data, tools, and resources are updated every two months. The website has extensive documentation resources for new and experienced users including tutorials, video tutorials, practice exercises, answer keys, and sample files.
CitSci.org: A platform for engaging citizen scientists through individualized websites
Presented by Greg Newman, Director CitSci.org & Research Scientist, Natural Resource Ecology Laboratory, Colorado State University
Contact: Gregory.Newman at colostate dot edu
Abstract: Citizen science empowers individuals to pursue their interests in the scientific world. Members of CitSci.org are encouraged to investigate their own scientific questions or jump on board as a volunteer for an existing project. In parallel, citizen science programs create their own online projects where trained volunteers and scientists together answer local, regional, and global questions, inform natural resource decisions, advance scientific understanding, and improve environmental education. The platform provides tools to empower the citizen science gateway creators and their participants to ask questions, select methods, submit data, analyze data, and share results. CitSci.org provides tools for the entire research process and full spectrum of citizen science program needs: creating new projects, managing project members, building custom data sheets, analyzing collected data, and gathering participant feedback. To date, our volunteer coordinators have started 414 projects that have contributed a total of 697,984 measurements for analysis to answer local, regional and/or global questions.