Skip to main content

Data Science Workflows Architect - RCSB Protein Data Bank

Data Science Workflows Architect

RCSB Protein Data Bank is seeking a highly motivated Data Science Workflows Architect. This position will involve collaboration with a team of scientists and software developers in a collaborative environment. The candidate should be comfortable working in a dynamic environment, and able to demonstrate an ability to think creatively, generate new ideas, and implement solutions. The candidate should enjoy working with scientific data and  engaging with other programmers and scientists in a collaborative team environment.



* Serve as a subject-matter expert on engineering software running in a high-performance computing environment.

* Evaluate an appropriate workflow management system to run a complex scientific workflow within a high-performance computing environment.

* Design and implement the chosen workflow management system to execute on-top of the Slurm job scheduling platform.

* Optimize the workflow for horizontal and vertical scalability

* Closely work with other software developers to find workflow bottlenecks and suggest areas of improvement

Required Knowledge, Skills, and Abilities

Minimum bachelor’s degree in computer science or a related discipline

●       Programming skills in a high level language, preferably python

●       Experience in development of software and workflows within an HPC environment

●       Knowledge of workflow engines, e.g. Airflow, StreamFlow or Prefect

●       Ability to debug highly complex software and workflows in an HPC environment

●       Experience with a container orchestration platform (e.g. Kubernetes) or cloud provider platforms (e.g. AWS) would be an advantage


Outstanding Benefits Package

RCSB PDB is a friendly and collaborative working environment with excellent professional development opportunities. At Rutgers, this academic position offers New Jersey state benefits and a faculty-level salary. There are personal rewards including comprehensive health and retirement plan options.

For more details about state and university benefits, see our employee benefits page.


RCSB PDB impacts millions users around the world working in fundamental biology, biomedicine, bioengineering, and energy sciences. Our work involves data analysis, integration, transformation, and presentation/visualization of data using complex interactive graphical user interfaces. An important aspect is to provide users with the ability to search and explore the PDB data archive. Solutions are implemented using a wide range of components developed and maintained in-house, in addition to third-party tools, libraries, frameworks and technologies.

See Why Join Us.