HPC System Administrator - UCLA's Office of Advanced Research Computing
- Published on Sunday, 12 November 2023 12:32
APPLICATIONS PROGR 3
HPC System Administrator
$5,958 - $12,908 Monthly
2200-RESEARCH TECHNOLOGY GROUP
UCLA's Office of Advanced Research Computing (OARC) supports and enhances the university mission of education, research, and service through the development and execution of innovative and sustainable technology practices, programs, services, infrastructure, policies, and partnerships.
The OARC High Performance Computing (HPC) Systems Research Technology Group (RTG) - https://oarc.ucla.edu/get-help/areas-expertise/high-performance-computing - supports thousands of UCLA researchers and over 300 research groups through consultation and the operation of the Hoffman2 High Performance Research Cluster. More information on the Hoffman2 cluster may be found at Hoffman2 Cluster Documentation - https://www.hoffman2.idre.ucla.edu/
The Hoffman2 cluster environment consists of approximately 1000 compute nodes, GPU nodes, high speed networking, high-performance storage, backup equipment, and extensive hardware and software support infrastructure, spread across multiple data centers.
The HPC System Administrator, as part of the HPC team, will serve as a technical expert supporting OARC's HPC environment in the areas of systems and application software development, HPC cluster system administration and management of the backup system environment.
Duties and Responsibilities:
- Day-to-day system administration tasks related to a large RedHat Linux-based cluster.
- Develop custom applications and system scripts to assist in the operation and management of the OARC HPC system.
- Maintain and update legacy software applications related to OARC's Hoffman2 shared computing cluster.
- Use appropriate software engineering techniques and methodology to produce secure, robust, reliable, and well documented software.
- Install, configure, monitor, maintain, and support complex software applications, operating systems, and hardware typical in the academic HPC environment.
- Maintain and upgrade software, technical, and operational documentation.
- Learn, assess, evaluate, and apply new technologies and concepts applicable to fulfilling OARC's researcher-focused mission.
- Support, through a ticketing system, researchers and colleagues utilizing the HPC system.
- Maintain OARC's HPC storage backup system.
- Read and understand the Qualifications section.
Requires the ability to work from UCLA's Westwood campus as operational demands dictate. FlexWork / hybrid schedules will be considered based on work demands and operational needs. 100 8:00 am 5:00 pm