HPC DevOps Engineer at UW Institute for Health Metrics and Evaluation

Company:  Eastern Washington University
Location: Washington
Closing Date: 23/10/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

IHME has an outstanding opportunity for a HPC DevOps Engineer, Infrastructure.

POSITION PURPOSE
The purpose of this position is to define, implement, and automate processes relating to the deployment of applications within IHME’s infrastructure. The HPC DevOps Engineer will support development staff as they integrate new processes into the development workflow by defining processes, architecting environments, and advocating best practices relating to application security, scalability, and reusability. Additionally, a qualified candidate will monitor the deployed applications to ensure that our products remain available to their audiences. We are looking for a highly creative and organized individual with substantial experience in systems and IT operations who thrives on building rapidly deployable, scalable, and stable computational environments.

DUTIES AND RESPONSIBILITIES

Process engineering

  • Define and improve the automation processes for the system administration team.
  • Work with HPC software engineers to ensure best implementation of tools, scripts, and programs to fully utilize our resources.
  • Ensure that development processes allow for rapid deployment, rapid recovery from failures, and overall application stability.
  • Understand and promote application security best practices in development processes.

Architecting and managing environments

  • Manage various development, testing, and production environments for internal and public-facing data visualizations and web applications.
  • Develop strategies for recovering from failures by implementing checkpoints and other methods in the HPC environment.
  • Implement and maintain application infrastructure monitoring and reporting tools.
  • Work with Systems Administrators to ensure stable, functional environments.
  • Understand the methods and technologies used in the storage, manipulation, and display of geospatial and high-dimensional epidemiological information.

Development and maintenance of internal tools

  • Develop and maintain internal tools related to the deployment of software-defined infrastructure.

Research knowledge

  • Become familiar with substantive areas of expertise at IHME and their comprehensive data needs in order to perform complex multidisciplinary analyses.
  • Develop an understanding of overall IHME team structure in order to more effectively manage information channels and inter-team dependencies.

General

  • Stay up to date on the latest technologies surrounding application deployment and large datasets, and have a good understanding of when and where they should be implemented to benefit the Institute.
  • Perform additional duties as assigned that fall within the reasonable scope of this position and the IHME team overall.

MINIMUM REQUIREMENTS

  • Bachelor’s degree in computer science or related field and five years’ related experience, or equivalent combination of education and experience.

ADDITIONAL REQUIREMENTS

  • Outstanding interpersonal skills, including team ethic and relationship building.
  • Strong background in Linux/Unix administration.
  • Strong software engineering and scripting experience.
  • Experience with systems and IT operations.
  • Experience with configuration management and automation (i.e., Salt, Ansible, Chef).
  • Familiarity with Linux Containers and Virtualization (Docker/LXC, VSphere, OpenStack).
  • A working understanding of code and script (Bash, Python, or Ruby).
  • Experience with monitoring and metrics-gathering tools (ELK stack, Nagios, Sensu).
  • Understanding of Software Development Life Cycle, Continuous Integration and Continuous Delivery.
  • Experience with version control systems (Git).
  • Comfortable with Agile methodologies and working closely within small teams.
  • A commitment to working alongside others at IHME to illuminate the health impacts of systemic racism and to work within IHME to make our organization more diverse and inclusive.

DESIRED QUALIFICATIONS

  • Experience with distributed storage systems such as Qumulo, CEPH, Stornext, etc.
  • Knowledge of cluster management and scheduling systems like SGE/UGE, Torque, or Slurm.

CONDITIONS OF EMPLOYMENT

  • Weekend and evening work sometimes required.
  • This position is open to anyone authorized to work in the US. The UW is not able to sponsor visas for staff positions.
  • Office is located in Seattle, Washington. This position is eligible to work fully remote in the US; work schedule required to overlap 50% of IHME office hours, between 8 a.m. and 6 p.m. Pacific Time.
#J-18808-Ljbffr
Apply Now
Share this job
Eastern Washington University
  • Similar Jobs

  • HPC DevOps Engineer at UW Institute for Health Metrics and Evaluation

    Washington
    View Job
  • CBP LSOTA - Test and Evaluation Engineer I (C)

    Washington
    View Job
  • CBP LSOTA - Test and Evaluation Engineer I (C)

    Washington
    View Job
  • CBP LSOTA - Test and Evaluation Engineer I (C)

    Washington
    View Job
  • Test & Evaluation Engineer 3

    Washington
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙