HPC SYSTEMS ENGINEER

Company:  Alta It Services
Location: Rockville
Closing Date: 16/10/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
Job Description

Job Description

HPC Systems Engineer

Remote - but must be local to DC Metro area for some onsite meetings

Work with a 4000+ core HPC cluster that is GPU-focused and a 1,500+ HPC cluster supporting the hardware and operating system environments

Ability to translate technical concepts in HPC and research computing to scientists and other non- technical personnel

Ability to determine meaningful metrics and usage data for leadership

Supporting bioinformatics applications for a large and diverse research community with needs in genomics, cryo-electron microscopy, and AI/ML

Monitor the portfolio of software applications and be proactive in planning upgrades and license renewals

Monitor and report on cluster performance and generate data to show usage and trends

Triage support requests from the research community and work with others in the Scientific Infrastructure team to resolve issues and complete service requests

Collaborate with researchers to guide them in effective use of the HPC resources, such as job scheduler submission, data formats, and building data workflows

Engage with researchers to understand their HPC needs to include data life cycle management, integration of scientific instruments to HPC, and storage capacity and compute requirements

Provide input to the Scientific Infrastructure team leader for setting priorities for cluster operations, scheduling policies, resources needed, etc.

Attend and actively participate in daily standup meetings to provide updates on progress, discuss obstacles, and co-ordinate tasks with other team members

Education:
BS/BA (or equivalent)

Required Experience:
Five years of related experience
Required Technical Skills:
Minimum of five years of experience with servers, datacenters, networking, and related technologies
Minimum of five years of experience managing Linux systems
Experience with Spack package manager, including making packages from PyPi, R, Github
Experience installing and packaging GPU applications and optimizing job submission scripts that are used for ML model training, data mining operations, or high-res graphics rendering
Experience with Python scripting
Experience using Git distributed workflows
Experience with Ansible manage system configuration
Experience with Terraform for provisioning systems

Apply Now
Share this job
Alta It Services
  • Similar Jobs

  • HPC Systems Engineer

    Rockville
    View Job
  • Systems Engineer

    Gaithersburg
    View Job
  • Systems Engineer

    Silver Spring
    View Job
  • Systems Engineer

    Gaithersburg
    View Job
  • Systems Engineer

    Gaithersburg
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙