Senior Software Engineer

Company:  Microsoft
Location: Mead
Closing Date: 22/10/2024
Salary: £125 - £150 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

Want to impact the foundation for future AI storage development in Azure, the world's computer? The Azure Managed Lustre File System (AMLFS) team leads development, deployment, and monitoring of the most popular High-Performance Computing (HPC) parallel file system in the world: Lustre. The AMLFS Platform Team is responsible for end-to-end delivery of AMLFS images, cluster deployment, logs and metrics, and configuration compliance.


As a Senior Software Engineer in the AMLFS Platform team you'll be responsible for developing the reliable deployment of AMLFS in Azure, assessing and mitigating security risks, developing comprehensive unit and system-level tests, and diagnosing, mitigating, and fixing the most challenging deployment and upgrade customer issues. You'll design and develop logging, monitoring, and reporting capabilities for AMLFS and help define and measure key Service Level Indicators designed to make our product increasingly robust. This opportunity will allow you to develop expertise in distributed system design, grow proficient in navigating and managing Linux operating systems, and collaborate with the core storage, compute, and networking teams that form the foundation of Azure.


Responsibilities

  1. Collaborates with appropriate stakeholders to determine user requirements for a scenario.
  2. Drives identification of dependencies and the development of design documents for a product, application, service, or platform.
  3. Creates, implements, optimizes, debugs, refactors, and reuses code to establish and improve performance and maintainability, effectiveness, and return on investment (ROI).
  4. Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items.
  5. Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions, alerting stakeholders about status and initiates actions to restore system/product/service for simple and complex problems when appropriate.
  6. Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale.

Qualifications

Required Qualifications:

  1. Bachelor's Degree in Computer Science or related technical field AND 4+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
    • OR equivalent experience.
  2. 2+ years working, developing, and debugging within a Linux operating system environment and at least broad understanding of Linux fundamentals.
  3. 2+ years experience with high-performance computing OR distributed systems in an industry or academic setting.

Other Requirements:

  1. Ability to meet Microsoft, customer and/or government security screening requirements are required for this role. These requirements include, but are not limited to the following specialized security screenings:
  • Microsoft Cloud Background Check: This position will be required to pass the Microsoft Cloud Background Check upon hire/transfer and every two years thereafter.

Preferred Qualifications:

  1. Bachelor's Degree in Computer Science or related technical field AND 8+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
    • OR Master's Degree in Computer Science or related technical field AND 6+ years technical engineering experience with coding in languages including, but not limited to, C, C++, C#, Java, JavaScript, or Python
    • OR equivalent experience
  2. 4+ years of experience with high-performance computing OR distributed systems in an industry or academic setting.
  3. Experience with the Lustre parallel file system OR an equivalent parallel or distributed file system.
  4. 4+ years of working, developing, and debugging within a Linux operating system environment.
#J-18808-Ljbffr
Apply Now
Share this job
Microsoft
An error has occurred. This application may no longer respond until reloaded. Reload 🗙