Company:
Huntington Bank
Location: Hopkins
Closing Date: 07/11/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description
Description
As the Director of Resiliency Engineering, you will be responsible for leading our resiliency engineering initiatives and building a robust framework to ensure the reliability, availability, and performance of our services. You will oversee a team of engineers and collaborate with cross-functional teams to develop and implement strategies that enhance the resilience of our technology stack. This role requires a strategic thinker with a hands-on approach and strong leadership skills.
Key Responsibilities:
- Define and execute the vision and strategy for resiliency engineering, aligning with company goals and ensuring the highest levels of service availability and performance.
- Build, mentor, and lead a high-performing resiliency engineering team, fostering a culture of excellence, innovation, and collaboration.
- Establish and promote best practices for resiliency engineering, including service level objectives (SLOs), incident response, chaos engineering, and disaster recovery planning.
- Work closely with product, engineering, and operations teams to ensure alignment on resiliency goals and to integrate resiliency practices into the software development lifecycle.
- Oversee the implementation of monitoring and alerting systems to detect issues proactively and lead incident response efforts to minimize downtime and customer impact.
- Collaborate with engineering and infrastructure teams to analyze usage patterns and optimize resource allocation to ensure performance during peak loads.
- Drive innovation in resiliency practices by exploring new tools, technologies, and methodologies to enhance system robustness.
- Communicate effectively with executive leadership and other stakeholders regarding resiliency initiatives, progress, and challenges.
Basic Qualifications:
- Bachelor’s degree in Computer Science, Engineering, or a related field
- 12+ years of experience in software engineering, reliability engineering, or a related field, with at least 5 years in a leadership role.
- 5+ years experience leading teams in building and maintaining high-availability systems and services in a cloud-based environment.
- 5+ years experience with distributed systems, microservices architecture, and cloud technologies (AWS, GCP, Azure).
- 5+ years experience with chaos engineering practices and tools, as well as incident management and disaster recovery processes.
Preferred Qualifications:
- Master’s degree in Computer Science, Engineering, or a related field
- Proven track record of leading teams in building and maintaining high-availability systems and services in a cloud-based environment.
- Strong understanding of distributed systems, microservices architecture, and cloud technologies (AWS, GCP, Azure).
- Experience with chaos engineering practices and tools, as well as incident management and disaster recovery processes.
- Excellent problem-solving skills and a proactive approach to identifying and addressing challenges.
- Strong communication and interpersonal skills, with the ability to influence and collaborate across all levels of the organization.
- Experience in a senior engineering leadership role within a fast-paced, high-growth technology company.
- Familiarity with DevOps practices and culture.
- Knowledge of container orchestration technologies (e.g., Kubernetes, Docker) and infrastructure as code (IAC) tools (e.g., Terraform, Ansible).
Share this job
Huntington Bank
Useful Links