Job Summary:
Our Performance and Reliability teams are leading the improvements, optimization, and availability of applications across the Disney organization and business units, taking a consultative approach to Reliability Engineering by supporting, educating, mentoring, and delivering automation to foster performance and resiliency in best practice.
The Senior Site Reliability Engineer is a key member of our Performance and Reliability embedded teams. We focus on planning, scoping, solution architecting, software design, and implementation based on functional and performance capability requirements. We leverage cloud-native, commercial, and open-source tools and frameworks to solve complex business needs. These solutions touch a wide range of functional areas. This role will collaborate with Software Engineers, Product Owners, and others across teams and business areas to influence solutions and platforms across the organization.
Responsibilities:
Build solutions for problems of sizable scope and complexity that have been successfully deployed to customers.
Champions Infrastructure as Code (IaC); provides thought leadership; establishes enterprise-level infrastructure patterns.
Builds and enhances Continuous Integration and Delivery (CI/CD) pipelines.
Regularly review existing systems, policies, and practices, while identifying solutions that enhance service delivery efficiency, and enhance the current environment.
Mentors less experienced engineers. Collaborates with product engineering leaders to find innovative solutions for moderately complex problems.
Writes code that establishes and enhances frameworks, typically for software programs and systems that have little or no precedent.
Reviews code for the design, testability, and clear usability.
Develops specifications for assigned components, projects or fixes.
Builds solutions that scale and perform.
Participates in project proposal, architecture, and design. Contributes to architecture design and implementation of assigned projects and may lead in the effort.
Oversees technical maintenance. Performs troubleshooting for systems that tend to be large and highly complex.
Design, development, documentation and/or testing.
Applies experience to resolve a variety of complex issues.
Basic Qualifications:
7+ years within the Reliability Engineering field.
Well-versed with Reliability Engineering principles, patterns, and best practices.
Ability to understand the business domain from both a technical and product viewpoint.
5+ Years experience working with AWS Cloud Infrastructure and resources.
5+ Years experience in designing and implementing automation tools.
5+ Years experience running and monitoring large scale distributed systems
Proficient in Python and/or other coding language.
Well-versed with modern infrastructure services and concepts such as containerization, distributed systems and microservices.
Experience designing and implementing automation tools.
Well-versed in Software Engineering principles and patterns.
Experience working with globally distributed teams.
Experience as a coach and mentor within a business environment.
Experience working within an Agile environment.
Preferred Qualifications:
Bachelor's degree in computer science or related field, or equivalent training or work experience.