Senior Devops SRE

Company:  Futran Tech Solutions Pvt. Ltd.
Location: Burbank
Closing Date: 09/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

As a Senior engineer, the candidate is looked at by Disney fellow team members as a 'go to' individual ; He is someone who has a clear understanding of, and can thoroughly elaborate on SRE principles and best practices to a given audience. To be successful in this role he will continuously uphold and improve all the relevant reliability aspects for Disney services, with an increased focus on SLOs, while raising the reliability of a variety of large-scale guest-facing and internal services.
Responsibilities:
Build safe and secure automation for infrastructure and developer enablement following the Disney Security Configuration Standards whilst seeking best practices from other teams
• Develop useful telemetry, alerts, and response to reduce Mean Time To Repair (MTTR);
• Collaborate and provide technical excellence within and across teams;
• Consult on best practices and develop tools to enable smooth adoptions of good service reliability practices and methods;
• Identify areas of improvement in reliability, efficiency, and operations;
• Build tools to help your SRE team quickly pinpoint, isolate and resolve issues related to infrastructure, platform services and applications;
• Continuously refine monitoring processes, configurations, and thresholds;
• Develop runbooks and tools to streamline processes and shorten problem resolution time;
• Write code that improves scalability, performance, maintainability, and security;
• Add, tune and maintain alert configurations and documentation as needed;
• Cultivate full-team participation in high quality, thoughtful software;
• Develop and improve CI/CD processes to improve release cadence and success;
• Use Chaos Engineering principles and methodologies to test what you build under real-world conditions;
• Mentor SREs in technical and non-technical SRE responsibilities;
• Take primary responsibility for large (multi-person) efforts, including planning, execution, and training
Basic Qualifications:
Creative and innovative outside-the-box thinking
• 5+ years of experience in SRE, DevOps, technical operations, systems engineering, software engineering or related discipline
• Excellent communication skills, both verbal and written
• Passionate and curious about ways to leverage technology while continually learning
• Ability to identify root-cause sources of instability in high-traffic, large-scale distributed systems
• Experience in designing, building, and operating large-scale production systems
• Efficiently skilled with the use of containers in enterprise production environments (e.g. Docker, Kubernetes, LXC, AWS ECS and EKS)
• Configuration management and orchestration (e.g. Terraform, Cloud Formation, Ansible)
• Comfortable in one or more of the following languages (Python, Java, Scala, Go, Rust, Ruby, or similar )
• Scripting languages like Ruby, Bash, PowerShell or Python ;
• Skilled in Cloud/PaaS/SaaS Environments (e.g. AWS, Azure, Google Cloud Compute)
• Hands-on experience using source control (Git, GitHub) and feature branching strategies
• Experience with continuous integration tools (e.g. Jenkins, Gitlab CI/CD, AWS CodeBuild, CodeDeploy, CodePipeline, Azure DevOps, Spinnaker)
• Knowledge of best practices and IT operations in an always-up, always-available service;
• Possess expertise in scalable testing, automation, continuous integration frameworks and best practices;
• Experience in SDLC, distributed systems, networking, hardware, logistics and operations or capacity planning;
• UNIX/Linux administration, troubleshooting, performance tuning, and security
• Must be detail-oriented, self-organized, be committed to quality and be capable of tracking multiple issues simultaneously
Preferred Education
• BS Degree in Computer Science, Electrical & Computer Engineering or Mathematics; or equivalent experience.

Apply Now
Share this job
Futran Tech Solutions Pvt. Ltd.
An error has occurred. This application may no longer respond until reloaded. Reload 🗙