Responsibilities:
Engage in and enhance the entire lifecycle of large-scale distributed systems, spanning system design consulting, launch reviews, deployment, operation, and refinement. Establish service availability across multiple global data centers. Develop tools and software to enhance service reliability, scalability, and operability. Measure and monitor availability, latency, and overall service health. Implement sustainable incident response protocols and conduct thorough postmortems. Participate in on-call rotations spanning multiple continents.
Key Requirements:
- Bachelor's degree in Computer Science with a minimum of 3 years of experience in a related field.
- Expertise in Unix/Linux operating systems and IP networking .
- Proficiency in programming in at least one of the following languages: C, C++, Java, Python, Perl, or Go .
- Experience in problem-solving , resolving application issues, or managing production operations.
- Experience in automating routine tasks .
- Strong communication skills with a sense of ownership and drive.
- Preferred experience in designing, analyzing, and troubleshooting large-scale distributed systems.
Similar Jobs
- View Job
Site Reliability Engineer (Infra and SRE) - Global Payment - Singapore
Emory - View Job
Enterprise Level Site Reliability Engineer – Leading Buyside Firm in Singapore
Emory - View Job
OCT Workforce Development Engineer / Senior Engineer
Emory - View Job
Security Operations Engineer (Sec Ops Engineer)
Emory - View Job
Security Operations Center Engineer
Emory