Site Reliability Engineer

Company: JobRialto

Location: West Lake Hills

Closing Date: 03/11/2024

Hours: Full Time

Type: Permanent

Apply Now

Job Requirements / Description

Job Summary
The Site Reliability Engineer (SRE) is responsible for leading the production support, readiness, availability, and resiliency of critical applications, infrastructure, and batches. This role focuses on cloud computing (AWS & Azure), enterprise tools (Jenkins, Docker, Kubernetes, etc.), and implementing practices in resiliency engineering, automation, observability, and chaos testing. The SRE will work within a centralized support services team, managing complex distributed systems and enhancing reliability across all levels of the infrastructure.
Key Responsibilities
• Oversee production support, ensuring readiness, availability, and resiliency of critical systems and applications.
• Lead implementation of practices related to resiliency engineering, automation, observability, and chaos testing.
• Collaborate with cross-functional teams to address issues related to hardware, software, network, applications, and cloud service providers.
• Provide cloud and platform engineering support for production environments, participating in an on-call rotation.
• Solve application issues in Unix/Linux environments using J2EE, WebSphere, Tomcat, and SQL.
• Utilize observability tools like Prometheus, Grafana, ELK, Datadog, and Splunk to monitor applications and infrastructure.
• Ensure scalability and resiliency of complex distributed systems.
• Implement infrastructure as code using tools like Terraform, Chef, and Ansible.
• Automate day-to-day activities using Python and Ansible.
• Perform chaos testing to build system resilience.
• Manage and monitor SSL certificates and handle security and patching for on-prem servers.
Required Qualifications
• Bachelor's degree in a technology-related field (e.g., Engineering, Computer Science).
• 5-8+ years of hands-on experience deploying or supporting multi-tiered distributed systems.
• Hands-on experience with public cloud environments (AWS and Azure).
• Experience with container orchestration (Kubernetes).
• Experience with batch processing tools (Control-M, Informatica).
• Strong understanding of cloud computing and DevOps concepts, including CI/CD pipelines.
• Experience with monitoring and observability tools (Prometheus, Datadog, Grafana, Splunk).
• Experience in Unix/Linux environments with J2EE, WebSphere, Tomcat, and SQL.
• Familiarity with ITIL processes like incident and change management.
• Hands-on experience with infrastructure as code tools (Terraform, Ansible, Chef).
• Proven experience in chaos testing and building resilient systems.
• Strong skills in scripting languages (Python, Bash, Korn).
Preferred Qualifications
• Experience in cloud development and migration skills.
• Proficiency in managing large datasets using query languages and visualization tools.
• Strong understanding of API testing tools (SoapUI, Postman).
• Experience with Agile methodology and handling on-prem server fleets.
• Experience with web development (Django, JavaScript).
Certifications
• AWS or Azure cloud certifications (preferred but not required).
Education: Bachelors Degree

Apply Now

Share this job

JobRialto

Useful Links

More Jobs in West Lake Hills
Full Time Jobs in West Lake Hills
Part Time Jobs in West Lake Hills
Engineering Jobs

Similar Jobs
Site Reliability Engineer
Austin
View Job
Site Reliability Engineer
Austin
View Job
Site Reliability Engineer
Austin
View Job
Site Reliability Engineer
Austin
View Job
Senior Site Reliability Engineer
Austin
View Job

Site Reliability Engineer

Similar Jobs