Job ID: 24-05261 Site Reliability Engineer
Newton, MA Hybrid
6 Months
$38.73 per hour on W2
We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance. You will play a crucial role in monitoring, automating, and optimizing our infrastructure to ensure the seamless operation of our services.
Key Responsibilities:
System Monitoring and Incident Response: Monitor system health, performance metrics, and availability. Respond promptly to incidents and outages, ensuring minimal downtime.
Infrastructure Management: Manage and optimize both cloud and on-premise infrastructure using Infrastructure as Code (IaC) tools.
Automation: Develop and maintain automation scripts and tools to enhance operational efficiency and reduce manual tasks.
Collaboration: Work closely with development teams to implement CI/CD practices and improve deployment processes.
Capacity Planning: Analyze usage patterns and forecast capacity needs to ensure system scalability and reliability.
Documentation: Create and maintain comprehensive documentation for systems, processes, and incident response protocols.
Security Best Practices: Implement and enforce security measures to protect infrastructure and data.
Post-Incident Reviews: Conduct post-mortems on incidents to identify root causes and implement corrective actions.
Skills:
Strong knowledge of Linux/Unix systems and proficiency in scripting languages (e.g., Python, Bash).
Familiarity with cloud platforms (e.g., AWS) and their services.
Experience with container orchestration (e.g., Kubernetes, Docker).
Proficiency in using monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios).
Experience with version control systems (e.g., Git).
Strong troubleshooting skills with the ability to diagnose complex system issues.
Excellent verbal and written communication skills for collaboration with cross-functional teams.
Understanding of Agile development practices and methodologies.