Site Reliability Engineer

Company:  Cypress HCM
Location: Newton
Closing Date: 09/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

Site Reliability Engineer 2

Description:

Reason: Special Project |Department: Stock US Eng | 6 Months

Job Summary

We are seeking a skilled Site Reliability Engineer (SRE) Level 2 to join our dynamic team. The ideal candidate will have a strong technical background, excellent problem-solving skills, and a passion for enhancing system reliability and performance. You will play a crucial role in monitoring, automating, and optimizing our infrastructure to ensure the seamless operation of our services.

Key Responsibilities:

  • System Monitoring and Incident Response: Monitor system health, performance metrics, and availability. Respond promptly to incidents and outages, ensuring minimal downtime.
  • Infrastructure Management: Manage and optimize both cloud and on-premise infrastructure using Infrastructure as Code (IaC) tools.
  • Automation: Develop and maintain automation scripts and tools to enhance operational efficiency and reduce manual tasks.
  • Collaboration: Work closely with development teams to implement CI/CD practices and improve deployment processes.
  • Capacity Planning: Analyze usage patterns and forecast capacity needs to ensure system scalability and reliability.
  • Documentation: Create and maintain comprehensive documentation for systems, processes, and incident response protocols.
  • Security Best Practices: Implement and enforce security measures to protect infrastructure and data.
  • Post-Incident Reviews: Conduct post-mortems on incidents to identify root causes and implement corrective actions.

Required Skills:

  • Strong knowledge of Linux/Unix systems and proficiency in scripting languages (e.g., Python, Bash).
  • Familiarity with cloud platforms (e.g., AWS) and their services.
  • Experience with container orchestration (e.g., Kubernetes, Docker).
  • Proficiency in using monitoring and alerting tools (e.g., Prometheus, Grafana, Nagios).
  • Experience with version control systems (e.g., Git).
  • Strong troubleshooting skills with the ability to diagnose complex system issues.
  • Excellent verbal and written communication skills for collaboration with cross-functional teams.
  • Understanding of Agile development practices and methodologies.

Experience:

1-4 years of experience in Site Reliability Engineering or a similar role.

Location:

Newton, MA

Schedule:

  • Start Date: 10/14/2024
  • Estimated End Date: 04/07/2025
  • Hours Per Week: 40.00
  • Hours Per Day: 8.00

Compensation:

Up to $38.73/hr. (W2/Non-Exempt)

#33897117

Apply Now
Share this job
Cypress HCM
  • Similar Jobs

  • Site Reliability Engineer

    Newton
    View Job
  • Site Reliability Engineer

    Newton
    View Job
  • Site Reliability Engineer

    Boston
    View Job
  • Site Reliability Engineer

    Boston
    View Job
  • Site Reliability Engineer

    Boston
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙