At NICE, we don’t limit our challenges. We challenge our limits. Constantly. We’re relentless. We’re ambitious. And we make an impact. Our NICErs bring their A game and spend each day turning it into an A+. And if you’re like us, we can offer you the kind of challenge that will light a fire within you.
Senior DevOps Engineer / Site Reliability Engineer (SRE)
About The Team
The CXone Expert product is a multi-tenant SaaS platform, designed to handle millions of requests with high performance and reliability. Each Expert site can easily host a complex hierarchy of tens of thousands of pages (articles), with layers of fine-grained permissioning, server- and client-side customizations and branding, and other complex business logic. Our enterprise customers have a global presence, and delivering their content with low latency across the globe with near-zero downtime is what they expect.
CXone Expert is an agile engineering organization, and QA is fully automated. We release new versions of our platform every week through our CI/CD pipeline. Our application infrastructure runs on AWS and is almost entirely containerized orchestrated by Kubernetes.
We need a Sr. DevOps Engineer to round out our Site Reliability / DevOps team. This person will be the go-to person for research and development of architectural changes from the infrastructure up. We have our AWS, CloudFormation, Kubernetes, and Linux experts on the team already, and need someone to partner with them to design and build improvements to our platform to help us scale reliably as we expand our customer base over the coming years. Another important part of this role is helping other engineers on the team design and implement software that scales well and is highly reliable. You will get your hands dirty and refactor existing system / application code yourself (this is a hands-on role).
Responsibilities:
- Infrastructure as Code (IaC):
- Collaborate with development teams to design, implement, and maintain infrastructure using tools like Terraform, or CloudFormation.
- Automate provisioning, configuration, and scaling of cloud resources.
- Monitoring and Incident Response:
- Set up monitoring and alerting systems (e.g., Datadog, OpenTelemetry, NewRelic, Prometheus) to track service-level indicators (SLIs) and respond to incidents promptly.
- Participate in on-call rotations, diagnosing and resolving production issues.
- Capacity Planning and Performance Optimization:
- Analyze system performance, identify bottlenecks, and optimize resource utilization.
- Work closely with developers to improve application performance.
- Reliability Engineering:
- Define and track service-level objectives (SLOs) and error budgets.
- Implement chaos engineering practices (e.g., game days, fault injection) to validate system resilience.
- Continuous Integration and Deployment (CI/CD):
- Enhance CI/CD pipelines, ensuring smooth and reliable software releases.
- Implement blue-green deployments, canary releases, and feature flags.
- Security and Compliance:
- Collaborate with security teams to ensure compliance with industry standards (e.g., CIS, NIST, SOC, ISO).
- Implement security best practices, including access controls, encryption, and vulnerability scanning.
- Documentation and Knowledge Sharing:
- Document infrastructure, processes, and incident response procedures.
- Share knowledge with team members through internal workshops or presentations.
Requirements:
- Education: Bachelor’s degree in computer science, software engineering, or a related field. Master’s degree preferred.
- Experience:
- Minimum of 5 years in DevOps or SRE roles.
- Proficiency in cloud platforms (AWS, Azure, GCP) and container orchestration (Kubernetes, Docker).
- Programming and Scripting:
- Strong scripting skills (Python, Bash, or similar).
- Familiarity with Git and version control.
- Collaboration and Communication:
- Excellent teamwork and communication skills.
- Ability to work across cross-functional teams.
- Problem-Solving and Analytical Skills:
- Proven ability to troubleshoot complex issues and optimize system performance.
- Certifications (optional):
- AWS Certified DevOps Engineer, Google Professional DevOps Engineer, or similar.
About NICE
NICELtd. (NASDAQ: NICE)software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences,fight financial crimeand ensure public safety.Every day, NICE software managesmore than120 million customer interactions and monitors3+billion financial transactions.
Known as an innovation powerhouse that excels in AI, cloud and digital, NICE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.
NICE is proud to be an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, national origin, age, sex, marital status, ancestry, neurotype, physical or mental disability, veteran status, gender identity, sexual orientation or any other category protected by law.
#J-18808-Ljbffr