(Full Time) Site Reliability Engineer at RADAR (United States)
Site Reliability Engineer
RADAR United States
Date Posted: 14 Mar, 2023
Work Location: San Francisco, United States
Salary Offered: $100000 — $230000 yearly
Job Type: Full Time
Experience Required: 6+ years
Remote Work: Yes
Stock Options: No
Vacancies: 1 available
About Us
Be part of an exciting, well-funded startup changing the world of retail and beyond. RADAR’s mission is to revolutionize customer experience in retail through precise identification of inventory in the stores and distribution centers, completely transforming the in-store experience for employees and customers alike.
About the Role
As a cloud Site Reliability Engineer, you will be involved with our fast-paced releases and collaborate closely with the application development team. The role requires hands-on participation and a deep understanding of cloud-related technologies, management platforms, and networking.
Responsibilities
- Run the production environment by monitoring availability and taking a holistic view of system health, and correct any issues with low latency, high performance, scalable systems in a polyglot architecture.
- Lead in capacity planning, automate the server capacity monitoring and scaling, and best practices for metrics gathering, monitoring, and alarming.
- Provide tooling to monitor and resolve any issues with persistent data stores in the system, basic data administration, and optimization for the data pipeline.
- Evangelize high engineering standards and best practices across multiple areas.
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
- Provide primary operational support and engineering for multiple large-scale distributed software applications.
- Follow key SRE practices of preventive measure for all failures, availability, performance, monitoring, alerting, and incident response.
- Document “tribal knowledge” and conduct post-incident reviews and corrections.
- Improve operational processes (such as deployments and upgrades) to make them as boring as possible.
- Debug production issues across services and levels of the stack.
About You
Requirements:
- Bachelor’s degree or the equivalent in experience in Engineering, Computer Science or related field.
- 7+ years professional experience in DevOps / SRE handling production procedures and have a certification with a major cloud provider as GCP, AWS, or Azure.
- In-depth experience with Docker Compose / Docker swarm, Kubernetes cluster deployment, cluster design, sizing, and containerization.
- In-depth experience deploying microservice architecture, applications, and supporting serverless architectures.
- In-depth experience with infrastructure-as-code and config management for VMs and containers. Terraforms, Ansible or comparable tooling.
- In-depth experience with Prometheus, TICK stack, Elastic, Logstash/ Filebeat, telegraph amongst others.
- Prior experience in building out solutions with Vault and Consul for secret and configuration.
- Prior in-depth experience with open-source databases, cloud-native databases, cloud-native messaging frameworks.
- Rock solid with scripting languages such as Python, Ruby, Go shell, and YAML constructs.
- Working Knowledge of networking concepts, VPN, and VPC constructs in cloud.
- Understanding of Operations tools (Pagerduty, CloudWatch, Datadog, Sentry, etc.).
- Deeply conversant with cloud infrastructure security best practices.
- Good understanding with one of the following CI / CD tooling: Atlassian tooling, Jenkins, CircleCI, and cloud-native deployment tools and deep understanding of GITOPS.
What We’re Looking For In Teammates
We are looking for exceptional people to join our growing team and have a positive impact on our culture, technology, and product from day one. We deeply value humility, curiosity, and a positive attitude.
What It’s Like To Work With Us
We respect each other and each of our contributions, and we believe that the best solutions will come from a diversity of ideas and perspectives.
#J-18808-Ljbffr