Site Reliability Engineer

Company:  RADAR
Location: San Francisco
Closing Date: 31/10/2024
Salary: £250 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

(Full Time) Site Reliability Engineer at RADAR (United States)

Site Reliability Engineer

RADAR United States

Date Posted: 14 Mar, 2023

Work Location: San Francisco, United States

Salary Offered: $100000 — $230000 yearly

Job Type: Full Time

Experience Required: 6+ years

Remote Work: Yes

Stock Options: No

Vacancies: 1 available

About Us

Be part of an exciting, well-funded startup changing the world of retail and beyond. RADAR’s mission is to revolutionize customer experience in retail through precise identification of inventory in the stores and distribution centers, completely transforming the in-store experience for employees and customers alike.

About the Role

As a cloud Site Reliability Engineer, you will be involved with our fast-paced releases and collaborate closely with the application development team. The role requires hands-on participation and a deep understanding of cloud-related technologies, management platforms, and networking.

Responsibilities

  1. Run the production environment by monitoring availability and taking a holistic view of system health, and correct any issues with low latency, high performance, scalable systems in a polyglot architecture.
  2. Lead in capacity planning, automate the server capacity monitoring and scaling, and best practices for metrics gathering, monitoring, and alarming.
  3. Provide tooling to monitor and resolve any issues with persistent data stores in the system, basic data administration, and optimization for the data pipeline.
  4. Evangelize high engineering standards and best practices across multiple areas.
  5. Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement.
  6. Provide primary operational support and engineering for multiple large-scale distributed software applications.
  7. Follow key SRE practices of preventive measure for all failures, availability, performance, monitoring, alerting, and incident response.
  8. Document “tribal knowledge” and conduct post-incident reviews and corrections.
  9. Improve operational processes (such as deployments and upgrades) to make them as boring as possible.
  10. Debug production issues across services and levels of the stack.

About You

Requirements:

  1. Bachelor’s degree or the equivalent in experience in Engineering, Computer Science or related field.
  2. 7+ years professional experience in DevOps / SRE handling production procedures and have a certification with a major cloud provider as GCP, AWS, or Azure.
  3. In-depth experience with Docker Compose / Docker swarm, Kubernetes cluster deployment, cluster design, sizing, and containerization.
  4. In-depth experience deploying microservice architecture, applications, and supporting serverless architectures.
  5. In-depth experience with infrastructure-as-code and config management for VMs and containers. Terraforms, Ansible or comparable tooling.
  6. In-depth experience with Prometheus, TICK stack, Elastic, Logstash/ Filebeat, telegraph amongst others.
  7. Prior experience in building out solutions with Vault and Consul for secret and configuration.
  8. Prior in-depth experience with open-source databases, cloud-native databases, cloud-native messaging frameworks.
  9. Rock solid with scripting languages such as Python, Ruby, Go shell, and YAML constructs.
  10. Working Knowledge of networking concepts, VPN, and VPC constructs in cloud.
  11. Understanding of Operations tools (Pagerduty, CloudWatch, Datadog, Sentry, etc.).
  12. Deeply conversant with cloud infrastructure security best practices.
  13. Good understanding with one of the following CI / CD tooling: Atlassian tooling, Jenkins, CircleCI, and cloud-native deployment tools and deep understanding of GITOPS.

What We’re Looking For In Teammates

We are looking for exceptional people to join our growing team and have a positive impact on our culture, technology, and product from day one. We deeply value humility, curiosity, and a positive attitude.

What It’s Like To Work With Us

We respect each other and each of our contributions, and we believe that the best solutions will come from a diversity of ideas and perspectives.

#J-18808-Ljbffr
Apply Now
Share this job
RADAR
  • Similar Jobs

  • Site Reliability Engineer

    San Francisco
    View Job
  • Site Reliability Engineer

    San Francisco
    View Job
  • Site Reliability Engineer

    San Francisco
    View Job
  • Site Reliability Engineer

    San Francisco
    View Job
  • Site Reliability Engineer

    San Francisco
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙