AI Hardware Systems Engineer

Company:  Sibitech
Location: Washington
Closing Date: 07/11/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

At eBay, we’re more than a global ecommerce leader — we’re changing the way the world shops and sells. Our platform empowers millions of buyers and sellers in more than 190 markets around the world. We’re committed to pushing boundaries and leaving our mark as we reinvent the future of ecommerce for enthusiasts.

Our customers are our compass, authenticity thrives, bold ideas are welcome, and everyone can bring their unique selves to work — every day. We’re in this together, sustaining the future of our customers, our company, and our planet.

Join a team of passionate thinkers, innovators, and dreamers — and help us connect people and build communities to create economic opportunity for all.

The Role:

We are looking for a Systems Software Engineer to join our team to qualify and automate testing of new hardware technologies related to AI, as well as support some of our traditional qualifications efforts. This person will interface with internal eBay teams working on AI platforms, other platform teams, key technology and systems integration vendors, AI open source software communities, and with other members of the hardware engineering team.

Key Responsibilities:

  • You will work as part of the Hardware Engineering team to reduce the cost of purchasing and operating eBay’s fleet of servers, saving millions of dollars a year.
  • Your primary focus will be working on our AI hardware platforms.
  • You will translate internal customer requests into requirements, and develop benchmarks and test suites to ensure our platforms meet their needs.
  • Evaluate the performance and reliability of new hardware platforms and hardware components using automated tests, with a strong focus on AI accelerators.
  • Expand and maintain our automation that we use daily for testing, and reliability work.
  • Develop performance test plans and experiments with our customer teams to ensure we are able to utilize our hardware to the fullest of its ability.
  • Work with our customers to debug, and address any reliability or performance issues they have with our server products.
  • Identify and suggest the ideal OS and BIOS settings for our systems.
  • Explore and propose new hardware/software technologies that improve performance, or reduce cost of our products, particularly new AI accelerators.
  • You will improve our monitoring and data collection tooling, to ensure we’re recording relevant information.

What you need:

  • You have at least 5-8 years of systems engineering experience using Linux as an operating system.
  • You should understand how to configure servers to expose AI accelerators.
  • Experience with AI frameworks and platforms, ideally with experience benchmarking services or accelerators, such as pytorch, deepspeed, or MLPerf.
  • You should be able to explain how Linux utilizes various hardware components, and what tunables it provides.
  • We primarily use Python and Bash for automating tasks; you must be proficient in one of these languages.
  • You should have used a revision control system like GIT, and be familiar with concepts like branching and merging.
  • You must be able to build and use containers using Docker or another technology.
  • You should understand how to compile and build source code, especially the Linux kernel.
  • BS EE or CS with continued formal or informal education.

Desired Skills:

  • It would be a bonus if you understood the differences between AI accelerators from multiple vendors, and the differences in their architectures.
  • We’d like you to be familiar with extending a monitoring framework like Prometheus.
  • We’d like for you to be familiar with Kubernetes and cloud computing concepts.
  • It would be a bonus if you’ve used various profiling and performance tools like perf, vtune, or performance co-pilot.
  • It would be great if you have experience analyzing logs, and working with data repositories to help drive technical decisions.
  • It would be a bonus if you’ve deployed and configured systems at scale using standard technology like PXE, Ansible, Salt, and Puppet.
  • Position ideally based in San Jose, CA with minimal travel required.
#J-18808-Ljbffr
Apply Now
Share this job
Sibitech
  • Similar Jobs

  • AI Hardware Systems Engineer

    Washington
    View Job
  • AI Hardware Systems Engineer

    Washington
    View Job
  • AI Systems Engineer

    Washington
    View Job
  • AI Systems DevOps Engineer

    Washington
    View Job
  • Systems Programmer - Hardware Configuration

    Washington
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙