Data Engineer

Company:  Oliver James
Location: San Francisco
Closing Date: 17/10/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

Location: San Francisco, CA (Hybrid)

Qualifications:

  • Master's degree or PhD in related field.
  • Proficient in Python.
  • Strong background in Software Engineering.
  • Meticulous in preventing and catching data mistakes.
  • Enthusiastic about engaging deeply with raw data.
  • Committed to adhering to engineering best practices.

Responsibilities:

  • Strong understanding of the significance of high-quality data for creating high-performance machine learning systems.
  • Integrate novel, high-quality text data sources into established data pipelines.
  • Build models dedicated to precise classification and extraction of valuable text from raw HTML.
  • Develop a sophisticated OCR pipeline to extract pretraining text from images and scans, ensuring exceptional quality.
  • Amass an extensive volume of multimodal data, exemplified by the collection of video transcripts spanning thousands of years.
  • Devise innovative data generation pipelines that capitalize on existing data, such as the conversion of code from one programming language to another.
  • Unify various annotation service providers into a user-friendly interface tailored for researchers.
#J-18808-Ljbffr
Apply Now
Share this job
Oliver James
An error has occurred. This application may no longer respond until reloaded. Reload 🗙