Data Engineer, Python / AWS

Company:  Credera
Location: Washington
Closing Date: 20/10/2024
Salary: £125 - £150 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

  • REMOTE ROLE for Contractor anywhere in N. America **
  • 12 month contract and possible extension **
  • Rate = $65 - 80/hr USD Based On Experience

Description:

Data Engineers at TA Digital work closely with Subject Matter Experts (SMEs) to design the ontology (data model), develop data pipelines, and integrate Foundry with external systems containing the data. Data engineers also need to provide guidance and support on how to access and leverage the data foundation to create new workflows or analyze data.

Responsibilities Include:

  • Leverage Generative AI on AWS data
  • Integrate new data sources to Foundry using Data Connection
  • Implement 2-way integrations between Foundry and external systems
  • Develop pipelines transforming tabular or unstructured data
  • Implement data transformations in PySpark / PySpark SQL to derive new datasets or create ontology objects
  • Set up support structures for pipelines running in production
  • Monitor and debug critical issues such as data staleness or data quality
  • Improve performance of data pipelines (latency, resource usage)
  • Design and implement an ontology based on business requirements and available data
  • Provide data engineering context for application development

Requirements:

  • Generative AI on AWS such as Amazon Bedrock, Amazon SageMaker, Amazon EC2, Amazon EC2 UltraClusters, AWS Trainium or AWS Inferentia
  • Python – complete language proficiency
  • SQL – proficiency in querying language (join types, filtering, aggregation) and data modeling (relationship types, constraints)
  • PySpark – basic familiarity (DataFrame operations, PySpark SQL functions) and differences with other DataFrame implementations (Pandas)
  • Distributed compute – conceptual knowledge of Hadoop and Spark (driver, executors, partitions)
  • Databases – general familiarity with common relational database models and proprietary instantiations, such as SAP, Salesforce etc.
  • Git – knowledge of version control / collaboration workflows and best practices
  • Iterative working – familiarity with agile and iterative working methodology and rapid user feedback gathering concepts
  • Data quality – best practices

#J-18808-Ljbffr
Apply Now
Share this job
Credera
  • Similar Jobs

  • Lead Data Engineer (Python, Spark, AWS)

    Washington
    View Job
  • Lead Data Engineer (Python, AWS, Kafka)

    Washington
    View Job
  • Senior Software Engineer (Python, AWS)

    Washington
    View Job
  • Senior Software Engineer (Python, Spark, AWS)

    Washington
    View Job
  • AWS Python Developer with Data Experience

    Washington
    View Job
An error has occurred. This application may no longer respond until reloaded. Reload 🗙