Company:
Jobs via eFinancialCareers
Location: Dallas
Closing Date: 04/11/2024
Salary: £100 - £125 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description
Our challenge
We are seeking an experienced Python Developer with a strong background in PySpark to join our data engineering team. The ideal candidate will have a robust understanding of big data processing, experience with Apache Spark, and a proven track record in Python programming. You will be responsible for developing scalable data processing and analytics solutions in a cloud environment.
The Role
Responsibilities:
- Design, build and maintain scalable and efficient data processing pipelines using PySpark.
- Develop high-performance algorithms, predictive models, and proof-of-concept prototypes.
- Work closely with data scientists and analysts to transform data into actionable insights.
- Write reusable, testable, and efficient Python code.
- Optimize data retrieval, develop dashboards, and reports for business stakeholders.
- Implement data ingestion, data cleansing, deduplication, and data consolidation processes.
- Leverage cloud-based big data services and architectures (AWS, Azure, or GCP) for processing large datasets.
- Collaborate with cross-functional teams to define and refine data and analytics requirements.
- Ensure systems meet business requirements and industry practices for security and privacy.
- Stay updated with the latest innovations in big data technologies and PySpark enhancements.
Requirements:
- Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
- Minimum of 3 years of experience in Python development.
- Strong experience with Apache Spark and its components (Spark SQL, Streaming, MLlib, GraphX) using PySpark.
- Demonstrated ability to write efficient, complex queries against large data sets.
- Knowledge of data warehousing principles and data modeling concepts.
- Proficient understanding of distributed computing principles.
- Experience with at least one cloud provider (AWS, Azure, GCP), including their big data processing services.
- Strong problem-solving skills and ability to work under tight deadlines.
- Excellent communication and collaboration abilities.
It would be great if you also had:
- Experience with additional big data tools like Hadoop, Kafka, or similar technologies.
- Familiarity with machine learning frameworks and libraries.
- Experience with data visualization tools and libraries.
- Knowledge of containerization and orchestration technologies (Docker, Kubernetes).
- Contributions to open-source projects or a strong GitHub portfolio showcasing relevant projects.
Share this job
Jobs via eFinancialCareers
Useful Links