Data Engineer

Company:  Oregon Health & Science University
Location: Washington
Closing Date: 22/10/2024
Salary: £125 - £150 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

The CDC Foundation helps the Centers for Disease Control and Prevention (CDC) save and improve lives by unleashing the power of collaboration between CDC, philanthropies, corporations, organizations and individuals to protect the health, safety and security of America and the world. The CDC Foundation is the go-to nonprofit authorized by Congress to mobilize philanthropic partners and private-sector resources to support CDC’s critical health protection mission. Since 1995, the CDC Foundation has raised over $1.9 billion and launched more than 1,300 programs impacting a variety of health threats from chronic disease conditions including cardiovascular disease and cancer, to infectious diseases like rotavirus and HIV, to emergency responses, including COVID-19 and Ebola. The CDC Foundation managed hundreds of programs in the United States and in more than 90 countries last year. Visit for more information.

Overview

The Data Engineer will play a crucial role in advancing the CDC Foundation’s mission by designing, building, and maintaining modern data infrastructure for the Northwest Portland Area Indian Health Board (NPAIHB) Data Hub project. This role is aligned to the Workforce Acceleration Initiative (WAI). WAI is a federally funded CDC Foundation program with the goal of helping the nation’s public health agencies by providing them with the technology and data experts they need to accelerate their information system improvements.

Working closely with the Data Hub Team, the Data Engineer will create the architecture needed for data storage, processing, analysis, and secure transfer to Tribal Leaders and public health professionals. The Data Engineer will collaborate with epidemiologists, data content experts, IT staff, the Data Hub Project Director, and others to develop and implement scalable solutions that align with the objectives of the NPAIHB’s Data Hub project.

NPAIHB’s Data Hub Team is currently developing a system, “The NW Tribal Data Hub,” to provide comprehensive, user-friendly public health data dashboards for its 43 member Tribes. The Data Engineer will ensure the successful design and implementation of a newly created public health database, the ingestion of additional data into the system, and create tables, views, and other database structures to support epidemiological analysis, visualization, and reporting to Tribes. The data, sourced primarily from state and federal agencies, include vital statistics (births, deaths), cancer registries, emergency department, clinical service data, and others. The Data Engineer’s work will be pivotal in enhancing the capacity of Tribal public health departments to conduct data-driven activities, advancing Tribal data sovereignty, and empowering Tribes to improve health outcomes within their communities.

NPAIHB is a tribally owned and operated non-profit organization serving the 43 federally recognized Tribes in the states of Idaho, Oregon, and Washington. Led by the organization’s Board of Directors, NPAIHB’s mission is to “eliminate health disparities and improve the quality of life of American Indians and Alaska Natives by supporting Northwest Tribes in their delivery of culturally appropriate, high-quality health programs and services.” NPAIHB is a mission-driven organization with a staff of over 120 professionals dedicated to advancing Tribal health for the 7th generation in the Pacific Northwest.

The Data Engineer will be hired by the CDC Foundation and assigned to the Data Hub Team at NPAIHB. This position is eligible for a fully remote work arrangement for U.S. based candidates.

Responsibilities

  • Design a data hub roadmap to streamline secure and reliable data management, including ingestion, processing, and storage through enhancements or implementation of new systems and pipelines.
  • Load data into storage systems or data warehouses, transforming, cleaning, and organizing with dimensional modeling techniques to ensure accuracy, consistency, and efficient querying.
  • Transform and structure data to ensure it is optimized for use in data visualization software, enabling accurate and effective visual representations of epidemiological data.
  • Collaborate closely with the project epidemiologist to ensure they gain a comprehensive understanding of the data pipeline architecture and data engineering methods to support long-term maintenance and sustainability of the system.
  • Collaborate closely with project epidemiologist to understand data requirements and ensure that data infrastructure and workflows align with epidemiological needs.
  • Ensure thorough and clear documentation of database architecture and workflows to promote sustainability, consistency, and ease of maintenance.
  • Define business rules around data governance for the Data Hub. Apply rigorous data quality checks and validation processes to guarantee the accuracy and reliability of the data released, emphasizing the importance of delivering correct and trustworthy data to support public health initiatives.
  • Optimize data pipelines, infrastructure, and workflows for performance and scalability.
  • Monitor data pipelines and systems for performance issues, errors, and anomalies, and implement solutions to address them.
  • Analyze and interpret datasets to identify data management needs and advise on data management strategy.
  • Implement security measures to protect sensitive information.
  • Collaborate with epidemiologists, analysts, and other partners to understand current and future data needs and requirements, and to ensure that the data infrastructure supports the organization’s goals and objectives.
  • Collaborate with cross-functional teams to understand data requirements to design and implement scalable database solutions in accordance to end users’ business needs.
  • Implement and maintain ETL processes to ensure the accuracy, completeness, and consistency of data.
  • Design and manage data storage systems, including migration of SAS datasets to PostgreSQL relational database.
  • Apply knowledge about industry trends, best practices, and emerging technologies in data engineering, and incorporate the trends into the organization’s data infrastructure.
  • Provide technical guidance to other staff on preparing and structuring data for visualization, leveraging knowledge of visualization tools to support the creation of meaningful and insightful visual outputs.
  • Communicate effectively with partners at all levels of the organization to gather requirements, provide updates, and present findings.

Qualifications

Required Qualifications

  • Bachelor’s degree in Computer Science, Information Technology, Data Science, or a related field.
  • Minimum of five (5) years of related informatics experience, preferably with three (3) years of experience in a lead data engineer position.
  • Demonstrated expertise in building SQL relational databases and transitioning non-relational data into a structured relational format, ensuring seamless integration and optimized performance.
  • Proficiency in SQL programming and other languages commonly used in data engineering, such as Python, Java, Scala. Candidate should be able to implement data automations within existing frameworks as opposed to writing one off scripts.
  • Experience transforming and preparing data into formats suitable for data visualization software, ensuring it is structured for optimal use in dashboards and other visual outputs.
  • Strong understanding of database systems, including relational databases (e.g., MySQL, PostgreSQL) and NoSQL databases (e.g., MongoDB, Cassandra), with PostgreSQL preferred.
  • Experience regarding engineering best practices such as source control, automated testing, continuous integration and deployment, and peer review, and serving as a subject matter expert on these topics.
  • Knowledge of data warehousing concepts and tools.
  • Experience with cloud computing platforms, with preference for experience in AWS environment.
  • Expertise in data modeling, ETL (Extract, Transform, Load) processes, and data integration techniques.
  • Familiarity with agile development methodologies, software design patterns, and best practices.
  • Strong analytical thinking and problem-solving abilities.
  • Excellent verbal and written communication skills, including the ability to convey technical concepts to non-technical partners effectively.
  • Flexibility to adapt to evolving project requirements and priorities.
  • Outstanding interpersonal and teamwork skills; and the ability to develop productive working relationships with colleagues and partners.
  • Experience working in a virtual environment with remote partners and teams.
  • Proficiency in Microsoft Office.
  • Ability to travel occasionally for in-person meetings (travel costs will be covered by NPAIHB).

Preferred Qualifications:

  • Experience facilitating data requirements gathering sessions to support data modeling plans
  • Experience planning and designing database models based on business data requirements
  • Experience working with complex public health, health care, or other non-business data requiring advanced processing and analysis techniques.
  • Experience transitioning SAS datasets and analyses into relational database structures.
  • Experience building data pipelines within Amazon Web Services (AWS), such as AWS Relational Database Services (RDS), Amazon Aurora Serverless, AWS Glue, Lambda
  • Experience creating complex fields and visuals in AWS QuickSight or similar data visualization tools (Tableau, Microsoft Power BI, etc).
  • Experience with dimensional modeling in scenarios where dimensions and fields change over time.
  • Experience with implementing data suppression techniques and familiarity with HIPAA, PHI, and other data confidentiality regulations.
  • Experience providing mentorship, training, and knowledge transfer of data engineering techniques to build the organization’s capacity for ongoing system management and development
#J-18808-Ljbffr
Apply Now
Share this job
Oregon Health & Science University
An error has occurred. This application may no longer respond until reloaded. Reload 🗙