Pyx Health is looking for a motivated and technically solid Data Engineer to join our growing Data & Analytics team. In this role, you will contribute to building and maintaining our data infrastructure on Azure, working alongside senior engineers to develop reliable data pipelines that support our healthcare solutions. You will work with Databricks, Airflow, Python, and SQL to implement and improve data workflows in a collaborative, high-growth environment. Only candidates residing in the USA may apply.
KEY RESPONSIBILITIES:
Pipeline Development & Maintenance
• Build and maintain data pipelines for ingesting, transforming, and cleaning healthcare data in the Azure cloud using Databricks, PySpark, and Delta Lake
• Implement pipeline logic from defined specifications, with guidance from senior engineers on architectural decisions
• Monitor pipelines for failures and performance issues, escalating complex problems appropriately
Orchestration & Automation
• Develop and maintain Airflow DAGs with appropriate error handling and retry logic
• Support deployment and configuration of pipelines via Astronomer on Azure using the Astro CLI
• Contribute to improving pipeline reliability and reducing manual intervention
Data Modeling & Storage
• Implement data models for efficient storage and retrieval using Delta Lake, including merge/upsert patterns
• Apply Change Data Capture (CDC) patterns under the direction of senior team members
• Write and optimize T-SQL for SQL Server operations, stored procedures, and ETL support
Azure Infrastructure
• Work within the Azure ecosystem including ADLS Gen2, Key Vaults, Logic Apps, and Azure DevOps
• Follow established security and scalability standards when building data infrastructure
• Support infrastructure tasks with guidance from senior engineers on architecture
Data Quality & Monitoring
• Implement data quality checks and validation logic within pipelines
• Troubleshoot and resolve pipeline failures, documenting root causes and resolutions
• Contribute to monitoring and alerting improvements
Collaboration & Documentation
• Work closely with data scientists, analysts, and business stakeholders to understand data needs
• Participate in code reviews, incorporating feedback to improve code quality
• Document pipelines, processes, and implementation decisions clearly and consistently
• Stay current with data engineering technologies, particularly in the healthcare space