CONTACT

Join one of our growing portfolio companies.

companies

Jobs

My job alerts

Sr Data Engineer

Pyx Health

Data Science

Tucson, AZ, USA

USD 150k-170k / year

Posted on May 8, 2026

Apply now

LocationRemote
Base Pay$150,000.00 - $170,000.00 / Year
Employee TypeFT Exempt

Description

Pyx Health is looking for a talented and motivated Senior Data Engineer to join our team at a high-growth startup. In this role, you will play a pivotal part in building and maintaining our data infrastructure on Azure. You’ll leverage your expertise in Databricks, Airflow, Python, and SQL to design, develop, and implement robust data pipelines that fuel our innovative healthcare solutions. Only candidates residing in the USA may apply.

KEY RESPONSIBILITIES:

· Pipeline Design & Development: Design, build, and maintain end-to-end data pipelines for ingesting, transforming, and cleaning large volumes of healthcare data in the Azure cloud using Databricks, PySpark, and Delta Lake.

· Orchestration & Automation: Develop and maintain production-grade Airflow DAGs (including dynamic DAGs) with robust error handling and retry logic. Deploy and configure pipelines via Astronomer on Azure using the Astro CLI.

· Data Modeling & Optimization: Design and optimize data models for efficient storage and retrieval using Delta Lake, including merge/upsert patterns and Change Data Capture (CDC) strategies. Leverage T-SQL for SQL Server Change Tracking, stored procedures, and source-side ETL debugging.

· Azure Infrastructure: Work across the Azure ecosystem including ADLS Gen2, Key Vaults, Logic Apps, Azure DevOps, and Astronomer on Azure to build secure, scalable data infrastructure.

· Data Quality & Security: Develop and implement automation scripts and tools to ensure data quality, security, and scalability. Monitor and troubleshoot pipelines, proactively identifying and resolving issues.

· Collaboration & Mentorship: Collaborate with data scientists, analysts, and business stakeholders to translate data needs into effective solutions. Participate in code reviews and provide technical mentorship to junior team members.

· Documentation: Document data pipelines, processes, and architectural decisions for clarity and maintainability. Stay current with data engineering technologies and trends, particularly in the healthcare domain.

Requirements

· Minimum 5 years of experience as a Data Engineer with strong expertise in Azure cloud services (ADLS, Synapse Analytics, Data Factory, or similar).

· In-depth knowledge of SQL Server, including T-SQL scripting, data modeling, and query optimization.

· Proven ability to start, run, manage, and complete a technical project with minimal project management oversight.

· Prior experience as a lead, supervisor, or senior individual contributor with mentorship responsibilities.

· Strong cross-functional communication skills; able to work effectively with business partners across all branches of the organization.

· Experience with CI/CD pipelines, DevOps principles, and version control systems (Git, Azure DevOps).

· Excellent root cause analysis and problem-solving skills.

· Familiarity with healthcare data standards and regulations (HIPAA, HL7, etc.) is a strong plus.

· Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field (or equivalent work experience).

MUST HAVES:

· Databricks SQL (Advanced): Delta Lake, merge/upsert, CDC patterns.

· Python (Advanced): Data engineering-grade pipeline logic, CDC ingestion, file egress, SQL migration runners, PySpark/Spark DataFrame API.

· Databricks Spark Notebooks (Advanced): Notebook-based pipeline development, cluster configuration, Delta table operations.

· Airflow Python Development (Advanced): Production DAGs, dynamic DAGs, error handling, retries.

· Airflow Astro Configuration (Mid): Astronomer deployment, Astro CLI, connection/variable management.

· Azure Ecosystem (Mid): Key Vaults, Logic Apps, Astronomer on Azure, ADLS Gen2, Azure DevOps.

· T-SQL (Mid): SQL Server Change Tracking, stored procedures, source-side ETL debugging.

NICE TO HAVES:

· Azure Data Factory (Mid): Pipeline authoring, triggers, integration runtimes.

· Git + Azure DevOps CI/CD (Mid): Branching, PRs, notebook/SQL migration deployments.

Apply now

See more open positions at Pyx Health

Powered by Getro.com

Privacy policy Cookie policy