Senior Data Engineer | M.S. Statistics | Miami, FL
Building scalable data platforms and analytics infrastructure for healthcare and fintech.
Founder of DAR Analytics — a data engineering consultancy focused on designing modern data architectures, implementing robust ELT pipelines, and enabling data-driven decision making for organizations in regulated industries.
I design and build end-to-end data platforms with a focus on reliability, scalability, and cost efficiency. My work spans the full data lifecycle — from ingestion and transformation to modeling and observability.
Core areas:
- Data Modeling & Transformation — Dimensional modeling, slowly changing dimensions, and modular SQL transformations with dbt on Snowflake
- Pipeline Orchestration — Production Airflow DAGs with complex dependency management, SLA monitoring, and retry strategies (Astronomer)
- Cloud Data Infrastructure — AWS-native architectures using S3, EMR Serverless (PySpark), and Snowflake as the central warehouse
- Data Quality — Schema enforcement, freshness checks, and automated testing integrated into the transformation layer
Sources (APIs, S3, SFTP, Databases)
│
▼
┌─────────────────────────────┐
│ Airflow (Astronomer) │ Orchestration layer
│ ├── Ingestion DAGs │ API pulls, S3 discovery, SFTP sync
│ ├── EMR Serverless Jobs │ Heavy transformations (PySpark)
│ └── dbt Orchestration │ Model runs, tests, snapshots
└─────────────────────────────┘
│
▼
┌─────────────────────────────┐
│ Snowflake │ Central data warehouse
│ ├── RAW / Landing │ COPY INTO from S3, Snowpipe
│ ├── STAGING │ dbt staging models (cleaning, typing)
│ ├── DATA MART │ Business-layer dimensional models
│ └── HYBRID │ Cross-domain joined datasets
└─────────────────────────────┘
│
▼
BI / Analytics / Reverse ETL
Healthcare
- Claims data pipelines, clinical data integration, HIPAA-compliant architectures
- Master Data Management (MDM) for patient and provider entities
- Analytics platforms supporting population health and operational reporting
Fintech
- Transaction processing and fraud detection data models
- Real-time and batch pipelines for marketplace and auction platforms
- Competitor intelligence and pricing analytics at scale (EMR Serverless)
Production-grade dbt project for healthcare claims analytics on Snowflake. Dimensional modeling, incremental processing, data quality testing, SCD Type 2 snapshots.
Production-ready Airflow DAGs for healthcare data engineering. EMR Serverless batch processing, S3-to-Snowflake ingestion, dbt orchestration, API data pulls.
Infrastructure as Code for Snowflake. Modular Terraform managing warehouses, databases, RBAC roles, grants, and S3 storage integrations across environments.
Data lakehouse with Medallion architecture (Bronze/Silver/Gold). Spark Structured Streaming + Kafka for fintech transactions, Delta Lake, batch and real-time pipelines.