MS Bioinformatics Graduate (Data Science Concentration) from Northeastern University (GPA 3.9/4.0), with experience in clinical data pipelines, genomic analysis, and machine learning.
- Clinical & Genomic Data Analysis — GWAS, case-control studies, variant calling, and epidemiological modeling
- Machine Learning for Healthcare — Predictive models for clinical outcomes using ensemble methods (RF, SVM, XGBoost)
- ETL & Data Pipelines — End-to-end pipelines using Python, R, BigQuery, and PySpark on large-scale EHR datasets
- Biostatistics — Survival analysis, ANOVA, hypothesis testing, logistic regression, and study design
- Data Visualization — Dashboards and reports using Tableau, Looker, Power BI, and ggplot2
Languages & Libraries: Python, R, SQL, SAS, STATA, NumPy, Pandas, Scikit-learn, PyTorch, PySpark
Tools: BigQuery, Tableau, Looker, Power BI, REDCap, Docker, Hadoop, Git
Clinical & Research: CDISC, SDTM, ADaM, ICD-10, EHR, ETL, IRB, ICH-GCP
- Pharmacy background → Bioinformatics grad — I understand the biology behind the data, not just the numbers
- Enjoy working on problems where statistics and genomics collide — GWAS, survival analysis, clinical trial design
- Google Advanced Data Analytics Professional Certificate (2025)