Professional with 4 years of experience in Data Analysis, specializing in the end-to-end data lifecycle. Expertise includes architecting data infrastructures, performing statistical research, and implementing machine learning systems. Focused on building reliable systems that transform raw data into actionable business intelligence.
| Pillar | Focus Areas |
|---|---|
| Data Engineering | Pipeline Idempotency, Data Contracts (Pydantic), Schema-on-Read/Write. |
| Cloud Infrastructure | Infrastructure as Code (Terraform), Medallion Architecture, GCP/BigQuery. |
| Data Analysis | Statistical Hypothesis Testing, Exploratory Data Analysis (EDA), BI Serving Layers. |
| Machine Learning | Ensemble Modeling (XGBoost), MLOps (MLflow), Model Performance Auditing. |
1. Data Engineering: De-Crypto Pipeline
Infrastructure: A modular ETL engine designed for financial data ingestion.
- Implementation: Financial precision using
decimal.Decimaland system idempotency throughUpsertstrategies. - Quality: Dual-layer data validation firewalls and automated unit testing with Pytest.
2. Cloud Infrastructure: GCP Data Platform Hub
Cloud Scale: A native platform following the Medallion Architecture (Bronze, Silver, Gold).
- Automation: Infrastructure provisioned via Terraform for reproducible cloud environments.
- Analytics: Decoupled architecture using Cloud Functions and BigQuery for high-performance modeling.
3. Data Analysis: Crypto Market Analysis
Statistical Discovery: Analysis layer connecting raw data with predictive systems.
- Research: Spearman Rank Correlation analysis of volume shocks and market reversals.
- Serving: Optimized BigQuery SQL views for centralized Business Intelligence dashboards.
4. Machine Learning: Crypto ML Predictor
Predictive Modeling: System for classifying structural market anomalies.
- Inference: Deployment of XGBoost Classifiers to map market regimes with high precision.
- Tracking: Integration of MLflow for experiment tracking and model auditing.
- Languages: Python (Pandas, Scikit-learn, Scipy), SQL (PostgreSQL, BigQuery).
- Tools & Platforms: Docker, Terraform, GCP, MLflow, Git, Pytest.
- Methodologies: ADR (Architecture Decision Records), CI/CD, Statistical Modeling, DataOps.
Jose Cortes Ramos - Engineering Reliable and Intelligent Data Systems