Ryan Giggs Derrick-Ryan-Giggs

Derrick Ryan Giggs

Data Engineer | Technical Writer | Building Scalable, Reliable Data Pipelines | Cloud & Workflow Automation

With a passion for modern data stack tooling, I specialize in building production-ready data pipelines using Python, Apache Flink, dbt, and cloud-native GCP services. I focus on clean, maintainable streaming and batch pipelines, orchestration, and infrastructure-as-code.

Core Competencies

Languages & Querying

Core scripting and advanced querying for data engineering workflows.

Data Processing & Ingestion

Building robust batch and real-time data ingestion pipelines at scale.

Cloud & Data Platforms

Cloud data architecture and modern data warehousing solutions.

Orchestration & Infrastructure

Orchestrating reliable, production-grade data workflows.

Engineering Practices & Tooling

Stream Processing: Apache Flink / PyFlink, Redpanda (Kafka-compatible)
Data Ingestion: dlt (data load tool), PySpark, REST API pipelines
Orchestration & Workflow: Apache Airflow, Kestra, Prefect
Data Transformation: dbt Cloud, SQL
Infrastructure & Deployment: Docker, Terraform, GCS, BigQuery
CI/CD: GitHub Actions
Version Control: Advanced Git
Monitoring & Reliability: Structured logging, pipeline health checks, alerting
Documentation: Pipeline lineage, runbooks, data dictionaries

Featured Projects

CoinPulse — Real-Time Crypto Analytics Pipeline

A production-grade hybrid streaming and batch cryptocurrency analytics pipeline on GCP, tracking BTC, ETH, SOL, BNB, and ADA in real time at near-zero infrastructure cost (~$0.01/month).

Impact: Delivers live price aggregations with ~1 minute latency alongside enriched daily market context (market cap, OHLC candles, 24h change %) — all surfaced in a public, auto-refreshing Grafana Cloud dashboard.

Key Challenge: Designing a dual-lane architecture that keeps compute costs at zero by running Flink, Redpanda, and Airflow locally in Docker while using GCP only for storage — replacing expensive BigQuery Streaming Inserts with free Load Jobs via a GCS JSONL intermediate layer.

Architecture:

Streaming lane: Binance WebSocket → Python Producer → Redpanda → PyFlink (1-min tumbling windows) → GCS JSONL → BigQuery
Batch lane: CoinGecko API → Airflow 7-task DAG → GCS Parquet → BigQuery
Transformation: dbt Cloud staging views + incremental mart tables (daily @ 07:00 UTC)
Visualization: Grafana Cloud, 6 panels, 30-second auto-refresh

Stack: PyFlink 2.2.0 · Redpanda · Apache Airflow 2.9.2 · dbt Cloud · BigQuery · GCS · Terraform · Grafana Cloud · Python · Docker

Live Dashboard: derrickryangiggs.grafana.net | Repo: coinpulse

Sovereign Debt Observatory

An end-to-end ELT pipeline ingesting World Bank external debt data (JEDH + QEDS datasets) into BigQuery, with dbt Cloud transformations and a Looker Studio dashboard tracking sovereign debt trends across 120+ countries.

Impact: Automated quarterly ingestion of World Bank IDS data, surfacing debt-to-GNI ratios, creditor composition, and external debt stock trends across developing economies in an interactive public dashboard.

Key Challenge: Fixing double-counted inflation in staging models caused by World Bank aggregate region codes being included alongside country-level records — verified against published World Bank figures post-fix.

Architecture:

Ingestion: wbgapi Python library → Apache Airflow (CeleryExecutor, Docker Compose) → GCS Parquet → BigQuery
Transformation: dbt Cloud (staging → mart layer, incremental models)
Visualization: Looker Studio connected to BigQuery mart tables

Stack: Python · Apache Airflow · dbt Cloud · BigQuery · GCS · PySpark · Docker · Looker Studio

Repo: sovereign-debt-observatory

Tech Ecosystem Observatory

A cloud-native batch pipeline analyzing global tech ecosystem health by correlating layoffs trends with YC startup activity, built entirely on GCP with infrastructure-as-code.

Impact: Enables macro-level analysis of tech sector cycles — surfacing patterns between funding activity, layoff waves, and startup formation rates in a Looker Studio dashboard refreshed on a weekly schedule.

Key Challenge: Joining two independently-sourced datasets (Layoffs.fyi + YC company data) with different granularities and update cadences into a coherent, time-aligned analytical model without double-counting events across reporting periods.

Architecture:

Ingestion: REST APIs + CSV sources → Kestra workflow orchestration → GCS
Transformation: dbt Cloud (staging → mart layer)
Infrastructure: Terraform (GCS bucket, BigQuery datasets, IAM)
Visualization: Looker Studio

Stack: Python · Kestra · dbt Cloud · BigQuery · GCS · Terraform · Looker Studio · Docker

Repo: tech-ecosystem-observatory

DLT Taxi Pipeline

Built a production-ready data ingestion pipeline using dlt (data load tool) to ingest, normalize, and load NYC taxi trip data into a cloud data warehouse.

Impact: Automated end-to-end data loading with schema inference, incremental loading, and built-in data quality checks. Key Challenge: Handling schema evolution across different taxi dataset versions while maintaining idempotent, reliable loads.

Stack: Python · dlt · SQL · GitHub Actions

PySpark Data Engineering

Leveraged PySpark to process and analyze large-scale datasets using distributed computing techniques.

Impact: Applied big data processing fundamentals to transform raw datasets into structured, analysis-ready formats. Key Challenge: Optimizing Spark jobs for performance while maintaining code clarity and reproducibility in Jupyter notebooks.

Stack: PySpark · Python · Jupyter Notebook

Connect & Collaborate

Open to Remote & Hybrid Opportunities

GitHub: github.com/Derrick-Ryan-Giggs Blog: medium.com/@derrickryangiggs · dev.to/derrickryangiggs · ryan-giggs.hashnode.dev LinkedIn: in/ryan-giggs-a19330265

Open to collaborating on interesting data infrastructure projects and discussions about data engineering, cloud architecture, and modern data stack tooling.

Last Updated: 2026-04-24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ryan Giggs Derrick-Ryan-Giggs

Achievements

Achievements

Block or report Derrick-Ryan-Giggs

Derrick Ryan Giggs

Core Competencies

Languages & Querying

Data Processing & Ingestion

Cloud & Data Platforms

Orchestration & Infrastructure

Engineering Practices & Tooling

Featured Projects

CoinPulse — Real-Time Crypto Analytics Pipeline

Sovereign Debt Observatory

Tech Ecosystem Observatory

DLT Taxi Pipeline

PySpark Data Engineering

Connect & Collaborate

Pinned Loading

Uh oh!