Skip to content
View shahfazal's full-sized avatar

Block or report shahfazal

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
shahfazal/README.md

Bonjour, I'm Fazal! 👋

Software Engineer | French Dev in NYC 🗽🇫🇷

Lead SWE @ Salesforce. 20+ years deterministic systems, now building and learning in public. Agent Evals, AI tooling and much more.

Find me: shahfazal.com | LinkedIn


Active Projects

ARIA-ready descriptions for civic data visualizations. Submitted to the Kaggle Gemma 4 Good Hackathon (May 2026).

Fine-tuned Gemma 4 E4B on 61 hand-curated examples, paired with a deterministic verification layer that grounds extracted numbers against source CSV when available. Numbers from the model are treated as claims to be checked, not tokens to be trusted.

Tech stack: Unsloth, TRL, PyTorch, Modal, FastAPI, Python

Shipped:

  • Fine-tuned model published to HuggingFace
  • Deterministic verifier with four states (verified / partial / unverified / structural-issue)
  • Live demo on Modal
  • Two upstream vision DPO fixes contributed to unslothai/unsloth#5196 (merged April 29, 2026)

Session browser for Claude Code. Five shipped versions:

  • v0.1: Local session explorer (parses ~/.claude/ directory)
  • v0.2: Memory browser (reads project memory states)
  • v0.3: Compaction viewer (analyzes context window compression)
  • v0.4: Resilience — environment health checks, session export endpoint, externalized pricing config
  • v0.5: Stats dashboard — D3 heatmap, per-project cost bars, cumulative cost line, date-range filter, graceful degradation on parse failures

Upcoming: Driver.js guided help tour across the nav surfaces Tech stack: Python, Flask, Jinja2, D3.js, pytest

Use case: Browse session history, review memory evolution, analyze cost and token usage across projects.


French municipal elections 2026 data viz.

Data sources: DVF (property prices), 2nd round results (Ministère de l'Intérieur)

Stack: Python + pandas (pipeline), Plotly.js (charts), Leaflet.js (maps), Driver.js (help tours), vanilla JS

Shipped:

  • 5 interactive tabs: quintile breakdown, abstention box plot by bloc, Price distribution box plot, Paris-Lyon-Marseille choropleth, prix/m² vs abstention scatter plot with year toggle
  • 838 communes analysed, DVF 2024 + 2025
  • Guided tours, full French UI, accessibility attributes
  • Published réutilisation on data.gouv.fr for the Défi 1 challenge

Key lesson: Declarative specs upfront beat imperative iteration. Full build log coming in blog series.


Recent Contributions

Fixed search_datasets reporting resources_count: 4 for every dataset. The v2 search API returns resources as a HATEOAS link dict, so the client was counting its 4 keys instead of reading resources.total. Found while using the MCP, isolated against the live API, locked with a regression unit test, and verified end-to-end through the local MCP loop (using the call_tool.py I shipped in #100).

Impact: Consuming models now see true per-dataset resource counts instead of a constant 4.

Reduced dev friction when testing the official French data.gouv.fr MCP server. Added a /health endpoint that runs a full MCP handshake plus tool call, and a call_tool.py script that replaces the manual 3-curl handshake with a single command.

Impact: Lowers barrier for contributors testing MCP integrations locally.

Filed unslothai/unsloth#5196 reporting two vision DPO blockers on Gemma 4 (tokenization hang in dataset.map + data collator schema mismatch) with reproductions and documented workaround attempts. Fix merged into Unsloth main on April 29, 2026.


Writing

Posts (and ramblings) at shahfazal.com/posts:

  • "Nobody Tests the Steering Wheel" - Why agent evals need observe-first methodology
  • "Claude Gatekeep You Yet?" - why it's important to stop and think before handing the reins to your coding agent.
  • Declarative Viz series (upcoming) - Build log from elections-municipales-2026
  • CivicInsight retrospective (upcoming) - 5 weeks, 19 sessions, what shipped and what got dropped

What I'm Working On

Next up: Decompressing from CivicInsight. Picking up backlog projects.

Backlog:

  • ADS-B + Gemma 4 voice assistant on Raspberry Pi (family collaborative project)
  • TinyDiffusion (3-phase learning project: 1D scalar diffusion → 2x2 unconditional → 2x2 conditional)
  • CodeHaiku (fine-tune Gemma 4 to write PR review comments as haiku)
  • Public agent eval demo using datagouv-mcp
  • AI workflow optimizer (analyzes Claudio session exports for inefficiency patterns)
  • Plotly a11y toolkit

ML Foundations

Before building production eval systems, rebuilt intuition from first principles:

  • TinyNet - Neural net from scratch (Python, no frameworks)
  • NYC EV LSTM - Spatio-temporal demand forecasting

These aren't production systems - they're foundational exercises to understand backprop, overfitting, and temporal modeling before applying those concepts to agent evaluation.


Philosophy: If it can't be measured, it can't be trusted. I apply 20+ years of production engineering rigor (observability, regression detection, test harness design) to the chaos of agentic systems.

Pinned Loading

  1. datagouv-mcp datagouv-mcp Public

    Forked from datagouv/datagouv-mcp

    Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

    Python

  2. elections-municipales-2026 elections-municipales-2026 Public

    Analysis of French 2026 municipal elections — transport access, voter turnout, political nuance

    JavaScript

  3. gmail-takeout-rag gmail-takeout-rag Public

    A RAG POC for processing Google Takeout Gmail exports and making them searchable via natural language queries. Works both as a Jupyter notebook for exploration and as an MCP server for Cursor IDE i…

    Python

  4. hello-neural-world hello-neural-world Public

    A minimal neural network learning project with a 4→3→2 architecture. Learn backpropagation, overfitting, and generalization through hands-on experimentation.

    Jupyter Notebook

  5. datagouv/datagouv-mcp datagouv/datagouv-mcp Public

    Official data.gouv.fr Model Context Protocol (MCP) server that allows AI chatbots to search, explore, and analyze datasets from the French national Open Data platform, directly through conversation.

    Python 1.5k 125

  6. claudio claudio Public

    Browse your Claude Code sessions locally. No cloud, no sync, no accounts.

    Python 1 1