Skip to content
View HOZHENWAI's full-sized avatar

Block or report HOZHENWAI

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
HOZHENWAI/README.md

Hi, I'm Olivier πŸ‘‹

Mathematician by training (PhD in extreme value theory), full-stack AI engineer in practice. I like problems that need both a whiteboard and a deploy pipeline.

πŸ”­ These days I'm co-founder & CDO at Nabu, where I've spent the last six years building the data and ML side of a document-intelligence platform for customs & trade β€” turning messy, unstructured trade documents into structured, calibrated data. In practice, a lot of slow manual processing collapses into a few minutes.

What I work on

  • Document AI & RAG β€” LangChain / LangGraph, Weaviate & pgvector, visual-rich document understanding, custom OCR pipelines
  • LLM systems with some rigour β€” dynamic structured outputs, logprob-calibrated confidence scores, and MILP where it earns its keep
  • The stack around it β€” FastAPI, PostgreSQL, React/Vue, AWS (EKS, SageMaker), Terraform, Kubernetes, and a soft spot for observability (OpenTelemetry, Datadog, SigNoz, Grafana)

⚑ Outside work β€” where most of my GitHub lives

  • πŸ“ˆ Quant & crypto tinkering β€” backtesting ideas, poking at market data, and the occasional Ethereum rabbit hole
  • πŸ§ͺ ML / RL & generative AI for fun β€” reinforcement learning, self-hosted LLMs, Stable Diffusion
  • 🏠 Self-hosting & homelab β€” Proxmox, ZFS, Grafana dashboards, and a healthy distrust of the cloud for personal stuff
  • πŸ—ƒοΈ Data hoarding β€” web archiving, media library tooling, metadata wrangling, giving everything a tidy, well-tagged home. If it can be catalogued, I've probably tried.

πŸ› οΈ A few of my own projects


⚑ Fun fact: all of this runs on three servers and ~200 TB at home β€” which I assure everyone is "for the homelab" and definitely not just hoarding.

πŸ’¬ Always happy to talk document AI, applied ML with real math behind it, quant experiments, or homelab over-engineering.

πŸ“« hozhenwai@gmail.com Β· LinkedIn Β· based in Strasbourg πŸ‡«πŸ‡·

Pinned Loading

  1. Beets-Plugin_VGMdb Beets-Plugin_VGMdb Public

    A small plugin to collect metadata from VGMdb and manage a VGMdb collection

    Python 20 5

  2. py-prisma2markdown py-prisma2markdown Public archive

    Python

  3. scripts-utils scripts-utils Public

    Shell