TA
Focus: AI Engineering - Agentic Systems - Data Engineering - Cloud Infrastructure - Backend Systems - Engineering Leadership
Engineering at scale. I optimize data systems that handle billions of records, cut infrastructure costs, and actually last. Hands-on from product discovery, architecture, implementation to deployment.
Engineering since 2011. Pipelines, EMR clusters, Airflow DAGs, and the unglamorous work of cutting cloud costs from the inside. I profile actual bottlenecks before changing anything, and prefer durable fixes over clever ones.
Built teams too: grew my team at iPrice from 3 to 7 and mentored two engineers who were later promoted.
AWS Certified Solutions Architect: Associate (July 2022).
Backend at scale
- 20M+ monthly visitors served
- <250ms API p95 latency
- 4x crawler throughput (60K -> 250K pages/hour)
Data engineering
- 6B records at peak
- $6K+ monthly infrastructure savings (EMR $7K -> $3K, OpenSearch $2K -> self-hosted, reporting ~$1K)
- <8 h data processing time
Leadership
- Grew my team from 3 to 7, with 2 engineers later promoted
See theanh.github.io for the long versions.
- SketchNet: convolutional neural network (CNN) for hand-drawn sketches. 95.1% accuracy, 938 KB model, 1ms inference. Live demo.
- DIA Risk Screener: five algorithms scoring the same molecule for drug-induced autoimmunity risk. The spread between their probabilities is the trust signal. 0.896 best test AUC (area under the ROC curve), 477 compounds. Live demo.
- PCA Audio Toolkit: Principal Component Analysis (PCA)-based audio denoising and lossy compression. Live demo.
- Cutting Spark shuffle cost: wide vs narrow transformations on billion-record EMR pipelines.
Python - TypeScript - JavaScript - PHP - PyTorch - scikit-learn - XGBoost - LLM applications - AWS (Athena - EMR - S3 - SQS - ElastiCache - Lambda - RDS) - Azure Data Warehouse - Apache Airflow - PySpark - Elasticsearch - MySQL - PostgreSQL - SQL Server - Cassandra - Laravel - Symfony - RESTful API - GraphQL - Docker - Terraform - ELK Stack
Building AI applications now: LLM-powered automation in production, and learning ML by building (see SketchNet, DIA Risk Screener). Also travel and read.
"Great code is minimal to no code."