Skip to content
View vitamin33's full-sized avatar
🏠
Working from home
🏠
Working from home

Block or report vitamin33

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
vitamin33/README.md

Hi, I'm Vitalii — Senior Full-Stack Engineer

I build streaming-first backends and mobile apps that are easy to operate: clear APIs, solid tests, and dashboards that tell the truth. Lately I’ve focused on LLM-powered features with measurable latency, cost, and reliability.

  • Backend: Python (FastAPI, AsyncIO, Pydantic v2), SQLAlchemy 2.0, Celery, PostgreSQL, Redis
  • AI: OpenAI API, pragmatic RAG (Qdrant/pgvector), MLflow experiments, local models via Ollama / llama.cpp / vLLM
  • Ops: Docker, GitHub Actions, Prometheus + Grafana, OpenTelemetry tracing, Sentry
  • Mobile: Flutter (Dart, BLoC/Cubit, Provider, auto_route), Android/Kotlin (incl. BLE/GATT)

What I’m building now — RAGline

A public PoC that shows senior backend practices + LLM orchestration in one repo.

  • FastAPI service with JWT multi-tenancy, typed contracts, and SSE/WebSocket streaming
  • Idempotent writes, outbox → Redis Streams fan-out, retries, DLQ
  • Observability first: route histograms, cache hit ratio, stream lag; OTel traces across API → DB → Redis → worker
  • /chat orchestration with tool calls and RAG; switch to local models by setting OPENAI_API_BASE
  • CI/CD with lint/type/tests + ownership guards; small PRs, fast merges

Why it matters: you can see p50/p95 numbers, traces, and the exact code that keeps releases predictable.


Recent highlights

  • Creative Coach (FastAPI + Celery): streaming insights over SSE/WebSockets; multi-model routing with OpenAI-compatible clients cut serving cost by ~98%; tracked 50+ experiment metrics in MLflow with Prometheus/Grafana dashboards.
  • AI Agent Development System: multi-agent dev platform with real-time telemetry, eval gates, prompt governance, canary + rollback, and local-model support (Ollama / llama.cpp / vLLM). Saves ~13.5–31 h/week in my workflow and improves DORA-style outcomes.
  • RAGline (public PoC): streaming-first Python backend with JWT multi-tenancy, idempotent writes, outbox → Redis Streams, and /chat orchestration with RAG (Qdrant/pgvector). Local model switch via OPENAI_API_BASE; first-token and p95 latency measured with OpenTelemetry.

Tech I use (and enjoy)

Python: FastAPI, AsyncIO, Pydantic v2, SQLAlchemy 2.0, Celery
Data/Infra: PostgreSQL, Redis, Docker, GitHub Actions
AI/LLM: OpenAI API, RAG with Qdrant/pgvector, MLflow; local models via Ollama / llama.cpp / vLLM
Observability: Prometheus, Grafana, OpenTelemetry, Sentry
Mobile: Flutter, Dart (BLoC/Cubit, Provider, auto_route), Android/Kotlin (BLE)
Cloud: AWS (Lambda, AppSync, DynamoDB), GCP (GKE/Cloud Run), Firebase


How I work

  • Contract-first: OpenAPI & event schemas before code
  • Metrics-first: set SLOs, watch p95, fix root causes, add a safeguard
  • CI discipline: tests, lint, types, security checks on every PR
  • Small batches: feature flags, frequent releases, clean rollbacks
  • Leadership: led 3 teams (2–6 devs); set review/testing standards; mentor kindly, ship steadily

Contact


GitHub Stats

GitHub Stats Top Languages

Streak

Streak Stats

☕️ Support

Pinned Loading

  1. ragline ragline Public

    Streaming-first Python backend with LLM/RAG orchestration

    Python

  2. serbyn-pro serbyn-pro Public

    Professional portfolio website - AI engineer resume and case studies

    TypeScript

  3. vitamin33 vitamin33 Public