Skip to content
View Rares8921's full-sized avatar

Block or report Rares8921

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Rares8921/README.md

Rareș Cocoșilă-Dumitriu

Software Engineer | AI & Cloud Infrastructure | ML Systems | Distributed Systems

I build production-grade machine learning systems focused on real-time inference, scalability, reliability, and cost efficiency.

My work sits at the intersection of backend engineering, distributed systems, and applied AI.

Website   LinkedIn   Resume   Email


About

Focused on building systems where machine learning must operate under real production constraints: latency, scale, reliability, and cost.

Work spans inference systems, distributed architectures, retrieval pipelines, orchestration layers, and applied ML systems.


Featured Projects

Selected systems aligned with production ML and distributed infrastructure work. More experiments and builds are available in my repositories.

Enterprise Multimodal Document Intelligence Platform

  • End-to-end OCR + LayoutLMv3 + retrieval + LLM system for document understanding.
  • 10k+ docs/day, 5.8k QPS, 94.3% extraction accuracy.
  • Distributed microservices architecture with async pipelines, caching, and observability.

Cost-Aware Autoscaling GPU Inference Cluster

  • Multi-tenant GPU inference platform with batching and Redis scheduling.
  • 2.3x throughput improvement with p99 <100ms under 5x traffic spikes.
  • 32% infrastructure cost reduction via autoscaling + warm pool design.

Human Behaviour Modeling for AI

  • Eye-tracking based ML pipeline on raw gaze streams.
  • BiLSTM + Attention model achieving 56% accuracy vs 9% baseline.
  • Behavioral analysis against computational saliency models.

Core Technologies

Languages
Python · Java · C++ · C · JavaScript · TypeScript

Backend
FastAPI · Spring Boot · Node.js · REST APIs · Microservices · Async Systems

Infrastructure
Docker · Kubernetes · Redis · AWS · Linux · Git · CI/CD · Nginx

AI / ML
Transformers · LLMs · NLP · OCR · Computer Vision · Multimodal AI · PyTorch · OpenCV


Current Focus

  • Real-time ML inference systems
  • Distributed systems and scalability
  • MLOps and production observability
  • Cost-efficient AI infrastructure
  • Retrieval and decision systems

Always building.

Pinned Loading

  1. Personal-Projects Personal-Projects Public

    Showcase of my personal projects, featuring a diverse range of software development experiments and problem-solving exercises. Explore projects ranging from application development, image processin…

    Jupyter Notebook 1

  2. 99-haskell-problems 99-haskell-problems Public

    This repository contains my solutions to the 99 Haskell Problems, a popular set of exercises designed to challenge and enhance your understanding of functional programming using Haskell.

    Haskell 5

  3. Eye-Tracking-data-analysis Eye-Tracking-data-analysis Public

    HTML

  4. cost-aware-inference-cluster cost-aware-inference-cluster Public

    Python

  5. enterprise-multimodal-rag-platform enterprise-multimodal-rag-platform Public

    Python