- London
-
00:07
(UTC +01:00) - https://substack.com/@makraduli
- in/filip-makraduli-55091310a
- @f_makraduli
Lists (2)
Sort Name ascending (A-Z)
Stars
⚡ Haystack + OpenSearch + Cognee — hybrid search, graph memory, streaming answers.
Reliable model swapping for any local OpenAI/Anthropic compatible server - llama.cpp, vllm, etc
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
Open-source inference server and production cluster for all the models your agent needs.
Community maintained hardware plugin for vLLM on Apple Silicon
rvLLM: High-performance LLM inference in Rust. Drop-in vLLM replacement.
Comprehensive ML/AI interview codex with iterative system design, production-ready code, and 2026 standards. Includes LLM/GenAI, RAG systems, agentic AI, and algorithms from scratch.
PDF Parser for AI-ready data. Automate PDF accessibility. Open-source.
A framework for teaching AI to write like you. Not like a better version of you. Like you.
The best-benchmarked open-source AI memory system. And it's free.
AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
A vector index built on TurboQuant, written in Rust with Python bindings
A high-throughput and memory-efficient inference and serving engine for LLMs
Code supporting the "Beyond Linearity in Attention Projections: The Case for Nonlinear Queries" paper
Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of …
A reactive notebook for Python — run reproducible experiments, query with SQL, execute as a script, deploy as an app, and version with git. Stored as pure Python. All in a modern, AI-native editor.
AI agents running research on single-GPU nanochat training automatically
A collection of tricks and tools to speed up transformer models
Become a cracked AI/ML Research Engineer
Cost-efficient and pluggable Infrastructure components for GenAI inference
AirLLM 70B inference with single 4GB GPU
Beads - A memory upgrade for your coding agent
AI agents can now use real Android and iOS apps, just like a human.
Recursive Language Models (RLMs) implementation based on the paper by Zhang, Kraska, and Khattab
Open catalog of datasets used to train and align LLMs across pretraining, mid-training, and post-training.
AI Hero's open-source examples and course material. Learn AI Engineering with a single repo.
A modern web interface for managing and interacting with vLLM servers (www.github.com/vllm-project/vllm). Supports both GPU and CPU modes, with special optimizations for macOS Apple Silicon and ent…