👀 Check out my nano implementation of the Deepseek-V3 model
- 🧑🔬 Comes with Mixture of Experts (MoE) to route to "experts" to speed up inference times greatly
- 🤯 Also comes with Multiheaded Latent Attention (MLA) to save further time on inference using caching
- 📖 Contains implemented Rotary Position Embeddings (RoPE), Multilayer Perceptrons (MLP), and more!
🚨 Check out my implementation of a nano-BERT model with Apple's MLX framework
- 🍎 Usage of Apple's open-source machine learning framework developed for Apple silicon
- 🔱 Comes with Masked Language Modeling (MLM) and Multi-headed Attention
- 🎯 Finetuned to summarize long texts (answer SQuAD-style questions) --> possibly use it with RAG
👻 Check out my Black-Box Hallucination Detection Case Study
- 🚀 Tests three different methods (selfCheckGPT, SAC3, and semantic entropy) to identify hallucinated answers
- 📦 Requires no knowledge or access to model internals or logits
- 👑 Implemented and tested methods using Meta's Llama-3.2-3B-Instruct and Microsoft's Phi-3-mini-128k-instruct
- ✍️ Smart Document System/Assistant - Taking inspiration from Microsoft's Copilot and Semantic Indexing to create a document system that can use
- 📈 Calibration and Simulation of Rough Volatility Models - Optimized Monte Carlo simulations of fractional Brownian volatility, using the Rough Bergomi model, on a multi-node HPC cluster
- 💻 Collaborative Code Editor - Uses Websockets and Amazon S3 to create a collaborative experience for pair (or group) coding. One might say, it's a Google Docs for Code 😉