Skip to content
View SamanSathenjeri's full-sized avatar
🎱
Hallucinating
🎱
Hallucinating

Block or report SamanSathenjeri

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
SamanSathenjeri/README.md

Hey, it's Saman Sathenjeri ❤️

C Java Python OCaml C++ nVIDIA PyTorch NumPy Pandas scikit-learn Docker Kubernetes Amazon S3

AlexNet

👀 Check out my nano implementation of the Deepseek-V3 model

  • 🧑‍🔬 Comes with Mixture of Experts (MoE) to route to "experts" to speed up inference times greatly
  • 🤯 Also comes with Multiheaded Latent Attention (MLA) to save further time on inference using caching
  • 📖 Contains implemented Rotary Position Embeddings (RoPE), Multilayer Perceptrons (MLP), and more!

🚨 Check out my implementation of a nano-BERT model with Apple's MLX framework

  • 🍎 Usage of Apple's open-source machine learning framework developed for Apple silicon
  • 🔱 Comes with Masked Language Modeling (MLM) and Multi-headed Attention
  • 🎯 Finetuned to summarize long texts (answer SQuAD-style questions) --> possibly use it with RAG

👻 Check out my Black-Box Hallucination Detection Case Study

  • 🚀 Tests three different methods (selfCheckGPT, SAC3, and semantic entropy) to identify hallucinated answers
  • 📦 Requires no knowledge or access to model internals or logits
  • 👑 Implemented and tested methods using Meta's Llama-3.2-3B-Instruct and Microsoft's Phi-3-mini-128k-instruct

🧑‍🍳 Projects that are cooking:

  • ✍️ Smart Document System/Assistant - Taking inspiration from Microsoft's Copilot and Semantic Indexing to create a document system that can use
  • 📈 Calibration and Simulation of Rough Volatility Models - Optimized Monte Carlo simulations of fractional Brownian volatility, using the Rough Bergomi model, on a multi-node HPC cluster
  • 💻 Collaborative Code Editor - Uses Websockets and Amazon S3 to create a collaborative experience for pair (or group) coding. One might say, it's a Google Docs for Code 😉

Pinned Loading

  1. mlx-BERT2BERT-Summarizer mlx-BERT2BERT-Summarizer Public

    A custom BERT2BERT transformer for text summarization, built in MLX for full MacBook compatibility for both training and inference.

    Python

  2. nano-deepseekV3 nano-deepseekV3 Public

    A compact, single-GPU optimized version of DeepSeek-V3, trained on FineWebEDU for research and experimentation.

    Python

  3. HallucinationDetection HallucinationDetection Public

    A lightweight comparative analysis of 3 modern Black-Box Hallucination Detection methods for language models, including SAC3, SelfCheckGPT, and Semantic Entropy.

    Python

  4. codeEditor codeEditor Public

    Real-time, collaborative coding environment using WebSockets for instant communication between users and Amazon S3 for persistent storage of code files.

    JavaScript

  5. oboeExtraFeatures oboeExtraFeatures Public

    Extra visual and recall features to the Oboe app that supports the AI powered courses that they create

    Python

  6. Simulated-Banking-System Simulated-Banking-System Public

    A full-stack banking simulation platform where users can create accounts, perform secure transactions, and sign high-value envelopes (> $10,000) for verification

    Java