-
Engineer
- San Francisco Bay Area
- https://stephengineer.github.io/
- @StephEngineers
Highlights
Lists (6)
Sort Name ascending (A-Z)
Stars
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
AIInfra(AI 基础设施)指AI系统从底层芯片等硬件,到上层软件栈支持AI大模型训练和推理。
AISystem 主要是指AI系统,包括AI芯片、AI编译器、AI推理和训练框架等AI全栈底层技术
🃏 Using multiple types of annotation extracted from hateful-memes dataset and feed those data into multi-modal transformers to achieve high accuracy.
My learning notes/codes for ML SYS.
verl: Volcano Engine Reinforcement Learning for LLMs
This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."
Maple Mono: Open source monospace font with round corner, ligatures and Nerd-Font icons for IDE and terminal, fine-grained customization options. 带连字和控制台图标的圆角等宽字体,中英文宽度完美2:1,细粒度的自定义选项
🏋 Modern open-source fitness coaching platform. Create workout plans, track progress, and access a comprehensive exercise database.
AutoClip: AI-powered video clipping and highlight generation · 一款智能高光提取与剪辑的二创工具
100+ Fine-tuning Tutorial Notebooks on Google Colab, Kaggle and more.
Text-audio foundation model from Boson AI
An open-source AI agent that brings the power of Gemini directly into your terminal.
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
A Distributed Attention Towards Linear Scalability for Ultra-Long Context, Heterogeneous Data Training
MAGI-1: Autoregressive Video Generation at Scale
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Unleash Next-Level AI! 🚀 💻 Code Generation: DeepSeek r1 + Claude 3.7 Sonnet - Unparalleled Performance! 📝 Content Creation: DeepSeek r1 + Gemini 2.5 Pro - Superior Quality! 🔌 OpenAI-Compatible. 🌊 S…
FlashMLA: Efficient Multi-head Latent Attention Kernels
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
心理健康大模型 (LLM x Mental Health), Pre & Post-training & Dataset & Evaluation & Depoly & RAG, with InternLM / Qwen / Baichuan / DeepSeek / Mixtral / LLama / GLM series models
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
📚 Freely available programming books
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
we want to create a repo to illustrate usage of transformers in chinese
🤗更优雅的微信公众号订阅方式,支持私有化部署、微信公众号RSS生成(基于微信读书)