Highlights
- Pro
Lists (7)
Sort Name ascending (A-Z)
Stars
Instruct-tune LLaMA on consumer hardware
此项目是机器学习(Machine Learning)、深度学习(Deep Learning)、NLP面试中常考到的知识点和代码实现,也是作为一个算法工程师必会的理论基础知识。
Get started with building Fullstack Agents using Gemini 2.5 and LangGraph
llama3 implementation one matrix multiplication at a time
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery 🧑🔬
MiniCPM4 & MiniCPM4.1: Ultra-Efficient LLMs on End Devices, achieving 3+ generation speedup on reasoning tasks
Code repo for realtime multi-person pose estimation in CVPR'17 (Oral)
Democratizing Reinforcement Learning for LLMs
Acceptance rates for the major AI conferences
Qwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
Official repository of 'Visual-RFT: Visual Reinforcement Fine-Tuning' & 'Visual-ARFT: Visual Agentic Reinforcement Fine-Tuning'’
Official implementation for "Automatic Chain of Thought Prompting in Large Language Models" (stay tuned & more will be updated)
Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering
Codebase for Aria - an Open Multimodal Native MoE
A collection of notebooks/recipes showcasing usecases of open-source models with Together AI.
OLMoE: Open Mixture-of-Experts Language Models
[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
[NeurIPS 2021] [T-PAMI] DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
Person Re-ranking (CVPR 2017)
🔦 A Pytorch implementation of GoogleBrain's SpecAugment: A Simple Data Augmentation Method for Automatic Speech Recognition
🧀 Code and models for the ICML 2023 paper "Grounding Language Models to Images for Multimodal Inputs and Outputs".
Action recognition using soft attention based deep recurrent neural networks
欢迎来到 "LLM-travel" 仓库!探索大语言模型(LLM)的奥秘 🚀。致力于深入理解、探讨以及实现与大模型相关的各种技术、原理和应用。
An implementation of our CVPR 2018 work 'Domain Adaptive Faster R-CNN for Object Detection in the Wild'
OpenThinkIMG is an end-to-end open-source framework that empowers LVLMs to think with images.
Official implementation of the paper "Attentive Eraser: Unleashing Diffusion Model’s Object Removal Potential via Self-Attention Redirection Guidance" (AAAI 2025 Oral)