Lists (1)
Sort Name ascending (A-Z)
Stars
CheXOne: A Reasoning-Enabled Vision–Language Foundation Model for Chest X-ray Interpretation
[Notice] The repo temporarily locked while ownership transfer. in the meantime we maintain on here: https://github.com/ultraworkers/claw-code-parity. The fastest repo in history to surpass 100K sta…
An in-the-wild benchmark for AI agents in the OpenClaw Environment.
将冰冷的离别化为温暖的 Skill,欢迎加入数字生命1.0!Transforming cold farewells into warm skills? It's giving rebirth era. Welcome to Digital Life 1.0. 🫶
[Survey] A Comprehensive Survey of Self-Evolving AI Agents: A New Paradigm Bridging Foundation Models and Lifelong Agentic Systems
A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.
This is a curated list of "Embodied AI or robot with Large Language Models" research. Watch this repository for the latest updates! 🔥
NVIDIA Isaac Sim™ is an open-source application on NVIDIA Omniverse for developing, simulating, and testing AI-driven robots in realistic virtual environments.
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
A reading list for large models safety, security, and privacy (including Awesome LLM Security, Safety, etc.).
A Curated List of Vision-Language-Action (VLA) and World Action Models (WAM) Research and Beyond
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
We introduce 'Thinking with Video', a new paradigm leveraging video generation for multimodal reasoning. Our VideoThinkBench shows that Sora-2 surpasses GPT5 by 10% on eyeballing puzzles and reache…
A PowerPoint add-in to insert LaTeX equations into PowerPoint presentations on Windows and Mac
🌟100+ 原创 LLM / RL 原理图📚,《大模型算法》作者巨献!💥(100+ LLM/RL Algorithm Maps )
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
A curated list of large VLM-based VLA models for robotic manipulation.
The official code for paper "GPSToken: Gaussian Parameterized Spatially-adaptive Tokenization for Image Representation and Generation"
CVPR2026 ANTS: Adaptive Negative Textual Space Shaping for OOD Detection via Test-Time MLLM Understanding and Reasoning
A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.
Code, documentation, and discussion around the MIMIC-CXR database
Enhanced Generative Structure Prior for Text Image Super-Resolution [TPAMI]
GLM-4.6V/4.5V/4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning