Stars
- All languages
- ASP.NET
- Assembly
- Astro
- C
- C#
- C++
- CMake
- CSS
- Cuda
- Cython
- Dart
- Dockerfile
- Edge
- GLSL
- Go
- HLSL
- HTML
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Lean
- Lua
- M
- MATLAB
- MDX
- MLIR
- Makefile
- Markdown
- Mathematica
- Objective-C
- OpenEdge ABL
- PHP
- PureBasic
- Python
- R
- Raku
- Roff
- Ruby
- Rust
- Scala
- ShaderLab
- Shell
- Slang
- Svelte
- Swift
- TeX
- Terra
- TypeScript
- Vue
Windows Agent Arena (WAA) 🪟 is a scalable OS platform for testing and benchmarking of multi-modal AI agents.
A Universal Platform for Training and Evaluation of Mobile Interaction
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
[NeurIPS 2025 Spotlight] OpenCUA: Open Foundations for Computer-Use Agents
Out-of-the-box (OOTB) GUI Agent for Windows and macOS
The Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
Ideogram 4: Open image model at the forefront of design
A cross-platform GUI automation Python module for human beings. Used to programmatically control the mouse & keyboard.
Native and Compact Structured Latents for 3D Generation
Benchmarking Agentic Procedural 3D Modeling Via Code
[CVPR 2026] VGGDrive: Empowering Vision-Language Models with Cross-View Geometric Grounding for Autonomous Driving
Code for "Improving Robotic Manipulation with Efficient Geometry-Aware Vision Encoder"
A Minimal and Elegant Framework & Tutorial for Real-Time Interactive World Models
A curated collection of resources, tools, and frameworks for developing GUI Agents.
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos (ICML 2026)
OpenMMLab Pre-training Toolbox and Benchmark
Scalable pipeline for synthesizing verifiable RLVR training data for computer-use agents