-
-
-
Agent-S Public
Forked from simular-ai/Agent-SAgent S: an open agentic framework that uses computers like a human
Python Apache License 2.0 UpdatedDec 12, 2024 -
anthropic-quickstarts Public
Forked from anthropics/claude-quickstartsA collection of projects designed to help developers quickly get started with building deployable applications using the Anthropic API
TypeScript MIT License UpdatedNov 13, 2024 -
OmniParser Public
Forked from microsoft/OmniParserA simple screen parsing tool towards pure vision based GUI agent
Jupyter Notebook Creative Commons Attribution 4.0 International UpdatedNov 5, 2024 -
OSWorld Public
Forked from xlang-ai/OSWorld[NeurIPS 2024] OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments
Python Apache License 2.0 UpdatedNov 4, 2024 -
visualwebarena Public
Forked from web-arena-x/visualwebarenaVisualWebArena is a benchmark for multimodal agents.
Python MIT License UpdatedSep 27, 2024 -
RATT Public
Forked from jinghanzhang1998/RATTRATT: A Thought Structure for Coherent and Correct LLM Reasoning
Jupyter Notebook UpdatedJul 11, 2024 -
world-model-for-language-model Public
Forked from szxiangjn/world-model-for-language-modelPython UpdatedJul 10, 2024 -
AppAgent Public
Forked from TencentQQGYLab/AppAgentAppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.
Python MIT License UpdatedMay 26, 2024 -
The Basic Repository of the Graduation Thesis"High-quality image editing based on Diffusion Model", Zhejiang University
MIT License UpdatedMar 30, 2024 -
MiDaS Public
Forked from isl-org/MiDaSCode for robust monocular depth estimation described in "Ranftl et. al., Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-shot Cross-dataset Transfer, TPAMI 2022"
Python MIT License UpdatedFeb 14, 2024