Skip to content
View kiljos's full-sized avatar
🍊
🍊

Highlights

  • Pro

Block or report kiljos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Model Context Protocol (MCP) server for AI-assisted development ("vibe coding") of MDK applications.

JavaScript 22 3 Updated Feb 9, 2026

AI agents can now use real Android and iOS apps, just like a human.

Python 2,200 175 Updated Feb 12, 2026

[CVPR 2025] Open-source, End-to-end, Vision-Language-Action model for GUI Agent & Computer Use.

Python 1,714 130 Updated Jan 20, 2026

Every front-end GUI client for ChatGPT, Claude, and other LLMs

3,945 278 Updated Jan 22, 2026

[NeurIPS'25] GUI-Actor: Coordinate-Free Visual Grounding for GUI Agents

Python 381 46 Updated Feb 11, 2026

GUI Grounding for Professional High-Resolution Computer Use

Python 330 44 Updated Jan 7, 2026

Agent S: an open agentic framework that uses computers like a human

Python 9,766 1,127 Updated Jan 19, 2026

This is the repo for the paper "OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use" (ACL 2025 Oral).

377 19 Updated Aug 16, 2025

Python script to upload videos on YouTube using Selenium

Python 659 208 Updated Feb 12, 2023

Goodreads Quote API

Python 22 5 Updated Oct 22, 2024

[ICML'24] SeeAct is a system for generalist web agents that autonomously carry out tasks on any given website, with a focus on large multimodal models (LMMs) such as GPT-4V(ision).

Python 822 107 Updated Feb 3, 2025

AI agent using GPT-4V(ision) capable of using a mouse/keyboard to interact with web UI

JavaScript 1,066 101 Updated Dec 9, 2024

🖥️ Run AI Agent in your browser.

Python 15,597 2,684 Updated Aug 31, 2025

Zonos-v0.1 is a leading open-weight text-to-speech model trained on more than 200k hours of varied multilingual speech, delivering expressiveness and quality on par with—or even surpassing—top TTS …

Python 7,161 814 Updated Mar 5, 2025

Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

957 29 Updated Nov 14, 2025

ui-screenshot-to-prompt is an AI-powered tool that analyzes UI images to generate detailed prompts for AI coders. It uses computer vision and natural language processing to break down UI components…

Python 191 33 Updated Oct 15, 2025

A curated list of of awesome UI agents resources, encompassing Web, App, OS, and beyond (continually updated)

274 32 Updated Dec 15, 2025

A collection of AI Agents papers (Updated biweekly)

1,067 80 Updated Feb 15, 2026

A one stop repository for generative AI research updates, interview resources, notebooks and much more!

HTML 24,695 5,266 Updated Feb 10, 2026

JavaScript API for Chrome and Firefox

TypeScript 93,594 9,376 Updated Feb 17, 2026

The model, data and code for the visual GUI Agent SeeClick

HTML 465 29 Updated Jul 13, 2025

verl: Volcano Engine Reinforcement Learning for LLMs

Python 19,254 3,257 Updated Feb 16, 2026

A list of AI autonomous agents

25,846 2,228 Updated Feb 26, 2025

[NeurIPS 2025] 🌐 WebThinker: Empowering Large Reasoning Models with Deep Research Capability

Python 1,405 136 Updated Dec 8, 2025

Monte carlo permutation tests

Python 310 109 Updated Mar 3, 2025

https://hf.co/hexgrad/Kokoro-82M

JavaScript 5,674 646 Updated Aug 6, 2025

our vim dotfiles

Vim Script 150 118 Updated May 24, 2025

[NeurIPS'23 Spotlight] "Mind2Web: Towards a Generalist Agent for the Web" -- the first LLM-based web agent and benchmark for generalist web agents

Jupyter Notebook 946 120 Updated Nov 5, 2025

A simple screen parsing tool towards pure vision based GUI agent

Jupyter Notebook 24,363 2,118 Updated Sep 12, 2025
Next