Stars
Dataset of Mouse and Touchscreen Input Performance
AI agents running research on single-GPU nanochat training automatically
Virtual Screen Reader is a screen reader simulator for unit tests.
verl: Volcano Engine Reinforcement Learning for LLMs
Building a comprehensive and handy list of papers for GUI agents
Resources and paper list for "Thinking with Images for LVLMs". This repository accompanies our survey on how LVLMs can leverage visual information for complex reasoning, planning, and generation.
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
VIP cheatsheet for Stanford's CME 295 Transformers and Large Language Models
This repository showcases various advanced techniques for Retrieval-Augmented Generation (RAG) systems. RAG systems combine information retrieval with generative models to provide accurate and cont…
ScreenCoder — Turn any UI screenshot into clean, editable HTML/CSS with full control. Fast, accurate, and easy to customize.
FULL Augment Code, Claude Code, Cluely, CodeBuddy, Comet, Cursor, Devin AI, Junie, Kiro, Leap.new, Lovable, Manus, NotionAI, Orchids.app, Perplexity, Poke, Qoder, Replit, Same.dev, Trae, Traycer AI…
Vibetest MCP - automated QA testing using Browser-Use agents
C++ implementation of a ScienceDirect paper "An accelerating cpu-based correlation-based image alignment for real-time automatic optical inspection"
An open-source audio wake word (or phrase) detection framework with a focus on performance and simplicity.
Grounded SAM 2: Ground and Track Anything in Videos with Grounding DINO, Florence-2 and SAM 2
Run Segment Anything Model 2 on a live video stream
A robust, efficient, low-latency speech-to-text library with advanced voice activity detection, wake word activation and instant transcription.
Datasets on Website Aesthetics for Machine Learning
Android in docker solution with noVNC supported and video recording
A command-line utility and Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
JohannesBuchner / imagehash
Forked from bunchesofdonald/photohashA Python Perceptual Image Hashing Module
Pretty good call graphs for dynamic languages
MagentaA11y is a tool built to simplify the process of accessibility testing.
Google play scraper for Python inspired by <facundoolano/google-play-scraper>