Skip to content
View ishine's full-sized avatar
  • gerzz.inc
  • shanghai

Block or report ishine

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasoning.

Python 40 5 Updated Apr 11, 2026

Chrome extension & CLI to let agents control your browser. Runs Playwright snippets in a stateful sandbox. Available as CLI or MCP

HTML 3,591 151 Updated May 27, 2026

Official JAX implementation of End-to-End Test-Time Training for Long Context

Python 620 47 Updated Feb 15, 2026

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.

Python 8,267 638 Updated Apr 25, 2026

This repository contains tools to download, crawl, and process French political speeches from the vie-publique.fr public dataset. It allows for the collection of speech metadata and the scraping of…

Jupyter Notebook 3 1 Updated Jan 4, 2026

General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.

Python 4,457 777 Updated Jun 6, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,820 177 Updated Apr 20, 2026

DeepTutor: Agent-native, Open-sourced Personalized Tutoring. https://deeptutor.info/.

Python 24,724 3,341 Updated Jun 12, 2026

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 90,819 6,161 Updated Jun 12, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,380 698 Updated May 17, 2026

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

Python 20,089 3,556 Updated Jun 12, 2026

A collection of sample agents built with Agent Development Kit (ADK)

Python 9,645 2,663 Updated Jun 12, 2026
Python 47 3 Updated Mar 29, 2026

A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!

Python 119 11 Updated Nov 24, 2025

A family of efficient speech models for multilingual phone recognition

Python 64 10 Updated Feb 12, 2026

Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"

Python 162 18 Updated Mar 3, 2026

Open Audio Watermarking Tool

Python 516 49 Updated Dec 22, 2025
Python 7 3 Updated Oct 24, 2025

Generate audio signals corresponding to moving sources/receivers in a shoebox-shaped room (Python)

Python 10 1 Updated Nov 14, 2025

[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Python 47 1 Updated Nov 21, 2025
Python 11,527 784 Updated Feb 9, 2026

A new dataset that includes long audio, captions of local audio events, and temporal boundaries

12 Updated Mar 26, 2026

SpikeMamba presents a novel integration of spiking neural networks (SNNs) with the Mamba state space model architecture, investigating the potential for biologically-inspired temporal dynamics in l…

Python 6 Updated Sep 9, 2025

Resources to develop programming and software development skills

HTML 27 11 Updated Sep 21, 2023

Extracted system prompts from Anthropic - Claude Fable 5, Opus 4.8, Claude Code, Claude Design. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex. Google - Gemini 3.5 Flash, 3.1 Pro, Antigravit…

JavaScript 41,833 6,925 Updated Jun 12, 2026

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

Python 34,790 3,106 Updated May 25, 2026

Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"

Python 18 1 Updated Oct 9, 2025

[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"

Python 373 15 Updated Jun 27, 2025

Towards Fine-grained Audio Captioning with Multimodal Contextual Cues

Python 87 4 Updated Jan 4, 2026
Next