Skip to content
View ishine's full-sized avatar
  • gerzz.inc
  • shanghai

Block or report ishine

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasoning.

Python 40 5 Updated Apr 11, 2026

Chrome extension & CLI to let agents control your browser. Runs Playwright snippets in a stateful sandbox. Available as CLI or MCP

HTML 3,628 155 Updated Jun 23, 2026

Official JAX implementation of End-to-End Test-Time Training for Long Context

Python 620 47 Updated Feb 15, 2026

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.

Python 8,306 642 Updated Apr 25, 2026

This repository contains tools to download, crawl, and process French political speeches from the vie-publique.fr public dataset. It allows for the collection of speech metadata and the scraping of…

Jupyter Notebook 3 1 Updated Jan 4, 2026

General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.

Python 5,096 842 Updated Jun 6, 2026

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,821 178 Updated Apr 20, 2026

DeepTutor: Agent-native Personalized Tutoring. https://deeptutor.info/.

Python 24,948 3,379 Updated Jun 24, 2026

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 91,549 5,971 Updated Jun 23, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,450 710 Updated May 17, 2026

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

Python 20,273 3,612 Updated Jun 24, 2026

A collection of sample agents built with Agent Development Kit (ADK)

Python 9,734 2,694 Updated Jun 22, 2026
Python 47 3 Updated Mar 29, 2026

A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!

Python 119 11 Updated Nov 24, 2025

A family of efficient speech models for multilingual phone recognition

Python 63 10 Updated Feb 12, 2026

Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"

Python 164 18 Updated Mar 3, 2026

Open Audio Watermarking Tool

Python 517 49 Updated Dec 22, 2025
Python 7 3 Updated Oct 24, 2025

Generate audio signals corresponding to moving sources/receivers in a shoebox-shaped room (Python)

Python 10 1 Updated Nov 14, 2025

[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Python 47 1 Updated Nov 21, 2025
Python 11,613 792 Updated Feb 9, 2026

A new dataset that includes long audio, captions of local audio events, and temporal boundaries

13 Updated Mar 26, 2026

SpikeMamba presents a novel integration of spiking neural networks (SNNs) with the Mamba state space model architecture, investigating the potential for biologically-inspired temporal dynamics in l…

Python 6 Updated Sep 9, 2025

Resources to develop programming and software development skills

HTML 27 10 Updated Sep 21, 2023

Extracted system prompts from Anthropic - Claude Fable 5, Opus 4.8, Claude Code, Claude Design. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex. Google - Gemini 3.5 Flash, 3.1 Pro, Antigravit…

JavaScript 45,591 7,488 Updated Jun 24, 2026

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译,支持 Google/DeepL/Ollama/OpenAI 等服务,提供 CLI/GUI/MCP/Docker/Zotero

Python 35,151 3,141 Updated May 25, 2026

Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"

Python 18 1 Updated Oct 9, 2025

[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"

Python 374 15 Updated Jun 27, 2025

Towards Fine-grained Audio Captioning with Multimodal Contextual Cues

Python 87 4 Updated Jan 4, 2026
Next