ishine

ishine

speech asr/speech-recognition tts/text-to-speech vc/voice-conversion ac/accent-conversion

166 followers · 261 following

gerzz.inc
shanghai
dubbing-ai.com dubbingai.io

Achievements

Stars

Alittleegg / Eureka-Audio

Eureka-Audio: A 1.7B lightweight audio–language model that matches 7B–30B models on ASR, audio understanding, and paralinguistic reasoning.

Python 40 5 Updated Apr 11, 2026

remorses / playwriter

Chrome extension & CLI to let agents control your browser. Runs Playwright snippets in a stateful sandbox. Available as CLI or MCP

HTML 3,591 151 Updated May 27, 2026

test-time-training / e2e

Official JAX implementation of End-to-End Test-Time Training for Long Context

Python 620 47 Updated Feb 15, 2026

MiroMindAI / MiroThinker

MiroThinker is a deep research agent optimized for complex research and prediction tasks. Our latest models, MiroThinker-1.7, achieves 74.0 and 75.3 on the BrowseComp and BrowseComp Zh, respectively.

Python 8,267 638 Updated Apr 25, 2026

zincjian / french_speech_datacrawler

This repository contains tools to download, crawl, and process French political speeches from the vie-publique.fr public dataset. It allows for the collection of speech metadata and the scraping of…

Jupyter Notebook 3 1 Updated Jan 4, 2026

alexzhang13 / rlm

General plug-and-play inference library for Recursive Language Models (RLMs), supporting various sandboxes.

Python 4,457 777 Updated Jun 6, 2026

Tongyi-MAI / MAI-UI

MAI-UI: Real-World Centric Foundation GUI Agents ranging from 2B to 235B

Jupyter Notebook 1,820 177 Updated Apr 20, 2026

HKUDS / DeepTutor

DeepTutor: Agent-native, Open-sourced Personalized Tutoring. https://deeptutor.info/.

Python 24,724 3,341 Updated Jun 12, 2026

microsoft / playwright

Playwright is a framework for Web Testing and Automation. It allows testing Chromium, Firefox and WebKit with a single API.

TypeScript 90,819 6,161 Updated Jun 12, 2026

sgl-project / mini-sglang

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,380 698 Updated May 17, 2026

google / adk-python

An open-source, code-first Python toolkit for building, evaluating, and deploying sophisticated AI agents with flexibility and control.

Python 20,089 3,556 Updated Jun 12, 2026

google / adk-samples

A collection of sample agents built with Agent Development Kit (ADK)

Python 9,645 2,663 Updated Jun 12, 2026

Xinxi-Zhang / Re-MeanFlow

Python 47 3 Updated Mar 29, 2026

ysharma3501 / FastNeuTTS

A highly optimized engine for neutts-air model to generate minutes of audio in seconds. Over 200x realtime on modern hardware!

Python 119 11 Updated Nov 24, 2025

lingjzhu / zipa

A family of efficient speech models for multilingual phone recognition

Python 64 10 Updated Feb 12, 2026

FrontierLabs / F5R-TTS

Official code for "F5R-TTS: Improving Flow-Matching based Text-to-Speech with Group Relative Policy Optimization"

Python 162 18 Updated Mar 3, 2026

resemble-ai / Perth

Open Audio Watermarking Tool

Python 516 49 Updated Dec 22, 2025

mandip42 / rirmega

Python 7 3 Updated Oct 24, 2025

ehabets / das-generator

Generate audio signals corresponding to moving sources/receivers in a shoebox-shaped room (Python)

Python 10 1 Updated Nov 14, 2025

roudimit / Omni-R1

[ASRU 2025] Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Python 47 1 Updated Nov 21, 2025

Tongyi-MAI / Z-Image

Python 11,527 784 Updated Feb 9, 2026

line / CASTELLA

A new dataset that includes long audio, captions of local audio events, and temporal boundaries

12 Updated Mar 26, 2026

The-Swarm-Corporation / SpikeMamba

SpikeMamba presents a novel integration of spiking neural networks (SNNs) with the Mamba state space model architecture, investigating the potential for biologically-inspired temporal dynamics in l…

Python 6 Updated Sep 9, 2025

Mistrymm7 / AEC-Design-Technologist

Resources to develop programming and software development skills

HTML 27 11 Updated Sep 21, 2023

asgeirtj / system_prompts_leaks

Extracted system prompts from Anthropic - Claude Fable 5, Opus 4.8, Claude Code, Claude Design. OpenAI - ChatGPT 5.5 Thinking, GPT 5.5 Instant, Codex. Google - Gemini 3.5 Flash, 3.1 Pro, Antigravit…

JavaScript 41,833 6,925 Updated Jun 12, 2026

KeisukeImoto / LEAD_dataset

10 Updated Dec 4, 2024

PDFMathTranslate / PDFMathTranslate

[EMNLP 2025 Demo] PDF scientific paper translation with preserved formats - 基于 AI 完整保留排版的 PDF 文档全文双语翻译，支持 Google/DeepL/Ollama/OpenAI 等服务，提供 CLI/GUI/MCP/Docker/Zotero

Python 34,790 3,106 Updated May 25, 2026

colaudiolab / AudioSet-R

Official implementation: "AudioSet-R: A Refined AudioSet with Multi-Stage LLM Label Reannotation"

Python 18 1 Updated Oct 9, 2025

liuhuadai / OmniAudio

[ICML 2025] PyTorch Implementation of "OmniAudio: Generating Spatial Audio from 360-Degree Video"

Python 373 15 Updated Jun 27, 2025

FreedomIntelligence / FusionAudio

Towards Fine-grained Audio Captioning with Multimodal Contextual Cues

Python 87 4 Updated Jan 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ishine

Achievements

Achievements

Block or report ishine

Stars

Alittleegg / Eureka-Audio

remorses / playwriter

test-time-training / e2e

MiroMindAI / MiroThinker

zincjian / french_speech_datacrawler

alexzhang13 / rlm

Tongyi-MAI / MAI-UI

HKUDS / DeepTutor

microsoft / playwright

sgl-project / mini-sglang

google / adk-python

google / adk-samples

Xinxi-Zhang / Re-MeanFlow

ysharma3501 / FastNeuTTS

lingjzhu / zipa

FrontierLabs / F5R-TTS

resemble-ai / Perth

mandip42 / rirmega

ehabets / das-generator

roudimit / Omni-R1

Tongyi-MAI / Z-Image

line / CASTELLA

The-Swarm-Corporation / SpikeMamba

Mistrymm7 / AEC-Design-Technologist

asgeirtj / system_prompts_leaks

KeisukeImoto / LEAD_dataset

PDFMathTranslate / PDFMathTranslate

colaudiolab / AudioSet-R

liuhuadai / OmniAudio

FreedomIntelligence / FusionAudio