Skip to content
View Durgesh92's full-sized avatar
🎯
Focusing
🎯
Focusing
  • Infolabs Global
  • Dubai

Block or report Durgesh92

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1 Updated Jun 11, 2026

"ViMax: Agentic Video Generation (Director, Screenwriter, Producer, and Video Generator All-in-One)"

Python 10,404 1,517 Updated Jun 13, 2026

The retrieval layer for production AI systems. Lightning-fast (<10ms) search without vector databases. Built for browser, edge, on-device, and cloud.

Python 423 50 Updated Jun 18, 2026

A standalone desktop/smartTV overlay that translates system audio into 3D Sign Language animation in real-time.

TypeScript 3 Updated May 26, 2026

Open-source American english TTS model. 6 voices and a high performance inference library for Apple Silicon.

Python 17 2 Updated May 20, 2026

[SIGGRAPH 2026] Pixal3D: Pixel-Aligned 3D Generation from Images

Python 1,784 165 Updated May 24, 2026

tiny-world-builder

JavaScript 1,075 143 Updated Jun 18, 2026

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation" (CVPR'25 Spotlight).

Python 12,944 1,250 Updated Nov 5, 2025

[ECCV 2026] Implementation of "Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length"

Python 2,169 242 Updated Jun 18, 2026

Self hosted, real-time digital human agent platform. Build voice-first AI agents with WebRTC, persona memory, tools, RAG, and optional digital-human video.

Python 1,230 171 Updated Jun 17, 2026

FlashRT is a high-performance realtime inference engine for small-batch, latency-sensitive AI workloads. The flagship integration is production VLA control for Pi0, Pi0.5, GROOT N1.6, and Pi0-FAST.…

C++ 357 41 Updated Jun 15, 2026
Python 92 14 Updated May 14, 2026

Browser-based text-to-speech powered by OmniVoice. Runs entirely locally via WebGPU and WebAssembly.

JavaScript 12 3 Updated Apr 12, 2026

Open source video conferencing app powered by LiveKit. Built with Django and React.

Python 2,106 241 Updated Jun 18, 2026

Self-hosted DTLN noise suppression plugin for LiveKit Agents — no cloud API, no per-minute fees

Python 42 10 Updated Apr 16, 2026

Building actual open source including dataset Multilingual TTS more than 150 languages with Voice Cloning.

Jupyter Notebook 55 4 Updated Apr 23, 2026

A framework for efficient model inference with omni-modality models

Python 5,193 1,134 Updated Jun 18, 2026

🎙️ VoxSherpa TTS Offline Neural Text-to-Speech Engine for Android ⚡ Sherpa-ONNX powered 🔊 Natural voice synthesis 📱 Fully offline processing 🚀 No cloud • No limits

Java 132 22 Updated Jun 16, 2026

Vietnamese TTS with instant voice cloning • On-device • Real-time CPU inference • 24kHz audio quality • Chuyển văn bản thành giọng nói tiếng Việt • Text to speech tiếng Việt • TTS tiếng Việt

Python 1,875 558 Updated Jun 10, 2026

Automate the process of making money online.

Python 30,970 3,346 Updated Jun 14, 2026

VoXtream is a Full-Stream Zero-shot TTS model with Extremely Low Latency and Speaking rate Control

Python 241 30 Updated May 30, 2026

This repository contains the official code for LPIPS-AttnWav2Lip. The paper has been accepted by the journal Speech Communication.

Python 13 4 Updated Jan 30, 2026

Detect Anything in Real Time: Real-time object detection using frontier object detection models.

Python 294 42 Updated Mar 26, 2026
Jupyter Notebook 45 6 Updated Apr 28, 2026

vLLM plugin for Reka models

Python 9 Updated Jun 15, 2026

Open Source Speech Language Model

Jupyter Notebook 995 107 Updated May 11, 2026

The open-source app everyone uses to manage agents at work

TypeScript 70,874 13,187 Updated Jun 18, 2026

A SOTA Industrial-Grade Voice Activity Detection & Audio Event Detection, supporting 100+ languages, outperforming Silero-VAD, TEN-VAD, FunASR-VAD and WebRTC-VAD

Python 426 28 Updated May 6, 2026

Real-time voice-to-avatar interaction server combining OpenAI Realtime API for conversational AI with an Audio to Expression model for synchronized avatar facial animation.

Python 9 Updated May 18, 2026

🕷️ An adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl!

Python 64,798 6,366 Updated Jun 17, 2026
Next