Skip to content
View realcarlos's full-sized avatar

Block or report realcarlos

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Instantly generate AI-powered subtitles on your device. Works standalone or connects to DaVinci Resolve.

TypeScript 2,072 108 Updated Nov 4, 2025

very good whiteboard SDK / infinite canvas SDK

TypeScript 43,666 2,826 Updated Nov 5, 2025

Speech-to-text, text-to-speech, speaker diarization, speech enhancement, source separation, and VAD using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Andr…

C++ 8,711 963 Updated Nov 5, 2025

The most advanced Nano Banana image generator and editor application. Your central hub for AI image generation and revisions. Intuitive UI features reference images, editing with image masks, versi…

TypeScript 449 97 Updated Sep 17, 2025

faster_whisper GUI with PySide6

Python 2,755 161 Updated Dec 8, 2024

from Google AI Studio

TypeScript 143 38 Updated Sep 19, 2025

一键生成产品营销与泛内容短视频,AI批量自动剪辑,高颜值跨平台桌面端工具 One click generation of product marketing and general content short videos, AI batch automatic cliping, beautiful cross platform desktop tool

TypeScript 2,281 311 Updated Nov 5, 2025

A MapBoxGL and D3 web mapping tool for exploring the dynamic population of Manhattan.

JavaScript 189 47 Updated May 2, 2023

A free and open source, self hosted Ai based live meeting note taker and minutes summary generator that can completely run in your Local device (Mac OS and windows OS Support added. Working on addi…

Rust 8,109 640 Updated Nov 2, 2025
JavaScript 303 57 Updated Aug 7, 2025

This repository contains the official implementation of the research papers, "MobileCLIP" CVPR 2024 and "MobileCLIP2" TMLR August 2025

Python 1,286 103 Updated Oct 9, 2025

This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025

Python 6,843 478 Updated May 5, 2025

可以实现按下 Option 按钮开始录制,抬起按钮就结束录制,并调用 Groq Whisper Large V3 Turbo 模型进行转译,由于 Groq 的速度非常快,所以大部分的语音输入都可以在 1-2s 内反馈。并且得益于 whisper 的强大能力,转译效果非常不错。

Python 582 64 Updated Jan 29, 2025
TypeScript 1,837 310 Updated Sep 28, 2025

An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

Python 14,800 1,671 Updated Oct 30, 2025

ゲームの字幕にルビ(ふりがな)を表示するためにフォントにルビを埋め込むプログラム

Python 40 2 Updated Feb 2, 2019

Simultaneous speech-to-text model

Python 8,268 771 Updated Oct 30, 2025

一个使用Flutter开发,支持诸多云平台AI大模型API调用的智能工作生活助手应用。除了常规大模型应用,还有极简记账、随机菜品、猫狗之家、waifu图片、MAL动漫排行、BGM动漫资讯、饮食健康等生活日常工具。

Dart 73 15 Updated Jun 10, 2025

Take notes with your voice & transform them with AI

TypeScript 449 67 Updated Aug 5, 2025

🚀 The open-source alternative to Twilio.

TypeScript 7,401 453 Updated Nov 4, 2025

Open-Source AI Presentation Generator and API (Gamma, Beautiful AI, Decktopus Alternative)

TypeScript 2,742 518 Updated Oct 10, 2025

Digital Mind Extension

JavaScript 6,862 1,054 Updated Oct 26, 2025

A personalized language-learning tool that combines Duolingo-style lessons with your own curated vocabulary lists. Seamlessly add words from books, articles, or videos, and revisit them through in…

TypeScript 1,854 176 Updated Aug 8, 2025

The open-source CapCut alternative

TypeScript 43,286 4,114 Updated Oct 24, 2025

LiYing is an automated photo processing program designed for automating the post-processing workflow of ID photos in general photo studios. | LiYing 是一套适用于自动化 完成一般照相馆后期证件照处理流程的照片自动处理的程序。

Python 2,987 243 Updated Oct 18, 2025

VOICE → WORDS

Swift 8 2 Updated Sep 22, 2025

Talk to Type

Swift 65 8 Updated Jul 1, 2025

Examples in the MLX framework

Python 7,970 1,105 Updated Oct 7, 2025

Multilingual Voice Understanding Model

Python 6,880 641 Updated Aug 15, 2025

Multilingual Voice Understanding Model (Adding openai Speech to Text compatible interfaces)

Python 1 Updated Dec 4, 2024
Next