Skip to content
View dsx-ai's full-sized avatar

Block or report dsx-ai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

A dataset containing synchronized visual, inertial and GNSS raw measurements.

C++ 241 51 Updated Jul 25, 2022

FusionFly is an open-source toolkit for standardizing GNSS (Global Navigation Satellite System) and IMU (Inertial Measurement Unit) data with Factor Graph Optimization (FGO). The system provides a …

TypeScript 6 2 Updated Apr 23, 2025

A unified evaluation suite for speech-to-text translation, covering SpeechLLMs, SFMs, and cascaded systems across diverse real-world speech phenomena.

Jupyter Notebook 19 2 Updated Dec 23, 2025

DPDFNet: causal single-channel speech enhancement that boosts DeepFilterNet2 with dual-path RNN blocks for stronger long-range temporal and cross-band modeling. Repo includes PyTorch implementation…

Python 6 3 Updated Dec 19, 2025

LibriVAD - a scalable open-source dataset derived from LibriSpeech and augmented with diverse real-world and synthetic noise sources, in addition to deep learning benchmarks..

Python 2 1 Updated Dec 22, 2025

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 325 7 Updated Dec 16, 2025

Live speech translation application built with Electron 34 and React, using OpenAI's Realtime API.

TypeScript 663 54 Updated Dec 19, 2025

PyTorch Implementation of "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.

Python 820 156 Updated Dec 21, 2022
Python 5 1 Updated Aug 28, 2025

EdgeFace: Efficient Face Recognition Model for Edge Devices [TBIOM 2024] the winner of compact track of IJCB 2023 Efficient Face Recognition Competition

Jupyter Notebook 133 22 Updated Aug 7, 2025

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything

Jupyter Notebook 1,039 74 Updated Jun 14, 2024

NeurIPS 2025

Python 16 1 Updated Sep 24, 2025

Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.

Go 7,013 902 Updated Dec 23, 2025

A library for making PyTorch models streamable

Python 52 4 Updated Dec 18, 2025

Lightning-YOLOs provides clean, modular YOLO object detection models built on PyTorch Lightning, making it easier to train, extend, and experiment with modern YOLO variants in research and producti…

Python 20 3 Updated Dec 13, 2025

TPAMI:Frequency-aware Feature Fusion for Dense Image Prediction

Jupyter Notebook 468 23 Updated Nov 25, 2025

Learning in infinite dimension with neural operators.

Python 3,277 799 Updated Dec 16, 2025

Official implementation of "Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation".

Jupyter Notebook 105 4 Updated Dec 19, 2025

Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"

Python 48 1 Updated Dec 20, 2025

Official code of Motus: A Unified Latent Action World Model

Python 179 2 Updated Dec 22, 2025

Official Implementation of Dynamic erf (Derf).

Python 89 10 Updated Dec 12, 2025

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Python 2,048 366 Updated Feb 23, 2024

HaMeR: Reconstructing Hands in 3D with Transformers

Python 823 109 Updated Mar 22, 2025

Pixio: a capable vision encoder dedicated to dense tasks, simply by pixel reconstruction

Python 195 7 Updated Dec 23, 2025

A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention

266 5 Updated Dec 1, 2025

The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module designs and a specially curated dataset.

Python 13 Updated Nov 18, 2025

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,970 129 Updated Dec 18, 2025

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,414 178 Updated Dec 23, 2025
Next