dsx-ai

DS.Xu dsx-ai

37 followers · 200 following

shanghai

Lists (13)

Sort

Starred repositories

HKUST-Aerial-Robotics / GVINS-Dataset

A dataset containing synchronized visual, inertial and GNSS raw measurements.

C++ 241 51 Updated Jul 25, 2022

ada-jl4025 / FusionFly

FusionFly is an open-source toolkit for standardizing GNSS (Global Navigation Satellite System) and IMU (Inertial Measurement Unit) data with Factor Graph Optimization (FGO). The system provides a …

TypeScript 6 2 Updated Apr 23, 2025

sarapapi / hearing2translate

A unified evaluation suite for speech-to-text translation, covering SpeechLLMs, SFMs, and cascaded systems across diverse real-world speech phenomena.

Jupyter Notebook 19 2 Updated Dec 23, 2025

ceva-ip / DPDFNet

DPDFNet: causal single-channel speech enhancement that boosts DeepFilterNet2 with dual-path RNN blocks for stronger long-range temporal and cross-band modeling. Repo includes PyTorch implementation…

Python 6 3 Updated Dec 19, 2025

zhenghuatan / LibriVAD

LibriVAD - a scalable open-source dataset derived from LibriSpeech and augmented with diverse real-world and synthetic noise sources, in addition to deep learning benchmarks..

Python 2 1 Updated Dec 22, 2025

MiniMax-AI / VTP

Towards Scalable Pre-training of Visual Tokenizers for Generation

Python 325 7 Updated Dec 16, 2025

kizuna-ai-lab / sokuji

Live speech translation application built with Electron 34 and React, using OpenAI's Realtime API.

TypeScript 663 54 Updated Dec 19, 2025

okankop / Efficient-3DCNNs

PyTorch Implementation of "Resource Efficient 3D Convolutional Neural Networks", codes and pretrained models.

Python 820 156 Updated Dec 21, 2022

dohun-mat / FeatherFace

Python 5 1 Updated Aug 28, 2025

otroshi / edgeface

EdgeFace: Efficient Face Recognition Model for Edge Devices [TBIOM 2024] the winner of compact track of IJCB 2023 Efficient Face Recognition Competition

Jupyter Notebook 133 22 Updated Aug 7, 2025

THU-MIG / RepViT

RepViT: Revisiting Mobile CNN From ViT Perspective [CVPR 2024] and RepViT-SAM: Towards Real-Time Segmenting Anything

Jupyter Notebook 1,039 74 Updated Jun 14, 2024

changsn / SparseDiT

NeurIPS 2025

Python 16 1 Updated Sep 24, 2025

NexaAI / nexa-sdk

Run the latest LLMs and VLMs across GPU, NPU, and CPU with PC (Python/C++) & mobile (Android & iOS) support, running quickly with OpenAI gpt-oss, Granite4, Qwen3VL, Gemma 3n and more.

Go 7,013 902 Updated Dec 23, 2025

CorentinJ / TorchStream

A library for making PyTorch models streamable

Python 52 4 Updated Dec 18, 2025

Borda / lightning-YOLOs

Lightning-YOLOs provides clean, modular YOLO object detection models built on PyTorch Lightning, making it easier to train, extend, and experiment with modern YOLO variants in research and producti…

Python 20 3 Updated Dec 13, 2025

Linwei-Chen / FreqFusion

TPAMI：Frequency-aware Feature Fusion for Dense Image Prediction

Jupyter Notebook 468 23 Updated Nov 25, 2025

neuraloperator / neuraloperator

Learning in infinite dimension with neural operators.

Python 3,277 799 Updated Dec 16, 2025

Insta360-Research-Team / panoramic-vision-survey

297 2 Updated Oct 13, 2025

Insta360-Research-Team / DAP

Official implementation of "Depth Any Panoramas: A Foundation Model for Panoramic Depth Estimation".

Jupyter Notebook 105 4 Updated Dec 19, 2025

WeChatCV / WeDetect

Official repository of paper "WeDetect: Fast Open-Vocabulary Object Detection as Retrieval"

Python 48 1 Updated Dec 20, 2025

thu-ml / Motus

Official code of Motus: A Unified Latent Action World Model

Python 179 2 Updated Dec 22, 2025

zlab-princeton / Derf

Official Implementation of Dynamic erf (Derf).

Python 89 10 Updated Dec 12, 2025

vchoutas / smplify-x

Expressive Body Capture: 3D Hands, Face, and Body from a Single Image

Python 2,048 366 Updated Feb 23, 2024

geopavlakos / hamer

HaMeR: Reconstructing Hands in 3D with Transformers

Python 823 109 Updated Mar 22, 2025

facebookresearch / pixio

Pixio: a capable vision encoder dedicated to dense tasks, simply by pixel reconstruction

Python 195 7 Updated Dec 23, 2025

Shengwang-Community / Conversational-AI-IOT-Sample

7 2 Updated Dec 15, 2025

attention-survey / Efficient_Attention_Survey

A Survey of Efficient Attention Methods: Hardware-efficient, Sparse, Compact, and Linear Attention

266 5 Updated Dec 1, 2025

lysanderism / TimeAudio

The official repository TimeAudio, a comprehensive framework that incorporates fine-grained acoustic cues into LALMs with enhanced module designs and a specially curated dataset.

Python 13 Updated Nov 18, 2025

facebookresearch / perception_models

State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!

Jupyter Notebook 1,970 129 Updated Dec 18, 2025

facebookresearch / sam-audio

The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…

Python 2,414 178 Updated Dec 23, 2025

DS.Xu dsx-ai

Lists (13)

360Pana

Agent

Awesome-X

Deep Learning Framework

ffmpeg

Infer NN

ISP

LM

Raw Image Deblur

Segment

Speech Enhancement

Super Resolution

YOLO-Series

Starred repositories

voice-conversion

speech-enhancement