Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
A scalable, distributed, collaborative, document-graph database, for the realtime web
A super fast Graph Database uses GraphBLAS under the hood for its sparse adjacency matrix graph representation. Our goal is to provide the best Knowledge Graph for LLM (GraphRAG).
The repository provides code for running inference with the Meta Segment Anything Audio Model (SAM-Audio), links for downloading the trained model checkpoints, and example notebooks that show how t…
FoundationDB - the open source, distributed, transactional key-value store
Sharp Monocular View Synthesis in Less Than a Second
Fast-FoundationStereo: Real-Time Zero-Shot Stereo Matching
A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
Research code for CVPR 2021 paper "End-to-End Human Pose and Mesh Reconstruction with Transformers"
Run Segment Anything Model 2 on a live video stream
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties
Reproduce AlignedReID: Surpassing Human-Level Performance in Person Re-Identification, using Pytorch.
[CVPR2022] DanceTrack: Multiple Object Tracking in Uniform Appearance and Diverse Motion
[IJCV-2021] FairMOT: On the Fairness of Detection and Re-Identification in Multi-Object Tracking
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
Open-source platform to build and deploy AI agent workflows.
A Rust library integrated with ONNXRuntime, providing a collection of Computer Vison and Vision-Language models such as YOLO, FastVLM, and more.
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Official implementation of "A Simple Visual-Textual Baseline for Pedestrian Attribute Recognition" [TCSVT 2022]
Pytorch Pedestrian Attribute Recognition: A strong PyTorch baseline for pedestrian attribute recognition and multi-label classification.
[NeurIPS2024] PLIP: Language-Image Pre-training for Person Representation Learning
Official implementation for "CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification without Concrete Text Labels" (AAAI 2023)
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
creating Anime Avataar from a facial image
An Open Phone Agent Model & Framework. Unlocking the AI Phone for Everyone