Lists (9)
Sort Name ascending (A-Z)
Stars
Universal Video Temporal Grounding with Generative Multi-modal Large Language Models
Benchmarking for Audio-Text and Audio-Visual Generation; Supports FAD, FD_VGG, FD_PANNs, FD_PaSST, IS_PaSST, IS_PANNs, KL_PaSST, KL_PANNs, LAION-CLAP, MS-CLAP, DeSync
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
[NeurIPS 2025] PyTorch implementation of [ThinkSound], a unified framework for generating audio from any modality, guided by Chain-of-Thought (CoT) reasoning.
[CVPR 2025] Adaptive Keyframe Sampling for Long Video Understanding
**Deep Video Discovery (DVD)** is a deep-research style question answering agent designed for understanding extra-long videos.
A GUI client for Windows, Linux and macOS, support Xray and sing-box and others
AutoDL平台服务器适配梯子, 使用 Clash 作为代理工具
✨✨[NeurIPS 2025] This is the official implementation of our paper "Video-RAG: Visually-aligned Retrieval-Augmented Long Video Comprehension"
Easy Data Preparation with latest LLMs-based Operators and Pipelines.
verl: Volcano Engine Reinforcement Learning for LLMs
Time-R1: Framework and resources for endowing LLMs with comprehensive temporal reasoning (understanding, prediction, creative generation) using a novel three-stage RL curriculum. Includes the Time-…
🔥CVPR 2025 Multimodal Large Language Models Paper List
amed Entity Recognition (NER) for biomedical research papers using BERT, BioBERT, BiLSTM, and CRF models. Implements deep learning and reinforcement learning to enhance medical text extraction accu…
Notebook for BERT medical named entity recognition
repository for Publicly Available Clinical BERT Embeddings
BioBERT model fine tuned for NER task on Pubmed Dataset
【蓝桥杯Python冲刺课】视频合集 https://space.bilibili.com/398421867/lists?sid=4898042&spm_id_from=333.788.0.0
这是我学习 PyTorch 的笔记对应的代码,点击查看 PyTorch 笔记在线电子书
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】【大飞 大模型Agent】
Large Language-and-Vision Assistant for Biomedicine, built towards multimodal GPT-4 level capabilities.
An Open-source RL System from ByteDance Seed and Tsinghua AIR
BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays