Skip to content
View Daming-W's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Daming-W

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

诺亚盘古大模型研发背后的真正的心酸与黑暗的故事。

11,366 1,348 Updated Jul 9, 2025

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,295 328 Updated Dec 15, 2025

[ICCV 2025] Official code of "ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation"

Python 532 55 Updated Dec 10, 2025

The official implementation of [Quality over Quantity: Boosting Data Efficiency Through Ensembled Multimodal Data Curation] in AAAI2025.

Python 6 Updated May 8, 2025
Python 6 1 Updated May 8, 2024

Closed-loop evaluation for end-to-end VLM autonomous driving agent

Python 24 4 Updated Mar 8, 2025

OpenEMMA, a permissively licensed open source "reproduction" of Waymo’s EMMA model.

Python 876 108 Updated May 13, 2025

HE-Drive: Human-Like End-to-End Driving with Vision Language Models

Python 249 16 Updated Aug 17, 2025

Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)

Python 64,338 7,794 Updated Dec 21, 2025
Python 542 42 Updated Jun 8, 2025

[CVPR 2024] LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Jupyter Notebook 831 70 Updated Apr 14, 2025

LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.

Python 3,103 217 Updated May 19, 2025
Python 4,461 435 Updated Sep 14, 2025

[ACL 2024 (Findings)] ICC: Quantifying Image Caption Concreteness for Multimodal Dataset Curation

Python 5 Updated Aug 24, 2024

This is an official pytorch implementation of TCA-Net: Triplet Concatenated-Attentional Network for Multimodal Engagement Estimation.

Python 4 Updated Jun 23, 2024

[NeurIPS 2024] MoVA: Adapting Mixture of Vision Experts to Multimodal Context

Python 168 4 Updated Sep 25, 2024

A Framework of Small-scale Large Multimodal Models

Python 938 95 Updated Apr 26, 2025

Scenario Understanding with Visual-Question-Answering Base on Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

Python 1 Updated May 24, 2024

This research project has constructed a two-stage clustering-based retrieval framework, as well as a deep learning-based retrieval algorithm using the CLIP model, which demonstrates zero-shot abili…

Python 4 Updated Mar 4, 2024

LAVIS - A One-stop Library for Language-Vision Intelligence

Jupyter Notebook 11,076 1,088 Updated Nov 18, 2024
Python 9 1 Updated Jul 13, 2022
Python 6 1 Updated Sep 24, 2023

最完整的AI算法面试题目仓库,1000道,25个类目

1,313 114 Updated Aug 13, 2023

📚 技术面试必备基础知识、Leetcode、计算机操作系统、计算机网络、系统设计

1 Updated Feb 23, 2022

This research presents probabilistic machine learning methods to optimize basic image ranking models. And make it suitable for visual navigation and exploration tasks in the real complex world.

Jupyter Notebook 1 Updated Sep 15, 2022

刷算法全靠套路,认准 labuladong 就够了!English version supported! Crack LeetCode, not only how, but also why.

Markdown 131,251 23,601 Updated Oct 8, 2025

深度学习入门教程, 优秀文章, Deep Learning Tutorial

Jupyter Notebook 16,991 3,844 Updated Apr 21, 2022