Skip to content
View felixxinjin1's full-sized avatar

Block or report felixxinjin1

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 1 Updated Dec 7, 2021

Enhancing Zero-shot CLIP with Cross-Modality Attention

4 Updated Jan 13, 2022

The multi-view version of MonoDETR on nuScenes dataset

21 1 Updated Nov 4, 2022

[CVPR 2022] PointCLIP: Point Cloud Understanding by CLIP

Python 409 37 Updated Nov 24, 2022

[NeurIPS 2022] Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training

Python 226 27 Updated May 4, 2023

Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

92 6 Updated Jun 14, 2023

[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

45 2 Updated Jun 14, 2023

[CVPR 2023] Learning 3D Representations from 2D Pre-trained Models via Image-to-Point Masked Autoencoders

Python 230 18 Updated Aug 10, 2023

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

Jupyter Notebook 2 Updated Mar 23, 2024

[CVPR 2023] Parameter is Not All You Need: Starting from Non-Parametric Networks for 3D Point Cloud Analysis

Python 540 51 Updated Apr 9, 2024

Personalize Segment Anything Model (SAM) with 1 shot in 10 seconds

Python 1,664 114 Updated Jul 22, 2024

[ICLR 2025] Mathematical Visual Instruction Tuning for Multi-modal Large Language Models

155 1 Updated Dec 5, 2024

[ECCV 2024] Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems?

Python 181 16 Updated Apr 28, 2025

[ICCV 2023] The first DETR model for monocular 3D object detection with depth-guided transformer

Python 443 46 Updated Jul 15, 2025
HTML 1 Updated Aug 15, 2020

TRI-ML Monocular Depth Estimation Repository

Python 1 Updated May 28, 2021

Vectornet for trajectory prediction, implemented in PyTorch/Torch_geometric (WIP)

Python 1 Updated Jun 8, 2021

Waymo Open Dataset

C++ 1 Updated Jul 13, 2021

OpenMMLab Detection Toolbox and Benchmark

Python 1 Updated Jul 19, 2022

OpenMMLab Semantic Segmentation Toolbox and Benchmark.

Python 1 Updated Jul 19, 2022

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Python 1 Updated Apr 14, 2024

Official repository of T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT

Python 1 Updated Jul 29, 2025

[AAAI 2023] Zero-Shot Enhancement of CLIP with Parameter-free Attention

Python 93 2 Updated Apr 29, 2023

Reading list for research topics in Masked Image Modeling

3 Updated May 22, 2023

Awesome List of Masked Image Modeling (MIM) Papers for Self-supervised Visual Representation Learning

11 3 Updated Jun 4, 2023

Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 4 Updated Jun 5, 2023

✨✨Latest Papers and Datasets on Multimodal Large Language Models, and Their Evaluation.

6 3 Updated Jun 27, 2023

Align 3D Point Cloud with Multi-modalities for Large Language Models

Python 461 28 Updated Dec 9, 2023
Next