Skip to content
View hekj's full-sized avatar
  • Institute of Automation, Chinese Academy of Sciences
  • BEIJING, CHINA

Block or report hekj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Official implementation of Human-Aware Vision-and-Language Navigation: Bridging Simulation to Reality with Dynamic Human Interactions (NeurIPS DB Track'24 Spotlight).

C++ 54 7 Updated Dec 20, 2024

Official implementation of WebVLN: Vision-and-Language Navigation on Websites

Python 35 2 Updated Jan 2, 2024

Everyday Object Disrupts Vision-and-Language Navigation Agent via Backdoor(VLN-ATT)

Python 9 Updated Dec 18, 2024

[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free

Python 964 51 Updated Jun 27, 2024

A human-annotated, fine-grained dataset for Vision-and-Language Navigation

13 1 Updated Jan 20, 2022

Official Implementation of Frequency-enhanced Data Augmentation for Vision-and-Language Navigation (NeurIPS2023)

Python 14 Updated Jan 8, 2024

Inpaint anything using Segment Anything and inpainting models.

Jupyter Notebook 7,606 657 Updated Feb 29, 2024

[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"

Python 433 35 Updated Apr 5, 2025

Official implementation of Think Global, Act Local: Dual-scale GraphTransformer for Vision-and-Language Navigation (CVPR'22 Oral).

Python 262 19 Updated Jun 27, 2023

Room-across-Room (RxR) is a large-scale, multilingual dataset for Vision-and-Language Navigation (VLN) in Matterport3D environments. It contains 126k navigation instructions in English, Hindi and T…

Python 176 14 Updated Jul 26, 2023

The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…

Python 36,564 5,142 Updated Mar 23, 2026

Code and Data of the CVPR 2022 paper: Bridging the Gap Between Learning in Discrete and Continuous Environments for Vision-and-Language Navigation

Python 149 12 Updated Oct 31, 2023

Code for NeurIPS 2021 paper "Curriculum Learning for Vision-and-Language Navigation"

Python 15 1 Updated Dec 13, 2022

A curated list of Multimodal Related Research.

Python 1,390 147 Updated Aug 5, 2023

cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理

12,507 2,251 Updated Apr 25, 2024

Official implementation of History Aware Multimodal Transformer for Vision-and-Language Navigation (NeurIPS'21).

Python 143 14 Updated Jun 14, 2023

Recent Transformer-based CV and related works.

1,338 142 Updated Aug 22, 2023

Reading list for research topics in multimodal machine learning

6,844 897 Updated Aug 20, 2024

Pytorch code for ICRA'21 paper: "Hierarchical Cross-Modal Agent for Robotics Vision-and-Language Navigation"

Python 89 8 Updated Jun 27, 2024

Know What and Know Where: An Object-and-Room Informed Sequential BERT for Indoor Vision-Language Navigation

C++ 16 2 Updated Feb 7, 2022

Codebase for the Airbert paper

Python 46 7 Updated Mar 20, 2023

Reading list for research topics in embodied vision

704 78 Updated Jun 13, 2025

Code of the CVPR 2021 Oral paper: A Recurrent Vision-and-Language BERT for Navigation

Python 202 36 Updated Aug 13, 2022

awesome grounding: A curated list of research papers in visual grounding

1,128 105 Updated Sep 21, 2025

Reading list for research topics in embodied vision

1 Updated Jul 16, 2021

[ACM MM 2021 Oral] Official repo of "Neighbor-view Enhanced Model for Vision and Language Navigation"

C++ 78 2 Updated Nov 16, 2022

Pytorch implementation of CVPR2021 paper "MUST-GAN: Multi-level Statistics Transfer for Self-driven Person Image Generation"

Python 75 17 Updated Jul 28, 2021

Evaluation code for various unsupervised automated metrics for Natural Language Generation.

Python 1,391 226 Updated Aug 20, 2024

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

1,155 104 Updated Aug 19, 2022
Next