Skip to content

xujz18/xujz18

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 

Repository files navigation

Hi, welcome to my Github πŸ‘‹

I am Jiazheng Xu, a fourth-year PhD student in Tsinghua University.

  • πŸ”­ Interested in multimodal generative models, especially RLHF and alignment. Find my up-to-date publication list in Google Scholar!
  • 🌱 Some of my proud leading works about RLHF for multimodal generative models:
    • ImageReward (NeurIPS'23): the first general-purpose text-to-image human preference reward model (RM) for RLHF, outperforming CLIP/BLIP/Aesthetic by 30% in terms of human preference prediction.
    • CogVideoX (ICLR'25): a large-scale diffusion transformer models designed for generating videos based on text prompts.
    • VisionReward (AAAI'26): a fine-grained and multi-dimensional reward model for image and video generation, outperforming VideoScore by 17.2% and enabling multi-objective optimization, applied to the RLHF of CogVideoX, boosting preference by 30%.
  • 🌱 I'm also honored to work with the team on multimodal foundation models:
    • CogVLM (NeurIPS'24): a powerful open-source visual language model (VLM), which achieves state-of-the-art performance on 10 classic cross-modal benchmarks.
    • CogAgent (CVPR'24): a visual agent being able to return a plan, next action, and specific operations with coordinates for any given task on any GUI screenshot, enhancing GUI-related question-answering capabilities.
    • GLM-4.1V-Thinking / GLM-4.5V / GLM-4.6V: a series of VLMs with reasoning paradigms, which achieved the strongest performance upon release.
  • πŸ’¬ Feel free to drop me an email for:
    • Any form of collaboration
    • Any issue about my works or code
    • Interesting ideas to discuss or just chatting

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors