π Iβm currently working on the topic of visual perception and my long-term goal is to build general foundation models.
β‘ Recently I'm focusing on vision-language model and unified visual models.
π« If you are also interested in relevant issues, feel free to chat with me!