A third-year PhD student at Show Lab, National University of Singapore, working with Prof. Mike Shou. Prior to my PhD, I dedicated three years to exploring label-efficient learning for scene understanding, focusing on weakly-supervised object localization and semantic segmentation. In my first year of PhD journey, I delved into visual prompt learning and effective controllable image synthesis. Currently, I’m concentrating on unifying multimodal understanding and generation within a native unified multimodal model. I have pre-trained and post-trained two models, Show-o and Show-o2, with trainable parameters up to 7 billion and utilizing billion-scale datasets.
🎯
Focusing
PhD student at NUS.
-
NUS, Tencent, SZU
- Singapore
-
23:46
(UTC +08:00) - https://sierkinhane.github.io/
Pinned Loading
-
showlab/Show-o
showlab/Show-o Public[ICLR 2025] Repository for Show-o series, One Single Transformer to Unify Multimodal Understanding and Generation.
-
showlab/BoxDiff
showlab/BoxDiff Public[ICCV 2023] BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
-
CVI-SZU/CCAM
CVI-SZU/CCAM Public[CVPR 2022] C2AM: Contrastive learning of Class-agnostic Activation Map for Weakly Supervised Object Localization and Semantic Segmentation
-
showlab/VisorGPT
showlab/VisorGPT Public[NeurIPS 2023] Customize spatial layouts for conditional image synthesis models, e.g., ControlNet, using GPT
-
CVI-SZU/CLIMS
CVI-SZU/CLIMS Public[CVPR 2022] CLIMS: Cross Language Image Matching for Weakly Supervised Semantic Segmentation
-
CRNN_Chinese_Characters_Rec
CRNN_Chinese_Characters_Rec Public(CRNN) Chinese Characters Recognition.
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.