-
Xidian University
- Xi'an, China
-
12:44
(UTC +08:00)
Lists (4)
Sort Name ascending (A-Z)
Stars
Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.
解决Cursor在免费订阅期间出现以下提示的问题: Your request has been blocked as our system has detected suspicious activity / You've reached your trial request limit. / Too many free trial accounts used on this machine.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
A curated collection of fun and creative examples generated with Nano Banana🍌, Gemini-2.5-flash-image based model. We also release Nano-consistent-150K openly to support the community's development…
Wan: Open and Advanced Large-Scale Video Generative Models
PyTorch code and models for the DINOv2 self-supervised learning method.
Reference PyTorch implementation and models for DINOv3
⏰ Collaboratively track worldwide conference deadlines (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
Open Source Image and Video Restoration Toolbox for Super-resolution, Denoise, Deblurring, etc. Currently, it includes EDSR, RCAN, SRResNet, SRGAN, ESRGAN, EDVR, BasicVSR, SwinIR, ECBSR, etc. Also …
Awesome curated collection of images and prompts generated by gemini-2.5-flash-image (aka Nano Banana) state-of-the-art image generation and editing model. Explore AI generated visuals created with…
OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340
This list of writing prompts covers a range of topics and tasks, including brainstorming research ideas, improving language and style, conducting literature reviews, and developing research plans.
Edit anything in images powered by segment-anything, ControlNet, StableDiffusion, etc. (ACM MM)
Collect super-resolution related papers, data, repositories
👁️ 🖼️ 🔥PyTorch Toolbox for Image Quality Assessment, including PSNR, SSIM, LPIPS, FID, NIQE, NRQM(Ma), MUSIQ, TOPIQ, NIMA, DBCNN, BRISQUE, PI and more...
Project Page for "LISA: Reasoning Segmentation via Large Language Model"
GPT4V-level open-source multi-modal model based on Llama3-8B
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
Monkey (LMM): Image Resolution and Text Label Are Important Things for Large Multi-modal Models (CVPR 2024 Highlight)
A comprehensive collection of IQA papers
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
③[ICML2024] [IQA, IAA, VQA] All-in-one Foundation Model for visual scoring. Can efficiently fine-tune to downstream datasets.
IQA: Deep Image Structure and Texture Similarity Metric
[AAAI 2023] Exploring CLIP for Assessing the Look and Feel of Images
[IEEE TPAMI] A Survey on All-in-One Image Restoration: Taxonomy, Evaluation and Future Trends
CVPR 2025: Frequency Dynamic Convolution for Dense Image Prediction