Hey ππ½, I'm cpuimage
Hi, I am ZhiHan Gao, living in Shantou, China.
I specialize in developing audio, video, and image processing algorithms, and I share my open-source projects on GitHub. If you find my projects useful, please consider buying me a coffee. Your support is greatly appreciated!
Professional Experience
- π¨π½βπ» I have worked at leading tech companies including Baidu, KingSoft, and more.
- π± Developed algorithms for multiple applications:
- π‘Delivered AI-based technical customization services and successfully implemented and delivered several AI projects.
Research Progress and Achievements
- π± Here are some of my past research endeavors and achievements in deep learning and statistical algorithms:
-
Deep Learning
- A Trimap-Free Solution for Real-Time Automatic Portrait Matting on Mobile Devices
-
A Robust Optimizer With Accelerated Convergence Capability in Deep Learning -
A General and Adaptive Robust Loss Structure Scheme -
A Robust Loss Weighting Solution For Learning Long-Tail Data - Image Synthesis and Semantic Manipulation Using Stable Diffusion Networks
- Stable Diffusion Architecture Optimization And Deployment On Mobile Devices
- A Robust Solution For Accelerated Training Convergence And Learning Long-Tail Data
- A Arbitrary Resolution Super Resolution Solution for Real World
- Accelerate Stable Diffusion FP16 Inference Deployment Optimization with TensorRT
- Port Stable Diffusion X4 Upscaler To TensorFlow And Support FP16 Inference Deployment
- Port Stable Diffusion PromptGen (GPT2) To TensorFlow And Support ONNX Inference Deployment
- Stable Diffusion Architectural Distillation
- Content-aware 3-view synthesis based on Stable Diffusion in Game Art
- Super Resolution Solution based on Stable Diffusion
- Video Editing techniques based on Stable Diffusion
- Port Stable Diffusion XL 1.0 To TensorFlow And Support FP16 Inference Deployment
- A Plug-And-Play Algorithm For Asynchronous Inference With Frequency-Domain Decomposable Reconstruction For Arbitrary Visual Scenes
-
Stable Diffusion Inference With PyTorch Weights And More Features Like Stable Diffusion Web UI In Keras 3.x - FLUX.1 Support FP16 Inference Deployment and Low Memory Lora Training In PyTorch
- LLM from Scratch with PyTorch
- Enhanced FaceFusion: Decoupled Modules and Optimized Inference for Visual Performance
- Ultra High-Resolution Portrait Retouching
- Training-Free Universal High-Resolution Synthesis for Any Video Model
- Chunked Flash Attention in Keras
- Robustness and Speed, Effortlessly: An Adaptive, Efficient Optimizer for Stable Training
- Learning-Rate-Free
- Warmup-Free
- Normalization-Free
- Corrected Gradient Accumulation β Large-Batch-Equivalent Performance
- Long-Tailed Gradient Mitigation
- Accelerated Convergence
- Memory-Efficient
- Loss Regularization: A Novel Approach to Enhance Model Generalization and Convergence
- A Simple Yet Effective Approach to Multi-Task Learning via Dynamic Loss Weighting
- A Parameter-Free Weight Regularization Approach
- Towards Stable Batch Normalization via Adaptive Moving Averages
- AdamSage: An Adaptive Optimizer for Mixed-Precision and Large-Batch Training
- Full PyTorch AdamW Inheritance β A drop-in replacement requiring zero code changes.
- Corrected Gradient Accumulation β Delivers true large-batch-equivalent performance.
- Learning-Rate-Free β Eliminates a critical hyperparameter.
- Unified Closure & AMP GradScaler Support β Ensures seamless mixed-precision training.
- Memory-Efficient Training
-
Statistical Algorithms
- Real time and embedded implementation of speech enhancement algorithms based on Minimum Mean-Square Error Short-Time Spectral Amplitude estimation (MMSE-STSA)
-
Collaboration and Contact
- π― Iβm looking to collaborate on audio and image algorithms
- π¬ Any paid technical service or solution consulting