forked from PaddlePaddle/PaddleGAN
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
* add styleclip * update 2022 * add weight url * update doc & img url
- Loading branch information
Showing
7 changed files
with
1,061 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,99 @@ | ||
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
import argparse | ||
|
||
import paddle | ||
from ppgan.apps import StyleGANv2ClipPredictor | ||
|
||
if __name__ == "__main__": | ||
parser = argparse.ArgumentParser() | ||
parser.add_argument("--latent", | ||
type=str, | ||
help="path to first image latent codes") | ||
|
||
parser.add_argument("--neutral", type=str, help="neutral description") | ||
parser.add_argument("--target", type=str, help="neutral description") | ||
parser.add_argument("--beta_threshold", | ||
type=float, | ||
default=0.12, | ||
help="beta threshold for channel editing") | ||
|
||
parser.add_argument("--direction_offset", | ||
type=float, | ||
default=5.0, | ||
help="offset value of edited attribute") | ||
|
||
parser.add_argument("--direction_path", | ||
type=str, | ||
default=None, | ||
help="path to latent editing directions") | ||
|
||
parser.add_argument("--output_path", | ||
type=str, | ||
default='output_dir', | ||
help="path to output image dir") | ||
|
||
parser.add_argument("--weight_path", | ||
type=str, | ||
default=None, | ||
help="path to model checkpoint path") | ||
|
||
parser.add_argument("--model_type", | ||
type=str, | ||
default=None, | ||
help="type of model for loading pretrained model") | ||
|
||
parser.add_argument("--size", | ||
type=int, | ||
default=1024, | ||
help="resolution of output image") | ||
|
||
parser.add_argument("--style_dim", | ||
type=int, | ||
default=512, | ||
help="number of style dimension") | ||
|
||
parser.add_argument("--n_mlp", | ||
type=int, | ||
default=8, | ||
help="number of mlp layer depth") | ||
|
||
parser.add_argument("--channel_multiplier", | ||
type=int, | ||
default=2, | ||
help="number of channel multiplier") | ||
|
||
parser.add_argument("--cpu", | ||
dest="cpu", | ||
action="store_true", | ||
help="cpu mode.") | ||
|
||
args = parser.parse_args() | ||
|
||
if args.cpu: | ||
paddle.set_device('cpu') | ||
|
||
predictor = StyleGANv2ClipPredictor( | ||
output_path=args.output_path, | ||
weight_path=args.weight_path, | ||
model_type=args.model_type, | ||
seed=None, | ||
size=args.size, | ||
style_dim=args.style_dim, | ||
n_mlp=args.n_mlp, | ||
channel_multiplier=args.channel_multiplier, | ||
direction_path=args.direction_path) | ||
predictor.run(args.latent, args.neutral, args.target, args.direction_offset, | ||
args.beta_threshold) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,144 @@ | ||
# StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery | ||
|
||
## Introduction | ||
|
||
The task of StyleGAN V2 is image generation while the Clip guided Editing module uses the attribute manipulation vector obtained by CLIP (Contrastive Language-Image Pre-training) Model for mapping text prompts to input-agnostic directions in StyleGAN’s style space, enabling interactive text-driven image manipulation. | ||
|
||
|
||
This model uses pretrained StyleGAN V2 generator and uses Pixel2Style2Pixel model for image encoding. At present, only the models of portrait editing (trained on FFHQ dataset) is available. | ||
|
||
Paddle-CLIP and dlib package is needed for this module. | ||
|
||
``` | ||
pip install -e . | ||
pip install paddleclip | ||
pip install dlib-bin | ||
``` | ||
|
||
## How to use | ||
|
||
### Editing | ||
|
||
``` | ||
cd applications/ | ||
python -u tools/styleganv2clip.py \ | ||
--latent <<PATH TO STYLE VECTOR> \ | ||
--output_path <DIRECTORY TO STORE OUTPUT IMAGE> \ | ||
--weight_path <YOUR PRETRAINED MODEL PATH> \ | ||
--model_type ffhq-config-f \ | ||
--size 1024 \ | ||
--style_dim 512 \ | ||
--n_mlp 8 \ | ||
--channel_multiplier 2 \ | ||
--direction_path <PATH TO STORE ATTRIBUTE DIRECTIONS> \ | ||
--neutral <DESCRIPTION OF THE SOURCE IMAGE> \ | ||
--target <DESCRIPTION OF THE TARGET IMAGE> \ | ||
--beta_threshold 0.12 \ | ||
--direction_offset 5 | ||
--cpu | ||
``` | ||
|
||
**params:** | ||
- latent: The path of the style vector which represents an image. Come from `dst.npy` generated by Pixel2Style2Pixel or `dst.fitting.npy` generated by StyleGANv2 Fitting module | ||
- output_path: the directory where the generated images are stored | ||
- weight_path: pretrained StyleGANv2 model path | ||
- model_type: inner model type, currently only `ffhq-config-f` is available. | ||
- direction_path: The path of CLIP mapping vector | ||
- stat_path: The path of latent statisitc file | ||
- neutral: Description of the source image,for example: face | ||
- target: Description of the target image,for example: young face | ||
- beta_threshold: editing threshold of the attribute channels | ||
- direction_offset: Offset strength of the attribute | ||
- cpu: whether to use cpu inference, if not, please remove it from the command | ||
|
||
>inherited params for the pretrained StyleGAN model | ||
- size: model parameters, output image resolution | ||
- style_dim: model parameters, dimensions of style z | ||
- n_mlp: model parameters, the number of multi-layer perception layers for style z | ||
- channel_multiplier: model parameters, channel product, affect model size and the quality of generated pictures | ||
|
||
### Results | ||
|
||
Input portrait: | ||
<div align="center"> | ||
<img src="../../imgs/stylegan2fitting-sample.png" width="300"/> | ||
</div> | ||
|
||
with | ||
> direction_offset = [ -1, 0, 1, 2, 3, 4, 5] | ||
> beta_threshold = 0.1 | ||
edit from 'face' to 'boy face': | ||
|
||
![stylegan2clip-sample-boy](https://user-images.githubusercontent.com/29187613/187344690-6709fba5-6e21-4bc0-83d1-5996947c99a4.png) | ||
|
||
|
||
edit from 'face' to 'happy face': | ||
|
||
![stylegan2clip-sample-happy](https://user-images.githubusercontent.com/29187613/187344681-6509f01b-0d9e-4dea-8a97-ee9ca75d152e.png) | ||
|
||
|
||
edit from 'face' to 'angry face': | ||
|
||
![stylegan2clip-sample-angry](https://user-images.githubusercontent.com/29187613/187344686-ff5047ab-5499-420d-ad02-e0908ac71bf7.png) | ||
|
||
edit from 'face' to 'face with long hair': | ||
|
||
![stylegan2clip-sample-long-hair](https://user-images.githubusercontent.com/29187613/187344684-4e452631-52b0-47cf-966e-3216c0392815.png) | ||
|
||
|
||
|
||
edit from 'face' to 'face with curly hair': | ||
|
||
![stylegan2clip-sample-curl-hair](https://user-images.githubusercontent.com/29187613/187344677-c9a3aa9f-1f3c-41b3-a1f0-fcd48a9c627b.png) | ||
|
||
|
||
edit from 'head with black hair' to 'head with gold hair': | ||
|
||
![stylegan2clip-sample-gold-hair](https://user-images.githubusercontent.com/29187613/187344678-5220e8b2-b1c9-4f2f-8655-621b6272c457.png) | ||
|
||
## Make Attribute Direction Vector | ||
|
||
For details, please refer to [Puzer/stylegan-encoder](https://github.com/Puzer/stylegan-encoder/blob/master/Learn_direction_in_latent_space.ipynb) | ||
|
||
Currently pretrained weight for `stylegan2` & `ffhq-config-f` dataset is provided: | ||
|
||
direction: https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f-styleclip-global-directions.pdparams | ||
|
||
stats: https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f-styleclip-stats.pdparams | ||
|
||
## Training | ||
|
||
1. extract style latent vector stats | ||
``` | ||
python styleclip_getf.py | ||
``` | ||
2. calcuate mapping vector using CLIP model | ||
|
||
``` | ||
python ppgan/apps/styleganv2clip_predictor.py extract | ||
``` | ||
|
||
# Reference | ||
|
||
- 1. [StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery](https://arxiv.org/abs/2103.17249) | ||
|
||
``` | ||
@article{Patashnik2021StyleCLIPTM, | ||
title={StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery}, | ||
author={Or Patashnik and Zongze Wu and Eli Shechtman and Daniel Cohen-Or and D. Lischinski}, | ||
journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)}, | ||
year={2021}, | ||
pages={2065-2074} | ||
} | ||
``` | ||
- 2. [Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation](hhttps://arxiv.org/abs/2008.00951) | ||
|
||
``` | ||
@article{richardson2020encoding, | ||
title={Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation}, | ||
author={Richardson, Elad and Alaluf, Yuval and Patashnik, Or and Nitzan, Yotam and Azar, Yaniv and Shapiro, Stav and Cohen-Or, Daniel}, | ||
journal={arXiv preprint arXiv:2008.00951}, | ||
year={2020} | ||
} | ||
``` |
Oops, something went wrong.