Skip to content

Commit

Permalink
add styleclip (PaddlePaddle#643)
Browse files Browse the repository at this point in the history
* add styleclip

* update 2022

* add weight url

* update doc & img url
  • Loading branch information
ultranity authored Sep 1, 2022
1 parent 0541ace commit d1225d0
Show file tree
Hide file tree
Showing 7 changed files with 1,061 additions and 61 deletions.
99 changes: 99 additions & 0 deletions applications/tools/styleganv2clip.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Copyright (c) 2022 PaddlePaddle Authors. All Rights Reserve.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import argparse

import paddle
from ppgan.apps import StyleGANv2ClipPredictor

if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("--latent",
type=str,
help="path to first image latent codes")

parser.add_argument("--neutral", type=str, help="neutral description")
parser.add_argument("--target", type=str, help="neutral description")
parser.add_argument("--beta_threshold",
type=float,
default=0.12,
help="beta threshold for channel editing")

parser.add_argument("--direction_offset",
type=float,
default=5.0,
help="offset value of edited attribute")

parser.add_argument("--direction_path",
type=str,
default=None,
help="path to latent editing directions")

parser.add_argument("--output_path",
type=str,
default='output_dir',
help="path to output image dir")

parser.add_argument("--weight_path",
type=str,
default=None,
help="path to model checkpoint path")

parser.add_argument("--model_type",
type=str,
default=None,
help="type of model for loading pretrained model")

parser.add_argument("--size",
type=int,
default=1024,
help="resolution of output image")

parser.add_argument("--style_dim",
type=int,
default=512,
help="number of style dimension")

parser.add_argument("--n_mlp",
type=int,
default=8,
help="number of mlp layer depth")

parser.add_argument("--channel_multiplier",
type=int,
default=2,
help="number of channel multiplier")

parser.add_argument("--cpu",
dest="cpu",
action="store_true",
help="cpu mode.")

args = parser.parse_args()

if args.cpu:
paddle.set_device('cpu')

predictor = StyleGANv2ClipPredictor(
output_path=args.output_path,
weight_path=args.weight_path,
model_type=args.model_type,
seed=None,
size=args.size,
style_dim=args.style_dim,
n_mlp=args.n_mlp,
channel_multiplier=args.channel_multiplier,
direction_path=args.direction_path)
predictor.run(args.latent, args.neutral, args.target, args.direction_offset,
args.beta_threshold)
144 changes: 144 additions & 0 deletions docs/en_US/tutorials/styleganv2clip.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
# StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery

## Introduction

The task of StyleGAN V2 is image generation while the Clip guided Editing module uses the attribute manipulation vector obtained by CLIP (Contrastive Language-Image Pre-training) Model for mapping text prompts to input-agnostic directions in StyleGAN’s style space, enabling interactive text-driven image manipulation.


This model uses pretrained StyleGAN V2 generator and uses Pixel2Style2Pixel model for image encoding. At present, only the models of portrait editing (trained on FFHQ dataset) is available.

Paddle-CLIP and dlib package is needed for this module.

```
pip install -e .
pip install paddleclip
pip install dlib-bin
```

## How to use

### Editing

```
cd applications/
python -u tools/styleganv2clip.py \
--latent <<PATH TO STYLE VECTOR> \
--output_path <DIRECTORY TO STORE OUTPUT IMAGE> \
--weight_path <YOUR PRETRAINED MODEL PATH> \
--model_type ffhq-config-f \
--size 1024 \
--style_dim 512 \
--n_mlp 8 \
--channel_multiplier 2 \
--direction_path <PATH TO STORE ATTRIBUTE DIRECTIONS> \
--neutral <DESCRIPTION OF THE SOURCE IMAGE> \
--target <DESCRIPTION OF THE TARGET IMAGE> \
--beta_threshold 0.12 \
--direction_offset 5
--cpu
```

**params:**
- latent: The path of the style vector which represents an image. Come from `dst.npy` generated by Pixel2Style2Pixel or `dst.fitting.npy` generated by StyleGANv2 Fitting module
- output_path: the directory where the generated images are stored
- weight_path: pretrained StyleGANv2 model path
- model_type: inner model type, currently only `ffhq-config-f` is available.
- direction_path: The path of CLIP mapping vector
- stat_path: The path of latent statisitc file
- neutral: Description of the source image,for example: face
- target: Description of the target image,for example: young face
- beta_threshold: editing threshold of the attribute channels
- direction_offset: Offset strength of the attribute
- cpu: whether to use cpu inference, if not, please remove it from the command

>inherited params for the pretrained StyleGAN model
- size: model parameters, output image resolution
- style_dim: model parameters, dimensions of style z
- n_mlp: model parameters, the number of multi-layer perception layers for style z
- channel_multiplier: model parameters, channel product, affect model size and the quality of generated pictures

### Results

Input portrait:
<div align="center">
<img src="../../imgs/stylegan2fitting-sample.png" width="300"/>
</div>

with
> direction_offset = [ -1, 0, 1, 2, 3, 4, 5]
> beta_threshold = 0.1
edit from 'face' to 'boy face':

![stylegan2clip-sample-boy](https://user-images.githubusercontent.com/29187613/187344690-6709fba5-6e21-4bc0-83d1-5996947c99a4.png)


edit from 'face' to 'happy face':

![stylegan2clip-sample-happy](https://user-images.githubusercontent.com/29187613/187344681-6509f01b-0d9e-4dea-8a97-ee9ca75d152e.png)


edit from 'face' to 'angry face':

![stylegan2clip-sample-angry](https://user-images.githubusercontent.com/29187613/187344686-ff5047ab-5499-420d-ad02-e0908ac71bf7.png)

edit from 'face' to 'face with long hair':

![stylegan2clip-sample-long-hair](https://user-images.githubusercontent.com/29187613/187344684-4e452631-52b0-47cf-966e-3216c0392815.png)



edit from 'face' to 'face with curly hair':

![stylegan2clip-sample-curl-hair](https://user-images.githubusercontent.com/29187613/187344677-c9a3aa9f-1f3c-41b3-a1f0-fcd48a9c627b.png)


edit from 'head with black hair' to 'head with gold hair':

![stylegan2clip-sample-gold-hair](https://user-images.githubusercontent.com/29187613/187344678-5220e8b2-b1c9-4f2f-8655-621b6272c457.png)

## Make Attribute Direction Vector

For details, please refer to [Puzer/stylegan-encoder](https://github.com/Puzer/stylegan-encoder/blob/master/Learn_direction_in_latent_space.ipynb)

Currently pretrained weight for `stylegan2` & `ffhq-config-f` dataset is provided:

direction: https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f-styleclip-global-directions.pdparams

stats: https://paddlegan.bj.bcebos.com/models/stylegan2-ffhq-config-f-styleclip-stats.pdparams

## Training

1. extract style latent vector stats
```
python styleclip_getf.py
```
2. calcuate mapping vector using CLIP model

```
python ppgan/apps/styleganv2clip_predictor.py extract
```

# Reference

- 1. [StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery](https://arxiv.org/abs/2103.17249)

```
@article{Patashnik2021StyleCLIPTM,
title={StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery},
author={Or Patashnik and Zongze Wu and Eli Shechtman and Daniel Cohen-Or and D. Lischinski},
journal={2021 IEEE/CVF International Conference on Computer Vision (ICCV)},
year={2021},
pages={2065-2074}
}
```
- 2. [Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation](hhttps://arxiv.org/abs/2008.00951)

```
@article{richardson2020encoding,
title={Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation},
author={Richardson, Elad and Alaluf, Yuval and Patashnik, Or and Nitzan, Yotam and Azar, Yaniv and Shapiro, Stav and Cohen-Or, Daniel},
journal={arXiv preprint arXiv:2008.00951},
year={2020}
}
```
Loading

0 comments on commit d1225d0

Please sign in to comment.