首页 > Web开发 > 详细

(Pixel2Pixel)Image-to-Image translation with conditional adversarial networks

时间:2019-07-09 20:43:59      阅读:172      评论:0      收藏:0      [点我收藏+]

Introduction

1. develop a common framework for all problems that are the task of predicting pixels from pixels.

2. CNNs learn to minimize a loss function -an objective that scores the quality of results-- and although the learning process is automatic, a lot of manual effort still goes into designing effective losses.

3.the CNN to minimize Euclidean distance(欧式距离L2) between predicted and ground truth pixels, it will tend to produce blurry results.

why? because the L2 distance is minimized by averaging all plausible outputs, which cause blurring.

4.GANs learn a loss that tries to classify if the output image is real of fake ,  blurry images will not be tolerated since they obviously fake!

5. they apply cGANs suitable for image-to-image translation tasks, where we condition on input image and generate a corresponding output image.


 

Releted work

1.image-to-image translation problems are formulated as per-pixel(逐个像素的)classfication or regression.but these formulations treat the output space as “unstructured” ,each output pixel is considered conditionally independent from all others given the input image.(独立性!)

2. conditional GANs  learn a structured loss.

3. cGANs is different in that the loss is learned(损失可以学习), in theory,  penalize any possible structure that differs between output and target.(条件GAN的不同之处在于,损失是可以习得的,理论上,它可以惩罚产出和目标之间可能存在差异的任何结构。)

4. the choices for generator and discriminator achitecture:

for G:  using ‘U-Net ‘

for D: using PatchGAN classifier penalizes structure at the scale of image patches. 

The purpose of PatchGAN aim to capure local style statistics.(用于捕获本地样式统计信息)


 

Method

1. The whole of framwork is that conditional GANs learn a mapping from observed image and random noise vector z, to y.  $G:{x,z}\rightarrow y(ground-truth)$ .

技术分享图片

2. Unlike an unconditional GAN, both the generator and discriminator observe the input edge map.

3. objective function:

技术分享图片

G try to minimize this objective against an adversarial D that try to maximize it.

4. they test the importence of conditioning the disctiminator, the discriminator dose not oberve x(edge map):

技术分享图片

5.  it‘s beneficial to mix GAN objective with a more traditional loss, such as L2-distance.

6. G is tasked to not only  fool the discriminator but also to be near the ground truth output in an L2 sense.

7. L1 distance is applied into the additional loss rather than L2 as L1 encourages less blurring(remeber it!)

8.技术分享图片

 final objective 

技术分享图片

9. without $z%(random noise vector), the net still learn a mapping from $x$ to $y$, but would produce deterministic output, therefore fail to match any distribution other than a delta function.(因此无法匹配除函数之外的任何分布)

10. towords $z$, Gaussian noise often is used in the past, but authors find this strategy ineffective, the G simply learned to ignore the noise. Finally, in the form of dropout is provided.but we observe only minor stochasticity in the output of our nets. 


 Network Architecture

1. The whole of generator and discriminator architectures from DCGANs.


 For G: U-Net;DCGAN; encoder- decoder; bottleneck; shuttle the information;

The job:

1.mapping a high resolution grid to a high resolution output grid.

2. although the input and output differ in surface appearance, but both are rendering of same underlying structure.

The character:

structure in the input is roughly aligned with structure in the output.

The previous measures:

1.encoder-decoder network is applied.

2.until a bottleneck layer, downsample is changed to upsample.

Q:

1. A great deal of low-level information shared between the input and output, shuttling this information directly across the net is desirable.例如,在图像着色的情况下,输入和输出共享突出边缘的位置。

END:

To give the generator a means to circumvent(绕过) the bottleneck for information like this,  adding skip connections is adopted, this architecture called ‘U-Net‘

技术分享图片


 The results of different loss function:

技术分享图片

 L1 loss or L2 loss produce the blurry results on image generation problems.


 

(Pixel2Pixel)Image-to-Image translation with conditional adversarial networks

原文:https://www.cnblogs.com/ChenKe-cheng/p/11158248.html

(0)
(0)
   
举报
评论 一句话评论(0
关于我们 - 联系我们 - 留言反馈 - 联系我们:wmxa8@hotmail.com
© 2014 bubuko.com 版权所有
打开技术之扣,分享程序人生!