GANs1

编程入门行业动态更新时间:2024-10-24 04:33:07

GANs1

前述：GANs根据其生成对象的不同可大致分为以下三类：Controllable Generation，Randon Generation和Condition Generation。随机生成不能保证生成对象的方向，而条件生成和可控制生成可以做到（如指定生成一个长毛、黑猫等）。条件生成通常通过可通过热编码的形式，将指定类型加入到噪声中进行训练，并通过指定热编码序列来控制其生成对象。可控制生成是基于随机生成，通过观察生成样本的特征对噪音进行插值或者添加一个分类器进行筛选。
本案例目标： 通过训练一个分类器来控制噪声（输入），从而控制图像的生成。

首先对生成器和分类器的网络结构进行定义

import torch
from torch import nn
from tqdm.auto import tqdm          #进度条
from torchvision import transforms
from torchvision.utils import make_grid     #可视化
from torchvision.datasets import CelebA     #Celeba训练集
from torch.utils.data import DataLoader
import matplotlib.pyplot as plt
torch.manual_seed(0) # Set for our testing purposes, please do not change!def show_tensor_images(image_tensor, num_images=16, size=(3, 64, 64), nrow=3):'''用于可视化图像Function for visualizing images: Given a tensor of images, number of images, andsize per image, plots and prints the images in an uniform grid.'''image_tensor = (image_tensor + 1) / 2image_unflat = image_tensor.detach().cpu()image_grid = make_grid(image_unflat[:num_images], nrow=nrow) #将多张图片拼接在一起plt.imshow(image_grid.permute(1, 2, 0).squeeze())plt.show()class Generator(nn.Module):'''用于定义生成器Generator ClassValues:z_dim: the dimension of the noise vector, a scalarim_chan: the number of channels of the output image, a scalar(CelebA is rgb, so 3 is our default)hidden_dim: the inner dimension, a scalar'''def __init__(self, z_dim=10, im_chan=3, hidden_dim=64):super(Generator, self).__init__()self.z_dim = z_dim# Build the neural networkself.gen = nn.Sequential(self.make_gen_block(z_dim, hidden_dim * 8),self.make_gen_block(hidden_dim * 8, hidden_dim * 4),self.make_gen_block(hidden_dim * 4, hidden_dim * 2),self.make_gen_block(hidden_dim * 2, hidden_dim),self.make_gen_block(hidden_dim, im_chan, kernel_size=4, final_layer=True),)def make_gen_block(self, input_channels, output_channels, kernel_size=3, stride=2, final_layer=False):'''Function to return a sequence of operations corresponding to a generator block of DCGAN;a transposed convolution, a batchnorm (except in the final layer), and an activation.Parameters:input_channels: how many channels the input feature representation hasoutput_channels: how many channels the output feature representation should havekernel_size: the size of each convolutional filter, equivalent to (kernel_size, kernel_size)stride: the stride of the convolutionfinal_layer: a boolean, true if it is the final layer and false otherwise (affects activation and batchnorm)'''if not final_layer:return nn.Sequential(nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),nn.BatchNorm2d(output_channels),nn.ReLU(inplace=True),)else:return nn.Sequential(nn.ConvTranspose2d(input_channels, output_channels, kernel_size, stride),nn.Tanh(),)def forward(self, noise):'''Function for completing a forward pass of the generator: Given a noise tensor, returns generated images.Parameters:noise: a noise tensor with dimensions (n_samples, z_dim)'''x = noise.view(len(noise), self.z_dim, 1, 1)return self.gen(x)def get_noise(n_samples, z_dim, device='cpu'):'''Function for creating noise vectors: Given the dimensions (n_samples, z_dim)creates a tensor of that shape filled with random numbers from the normal distribution.Parameters:n_samples: the number of samples in the batch, a scalarz_dim: the dimension of the noise vector, a scalardevice: the device type'''return torch.randn(n_samples, z_dim, device=device)class Classifier(nn.Module):'''用于定义分类器Classifier ClassValues:im_chan: the number of channels of the output image, a scalar(CelebA is rgb, so 3 is our default)n_classes: the total number of classes in the dataset, an integer scalarhidden_dim: the inner dimension, a scalar'''def __init__(self, im_chan=3, n_classes=2, hidden_dim=64):super(Classifier, self).__init__()self.classifier = nn.Sequential(self.make_classifier_block(im_chan, hidden_dim),self.make_classifier_block(hidden_dim, hidden_dim * 2),self.make_classifier_block(hidden_dim * 2, hidden_dim * 4, stride=3),self.make_classifier_block(hidden_dim * 4, n_classes, final_layer=True),)def make_classifier_block(self, input_channels, output_channels, kernel_size=4, stride=2, final_layer=False):'''Function to return a sequence of operations corresponding to a classifier block; a convolution, a batchnorm (except in the final layer), and an activation (except in the final layer).Parameters:input_channels: how many channels the input feature representation hasoutput_channels: how many channels the output feature representation should havekernel_size: the size of each convolutional filter, equivalent to (kernel_size, kernel_size)stride: the stride of the convolutionfinal_layer: a boolean, true if it is the final layer and false otherwise (affects activation and batchnorm)'''if final_layer:return nn.Sequential(nn.Conv2d(input_channels, output_channels, kernel_size, stride),)else:return nn.Sequential(nn.Conv2d(input_channels, output_channels, kernel_size, stride),nn.BatchNorm2d(output_channels),nn.LeakyReLU(0.2, inplace=True),)def forward(self, image):'''Function for completing a forward pass of the classifier: Given an image tensor, returns an n_classes-dimension tensor representing fake/real.Parameters:image: a flattened image tensor with im_chan channels'''class_pred = self.classifier(image)return class_pred.view(len(class_pred), -1)

此处需要把该分类器进行训练，或者导入一个预训练好的分类器模型（二选一）
- 首先是手动训练

# 训练一个分类器def train_classifier(filename):import seaborn as snsimport matplotlib.pyplot as plt# Target all the classes, so that's how many the classifier will learnlabel_indices = range(40)n_epochs = 3display_step = 500lr = 0.001beta_1 = 0.5beta_2 = 0.999image_size = 64# transform函数是对输入图像尺寸、格式进行调整，并进行归一化处理，加快处理速度transform = transforms.Compose([transforms.Resize(image_size),transforms.CenterCrop(image_size),transforms.ToTensor(),transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)),])# 装载训练集 dataloader = DataLoader(CelebA(".", split='train', download=True, transform=transform),batch_size=batch_size,shuffle=True)classifier = Classifier(n_classes=len(label_indices)).to(device)class_opt = torch.optim.Adam(classifier.parameters(), lr=lr, betas=(beta_1, beta_2))criterion = nn.BCEWithLogitsLoss()cur_step = 0classifier_losses = []# classifier_val_losses = []for epoch in range(n_epochs):# Dataloader returns the batchesfor real, labels in tqdm(dataloader):real = real.to(device)labels = labels[:, label_indices].to(device).float()class_opt.zero_grad()class_pred = classifier(real)class_loss = criterion(class_pred, labels)class_loss.backward() # Calculate the gradientsclass_opt.step() # Update the weightsclassifier_losses += [class_loss.item()] # Keep track of the average classifier loss## Visualization code ###每500次展示一次，展示内容包括平均loss，9if cur_step % display_step == 0 and cur_step > 0:class_mean = sum(classifier_losses[-display_step:]) / display_stepprint(f"Step {cur_step}: Classifier loss: {class_mean}")step_bins = 20x_axis = sorted([i * step_bins for i in range(len(classifier_losses) // step_bins)] * step_bins)sns.lineplot(x_axis, classifier_losses[:len(x_axis)], label="Classifier Loss")   #绘制折线图plt.legend()   #设置图例，边框等plt.show()torch.save({"classifier": classifier.state_dict()}, filename)cur_step += 1train_classifier("Train_result")

此处是读入模型

# Downlaod the prtrained models from Google Drive
#!wget -q -O pretrained_celeba.pth "=1Hp3A7cAmCPWwPtKzp9sj91jF1szwioJS"
#!wget -q -O pretrained_classifier.pth "=1-8OsxfPiUHVB8vNtJCLrPDMMkuKKNDU7"
#删除#！通过以上wget指令下载预训练模型，然后读入，但是读入总发生错误，不理解import torch
gen = Generator(z_dim).to(device)
gen_dict = torch.load("pretrained_celeba.pth", map_location=torch.device(device))["gen"] #选择模型及其参数
gen.load_state_dict(gen_dict)
gen.eval()    # 改成eval模式n_classes = 40
classifier = Classifier(n_classes=n_classes).to(device)
class_dict = torch.load("pretrained_classifier.pth", map_location=torch.device(device))["classifier"] #选择模型及其参数
classifier.load_state_dict(class_dict)
classifier.eval()
print("Loaded the models!")opt = torch.optim.Adam(classifier.parameters(), lr=0.01)

硬菜来了：如何通过分类器控制生成结果 -- 利用梯度下降法修改noise向量 new = old + (∇ old * weight)

def calculate_updated_noise(noise, weight):#### noise计算的梯度下降函数 ##### Get the new noisenew_noise = noise + noise.grad * weight#### END CODE HERE ####return new_noise

根据CelebA数据集的分类器的分类结果和类型，选择其中一种分类类型来限制生成器的生成工作。

# First generate a bunch of images with the generator
n_images = 8
fake_image_history = []
grad_steps = 10 # Number of gradient
skip = 2 # Number of gradient steps to skip in the visualizationfeature_names = ["5oClockShadow", "ArchedEyebrows", "Attractive", "BagsUnderEyes", "Bald", "Bangs",
"BigLips", "BigNose", "BlackHair", "BlondHair", "Blurry", "BrownHair", "BushyEyebrows", "Chubby",
"DoubleChin", "Eyeglasses", "Goatee", "GrayHair", "HeavyMakeup", "HighCheekbones", "Male", 
"MouthSlightlyOpen", "Mustache", "NarrowEyes", "NoBeard", "OvalFace", "PaleSkin", "PointyNose", 
"RecedingHairline", "RosyCheeks", "Sideburn", "Smiling", "StraightHair", "WavyHair", "WearingEarrings", 
"WearingHat", "WearingLipstick", "WearingNecklace", "WearingNecktie", "Young"]### Change me! ###
target_indices = feature_names.index("Smiling") # Feel free to change this value to any string from feature_names!noise = get_noise(n_images, z_dim).to(device).requires_grad_()
for i in range(grad_steps):opt.zero_grad()fake = gen(noise)fake_image_history += [fake]fake_classes_score = classifier(fake)[:, target_indices].mean()fake_classes_score.backward()    # 这个update的作用不太明白，为何要更新分类器的参数？noise.data = calculate_updated_noise(noise, 1 / grad_steps)# 设置图像显示大小
plt.rcParams['figure.figsize'] = [n_images * 2, grad_steps * 2]
show_tensor_images(torch.cat(fake_image_history[::skip], dim=2), num_images=n_images, nrow=n_images)

理论上能够训练出根据分类器要求的人脸图像，但是每次噪声的更新，不仅改变了笑脸，还改变了其他的内容（如添加笑脸时，头发由短边长），里面存在耦合作用。
解耦合的方法：L2正则化惩罚 --- 分数 = 目标分数 + 惩罚分数（由其他类别差异分数*惩罚因子得到）

def get_score(current_classifications, original_classifications, target_indices, other_indices, penalty_weight):'''Function to return the score of the current classifications, penalizing changesto other classes with an L2 norm.Parameters:current_classifications: the classifications associated with the current noiseoriginal_classifications: the classifications associated with the original noisetarget_indices: the index of the target classother_indices: the indices of the other classespenalty_weight: the amount that the penalty should be weighted in the overall score'''# Steps: 1) Calculate the change between the original and current classifications (as a tensor)#           by indexing into the other_indices you're trying to preserve, like in x[:, features].#        2) Calculate the norm (magnitude) of changes per example.#        3) Multiply the mean of the example norms by the penalty weight. #           This will be your other_class_penalty.#           Make sure to negate the value since it's a penalty!#        4) Take the mean of the current classifications for the target feature over all the examples.#           This mean will be your target_score.#### START CODE HERE ##### Calculate the norm (magnitude) of changes per example and multiply by penalty weightother_distances = current_classifications[:, other_indices] - original_classifications[:, other_indices]# Calculate the norm (magnitude) of changes per example and multiply by penalty weightother_class_penalty = -torch.norm(other_distances, dim=1).mean() * penalty_weight# Take the mean of the current classifications for the target featuretarget_score = current_classifications[:, target_indices].mean()#### END CODE HERE ####return target_score + other_class_penalty

解耦合的训练

fake_image_history = []
### Change me! ###
target_indices = feature_names.index("Smiling") # Feel free to change this value to any string from feature_names from earlier!
other_indices = [cur_idx != target_indices for cur_idx, _ in enumerate(feature_names)]
noise = get_noise(n_images, z_dim).to(device).requires_grad_()
original_classifications = classifier(gen(noise)).detach()
for i in range(grad_steps):opt.zero_grad()fake = gen(noise)fake_image_history += [fake]fake_score = get_score(classifier(fake), original_classifications,target_indices,other_indices,penalty_weight=0.1)fake_score.backward()noise.data = calculate_updated_noise(noise, 1 / grad_steps)plt.rcParams['figure.figsize'] = [n_images * 2, grad_steps * 2]
show_tensor_images(torch.cat(fake_image_history[::skip], dim=2), num_images=n_images, nrow=n_images)

更多推荐

GANs1

本文发布于:2024-03-04 10:44:13，感谢您对本站的认可！

本文链接:https://www.elefans.com/category/jswz/34/1709048.html