Created
February 11, 2018 10:19
-
-
Save ecjang/108842d48cfffe44fed2087fdf3d8e48 to your computer and use it in GitHub Desktop.
07.Style-Transfer
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Neural Style Transfer" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"레퍼런스의 이미지를 타겟이미지에 적용시키는 응용. 단 원본의 content는 유지.\n", | |
"스타일 = Txture, Colors, Visual Patterns\n", | |
"\n", | |
"- Content = The Higher-Level macrostructure of the Image\n", | |
"- Loss = distance(style(ref) - style(generated)) + distance(content(orginal) - content(generated))\n", | |
"- Distance = L2 norm" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## The Content Loss" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"- 신경망의 앞단 레이어의 활성화 된 값들은 이미지에 대한 국소적인 정보를 가지고 있다.\n", | |
"- 상대적으로 이후의 레이어으는 전체적이며 추상적인 정보를 가지고 있다.\n", | |
"- 다르게 말하면 convent의 다른 레이어의 활성화값은 다른 공간스케일의 내용에 대한 분해된 값을 제공한다.\n", | |
"- 그러므로 content에 대한 정보는 convnet의 상위 레이어에 표현될 것이다.\n", | |
"- 좋은 conetent loss의 후보는 타겟 이미지와 생성된 이이지의 convent의 top layer 사이의 L2 norm 값이다." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## The Style Loss" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"- content loss 에서는 하나의 상위 레이어만 사용하지만 style loss에서는 여러개의 convent 레이어르 사용한다.\n", | |
"- convent의 모든 공간 스케일로부터 스타일 정보를 추출하기를 원한다.\n", | |
"- style loss로는 활성화 값의 Gram maxrix를 사용한다. feature map 사이의 내적값이다.\n", | |
"- 내적인 레이어의 피쳐들 사이의 관계에 대한 맵을 표현하는 것으로 이해할 수 있다.\n", | |
"- 이러한 관계는 특정 공간에 스케일의 패턴에 대한 통계를 인식한다.\n", | |
"- 그러므로 style loss는 다른 레이어의 활성화 사이의 관계를 비슷하게 유지할려고 한다.\n", | |
"- 그러면 이 방법을 사용하면 다른 공간 스케일에서 발견된 텍스쳐가 ref 이미지와 generated 이밎 사이에 비슷할 것을 보장해준다." | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"## Neural style transfer in Keras" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"- VGG19 Network를 사용\n", | |
"- ref, target, generated 이미지에 대한 VGG19 레이어 활성화 값을 계산하는 네트워크를 만든다.\n", | |
"- 위에서 설명한 loss function을 정의하기 위해서 레이어 활성화를 사용한다.\n", | |
"- loss function을 최소화하기 위한 gradient descent 과정을 거친다." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 2, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stderr", | |
"output_type": "stream", | |
"text": [ | |
"C:\\ProgramData\\Anaconda3\\envs\\py35\\lib\\site-packages\\h5py\\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n", | |
" from ._conv import register_converters as _register_converters\n", | |
"Using TensorFlow backend.\n" | |
] | |
} | |
], | |
"source": [ | |
"from keras.preprocessing.image import load_img, img_to_array" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 3, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"width, height : 1920 1200\n" | |
] | |
} | |
], | |
"source": [ | |
"# This is the path to the Image you want to transform.\n", | |
"target_image_path = './data/image/blue-moon-lake.jpg'\n", | |
"\n", | |
"# This is the path to the Style Image.\n", | |
"style_reference_image_path = './data/image/starry_night.jpg'\n", | |
"\n", | |
"# Dismensions of the generated picture.\n", | |
"width, height = load_img(target_image_path).size\n", | |
"print(\"width, height : \", width, height)\n", | |
"\n", | |
"img_height = 400\n", | |
"img_width = int(width * img_height / height)" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [ | |
"- VGG19 convert에 들어가고 나올 이미지들에 대한 로딩, 전처리, 후처리를 위한 보조 함수들이 필요하다." | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 4, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"import numpy as np\n", | |
"from keras.applications import vgg19" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 16, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def preprocess_image(image_path):\n", | |
" img = load_img(image_path, target_size=(img_height, img_width))\n", | |
" img = img_to_array(img)\n", | |
" img = np.expand_dims(img, axis=0)\n", | |
" img = vgg19.preprocess_input(img)\n", | |
" return img\n", | |
"\n", | |
"def deprocess_image(x):\n", | |
" # Remove zero-conter by mean pixel\n", | |
" x[:, :, 0] += 103.939\n", | |
" x[:, :, 1] += 116.779\n", | |
" x[:, :, 2] += 123.68\n", | |
" # BGR -> RGB\n", | |
" x = x[:, :, ::-1]\n", | |
" x = np.clip(x, 0, 255).astype('uint8')\n", | |
" return x\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 6, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Model loaded\n" | |
] | |
} | |
], | |
"source": [ | |
"from keras import backend as K\n", | |
"\n", | |
"target_image = K.constant(preprocess_image(target_image_path))\n", | |
"style_reference_image = K.constant(preprocess_image(style_reference_image_path))\n", | |
"\n", | |
"target_image, style_reference_image\n", | |
"\n", | |
"# This placeholder will contain our generated image\n", | |
"combination_image = K.placeholder((1, img_height, img_width, 3))\n", | |
"\n", | |
"# We conbinde the 3 images into a single batch.\n", | |
"input_tensor = K.concatenate([target_image, style_reference_image, combination_image], axis=0)\n", | |
"\n", | |
"input_tensor\n", | |
"# We build the VGG19 network with out batch of 3 iamges as input.\n", | |
"# The model will be loaded whth pre-trained imgeNet weights.\n", | |
"model = vgg19.VGG19(input_tensor=input_tensor, weights='imagenet', include_top=False)\n", | |
"\n", | |
"print('Model loaded')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 7, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def content_loss(base, combination):\n", | |
" return K.sum(K.square(combination - base))" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 8, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def gram_matrix(x):\n", | |
" features = K.batch_flatten(K.permute_dimensions(x,(2,0,1)))\n", | |
" gram = K.dot(features, K.transpose(features))\n", | |
" return gram\n", | |
"\n", | |
"def style_loss(style, combination):\n", | |
" S = gram_matrix(style)\n", | |
" C = gram_matrix(combination)\n", | |
" channels = 3\n", | |
" size = img_height * img_width\n", | |
" return K.sum(K.square(S - C)) / (4 * (channels ** 2) * (size ** 2))\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 9, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"def total_variation_loss(x):\n", | |
" a = K.square(\n", | |
" x[:, :img_height- 1, :img_width - 1, :] - x[:, 1:, :img_width -1, :])\n", | |
" b = K.square(\n", | |
" x[:, :img_height - 1, :img_width - 1, :] - x[:, :img_height -1, 1:, :])\n", | |
" return K.sum(K.pow(a + b, 1.25))\n" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 10, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# Diot mapping layer to activation Tensers\n", | |
"outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])\n", | |
"\n", | |
"# Name of layer used for content loss\n", | |
"content_layer = 'block5_conv2'\n", | |
"\n", | |
"# name og layer used for style loss\n", | |
"style_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1']\n", | |
"\n", | |
"# Weights in the weighted average of the loss compcents\n", | |
"total_variation_weight = 1e-4\n", | |
"style_weight = 1\n", | |
"content_weight = 0.025\n", | |
"\n", | |
"# Define the loss by adding all components to a 'loss' variable\n", | |
"loss = K.variable(0.)\n", | |
"layer_features = outputs_dict[content_layer]\n", | |
"target_image_features = layer_features[1, :, :, :]\n", | |
"combination_features = layer_features[2, :, :, :]\n", | |
"loss += content_weight * content_loss(target_image_features, combination_features)\n", | |
"\n", | |
"for layer_name in style_layers:\n", | |
" layer_features = outputs_dict[layer_name]\n", | |
" style_reference_features = layer_features[1, :, :, :]\n", | |
" combination_features = layer_features[2, :, :, :]\n", | |
" sl = style_loss(style_reference_features, combination_features)\n", | |
" loss += (style_weight / len(style_layers)) + sl\n", | |
"loss += total_variation_weight * total_variation_loss(combination_image)" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": 14, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"# Get the gradients of the generated image wrt the loss\n", | |
"grads = K.gradients(loss, combination_image)[0]\n", | |
"\n", | |
"# Function to the fetch the value of the current loss and the current gradients\n", | |
"fetch_loss_and_grads = K.function([combination_image], [loss, grads])\n", | |
"\n", | |
"class Evaluator(object):\n", | |
" \n", | |
" def __init__(self):\n", | |
" self.loss_value = None\n", | |
" self.grads_value = None\n", | |
" \n", | |
" def loss(self, x):\n", | |
" assert self.loss_value is None\n", | |
" x = x.reshape((1, img_height, img_width, 3))\n", | |
" outs = fetch_loss_and_grads([x])\n", | |
" loss_value = outs[0]\n", | |
" grad_values = outs[1].flatten().astype('float64')\n", | |
" self.loss_value = loss_value\n", | |
" self.grad_values = grad_values\n", | |
" return self.loss_value\n", | |
" \n", | |
" def grads(self, x):\n", | |
" assert self.loss_value is not None\n", | |
" grad_values = np.copy(self.grad_values)\n", | |
" self.loss_value = None\n", | |
" self.grad_values = None\n", | |
" return grad_values\n", | |
"\n", | |
"evaluator = Evaluator()" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [ | |
{ | |
"name": "stdout", | |
"output_type": "stream", | |
"text": [ | |
"Start of iteration 0\n", | |
"Current loss value : 9664402000.0\n", | |
"Image saved as style_transfer_result_at_iteration_0.png\n", | |
"Iteration 0 completed in 1025s\n", | |
"Start of iteration 1\n", | |
"Current loss value : 4350824400.0\n", | |
"Image saved as style_transfer_result_at_iteration_1.png\n", | |
"Iteration 1 completed in 1048s\n", | |
"Start of iteration 2\n", | |
"Current loss value : 2894161000.0\n", | |
"Image saved as style_transfer_result_at_iteration_2.png\n", | |
"Iteration 2 completed in 1052s\n", | |
"Start of iteration 3\n", | |
"Current loss value : 2291391200.0\n", | |
"Image saved as style_transfer_result_at_iteration_3.png\n", | |
"Iteration 3 completed in 1018s\n", | |
"Start of iteration 4\n", | |
"Current loss value : 1886620200.0\n", | |
"Image saved as style_transfer_result_at_iteration_4.png\n", | |
"Iteration 4 completed in 964s\n", | |
"Start of iteration 5\n", | |
"Current loss value : 1597830000.0\n", | |
"Image saved as style_transfer_result_at_iteration_5.png\n", | |
"Iteration 5 completed in 1072s\n", | |
"Start of iteration 6\n", | |
"Current loss value : 1432617900.0\n", | |
"Image saved as style_transfer_result_at_iteration_6.png\n", | |
"Iteration 6 completed in 1035s\n", | |
"Start of iteration 7\n", | |
"Current loss value : 1296190000.0\n", | |
"Image saved as style_transfer_result_at_iteration_7.png\n", | |
"Iteration 7 completed in 1040s\n", | |
"Start of iteration 8\n", | |
"Current loss value : 1194520600.0\n", | |
"Image saved as style_transfer_result_at_iteration_8.png\n", | |
"Iteration 8 completed in 1102s\n", | |
"Start of iteration 9\n", | |
"Current loss value : 1102861800.0\n", | |
"Image saved as style_transfer_result_at_iteration_9.png\n", | |
"Iteration 9 completed in 1052s\n", | |
"Start of iteration 10\n", | |
"Current loss value : 1036696600.0\n", | |
"Image saved as style_transfer_result_at_iteration_10.png\n", | |
"Iteration 10 completed in 1010s\n", | |
"Start of iteration 11\n", | |
"Current loss value : 979493440.0\n", | |
"Image saved as style_transfer_result_at_iteration_11.png\n", | |
"Iteration 11 completed in 1072s\n", | |
"Start of iteration 12\n", | |
"Current loss value : 920844800.0\n", | |
"Image saved as style_transfer_result_at_iteration_12.png\n", | |
"Iteration 12 completed in 957s\n", | |
"Start of iteration 13\n", | |
"Current loss value : 866996350.0\n", | |
"Image saved as style_transfer_result_at_iteration_13.png\n", | |
"Iteration 13 completed in 2850s\n", | |
"Start of iteration 14\n", | |
"Current loss value : 828140900.0\n", | |
"Image saved as style_transfer_result_at_iteration_14.png\n", | |
"Iteration 14 completed in 1581s\n", | |
"Start of iteration 15\n", | |
"Current loss value : 792919740.0\n", | |
"Image saved as style_transfer_result_at_iteration_15.png\n", | |
"Iteration 15 completed in 1156s\n", | |
"Start of iteration 16\n" | |
] | |
} | |
], | |
"source": [ | |
"%%time\n", | |
"from scipy.optimize import fmin_l_bfgs_b\n", | |
"from scipy.misc import imsave\n", | |
"import time\n", | |
"\n", | |
"result_prefix = 'style_transfer_result'\n", | |
"iterations = 20\n", | |
"\n", | |
"# Run scipy-based optimization (L-BFGS) over the pixels of the generated image\n", | |
"# so as to minimize the neural style loss.\n", | |
"# This is our initial state the target image\n", | |
"# Note that 'scipy.optimize.fmion_l_bfgs_b' can only process flat vectors.\n", | |
"x = preprocess_image(target_image_path)\n", | |
"x =x.flatten()\n", | |
"for i in range(iterations):\n", | |
" print('Start of iteration', i)\n", | |
" start_time = time.time()\n", | |
" # x, min_val, info = fmin_l_bfgs_b(evaluator, loss, x, fprime=evaluator. grads, maxfun=20)\n", | |
" x, min_val, info= fmin_l_bfgs_b(evaluator.loss, x, fprime=evaluator.grads, maxfun=20)\n", | |
" print('Current loss value : ', min_val)\n", | |
" \n", | |
" # Save current Generated Image\n", | |
" img = x.copy().reshape((img_height, img_width, 3))\n", | |
" img = deprocess_image(img)\n", | |
" fname = result_prefix + '_at_iteration_%d.png' % i\n", | |
" imsave(fname, img)\n", | |
" end_time = time.time()\n", | |
" print('Image saved as', fname)\n", | |
" print('Iteration %d completed in %ds' % (i, end_time - start_time))\n", | |
" " | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [ | |
"[!image]('style_transfer_result_at_iteration_15.png')" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
}, | |
{ | |
"cell_type": "code", | |
"execution_count": null, | |
"metadata": {}, | |
"outputs": [], | |
"source": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": {}, | |
"source": [] | |
} | |
], | |
"metadata": { | |
"kernelspec": { | |
"display_name": "python3.5", | |
"language": "python", | |
"name": "python3.5" | |
}, | |
"language_info": { | |
"codemirror_mode": { | |
"name": "ipython", | |
"version": 3 | |
}, | |
"file_extension": ".py", | |
"mimetype": "text/x-python", | |
"name": "python", | |
"nbconvert_exporter": "python", | |
"pygments_lexer": "ipython3", | |
"version": "3.5.4" | |
} | |
}, | |
"nbformat": 4, | |
"nbformat_minor": 2 | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment