Skip to content

Instantly share code, notes, and snippets.

@ecjang
Created February 11, 2018 10:19
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save ecjang/108842d48cfffe44fed2087fdf3d8e48 to your computer and use it in GitHub Desktop.
Save ecjang/108842d48cfffe44fed2087fdf3d8e48 to your computer and use it in GitHub Desktop.
07.Style-Transfer
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Neural Style Transfer"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"레퍼런스의 이미지를 타겟이미지에 적용시키는 응용. 단 원본의 content는 유지.\n",
"스타일 = Txture, Colors, Visual Patterns\n",
"\n",
"- Content = The Higher-Level macrostructure of the Image\n",
"- Loss = distance(style(ref) - style(generated)) + distance(content(orginal) - content(generated))\n",
"- Distance = L2 norm"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Content Loss"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- 신경망의 앞단 레이어의 활성화 된 값들은 이미지에 대한 국소적인 정보를 가지고 있다.\n",
"- 상대적으로 이후의 레이어으는 전체적이며 추상적인 정보를 가지고 있다.\n",
"- 다르게 말하면 convent의 다른 레이어의 활성화값은 다른 공간스케일의 내용에 대한 분해된 값을 제공한다.\n",
"- 그러므로 content에 대한 정보는 convnet의 상위 레이어에 표현될 것이다.\n",
"- 좋은 conetent loss의 후보는 타겟 이미지와 생성된 이이지의 convent의 top layer 사이의 L2 norm 값이다."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## The Style Loss"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- content loss 에서는 하나의 상위 레이어만 사용하지만 style loss에서는 여러개의 convent 레이어르 사용한다.\n",
"- convent의 모든 공간 스케일로부터 스타일 정보를 추출하기를 원한다.\n",
"- style loss로는 활성화 값의 Gram maxrix를 사용한다. feature map 사이의 내적값이다.\n",
"- 내적인 레이어의 피쳐들 사이의 관계에 대한 맵을 표현하는 것으로 이해할 수 있다.\n",
"- 이러한 관계는 특정 공간에 스케일의 패턴에 대한 통계를 인식한다.\n",
"- 그러므로 style loss는 다른 레이어의 활성화 사이의 관계를 비슷하게 유지할려고 한다.\n",
"- 그러면 이 방법을 사용하면 다른 공간 스케일에서 발견된 텍스쳐가 ref 이미지와 generated 이밎 사이에 비슷할 것을 보장해준다."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Neural style transfer in Keras"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- VGG19 Network를 사용\n",
"- ref, target, generated 이미지에 대한 VGG19 레이어 활성화 값을 계산하는 네트워크를 만든다.\n",
"- 위에서 설명한 loss function을 정의하기 위해서 레이어 활성화를 사용한다.\n",
"- loss function을 최소화하기 위한 gradient descent 과정을 거친다."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"C:\\ProgramData\\Anaconda3\\envs\\py35\\lib\\site-packages\\h5py\\__init__.py:34: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.\n",
" from ._conv import register_converters as _register_converters\n",
"Using TensorFlow backend.\n"
]
}
],
"source": [
"from keras.preprocessing.image import load_img, img_to_array"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"width, height : 1920 1200\n"
]
}
],
"source": [
"# This is the path to the Image you want to transform.\n",
"target_image_path = './data/image/blue-moon-lake.jpg'\n",
"\n",
"# This is the path to the Style Image.\n",
"style_reference_image_path = './data/image/starry_night.jpg'\n",
"\n",
"# Dismensions of the generated picture.\n",
"width, height = load_img(target_image_path).size\n",
"print(\"width, height : \", width, height)\n",
"\n",
"img_height = 400\n",
"img_width = int(width * img_height / height)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- VGG19 convert에 들어가고 나올 이미지들에 대한 로딩, 전처리, 후처리를 위한 보조 함수들이 필요하다."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"from keras.applications import vgg19"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"def preprocess_image(image_path):\n",
" img = load_img(image_path, target_size=(img_height, img_width))\n",
" img = img_to_array(img)\n",
" img = np.expand_dims(img, axis=0)\n",
" img = vgg19.preprocess_input(img)\n",
" return img\n",
"\n",
"def deprocess_image(x):\n",
" # Remove zero-conter by mean pixel\n",
" x[:, :, 0] += 103.939\n",
" x[:, :, 1] += 116.779\n",
" x[:, :, 2] += 123.68\n",
" # BGR -> RGB\n",
" x = x[:, :, ::-1]\n",
" x = np.clip(x, 0, 255).astype('uint8')\n",
" return x\n"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Model loaded\n"
]
}
],
"source": [
"from keras import backend as K\n",
"\n",
"target_image = K.constant(preprocess_image(target_image_path))\n",
"style_reference_image = K.constant(preprocess_image(style_reference_image_path))\n",
"\n",
"target_image, style_reference_image\n",
"\n",
"# This placeholder will contain our generated image\n",
"combination_image = K.placeholder((1, img_height, img_width, 3))\n",
"\n",
"# We conbinde the 3 images into a single batch.\n",
"input_tensor = K.concatenate([target_image, style_reference_image, combination_image], axis=0)\n",
"\n",
"input_tensor\n",
"# We build the VGG19 network with out batch of 3 iamges as input.\n",
"# The model will be loaded whth pre-trained imgeNet weights.\n",
"model = vgg19.VGG19(input_tensor=input_tensor, weights='imagenet', include_top=False)\n",
"\n",
"print('Model loaded')"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"def content_loss(base, combination):\n",
" return K.sum(K.square(combination - base))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"def gram_matrix(x):\n",
" features = K.batch_flatten(K.permute_dimensions(x,(2,0,1)))\n",
" gram = K.dot(features, K.transpose(features))\n",
" return gram\n",
"\n",
"def style_loss(style, combination):\n",
" S = gram_matrix(style)\n",
" C = gram_matrix(combination)\n",
" channels = 3\n",
" size = img_height * img_width\n",
" return K.sum(K.square(S - C)) / (4 * (channels ** 2) * (size ** 2))\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def total_variation_loss(x):\n",
" a = K.square(\n",
" x[:, :img_height- 1, :img_width - 1, :] - x[:, 1:, :img_width -1, :])\n",
" b = K.square(\n",
" x[:, :img_height - 1, :img_width - 1, :] - x[:, :img_height -1, 1:, :])\n",
" return K.sum(K.pow(a + b, 1.25))\n"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"# Diot mapping layer to activation Tensers\n",
"outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])\n",
"\n",
"# Name of layer used for content loss\n",
"content_layer = 'block5_conv2'\n",
"\n",
"# name og layer used for style loss\n",
"style_layers = ['block1_conv1', 'block2_conv1', 'block3_conv1', 'block4_conv1', 'block5_conv1']\n",
"\n",
"# Weights in the weighted average of the loss compcents\n",
"total_variation_weight = 1e-4\n",
"style_weight = 1\n",
"content_weight = 0.025\n",
"\n",
"# Define the loss by adding all components to a 'loss' variable\n",
"loss = K.variable(0.)\n",
"layer_features = outputs_dict[content_layer]\n",
"target_image_features = layer_features[1, :, :, :]\n",
"combination_features = layer_features[2, :, :, :]\n",
"loss += content_weight * content_loss(target_image_features, combination_features)\n",
"\n",
"for layer_name in style_layers:\n",
" layer_features = outputs_dict[layer_name]\n",
" style_reference_features = layer_features[1, :, :, :]\n",
" combination_features = layer_features[2, :, :, :]\n",
" sl = style_loss(style_reference_features, combination_features)\n",
" loss += (style_weight / len(style_layers)) + sl\n",
"loss += total_variation_weight * total_variation_loss(combination_image)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"# Get the gradients of the generated image wrt the loss\n",
"grads = K.gradients(loss, combination_image)[0]\n",
"\n",
"# Function to the fetch the value of the current loss and the current gradients\n",
"fetch_loss_and_grads = K.function([combination_image], [loss, grads])\n",
"\n",
"class Evaluator(object):\n",
" \n",
" def __init__(self):\n",
" self.loss_value = None\n",
" self.grads_value = None\n",
" \n",
" def loss(self, x):\n",
" assert self.loss_value is None\n",
" x = x.reshape((1, img_height, img_width, 3))\n",
" outs = fetch_loss_and_grads([x])\n",
" loss_value = outs[0]\n",
" grad_values = outs[1].flatten().astype('float64')\n",
" self.loss_value = loss_value\n",
" self.grad_values = grad_values\n",
" return self.loss_value\n",
" \n",
" def grads(self, x):\n",
" assert self.loss_value is not None\n",
" grad_values = np.copy(self.grad_values)\n",
" self.loss_value = None\n",
" self.grad_values = None\n",
" return grad_values\n",
"\n",
"evaluator = Evaluator()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Start of iteration 0\n",
"Current loss value : 9664402000.0\n",
"Image saved as style_transfer_result_at_iteration_0.png\n",
"Iteration 0 completed in 1025s\n",
"Start of iteration 1\n",
"Current loss value : 4350824400.0\n",
"Image saved as style_transfer_result_at_iteration_1.png\n",
"Iteration 1 completed in 1048s\n",
"Start of iteration 2\n",
"Current loss value : 2894161000.0\n",
"Image saved as style_transfer_result_at_iteration_2.png\n",
"Iteration 2 completed in 1052s\n",
"Start of iteration 3\n",
"Current loss value : 2291391200.0\n",
"Image saved as style_transfer_result_at_iteration_3.png\n",
"Iteration 3 completed in 1018s\n",
"Start of iteration 4\n",
"Current loss value : 1886620200.0\n",
"Image saved as style_transfer_result_at_iteration_4.png\n",
"Iteration 4 completed in 964s\n",
"Start of iteration 5\n",
"Current loss value : 1597830000.0\n",
"Image saved as style_transfer_result_at_iteration_5.png\n",
"Iteration 5 completed in 1072s\n",
"Start of iteration 6\n",
"Current loss value : 1432617900.0\n",
"Image saved as style_transfer_result_at_iteration_6.png\n",
"Iteration 6 completed in 1035s\n",
"Start of iteration 7\n",
"Current loss value : 1296190000.0\n",
"Image saved as style_transfer_result_at_iteration_7.png\n",
"Iteration 7 completed in 1040s\n",
"Start of iteration 8\n",
"Current loss value : 1194520600.0\n",
"Image saved as style_transfer_result_at_iteration_8.png\n",
"Iteration 8 completed in 1102s\n",
"Start of iteration 9\n",
"Current loss value : 1102861800.0\n",
"Image saved as style_transfer_result_at_iteration_9.png\n",
"Iteration 9 completed in 1052s\n",
"Start of iteration 10\n",
"Current loss value : 1036696600.0\n",
"Image saved as style_transfer_result_at_iteration_10.png\n",
"Iteration 10 completed in 1010s\n",
"Start of iteration 11\n",
"Current loss value : 979493440.0\n",
"Image saved as style_transfer_result_at_iteration_11.png\n",
"Iteration 11 completed in 1072s\n",
"Start of iteration 12\n",
"Current loss value : 920844800.0\n",
"Image saved as style_transfer_result_at_iteration_12.png\n",
"Iteration 12 completed in 957s\n",
"Start of iteration 13\n",
"Current loss value : 866996350.0\n",
"Image saved as style_transfer_result_at_iteration_13.png\n",
"Iteration 13 completed in 2850s\n",
"Start of iteration 14\n",
"Current loss value : 828140900.0\n",
"Image saved as style_transfer_result_at_iteration_14.png\n",
"Iteration 14 completed in 1581s\n",
"Start of iteration 15\n",
"Current loss value : 792919740.0\n",
"Image saved as style_transfer_result_at_iteration_15.png\n",
"Iteration 15 completed in 1156s\n",
"Start of iteration 16\n"
]
}
],
"source": [
"%%time\n",
"from scipy.optimize import fmin_l_bfgs_b\n",
"from scipy.misc import imsave\n",
"import time\n",
"\n",
"result_prefix = 'style_transfer_result'\n",
"iterations = 20\n",
"\n",
"# Run scipy-based optimization (L-BFGS) over the pixels of the generated image\n",
"# so as to minimize the neural style loss.\n",
"# This is our initial state the target image\n",
"# Note that 'scipy.optimize.fmion_l_bfgs_b' can only process flat vectors.\n",
"x = preprocess_image(target_image_path)\n",
"x =x.flatten()\n",
"for i in range(iterations):\n",
" print('Start of iteration', i)\n",
" start_time = time.time()\n",
" # x, min_val, info = fmin_l_bfgs_b(evaluator, loss, x, fprime=evaluator. grads, maxfun=20)\n",
" x, min_val, info= fmin_l_bfgs_b(evaluator.loss, x, fprime=evaluator.grads, maxfun=20)\n",
" print('Current loss value : ', min_val)\n",
" \n",
" # Save current Generated Image\n",
" img = x.copy().reshape((img_height, img_width, 3))\n",
" img = deprocess_image(img)\n",
" fname = result_prefix + '_at_iteration_%d.png' % i\n",
" imsave(fname, img)\n",
" end_time = time.time()\n",
" print('Image saved as', fname)\n",
" print('Iteration %d completed in %ds' % (i, end_time - start_time))\n",
" "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"[!image]('style_transfer_result_at_iteration_15.png')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "markdown",
"metadata": {},
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "python3.5",
"language": "python",
"name": "python3.5"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.5.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment