Process of Style Transferring

Last updated on Oct 25 2021
Ashutosh Wakiroo

Table of Contents

Process of Style Transferring

Neural style transfer is an optimization technique that is used for two images – a content image and a style reference image – and they are merged so that the output image looks like a content image, but in the style of the style reference image “depicted” is.
To achieve style transfer, it is necessary to separate the style of the image from its content. After that, it is possible to transfer the style elements of one image to the content elements of another image. This process is mainly carried out using the feature extraction form standard nonlinear neural network.
These features are then manipulated to extract content information or style information. The process involves three images a style image, a content image, and finally, a target image.
The ultimate goal is the style of the style image combined with the content of the content image to create the image.

tensFlow
tensorFlow

This process begins by selecting some layers from within our model to extract features. We have a good idea of how our image is processed in neural networks by choosing a few layers to extract features. We remove the model attributes of our style image and content image. After that, we remove the elements from our target image and compare it with our style image feature and our content image feature.

Working of Style Transferring

Neural style transfer is the optimization technique used to take two images- a content image and a style reference image and blend them, so the output image looks like the content image, but it “painted” in the style of the style reference image.

Import and configure the modules

Open Google colab
1. from __future__ import absolute_import, division, print_function, unicode_literals

1. try:
2. # %tensorflow_version only exists in Colab.
3. %tensorflow_version 2.x
4. except Exception:
5. pass
6. import tensorflow as tf
Output:
TensorFlow 2.x selected.

1. import IPython.display as display
2. import matplotlib.pyplot as plt
3. import matplotlib as mpl
4. mpl.rcParams['figure.figsize'] = (12,12)
5. mpl.rcParams['axes.grid'] = False
6. import numpy as np
7. import time
8. import functools
9.
10. content_path = tf.keras.utils.get_file('nature.jpg','https://www.eadegallery.co.nz/wp-content/uploads/2019/03/626a6823-af82-432a-8d3d-d8295b1a9aed-l.jpg')
11. style_path = tf.keras.utils.get_file('cloud.jpg','https://i.pinimg.com/originals/11/91/4f/11914f29c6d3e9828cc5f5c2fd64cfdc.jpg')
Output:
Downloading data from https://www.eadegallery.co.nz/wp-content/uploads/2019/03/626a6823-af82-432a-8d3d-d8295b1a9aed-l.jpg
1122304/1117520 [==============================] - 1s 1us/step
Downloading data from https://i.pinimg.com/originals/11/91/4f/11914f29c6d3e9828cc5f5c2fd64cfdc.jpg
49152/43511 [=================================] - 0s 0us/step5. def
Check the greatest measurement to 512 pixels.
1. load_img(path_to_img):
2. max_dim = 512
3. img = tf.io.read_file(path_to_img)
4. img = tf.image.decode_image(img, channels=3)
5. img = tf.image.convert_image_dtype(img, tf.float32)
6. shape = tf.cast(tf.shape(img)[:-1], tf.float32)
7. long_dim = max(shape)
8. scale = max_dim / long_dim
9. new_shape = tf.cast(shape * scale, tf.int32)
10. img = tf.image.resize(img, new_shape)
11. img = img[tf.newaxis, :]
12. return img

Creating a function to show the image

1. def imshow(image, title=None):
2. if len(image.shape) > 3:
3. image = tf.squeeze(image, axis=0)
4.
5. plt.imshow(image)
6. if title:
7. plt.title(title)

1. content_image = load_img(content_path)
2. style_image = load_img(style_path)
3. plt.subplot(1, 2, 1)
4. imshow(content_image, 'Content Image')
5. plt.subplot(1, 2, 2)
6. imshow(style_image, 'Style Image')

Output:

tensorFlow 4
tensorFlow
1. x = tf.keras.applications.vgg19.preprocess_input(content_image*255)
2. x = tf.image.resize(x, (224, 224))
3. vgg = tf.keras.applications.VGG19(include_top=True, weights='imagenet')
4. prediction_probabilities = vgg(x)
5. prediction_probabilities.shape
Output:
Downloading data from https://github.com/fchollet/deep-learning- models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5

574717952/574710816 [==============================] - 8s 0us/step
TensorShape([1, 1000])

1. predicted_top_5 = tf.keras.applications.vgg19.decode_predictions(prediction_probabilities.numpy())[0]
2. [(class_name, prob) for (number, class_name, prob) in predicted_top_5]
Output:
Downloading data from https://storage.googleapis.com/download.tensorflow.org/data/imagenet_class_index.json
40960/35363 [==================================] - 0s 0us/step
[('mobile_home', 0.7314594),
('picket_fence', 0.119986326),
('greenhouse', 0.026051044),
('thatch', 0.023595566),
('boathouse', 0.014751049)]

Define style and content representations

Use the middle layers of the model to the content and style representation of the image. Starting from the input layer, the first few layer activation represents low-level represent like edges and textures.
For the input image, try to match the similar style and content target representation at the intermediate layers.
Load the VGG19 and run it on our image to ensure it used correctly here.

1. vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet')
2. print()
3. for layer in vgg.layers:
4. print(layer.name)
Output:
Download data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels_notop.h5
80142336/80134624 [==============================] - 2s 0us/step

input_2
block1_conv1
block1_conv2
block1_pool
block2_conv1
block2_conv2
block2_pool
block3_conv1
block3_conv2
block3_conv3
block3_conv4
block3_pool
block4_conv1
block4_conv2
block4_conv3
block4_conv4
block4_pool
block5_conv1
block5_conv2
block5_conv3
block5_conv4
block5_pool

1. # Content layer
2. content_layers = ['block5_conv2']
3.
4. # Style layer of interest
5. style_layers = ['block1_conv1',
6. 'block2_conv1',
7. 'block3_conv1',
8. 'block4_conv1',
9. 'block5_conv1']
10.
11. num_content_layers = len(content_layers)
12. num_style_layers = len(style_layers)

Intermediate layers for style and content

At the high level, to a network to perform image classification, it understands the image and requires taking the image as the pixels and building an internal illustration that converts the raw image pixels into a complex features present within the image.
This is also a reason why the convolutional neural networks can generalize well: they can capture the deviating and defining features within classes (e.g., cats vs. dogs) that are agnostic where the image is fed into the model and output arrangement label, the model deliver as a complex feature extractor. By accessing intermediate layers of the model, we’re able to describe the style and content of input images.
Build the model
The network in tf.keras.applications are defined, so we can easily extract the intermediate layer values using the Keras functional API.
To define any model using the functional API, specify the inputs and outputs:
model= Model(inputs, outputs)
The given function builds a VGG19 model that returns a list of intermediate layer.

1. def vgg_layers(layer_names):
2. """ Creating a vgg model that returns a list of intermediate output values."""
3. # Load our model. Load pretrained VGG, trained on imagenet data
4. vgg = tf.keras.applications.VGG19(include_top=False, weights='imagenet')
5. vgg.trainable = False
6. outputs = [vgg.get_layer(name).output for name in layer_names]
7. model = tf.keras.Model([vgg.input], outputs)
8. return model

1. style_extractor = vgg_layers(style_layers)
2. style_outputs = style_extractor(style_image*255)
3. #Look at the statistics of each layer's output
4. for name, output in zip(style_layers, style_outputs):
5. print(name)
6. print(" shape: ", output.numpy().shape)
7. print(" min: ", output.numpy().min())
8. print(" max: ", output.numpy().max())
9. print(" mean: ", output.numpy().mean())
10. print()
Output:
block1_conv1
shape: (1, 427, 512, 64)
min: 0.0
max: 763.51953
mean: 25.987665

block2_conv1
shape: (1, 213, 256, 128)
min: 0.0
max: 3484.3037
mean: 134.27835

block3_conv1
shape: (1, 106, 128, 256)
min: 0.0
max: 7291.078
mean: 143.77878

block4_conv1
shape: (1, 53, 64, 512)
min: 0.0
max: 13492.799
mean: 530.00244

block5_conv1
shape: (1, 26, 32, 512)
min: 0.0
max: 2881.529
mean: 40.596397

Gram matrix:

Calculating style
The content of the image is represented by the values of the common features of the map.
Calculate a Gram Matrix, which includes this information by taking the output product over all locations.
The Gram matrix can be calculated for a particular layer as:

tensorFlow 5
tensorFlow

This is implemented concisely using the tf.linalg.einsum function:

1. def gram_matrix(input_tensor):
2. result = tf.linalg.einsum('bijc,bijd->bcd', input_tensor, input_tensor)
3. input_shape = tf.shape(input_tensor)
4. num_locations = tf.cast(input_shape[1]*input_shape[2], tf.float32)
5. return result/(num_locations)

Extracting the style and content of image

Building the model that returns the content and style tensor.

1. class StyleContentModel(tf.keras.models.Model):
2. def __init__(self, style_layers, content_layers):
3. super(StyleContentModel, self).__init__()
4. self.vgg = vgg_layers(style_layers + content_layers)
5. self.style_layers = style_layers
6. self.content_layers = content_layers
7. self.num_style_layers = len(style_layers)
8. self.vgg.trainable = False
9. def call(self, inputs):
10. "Expects float input in [0,1]"
11. inputs = inputs*255.0
12. preprocessed_input = tf.keras.applications.vgg19.preprocess_input(inputs)
13. outputs = self.vgg(preprocessed_input)
14. style_outputs, content_outputs = (outputs[:self.num_style_layers],outputs[self.num_style_layers:])
15. style_outputs = [gram_matrix(style_output)
16. for style_output in style_outputs]
17.
18. content_dict = {content_name:value for content_name, value in zip(self.content_layers, content_outputs)}
19. style_dict = {style_name:value
20. for style_name, value
21. in zip(self.style_layers, style_outputs)}
22. return {'content':content_dict, 'style':style_dict}
When called on the image, this model returns the gram matrix (style) of the style_layers and content of the content_layers:
1. extractor = StyleContentModel(style_layers, content_layers)
2. results = extractor(tf.constant(content_image))
3. style_results = results['style']
4. print('Styles:')
5. for name, output in sorted(results['style'].items()):
6. print(" ", name)
7. print(" shape: ", output.numpy().shape)
8. print(" min: ", output.numpy().min())
9. print(" max: ", output.numpy().max())
10. print(" mean: ", output.numpy().mean())
11. print()
12. print("Contents:")
13. for name, output in sorted(results['content'].items()):
14. print(" ", name)
15. print(" shape: ", output.numpy().shape)
16. print(" min: ", output.numpy().min())
17. print(" max: ", output.numpy().max())
18. print(" mean: ", output.numpy().mean())
Output:
Styles:
block1_conv1
shape: (1, 64, 64)
min: 0.0055228453
max: 28014.557
mean: 263.79025

block2_conv1
shape: (1, 128, 128)
min: 0.0
max: 61479.496
mean: 9100.949

block3_conv1
shape: (1, 256, 256)
min: 0.0
max: 545623.44
mean: 7660.976

block4_conv1
shape: (1, 512, 512)
min: 0.0
max: 4320502.0
mean: 134288.84

block5_conv1
shape: (1, 512, 512)
min: 0.0
max: 110005.37
mean: 1487.0381

Contents:
block5_conv2
shape: (1, 26, 32, 512)
min: 0.0
max: 2410.8796
mean: 13.764149

Run gradient descent

With this style and content extractor, we implement the style transfer algorithm. Do this by evaluating the mean square error in our image’s output relative to each target, then take the weighted sum of the losses.
Set our style and content target values:

1. style_targets = extractor(style_image)['style']
2. content_targets = extractor(content_image)['content']
Define a tf.Variable to contain the image to hold. Initialize it with the help of content image (the tf.Variable be the same shape as the content image):
1. image = tf.Variable(content_image)
This is a floating image, define a function to keep the pixel value between 0 and 1:
1. def clip_0_1(image):
2. return tf.clip_by_value(image, clip_value_min=0.0, clip_value_max=1.0)
Create the optimizer. The paper recommends LBFGS:
1. opt = tf.optimizers.Adam(learning_rate=0.02, beta_1=0.99, epsilon=1e-1)
To optimizing it, use a weight combination of the two losses to get the total loss:
1. style_weight=1e-2
2. content_weight=1e4

1. def style_content_loss(outputs):
2. style_outputs = outputs['style']
3. content_outputs = outputs['content']
4. style_loss = tf.add_n([tf.reduce_mean((style_outputs[name]-style_targets[name])**2)
5. for name in style_outputs.keys()])
6. style_loss *= style_weight / num_style_layers
7.
8. content_loss = tf.add_n([tf.reduce_mean((content_outputs[name]-content_targets[name])**2)
9. for name in content_outputs.keys()])
10. content_loss *= content_weight / num_content_layers
11. loss = style_loss + content_loss
12. return loss
Use the function tf.GradientTape to update the image.
1. @tf.function()
2. def train_step(image):
3. with tf.GradientTape() as tape:
4. outputs = extractor(image)
5. loss = style_content_loss(outputs)
6. grad = tape.gradient(loss, image)
7. opt.apply_gradients([(grad, image)])
8. image.assign(clip_0_1(image))
Run below steps to test:
1. train_step (image)
2. train_step (image)
3. train_step (image)
4. plt.imshow(image.read_value()[0])

Output:

tensorFlow 6
tensorFlow

Transforming the image

Performing a longer optimization in this step:
1. import time
2. start = time.time()
3.
4. epochs = 10
5. steps_per_epoch = 100
6. step = 0
7. for n in range(epochs):
8. for m in range(steps_per_epoch):
9. step += 1
10. train_step(image)
11. print(".", end='')
12. display.clear_output(wait=True)
13. imshow(image.read_value())
14. plt.title("Train step: {}".format(step))
15. plt.show()
16. end = time.time()
17. print("Total time: {:.1f}".format(end-start))

Output:

tensFlow 2
tensFlow
tensFlow 3
tensFlow
tensFlow 4
tensFlow
tensFlow 5
tensFlow
tensFlow 6
tensFlow
tensFlow 7
tensFlow
tensFlow 8
tensFlow
tensFlow 9
tensFlow
tensFlow 10
tensFlow
tensFlow 11
tensFlow

Total variation loss

1. def high_pass_x_y(image):
2. x_var = image[:,:,1:,:] - image[:,:,:-1,:]
3. y_var = image[:,1:,:,:] - image[:,:-1,:,:]
4. return x_var, y_var

1. x_deltas, y_deltas = high_pass_x_y(content_image)
2. plt.figure(figsize=(14,10))
3. plt.subplot(2,2,1)
4. imshow(clip_0_1(2*y_deltas+0.5), "Horizontal Deltas: Original")
5. plt.subplot(2,2,2)
6. imshow(clip_0_1(2*x_deltas+0.5), "Vertical Deltas: Original")
7. x_deltas, y_deltas = high_pass_x_y(image)
8. plt.subplot(2,2,3)
9. imshow(clip_0_1(2*y_deltas+0.5), "Horizontal Deltas: Styled")
10. plt.subplot(2,2,4)
11. imshow(clip_0_1(2*x_deltas+0.5), "Vertical Deltas: Styled")

Output:

tensFlow 12

tensFlow 13
tensFlow

This shows how the high frequency component have increased.
This high frequency component is an edge-detector. We get same output from the edge detector, from the given example:

1. plt.figure(figsize=(14,10))
2. sobel = tf.image.sobel_edges(content_image)
3. plt.subplot(1,2,1)
4. imshow(clip_0_1(sobel [...,0]/4+0.5), "Horizontal Sobel-edges")
5. plt.subplot(1,2,2)
6. imshow(clip_0_1(sobel[...,1]/4+0.5), "Vertical Sobel-edges")

Output:

tensFlow 14
tensFlow

The regularization loss associated with this is sum of the square of the value:

1. def total_variation_loss(image):
2. x_deltas, y_deltas = high_pass_x_y(image)
3. return tf.reduce_sum(tf.abs(x_deltas)) + tf.reduce_sum(tf.abs(y_deltas))

1. total_variation_loss(image).numpy()
Output:
99172.59
That demonstrate what it does. But there's no need to implement it ourselves, it includes a standard implementation:
1. tf.image.total_variation(image).numpy()
Output:
array([99172.59], dtype=float32)

Re-running the optimization function

Pick the weight for the function total_variation_loss:

1. total_variation_weight=30
Now, train_step function:
1. @tf.function()
2. def train_step(image):
3. with tf.GradientTape() as tape:
4. outputs = extractor(image)
5. loss = style_content_loss(outputs)
6. loss += total_variation_weight*tf.image.total_variation(image)
7. grad = tape.gradient(loss, image)
8. opt.apply_gradients([(grad, image)])
9. image.assign(clip_0_1(image))
Reinitializing the optimization variable:
1. image = tf.Variable(content_image)
And run the optimization:
1. import time
2. start = time.time()
3.
4. epochs = 10
5. steps_per_epoch = 100
6.
7. step = 0
8. for n in range(epochs):
9. for m in range(steps_per_epoch):
10. step += 1
11. train_step(image)
12. print(".", end='')
13. display.clear_output(wait=True)
14. display.display(tensor_to_image(image))
15. print("Train step: {}".format(step))
16. end = time.time()
17. print("Total time: {:.1f}".format(end-start))

Output:

tensFlow 15
tensFlow

finally save the result:

1. file_name = 'styletransfer.png'
2. tensor_to_image(image). save(file_name)
3. try: from google. colab import files
4. except ImportError:
5. pass
6. else:
7. files.download(file_name)

So, this brings us to the end of blog. This Tecklearn ‘Process of Style Transferring’ blog helps you with commonly asked questions if you are looking out for a job in Artificial Intelligence. If you wish to learn Artificial Intelligence and build a career in AI or Machine Learning domain, then check out our interactive, Artificial Intelligence and Deep Learning with TensorFlow Training, that comes with 24*7 support to guide you throughout your learning period. Please find the link for course details:

https://www.tecklearn.com/course/artificial-intelligence-and-deep-learning-with-tensorflow/

Artificial Intelligence and Deep Learning with TensorFlow Training

About the Course

Tecklearn’s Artificial Intelligence and Deep Learning with Tensor Flow course is curated by industry professionals as per the industry requirements & demands and aligned with the latest best practices. You’ll master convolutional neural networks (CNN), TensorFlow, TensorFlow code, transfer learning, graph visualization, recurrent neural networks (RNN), Deep Learning libraries, GPU in Deep Learning, Keras and TFLearn APIs, backpropagation, and hyperparameters via hands-on projects. The trainee will learn AI by mastering natural language processing, deep neural networks, predictive analytics, reinforcement learning, and more programming languages needed to shine in this field.

Why Should you take Artificial Intelligence and Deep Learning with Tensor Flow Training?

• According to Paysa.com, an Artificial Intelligence Engineer earns an average of $171,715, ranging from $124,542 at the 25th percentile to $201,853 at the 75th percentile, with top earners earning more than $257,530.
• Worldwide Spending on Artificial Intelligence Systems Will Be Nearly $98 Billion in 2023, According to New IDC Spending Guide at a GACR of 28.5%.
• IBM, Amazon, Apple, Google, Facebook, Microsoft, Oracle and almost all the leading companies are working on Artificial Intelligence to innovate future technologies.

What you will Learn in this Course?

Introduction to Deep Learning and AI
• What is Deep Learning?
• Advantage of Deep Learning over Machine learning
• Real-Life use cases of Deep Learning
• Review of Machine Learning: Regression, Classification, Clustering, Reinforcement Learning, Underfitting and Overfitting, Optimization
• Pre-requisites for AI & DL
• Python Programming Language
• Installation & IDE
Environment Set Up and Essentials
• Installation
• Python – NumPy
• Python for Data Science and AI
• Python Language Essentials
• Python Libraries – Numpy and Pandas
• Numpy for Mathematical Computing
More Prerequisites for Deep Learning and AI
• Pandas for Data Analysis
• Machine Learning Basic Concepts
• Normalization
• Data Set
• Machine Learning Concepts
• Regression
• Logistic Regression
• SVM – Support Vector Machines
• Decision Trees
• Python Libraries for Data Science and AI
Introduction to Neural Networks
• Creating Module
• Neural Network Equation
• Sigmoid Function
• Multi-layered perception
• Weights, Biases
• Activation Functions
• Gradient Decent or Error function
• Epoch, Forward & backword propagation
• What is TensorFlow?
• TensorFlow code-basics
• Graph Visualization
• Constants, Placeholders, Variables
Multi-layered Neural Networks
• Error Back propagation issues
• Drop outs
Regularization techniques in Deep Learning
Deep Learning Libraries
• Tensorflow
• Keras
• OpenCV
• SkImage
• PIL
Building of Simple Neural Network from Scratch from Simple Equation
• Training the model
Dual Equation Neural Network
• TensorFlow
• Predicting Algorithm
Introduction to Keras API
• Define Keras
• How to compose Models in Keras
• Sequential Composition
• Functional Composition
• Predefined Neural Network Layers
• What is Batch Normalization
• Saving and Loading a model with Keras
• Customizing the Training Process
• Using TensorBoard with Keras
• Use-Case Implementation with Keras
GPU in Deep Learning
• Introduction to GPUs and how they differ from CPUs
• Importance of GPUs in training Deep Learning Networks
• The GPU constituent with simpler core and concurrent hardware
• Keras Model Saving and Reusing
• Deploying Keras with TensorBoard
Keras Cat Vs Dog Modelling
• Activation Functions in Neural Network
Optimization Techniques
• Some Examples for Neural Network
Convolutional Neural Networks (CNN)
• Introduction to CNNs
• CNNs Application
• Architecture of a CNN
• Convolution and Pooling layers in a CNN
• Understanding and Visualizing a CNN
RNN: Recurrent Neural Networks
• Introduction to RNN Model
• Application use cases of RNN
• Modelling sequences
• Training RNNs with Backpropagation
• Long Short-Term memory (LSTM)
• Recursive Neural Tensor Network Theory
• Recurrent Neural Network Model
Application of Deep Learning in image recognition, NLP and more
Real world projects in recommender systems and others

Got a question for us? Please mention it in the comments section and we will get back to you.

 

0 responses on "Process of Style Transferring"

Leave a Message

Your email address will not be published. Required fields are marked *