What we will do is also called Neural style transfer Is a method of mixing two images and creating a new image from the content image by copying the style of another image, which is called a style image. The resulting image is often referred to as a stylized image.
In this article, we will copy Andy Warhol’s style from Marilyn Diptych to our photographs. Warhol created the Monroe diptych in 1962, first painting the canvas with different colors and then placing the now famous image of Marilyn on top of the canvas. Although Warhol is not the founder of Pop Art, he is one of the most influential figures in the genre.
Figure: 1. “Marilyn Diptych” by Warhol, and the KDPV shows our result of neural transfer of style into pop art, which we achieved using the VGG-19 network
On the technical side of the tutorial, instead of using the off-the-shelf Magenta network, we will use a pretrained VGG-19 computer vision model and tweak it. Thus, this article is a guide to portable learning, as well as computer vision. By using transfer learning capabilities, we can achieve better results if we can properly tune the model and have a wide range of additional customization options.
portable learning Is a subsection of machine learning and artificial intelligence, the purpose of which is to apply the knowledge gained as a result of performing one task (original task) to another, but similar task (target task).
I will briefly talk about the model that we will be tuning: VGG-19.
VGG is a convolutional neural network with a depth of 19 layers. It was built and trained by K. Simonyan and A. Zisserman at Oxford University in 2014. All information about this is in the article. Very Deep Convolutional Networks for Large-Scale Image Recognitionpublished in 2015. The VGG-19 network has been trained using over one million images from the ImageNet database. She trained on 224×224 pixel color images. Naturally, you can import the ImageNet model with the weights already trained. This pre-trained network can classify up to a thousand objects. In this tutorial, we will get rid of the top used for classification and add our own additional layers so that it can be used for neural style transfer. Here is the official network visualization from academic work:
Figure: 3. Illustration of the VGG-19 network
As I mentioned, whose style could be more iconic and more appropriate than Andy Warhol’s style for transferring to pop art. We’ll use his iconic work, Marilyn Diptych, as the styling base, and his Unsplash portrait photo as the main content:
Figure: 4. Marilyn Diptych and a photo chosen for the experiment
Setting paths to images
Using TensorFlow I can write
get_files [получить файлы] from external URLs. With the code below, I will upload images to my Colab notebook, one for style and one for content:
Since our images are high resolution, we need to scale them up so that training does not take too long. The code below converts the image data to a suitable format, scales the image (feel free to change the parameter
max_dim) and creates a new object that can be used to load into the model:
Now when we define our function
img_scaler, we can create a wrapping function to load the image from the image outlines we set above, scale them to speed up learning (by calling
img_scaler()) and create a 4D tensor to fit VGG-19:
Now you can simply create tensors
style_imageusing the functions we listed above:
Using matplotlib, we can easily display content and style images side by side:
And here’s the output:
Figure: 5. Visualization of content and style images
Now that we have our images ready for neural style transfer, we can create our VGG-19 model and prepare it for fine tuning. This process requires more attention, but careful reading and programming will lead you to results. In this section, we:
- Load VGG-19 using the Keras FZS API from TensorFlow and load it with ImageNet weights.
- Create a matrix Gram function to calculate the style loss.
- Select the layers of the trained VGG-19 model for content and style.
- Create a custom model based on the previously loaded VGG-19 model with the Keras option Model Subclassing…
- Setting up the optimizer and loss functions.
- We determine the configured training step.
- We start the training cycle we have written.
Pay attention to the comments in the gist
Loading VGG from the Functional API
Since Keras hosts a pretrained VGG-19 model, we can load the model from the Keras Application API. Let’s first create a function to use later in the Subclassing section. This function allows us to create a custom VGG model with the desired layers while still having access to the model’s attributes:
Basic model with Model Subclassing
Instead of comparing the raw intermediate output of the content image and the style image, we compare the gram matrices of the two outputs using the gram_matrix function; it gives more accurate results:
Model VGG-19 consists of 5 blocks with layers inside each block as shown above. We’ll select the first convolutional layer of each block to gain style knowledge. Since intermediate level information is more valuable for transfer learning, we will leave the second convolutional layer of the fifth block for the content layer. The following lines create two lists of information about this layer:
Now that we have the layers selected, the gram_matrix () function to calculate the loss and the vgg_layers () function to load the desired into VGG-19, we can create our main model with the Keras option. Model Subclassing… With the following lines we do
preprocess_input [предварительно обрабатываем входные] data by passing it through our custom VGG model and
gram_matrix… Create a model and call it extractor. The model outputs a dictionary that contains the output values for content and style information:
Optimizer and loss settings
Now that we can output predictions for / style / information and content, it’s time to set up our model’s optimizer with Adam and create a custom loss function:
Custom learning step
We will now define a custom train_step function that will take advantage of GradientTape, which, in turn, allows automatic differentiation to calculate losses. GradientTape records the operations during the forward pass and can then compute the gradient of our loss function for the input image already for the backward pass. Please note that we are using the decorator
tf.function()so that TensorFlow knows we are passing the trainstep function… Feel free to experiment with
total_variation_weightto get different style transfer results.
Customizable learning cycle
Now that you’ve read everything, we can run a custom learning loop to optimize the weights and get the best results. Let’s run the model for 20 epochs and 100
steps_per_epoch [шагов на эпоху]… This will give us a nice pop art version of the photo we uploaded in the beginning. In addition, our loop will output a stylized photo after each era (this is temporary).
If you are using Google Colab to repeat the steps in the tutorial, make sure you enable hardware accelerator in your notebook settings. This will significantly reduce the learning curve.
Save and display the stylized image
Now that our model has finished training, we can save a stylized photo of the content using the TensorFlow preprocessing API. The next line will save the photo to your surroundings:
Here’s the result:
Figure: 6. Photo and stylized version
You have just built a neural style transmission model using transfer learning. Obviously there are ways to make it better, but if you look closely, you will see that our model copied Warhol’s style when he styled Monroe’s hair. The model also borrowed the background color from the Monroe diptych. Experiment with numbers
steps_per_epochto get different results. You can also use other art styles to get interesting results. And if you want to learn how to apply machine learning otherwise – come to study, and the promo code HABRgiving 10% in addition to the discount on the banner will help you with this.
- Machine Learning Course
- Course “Mathematics and Machine Learning for Data Science”
- Data Scientist Profession
- Data Analyst profession
- Java developer profession
- JAVA QA engineer
- Frontend developer profession
- Profession Ethical hacker
- C ++ developer profession
- Profession Unity Game Developer
- Profession Web developer
- The profession of iOS developer from scratch
- Profession Android developer from scratch