Learn how previously discarded style transfer instabilities can create psychedelic videos.
The first frame of the video, a watercolor, is progressively stirred into a plethora of curvy and colorful patches. It then metamorphoses into a purplish phantasmal coral reef that is itself slowly submerged by an angry puce ocean. The water then calms down as the coral reef disappears and ends up perfectly still. How is this psychedelic animation related to style transfer methods?
Neural style transfers are rendering techniques — for images mostly — that seek to stylize a content image with the style of another, see Figure below. More precisely, the algorithm is designed to extract a style representation of an image and a representation of the semantic content of another and then cleverly construct a new picture from these.
While designing new evaluation techniques for style transfer methods (see our paper), we made an uncomplicated but crucial observation. Style transfer applied to the same image as style and content should reasonably output the image itself. However, we observed that many style transfer algorithms do not satisfy this property. No one ever cared for hard-coding this fundamental property. Here, we show how, leveraging on that instability, we produce animations like the one above.
Formally a style transfer method is simply a function f that takes a style image s and a content image c and outputs a new image f(s,c). Our observation is that for some style transfer method f and an initial image x⁰, the equalityf(x⁰, x⁰) = x⁰ is not satisfied. The output images are adding a slightly perceptible flicker, blur or blunder to the initial image x⁰. These instability patterns differ from one method to another but are the same when starting from different images x⁰.
Yet these effects are hardly perceptible. Hence to better understand the phenomenon, we need to amplify them. We simply repeat the process: start from an initial image x⁰ and reiterate the style transfer operation
Indeed, after a few iterations, the effects become perceptible and particularly stylish. In the images above, when taking the MST style transfer method (thanks to this code), the iterates become tessellated versions of the initial photo. The instabilities amplify all the lines of the pictures. On portraits, they reveal all wrinkles. When taking another algorithm like WCT (thanks to this code), the effects are different, see below.
So far, we simply showed the outputs of the first iterations of the repeat process. Actually, the animation above basically collects all the images of the sequence (xᵗ). For many different pictures and methods, we observe this asymptotic type of divergence that we name psychedelic regime. Indeed, once the algorithm loses track of the initial image, it starts raving. And then feeds itself with its own slowly delusional outputs, without ever going back to our reality! The raving differs from one method to another but experimentally does not depend on the initial image.
This playfully shoots what a machine can do when forgetting about the human inputs or the non-numerical reality. In fact, metaphorically, this is also happening in many practical uses of algorithms. For instance, collaborative based filtering recommender systems use new data that come from humans interacting with the algorithm. We no longer assess the choices human would have done without ever been influenced by algorithms. We have lost this initial input!
Finally, if you are interested in making your own psychedelic videos, any feed-forward neural style-transfer approach will likely give a different psychedelic regime. Below, the psychedelic video is realized using the WCT style transfer method.