Creatives need neural "concept" style transfer
Adapting neural style transfer for creatives: Part 2
This is part 2 of my post from last week; Neural Styles and Bioshock's Concepts. TL;DR: “Neural style transfer” is useful in that it transfers colors and shapes. Colors and shapes certainly affect how one feels upon seeing an image.
However, color and shapes alone are insufficient to completely describe “style,” or whatever one might call the concepts one might wish to transfer from one image to another. This insufficiency is because the concepts you might wish to transfer are abstractions humans employ when they perceive and interact with the world. One cannot learn these abstractions reliably without heavily curating the training data.
Here, I present another thought experiment on a style-transfer use case.
We want “concept transfer.”
Suppose I had a picture of this some handsome fellow.
Then suppose I wonder aloud, “What would this face look like if it were artistically composed of fruit?”
In other words, imagine me as a creative trying to recreate something akin to the painting Vertumnus by Giuseppe Arcimboldo?
I decide to use neural style transfer as a tool. I search for an image of fruit still-life to use as my style source. I find this work by Severin Roesen.
And I apply neural style transfer, hoping to achieve something like the following.
But instead, all I get is some horrible skin disease.
The neural style transfer architecture learns color patterns and hierarchies of geometries. Here, it seemed to learn geometries that approximate the concepts of grapes and leaves, though the match is imperfect.
It also fails to partition the face in a way that is meaningful to humans. Humans think of faces as composed of discrete parts like noses, mouths, and foreheads. Each of those concepts is composed of still yet smaller concepts like eyelids and nostrils. These entities are abstractions we invent; there is no real border separating the upper cheek from the nose; we just imagine there is.
In the hypothetical transfer to Vertumnus, the nose, cheeks, and foreheads are discrete fruits and vegetables. The concepts that compose the eyes map to small vegetables. The hypothetical transfer preserves the separation of the concepts that compose the face.
The trouble with the neural style transfer is that you can’t point anywhere in the algorithm’s shape hierarchy and say, “Hey, that pattern there? That’s a pear. It’s good for cheeks and butts.” In other words, you can’t curate the abstractions the model learns as it’s executing the style transfer.
I brainstormed this post with a cousin of mine who has artistic talent and knowledge of art history. She tried a manual style-transfer based on a painting from one of Arcimboldo’s circle:
She found a stock photo of a woman’s profile online and tried to sketch what the profile might look like if composed of pots, pans, cutlery, a loom, and other tools. She got as far as sketching the image on the right before realizing it was much more difficult a creative task than she expected.
The challenge of the creative task motivates the AI-tool use case; if such a tool could catalyze her creative effort (instead of displacing her as the artist) that would be a win.
I tried applying neural style transfer and got worse results than with Vertumnus.
Notice I used the original painting as the style source, which should have given the algorithm an edge.
While neural style transfer on Vertumnus learned enough geometry to capture leaves and grapes, it captured none of the objects in the style source.
Interesting? Help us find like-minded people by sharing with a friend.
Could conceptual style transfer work?
I suspect that in some cases, we could solve this problem with standard neural style transfer techniques with training data heavily curated to suit a specific style. But most likely, it would only work for a narrow set of content source images, and it would be tough to predict which source images would work and which would not. It's hard to build an app out of that.
Machine learning researchers are working on variants of this style transfer problem. DeepMind’s SPIRAL system is does style transfer when the style is described in terms of simple abstractions like strokes of varying widths, as in the following image.
Work by Iseringhausen et al. effectively simulates parquetry — images and designs made out of pieces of wood with predefined shape.
These examples are in the right direction because the components of the style (strokes and wood pieces) are defined in advance. In both cases we see mapping to some distinct concepts (e.g. eyes and nose in SPIRAL, pupil in Iseringhausen et al.).
These elements are essential for art and design because artists and designers deal in concepts. Concepts are paramount, shapes and colors are how they render those concepts.
Iseringhausen, J., Weinmann, M., Huang, W. and Hullin, M.B., 2019. Computational Parquetry: Fabricated Style Transfer with Wood Pixels. arXiv preprint arXiv:1904.04769.
Mellor, J.F., Park, E., Ganin, Y., Babuschkin, I., Kulkarni, T., Rosenbaum, D., Ballard, A., Weber, T., Vinyals, O. and Eslami, S.M., 2019. Unsupervised doodling and painting with improved spiral. arXiv preprint arXiv:1910.01007.