Jonghyun Jee | All Too Human: The Turing Test for Paintings

The web-based project All Too Human displays human-made paintings and AI-generated counterparts in a random order, inviting the viewers to identify whether the image was created by an algorithm or a human. By investigating which AI-generated images appear more or less human-made, the project seeks to contribute to contemporary discussions about human perception of machine creativity in visual arts.
 

uc?id=15kzEN2b-afILXuyNLNqx_CYsbK-U8sSM&export=download
The leaderboard section of the webpage.
 
 
 

 
All Too Human is a project that investigates human perception in the visual arts, particularly the variations in perception between machine-generated and human-created images. As GAN-based image synthesis advances, our understanding of visual perception becomes increasingly multifaceted. This leads to debates in the field of art around notions such as authorship, vision, intelligence, and–most importantly–creativity, which has been regarded as one of the distinctive human capabilities.
In a similar vein, several versions of the “Turing test” have been proposed to examine if a machine can demonstrate such human capabilities. The Bot or Not project, for instance, tests whether users can distinguish between AI-written and human-written poems. All Too Human poses a similar question, but with paintings instead of poems. Admittedly, seeing an image is drastically different from reading a text. We cannot identify the difference between human-written and computer-written text from its appearance alone. As a result, what matters in the Turing test for poetry is primarily semantics and the usage of syntax. The Turing test for paintings, however, is a far more complex subject that cannot be boiled down to one single factor. Hence, this project aims to probe which types of AI-generated images appear to be more human-made, and what factors have influenced such visual perceptions.
This web-based project displays AI-generated paintings and human-made paintings in a random sequence, allowing the viewers to guess whether the image was created by an algorithm or a human. The human image set encompasses a wide range of styles (Realism, Impressionism, Surrealism, etc.) and genres (portrait, abstract, landscape), whereas the machine image set includes results of different algorithms and training datasets. I primarily used two image-generating models–StyleGAN2 and CLIP+VQGAN. For StyleGAN2, I utilized a WikiArt dataset consisting of 1) 40,000 paintings that include a comprehensive anthology of art history and 2) the selection of early twentieth-century abstract paintings. On the other hand, CLIP is trained with 400 million image-text pairs that contain diverse images such as photos, paintings, and even screenshots. Since the project presents images without any external context, say, their title or the artist’s name, the viewers have to intuitively guess the authorship of these images. To collect more genuine responses, the website does not reveal the creator of each image till the very end, nor does it inquire about what factors influenced the viewer’s judgment. A separate leaderboard section lists the creator of each painting and the proportion of user responses.
In summation, the focus of this project is less on how adept algorithms have become at creating plausible images; instead, this experiment seeks to uncover some insights into common perceptions and expectations about AI in the context of visual arts. What does it mean if an AI-generated image passes this test? What does it mean for humans to have a visual culture in which we cannot tell whether the images were created by humans or machines?

 


Tags:#ImageSynthesis#TuringTest#VisualPerception