Article

Loving/Hating DALL-E

Emma Barrett (writer), Alexi O'Keefe (graphics)

nonfictionstaffwriters

I’m fidgeting in a swivel chair as we wait for Craiyon, formerly known as DALL-E mini, to spit out images for the text prompt: “philosopher Jean-Paul Sartre wearing a green crochet bonnet, photorealistic”. Meanwhile, my housemate attempts to explain to me the mechanics of the new DALL-E software—of which Craiyon is a simplified version—but I’m not really listening. I’m just waiting for Sartre to materialise on the screen. He does—and though it’s not exactly photorealistic, the likeness is undeniable.

Image produced using Craiyon

Over the next few days, we enter into a strange co-dependent relationship with Craiyon as we feed more prompts into the machine. Visitors to our house are subjected to demonstrations as though Craiyon is a pet puppy performing tricks. And I, usually annoyed by computers and software, become incredibly invested in the development of DALL-E.

Given Craiyon’s quick ascension into Know Your Meme in June 2022, I think many were exposed to this new technology in a similar manner: type out the most ridiculous and inconceivable combination of ideas, wait two minutes, and receive an easy and free source of entertainment. But to me, the memeification of the software—though irresistible—displaced our attention from the power, possibilities and potential problems that lie behind these artificial intelligence programs. Behind the funny traces of uncanniness in these generated images is a new future of art and design production.

As the name suggests, DALL-E is a cross between the beloved robot WALL-E and surrealist painter Salvador Dali, describing its capacity to generate images that expand beyond human conception. Developed by OpenAI, Craiyon’s parent DALL-E produces images by encoding the input text, mapping them to the encoded images, and then decoding them into a visual representation of the input text. Compared to Craiyon, the DALL-E 2 model has been trained more extensively in image recognition and text description, learning how to pair the two with incredibly high accuracy. More importantly, it has been fed “natural language”—colloquial words and phrases—which allows it to better respond to a greater diversity of prompts, recognising images and the relationships between them.

Only a few weeks later, we gained access to DALL-E 2. As we are playing around with the new model, a whole suite of high-resolution skeleton images quickly materialise on the screen, including “a melancholic skeleton looking front on into a mirror, his reflection resembles a Caravaggio painting”. The despondent, gazing skeleton, whose eyes are sharpened by dark shadows, appears like a chiaroscuro portrait painted by a human hand. It sparks a pang of sympathy within me—why is the skeleton sad?—before I catch myself: this is an image created by a computer. I feel strange.

Towards the end of July, a payment scheme was introduced for DALL-E 2 users, ending the free system for early account holders. Every DALL-E 2 user receives 15 credits per month, with each credit representing one prompt; 115 further credits can be bought for $15. This moment marked a symbolic change: DALL-E 2 is no longer a source of free entertainment for the hyper-online, but software that demands remuneration. DALL-E’s ascension from the sharehouse laptop to corporate desktop has commenced. But this raises a question: how do we place a dollar value on AI-made art? And, what if this cost is significantly cheaper than the cost of commissioning a human artist?

For contemporary businesses, DALL-E is a cheap and expedient graphic design tool, one that eliminates the time and cost spent on contracting a human designer. There is a skill behind selecting the right input text, though, and often it is necessary to curate or alter the images DALL-E generates. These are tasks that require humans who are literate in the software. But, it seems inevitable that this technology will mould the future of design and shape the role which artists play in the corporate world—for better or worse.

A common thread in conversations about AI-generated art is the desire to justify an inherent hierarchy between human-made and AI-made art. We sentimentally ascribe human-made art with a particular quality of humanness, often termed the hand of the artist. This is what AI lacks—we deem it technological, cold, and uncanny. Sure, this is true of many of Craiyon’s creations, which are imperfect and intuitively feel non-human. But DALL-E 2’s works blend in almost imperceptibly amongst human-made works, and further iterations of the software will only be more successful in doing so. If we can no longer tell whether a human or computer has “painted” a portrait, does it really matter?

With AI-generated content comes a new host of problems regarding intellectual property and the ease of appropriation. DALL-E 2 accepts images as inputs, creating variations of these in the form of new images. This allows any user to feed the computer a set of drawings made by a particular artist and receive a set of new drawings inspired by their style and ideas. While I think most would agree that utilising DALL-E in such a way amounts to misuse, art has always been appropriated and imitated. How do we draw the boundaries here, when it’s more ambiguous than ever? And, with OpenAI now allowing users full ownership over their AI-generated content, what does this mean for these kinds of appropriative creations?

OpenAI has developed a series of limitations and functions to combat some of these ethical problems. For example, those publishing AI-generated content must acknowledge that they have used AI to create their imagery, attempting to curb the proliferation of deepfakes. Additionally, sexually explicit or violent content cannot be produced—this type of content is practically impossible to generate regardless, as DALL-E 2 was not exposed to explicit images during training. Further, to reduce racial and gender biases—for instance, all people generated for the input “CEO” being white men—the software has been trained to ensure diversity amongst the people in its images.

Though these guidelines seek to tackle inappropriate or problematic content, they leave plenty of ambiguity and unanswered questions. Despite OpenAI’s efforts, the development of DALL-E has simply progressed faster than ethical enquiries or legal responses. At the end of the day, these are merely limitations on use created by a company, not laws. In this rapidly expanding field, it remains unclear who should be trusted to draw the line between appropriate and inappropriate uses of such AI technology.

When I first saw that goofy depiction of Sartre wearing a green crochet bonnet, I laughed and kept drinking my cup of tea. Now I feel increasingly unsettled when I stare into his pixelated eyes. I wonder if our desire to poke fun at not-quite-right AI technology masks a deeper feeling of unease—what will our world look like once DALL-E and its peers are further developed and improved? Would they open up a new realm of possibilities, or distort the world by redefining the whole sphere of art?

It might be fun while it lasts—but with the rapid advancement and increasing prevalence of AI technology, are we equipped to deal with a world where AI-generated content becomes a norm, rather than a gimmick?

Published: 04 Nov 2022 04:52