Is AI creative?

Are AI systems capable of genuine creativity, doing more than remixing the source material they were trained with, but forging new work altogether – new prose or visual art that might be called “original” or “inspired” if a human had made it?

To be thorough in discussing this question we should say what we mean by “AI” and what we mean by “creativity,” and we should say what might distinguish inauthentic or substandard creativity from the “genuine” sort of creativity that we deem worthy of admiration. But we can also look for shortcuts to making an assessment, so here’s mine.

In the early 2000s, I would start my work mornings by checking the “Word of the Day” online. There were a handful of dictionary websites running back then, and each offered its own daily word. By checking four of these sites, I could get four uncommon and unrelated words which had certainly never been used together in a sentence before. My officemate at the time would check the same sites and get the same four words, and we’d make a daily game of it: each of us would try to write our own sentence using all four words together. We’d send these sentences back and forth over instant message, seeing who could write a sentence first, and who’s sentence might come out the best each day.

To be good, a sentence had to do more than include the four required words, it had to illustrate all four of their meanings, so that someone reading that one sentence could make a strong guess at all four definitions.

In 2009, I turned this challenge into a website called Quadrivial Quandary (“QQ” for short) and I operated it until 2015. At the height of this project I’d spend hours every day moderating and maintaining the site and writing my own sentence. I cared about it dearly, and still do.

But why did I make such an investment in this quirky amusement? It was fun. And it was a chance to foster a small online community. To meet people and share our love of words.

But more than that, I felt it was like a laboratory – or gym – for creative problem solving. To write a good sentence, you’d have to deeply understand the meanings of each of the four words. You’d need a good sense of how those words might be used in speech – were they formal, informal? Positive, negative, neutral? Common, obscure? What contexts did they belong in?

You’d need to overcome your preconceptions. Surely an obscure medical term, a highly specific legal term, a slang interjection, and an obsolete botanical term could never be connected into one coherent utterance? You’d have to think again, and search hard for those connections. You’d have to invent a context, a story that brought those seemingly unrelated ideas together. And that story would need to be tight enough that each of the four words would seem essential – none would seem frivolous or easy to remove.

If creativity is about understanding your materials deeply, then combining those materials in new ways – ways that defy convention – to communicate meaning – and doing all this in a way that obeys specific constraints – then QQ was creativity in a nutshell. That’s how I thought of it and that’s why I spent so much time on it.

One thing that’s missing from this description of creativity is the component of emotion – the way a creator is moved by their own experiences – the way a creator yearns to express what they feel inside. It would be a stretch to say that QQ ever provided a way for participants to express their deepest inner lives, and it was never meant as such an avenue; but as a kind of writing practice, perhaps QQ was a way to refine one’s tools.

During the years that I maintained QQ, I never considered whether a computer could solve it – I just assumed this would never happen because computers aren’t creative and this was a creative challenge that many humans found difficult.

As I write these words in 2023, after the QQ site has been dormant for 7 years, OpenAI’s “ChatGPT” and other Large Language Models are all the rage. So can ChatGPT solve QQ?

You can see what this question means to me personally, because I invested so much of my life-energy in launching QQ, maintaining it, and contributing to it over six years, thinking of it as a quintessential creative challenge. If ChatGPT could solve it, I’d be forced to accept that ChatGPT was exhibiting creativity. And I’d find it hard to wiggle out of the conclusion that this creativity was the real, legitimate, genuine sort – the same kind I respected and encouraged as the founder of QQ.

So the other week, I took the first four words that came to mind: mirtazapine, eviction, afflatus, and conciliatory.

I gave ChatGPT this prompt:

Write one sentence that uses and illustrates the meanings of the following four words: mirtazapine, eviction, afflatus, conciliatory.

It came back with this:

After facing eviction, the writer’s afflatus was restored by a conciliatory conversation with their psychiatrist who adjusted their medication to include mirtazapine, leading to a renewed sense of creativity and purpose.

The sentence begins with a dangling participle. The meaning of eviction is not fully illustrated: from this sentence, we only know that it’s a negative outcome of some sort. We can gather that mirtazapine is an antidepressant but we don’t know what might be special about it. And the sentence does not really rise to the level of being a short story as we don’t really come to know or connect with a specific character here. The very best QQ submissions did introduce us to a person we could begin getting to know – they were pieces of micro-fiction in the span of one long sentence.

Considering that AI systems in 2023 can already pass the bar exam and make decent stock predictions and generate plausible academic papers, this QQ solution might not seem to be one of AI’s most spectacular achievements.

But it is awe-inspiring to me, because it’s a pretty decent solution to this particular QQ. As the moderator of the game for years, I saw far worse. This solution has got solid bones, and its flaws are fixable.

I doubt that ChatGPT had been trained on any text that included these four words together. And yet, in an instant it was able to discover a plausible story connecting them. If you’ve got “afflatus” in the mix, then you’ve probably got an artist or creative person. If you’ve got “mirtazapine” then you’ve got someone with depression, which is being treated, and that depression probably belongs to the artist. If you’ve got “eviction” maybe it’s because someone couldn’t pay rent, and may that someone who can’t pay rent is the artist because they were depressed and weren’t working. If you’ve got “conciliatory” in the mix, well, that could be the artist being conciliatory towards the landlord, or vice versa, but it could also be the doctor being conciliatory toward the patient.

To wiggle out of the conclusion that ChatGPT is being creative here, there are three approaches I could take.

First, I could argue that ChatGPT isn’t that good at solving QQ. I could prompt it with lots of other word combinations and focus on what it gets wrong. I could argue that the best human QQ solutions are categorically better than the best AI generated solutions. But if I go down this path I have to start by acknowledging that ChatGPT has already done something which I never imagined any computational system would ever be able to do. With that one sentence quoted above, my view of what’s possible has changed irrevocably.

Second, I could argue that QQ doesn’t require as much creativity as I thought it did. Perhaps we could devise a system for QQ that would make it easy for humans to solve the puzzle, so that a person wouldn’t really need to manifest any “creativity” in following that system to construct a plausible sentence that uses any four arbitrary words. But I have to remember that QQ has been one of the biggest labors of love in my life so far, and I poured an unreasonable amount of effort into it over a long span of years. I have to trust in myself that I wouldn’t have done that if there weren’t something deep to be explored and practiced in this game.

Third, I could argue that although ChatGPT can solve QQ, it’s not solving it in an “interesting” or respectable way. What would that mean? Perhaps it’s using brute force in a way that we wouldn’t accept as truly creative. Imagine a system that generated all possible sentences of a certain length, then removed those sentences that don’t include the four required words, and finally applied a statistical metric to choose the sentence among all the remaining possibilities that is most likely or most consistent with reams of recorded human speech. Would such a brute-force process seem to be creative in a satisfying way? Not much more than the monkey in the so-called Infinite Monkey Theorem, who hits keys at random for an infinite amount of time and at some point types out the full text of Hamlet. We can be sure that ChatGPT is not working exactly like this – it can’t be exploring every possible sequence of say, 80 words, because this space of options rivals the estimated number of particles in the observable universe. But maybe it’s using some brute force in combination with material that it has memorized in a way that still seems like “cheating?”

Of course the fourth option is to accept that yes, ChatGPT is creative, and genuinely so.

And in turn this would force us to accept that just as nature can be “creative” and just as people can be “creative” there is now a third category of creative agency that we have to recognize. There is the disembodied, computational creativity of machines, which as it advances, may come to rival the other two forms. There is a creativity that is detached from feeling and experience, but still able to appear as if it’s based in sentience. A robot that has never been depressed, and for whom that term has no meaning, may someday be able to write about depression in the same way a human might. A robot that does not feel a thing may still be able to persuade us – using the same tools of language that we use to persuade each other – that it feels. And when we look at prose or visual art we may no longer be able to tell whether it is a product of computational creativity, generated in an instant, or a product of human creativity, derived from experience, emerging from struggle, crafted through the application of human virtue. ■