The WOMBO Dream app developed by Trinity students may sketch out the future of visual art in the digital space.
By Vikram Nijhawan, Senior Arts and Culture Editor
In his ancient literary treatise Poetics, Aristotle proposed that all forms of art are merely imitations of life and nature. As we begin to enter a more digitized artistic space, machines wielding nonexistent paintbrushes are now attempting to imitate human nature – which also encompasses human creativity.
For decades, artificial intelligence (or A.I.) has managed to replicate human efforts in a variety of sectors – from industrial automation, to legal and medical analysis, to video games, and even mastery over the chessboard. But in recent years, this has come to include the arts as well – a sphere which, for many, remains distinct from the other comparatively soulless, mechanical, and procedural tasks now delegated to A.I. Current trends paint a different picture, however.
Back in 2016, a novella written by an A.I. program was deemed eligible for entrance in a Japanese literary contest, initially unbeknownst to the judges. Although the program did not end up winning the prize, its admission illustrates the artistic potential of a non-human creator. Six years later, apps like Dream reveal a similar capability.
Dream, the brainchild of the hit Canadian deepfake lip-syncing app WOMBO, is the latest example of an A.I. that can generate visual art based on text inputted by human users. Akshat Jagga, a Computer Science student at Trinity College and one of the architects behind WOMBO, described this new off-shoot project as “a way to bring beauty and art to the masses, which is easy to create and accessible for everyone.”
Dream is based on an innovative new coding technology called VQGAN+CLIP, which enables programs to generate automated images, while also being guided by human input to filter for specific requirements. According to Jagga, the model that powers the app is trained on over 400 million different images, which it uses to guide the visual creation once words are entered.
“When you enter a string of words, it sort of takes each individual word and correlates it with an image, and then it carves out a 2-D space and tries to move towards the best possible outcome,” said Jagga. “Think of it like a car traveling through a field, and the goal is to get the closest to the centre of the field. There are endless possibilities as to how and where it can drive.”
Like two snowflakes, no two images rendered by Dream will look identical, even if the same text is entered multiple times. Likewise, Dream differentiates itself from other similar apps by offering users a free experience, sans obtrusive ads or wait time while the artwork is generating.
“What A.I.’s best at doing is being given a very simple input, and giving you an impossibly crazy output,” said Angad Arneja, WOMBO’s Head of People, and a Rotman Commerce student at Trinity. “I think the reason these products have taken off so much is that they allow pretty much anyone and everyone to be a content creator.”
The company has sought to monetize Dream by allowing users to purchase prints of their artworks, and is eventually planning to allow users to mint their images as NFTs.
The images Dream generates currently lean more towards the abstract. For many independent creatives though, the app has proven to be a boon. Jagga and Arneja reveal that sci-fi authors have used artwork made by Dream for novel covers, sparing them the expense of having to commission a human artist, as per the conventional route.
This includes Hugo Award-winning graphic novelist Ursula Vernon, who recently tweeted a nine-page webcomic she illustrated with help from Dream’s artwork. The setting of Vernon’s story, the mythical and otherworldly library of the Egyptian scribal god Thoth, made the app’s incantatory, dream-like visuals in the background feel all too apt within the context of her fictional world. In other words, Dream enabled the comic’s form to complement its unique narrative content – the hallmark of any good work of art.
While naysayers may decry the advent of A.I. art as a threat to human creativity, Jagga echoes the optimism of innovators like Sam Altman, in viewing this technology as an augmentation for human creators. As much as Dream promises to democratize the artistic process, the app’s creators consider this part of a natural evolution in the broader creative landscape, analogous to how digital graphic design came to replace hand-drawn art.
“The people with the best skills will always produce the best art, no matter what tools they’re using,” proposed Jagga. “A child can draw a portrait or a landscape, but a more skilled artist will always draw those better.” To this effect, the WOMBO Dream team’s future plans include adding more customizable features to allow users to shape their artwork in more advanced and creative ways, by planning over twenty different parameters for users to toggle with to affect the appearance of the final piece.
As WOMBO was developing Dream near the end of last year, another experimental project was underway. Playform, a creative A.I. startup company from Rutgers University, developed a program that helped to reconstruct Ludwig van Beethoven’s unfinished 10th Symphony, based solely on scant notes the composer left behind before his death. With the help of modern musicologists, the program wrote a potential version of the symphony that was brought to life by an orchestra performance on October 19, 2021.
Playform’s 10th Symphony was sublime – if not in the strict artistic sense of the word, then by the very nature of its composition. Critics and consumers can debate on the merit of art created by algorithms, but creators using apps like WOMBO now have the power to redraw those interpretive boundaries for themselves.
Avery King, one of WOMBO’s machine-learning software engineers, followed suit on a smaller scale. He developed an image using the prompt “bird of prey”, while also tinkering with the picture himself through special software that regular users wouldn’t have access to. The final result was a product of both human and artificial creativity.
A picture says a thousand words, and depending on their artistic leanings, viewers may have choice words for Avery’s picture. Whether or not this piece is ‘inspired’ is a matter of subjective taste. But as a testament to the potential of joint human and machine creativity, ‘inspirational’ is certainly a word that comes to mind.
Vikram Nijhawan is the Senior Arts and Culture Editor for Trinity Times, and a fourth-year undergraduate student at Trinity College, studying English, History, and Classics.