Artificial Intelligence: Fast Food for the Soul

people

Dr. Jan Pinkava, Director of Animationsinstitut, has spent much of his career in the entertainment industry. But in recent years, a new technology has been developed that may change creative work as we know it. In this editorial, Pinkava shares his thoughts on the use of Artificial Intelligence in world of art.

According to the old saying, “a picture is worth a thousand words” („Ein Bild sagt mehr als tausend Worte“) which is to say, that a picture, and especially a moving picture, can not only express a complex idea, but can also tell a story. That is what we do at Filmakademie Baden-Wuerttemberg.

 

The internet is full of words and pictures. Trillions of them. Such numbers are beyond true human comprehension. This accumulation of human writing and picture making has been likened to the aeons of carbon deposits which, in the form of coal and oil, have fuelled the industrial revolution and our unsustainable petro economy. So now, in the information age, when all things are transformed into data, words and pictures are being mined to feed the data-hungry machines with which our modern lives are intertwined. We have had to learn strange new words. The change came slowly, then quickly.

 

It was the pairing of words and pictures, by humans, through crowdsourcing services, that created the huge databases necessary for training so-called Deep Learning neural networks to reliably recognize cats and dogs and shoes and handbags. That was already impressive.

 

The breakthrough came with the invention of so-called Transformers that learn not just the patterns in the data, but also wider context. This began the boom in speech recognition, face recognition and machine translation, which has transformed our lives.

 

From Missing Words to Magic: The Rise of Generative AI

The Large Language Models (LLMs) built on Generative Pre-Trained Transformers (GPTs) were trained through “self-supervised learning” in which the machine teaches itself to guess missing words in astronomical amounts of text data, with no human intervention.

 

Transformer-based LLMs turned out to be capable of unexpected “emergent behaviour” beyond their explicit training. The bigger the models, the more complex the behaviour, and the race was on to mine ever more data. According to one estimate, the high-quality human-created textual data on the internet will be used up by about 2028.

 

Now AI agents, like ChatGPT and Gemini, can produce any quantity of text, about any subject, in any format, from computer code, to cooking recipes, to screenplays. Very appealing if you need help with that programming assignment, or if you need to know what you can cook with what is in the fridge, or if you’re staring down that deadline for a first draft. They can also produce complete nonsense.

With super-human speed, GenAI makes wonderful collages from the works of man and nature. What need have we of artists?

 

Researchers are still trying to understand how the Large Language Models they created actuallywork, so they can begin to fix their “hallucinations”.

 

We have seen how Generative AI systems – with names like Stable Diffusion, DALL-E, Midjourney, Imagen, Runway, and Sora – can turn a few words, in the form of prompts, into impressive pictures, even moving pictures, as if by magic. And magic it is, because we don’t yet understand it, nor yet how to properly control it.

 

Sometimes the pictures are uncanny or weird, like the Surrealism of DALL-E’s namesake Salvador, with refined detail, photographic textures and lighting, suggesting the alternative reality of a dream or nightmare. But weirdness is (mostly) optional, and we can also conjure images of charming subjects, replicating almost any nuance of photography or almost any style of graphic art or painting.

 

With super-human speed, Generative Artificial Intelligence (GenAI) makes wonderful collages from the works of man and nature. The sense of power is intoxicating. What need have we of artists? Imagery is “democratized.” Surely, we can now create, with the magic of AI, any image that we can imagine.

 

The diploma Project THE AMAZING KITSUVERSE is all about AI - not only did the student team use AI as an assisting tool to create it, the story also revolves around the beauty and danger of Artificial Intelligence (click to enlarge).

 

So, what can we imagine? Quick, imagine something! Do you see it with your mind’s eye? No? Never mind. The magical machine can imagine the image for us. Just describe what you would like to imagine, in a few words. Make a wish. Let the magic begin!

 

So it goes. And like the fairytale wish fulfilled, the magic soon turns to dust in our hands.

 

Not because the images are not appealing. Not because the technology is not astonishing. But because a picture is worth a thousand words, or more, and although you can make a picture with a word to two, yet still a thousand words, or more, are needed to make the picture that you wanted to make, or something somewhat like it, if you can still remember what that was. Or, you can accept what the machine gives you, and go home early.

 

Prompting is an Art — but is it enough?

If you only had three wishes, you might end up with a sausage on your nose. But this is not a fairytale. You can keep wishing, discard failures, and refine your prompts to make the image more and more specific, until the job is done well enough; at least to your client’s satisfaction. With persistence and time, your prompting skills improve, and then your deadline arrives. Exit the artist. Enter the prompt engineer.

 

For some work, like music videos, and logo graphics, that may be enough. But professionals, who need to respond to feedback and need reliable tools to achieve specific results, and artists, who know what they want and will not compromise, are still looking for ways to make the machine’s imagination more controllable, more predictable, more editable.

 

And the Hollywood studios, with their survival at stake, are waiting for the copyright lawsuits to work their sluggish way through the American courts. It’s no good if you spend less money on artists, if you then have to spend many times more on lawyers.

 

Writers, artists and executives are learning that the promise of GenAI is still in the future.

 

Tools Don’t Think. That’s Still Our Job.

At the Filmakademie, we learn by doing. Students are encouraged to experiment with AI tools to gain the understanding and competence necessary to make informed, ethical choices in their own work. Our teaching emphasises the asking of questions: Does an AI-generated image or script help you to express your ideas? Can this tool help you to clarify your intention or to better communicate it? What are you trying to say? Why is this subject important to you and to your audience?

 

When students know why they are creating sound or picture – for what purpose, as part of which artistic intention – then they are better prepared to make use of Generative AI tools, or not, to answer the questions that matter to them and the audience.

 

For SENSUAL, one single student hand-painted over 800 watercolour paintings. To make a project of this scope possible at all, the team used AI to create the in-between frames.

 

The “why” questions are not answered by Artificial Intelligence. Not yet.

 

Much virtual ink has been spilled to remind us that machines have no morals, no taste, no understanding of truth or falsehood; all of which is true, and may remain true until we understand how to perform black-box-brain surgery, or robots learn to live among us and share our human condition.

 

This reminds us not to expect machine-generated text, or sound, or picture to mean anything more than what we tell it to mean, or what we imagine it to mean.

If all we need is spectacle and entertainment (and who doesn’t sometimes?) then the magic machine can make us fat with it. If we want our art to mean something human, then it is still up to us, the “humans in the loop.”

 

Will we be trapped in an infinite hall of mirrors, whose fragmented reflections offer no new insights? Or will we draw from life; our inner-life and the lives of others? Will writing become multiple-choice recombination of audience-tested IP, or an expression of lived human experience? Will we, or our audience, be able to tell the difference?

 

At the very least, we should not call the AI’s voice our own, simply because we asked it to say something, or add our signature to a GenAI image, just because we wrote the prompt.

 

However we employ these powerful tools to amplify our efforts, we must do the thinking and feeling for ourselves, not outsource thinking and feeling to the “efficiency” of AI.

 

Share article