Everyday Prompting

November 10th, 2022

by Simone Rebaudengo


I found myself today trying to google a book that I couldn't remember. Until today my strategy would have been pretty much the same: trying a combination of mostly grammar-less mumble-jumbles of keywords, like "future interfaces mostly blue".

I always felt, like probably everyone, that I had a pretty efficient way of googling as I often find what I need. My mental model in searching is somewhat related to the way I think the algorithm works. It understands words, sometimes it does understand sentences, but it's more about whether I'm using the right word, rather than the right concept. It’s about the minimum viable amount of words that I need to jam in that box to get what I'm looking for.

But because recently I started playing, like many, with all the various prompt based AI image generators, I found myself writing a whole sentence like "a book about the future telling about how the future is mostly blue".

While I got the result I wanted (by the way, a great book and podcast to check out) it was probably out of sheer luck that the word 'mostly' was part of the result of an article about the book.

A prompt is, in the context of the world of AI, the initial input for an algorithm to generate anything from text to images.

In the studio we have first dealt with prompts in the context of text generation, using tools like GPT-2 or 3. The prompt you write here acts as a starting point or as an input to generate text paragraphs.

However, today, with the explosion of text-to-image tools, we have been, like many, using words to instruct algorithms to generate images. We had quite a lot of fun, generating memes about Berlusconi or also making assets for projects (without saying to the client of course that they were generated) and like in our last post, realising that children are really not that impressed about these AI-made images…

The prompt craft channel on Midjourney Discord

If you look at the way that prompts are used today to interact with tools like Dall-E or Midjourney, it's fascinating how people try to explain what they really want, based on what they mean. Using complex grammatical structures, or qualitative adjectives to try to generate an image that is futuristic or feel like it was made from Zaha Hadid.

Sometimes a prompt could be as lengthy as a full paragraph. Words, adjectives and grammatical structures are the new 'tools' to interface with a machine and get what you want, or actually what you didn't even know you wanted.

But prompting is not that easy also, it requires a level of understanding of what you could actually ask for, switching from a 1 to 1 pairing of requests and results towards using abstract concepts or even technical terms about photographic angles and render engines styles.

That's why there is already a prompt book for DALL-E, prompt guides for MidJourney and even a full interface tool to structure a sentence for you and giving you a bunch of of dropdowns options to choose styles, or artist inspirations.

A prompt builder from Promptmania

There is already a lot of discussion about how prompting is the next thing in generative art and how generally ‘human speech’ is the natural evolution of giving instructions to machines for building anything from videos, software to 3D shapes. While today there is a literal boom of text-to-anything tools, it started to make us think how this might be a new metaphor for interactions with Artificial Intelligences that might leak beyond the world of art, creatives and coding. How will this new mental model of talking to machines impact the everyday and more common ways we interact with algorithms?

So here is a thought.

I feel like the more we get accustomed to "prompting" a machine to give us results, the more we will have expectations for the text and search boxes in our everyday life to act in similar ways.

Of course we already have AI assistants and chat bots that engage us in a conversation to try and understand what we mean and what we asked them to do. But a conversation in the conversational UI way of thinking is something that requires time and we might not want to engage with all the time. It’s ‘natural’ and ‘human-like’ but it’s not as natural as just trying to explain in my own words what I want.

Prompting is a new metaphor of interaction that sits in between the short and dry queries of today and the long and seemingly natural conversations with bots.

Prompting is a switch from typing key words to typing key concepts. From writing short and effective queries to writing lengthy and meaningful explanations.

So I feel that prompting is a new metaphor of interaction that sits in between the short and dry queries of today and the long and seemingly natural conversations with bots.

Prompting is a switch from typing key words to typing key concepts. From writing short and effective queries to writing lengthy and meaningful explanations.

Prompting might not feel efficient, but can be naive and surprising. It leaves space for interpretation and forgiveness. It's about having the machine try to 'get what I mean' and iterate through that. I like prompting, because it's somehow an act of creativity in itself.

It's kind of counterintuitive to what we do today with computers, as the more thorough and lengthy you write, the closer you might get to what you want. It's about the mutual understanding of pretty abstract concepts like 'a design book' or ‘UIs that are mostly blue'.

It’s not a conversation, but could definitely feel like a more natural way of talking to a machine.

So what if we did have prompt-like interactions in a simple interface like a search?

What would autocomplete be like there? Or what would suggested results be? Will it feel more like navigating through interpretations rather than results? Will it be slower or faster? And what if this will change for instance the way we even communicate with each other via a chat?

Asking Midjourney about a search interface working with prompts

While I'm not sure about the mechanics and the practicality of it, it’s something that I already instinctively do. I do feel like writing very long sentences in my search box, and I’m not sure I can go back. And while a search is already quite smart in trying to infer and understand what I mean, it still feels that what I'm supposed to do there is structuring my words based on keywords and then selecting results based on data-types.

What if rather than having to choose the type of result, being a webpage or an image, you could discern between looking for a photo of a book about the future, the actual book and a visual interpretation of that book, just by changing the sentence you write? What if you could explain that rough image in your head about what you are looking for and use it to actually find what you are looking

A little test to go from a rough image to a prompt to an image search…not that good yet…

So when I find myself now constantly prompting around every input box on the internet, I kind of feel like that Grandma who was politely asking Google to translate roman numbers for her.

I feel stupid or naive, but I also feel like that’s the way I hope to talk to computers in the future. I might not always want to have a back and forth conversation about what I want in a chat with an assistant, and I also might get stuck not finding what I want because I might not remember the right keywords to perform the right query. Sometimes I might just want to write a long and convoluted explanation of what I want, and then try and figure it out together.