Current state-of-the-art AI image generators like Midjourney (i.e. diffusion models, previously GANs and other types of models) create art in a distinctly nonhuman-like way, essentially by manipulating random noise into gradually looking more and more like the target image based on a given prompt. I'm curious on whether it will soon become technically feasible (if not necessarily practical) for an AI to create a similar wide variety of images via prompts through methods that look a lot closer to what a human artist might do, albeit presumably a lot faster.
This question resolves YES if, by resolution time, I can get access to an AI that can:
Control a virtual mouse and/or keyboard on either my machine or some other (virtual?) machine I can access.
Use said control to open image editing software like Photoshop, GIMP, or Paint.NET. Any single one would do.
Draw at least a basic picture of an arbitrary prompt in said software in a way that's recognizable.
To be clear, it does not have to be anywhere near state-of-the-art. It just has to be capable of drawing any reasonable prompt in a way that someone who hasn't seen the prompt could more or less recognize. It can even be a simple black and white sketch, so long as it's a decent one.
I'd try around 5-10 prompts of no more than a sentence each, something like "a bustling city street under the shine of a full moon". If the AI gets at least half of the prompts correct and recognizable (according to my subjective opinion), that counts for the purposes of this market.
Additional details:
I'm willing to pay a reasonable fee to access the AI that would resolve this market, if needed.
The AI should spend no more than 30 real-time minutes on creating each image. If it goes over, I'll try and cut it off early.
Due to subjective judgment being required, I will not trade on this market.
I can see this being technically doable (RL with action space being paint-software-actions, reward according to an image description model, like the one that generate prompts from midjourney image. And some fuzzy distance between original and critic prompt.). Would be hideously expensive though, especially if using real software instead of an idealized set of simple paint-action, so buying No.
@CamillePerrin I was imagining a much more general AI system (i.e. like Gemini + agentic) being able to do this just as a byproduct of being able to do almost anything. But a narrow system would also work for the purposes of this market.