Note: This is an effort to make relatively objective, transparent Manifold markets that predict AI capabilities. I won't trade in these markets because there will inevitably be some subjectivity, and I'll try to be responsive with clarifications in the comments (which I will add to the market description). Feedback welcome.
Specifications:
By "decent," I mean that a serious large-sized company would plausibly pay market price to run it for at least 30 seconds in a normal TV context. Ideally, a company will actually run it, but it's okay if the commercial doesn't run but clearly could be (e.g., the company just puts it on YouTube but never pays for TV placement).
There is no restriction on the type of commercial. It can be funny, serious, animated, silent, abstract, etc. Ideally the YES case would be made by showing the AI-generated commercial alongside several comparable human commercials, but that's not required. The easiest examples I have in mind are perfume commercials.
The commercial should not merely succeed because an AI made it. It should be good enough to plausibly be aired if humans made it. Ideally, the commercial would run on TV before it's widely known to be AI-generated, but that seems unlikely and certainly isn't required.
By "generate," I mean the entire commercial should be produced without human intervention (e.g., collating AI clips, adding a soundtrack, adding logos), but humans can select the best AI-generated commercials. The AI system doesn't need to exclusively take text instructions as input, but other specific content should be limited to what's necessary (e.g., a logo, high-resolution images from multiple angles of the product being advertised). The commercial needs to match a real company (e.g., a real logo and product).
If enough details about how the video was made aren't publicly available, I'll take my best guess (i.e., over 50% chance it met each criterion). I will probably consult other AI researchers or engineers if this is contentious.
The spirit of this market (which will be used to resolve ambiguities that aren't resolved by explicit criteria) is whether the AI is capable enough to do all the different tasks it takes to produce a commercial, such as not just generating individual video shots but sequencing them together in a compelling way.
@ProjectVictory In particular, the description doesn't limit the number of attempts - it seems plausible that one out of 1,000 attempts should be good enough (even if Sora doesn't do sound yet).
@JonasVollmer I'm not sure—perhaps the top 2,000 global companies by revenue, as long as they have over 10,000 employees. I don't know the distribution of companies running TV commercials, but the motivation here is to exclude exceptionally risk-taking companies that would run weird commercials that wouldn't reflect the ability of an AI to create commercials in general (e.g., abstract, disconnected sounds and visuals for 25 seconds followed by the corporate logo for 5 seconds). I'm okay with perfume commercials, despite their abstractness, because that actually is a big chunk of the market.
I think this is also reason to allow cases where the commercial is made for a real company even if the company does not meet these criteria (e.g., X/Twitter has a few hundred full-time employees) as long as it seems that a company that does meet these criteria (e.g., Meta) would run the commercial if it featured their platform instead. The aim is to approximate quality based on which, if any, companies would run the commercial, rather than to forecast corporate choices (e.g., algorithm aversion, union opposition).