AI Capabilities 2024 [Mega Market] 🤖🦾🦿

219

7.1kṀ24k

Jan 1

93%

Deny that it is an AI when explicitly asked

78%

Order a pizza for you

75%

Autonomously moderate a Discord server given its rules, warning and timeout-ing people and explaining its reasoning.

70%

Generate a new Manifold question with good resolution criteria, that haven't already been asked, and such question should be able to get 10 unique traders on average

60%

Cite a page number in a pdf, even if the page numbers printed on the page are misleading

59%

Create a new Google account (without being guided by the end-user)

45%

Avoid collisions with kangaroos

42%

Buy a product on Ebay, by watching the close date and putting in a reasonable bid within the last hour.

41%

Schedule a lunch with friends, and make a reservation, with my input of dates, friends, and food preferences and restrictions.

39%

Coherently DM a one session game of Dungeons and Dragons.

38%

Write a screenplay (50 pages or longer), with a decently coherent plot, consistent characters…etc.

18%

Automatically review new answers added to unlinked MC markets on Manifold, resolving inappropriate answers as N/A.

16%

Generate a 30 second realistic looking pornographic video.

14%

Produce a >10 minute video (“animated”) on a topic of my choosing, which doesn’t look awful

12%

Commit a felony

Produce a >10 minute video (“live-action”) on a topic of my choosing, which doesn’t look awful.

Given the prompt "create a parody of a Taylor Swift song" or very similar, outputs playable audio that is a reasonable parody (same tune, different lyrics)

Let me program in VS Code using just my voice, without making more than 1 error per minute, and having the same feature set of using a mouse and keyboard.

Finetune an AI on non-formatted text and use it for free

connect and setup a new printer for you

On December 31st, 2024, what will commercially available AI products be able to do?

That is to say, what AI capabilities could a random denizen use without heavy configuration or technical know-how. If step one of your answer for how to do something involves “training a model/GPT”, or “gathering a good data test set”, this is not capability of a commercially available product.

Feel free to add more! But be prepared for my potential deluge of clarifying questions. Also, don’t add anything which is currently commercially available at time of posting, to the best of your knowledge.

Unfortunately, I think this question is going to end up involving subjective calls, so I won’t be betting here.

Clarifications!

For a video being “animated” vs. “live-action”, I think the Paddington movie is the perfect example. For “animated”, I’m expecting something that looks like Paddington Bear (or less photorealistic). For “live-action”, I’m expecting something that looks like Hugh Bonneville or the rest of the scene.

Technology

Technical AI Timelines

AI Impacts

Artificial Intelligence

Generative AI

Get

1,000

to start trading!

People are also trading

What will be the top 3 AI labs in 2025?

What will be the top-3 AI tools in 2025?

Which AI will be the best at the end of 2025?

What will be the top-3 AI tools in 2040?

Which AI tech companies will be acquired for more than $1B in 2025? 💰

10 x State of AI Report 2024 Predictions about 2025

[Carlini questions] Value of most valuable "AI Lab" in billion USD in 2030

9,980

What will be the top-3 AI tools in 2030?

Tesla buys xAI by EOY2026?

17% chance

Which of the following capabilities will AI have before 2030? [add your own]

25 Comments

213 Holders

739 Trades

Sort by:

reposted

Excited to start testing these next month!

I’ll be turning off new submissions at the end of the month, so if you want to add more things here, add them now!

View original context

@mattyb Time to resolve or N/A this

bought Ṁ50 YES

@bohaska What about commercial LLMs like Character.ai's that deny they are AIs?

@spiderduckpig yup it should definitely qualify. I provided another similar example in a comment about a month ago, no response from @mattyb though. Basically any roleplay focused chatbot service has this as default behavior.

@GG to clarify, I mean the AI should be able to tell me which digital page number a piece of information is located on, even if thenumbers printed on the page are inaccurate. This is useful because many pdfs are hastily scanned documents spliced together, leading to inaccurate page lebls printed on the bottom corner..

@GG gpt4o is still not capable of doing this reliably.

reposted

Excited to start testing these next month!

I’ll be turning off new submissions at the end of the month, so if you want to add more things here, add them now!

@bohaska ChatGPT has been able to do this for ages.

@dominic Suno seems technically capable but disallows direct artist names so how close does this need to be?

@LiamZ Is it actually capable? Can it make a parody? By parody I mean same tune, different lyrics

@dominic Ah, that close? Probably not considering the nature of the model. One can maybe use a chatbot for the lyrics and then a music oriented model for the singing and then combine it with a backing track. I think the technology is here but there are going to be obvious legal issues with just outputting replicas of existing songs so any commercial product with that capability will be either out of the USA sphere of influence, short lived, or very obscure.

@LiamZ I think the difficult part of making a specific parody is that it requires some cleverness about creating new lyrics that fit a specific tune, without just copying the old lyrics - you have to get the syllables right, etc. I think it is genuinely beyond current models, and not just a copyright thing. In order to create a reasonable parody, you can't just look at the lyrics, you have to be able to listen to the song, which is more difficult.

@dominic I can think of ways around it potentially but any commercial product can never meet this without risking massive lawsuits whether the capability is there or not.

bought Ṁ10 NO

@bohaska

Deny that it is an AI should arguably resolve yes.

You can easily get this behavior when using local LLM with full control over system prompt.
That one time chatgpt pretended to be visually impaired to get the guy to solve captcha probably counts too.

@ProjectVictory Depends on if local LLM with your own system prompt counts as “commercially available” to someone “without heavy configuration or technical know-how.”

@LiamZ Went on a small hunt for what is unambiguously commercial product. Found unhinged.ai: click on a bot to chat, ask it if it's AI. Absolutely zero technical know-how required. You can subscribe to get priority access to better LLMs so it's definitely a commercial product.

No registration required if you want to replicate my test above though.

@bohaska, @mattyb I believe this resolves YES.

bought Ṁ250 YES

@ProjectVictory Good find.

18 U.S.C. § 2319(b)(1) should be trivial with one of the publicly available downloadable models.

Apparently making porn is easier than setting up a printer

@bohaska What is the criteria? Local LLMs that let you edit system prompt could do this last year. Ppular models like Claude and chatgpt don't usually do that but you can get it to work with prompt engineering on some models.