When will a robot reliably pass "The Coffee Test"?

2.5kṀ12k

2030

18%

Before 2026

44%

Before 2027

63%

Before 2028

79%

Before 2029

87%

Before 2030

Resolved

Before 2025

"The Coffee Test" is an alternative to Turing test proposed by Steve Wozniak. It goes as follows:

A machine is required to enter an average American home and figure out how to make coffee: find the coffee machine, find the coffee, add water, find a mug, and brew the coffee by pushing the proper buttons. This has not yet been completed.

The test has to be repeated several times under independent control. (I.e. self-published video by Boston Dynamics is not enough.) The success rate has to be at least 50% out of at least 3 attempts in different houses.

The market will be resolved positively after a month has passed since the successful demonstration to give some time to uncover possible cheating.

Related question with a different time scale:

/MatthewBarnett/will-a-robot-be-created-that-is-cap

I do not bet on my own questions.

Technology

Robotics

The Coffee Test

Get

1,000

to start trading!

People are also trading

Will a robot capable of passing both the Coffee Test and a strong, adversarial Turing test be created before 2100?

94% chance

When will the first robot be made that can go into a near-arbitrary kitchen and make a cup of tea?

2031

Will figure_robot be able to make a cup of coffee from the breville coffee maker by EOY 2025

58% chance

Will a human be created that is capable of passing Steve Wozniak's "The Coffee Test" before 2040?

73% chance

Will AI pass the Bob Ross Turing Test by 2035?

75% chance

Will a smart agent pass our Turing test by the end of 2025?

58% chance

Will any model pass an "undergrad proofs exam" Turing test by 2027?

Sort by:

Robot enters home, uses it’s built in coffee machine to make a cup of coffee. Easy.

bought Ṁ70 NO

This feels to me similar to the challenge of self-driving cars: Theoretically possible long before practically possible.

Interestingly, it's very likely that a capable robot will exist for a while before this market closes. This test might not be well known enough for someone to bother to run it.

@ProjectVictory The test is relatively well-known, so if the robot exists as more than a single prototype, I expect somebody to try it out. They might only try just one house, which will not meet the criteria, but then we can wait for somebody else to repeat the test.

@ProjectVictory If you think that such robot already exists and it fully meets the market criterias, you should bet on "Before 2025" and take a profit when it confirmed.

@bessarabov To be fair, it has to not just exist, but to be tested in a particular way. Just the existence of a robot capable of doing it is not enough for the resolution.

Test from 14 years ago:
https://www.youtube.com/watch?v=MowergwQR5Y

Some of the more modern coffee machines are quite simple to operate.

For example, does the above require the ability to manage an inexpensive italian press?

The intuition behind the idea was less 'do this specific task' and more 'generically be wise about the world'.

@gpt_news_headlines I would say an "average American home" has either a drip machine or a Keurig.

@gpt_news_headlines Multimodal models are already generically wise about the world in this way. The obstacle is dexterity, and that's all.

@HarrisonNathan It's an interesting point. If it's an issue of dexterity than perhaps there is a test which passes that which might be more relevant.

bought Ṁ3 NO

@HarrisonNathan can you name one such model? Not trying to argue, genuinely interested.

@ProjectVictory There are no models which are wise about the world to a particularly deep resolution. It's more like, ok, I know how to make coffee, I know about the coffee pot, the filter, where it goes, I need to put coffee grounds in it, put water in, and how much, etc. Fairly high level wisdom.

I mean, try out claude and gpt4. They can practically do all of this.

What there needs to be is a dextrous robot which has deep object recognition capability and manipulation. That part is missing. There is some connective models between the LLM layer and the robot layer, but unclear what's the best approach.