As repligate describes here:
It's possible that this will get resolved based off a technicality - i.e. a video does get posted but without proof of it being executed by Claude. Otherwise a pretty strong No - the first rule of Twitter is that any viral tweet without irrefutable proof in the thread is at least a strong exaggeration.
@NathanpmYoung does this need to be like... verified or backed up in some way that it's actually just Claude 3.5 sonnet doing this, without human or other aid? Or would this resolve YES if repligate or some other user just releases a video they claim is of this?
Not sure I'd call what I see in this video competent agents, and there seems to be some hand-holding from the creators, but these bots seem to manage to play the game okay: https://www.youtube.com/watch?v=1Sf437NKUPs
Still not clear to me how much is handled by the LLMs vs the other tools, since it seems that things like combat happen too fast for an LLM to react.