Which of the names below will I receive significant evidence are NOT bound by non-disparagement agreements by EOY 2024
9
Ṁ3654
Jan 1
60%
Bilva Chandra (Senior AI Policy Advisor, NIST)
60%
Charlotte Stix (Head of Governance, Apollo Research)
59%
Jack Clark (Co-Founder [focused on policy and evals], Anthropic)
60%
Jade Leung (CTO, AI Safety Institute)
70%
Paul Christiano (Head of Safety, AI Safety Institute)
60%
Remco Zwetsloot (Executive Director, Horizon Institute for Public Service)
Resolved
YES
Geoffrey Irving (Research Director, AI Safety Institute)
Resolved
YES
Beth Barnes (Head of Research, METR)
Resolved
YES
Chris Painter (Head of Policy, METR)

significant = first, second, or third hand reports

"bound" simply means they haven't been released from their OAI non-disparagement agreement, after having made such an agreement. Not receiving the relevant email after signing such an agreement, or having a mutual agreement would count as them being bound by an agreement.

Sorry for the negation in the question, but its hard to receive evidence if they are bound by an NDA, so...

Get Ṁ1,000 play money
Sort by:
bought Ṁ100 Chris Painter (Head ... YES

Beth Barnes says she is "very confident Chris Painter has never been under any non-disparagement obligation to OpenAI" and chris_painter (who I assume, but can't absolutely verify, is the Chris Painter meant) says "I have never owned equity in OpenAI, and have never to my knowledge been in any nondisparagement agreement with OpenAI"

sounds like Chris should resolve YES?

Should this count as Beth being bound or not?

https://www.lesswrong.com/posts/yRWv5kkDD4YhzwRLq/non-disparagement-canaries-for-openai?commentId=MrJF3tWiKYMtJepgX

I signed the secret general release containing the non-disparagement clause when I left OpenAI.  From more recent legal advice I understand that the whole agreement is unlikely to be enforceable, especially a strict interpretation of the non-disparagement clause like in this post. IIRC at the time I assumed that such an interpretation (e.g. where OpenAI could sue me for damages for saying some true/reasonable thing) was so absurd that couldn't possibly be what it meant.
[1]
I sold all my OpenAI equity last year, to minimize real or perceived CoI with METR's work. I'm pretty sure it never occurred to me that OAI could claw back my equity or prevent me from selling it. [2]

OpenAI recently informally notified me by email that they would release me from the non-disparagement and non-solicitation provisions in the general release (but not, as in some other cases, the entire agreement.) They also said OAI "does not intend to enforce" these provisions in other documents I have signed. It is unclear what the legal status of this email is given that the original agreement states it can only be modified in writing signed by both parties.

As far as I can recall, concern about financial penalties for violating non-disparagement provisions was never a consideration that affected my decisions. I think having signed the agreement probably had some effect, but more like via "I want to have a reputation for abiding by things I signed so that e.g. labs can trust me with confidential information". And I still assumed that it didn't cover reasonable/factual criticism.

That being said, I do think many researchers and lab employees, myself included, have felt restricted from honestly sharing their criticisms of labs beyond small numbers of trusted people.  In my experience, I think the biggest forces pushing against more safety-related criticism of labs are:

(1) confidentiality agreements (any criticism based on something you observed internally would be prohibited by non-disclosure agreements - so the disparagement clause is only relevant in cases where you're criticizing based on publicly available information) 
(2) labs' informal/soft/not legally-derived powers (ranging from "being a bit less excited to collaborate on research" or "stricter about enforcing confidentiality policies with you" to "firing or otherwise making life harder for your colleagues or collaborators" or "lying to other employees about your bad conduct" etc)
(3) general desire to be researchers / neutral experts rather than an advocacy group.

To state what is probably obvious: I don't think labs should have non-disparagement provisions. I think they should have very clear protections for employees who wanted to report safety concerns, including if this requires disclosing confidential information. I think something like the asks here are a reasonable start, and I also like Paul's idea (which I can't now find the link for) of having labs make specific "underlined statements" to which employees can anonymously add caveats or contradictions that will be publicly displayed alongside the statements. I think this would be especially appropriate for commitments about red lines for halting development (e.g. Responsible Scaling Policies) - a statement that a lab will "pause development at capability level x until they have implemented mitigation y" is an excellent candidate for an underlined statement

Unless argued with, I'll say not for now.

@GarrettBaker A reason to not think so: OAI did not release her from all non-disparagement clauses

@GarrettBaker Nevertheless, she seems unworried about legal fallback from disparaging OAI, therefore I claim she is not bound.

@GarrettBaker
"OAI did not release her from all non-disparagement clauses"
What do you mean?

@BethBarnes Oops, my bad. I misread this part as talking about the entirety of the nondisparagement agreement

OpenAI recently informally notified me by email that they would release me from the non-disparagement and non-solicitation provisions in the general release (but not, as in some other cases, the entire agreement.)

Also, thank you for the LessWrong comment you made.

sold Ṁ238 Beth Barnes (Head of... NO

This confused me, because the question inherently supposes that they have non-disparagement agreements, but that is something I'm also uncertain.

@BenPace Oh I didn't mean it like that. Let me edit the description. I would consider them never having signed such agreements as also them not being bound by a non-disparagement agreement.

@GarrettBaker i.e. no agreement = no binding = YES, if I receive significant evidence of this fact

@GarrettBaker

"bound" simply means ... never made such an agreement in the first place

I don't think the edit quite did what you meant.

@EvanDaniel lol oops