List of discussions between Paul Christiano and Wei Dai

This is a list of discussions between Paul Christiano and Wei Dai, on topics mainly centered around AI alignment, philosophy, and the far future.

Start date End date Venue Thread title Topics covered Summary
2011-03-01 2011-03-02 LessWrong “Some Considerations Against Short-Term and/or Explicit Focus on Existential Risk Reduction”
2011-04-02 LessWrong “Anthropics in a Tegmark Multiverse” Anthropics, UDT
2011-04-02 2011-04-03 LessWrong “Where does uncertainty come from?”
2011-04-06 2011-04-07 LessWrong “What Should I Do?” Boredom, research vs earning to give
2011-12-29 LessWrong “Negentropy Overrated?”
2012-04-21 2012-04-29 Ordinary Ideas “A formalization of indirect normativity”
2012-04-26 LessWrong “Formalizing Value Extrapolation”
2013-01-27 2013-02-09 Rational Altruist “Taxonomy of change”
2013-02-27 2013-02-28 LessWrong “Why might the future be good?”
2013-05-08 2016-03-19 LessWrong “Pascal’s Muggle: Infinitesimal Priors and Strong Evidence” AI alignment, act-based agents, expected damage of philosophical errors Wei argues for progress in philosophy and for AI designs that can correct their philosophical errors, and thinks philosophical errors are already causing great damage in expectation. Paul is skeptical that philosophical errors are causing great damage.
2013-06-06 LessWrong “Tiling Agents for Self-Modifying AI (OPFAI #2)”
2013-06-14 LessWrong “After critical event W happens, they still won’t believe you”
2013-07-18 LessWrong “Three Approaches to ‘Friendliness’ ”
2013-08-28 2013-08-31 LessWrong “Outside View(s) and MIRI’s FAI Endgame” Difficulty of an intelligence explosion, philosophical problems, value drift
2014-03-01 LessWrong “Self-Congratulatory Rationalism” Rationality, agreement between rational agents
2014-07-21 2014-07-23 Ordinary Ideas “Approval-seeking”
2014-12-12 LessWrong “Approval-directed agents”
2015-04-10 LessWrong “Three Approaches to ‘Friendliness’ ”
2015-04-16 Medium “Handling errors with arguments”
2016-02-23 Medium “ALBA: An explicit proposal for aligned AI”
2016-03-10 2016-03-19 LessWrong “AlphaGo versus Lee Sedol” Paul Christiano’s approach to AI alignment, feasibility of getting an AI to defer to humans for philosophical judgment
2016-10-14 Facebook “I used to think of AI control as mostly unrelated to AI security. Now I’m not even certain that they should be separate research areas.”
2016-10-22 Intelligent Agent Foundations Forum “Control and security”
2016-10-26 Medium “Security amplification”
2016-11-14 Medium “Handling destructive technology”
2016-11-22 2016-11-24 LessWrong “Less costly signaling” Signaling, altruism, selfishness
2016-12-01 2017-09-17 LessWrong “Optimizing the news feed” Facebook newsfeed’s alignment with user preferences, profits vs social welfare, tech companies’ concern about public image Paul is optimistic about tech companies like Facebook working toward greater alignment with user preferences (as opposed to ad revenue), while Wei is pessimistic.
2016-12-03 2016-12-05 LessWrong “Crowdsourcing moderation without sacrificing quality” Potential automated attacks on discussion forums
2016-12-30 Intelligent Agent Foundations Forum “My current take on the Paul-MIRI disagreement on alignability of messy AI”
2017-03-19 Medium “Benign model-free RL”
2017-06-10 Medium “Corrigibility”
2017-06-21 2017-06-26 LessWrong “S-risks: Why they are the worst existential risks, and how to prevent them” Suffering risks, moral uncertainty, suffering-hating civilizations, whether it’s good for certain civilizations to exist (under different value assumptions) Wei pushes for progress in philosophy (decision theory, meta-ethics, and so on). Paul assumes a purely suffering-focused view to understand its recommendations.
2017-07-09 2017-07-15 Effective Altruism Forum “My current thoughts on MIRI’s ‘highly reliable agent design’ work” Maturity and concreteness of MIRI’s highly reliable agent design (HRAD) approach vs Paul Christiano’s approach to AI alignment Wei is worried that Paul’s approach has not received the level of scrutiny that MIRI’s approach has; Paul defends his approach.
2017-07-17 Intelligent Agent Foundations Forum “Current thoughts on Paul Christano’s research agenda”
2017-07-17 Intelligent Agent Foundations Forum “Current thoughts on Paul Christano’s research agenda”
2017-08-18 Intelligent Agent Foundations Forum “Autopoietic systems and difficulty of AGI alignment”
2018-02-25 LesserWrong “The abruptness of nuclear weapons”
2018-03-08 LesserWrong “Prize for probable problems” Paul Christiano’s approach to alignment
2018-03-10 Medium “Universality and security amplification”
2018-04-01 LessWrong “Can corrigibility be learned safely?”

(I’m actually not sure I got all the Medium threads because I find Medium profiles confusing to browse.)

See also

External links