Wei Dai’s views on AI safety

The following table summarizes Wei Dai’s views on topics in AI safety.

Topic View
AI timelines Wei talks a lot about concerns of shortening AI timelines (e.g. by continuing his decision theory work), but I haven’t really seen him give an estimate of when he expects human-level AI to arrive. “I won’t defend these numbers because I haven’t put much thought into this topic personally (since my own reasons don’t depend on these numbers, and I doubt that I can do much better than deferring to others).” (from this comment)
Value of decision theory work see here. “For example I’ve mostly stopped working on decision theory because it seems to help UFAI as much as FAI.” from this comment
Value of highly reliable agent design work see comments like this one, this one, this one, this one
Difficulty of AI alignment Seems to be very pessimistic. From this comment: “my probability of a good outcome is closer to 20% (maybe a range of 10-30% depending on my mood) than 1%”
Shape of takeoff/discontinuities in progress “I guess I would describe my overall view as being around 50/50 uncertain about whether the Singularity will be Yudkowsky-style (fast local FOOM) or Hanson-style (slower distributed FOOM).”1 See also the comment thread starting here where Wei gives some arguments for the plausibility of a FOOM.
Type of AI safety work most endorsed He has endorsed (1) strategy research; (2) intelligence enhancement/amplification;23 (3) “pushing for a government to try to take an insurmountable tech lead via large scale intelligence enhancement”; (4) philosophy research on topics like consciousness, normative ethics, and meta-ethics, which he thinks will not shorten AI timelines; and (5) advocacy/outreach.4 He also seems less worried about whole-brain emulation compared to de novo AGI.5 See also this comment. And this comment.
How “prosaic” AI will be He hasn’t said anything about this as far as I can tell.
Kind of AGI we will have first (de novo, neuromorphic, WBE, etc.)
Difficulty of philosophy Philosophy is hard. Wei has discussed this in many places. See here for one discussion. See here for a recent comment. I’m not aware of a single comprehensive overview of his views on the difficulty of philosophy.
How well we need to understand philosophy before building AGI We need to understand philosophy well. See some of the discussions with Paul Christiano. See also threads like this one. “I think we need to solve metaethics and metaphilosophy first, otherwise how do we know that any proposed solution to normative ethics is actually correct?”6 “I guess there is a spectrum of concern over philosophical problems involved in building an FAI/AGI, and I’m on the far end of the that spectrum. I think most people building AGI mainly want short term benefits like profits or academic fame, and do not care as much about the far reaches of time and space, in which case they’d naturally focus more on the immediate engineering issues.”7
How much alignment work is possible early on “My model of FAI development says that you have to get most of the way to being able to build an AGI just to be able to start working on many Friendliness-specific problems, and solving those problems would take a long time relative to finishing rest of the AGI capability work.”8 In a slow-FOOM scenario, it is difficult to work ahead because new AI architectures will continue to be developed.9
Hardware/computing overhang I haven’t seen him talk about this at all.
Relationship between ability of AI alignment team and the probability of good outcomes He has a chart in this comment.

See also

External links


  1. “Wei_Dai comments on How can we ensure that a Friendly AI team will be sane enough?”. LessWrong. Retrieved March 8, 2018.

  2. “Wei_Dai comments on Cynical explanations of FAI critics (including myself)”. LessWrong. Retrieved March 8, 2018.

  3. Wei Dai (July 13, 2011). “Some Thoughts on Singularity Strategies”. LessWrong. Retrieved March 8, 2018.

  4. “Wei_Dai comments on How does MIRI Know it Has a Medium Probability of Success?”. LessWrong. Retrieved March 8, 2018.

  5. Wei Dai (August 28, 2012). “Wei_Dai comments on Stupid Questions Open Thread Round 4”. LessWrong. Retrieved March 8, 2018.

  6. “Wei_Dai comments on AALWA: Ask any LessWronger anything”. LessWrong. Retrieved March 8, 2018.

  7. “Wei_Dai comments on AALWA: Ask any LessWronger anything”. LessWrong. Retrieved March 8, 2018.

  8. “Wei_Dai comments on How does MIRI Know it Has a Medium Probability of Success?” LessWrong. Retrieved March 8, 2018.

  9. Wei Dai (August 31, 2013). “Wei_Dai comments on Outside View(s) and MIRI’s FAI Endgame”. LessWrong. Retrieved March 8, 2018.

    The way I model AGI development in a slow-FOOM scenario is that AGI capability will come in spurts along with changing architectures, and it’s hard to do AI safety work “ahead of time” because of dependencies on AI architecture. So each time there is a big AGI capability development, you’ll be forced to spend time to develop new AI safety tech for that capability/architecture, while others will not wait to deploy it. Even a small delay can lead to a large loss since AIs can be easily copied and more capable but uncontrolled AIs would quickly take over economic niches occupied by existing humans and controlled AIs.