Paul Christiano’s views on AI safety

See also
External links

Topic	View
AI timelines
Kind of AGI we will have first (de novo, neuromorphic, WBE, etc.)
Preference ordering between kinds of AGI
Type of AI safety work most endorsed
Value of decision theory work	“On the flip side, I think there is a reasonably good chance that problems like decision theory will be obsoleted by a better understanding of how to build task/act-based AI (and I feel like for the most part people haven’t convincingly engaged with those arguments).” (source)
Value of highly reliable agent design work
Value of machine learning safety work
Value of intelligence amplification work
Value of thinking of esoteric failure modes	Endorsed; see this remark and posts like this one
Difficulty of AI alignment
Shape of takeoff/discontinuities in progress	Wei Dai: “Paul seems to be similarly uncertain about the speed and locality of the intelligence explosion, but apparently much more optimistic than me (or Eliezer and Robin) about the outcome of both scenarios. I’m not entirely sure why yet.”¹
How “prosaic” AI will be	Prosaic AGI is possible. See here, where he gives >10% probability to building prosaic AGI.
Difficulty of philosophy
How well we need to understand philosophy before building AGI	It’s unlikely or at least unclear that we need to solve many philosophical problems. See here for one expression of the idea.
Cooperation vs values spreading/moral advocacy
How much alignment work is possible early on
Hardware/computing overhang
Relationship between ability of AI alignment team and the probability of good outcomes

See also

List of discussions between Paul Christiano and Wei Dai

External links

“My current take on the Paul-MIRI disagreement on alignability of messy AI” by Jessica Taylor
“Current thoughts on Paul Christano’s research agenda” by Jessica Taylor

“Wei_Dai comments on How can we ensure that a Friendly AI team will be sane enough? - Less Wrong”. LessWrong. Retrieved March 8, 2018.↩︎