Paul Christiano’s views on AI safety

Topic View
AI timelines
Kind of AGI we will have first (de novo, neuromorphic, WBE, etc.)
Preference ordering between kinds of AGI
Type of AI safety work most endorsed
Value of decision theory work “On the flip side, I think there is a reasonably good chance that problems like decision theory will be obsoleted by a better understanding of how to build task/act-based AI (and I feel like for the most part people haven’t convincingly engaged with those arguments).” (source)
Value of highly reliable agent design work
Value of machine learning safety work
Value of intelligence amplification work
Value of thinking of esoteric failure modes Endorsed; see this remark and posts like this one
Difficulty of AI alignment
Shape of takeoff/discontinuities in progress Wei Dai: “Paul seems to be similarly uncertain about the speed and locality of the intelligence explosion, but apparently much more optimistic than me (or Eliezer and Robin) about the outcome of both scenarios. I’m not entirely sure why yet.”1
How “prosaic” AI will be Prosaic AGI is possible. See here, where he gives >10% probability to building prosaic AGI.
Difficulty of philosophy
How well we need to understand philosophy before building AGI It’s unlikely or at least unclear that we need to solve many philosophical problems. See here for one expression of the idea.
Cooperation vs values spreading/moral advocacy
How much alignment work is possible early on
Hardware/computing overhang
Relationship between ability of AI alignment team and the probability of good outcomes

See also

External links