Paul Christiano’s views on AI safety
Topic | View |
---|---|
AI timelines | |
Kind of AGI we will have first (de novo, neuromorphic, WBE, etc.) | |
Preference ordering between kinds of AGI | |
Type of AI safety work most endorsed | |
Value of decision theory work | “On the flip side, I think there is a reasonably good chance that problems like decision theory will be obsoleted by a better understanding of how to build task/act-based AI (and I feel like for the most part people haven’t convincingly engaged with those arguments).” (source) |
Value of highly reliable agent design work | |
Value of machine learning safety work | |
Value of intelligence amplification work | |
Value of thinking of esoteric failure modes | Endorsed; see this remark and posts like this one |
Difficulty of AI alignment | |
Shape of takeoff/discontinuities in progress | Wei Dai: “Paul seems to be similarly uncertain about the speed and locality of the intelligence explosion, but apparently much more optimistic than me (or Eliezer and Robin) about the outcome of both scenarios. I’m not entirely sure why yet.”1 |
How “prosaic” AI will be | Prosaic AGI is possible. See here, where he gives >10% probability to building prosaic AGI. |
Difficulty of philosophy | |
How well we need to understand philosophy before building AGI | It’s unlikely or at least unclear that we need to solve many philosophical problems. See here for one expression of the idea. |
Cooperation vs values spreading/moral advocacy | |
How much alignment work is possible early on | |
Hardware/computing overhang | |
Relationship between ability of AI alignment team and the probability of good outcomes |
See also
External links
- “My current take on the Paul-MIRI disagreement on alignability of messy AI” by Jessica Taylor
- “Current thoughts on Paul Christano’s research agenda” by Jessica Taylor
“Wei_Dai comments on How can we ensure that a Friendly AI team will be sane enough? - Less Wrong”. LessWrong. Retrieved March 8, 2018.↩