Paths to AGI
De novo |
|
|
|
|
Neuromorphic |
|
|
|
|
Whole brain emulation |
|
|
|
|
Intelligence enhancement |
|
|
|
|
Approaches to alignment
Highly reliable agent design |
|
Task-directed AGI |
|
Paul Christiano’s approach (are there multiple?) |
|
Inverse reinforcement learning |
|
Learning from human preferences |
|
Adversarial examples |
|
Working on philosophical questions |
|
Indirect normativity |
|
Coherent extrapolated volition |
|
https://vkrakovna.wordpress.com/2017/08/16/portfolio-approach-to-ai-safety-research/ also suggests various “properties” to group the different alignment approaches.
The role of philosophy
Eliezer has said something to the effect that “copy-pasting a strawberry hits 95% of the interesting alignment problems”, but he has also said we can’t do with anything less than full human morality, or something similar. Wei Dai pointed this out in a Facebook thread. I think this is related to the “how much philosophy do we need to understand?” question but probably distinct.
Implicit in the philosophical pessimism that Wei Dai has seems to be the idea that if we don’t do philosophy right, the expected value of the far future will be catastrophically bad or small or whatever, rather than merely “okay” or “pretty good” or “very good, but still far from optimal”. Is the reasoning like one given here?
Role of philosophy in alignment |
How much philosophy do we need to understand? Do we need to specify “all of human morality”? |
How benign does the environment need to be to get philosophy right? |
Weird failure modes (e.g. siren/marketing worlds, malign prior) |
Miscellaneous questions
Hardware overhang; possibly interesting search |
Openness vs secrecy |
Race dynamics |
Differential development/stuff about desirability of slow technological development |
Ability reduce problems to learning problems; see here |
Amount of hardware required for first AGI |
State involvement |
Ceiling for artificial intelligence (e.g. some people think AGI isn’t even possible in principle, and even those who believe AGI is possible have different views on how much smarter than a human a AGI could be) |
Singleton/multipolar scenarios |
Best-case scenario/mainline success scenario |