Paths to AGI
De novo |
|
|
|
|
Neuromorphic |
|
|
|
|
Whole brain emulation |
|
|
|
|
Intelligence enhancement |
|
|
|
|
Approaches to alignment
Highly reliable agent design |
|
|
|
|
Task-directed AGI |
|
|
|
|
Paul Christiano’s approach (are there multiple?) |
|
|
|
|
Inverse reinforcement learning |
|
|
|
|
Learning from human preferences |
|
|
|
|
Adversarial examples |
|
|
|
|
Working on philosophical questions |
|
|
|
|
Indirect normativity |
|
|
|
|
Coherent extrapolated volition |
|
|
|
|
https://vkrakovna.wordpress.com/2017/08/16/portfolio-approach-to-ai-safety-research/
also suggests various “properties” to group the different alignment
approaches.
The role of philosophy
Eliezer has said something to the effect that “copy-pasting a
strawberry hits 95% of the interesting alignment problems”, but he has
also said we can’t do with anything less than full human morality, or
something similar. Wei Dai pointed
this out in a Facebook thread. I think this is related to the “how
much philosophy do we need to understand?” question but probably
distinct.
Implicit in the philosophical pessimism that Wei Dai has seems to be
the idea that if we don’t do philosophy right, the expected value of the
far future will be catastrophically bad or small or whatever, rather
than merely “okay” or “pretty good” or “very good, but still far from
optimal”. Is the reasoning like one given here?
Role of philosophy in alignment |
How much philosophy do we need to understand? Do we need to specify
“all of human morality”? |
How benign does the environment need to be to get philosophy
right? |
Weird failure modes (e.g. siren/marketing worlds, malign prior) |
Miscellaneous questions
Hardware overhang; possibly
interesting search |
Openness vs secrecy |
Race dynamics |
Differential development/stuff about desirability of slow
technological development |
Ability reduce problems to learning problems; see here |
Amount of hardware required for first AGI |
State involvement |
Ceiling for artificial intelligence (e.g. some people think AGI
isn’t even possible in principle, and even those who believe AGI is
possible have different views on how much smarter than a human a AGI
could be) |
Singleton/multipolar scenarios |
Best-case scenario/mainline success scenario |