Value Systems, AI and Human Existence

The Singularity Institute has recognized that nonhuman intelligences can have – or evolve – arbitrary value systems. And arbitrary value systems seem likely to be incompatible with ours: human value systems are generally based on continuing human existence, which seems fairly constraining in the scope of all value systems. For example, any value system which wants to consume lots of a resource that humans depend on (air, sunlight, carbon atoms, habitable planets, …) will probably conflict with most human values.

So why do we still exist? No intelligence has yet succeeded in consuming our resources. We’ve not even managed to find convincing evidence of the existence any other intelligence which can compete for the resources we need. Lucky. (Or apply the anthropic principle.)

However, people are working on improving artificial intelligence in software. Currently AI is not very smart or powerful. But contrasted with our own brains, AI is very flexible – it doesn’t come with any assumptions or value systems and it is easy to change by editing the source code. It’s easy to imagine an AI which could edit its own source code. ¹

Given that an AI is editing its own code, and doing so in an intelligent manner, with the goal of making itself more intelligent, there is no obvious limit on the rate of improvement and scope of intelligence that such a self-improving entity could achieve. This is what’s known as the technological singularity – an AI that improves itself rapidly until it vastly outstrips human intelligence.

Up to this point, the only thing we are assuming about an intelligent AI’s value system is that it wants to make itself more intelligent. Without any other assumptions, this isn’t likely to be compatible with the human value of continuing existence – for example, to become more intelligent, perhaps the AI needs all the sunlight hitting earth for energy to run more brain cells. Since the AI is so much smarter than us, it would outsmart us to get itself to that point, extinguishing the human race in the process. Oops.

Having followed this train of thought, and considering it a serious risk to continuing human existence, the Singularity Institute was formed. Its purpose is to mitigate the risk of human extinction by technological singularity. Rather than trying to prevent a singularity from occurring, which seems unlikely to work², SIAI is trying to figure out how to make an AI whose values are sufficiently aligned with human values such that singularity doesn’t pose significant risk of human extinction.

I recently learned that SIAI is sponsoring Visiting Fellows – essentially an internship at the Institute to work on these problems. As part of my application, they asked me to explain SIAI’s purpose and how I can contribute. Having done the former, I will attempt to answer the latter in the next blog post (linked here when it’s ready).

Some humans think editing human code (DNA) is immoral. An AI would presumably have no qualms about editing its own code. That idea, when combined with the strong likelihood that the AI source code is easier to understand and model than the human genetic code, makes AI self-modification seem very easy in comparison. ↩︎
The economic benefits of controlling a super-smart AI are so great that many organizations have a strong incentive to attempt to produce one. ↩︎