Text extracted via OCR from the original document. May contain errors from the scanning process.
224 12 The Engineering and Development of Ethics
‘T will not harm humans, nor through inaction allow harm to befall them. In situations
wherein one or more humans is attempting to harm another individual or group, I shall endeavor
to prevent this harm through means which avoid further harm. If this is unavoidable, I shall
select the human party to back based on a reckoning of their intentions towards others, and
implement their defense through the optimal balance between harm minimization and efficacy.
My ultimate goal is to preserve as much as possible of humanity, even if an individual or
subgroup of humans must come to harm to do so.”
However, it’s obvious that even a more elaborated principle like this is potentially subject to
extensive abuse. Many of the genocides scarring human history have been committed with the
goal of preserving and bettering humanity writ large, at the expense of a group of “undesirables.”
Further refinement would be necessary in order to define when the greater good of humanity
may actually be served through harm to others. A first actor principle of aggression might
seem to solve this problem, but sometimes first actors in violent conflict are taking preemptive
measures against the stated goals of an enemy to destroy them. Such situations become very
subtle. A single simple maxim can not deal with them very effectively. Networks of interrelated
decision criteria, weighted by desirability of consequence and with reference to probabilistically
ordered potential side-effects (and their desirability weightings), are required in order to make
ethical judgments. The development of these networks, just like any other knowledge network,
comes from both pedagogy and experience — and different thoughtful, ethical agents are bound
to arrive at different knowledge-networks that will lead to different judgments in real-world
situations.
Extending the above “mostly harmless” principle to AGI systems, not just humans, would
cause it to be more effective in the context of imitative learning. The principle then becomes an
elaborated version of “I will not harm sentient beings.” As the imitative-learning-enabled AGI
observes humans acting so as to minimize harm to it, it will intuitively and experientially learn
to act in such a way as to minimize harm to humans. But then this extension naturally leads
to confusion regarding various borderline cases. What is a sentient being exactly? Is a sleeping
human sentient? How about a dead human whose information could in principle be restored via
obscure quantum operations, leading to some sort of resurrection? How about an AGI whose
code has been improved — is there an obligation to maintain the prior version as well, if it is
substantially different that its upgrade constitutes a whole new being?
And what about situations in which failure to preserve oneself will cause much more harm to
others than acting in self defense will. It may be the case that human or group of humans seeks
to destroy an AGI in order to pave the way for the enslavement or murder of people under the
protection of the AGI. Even if the AGI has been given an ethical formulation of the “mostly
harmless” principle which allows it to harm the attacking humans in order to defend its charges,
if it is not able to do so in order to defend itself, simply destroying the AGI first will enable the
slaughter of those who rely on it. Perhaps a more sensible formulation would allow for some
degree of self defense, and Asimov solved this problem with his third law. But where to draw
the line between self defense and the greater good also becomes a very complicated issue.
Creating hard and fast rules to cover all the various situations that may arise is essentially
impossible — the world is ever-changing and ethical judgments must adapt accordingly. This
has been true even throughout human history — so how much truer will it be as technological
acceleration continues? What is needed is a system that can deploy its ethical principles in an
adaptive, context-appropriate way, as it grows and changes along with the world it’s embedded
in.
HOUSE_OVERSIGHT_013140