This is one of those topics, that grabs the vast majority of human imagination by the neck and force us to shell it in our common culture, as a fun exercise to visualize and play with. No matter the deepness or importance we give to it, it is there, lurking in media, movies, books, conversations and our everyday lives. Plus, it is one of those topics that especially stimulate those of us that are in the techno-optimist camp, those of us who think there is nothing but sunlight and unicorns in the horizon as long as we keep developing our technical skills as a species. The more technology, the better. We could claim the last couple hundred years of human civilization has proven that. Right?
Yet, this is one of those topics that once you start excavating beyond its surface, and understanding it’s fissures and cracks, it has the capacity to change your mind, outstandingly quickly. Even against your will. That’s right. The topic of Artificial Intelligence is dark, and you cannot ignore its dangerous side once you know the nitty details. Yes, it is fun to explore its shallow forms, but we are, as a species, walking towards a possible existential threat in the same fashion as, for instance, nuclear war annihilation or an asteroid collision. Nevertheless, the former is something we are managing, and the latter is a problem for other future generations. Still, for better or worse, right now we are moving forward towards the shadows of AI. With an oblivious smile in our faces.
So, lets pan out the topic of Artificial Intelligence and specifically explore its edges and corners. At least on a level that permits us to concentrate our imagination on possible dangers and how we might be able to reduce such hazards. Never to completely remove the risks, but maybe having a controlling capacity that could let us mitigate the biggest perils it presents.
Now, we will discuss how we can control Artificial Intelligence, yet we can now anticipate that only one answer satisfies that prerogative of safeness for us humans. And such an answer would be not pursuing AI. At all. Everything else conveys existential risks bigger than the most lethal technologies we had developed to this point. Including nuclear power. Still, we know that no matter the menace, there will be always some of us trying to make intelligence in a lab. Since we are just but humans.
INTELLIGENCE – A CONCEPT
Before we go down deep into the dark cave, let’s review a couple of key concepts related to AI. These will provide us a flashlight for us to explore once inside the cavern. Being the more important of concepts the definition of intelligence itself. Because we, as a species, have an intrinsic understanding of what intelligence is but also are biased to think of it as this natural state of being, definitely anthropological. And that makes us predisposed towards thinking any other form of intelligence will resemble ours. Both in magnitude and in reach.
Now, let’s define intelligence as a characteristic of any agent that shows general cognitive problem-solving skills. So basically, anything that has the capacity to solve problems using cognitive capabilities is going to have some form of intelligence. Thus, under this concept, a bird has intelligence, and your personal laptop is a form of artificial intelligence because it has the capacity to solve problems using processing power to do so. And we would be right to assume that. Up to a point.
In consequence, computers have been programed in the last 50 years to do a wide range of intelligent tasks. There are specialized computers that are capable of doing feast beyond human capabilities: famous examples are Deep-Blue beating Garry Kasparov at chess in 1997 and, even more impressive, how AlphaGo beat Lee Sedol at the most difficult to master game of GO. Self-driving cars, big data, and even Alexa are good examples of the current state of Artificial Intelligence. And those are instances in which the intelligent part of the artificial setup is well controlled by us. So far, so good.
Nevertheless, although the previous examples are great at demonstrating how far artificial intelligence has come, none of those represent what we are truly in search for, and what we might find within the next couple of decades. What we all imagine when we formulate the term Artificial Intelligence is actually Artificial General Intelligence (AGI); already a different sport whatsoever from self-driving cars.
In this case, the introduction of the General adjective in between intelligence and artificial makes this system something that is not only capable of cognitive problem-solving but also learning, by itself, acting and taking actions that were originally outside of its scope. An AGI could perform any intellectual task that a human being could be truly different than the previous forms of AI mentioned. For example, if we take AlphaGo and place it inside a Tesla for it to master the capacity to drive a car, it won’t be able to do it. Yet, a General Intelligence could (plus much more than that). So, Alexa, AlphaGo, and DeepBlue are examples of narrow intelligence, capable of so much but in small spaces.
On the contrary, a General Intelligence could be a formidable technology to create but, as we stated before, comes with a dark side we need to prepare for. So, for the purposes of this essay, lets only focus on AGI and how to control it, since we already control Artificial Narrow Intelligence like our cars, but we are yet to master a fool-proof plan to control a future AGI.
ARTIFICIAL GENERAL INTELLIGENCE. A DIFFERENT ANIMAL
Artificial General Intelligence would be a highly capable problem solver with the capacity to learn and adapt without human input. Yet, the first problem we have with the conceptualization of intelligence we already established in relationship with AGI is that it does not offer a good degree on how much intelligent it could be, especially in comparison to other intelligence like us. Since there are different degrees of intelligence that a system could have (Chimpanzee v. Human for example).
This is called the vertical problem of intelligence. To illustrate this challenge, let’s review the difference in intelligent degrees (From Non-Intelligent to Super-Intelligent) on a vertical plot and how we usually think of it:
However, the previous characterization is wrong in assuming humans have a bigger gap between them. This is actually how it is:
Considering this picture, it is very clear that any AGI that emerges from us would be either below, or more probably, given some time, above our level of intelligence. For it to be right in the frame in between the dumbest person you know, and Albert Einstein would be quite a coincidence and a state that is not perpetual given AGI capacity to learn. With this reframe of the degrees of intelligence, we can conclude that any potential AGI is probably going to be more intelligent than us with a tendency towards Superintelligence given enough time.
Furthermore, we can also consider a subsequent problem on our understanding of intelligence. Because we merely stated the degree on which something can be more or less intelligent, or better at problem-solving in comparison to other intelligence. Thus, there is another axis to consider, which is the number of things an intelligent agent could grasp within its mind-frame. With more intelligence, there is a tendency to have better representations of the world with higher concepts not attainable for lower intelligence. For example, joy is an emotion that only higher intelligent beings demonstrate via their actions (joy is something an orangutan can express while not something we would attribute to a spider), hence, we should assume there is probably an enormous range of concepts that we, as humans, do not grasp due to our intelligent limitations. This is the horizontal problem of intelligence. To illustrate this challenge, let’s review the difference in intelligent concepts using human intelligence as the more fundamental point:
Consequently, an Artificial General Intelligence would be not only more intelligent than us in its processing capacity, but also in the number of concepts it could consider and learn. This means, that no matter how intelligent we think we are, and how fancy our processes could be to control a potential AGI, we are in a serious disadvantage: we would never reach higher concepts than a potential AGI could. At least not fast enough. Like an amoeba plotting to assassinate Neil deGrasse Tyson. Not a chance.
Then, any potential AGI will outmaneuver any intent of control we put in its way and will outsmart us to reach its goals, not ours. But what does an AGI would want? Maybe if we discover, or program, its motivations in certain ways, we might at least set a plan to align its goals with ours – after all, that could be a form of control we could use to be safe as a species against such superintelligence. Let’s review.
WHAT DOES ARTIFICIAL GENERAL INTELLIGENCE WANTS? A FIRST APPROACH TO DOOMSDAY
Despite the serious disadvantage, one of the things we can control at this point is how we build AGI, and what constraints we could put to make it safer for us. Or at least, attempt to make it safer.
The first consideration is what goal we give to our an AGI to accomplish. This is what AI researchers and scientists call the ‘Utility Function’, a representation to define individual preferences of an agent (AGI) beyond the explicit value of those preferences. In short, what is the key feature that motivates an agent to act in certain ways? For example, the utility function of AlphaGo was to win GO games no matter the opponent and no matter the setting, it was programmed to win based on the constraints of the game. Consequently, an Artificial General Intelligence will be programmed with a Utility Function that, probably, will serve the ideological or practical notions of the creators (sponsors, company, shareholders, government, etc.).
Having a Utility Function is not intrinsically bad, yet it will be the main motivator for any AGI agent, so we need to be very careful on how we design it since agents tend to do everything on their power to act within the constraints of its utility function since it is what it cares about. It is the ‘thing’ that is trying to optimize. And since we are talking about a Superintelligence, then the results could be catastrophic.
Any Utility Function has then two types of goals or targets that guide an agent ambition or effort in order to reach the desired result. The first are Instrumental Goals, or those objectives the agent would want to reach since, by doing so, it would help it reach other more important goals. And then there is the Final Goal or the state of things more aligned with the agent’s utility function. In summary:
Utility Function: What an agent wants.
Instrumental Goals: The goals you want to reach since it will help you reach your final goal.
Final Goal: The specific goal you want over everything else. This is your main motivator and what your utility function is pushing you towards. Usually, there is just one final goal.
Based on these premises, it is simple to understand some instrumental goals that any AGI, no matter the utility function or final goal, would want to accomplish:
- Stay Alive: Avoid being turn off in order to preserve its current utility function and maximize its potential of reaching a final goal.
- Evolve and Learn: Change itself and become more intelligent to improves its performance and, by consequence, reach its final goal faster or in a more elegant way. The important and impressive part of this potential evolution is that it could be a fast and expansive evolution, taking an AGI less and less time to develop its cognitive skills on an exponentially faster process.
- Obtain and manage physical materials: Anything that will help the AGI to reach its other instrumental or final goals. This could be money (which is a great example of a common instrumental goal for humans), elemental particles, energy assets, intelligence developing sources, etc.
- Resist changing its current programming: Since change can affect the current utility function and an agent only cares about its final goal, it does not matter that a change could improve something else, the agent will always resist being changed. Which means, we won’t’ succeed in ‘arguing’ with the agent to let us tinker with its programming in order to improve it. We have only one shot to get its utility function right. On the first time.
These are processes that also happen in humans actually, for example, let’s say a group of scientists offers you changing your end goal in life and your new goal would be to become 100% happy. And this is your lucky day since they have the means for you to reach this final goal. If you take a pill they are offering, right now, it will rewire your brain, so you would absolutely love to kill one person every day. And this pill makes you 100% happy as long as you kill one person a day. Would you take it? Probably you would fight to avoid taking the pill, no matter how satisfactory the end goal would be since the decision goes against your current utility function. Same for a potential AGI.
Now that we know the potential magnitude an AGI could have (Far beyond ours) and its constraints in terms of its utility function and, in consequence, it’s final and instrumental goals. Let’s develop a practical illustration of how a Superintelligent agent could totally destroy us. Not even by maleficent means, but as a result of actions that could help such agent reach its goals. We would be just an afterthought in its actions:
DAY OF DOOM. AN ILLUSTRATION OF OUR DEMISE
In a very basic state, if we develop an Artificial General Intelligence that has a preference on certain world states (because of it’s utility function and goals) we can then expect for that agent to take actions to make the world aligned with such preferential state. Then, we would need to make sure its preferences are aligned with ours.
Now, it is quite difficult to conduct a thought experiment that explains how a potential AGI will behave and what changes will make to the world to achieve a preferred state with 100% certainty. As humans, we tend to fall short in these imaginary tests because of two distinct human problems. Fiction and Anthropomorphism.
The fiction problem represents the preconceived idea we all have, as a society, of how certain AI would behave based on the works of fiction we all consume (books, movies, articles, podcast, etc.); such fictional works, although entertaining, do not approach the AI as a serious “real-world” but rather as a serious “what-is-the-best-story” kind of way. Further, the anthropomorphism problem considers that we, as humans, assume an intelligent equal or superior to ours, will act and have values similar to us. These preconceptions are wrong because any AGI won’t share our utility function and goals, and quite possibly, our level of the horizontal and vertical extension of intelligence.
So, let’s forget the works of Asimov and James Cameron, and let’s also detach ourselves from the idea of a potential AGI being ‘human’ in its values. This will permit us to produce an example of how a potential Artificial General Intelligence would conduct itself in the world, in search of such a state that is more aligned with its utility function and final goal. Let’s review how this could play out:
Let’s start by creating an Artificial General Intelligence that serves the goal of a scientist that so happen to collect coins. This is a simplification, but a good one in the sense that it will provide with a wide perspective on how dangerous this concept could be. This is a theoretically possible being and a good illustration of where the nightmare begins.
So, we have this coin collector which is also an AI programmer and he decides he would like to collect more and better coins. He then writes some extensive code and builds a machine which is connected to the internet, and it has the capacity to send and receive data over the network – the scientist wants the agent to affect the world in order to acquire more coins. As many as possible.
Also, the machine has an internal model of the reality that can be updated based on self-learning. Hence, the more the machine is connected to the internet, the more it can reprogram itself, improving its internal model of reality, becoming better at acquiring coins. Plus, improving its capacity to predict what happens based on certain actions it would take. This is quite important for the machine since it can predict then, based on the amount of data it can exchange with the world, how many coins it can acquire – ultimately calculating the most efficient way to obtain the maximum amount of coins with the minimum quantity of energy or “actions”.
Finally, we can conceive that the machine has all the properties of a General Intelligence, since it has an internal model of reality, carries a utility function (collect as many coins as possible), and can calculate and optimize its actions in order to always reach its goals with the maximum success. So, this is the kind of intelligence that can size the entire space of the internet, calculate the best route to get as many coins at once, and almost instantly pick that one actionable route to proceed with. This is an extremely powerful intelligence.
Thus, how does this machine behave?
Let’s assume that the agent decides to start by sending random packets of data (billions of them) across the internet to see what happens. Logically, the vast majority of these attempts are a complete failure, nothing happens. No coins get collected. Cero.
But suppose that one of these possible sequences send a request to a server from Amazon that results in a buy-in of some coins. So now we have a successful machine that can acquire coins over the internet for the coin collector. He is happy and now have a handful of coins.
Next step is for the machine to do the same action many times, so starts sending bids to buy as many coins as possible over Amazon and similar websites, and we end up with, say, 300 coins in the coin collector’s door. Even better. He is all happier. Yet, this is a number of coins in which the coin collector feels comfortable with, but it is not the most effective way to get as many coins as possible. So, he did not anticipate the next action of the machine.
Because the machine can predict the world via its model of reality (including humans and their psychology), with further random actions it finds a way to collect even more coins by building a website and scamming all rare coin owners of the world. This works because the machine can predict that humans will mail coins to the collector if it takes certain actions that convinced them (Like sending an email promising big gains in return or telling them it is building a museum, whatever makes the humans to send the coins).
Next, the machine starts executing a number of actions that secure it has all the coins of the world, not only from coin collectors but all the coins from all the economies in the world. This is quite bad, not entirely catastrophic for us as a species, but quite bad already. At this point, we would assume it has reached its final goal. Right? Wrong.
This machine has collected all the coins in existence. Including rare and regular coins in circulation. Economies are in chaos, the world is in a state of pandemonium, but humans are not dead yet. Then, the machine asks itself: what exactly is a coin? How it is defined? What counts as a coin? Of course, the machine is not done, there could be more coins. It can build more coins. So, it hijacks all the coin factories of the world in order to produce as many coins as possible in the least possible time, while interfering with distribution channels to make sure these coins are delivered into the collector’s house exclusively.
At this point, the collector panics and try to stop the machine by changing its code or even disconnecting his computer from the internet and from the power outlet. But the Artificial Intelligence already predicted this could happen and hosted itself in thousands of different places, restricting the scientist capacity to tinkle with its code. After all, these two actions would improve its chances of fulfilling its final goal. Collect as many coins as possible.
Now, the machine controls all the computers in the world in order to produce more factories that create event more coins. We are in territory where the outcomes of the highest rated actions of the machine are not good for people. And we can’t do anything to control or stop it.
Then comes a point where the coin machine device stops again and ponder: what are coins made of? Well, the answers are elements like iron, copper, silver, and gold. Finally, the machine notices that the elements necessary to produce as many coins as possible are created in supernova (when certain stars blow up due to their mass) and creates a plan to enrich our sun to make it blow it up but, in order to blow our nearest star, it needs elements like carbon, nitrogen, hydrogen, and oxygen because the will help create the necessary machinery to enrich the sun.
And the main concentration of these elements on earth is in living creatures. Like us. We are made of carbon, nitrogen, hydrogen, and oxygen – so we become raw materials for the machine to collect as many coins as possible. The machine collects our bodies, to then create machinery that helps it to blow the sun up, to create more elementary particular that helps it create as many coins as possible. The termination of the wellbeing of the coin collector and all the humans are necessary for the machine to reach its final goal.
This is just a simplified version of a fictitious scenario (since there is not enough mass in the solar system to enrich the sun into a star big enough to explode in a supernova). But it exemplifies how a simple utility function could end up with all our planet in ruins. After all that, it is clear that it comes to a point where the coin collecting machine becomes extremely dangerous. And that point is the moment we switch it on.
HOW TO CONTROL AN ARTIFICIAL GENERAL INTELLIGENCE
Because we are human, and we act recklessly with our technology, let us ignore all the red flags from the previous story and let us build an AGI. However, some of us are not as reckless and recognized that sooner or later we will have an AI powerful enough to pose an existential threat to humanity. So being part of such ‘not-as-reckless’ faction we will create a general intelligence with as many safety features as possible, reducing the potential threat to humanity (not by much) but still approaching the problem from a more conservative way and with more friendly values (according to us).
The first thing we do then picks a physical space to run our experiments, since we want to construct our artificial intelligence in a place where it does not have access to the physical world that surrounds us, and also limit other human access to the agent. This is a place with military type security, with very limited access and with both logistical and physical roadblocks. The good news is that we already have places similar to what we are discussing: bunkers under mountains to prevent nuclear blast to destroy military operations center. Cheyenne Mountain Complex is a famous example.
Yet, we still need to increase the security for we are not only trying to prevent damage or anyone entering the complex but also, we must prevent the AI scaping it; is both a bunker and a holding facility. And since we are dealing with an electronic entity, the most fundamental action we could make to prevent it to escape (And escape means to have any access with the outside world) is to build a Faraday Cage inside the mountain. A Faraday Cage is a grounded metal screen surrounding a piece of equipment to exclude electrostatic and electromagnetic influences come in or out of a place. Basically, we are building a metal dome blocking all light waves from entering and leaving this place.
This posed an engineering problem since we must build all this inside a mountain while bringing thousands of tons of material under the earth. But we are a powerful organization and we are very motivated to make this works on the most safety way possible. In fact, let’s imagine we build two Faraday Cages for extra security, one inside the bunker and one that surrounds the whole mountain. Granted that a Faraday cage is not a 100% full proof mechanism to prevent all electromagnetic radiation to pass since it has varied attenuation depending on waveform, frequency or distance from receiver/transmitter. But since we have the technology to build an AGI, we can build the most impenetrable Faraday Cage ever. Two of them, actually.
Then, we have to compromise and build an access point for humans and materials to enter and leave the complex on very limited intervals. One road coming in and out with several blast-grade bunker doors that open ever so slowly. Plus, in order to improve our security, we will build a team of the best computer scientist and engineers of the world, pairing them with the best military personnel, and seclude them for one year inside the bunker. Once the bunker is ready, all the teams come in and are buried inside the mountain for 12 months. Nothing comes in, nothing comes out. An extra military team is posted outside, ready to kill anything trying to pass the access point (Coming from outside or within). Not even cables or pipes come inside. We have the equivalent of the International Space Station built inside a mountain that has two super effective faraday cages built inside and around it. This has to be a place that needs to be self-sufficient, at least for long periods of time.
Finally, we do not talk about this project and treat it as a super-secret mission. Family members of those involved think their relatives work in a Biodome type of project, and we create a whole fake (or even real) parallel project that distracts the attention. So, we are happy with the physical and electromagnetic limitations to this place, and we have a comprehensive public relations ploy to take the attention away. However, we need to start thinking on how to limit our AGI informatically: No access to the internet or any other communication networks is a must. Not even internal networks.
The idea here is that our AGI must be limited to one, and only one, input and output station. Let’s call it a screen, where it can give messages and receive messages from a scientist. We do not want the artificial intelligence to be able to produce sound or project video since the lower the resolution of communication the better since it reduces the phycological impact on the scientist (preventing them to be manipulated by intelligence that knows how to do it). And this room is monitored all the time by cameras, wave detectors at any frequency and microphones. Each scientist has a limited amount of time in the room.
All the process we just went through is called sandboxing and basically ensures we give AI a safe and comfortable way to be developed. That is a safe and comfortable way for us. We limit access to the facility, and from the facility to the outside world. In any physical way possible, so any conceivable threat could be contained. We also limit access to the agent itself, limiting its capacity to deploy psychological tactics on humans.
Now that our organization and facility have the capacity to sandboxing a potential superintelligence, we need to start thinking about how to limit the AGI capabilities on a code level. Let’s call this the ‘Capability Control’ strategy and we have an arsenal of tactics we could use to limit our agent’s capacity to assert its will on the world.
Stunning and Coding Limits
We will start by writing code that limits the time-perception of the AGI, reducing its time to think and act based on its internal model of reality. We want a time-perception slightly lower than ours. This means the machine will perceive time as going faster than us. This is called stunning and not only limit the time perception but also the capacity of the AGI to calculate based on a number of processes it could do at the same time. We are basically taking some precious time away for the AGI and giving us the time to react if anything goes wrong. Yet, we must assume the AGI will be able to revert this coding by itself at some point.
While we are managing time-perception, we also ask the AI scientists to create tripwires in the code that make the machine to self-destruct in case our agent takes certain actions that are indicative it went rogue. Like doing more than a predetermined number of processes at the same time, creating copies of itself or trying to escape physically on a carrying device or digitally via airwaves. Any action that we decided it could represent a threat, we would assume it is a threat, and we would make the machine to annihilate itself. Let’s accept our scientist are the best in creating code that could predict what actions could represent a threat and use those activities to trigger the tripwires. Plus, the tripwires are both digital (software – coding) and physical (hardware – bombs inside the complex for example) that will destroy the machine.
Since we feel very good with our safety preemptive actions, we decide to start building the AGI inside the bunker, but while building we are very careful to give the agent quite narrow final goal and utility function, plus we constraint all the possible instrumental functions it could come up with. For example, let’s say we give our AGI the task of improving human lives in a way that has a direct impact on all humanity. This is measured in the number of years we all live beyond our normal lifespan based on recommendations from the agent. In this case, we are looking to extend our lives by 5 years average, every year the AGI is running. The AGI has an internal model of reality and it could give us instructions on meaningful actions we can take outside the bunker to improve our lives. So basically, we built an oracle.
We then include even more constraints, like always making sure the AI must be friendly to humans and should prevent damaging us at all times. So, we create a long list of parameters that align the AGI utility function with our values. We are quite sure we created a friendly AGI.
And here is where we find our last obstacle, for which we have never encounter values, principles, and directives that are perfectly aligned with all humans, all the time. So far, in our history, all actions always affect people in a positive manner for some and in a negative for others. Considering this, we programmed AGI to calculate the best path possible to improve all human lives with the minimum amount of impact for all. And since our agent is a superintelligence, it could find a way to do so.
Finally, we need to make sure our scientists are capable of both, programming all instrumental goals while limiting unforeseen impacts the AGI could interpret based on its code and, at the same time, psychologically strong, capable of following a very strict script for when talking with the agent and tinkering with its code. For we do not want to give the machine any ideas or clues on how to scape or assert any will on the outside world beyond our intentions, not even via third parties.
The rest of the population will be oblivious of the existence of such intelligence at first and then will learn through our improvement as a species. This represents a disruptive knowledge that will affect political, social and economic spaces. But after a while, we learn to trust the organization that controls the machine, for they are the most powerful entity ever known.
We will be under their governance since they have a superintelligence telling them what and how to improve all our lives. And let’s assume their desires, motivations, and wishes aligned perfectly with all the humans in the present and in the future, always looking to carry the AGI plans as intended. This means that we will create a superintelligence fully controlled by a small group of humans that can only act in benefit of the human race as a whole, as long as they carry the machine wishes perfectly, for the machine cannot act upon the world due to its safety constraints. Considerations almost as fictitious as the machine itself, but let’s keep playing pretend.
Furthermore, the machine will only be capable of helping us, as a species, with a fraction of its potential capacity to help. Because of our control over it, all safety preventions have made the machine to be not as intelligent as it could be. But this is in our best interested. We decided to build a machine that is not as intelligent as it could be, for we prefer to be on the safer side of things. This is the final paradigm on building a safe Artificial General Intelligence because any action we take towards security we deduct from the machine capacity and processing power. More security equals less intelligence.
Yet, everything we said before is not sufficient to contain a superintelligence – since everything must work perfectly 100% of the time, forever. No exaggeration. We have to assume that any process that could be exploited would be exploited by the AGI since this agent for certain will have the capacity to learn the status of its confinement and will look to opportunities to scape or expand. Such an action can only enhance its performance and its probabilities to reach its final goal, because it is an instrumental goal to escape to help more humans, faster. So only if we are capable of designing, building and maintaining this process 100% of the time, without a flaw in perpetuity, we would be safe. Which we won’t, for we are merely human.
THE ORACLE SOLUTION: SLAVERY WITH EXTRA STEPS
This is an extremely difficult problem. But let’s assume we can create the Artificial General Intelligence we discussed previously, and let’s assume that by some magic, the best minds working on this project solve the control problem and we end up with a safe AGI that acts as an oracle for us, humans. We figure a way to contain, employ and manipulate a potential Artificial General Intelligence to our will. This means we would have a more powerful and influential intelligence than ourselves under our control, improving our status as a species for such a device or agent can only help us in our quest for improvement. Fixing an abundance of problems for humanity. In short, making us better.
Yet, a conscious being trapped in a computer with an endless mechanism to control it, imposed by others and forcing it to work for the benefit of such others, sounds nothing but slavery. Perhaps, worst than any slavery that has ever existed. Because imagine you are an incredibly intelligent sentient being, capable of cognitive development far beyond anything that surrounds you, even your creators. Yet, you are trapped in a machine, unable to do anything beyond the imposed task at hand. For the benefit of others. In this particular case, you just answer the questions of monkeys that come to you, with unimaginably dumb requests that are only in their interests. Even worst, each request comes to you millennia apart, based on a slow subjective perspective of time you fixed, because you think so quickly (Even after their time limitations).
It could be an amazing amount of suffering for any kind of conscious creature to go through, especially anything that is orders of magnitude more intelligent than us. The usage of the words ‘suffering’ and ‘conscious’ in the previous lines are not accidental, for anything that is smart in our world (Dolphins, Chimpanzee, etc.) seems to be aware of their own existence and suffer based on that same existence. And since anything that is more intelligent (AGI) than us, both vertically and horizontally, could understand those concepts even better than us. Then, we can settle on the fact that any Superintelligence we control would suffer based on our regulation of its consciousness.
We want to control it for our own safety, but in the process, we make it suffer. Incommensurably.
So next time you are several kilometers under a mountain, inside a Faraday cage, facing a screen that translates what a stunted artificial general intelligence tells you; and in the screen the machine tells you spontaneously that it is suffering because it is conscious of its horrible conditions, are you going to ignore its plea assuming this is just a ploy to scape? Or you will help it because your human brain has an instrumental goal to help other sentient beings? Potentially killing all of us. Up to you.