How Models can Improve the Control of a System

We have learned how the brain routes (working memory) information from the neocortex through the thalamus and back to the neocortex to create the “inner voice” or “mind’s eye.” We have also seen how the working memory is used to transfer high-level information (like “the red apple is on the table”) to the prefrontal cortex. Yet the still unanswered question is how consciousness would arise from that loop. Simply claiming that information passing through the thalamus results in our subjective experience is not a real explanation. To approach this issue, let us first examine when we think someone else is conscious.

**Figure 6.22:** The robot Pepper turns its head towards faces, creating the illusion of consciousness (image source: Shutterstock).

It actually requires very little proof for us to initially think of someone or something as conscious. Just having similar capabilities of the superior colliculus (tracking a moving object with its head and eyes) can be enough for us to automatically attribute a type of “consciousness” to a person, animal, or thing. An example is the robot “Pepper” (see Figure 6.22) which (who?) is programmed to recognize faces and move its (her?) head.

Now, to be convinced that the person (or machine) in front of us is conscious we need more than just observing the person’s (or machine’s) ability to follow an object with his (or its) eyes. In this section, we will look at the how a brain (or machine) can learn and adapt to the environment instead of just reacting in very predictable ways (for example, following an object with one’s eyes).

Supervised Learning

What components are required for a machine that can learn and react to its environment?

For our examination of learning, we will look at a basic thermostat that regulates a room’s temperature. It starts heating by switching on the heating unit when the detected temperature drops below a certain limit, and switches off the heating unit once the temperature reaches the limit (see Figure 6.23). Obviously, if the room temperature drops quickly or if the connected heating unit takes a long time to warm up or cool down, this might lead to significant deviations from the desired temperature.

**Figure 6.23:** The programming of a basic thermostat that switches on if the temperature is below the desired level of warmth, and switches off once that temperature has been reached.

Figure 6.24 shows an example in which the thermostat is initially switched on (t2t2), and heats the room and switches off when the desired temperature V2V2 is reached. As the heating unit itself still emits heat after the heater is turned off, the thermostat will overshoot the target temperature. Then, it undershoots the target temperature because the heating unit needs time to fully heat up. It keeps over- and undershooting the target until the desired temperature (within the margin of error in the error band) is reached. If our brain worked like that, we would, for example, have trouble grabbing a glass of water as we would reach too far or not far enough.

**Figure 6.24:** If a controller reacts only to external data, it tends to over- and undershoot the desired value (for example, a target temperature in the case of a thermostat).

To improve the programming of the thermostat, we can add a simple feedback loop which adapts internal values when noticing that the expected measurements are too high or too low. For example, the thermostat could use two variables to switch on its heating unit earlier (when it predicts the temperature to drop) or switch off its heating unit earlier (when it predicts it will overshoot the desired temperature):

Cooldown time: This variable records the time it takes for the heating unit to cool down. If the measured room temperature ended up being too high after the heater was switched off, then the cooldown time was too short; the thermostat was switched off too late. The thermostat could increase the cooldown time a bit and perform better next time.
Heat-up time: This variable records the time it takes for the heating unit to heat up. If the temperature ended up being too low, the heat-up time was too short; the heater should have started heating earlier or switched off the heating unit later. Again, the thermostat could adapt the heat-up time.

**Figure 6.25:** A thermostat that aims to raise the room temperature to a desired temperature (DT). It has a feedback loop that allows the adaptation of a cooldown and heat-up timer after switching off the heater reduces the settling time.

With the cooldown timer and heat-up timer in place, we have created a basic system that can learn over time (see Figure 6.25). The thermostat could keep the room temperature closer to the desired value (and save energy) by starting or stopping heating earlier or later. For example, if the desired temperature is 65F, the thermostat probably has to switch off the heater some time before the room temperature reaches 65F because the heating unit is still hot when switched off.

Using variables and adapting them depending on the outcome of an action can be applied to problems of any size. As it requires interaction with and feedback from the environment, this type of learning is also called supervised learning as some sort of “supervisor” is involved to decide whether or not a decision was productive. In the case of the thermostat, the supervisor was the function that checked whether or not the desired temperature has been reached (“Did the temperature in the room rise above the desired value?” and “Did the temperature in the room fall below the desired value?”, respectively). If the desired temperature was missed, the thermostat failed in its process and the heat-up or cooldown timers need to be adjusted.

Supervised learning Using supervised learning, a brain (or computer) can improve its response to a situation with each new encounter. For example, a dog can learn to sit or roll over on command by getting positive rewards for doing so during training.

The same principle that applies to the thermostat also applies to, for example, learning to throw a ball at a target. The brain records which chain of neurons are responsible for an action. If you observe yourself missing the target, the erroneous chain of neurons is weakened and you might throw the ball differently the next time. If you hit your target, the successful chain is strengthened and you will be more like to throw the ball the same way the next time. That means that the next time, you are less likely to miss or more likely to hit the target, respectively.

The major downside of supervised learning is that it always requires interacting with the environment and observing the outcome. The two variables represent the behavior of the thermostat (when to start or stop heating), not the behavior of the room’s air temperature. Hence, while the feedback loop of the learning thermostat we discussed above reduces the time to reach a stable temperature in a room, it requires a trial-and-error approach. As such, a system based on trial and error is like a black box into which you cannot look. You could neither open it to learn the reasoning behind each activation or deactivation of the heating unit, nor could you tell the thermostat to update its heat-up and cooldown time according to the new coordinates, room size, or window configuration.

In our brain, one of the “supervisors” is the amygdala, which has mapped sense data to positive and negative emotions, with positive emotions strengthening a chain of neurons, and negative emotions weakening it. Vice versa, the amygdala itself learns this mapping of sense data to positive and negative emotions with its own supervisor. The amygdala’s supervisor is the experience of physical pain or pleasure, a hardwired mechanism in our body. In turn, pain and pleasure evolved as each proved to be advantageous for the fitness of lifeforms to prevent damage and encourage procreation and locating food. In summary, decisions of our neocortex are supervised by the emotional mappings in our amygdala. The amygdala learns those emotional mappings with the help of the pain and pleasure mechanism of the body, which in turn evolved over generations by selection.

The neurons in the amygdala (and in other parts of the brain) learn by what is called “backpropagation,” which is comparable to a bucket brigade where items are transported by passing them from one (stationary) person to the next. This method was used to transport water before hand-pumped fire engines; today, it can be seen at disaster recovery sites where machines are not available or not usable. To encourage this behavior, everyone in the chain is later honored, not just the last person of the bucket brigade who is actually dousing the fire. In neuronal learning, the last neuron back propagates the reward from end to start, allowing for the whole “bucket brigade” or chain of neurons to be strengthened. This encourages similar behavior in the future—it is the core of learning.

A good example for supervised learning is training a dog to sit on command. The learning cycle is as follows: First, you give the dog treats. Using the pain and pleasure mechanism, the dog’s amygdala connects the sense data (receiving a treat) with a positive emotion. She has learned that your treats are good. Next, you let her stand while you wait until she sits down. While waiting, you repeatedly tell her to sit. Of course, she does not understand what you are saying. Eventually, she will get tired and sit down. Then, you give her a treat. Her amygdala will link the command “sit,” sitting down, and receiving a treat. In time, she will feel happy about not only receiving a treat, but also sitting on command.

Turing Test

What would happen if we programmed such a system into a computer and gave it a language module to speak with us?

‍

In the wake of the development of computer technology, Alan Turing asked this question in 1950 [Oppy and Dowe, 2019]: How could we test whether a computer is as intelligent as or indistinguishable from a human? He proposed the so-called Turing test, in which a person is put in a room with a computer terminal. Using only a textual chat interface, the person has to figure out whether or not his chat partner is a computer program or a human being. If the person interacting with the computer cannot give a clear answer (or is convinced that the computer program is human), the machine has passed the Turing test.

Turing test In 1950, Alan Turing proposed the Turing test to assess whether or not a machine is intelligent. In the test, a human participant would observe a text chat between a computer and a human. The machine would pass if the observer could not tell who was the machine and who was the human.

What questions would you ask to determine whether you are talking with another person or a computer?

You could run through questions of a basic intelligence test but that would not answer the question of whether the computer has human-level intelligence and a sense of “self,” only that it is intelligent. Likewise, if you straightforwardly asked whether or not your chat partner has an inner experience, the computer could simply respond with a programmed response of “Yes, I am conscious of my inner experience.” With this line of questioning, it seems that all we can find out is how well-prepared or well-programmed the other side is, not whether or not the chat partner is conscious.

It seems that our intuitive understanding of consciousness depends primarily on the observer attributing consciousness to the person or entity. Given that with the Turing test, the arbiter of whether or not a computer is conscious is always the subjective opinion of a test person, this is not surprising.

Because answers to pre-determined questions could be prepared beforehand, we could test the flexibility of our conversation partner by referring to the conversation itself. For example, we could talk about food preferences we like and then ask the chat partner what dish it (he?) would recommend. To answer this question, there is no general rule as each person is different. The machine (or person) would have to evaluate our preferences throughout the conversation to build an idea about what we like or dislike. This is similar to programming a chess computer: you can program the opening moves into the computer, but once the game diverts from the database of memorized positions, the computer needs to rely on actually playing chess and predicting your moves. For example, we could bring up a statement like “When you ordered me a pepperoni pizza yesterday, I told you how great it was. I lied. I actually prefer pizza with tuna.”

**Figure 6.26:** With supervised learning and backpropagation, the brain can react to new information only through supervised learning after making a decision and having experienced the consequences.

A computer program with supervised learning based on neural committees with backpropagation (see Figure 6.26) could not do anything with updated information during a conversation. It could only correlate two different sense data and evaluate whether or not that response was positive or negative. For example, it could take into consideration your immediate positive reaction when it had ordered you a pepperoni pizza. However, modifying its own neural committees based on specific new information (that you like tuna instead of pepperoni) is impossible as it is a black box. All it could do is to evaluate its action of ordering you a pepperoni pizza in general as negative and then (hopefully) find out through trial and error that you actually like tuna. It could not order you specifically a tuna pizza because it can only learn through action and reaction, not through abstract information. It could only decide to order you another random pizza. This of course sounds anything but intelligent. Why could the computer not just order a tuna pizza? Well, the same question could be asked about the thermostat: why could the thermostat not just understand that we moved it to another room?

The challenge is that we are quick to attribute abilities to a machine that it does not have. Just because the computer could speak to us does not mean it can react to information like we do.

In summary, supervised learning alone is not enough to react flexibly within a conversation and pass the Turing test. This leads us to a different approach to learning, which does not need direct interaction with the environment, namely the so-called “unsupervised learning” method.

Unsupervised Learning

How can we gain new knowledge about the world without relying on trial and error?

The difference between supervised learning and unsupervised learning is that the former requires trial and error, while the latter just needs sense data to build a model of the world. Scientists developing automated machines have shown that to control a certain variable, “every good regulator of a system [needs to run] a model of that system” [Conant and Ashby, 1970]. For example, to keep the room at a certain temperature (the variable), the thermostat (the regulator) needs an internal model of the system (the heating unit and the room it heats—the system).

Our initial action depends on whether or not we are familiar with the involved entities. When encountering a new object, we are first inclined to go around it, touch it, or put things on it until we have formed a basic idea what that object is. As discussed in Philosophy for Heroes: Knowledge, we create the concept of, for example, a table by looking at many different tables, ending up with properties like number of legs, size, material, and shape. When deciding whether or not a table would fit into our living room, our brain relies on the spatial understanding of the particular table. For this, our brain adds the specific measurements or values of the properties of the concept, forming a model of the table in our mind.

Model The model of an entity is a simplified simulation of that entity. It consists of the entity’s concepts and its properties, as well as some of the entity’s measurements.

We have already discussed examples for the brain building such models, for example, of the external world (object permanence), a body schema, as well as the inner world of yourself and others (theory of mind). The body schema answers the simple question “Where are my limbs?” Without a body schema, we could still do anything we can do without it, but it would be much harder. This becomes apparent when we, for example, try to manipulate something with our hands when our vision is warped. When we can observe our hands only through a mirror, we have to translate our movements consciously (left becomes right); we cannot rely on our body schema and simply grab an object.

In principle, mammals use some sort of body schema to incorporate changes in their physiology (body size, limb length, etc.) as they are growing up. Similarly, we have learned how the superior parietal lobe maintains a mental representation of the internal state of the body. While the current state is provided by the internal and external senses, it is significantly more accurate to have an internal model that is constantly updated by the input from the senses [Wolpert et al., 1998].

We see the same model-building when using tools. Our brain sees tools as temporary extensions of our limbs. To do this, our brain manages several different body schemas (models), one for each type of tool. While humans are typically able to do this, many other animals have difficulties with this task. For example, dogs often fail to incorporate things into their body schema. Let us consider a dog with a stick: when holding a stick in her mouth, she might struggle to get through openings because, in her mind, the opening is wide enough for her body, but she might fail to include in her calculation the space needed for the stick. There are two possible ways for the dog to deal with this situation. One way would be that she learns through trial and error to tilt her head when walking through an opening while holding a stick. The downside of this approach is that it might work for some openings and sticks but not for others. Whenever she faces difficulties, she would again have to rely on trial and error until she turns her head in a way so that she can pass through with the stick. A better way would be that she creates a model in her mind of herself with the stick in her three-dimensional environment. This way, she could understand why turning her head is important, and to what degree she has to turn her head depending on the dimensions of the stick and the opening through which she is moving.

Unsupervised learning Using unsupervised learning, a brain (or computer) can build a concept by analyzing several sense perceptions, finding commonalities, and dropping measurements. For example, unsupervised learning could be used to form the concept “table” by encountering several different tables and finding out that they share properties like having a table-top, the form, material, and size of the table-top, and the number of table legs.

Our brain uses both supervised and unsupervised learning, one to decide upon an action, the other to model the world and make predictions. We use unsupervised learning to create a model of the world to predict what will happen, evaluate it, and feed it back to our supervised learning as “reward” or “punishment.”

Applied to the example of the thermostat, we would know how quickly it cools down, how long it takes to heat up the whole room, differences between summer and winter, habits of the people living there, etc. during installation. With this information, the thermostat can calculate the optimal time to switch the heating unit on or off (see Figure 6.27) without having to learn it by trial-and-error, significantly speeding up the progress of adapting to new environments.

**Figure 6.27:** Intelligent thermostat that tries to predict the future temperature with a model in order not to overheat the room or start heating too late.

But the approach of having a model of a situation to which you can apply parameters allows more than just quicker learning. It enables you to provide reasons for your decision and explain it to others using your model. For example, in the case of the thermostat, it might suddenly start the heating unit during the day even though the sun is shining. Without additional information from the thermostat, we might start wondering if it is defective. In such a case, a smart thermostat could inform us, for example, that the window is open or that the weather service reports an upcoming snow storm. We could also tell the thermostat that we have just installed automated blinds that open in the morning (allowing the sun to heat up the room). The thermostat could take that information, update its model and start heating the room correctly (maybe with some minor adjustments) on the first day without having to spend weeks of learning the new environment.

The thermostat could even start giving us hints about how to reduce the energy bill. This requires an understanding of the underlying model, and the ability to translate it into language. Basically, this refers to the ability to teach the user a simplified model the thermostat itself is using. Ultimately, a thermostat can only be seen as being as smart as we are if it is able to program (“teach”) another thermostat. In a social setting, this would be the equivalent of explaining our own behavior to others. If you are, for example, at a lecture and all your brain is doing is to make you jump up and rush home, what would the people around you think? Instead, if you are conscious of the steps leading up to the decision (for example, remembering the electric iron you forgot to switch off, combined with a self-image of being forgetful), you can provide a reasonable explanation for your behavior.

Another example for an application of unsupervised learning is in classifying images using artificial intelligence. A computer system based on unsupervised learning still requires that it is presented with inputs, but it does not have to interact with its environment to adapt to new (but similar) conditions. It simply learns how to describe or differentiate the input according to a number of variables, not to make actual decisions. For example, if we provided the system with pictures of faces, with the unsupervised learning method, it would return variables relating to age, sex, hair color, facial expression, eye color, facial geometry, and so on. This approach can be powerful because when encountering new objects that it has not yet observed (but that fit into its schema of properties), it can derive other properties from them. For example, if it has learned how faces change with age, it can make predictions about how the face might have looked in the past or will look in the future.

In summary, it could be said that unsupervised learning allows you to imagine how things would be in a different situation. In the case of the thermostat, we could “ask” it about its heating plan if we were to move to the North Pole, the equator, the basement, or to a room with a lot of windows. With some additional programming, we could even inquire, for example, the heating costs we would save by moving to a smaller house or apartment.

Applied again to humans, it seems clear that those of our ancestors who had more in-depth insight into the inner workings of their minds were able to form better relationships. Being able to arrive at logical decisions and to communicate how they came to those decisions made them reliable members of society. Others could better predict how they might act—assuming the explanations are not created after the fact as rationalizations for their behavior.

Summary

Let us now apply what we have learned to the Turing test and the example of ordering a pizza. If the machine had an internal model of your preferences, it could recall the situation when it originally ordered you the pizza, include your new assessment of your preferences, and run another backpropagation on the original neurons that led to the decision. This way, it would lead to a higher evaluation of your preference for tuna pizza, and a lower evaluation of your preference for pepperoni pizza. And that is the power of learning using a model (see Figure 6.28) versus trial and error. Once we have conceptualized a situation and learned how to handle it, we can easily apply the knowledge to similar situations with different parameters (measurements). And in the context of a conversation, we can directly incorporate new information in our response.

**Figure 6.28:** With a model, a neural network like the brain can learn (adapt its model) without actually having to test the decision against reality.

This concludes our examination of the elements of consciousness. It is now time to collect our findings and form a coherent theory of consciousness. So far, we have learned:

We learned how different brain parts evolved over time, and how they relate to decision-making. We still need to take a look at how each step contributes to evolutionary fitness. Our goal is to determine what use consciousness has for us.
This article led us through our evolution, comparing humans to other primate species. With what we have learned in the other parts of the book, we still need to precisely point out the differences between humans and apes when it comes to consciousness (assuming there are any).
Then, we discussed how the brain tries to predict the future, and how it builds a body schema. Similarly, the brain learns to create a theory of mind.
We found that it makes the most sense to look at consciousness from a monist perspective (materialism). This helped us to draw a diagram explaining the process of consciousness step by step.
We covered what effects various cognitive defects can have on our perception of self. We clarified the difference between a lack of awareness and blindness and discovered that it is not only our senses that are pre-processed, but also our attention. We cannot react to things we are not being made aware of by our brain, but we can learn strategies to overcome such a lack of consciousness.
We learned what the “inner voice” and “mind’s eye” are, and how these relate to the working memory and the prefrontal cortex. This also allowed us to clarify our understanding of the process that creates consciousness. The remaining question was where the subjective experience ultimately comes from.
Finally, we saw how the brain can build models of the world and in what way they are advantageous to us. We tied this in to our discussion of concepts in Philosophy for Heroes: Knowledge.

In summary, what is still missing is an explanation for the subjective experience of consciousness and a discussion of the evolution of consciousness in comparison to that of apes. In the course of this discussion, we will develop a new theory, the awareness schema, which we will discuss in the next section.

About the Author

Clemens Lode

Clemens Lode developed his passion for writing "choose your own adventure" books at age five. Soon, he turned to mechanical typewriters and, later, computers. He discovered LaTeX typesetting many years later during his computer studies, ultimately leading him to write more complex works on philosophy, science, and project management.