Search This Blog

Thursday, July 15, 2004

The Epigenesis of Humanism (2001)

A. J. Marr


The core assumptions and values of humanistic psychology in principle refer to processes that supercede or transcend a mechanistic philosophy of behavior, yet must nonetheless be ultimately rooted to genetic events and their neural instantiation in the human mind. The property of behavior that characterizes such values is demonstrated to represent emergent properties of neurally based incentive motivational systems. It is argued that the maximization of incentive value must ultimately cohere to humanistic values, thus demonstrating that such values are the inexorable byproduct of Darwinian exigencies of survival.

A credo of humanistic psychology, as stated by the JHP, is that humanistic psychologists encompass a "lifelong learning community that values authenticity, choice, self-determination, self and social responsibility, empathy, mutual respect, and the integration of mind, body and spirit".

But the question arises, why must this be so, and should it be representative of the core values of psychology as a science? Could this be because of a confluence of genetic tendencies selected fortuitously by evolution? Could it be because of an inferred computational mechanics that reconciles such sentiments with the inputs of experience? Is it mere prejudice, or at worst, wishful thinking? Or could it be because it arises like the epiphenomenon of consciousness from the elementary workings of the human mind? Can epigenesis, or the unpredicted side effects of the concerted workings of genetically determined structures, provide us not only with consciousness, but with art, literature, empathy, and the unexplainable and 'transcendent' qualities of human nature?

The answer may be found in the neuro-biological bases of incentive motivation, the mechanisms that provide learning and the impetus to decide what to learn. In the last twenty years, bio-behavioral models of learning that integrate molar (overt behavior) and molecular (neural behavior) behavior have been constructed that integrate embodied or affective experience with the heretofore disembodied computational notions of behavioristic and cognitive science. This 'affective neuroscience' (Panksepp, 1998) or 2nd generation cognitive science (Lakoff and Johnson, 1999) adds back into the equation of learning the non-computational conscious and nonconscious pleasures and pains that are the ultimate measure of our experience. It by its very nature reconciles the subjective with the objective, the metaphorical with the literal, and the behavioral with the neural. It is, in other words, a holistic or integrated psychology that is nonetheless firmly rooted to empirical or inductive principles. But what metaphors may we entertain that can simply explain this momentous shift in psychological thought that embodies all our intentions?

Skinnerian and Popperian Machines

The description of incentive motivation or intentionality as derived from the traditions of behavioristic and cognitive science was elegantly phrased by the philosopher Daniel Dennett (1996) using the metaphor of a learning machine. If, as the metaphor holds, we are information processors, then the information we process and attain must be first selected. It its most rudimentary form, information is selected from a simple process of trial and error. The behavior of a Skinnerian machine is selected or shaped through the reinforcement of successive approximations of behavior. The individual learns successful behaviors through an experience in real terms with the natural contingencies of reinforcement. Thus a biological organism, as an information processor, learns through actual manipulation of the environment and the positive consequences or reinforcement that follows behavior.

In contrast to the Skinnerian machine, the Popperian machine (named after the British philosopher Karl Popper) learns successful behaviors through an experience in virtual terms with the natural contingencies of reinforcement. In a Skinnerian sense, a mouse learns to avoid a mousetrap because of its vicarious experience with mousetraps. In a Popperian sense, a mouse avoids a mousetrap altogether because it can model in its mind mousetraps.

We are essentially Skinnerian and Popperian creatures. We learn from experience and from experiences that we model in our minds. The behaviors we select in reality and in virtual reality are thus chosen because of associational or computational principles. In Skinnerian and Popperian machines, it is inferred that approach behaviors whether real or virtual prepare for real culminations that ultimately provide reinforcement. In other words, the real or virtual doing is not where the reinforcers are, but in the actual accomplishment of the goal. Like walking down a path to a goal, whether the journey occurs virtually or in reality, reinforcement only occurs when the goal actually occurs. However, as humanistic and social psychologists are wont to point out, reinforcement is not that simple. Indeed, we take pleasure and pain, or are intrinsically reinforced, by events both virtual and real that are far removed from the behavior's substantive products, or 'reinforcers'. Thus we take pleasure in accomplishment, shame in our misdeeds, pride in our private virtue, and all independently of the ultimate results of acts both imaginary and real. By its very nature, reinforcement 'teaches' us which way to go, but if reinforcement is something more than the computational mechanics culminating in the acquisition of an object, then something else and more elementary must be added to the equation of learning.

Teaching Signals

Learning involves teaching, or the feedback that enables our brains to select or modulate the images that are important for problem solving, and in the large, survival. With Skinnerian and Popperian machines, it is commonly inferred that an organism is 'taught' from informative or discriminative feedback from the action itself. In other words, the activity of the computational organelle of our brain, or the neocortex suffices entire for learning. But the expansive neocortex of Homo sapiens is a literal late bloomer, and the product of only the last fifty million or so years of evolution. For our mammalian cousins as well as ancestors, the lack of a substantive neocortex referred choice to more primitive neural systems and the simple decision or teaching signals that guided the choices those systems mediated. This secondary teaching signal does not however reflect computational processes, but a non-computational hedonic sensitivity to abstract qualities of information.

To understand this elementary or ‘affect’ logic, it is important to understand the elementary decision rules that our ancestors had to make to secure survival. It all had to do with surprise. In the ancient environments our ancestors had to face, survival depended upon their ability to respond to the unpredicted changes in its environment. The smell of a new source of food, the sound of a new predator, the sight of a receptive female, or even the exploration of a new territory that presaged such events would require precedence over other aspects of the environment that were more predictable. Hence, our ancestors would be sensitized toward unexpected changes in their environment that was represented neurally by the creation of neuro-modulators (neurochemicals that modulate global areas of the brain) that fix attention, improve synaptic or thinking efficiency, and had hedonic value.

From the perspective of moment-to-moment or molecular behavior, this sensitivity may be termed Pavlovian incentive salience (Berridge, 2001) or behavioral discrepancy (Donahoe and Palmer, 1993). In turn, the same behavior over larger or molar time scales may be termed a seeking response, or a universal foraging instinct (Panksepp, 1998). Discrepancy theories of reward (Hollerman and Schultz, 1998), represent a second teaching signal or source of reinforcement that is different psychologically and physiologically from neo-cortically situated computational or associational processes. Moreover, this secondary teaching signal may cohere or be positively or negatively incoherent with goal states as rationally conceived.

For example, consider a Skinnerian fixed-ratio or piecework schedule of reinforcement in a factory environment. A worker may have to repeatedly pull a lever in a button-making machine a fixed number of times in order to receive a weekly paycheck. But if payment in some varying size occurs after each pull rather than on a set weekly basis, payments would therefore come in surprising or unpredicted regularity. By morphing from a button machine into a slot machine, the worker would be enthused, excited, and likely unmindful as to whether his average weekly winnings would ever match his former paycheck. The reinforcement value of the continuous discrepancy would thus be negatively incoherent with his rational appraisal of behavior that would otherwise maximize reward. In other words the affective value of gambling would be incoherent with the rational value of a predictable routine at work.

As another example, consider a playwright with a rather Popperian task of estimating the fruits of an otherwise simple commission. If the playwright was commissioned to write a play filled with sex and violence that meets the undiscerning needs of a popular audience, he may look past this relatively easy and predictable task to the virtual implications of the judgment of posterity, securing a girlfriend’s favor, surpassing a competitors talent, or impressing the Queen. These additional uncertain interdependencies are stimulating and exciting, but not ultimately necessary to pay the bills. However, they may be enough to raise an original idea for a banal and popular play about Romeo and Ethyl, the Pirate’s Daughter to an inspired masterpiece called Romeo and Juliet that meets all demands, both real and virtual. And as the movie ‘Shakespeare in Love’ so demonstrated, this was indeed the case.

In both examples, the uncertainty of virtual outcomes ‘energize’ behavior, but deflects its aims from what is logically required. Sometimes this is beneficial (as with our playwright), and sometimes its not (as with our gambler). But the illogic of discrepancy is also ubiquitous, and is a near constant aspect of our behavior. Thus our behavior is constantly influenced by daily distractions (e.g. checking email, idle conversations, daydreaming) that are valuable because of their affective rather than logical value. Similarly, when discrepancy is continuous and positive, as when we perceive a moment to moment string of positive uncertainties due to a near matching of demand and skill, the corresponding affective states may be continuous, and reported as ‘flow’ states (Csikszentmihalyi, 1990) that may nonetheless bestow value that is also incoherent with logical ends. Thus the successive uncertainties of a rock climber perched precariously on a ledge, a creative artist grasping for inspiration, or a football player driving his team down the field for a last second score achieve a greater affective value that overshadow the fact that these situations are not logically preferable to surer and safer alternatives.

The Necessary Illogic of Virtue

As Popperian machines, the ability to plan ahead as bestowed by a substantive neo-cortex allows us to be taught by hypothetical events that are cognitively separable from real events, but more primitive teaching signals from deeper neural structures cannot make this distinction. Thus we can be as rewarded by constructing hypothetical castles in the sky as if they were in fact real. The fact that the discrepancies entailed by hypothetical action plans virtually reinforce necessarily causes behavior to diverge from its logical reinforcers. This divergence or incoherence is ultimately the linguistic root of all evil, and of all good. Thus, if our behavior is positively incoherent, we call such behavior virtuous, and if it is negatively incoherent, it is vice. And so a gambler has an addiction, a disease, or a moral defect, whereas a triumphant playwright has inspiration, a muse, or the spark of genius.

If it is assumed that the purpose of life, or ‘happiness’ is to maximize value, or reinforcers, the means-ends values that ensure survival must be combined with the discrepant values that allow us to plan for the surprising exigencies of survival. Thus by modeling discrepancy, we maximize reinforcement, and as a matter of course maximize the empathy that enables us to model the minds of other people. But doing that will alter behavior from its logical course, and propel us to acts both hideous and sublime. Thus to fully and virtually prepare for the contingencies of existence is to betray them or transcend them!

Psychology as Humanism

It is to the survival interest of humanity that our ‘affect’ logic and the means-ends logic of our worlds cohere, and multiply each other in a beautiful and infinite synergy. Hence the design and purpose of any culture is not to maximize economic value alone, but the boundless discrepant value that comes from the stimulating value of the thought of not just mechanical universes but of the empathic modeling of the minds of men and women.

Humanism is at root an epigenetic system of values that on one level transcends evolution but on another level perfectly coheres with a Darwinian universe. On one level of thinking, we may say like a Dostoyevskian anti-hero that two plus two equals five, but on a lower or neural level, the mechanics of his thinking is as determined as the orbit of a planet about a star. Thus on one level behavior is indeterminate, and on another level it is not. Thus freedom and mechanism is dependent upon the perspective you take, but neither one is any less ‘real’. The great irony of human existence is that the ‘logical’ aims of our genes entail ‘illogical’ behavior. On one level (of mechanism) it makes Darwinian sense, but on a higher emergent level (of embodied consciousness) it transcends Darwin. So humanistic instincts are right because we must as a species know the value of virtual things, of the beauty and the pleasure of desire. And because we also must know that human nature, to ensure its survival, must drive us slightly, remarkably, and transcendentally mad.


Berridge, K. (2001) Reward Learning: Reinforcement, Incentives, and Expectations, The Psychology of Learning and Motivation, (3), Academic Press, New York

Csikszentmihalyi, M. (1990) Flow, The Psychology of Optimal Experience. New York: Harper Collins

Dennett, D. (1996) Kinds of Minds. New York: Basic Books

Donahoe, J.W. and D. C. Palmer (1993). Learning and Complex Behavior, Needham Heights, Ma: Allyn and Bacon

Hollerman, J. R., and W. Schultz (1998) Dopamine neurons report an error in the temporal prediction of reward during learning, Nature Neuroscience, 1(4), 304-309

Lakoff, G. and M. Johnson (1999) Philosophy in the Flesh, the Embodied Mind and Its Challenge to Western Thought. New York: Basic Books

Panksepp, J. (1998) Affective Neuroscience. Oxford: Oxford University Press

No comments: