Deinforcement Learning
Keywords:
machine learning, reinforcement learning, neuroscience, cognitive science, psychologyAbstract
An important challenge of human decision-making is determining via trial and error which options maximize reward and minimize punishment. In computer science, this problem is known as reinforcement learning (RL), and particular RL paradigms, such as the advantage actor-critic (A2C), have been the subject of extensive research (Niv, 2009). Current RL algorithms are insufficient representations of the brain, despite the fact that this biological analogy has historically advanced the field of computer science (Tassa et al., 2018). When mimicking dopamine pathways, RL often disregards one of the most potent biological signals: pain. The absence of a reward signal, also known as a negative signal, is frequently interpreted as being equivalent to punishment (Schultz et al., 1997). However, the biological mechanisms that interpret, transmit, and permit pain in the body contradict this assertion. We argue that people avoid unfavourable situations more rapidly if they learn through pain as opposed to through a lack of reward. Therefore, we propose that adding pain into current RL models will not only allow algorithms to converge more quickly, but also cause behaviour to become more safe, sophisticated, and generalizable. This work examines the historical connections between RL and neuroscience, synthesizes neuroscientific understandings of pain, and proposes refinements to current biologically inspired techniques for incorporating pain into RL algorithms.