Cognitive Modeling

ABiALS 2010/11: Spatial Representations and Dynamic Interactions


Presentation Abstracts



Day 1: Development of Interactive Spatial Representations


Exploring Neural Mechanisms for Prediction
Keith L. Downing

Prediction is considered a pivotal brain function by many neuroscientists, and it may even provide a critical link between sensorimotor behavior and cognition, thus giving Artificial Intelligence researchers a focal subtask for building computational models that link motion to cognition. However, there are several notions of prediction, some procedural and others declarative, and the brain appears to house separate neural circuitry for each type, in areas such as the cerebellum, basal ganglia, neocortex and hippocampus. This presentation will compare models of several of these brain regions in an attempt to shed light on the neural basis of prediction and its implications for artificial intelligence.

Overt visual attention as predecessor of conscious perception
Peter König, Tim C. Kietzmann, Stephan Geuter

Our everyday conscious experience of the visual world is fundamentally shaped by the processes of overt visual attention and perception. Although the principal impact of both components is undisputed, it is still unclear whether and how they interact. Common view holds that the conscious percept drives visual attention, i.e. overt visual attention is guided towards crucial local features of an object only after its identity is consciously perceived (action follows perception). However, it is also conceivable that overt attentional selection precedes conscious object perception. Contrary to the first, this scenario sees pre-consciously attended features as building blocks of the later perceptual outcome (action precedes perception). To investigate these hypotheses, we recorded eye-movements and pupil dilation during free inspection of ambiguous and corresponding unambiguous stimuli. Our analyses reveal that differences in overt visual attention do indeed precede the conscious percept. As implied by the action precedes perception hypothesis, we demonstrate that fixations recorded prior to conscious perception can predict the later perceptual outcome, and that subjects accumulate more evidence that is consistent with their later percept than for the alternative. Furthermore, our data contain no evidence that pre-conscious perceptual hypotheses exert influence on overt attention. Control experiments, guiding initial fixations, confirm the causal influence of overt attention on perception. The current work thus clearly supports the action precedes perception hypothesis and emphasizes the crucial importance of unconscious overt visual attention in the formation of our conscious experience of the visual world.

Learning modular, sensorimotor-grounded encodings for goal-directed decision making and control
Martin V. Butz & Yuuya Sugita

The two fundamental sensory processing pathways in the brain are termed the dorsal ?where? or ?how? stream and the ventral ?what? stream. As indicated by the names, the dorsal pathway processes sensory information for motor interaction while the ventral pathway processes sensory information for identification and decision making. It is still highly unclear, however, how and why these two streams form and interact. We propose that the stream separation is necessary because the stream-respective, optimal encodings have fundamentally different structures, which would interfere with each other. The dorsal pathway requires the formation of pro-motor, goal-oriented sensory encodings to enable the rather direct activation of sensory-guided, goal-directed motor control routines. The ventral pathway, on the other hand, requires the identification of motivation-relevant items in the perceived surrounding to enable the selection of the proper interaction routines. Investigating the structural emergence of these two pathways, we survey our insights gained from several computational models. First, we illustrate how structural interference can prohibit effective learning. Next, we highlight the importance of modular, effector-relative, spatial encodings in the dorsal stream, to enable effective sensory-guided, goal-directed motor control. We also show that the representations in the ventral stream are likely to be correlated but still strongly different from those in the dorsal stream. Finally, we propose that the distinction of these two streams may set the stage for acquiring higher-level sensorimotor-grounded and goal-oriented cognitive processes.

Dynamic Field Theory as a framework for understanding embodied cognition
Gregor Schöner

Dynamic Field Theory (DFT) is a neurally based set of concepts that has turned out to be useful to understand how cognition emerges in embodied systems that are embedded in structured environments. The origins of DFT are in movement preparation, but the representation of space has since become a central theme in DFT. Space is used as a substrate for linking many different kind of cognitive processes including those subserving object perception, scene representation, sequence generation, and change detection. Prediction is inherent how DFT frames representations. In fact, the neural representations captured by DFT are themselves really processes that generate and, in a sense, anticipate behavior. I will give a tutorial survey over DFT and its neural foundations and use implementations of DFT models in autonomous robotics to illustrate some of the functional properties the emerge from the neural dynamics of DFT.


Prediction is the foundation of cognition
Claes von Hofsten

Adaptive behavior has to deal with the fact that events precede the feedback signals about them. In biological systems, the delays in the control pathways can be substantial. The only way to overcome this problem is to anticipate what is going to happen next and use that information to control ones behavior. Every single goal directed movement has to anticipate the sensory consequences of it. In addition, most events in the outside world exist independent of ourselves. Interacting with them require us to move to specific places at specific times while being prepared to do specific things. This entails foreseeing the ongoing stream of events in the world as well as the unfolding of our own actions. We have traced the ontological history of this basic truth in young infants. Very early in development, in fact, at the same as time infants begin to master an action, they also carry out this action in a predictive way. Two-month-old infants track objects in a predictively and 4-month-old infants reach predictively for moving objects. As they get to master their own actions, infants also begin to anticipate the goals of other people?s actions. They look at the goal position of the observed actions before it is completed. In order to anticipate the fate of one?s own and other people?s actions, it is necessary to have a theory of what is going to happen next with one?s own movements, with events in the world and with other people?s actions. This is one of the most important foundations of cognition.


Cognitive Representation and Learning in Motor Action
Thomas Schack, Bettina Bläsing, André Krause & Matthias Weigelt

To understand representation of action we investigated the cognitive architecture of human action, showing how it is organized over several levels and how it is built up. Basic Action Concepts (BACs) are identified as major building blocks on a representation level. These BACs are cognitive tools for mastering the functional demands of movement tasks. Results from different lines of research showed that not only the structure formation of mental representations in long-term memory but also chunk formation in working memory and mental rotation processes are built up on BACs and relate systematically to movement structures. To learn more about the structure and functioning of such action representation, we have done experimental studies in complex movements (like for instance windsurfing, dance, soccer etc.) and in manual action. It is concluded that such movement representations include spatial information and might provide the basis for action control in skilled voluntary movements in the form of cognitive reference structures.
This presentation examines, in a first step, the structure and the feature (e.g. spatial) dimensions of cognitive representation in motor action. In a second step it was our interest to measure such structures in memory and use the results for developing new tools in mental training to support motor learning. The main problem of many traditional procedures is that they try to optimize the performance through repeated imagination of the movement without taking the athlete's mental technique representation into account (i.e., they are representation-blind). The alternative developed here is to measure the mental representation of the movement before mental training and then integrate these results into the training. The success of this procedure suggests that motor learning is functionally based on the development and change of cognitive representations in memory.
In a last step, the presentation deals with functional links between cognitive psychology and robotics. A modular architecture was designed that couples multiple echo state networks (ESNs) hierarchically and in parallel. On top of that architecture, a hierarchical self-organizing map (HSOM) implements cognitive structures of basic action concepts and provides input and reference values to the ESNs. The HSOM can integrate perceptual features of the environment, proprioceptive sensory data of the robot body and higher level commands (intention, affordance) to select a proper motor program. Cluster structures learned in the HSOM are compared to cognitive structures in human long term memory. Different links between cognitive motion psychology and robotics are designed to study how the development of structured representation (action templates) proceeds in human skill acquisition and how it can be applied in robotics.

Internal models in the cerebellum - coordination, learning and state estimation
Chris Miall

My research has focused largely on the role of the cerebellum as a predictor - a forward model that can anticipate the outcome of actions. This internal model of the sensorimotor system can be used to help cancel the delays in sensing feedback of ongoing movements, generating an estimate of the current state of the motor system. Without it, motor control is compromised by the delayed feedback, and cerebellar patients display characteristic errors in movement control. The forward model would be critical in coordinating simultaneous movements of multiple effectors - for example synchronous eye and hand actions. In addition, a forward model would need to be learned through experience, and this ties in well with ideas of the cerebellum as a learning machine. I will review our evidence for these predictive functions, from single unit recording and inactivation methods in monkey and from fMRI and from interruption of cerebellar function using TMS in humans.

Helge Ritter: Manual Interaction for Spatial Cognition


Day 2: Development of Dynamic Spatial Interaction Routines


Cognitive representations of tool-use interactions
Cristina Massen

The talk gives an overview over studies addressing action planning and action representation in tool use. In contrast to previous research that has mainly focused on learning and control of movement transformations that are intransparent and not immediately comprehensible to the user, the focus is on early processes of action planning and on action representation of transparent relationships between body movements and associated tool movements. The results of a number of studies on precuing effects, sequential effects and bimanual coordination in tool use suggest that users implement an internal representation of the required tool quite early in action planning. This representation can be described as a motor schema that contains the tool transformation (i.e. the relationship between body movements and associated tool movements) as an invariant. Studies on the observation of tool use show that this motor schema is automatically activated when tool-use actions are observed. In addition, further results suggest that the representation of transparent tool transformations can be as abstract as the representation of intransparent tool transformations, and thus easily generalizes to other movement-effect-instances, to other tools with different mechanical properties, and to actions with a different effector.

Action control under transformed perception-action feedback
Christine Sutter

Modern technologies progressively create workplaces in which movement execution and observation are spatially separated. Challenging workplaces in which users act by technical equipment in a distant space are ? for instance ? laparoscopic surgery, teleoperation or virtual reality. When using a tool, proprioceptive/tactile feedback from the moving hand (proximal action effect) and visual feedback of the moving cursor on a display (distal action effect) do often not correspond or are even in conflict. This discrepancy would be a constant source of interference if proximal and distal feedback would be equally important for controlling actions. Obviously the solution of the human information processor is to favor the intended distal action effects while the proximal action effects step to the background (Kunde et al., 2007, Massen & Prinz, 2008, Sutter & Müsseler, 2010). In a series of experiments the underlying cognitive processes and the limitations of the visual predominance are presented. The main findings of these experiments are, that when transformations are in effect the awareness of one?s own actions is quite low. This seems to be advantageous when using tools, as it allows for a much wider range of flexible sensorimotor adaptations and ? maybe more important ? it gives us the feeling of being in control. Thus the suppression of perceiving one?s own actions is an important precondition for using tools successfully. However, this seems to have reached its limits when feature overlap between vision and proprioception is low, and when the existence of transformations is quite obvious. Proximal action effects come to the fore and dominate action control. In conclusion action-effect control plays an important role in understanding the constraints of the acquisition and application of tool transformations.


Generation of Cognitive Behavior through Top-Down and Bottom-Up Interaction in Hierarchical Cortical Networks: Neuro-Robotics Experiments
Jun Tani

In this talk, I address two essential aspects for understanding the brain mechanisms for generating cognitive behavior. The first aspect concerns a generative model in which sensory-motor sequences can be predicted/generated with top-down intentions, and where such intentions can be modified by means of bottom-up regression by considering the prediction error with the sensory reality. The second aspect concerns a generative model that is self-organized with a functional hierarchy through dense interactions between the prefrontal cortex, characterized by its slower neural activity dynamics, and the posterior cortices, characterized by their faster dynamics. These two aspects are examined by reviewing some of our neuro-robotics experiments involving goal-directed action generation, mental simulation and planning, free-decisions, and delusion of controls experienced in schizophrenia patients. The experimental results suggest that interactions between different levels and different modalities involving various local brain regions can lead to the generation of compositional, and yet contextual, cognitive acts.

Learning Object-Action Complexes from Sensorimotor Experience in Humanoid Robotics
Tamim Asfour

Building cognitive situated humanoid robots able to learn to operate in the real world and to interact and communicate with humans, must model and reflectively reason about their perceptions and actions in order to learn, act, predict and react appropriately. Such capabilities can only be attained by embodied agents through physical interaction with and exploration of the real world and requires the simultaneous consideration of perception and action. Representations built from such interactions are much better adapted to guiding behaviour than human crafted rules and allow embodied agents to gradually extend their cognitive horizon.
In the first part of the talk I present the concept of Object-Action Complexes (OAC, pronounced ?oak?) which has been introduced by the European project PACO-PLUS (www.paco-plus.org) as the basis for symbolic representations of sensorimotor experience. OACs emphasize the notion that objects and actions are inseparably intertwined and that categories are therefore determined (and also limited) by the action an agent can perform and by the attributes of the world it can perceive. Entities (things) in the world of a robot (or human) will only become semantically useful objects through the action that the agent can/will perform on them. The second part of the talk presents current results toward the implementation of integrated 24/7 humanoid robots able to 1) perform complex grasping and manipulation tasks in a kitchen environment 2) autonomously acquire object knowledge through visual and haptic exploration and 3) learn actions from human observation. The developed capabilities are demonstrated on the humanoid robots ARMAR-IIIa and ARMAR-IIIb.

How internal modeling can be used to "understand" the external world
Giovanni Pezzulo

Recent evidence suggests that the brain reuses internal models implied in motor control outside purely control tasks. One example is the simulation of actions currently afforded by the environment (e.g., grasping, lifting or navigation actions), or even untied to the current context (e.g., imagery), without really executing them.
Another example is the emulation of the dynamics of the external world (including actions executed by others). It has been hypothesized that reuse of internal modeling could have many functions, including learning from vicarious trial and error, enhancing perceptual processing, providing understanding of another's actions and associated goals (in terms of one's own action repertoire), memorizing, (motor) training, and performing a form of thinking that consists of the internal manipulation of action-related representations.
By presenting theoretical arguments, empirical evidence, and computational models, we discuss the epistemic role of internal modeling, or how internal models permit to "understand" the external world (ultimately, for taking action on it). We discuss the nature and role of the "embodied representations" that derive from internal modeling, distinguishing our view from other representational approaches in cognitive science. As a second step, we discuss how these representational abilities could have provided the foundations for advanced cognitive abilities. We elaborate on the idea that organisms did not evolve to solve cognitive problems but problems of motor control and sociality, which also leveraged cognition. In this process, prediction had a key role: it permitted the passage from the control of situated action to the control of mental activities.
A consequence of our view is that, due to their origin as control mechanisms, representational and cognitive abilities retain embodied and sensorimotor vestiges, and reuse existing sensorimotor skills and the same basic mechanisms as those used in the control of overt action and attention. This situates our view within current embodied and motor approaches to cognition.

Application Scenarios for Visual Prediction: From Motor Learning to Mental Imagery
Wolfram Schenck

The prediction of future sensory or system states is important in many areas of motor control, perception, and cognition - both for biological organisms and artificial agents like robots. We present a forward model for the prediction of future visual states in the context of saccade-like camera movements, and describe its application in various task domains: motor learning, covert attention shifts, and mental imagery. In the area of motor learning, we suggest a learning scheme for saccade adaptation in which the re-identification of target objects after a saccade is facilitated by visual prediction. Target re-identification is necessary to determine the sensory error after a camera movement and subsequently the motor error for the adaptation of the saccade controller. By visual prediction, the retinal target position after a camera movement can be directly identified without an elaborate search process, and even "mental practice" without overt camera movements is possible.
In a second study, we use the visual forward model in combination with the saccade controller and a robot arm controller to explore a system architecture for grasping to both fixated and non-fixated target objects which is related to the premotor theory of attention. This theory states that shifts in spatial attention are accomplished by the brain by preparing non-executed saccadic motor commands to the attended position in space. In line with this theory, grasping to non-fixated objects is realized in our system architecture by generating such a covert saccade towards the target object. In addition, the object shape after this simulated movement is predicted by the visual forward model. The generated information - gaze direction towards the object and shape - is the same as for grasping movements to fixated objects. In this way, the same arm controller can be used for both types of grasping movements.
The third study employs visual prediction in the context of mental imagery. A robotic agent (robot arm with gripper and stereo-camera system) learns to simulate internally how its gripper would look like, given specific arm postures and gaze directions of the cameras. For this purpose, several internal sensorimotor models are combined. The two most important are (1) an associator between arm postures and an appearance vector encoding the image of the fixated gripper tip, and (2) the visual forward model which can transform this image to images of the gripper from any gaze direction. We plan to use the architecture for mental imagery in the broader context of gripper-object interaction, where the mental gripper images can be used to distinguish if the visual consequences of arm movements are related to the agent's body or to external objects. This ability is an important building block of higher-level perceptual and cognitive abilities.

End-state Comfort Speeds Up Shared Motor Tasks
Ruud Meulenbroek, Oliver Herbort, Arno Koning, & Janet van Uem

From single-subject studies it is known that people are willing to adopt awkward arm postures at the beginning of a motor task to ensure they can end up comfortably when completing the task. We report three studies that investigate the extent to which in collaborative motor tasks people are sensitive to the end-state comfort effect when predicting each others hand movements. In Experiment 1 we asked participants to judge fragments of video clips of the hand of an actor grasping a dial to verify whether the displayed end-state comfort effect was picked up. Experiment 2 involved a joint-action paradigm in which one participant was asked to undo the rotational action of another participant as quickly as possible. Cueing validity was varied to probe the temporal coordination of the successive hand rotations of the dyads. The results demonstrate that in shared motor tasks people correctly anticipate each others? movements and exploit the end-state comfort effect to speed up their joint performance. Experiment 3 involved a confederate in order to balance the number of bar handling motions that were congruent and incongruent with the end-state comfort effect. The findings confirm that end-state comfort speeds up joint action.